Best Practice: Preserve
Select a Best Practice below to learn more about the “Preserve” stage in the Data Life Cycle.
What is the “Preserve” stage?
Plan to preserve data in the short term to minimize potential losses (e.g., via accidents), and in the long term so that project stakeholders and others can access, interpret, and use the data in the future. Decide what data to preserve, where to preserve it, and what documentation needs to accompany the data. Map out the processes and resources for the entire Data Life Cycle.
More information can be found in the Best Practices Primer.
Backup your data
To avoid accidental loss of data you should: (click for more)Choose and use standard terminology to enable discovery
Terms and phrases that are used to represent categorical data values or for creating content in metadata records should reflect appropriate and accepted vocabularies in your community or institution. Methods used to identify and select the proper termin... (click for more)Create and document a data backup policy
A backup policy helps manage users’ expectations and provides specific guidance on the “who, what, when, and how” of the data backup and restore process. There are several benefits to documenting your data backup policy: Helps clarify the policies, p... (click for more)Decide what data to preserve
The process of science generates a variety of products that are worthy of preservation. Researchers should consider all elements of the scientific process in deciding what to preserve: (click for more)Ensure flexible data services for virtual datasets
In order for a large dataset to be effectively used by a variety of end users, the following procedures for preparing a virtual dataset are recommended: (click for more)Ensure integrity and accessibility when making backups of data
For successful data replication and backup: (click for more)Ensure the reliability of your storage media
All storage media, whether hard drives, discs or data tapes, will wear out over time, rendering your data files inaccessible. To ensure ongoing access to both your active data files and your data archives, it is important to continually monitor the cond... (click for more)Identify and use relevant metadata standards
Many times significant overlap exists among metadata content standards. You should identify those standards that include the fields needed to describe your data. In order to describe your data, you need to decide what information is required for data us... (click for more)Identify data sensitivity
Steps for the identification of the sensitivity of data and the determination of the appropriate security or privacy level are: Determine if the data has any confidentiality concerns Can an unauthorized individual use the information to do... (click for more)Identify data with long-term value
As part of the data life cycle, research data will be contributed to a repository to support preservation and discovery. A research project may generate many different iterations of the same dataset - for example, the raw data from the instruments, as w... (click for more)Identify suitable repositories for the data
Shaping the data management plan towards a specific desired repository will increase the likelihood that the data will be accepted into that repository and increase the discoverability of the data within the desired repository. When beginning a data man... (click for more)Plan data management early in your project
A Data Management Plan should include the following information: (click for more)Plan for effective multimedia management
Multimedia data present unique challenges for data discovery, accessibility, and metadata formatting and should be thoughtfully managed. Researchers should establish their own requirements for management of multimedia during and after a research project... (click for more)Preserve information: Keep raw data raw
In order to preserve the raw data for future use: (click for more)Provide a citation and document provenance for your dataset
For appropriate attribution and provenance of a dataset, the following information should be included in the data documentation or the companion metadata file: (click for more)Provide identifier for dataset used
In order to ensure replicable data access: (click for more)Provide version information for use and discovery
Provide versions of data products with defined identifiers to enable discovery and use. (click for more)Recognize stakeholders in data ownership
When creating the data management plan, review all who may have a stake in the data so future users of the data can easily track who may need to give permission. Possible stakeholders include but are not limited to: (click for more)Store data with appropriate precision
Data should not be entered with higher precision than they were collected in (e.g if a device collects data to 2dp, an Excel file should not present it to 5 dp). If the system stores data in higher precision, care needs to be taken when exporting to ASC... (click for more)