Best Practice: Analyze
Select a Best Practice below to learn more about the “Analyze” stage in the Data Life Cycle.
What is the “Analyze” stage?
Create analyses and visualizations to identify patterns, test hypotheses, and illustrate finding. During this process record your methods, document data processing steps, and ensure your data are reproduceable. Learn about these best practices and more.
More information can be found in the Best Practices Primer.
Describe method to create derived data products
When describing the process for creating derived data products, the following information should be included in the data documentation or the companion metadata file: (click for more)Document steps used in data processing
Different types of new data may be created in the course of a project, for instance visualizations, plots, statistical outputs, a new dataset created by integrating multiple datasets, etc. Whenever possible, document your workflow (the process used to c... (click for more)Ensure datasets used are reproducible
When searching for data, whether locally on one’s machine or in external repositories, one may use a variety of search terms. In addition, data are often housed in databases or clearinghouses where a query is required in order access data. In order to r... (click for more)Identify most appropriate software
Follow the steps below to choose the most appropriate software to meet your needs. Identify what you want to achieve (discover data, analyze data, write a paper, etc.) Identify the necessary software features for your project (i.e. functional requi... (click for more)Identify outliers
Outliers may not be the result of actual observations, but rather the result of errors in data collection, data recording, or other parts of the data life cycle. The following can be used to identify outliers for closer examination: (click for more)Identify values that are estimated
Data tables should ideally include values that were acquired in a consistent fashion. However, sometimes instruments fail and gaps appear in the records. For example, a data table representing a series of temperature measurements collected over time fro... (click for more)Store data with appropriate precision
Data should not be entered with higher precision than they were collected in (e.g if a device collects data to 2dp, an Excel file should not present it to 5 dp). If the system stores data in higher precision, care needs to be taken when exporting to ASC... (click for more)Understand the geospatial parameters of multiple data sources
Understand the input geospatial data parameters, including scale, map projection, geographic datum, and resolution, when integrating data from multiple sources. Care should be taken to ensure that the geospatial parameters of the source datasets can be ... (click for more)