Best Practice: Analyze
Select a Best Practice below to learn more about the “Analyze” stage in the Data Life Cycle.
What is the “Analyze” stage?
Create analyses and visualizations to identify patterns, test hypotheses, and illustrate finding. During this process record your methods, document data processing steps, and ensure your data are reproduceable. Learn about these best practices and more.
More information can be found in the Best Practices Primer.
-
Describe method to create derived data products
When describing the process for creating derived data products, the following information should be included in the data documentation or the companion metadata file: (click for more)
Tags: analyze data processing describe provenance
Document steps used in data processingDifferent types of new data may be created in the course of a project, for instance visualizations, plots, statistical outputs, a new dataset created by integrating multiple datasets, etc. Whenever possible, document your workflow (the process used to c... (click for more)
Tags: analyze data processing describe integrate provenance replicable data
Ensure datasets used are reproducibleWhen searching for data, whether locally on one’s machine or in external repositories, one may use a variety of search terms. In addition, data are often housed in databases or clearinghouses where a query is required in order access data. In order to r... (click for more)
Tags: analyze assure data archives data processing discover provenance replicable data
Identify most appropriate softwareFollow the steps below to choose the most appropriate software to meet your needs. Identify what you want to achieve (discover data, analyze data, write a paper, etc.) Identify the necessary software features for your project (i.e. functional requi... (click for more)
Tags: analyze data processing data services
Identify outliersOutliers may not be the result of actual observations, but rather the result of errors in data collection, data recording, or other parts of the data life cycle. The following can be used to identify outliers for closer examination: (click for more)
Tags: analyze annotation assure quality
Identify values that are estimatedData tables should ideally include values that were acquired in a consistent fashion. However, sometimes instruments fail and gaps appear in the records. For example, a data table representing a series of temperature measurements collected over time fro... (click for more)
Tags: analyze assure flag quality
Store data with appropriate precisionData should not be entered with higher precision than they were collected in (e.g if a device collects data to 2dp, an Excel file should not present it to 5 dp). If the system stores data in higher precision, care needs to be taken when exporting to ASC... (click for more)
Tags: analyze measurement preserve storage
Understand the geospatial parameters of multiple data sourcesUnderstand the input geospatial data parameters, including scale, map projection, geographic datum, and resolution, when integrating data from multiple sources. Care should be taken to ensure that the geospatial parameters of the source datasets can be ... (click for more)
Tags: analyze documentation geography geospatial integrate metadata provenance