Consider the compatibility of the data you are integrating
Data Life Cycle stage(s): Assure
The integration of multiple data sets from different sources requires that they be compatible. Methods used to create the data should be considered early in the process, to avoid problems later during attempts to integrate data sets. Note that just because data can be integrated does not necessarily mean that they should be, or that the final product can meet the needs of the study. Where possible, clearly state situations or conditions where it is and is not appropriate to use your data, and provide information (such as software used and good metadata) to make integration easier.
Description Rationale
When using integrated data sets, it is crucial that the data are comparable and compatible to avoid mistakes in analyses and interpretation.
Additional Information
Burley, T.E., and Peine, J.D., 2009, NBII-SAIN Data Management Toolkit, U.S. Geological Survey Open-File Report 2009-1170, 96 p. Available from: http://pubs.usgs.gov/of/2009/1170/
Examples
Water-quality data collected by two separate agencies may be thematically similar but may have been sampled using completely different methods. Differences in such water-quality sample methods can include equipment, sampling method protocol, and lab analysis procedures. Analysis performed on integrated water-quality data that were collected using completely different methods would likely result in questionable results.
Cite this best practice:
Eric Lind, Steve Aulenback, Tom Burley, DataONE (May 11, 2011) "Best Practice: Consider the compatibility of the data you are integrating". Accessed through the Data Management Skillbuilding Hub at https://dataoneorg.github.io/Education/bestpractices/consider-the-compatibility on Mar 01, 2024Home