Data Management Skillbuilding Hub

Best Practice: Identify missing values and define missing value codes

BEST PRACTICE

Best Practices by Data Life Cycle




Identify missing values and define missing value codes

Data Life Cycle stage(s): Assure

Missing values should be handled carefully to avoid their affecting analyses. The content and structure of data tables are best maintained when consistent codes are used to indicate that a value is missing in a data field. Commonly used approaches for coding missing values include:

  • Use a missing value code that matches the reporting format for the specific parameter. For example, use ““-999.99””, when the reporting format is a FORTRAN-like F7.2.
  • For character fields, it may be appropriate to use ““Not applicable”” or ““None”” depending upon the organization of the data file.
  • It might be useful to use a placeholder value such as ““Pending assignment”” when compiling draft information to facilitate returning to incomplete fields.
  • Do not use character codes in an otherwise numeric field.

Whatever missing value is chosen, it should be used consistently throughout all data associated files and identified in the metadata and/or data description files.

Description Rationale

Missing values are common in environmental data and affect the interpretation, analysis and calculations. Therefore, they need to be carefully defined and properly described. In addition many instruments will automatically add missing value codes in their datastream which will have to be dealt with for storage and analysis.

Additional Information

Monitoring programs like EPA, USGS, etc. have online documentation on how do handle missing values and can be consulted.

L. A. Hook, L.A., T.W. Beaty, S. SanthanaVannan, L. Baskaran, and R. B. Cook. 2007. Best Practices for Preparing Environmental Data Sets to Share and Archive. http://dx.doi.org/10.3334/ORNLDAAC/BestPractices-2010 Cook, R. B., Olson, R. J., Kanciruk, P., and Hook, L. A. 2001. Best Practices for Preparing Ecological and Ground-Based Data Sets to Share and Archive. Bulletin of ESA 82: 138-141. http://www.jstor.org/stable/20168543 Borer, E. T., Seabloom, E. W., Jones, M. B., & Schildhauer, M. 2009. Some Simple Guidelines for Effective Data Management. Bull. of ESA 90: 209-214. https://doi.org/10.1890/0012-9623-90.2.205

Tags

 
 

Cite this best practice:

DataONE Best Practices Working Group, DataONE  (July 01, 2010) "Best Practice: Identify missing values and define missing value codes". Accessed through the Data Management Skillbuilding Hub at https://dataoneorg.github.io/Education/bestpractices/identify-missing-values on Mar 01, 2024


Home

Hosted by DataONE

In collaboration with the community, DataONE has developed high quality resources for helping educators and librarians with training in data management, including teaching materials, webinars and a database of best-practices to improve methods for data sharing and management.

Question If you have a question or concern, please open an Issue in this repository on GitHub.