Define the data model
A data model documents and organizes data, how it is stored and accessed, and the relationships among different types of data. The model may be abstract or concrete.
Use these guidelines to create a data model:
- Identify the different data components- consider raw and processed data, as well as associated metadata (these are called entities)
- Identify the relationships between the different data components (these are called associations)
- Identify anticipated uses of the data (these are called requirements), with recognition that data may be most valuable in the future for unanticipated uses
- Identify the strengths and constraints of the technology (hardware and software) that you plan to use during your project (this is called a technology assessment phase)
- Build a draft model of the entities and their relations, attempting to keep the model independent from any specific uses or technology constraints.
- Incorporate intended usage and technology constraints as needed to derive the simplest, most general model possible
- Test the model with different scenarios, including best- and
- worst-case (worst-case includes problems such as invalid raw data, user mistakes, failing algorithms, etc) Repeat these steps to optimize the model
Considering and creating the data model helps with data planning and identifies potential problems that future data users might encounter.
Different types of data model examples can be found here: http://www.databaseanswers.org/data_models/index.htm
Cite this best practice:Damien Gessler, Todd Grappone, DataONE (May 11, 2011) "Best Practice: Define the data model". Accessed through the Data Management Skillbuilding Hub at https://dataoneorg.github.io/Education/bestpractices/define-the-data on May 24, 2019