Data Management Skillbuilding Hub

Best Practice: Assign descriptive file names

BEST PRACTICE

Best Practices by Data Life Cycle




Assign descriptive file names

Data Life Cycle stage(s): Describe   Discover

File names should reflect the contents of the file and include enough information to uniquely identify the data file. File names may contain information such as project acronym, study title, location, investigator, year(s) of study, data type, version number, and file type.

When choosing a file name, check for any database management limitations on file name length and use of special characters. Also, in general, lower-case names are less software and platform dependent. Avoid using spaces and special characters in file names, directory paths and field names. Automated processing, URLs and other systems often use spaces and special characters for parsing text string. Instead, consider using underscore ( _ ) or dashes ( - ) to separate meaningful parts of file names. Avoid $ % ^ & # and similar.

If versioning is desired a date string within the file name is recommended to indicate the version.

Avoid using file names such as mydata.dat or 1998.dat.

Description Rationale

Clear, descriptive, and unique file names may be important when your data file is combined in a directory or FTP site with your own data files or with the data files of other investigators. File names that reflect the contents of the file and uniquely identify the data file enable precise search and discovery of particular files.

Additional Information

Hook, Les A., Suresh K. Santhana Vannan, Tammy W. Beaty, Robert B. Cook, and Bruce E. Wilson. 2010. Best Practices for Preparing Environmental Data Sets to Share and Archive. Available online http://daac.ornl.gov/PI/BestPractices-2010.pdf from Oak Ridge National Laboratory Distributed Active Archive Center, Oak Ridge, Tennessee, U.S.A. doi:10.3334/ORNLDAAC/BestPractices-2010

Borer et al. 2009. Some Simple Guidelines for Effective Data Management. Bull. of ESA 90: 209-214.

Examples

An example of a good data file name:

Sevilleta_LTER_NM_2001_NPP.csv

  • Sevilleta_LTER is the project name
  • NM is the state abbreviation
  • 2001 is the calendar year
  • NPP represents Net Primary Productivity data
  • csv stands for the file type—ASCII comma separated variable

Instead of “data May2011” use “data_May2011” or “data-May2011”

Tags

 
 
 

Cite this best practice:

DataONE Best Practices Working Group, DataONE  (July 01, 2010) "Best Practice: Assign descriptive file names". Accessed through the Data Management Skillbuilding Hub at https://dataoneorg.github.io/Education/bestpractices/assign-descriptive-file on Mar 01, 2024


Home

Hosted by DataONE

In collaboration with the community, DataONE has developed high quality resources for helping educators and librarians with training in data management, including teaching materials, webinars and a database of best-practices to improve methods for data sharing and management.

Question If you have a question or concern, please open an Issue in this repository on GitHub.