Data Management Skillbuilding Hub

Best Practice: Ensure flexible data services for virtual datasets

BEST PRACTICE

Best Practices by Data Life Cycle




Ensure flexible data services for virtual datasets

Data Life Cycle stage(s): Describe   Preserve

In order for a large dataset to be effectively used by a variety of end users, the following procedures for preparing a virtual dataset are recommended:

  • Identify data service users

  • Define data access capabilities needed by community(s) of users. For example:
    • Spatial subsetting
    • Temporal subsetting
    • Parameter subsetting
    • Coordinate transformation
    • Statistical characterization
  • Define service interfaces based upon Open Standards. For example:

    • Open Geospatial Consortium (OGC WMS, WFS, WCS)
    • W3C (SOAP)
    • IETF (REST – derived from Hypertext Transfer Protocol [HTTP])
  • Publish service metadata for published services based upon Open Standards. For example:

    • Web Services Definition Language (WSDL)
    • RSS/Atom (see Service Casting reference below for an example of a model for publishing service metadata for a variety of service types)

Description Rationale

Some datasets are too large to efficiently deliver in their entirety, or are not directly usable by some users. To enable their effective use by a variety of end users, data collections may be published as “virtual” datasets that are extracted and/or processed based upon source data and pre-defined functions that deliver products derived from the source data.

Additional Information:

Web Service Definition Language
Service Casting via RSS/Atom

Tags

 
 
 

Cite this best practice:

DataONE Best Practices Working Group, DataONE  (July 01, 2010) "Best Practice: Ensure flexible data services for virtual datasets". Accessed through the Data Management Skillbuilding Hub at https://dataoneorg.github.io/Education/bestpractices/ensure-flexible-data on Mar 01, 2024


Home

Hosted by DataONE

In collaboration with the community, DataONE has developed high quality resources for helping educators and librarians with training in data management, including teaching materials, webinars and a database of best-practices to improve methods for data sharing and management.

Question If you have a question or concern, please open an Issue in this repository on GitHub.