NodeList

This document is OBSOLETE, and has been superseded by information in the DataONE types schema. It will be deleted after review.

A NodeList is a synchronized register for all of the nodes in the DataONE environment. It contains the information needed by DataONE to orchestrate activities across the distributed coordinating and member nodes of the network. While some information is provided by the Member Nodes themselves, the node list is maintained dynamically by the Coordinating Nodes. The node list is mutable in that it reflects the latest state of the nodes that are part of the system. Replicated copies of the node list are maintained at each of the Coordinating nodes.

Registry

  ContactGroup
    groupid
    name
    description
    members

  Contact
    contactid
    role (administrator, manager, ...)
    givenName (first name)
    sn (surname)
    notification
      type (phone, email, IRC, ...)
      connection (phone number, email address, IRC channel)

  Network (1..n, replaces "environment")
    networkid
    name
    description
    adminGroup
    notifyGroup

  Node
    nodeid
    name
    description
    location
    adminGroup
    notifyGroup
    created (date created / registered)
    modified (time stamp for modification)
    lastSynchronization (time stamp)
    objectFormatsSupported (list of object formats known to support)
    synchronize
    replicate
    replicationTarget

    service
      version (schema version supported, MN)
      baseURL (MN)
      name (human readable name for service, e.g. "DataONE-0.6.1", MN)
      activeNetwork (id of network this interface is active for, MN)
      lastChecked (last time service was examined, CN)
      method
        name (MN)
        isactive (set by CN)

The node list is a complex data type, with three main sub-structures: services, synchronization, and health. Some data is provided at node registration time, while other items are generated by DataONE itself in the course of managing objects.

The nodelist schema is expressed in XMLSchema and is available at:

The following list of fields represents the set of information collected and maintained by Coordinating Nodes for every node in the system.

Table 1. Quick reference to the NodeList fields described in more detail below.

Group

Field

Type

Cardinality

Generate By

Version

General

identifier

NodeReference

1

CN

0.5

name

NonEmptyString

1

CN

0.5

description

NonEmptyString

1

CN

0.5

baseURL

anyURI

1

MN

0.5

services

Service

0..n

MN

0.5

synchronization

Synchronization

0..1

CN

0.5

health

NodeHealth

0..1

CN

0.5

replicate

boolean

1

MN

0.5

synchronize

boolean

1

MN

0.5

type

NodeType

1

CN

0.5

environment

Environment

1

CN

0.5

Services

services.name

ServiceName

1

MN

0.5

services.version

string

1

MN

0.5

services.available

boolean

0..1

MN

0.5

services.method

ServiceMethod

0..n

MN

0.5

services.method.name

NMToken

0..1

CN

0.5

services.method.rest

xs:token

1

MN

0.5

services.method.implemented

boolean

1

MN

0.5

Synchronization

synchronization.lastHarvested

dateTime

1

CN

0.5

synchronization.lastCompleteHarvest

dateTime

1

CN

0.5

synchronization.schedule

Schedule

1

CN

0.5

synchronization.schedule.sec

crontabEntryType

1

CN

0.5

synchronization.schedule.min

crontabEntryType

1

CN

0.5

synchronization.schedule.hour

crontabEntryType

1

CN

0.5

synchronization.schedule.mday

crontabEntryType

1

CN

0.5

synchronization.schedule.mon

crontabEntryType

1

CN

0.5

synchronization.schedule.year

crontabEntryType

1

CN

0.5

synchronization.schedule.wday

crontabEntryType

1

CN

0.5

Health

health.ping

Ping

1

CN

0.5

health.status

Status

1

CN

0.5

health.state

State

1

CN

0.5

health.ping.success

boolean

0..1

CN

0.5

health.ping.lastSuccess

dateTime

0..1

CN

0.5

health.status.success

boolean

0..1

CN

0.5

health.status.dateChecked

dateTime

0..1

CN

0.5

NodeList fields

NodeList.identifier

A unique identifier for the node of type NodeReference. This may initially be the same as the baseURL, however this value should not change for future implementations of the same node, whereas the baseURL may change in the future.

Cardinality

1

ValueSpace

NodeReference

Generated By

CN

Required Version

0.5

NodeList.name

A human readable name for the node. (The name of the node is being used in Mercury currently to assign a path, so the format should be consistent with dataone directory naming conventions).

Cardinality

1

ValueSpace

NonEmptyString

Generated By

CN

Required Version

0.5

NodeList.description

Description of content maintained by this node and any other free style notes.

Cardinality

1

ValueSpace

NonEmptyString

Generated By

CN

Required Version

0.5

NodeList.baseURL

Of type anyURI, it is the base URL that is complete enough with the service.method.rest attribute to create a valid call.

Cardinality

1

ValueSpace

anyURI

Generated By

CN

Required Version

0.5

NodeList.replicate

A flag to tell the CN whether or not to replicate MN data.

Cardinality

1

ValueSpace

boolean

Generated By

CN

Required Version

0.5

NodeList.synchronize

A flag to tell the CN to synchronize or not. Applies to CNs and MNs (although CNs are presumed to synchronize)

Cardinality

1

ValueSpace

boolean

Generated By

CN

Required Version

0.5

NodeList.type

The type of node in the dataONE world this one is. Legal values are “MN” and “CN”.

Cardinality

1

ValueSpace

NodeType

Generated By

CN

Required Version

0.5

NodeList.environment

The systems environment the node belongs to. Legal values are “dev”, “test”, “staging”, and “prod”.

Cardinality

1

ValueSpace

Environment

Generated By

CN

Required Version

0.5

services.name

The name of the service exposed by the node

Cardinality

1

ValueSpace

ServiceName

Generated By

CN

Required Version

0.5

services.version

The version of the service implemented. Since not all member nodes can be orchestrated to migrate versions simultaneously, the version is needed to ensure business continuity in the eventuality of dataone-service-api upgrades.

Cardinality

1

ValueSpace

string

Generated By

CN

Required Version

0.5

services.available

A flag to indicate whether or not the service is available. Determined by the CN.

Cardinality

0..1

ValueSpace

boolean

Generated By

CN

Required Version

0.5

services.method.name

the name of the method implemented by the service

Cardinality

0..1

ValueSpace

NMToken

Generated By

CN

Required Version

0.5

services.method.rest

the rest path, relative to the baseURL of the node, that calls the method

Cardinality

1

ValueSpace

xs:token

Generated By

CN

Required Version

0.5

services.method.implemented

A flag to indicate if this method is implemented on the node. Determined by the MN through the addCapabilities method.

Cardinality

1

ValueSpace

boolean

Generated By

CN

Required Version

0.5

synchronization.lastHarvested

Set by a CN, contains the time of last MN-synchronization with a CN. The dateTime is taken from the frame of reference of the member node, that is to say, it uses the latest modification date from the objects harvested.

Cardinality

1

ValueSpace

dateTime

Generated By

CN

Required Version

0.5

synchronization.lastCompleteHarvest

Set by a CN, contains the time of the last complete harvest from a MN. A complete harvest is a full re-harvesting from a member node not relying on last harvest time. This value of this field should always be the same or earlier than the lastHarvested field.

Cardinality

1

ValueSpace

dateTime

Generated By

CN

Required Version

0.5

synchronization.schedule

a set of numerical list or range values used to set the synchronization schedule with a MN, following crontab formatting rules. See wikipedia entry for a popular, if not technical, explanation of crobtab http://en.wikipedia.org/wiki/Cron.

Cardinality

1

ValueSpace

Schedule

Generated By

CN

Required Version

0.5

health.state

The state of health of the node, based on ping and status calls. Legal values are “up”, “down”, “unknown”.

Cardinality

1

ValueSpace

State

Generated By

CN

Required Version

0.5

health.ping.success

A flag showing whether the last mn_health.ping was successful or not.

Cardinality

0..1

ValueSpace

boolean

Generated By

CN

Required Version

0.5

health.ping.lastSuccess

The time of last successful mn_health.ping to the node.

Cardinality

0..1

ValueSpace

dateTime

Generated By

CN

Required Version

0.5

health.status.success

A flag showing whether the last mn_health.status method call was successful or not.

Cardinality

0..1

ValueSpace

boolean

Generated By

CN

Required Version

0.5

health.status.dateChecked

The time of the last mn_health.status call to the node.

Cardinality

0..1

ValueSpace

dateTime

Generated By

CN

Required Version

0.5

The object format in protocol buffer format A set of values that describe a node, its Internet location, the services it supports and its replication policy.

message Node
{
  required NodeReference identifier = 1;
  required NonEmptyString name = 2;
  required NonEmptyString description = 3;
  required anyURI baseURL = 4;
  repeated Service services = 5;
  optional Synchronization synchronization = 6;
  optional NodeHealth health = 7;
  required boolean replicate = 8;
  required boolean synchronize = 9;
  required NMToken(string) type = 10;

  message Service
  {
    required ServiceName name = 0;
    required string version = 1;
    boolean available = 2;
    repeated ServiceMethod method = 3;

    message ServiceMethod
    {
      optional NMToken name = 0;
      required xs:token rest = 1;
      required boolean implemented = 2;
    }
  }

  message Synchronization
  {
    required dateTime lastHarvested = 0;
    required dateTime lastCompleteHarvest = 1;
    required Schedule schedule = 2;

    message Schedule
    {
      required crontabEntryType sec = 0;
      required crontabEntryType min = 1;
      required crontabEntryType hour = 2;
      required crontabEntryType mday = 3;
      required crontabEntryType mon = 4;
      required crontabEntryType year = 5;
      required crontabEntryType wday = 6;
    }
  }

  message NodeHealth
  {
    required Ping ping = 0;
    required Status status = 1;
    required State state = 2;

    message Ping
    {
      optional boolean success = 0;
      optional dateTime lastSuccess = 1;
    }

    message Status
    {
      optional boolean success = 0;
      optional dateTime dateChecked = 1;
    }

    enum State
    {
      UP = 0;
      DOWN = 1;
      UNKNOWN = 2;
    }
  }
}