Warning: These documents are under active development and subject to change (version 2.1.0-beta).
The latest release documents are at: https://purl.dataone.org/architecture

NodeList

This document is OBSOLETE, and has been superseded by information in the DataONE types schema. It will be deleted after review.

A NodeList is a synchronized register for all of the nodes in the DataONE environment. It contains the information needed by DataONE to orchestrate activities across the distributed coordinating and member nodes of the network. While some information is provided by the Member Nodes themselves, the node list is maintained dynamically by the Coordinating Nodes. The node list is mutable in that it reflects the latest state of the nodes that are part of the system. Replicated copies of the node list are maintained at each of the Coordinating nodes.

Registry

  ContactGroup
    groupid
    name
    description
    members

  Contact
    contactid
    role (administrator, manager, ...)
    givenName (first name)
    sn (surname)
    notification
      type (phone, email, IRC, ...)
      connection (phone number, email address, IRC channel)

  Network (1..n, replaces "environment")
    networkid
    name
    description
    adminGroup
    notifyGroup

  Node
    nodeid
    name
    description
    location
    adminGroup
    notifyGroup
    created (date created / registered)
    modified (time stamp for modification)
    lastSynchronization (time stamp)
    objectFormatsSupported (list of object formats known to support)
    synchronize
    replicate
    replicationTarget

    service
      version (schema version supported, MN)
      baseURL (MN)
      name (human readable name for service, e.g. "DataONE-0.6.1", MN)
      activeNetwork (id of network this interface is active for, MN)
      lastChecked (last time service was examined, CN)
      method
        name (MN)
        isactive (set by CN)

The node list is a complex data type, with three main sub-structures: services, synchronization, and health. Some data is provided at node registration time, while other items are generated by DataONE itself in the course of managing objects.

The nodelist schema is expressed in XMLSchema and is available at:

The following list of fields represents the set of information collected and maintained by Coordinating Nodes for every node in the system.

Table 1. Quick reference to the NodeList fields described in more detail below.

Group Field Type Cardinality Generate By Version
General
identifier NodeReference 1 CN 0.5
name NonEmptyString 1 CN 0.5
description NonEmptyString 1 CN 0.5
baseURL anyURI 1 MN 0.5
services Service 0..n MN 0.5
synchronization Synchronization 0..1 CN 0.5
health NodeHealth 0..1 CN 0.5
replicate boolean 1 MN 0.5
synchronize boolean 1 MN 0.5
type NodeType 1 CN 0.5
environment Environment 1 CN 0.5
Services
services.name ServiceName 1 MN 0.5
services.version string 1 MN 0.5
services.available boolean 0..1 MN 0.5
services.method ServiceMethod 0..n MN 0.5
services.method.name NMToken 0..1 CN 0.5
services.method.rest xs:token 1 MN 0.5
services.method.implemented boolean 1 MN 0.5
Synchronization
synchronization.lastHarvested dateTime 1 CN 0.5
synchronization.lastCompleteHarvest dateTime 1 CN 0.5
synchronization.schedule Schedule 1 CN 0.5
synchronization.schedule.sec crontabEntryType 1 CN 0.5
synchronization.schedule.min crontabEntryType 1 CN 0.5
synchronization.schedule.hour crontabEntryType 1 CN 0.5
synchronization.schedule.mday crontabEntryType 1 CN 0.5
synchronization.schedule.mon crontabEntryType 1 CN 0.5
synchronization.schedule.year crontabEntryType 1 CN 0.5
synchronization.schedule.wday crontabEntryType 1 CN 0.5
Health
health.ping Ping 1 CN 0.5
health.status Status 1 CN 0.5
health.state State 1 CN 0.5
health.ping.success boolean 0..1 CN 0.5
health.ping.lastSuccess dateTime 0..1 CN 0.5
health.status.success boolean 0..1 CN 0.5
health.status.dateChecked dateTime 0..1 CN 0.5

NodeList fields

NodeList.identifier

A unique identifier for the node of type NodeReference. This may initially be the same as the baseURL, however this value should not change for future implementations of the same node, whereas the baseURL may change in the future.

Cardinality:1
ValueSpace:NodeReference
Generated By:CN
Required Version:
 0.5
NodeList.name

A human readable name for the node. (The name of the node is being used in Mercury currently to assign a path, so the format should be consistent with dataone directory naming conventions).

Cardinality:1
ValueSpace:NonEmptyString
Generated By:CN
Required Version:
 0.5
NodeList.description

Description of content maintained by this node and any other free style notes.

Cardinality:1
ValueSpace:NonEmptyString
Generated By:CN
Required Version:
 0.5
NodeList.baseURL

Of type anyURI, it is the base URL that is complete enough with the service.method.rest attribute to create a valid call.

Cardinality:1
ValueSpace:anyURI
Generated By:CN
Required Version:
 0.5
NodeList.replicate

A flag to tell the CN whether or not to replicate MN data.

Cardinality:1
ValueSpace:boolean
Generated By:CN
Required Version:
 0.5
NodeList.synchronize

A flag to tell the CN to synchronize or not. Applies to CNs and MNs (although CNs are presumed to synchronize)

Cardinality:1
ValueSpace:boolean
Generated By:CN
Required Version:
 0.5
NodeList.type

The type of node in the dataONE world this one is. Legal values are “MN” and “CN”.

Cardinality:1
ValueSpace:NodeType
Generated By:CN
Required Version:
 0.5
NodeList.environment

The systems environment the node belongs to. Legal values are “dev”, “test”, “staging”, and “prod”.

Cardinality:1
ValueSpace:Environment
Generated By:CN
Required Version:
 0.5
services.name

The name of the service exposed by the node

Cardinality:1
ValueSpace:ServiceName
Generated By:CN
Required Version:
 0.5
services.version

The version of the service implemented. Since not all member nodes can be orchestrated to migrate versions simultaneously, the version is needed to ensure business continuity in the eventuality of dataone-service-api upgrades.

Cardinality:1
ValueSpace:string
Generated By:CN
Required Version:
 0.5
services.available

A flag to indicate whether or not the service is available. Determined by the CN.

Cardinality:0..1
ValueSpace:boolean
Generated By:CN
Required Version:
 0.5
services.method.name

the name of the method implemented by the service

Cardinality:0..1
ValueSpace:NMToken
Generated By:CN
Required Version:
 0.5
services.method.rest

the rest path, relative to the baseURL of the node, that calls the method

Cardinality:1
ValueSpace:xs:token
Generated By:CN
Required Version:
 0.5
services.method.implemented

A flag to indicate if this method is implemented on the node. Determined by the MN through the addCapabilities method.

Cardinality:1
ValueSpace:boolean
Generated By:CN
Required Version:
 0.5
synchronization.lastHarvested

Set by a CN, contains the time of last MN-synchronization with a CN. The dateTime is taken from the frame of reference of the member node, that is to say, it uses the latest modification date from the objects harvested.

Cardinality:1
ValueSpace:dateTime
Generated By:CN
Required Version:
 0.5
synchronization.lastCompleteHarvest

Set by a CN, contains the time of the last complete harvest from a MN. A complete harvest is a full re-harvesting from a member node not relying on last harvest time. This value of this field should always be the same or earlier than the lastHarvested field.

Cardinality:1
ValueSpace:dateTime
Generated By:CN
Required Version:
 0.5
synchronization.schedule

a set of numerical list or range values used to set the synchronization schedule with a MN, following crontab formatting rules. See wikipedia entry for a popular, if not technical, explanation of crobtab http://en.wikipedia.org/wiki/Cron.

Cardinality:1
ValueSpace:Schedule
Generated By:CN
Required Version:
 0.5
health.state

The state of health of the node, based on ping and status calls. Legal values are “up”, “down”, “unknown”.

Cardinality:1
ValueSpace:State
Generated By:CN
Required Version:
 0.5
health.ping.success

A flag showing whether the last mn_health.ping was successful or not.

Cardinality:0..1
ValueSpace:boolean
Generated By:CN
Required Version:
 0.5
health.ping.lastSuccess

The time of last successful mn_health.ping to the node.

Cardinality:0..1
ValueSpace:dateTime
Generated By:CN
Required Version:
 0.5
health.status.success

A flag showing whether the last mn_health.status method call was successful or not.

Cardinality:0..1
ValueSpace:boolean
Generated By:CN
Required Version:
 0.5
health.status.dateChecked

The time of the last mn_health.status call to the node.

Cardinality:0..1
ValueSpace:dateTime
Generated By:CN
Required Version:
 0.5

The object format in protocol buffer format A set of values that describe a node, its Internet location, the services it supports and its replication policy.

message Node
{
  required NodeReference identifier = 1;
  required NonEmptyString name = 2;
  required NonEmptyString description = 3;
  required anyURI baseURL = 4;
  repeated Service services = 5;
  optional Synchronization synchronization = 6;
  optional NodeHealth health = 7;
  required boolean replicate = 8;
  required boolean synchronize = 9;
  required NMToken(string) type = 10;

  message Service
  {
    required ServiceName name = 0;
    required string version = 1;
    boolean available = 2;
    repeated ServiceMethod method = 3;

    message ServiceMethod
    {
      optional NMToken name = 0;
      required xs:token rest = 1;
      required boolean implemented = 2;
    }
  }

  message Synchronization
  {
    required dateTime lastHarvested = 0;
    required dateTime lastCompleteHarvest = 1;
    required Schedule schedule = 2;

    message Schedule
    {
      required crontabEntryType sec = 0;
      required crontabEntryType min = 1;
      required crontabEntryType hour = 2;
      required crontabEntryType mday = 3;
      required crontabEntryType mon = 4;
      required crontabEntryType year = 5;
      required crontabEntryType wday = 6;
    }
  }

  message NodeHealth
  {
    required Ping ping = 0;
    required Status status = 1;
    required State state = 2;

    message Ping
    {
      optional boolean success = 0;
      optional dateTime lastSuccess = 1;
    }

    message Status
    {
      optional boolean success = 0;
      optional dateTime dateChecked = 1;
    }

    enum State
    {
      UP = 0;
      DOWN = 1;
      UNKNOWN = 2;
    }
  }
}