Glossary

DataONE

Data Observation Network for Earth

https://dataone.org

DataONE Common Library

Part of the DataONE Investigator Toolkit (ITK). Provides functionality commonly needed by projects that interact with the DataONE infrastructure, such as serialization and deserialization of the DataONE types to and from types native to the programming language.

It is a dependency of DataONE Client Library.

Available for Java and Python.

TODO:We need to point to releases.dataone.org for the Common Libraries. For now, see https://repository.dataone.org/software/cicore/trunk/
DataONE Client Library

Part of the DataONE Investigator Toolkit (ITK). Provides programmatic access to the DataONE infrastructure and may be used to form the basis of larger applications or to extend existing applications to utilize the services of DataONE.

Available for Java and Python.

Java Client Library documentation

Java Client Library source

Python Client Library documentation

Python Client Library source

DataONE Member Node API

The Application Programming Interfaces that a repository must implement in order to join DataONE as a Member Node.

http://mule1.dataone.org/ArchitectureDocs-current/apis/MN_APIs.html

Coordinating Node API

The Application Programming Interfaces that Coordinating Nodes implement to facilite interactions with MN and DataONE clients.

http://mule1.dataone.org/ArchitectureDocs-current/apis/CN_APIs.html

GMN

DataONE Generic Member Node

GMN is a complete implementation of a MN, written in Python. It provides an implementation of all MN APIs and can be used by organizations to expose their Science Data to DataONE if they do not wish to create their own, native MN.

GMN can be used as a standalone MN or it can be used for exposing data that is already available on the web, to DataONE. When used in this way, GMN provides a DataONE compatible interface to existing data and does not store the data.

GMN can also be used as a workbone or reference for a 3rd party MN implementation. If an organization wishes to donate storage space to DataONE, GMN can be set up as a replication target.

Metacat

Metacat is a repository for data and metadata (documentation about data) that helps scientists find, understand and effectively use data sets they manage or that have been created by others. Thousands of data sets are currently documented in a standardized way and stored in Metacat systems, providing the scientific community with a broad range of Science Data that–because the data are well and consistently described–can be easily searched, compared, merged, or used in other ways.

Metacat is implemented in Java.

http://knb.ecoinformatics.org/knb/docs/

Replication target
A MN that accepts replicas (copies) of Science Data from other MNs and thereby helps ensuring that Science Data remains available.
Investigator Toolkit (ITK)

The Investigator Toolkit provides a suite of software tools that are useful for the various audiences that DataONE serves. The tools fall in a number of categories, which are further developed here, with examples of potential applications that would fit into each category.

http://mule1.dataone.org/ArchitectureDocs-current/design/itk-overview.html

MN
DataONE Member Node.
CN
DataONE Coordinating Node.
client
An application that accesses the DataONE infrastructure on behalf of a user.
SciData
An object (file) that contains scienctific observational data.
SciMeta
An object (file) that contains information about a SciData object.
SysMeta
An object (file) that contains system level information about a SciData or a SciMeta object.
subject

In DataONE, a subject is a unique identity, represented as a string. A user or Node that wishes to act as a given subject in the DataONE infrastructure must hold an X.509 certificate for that subject.

DataONE defines a serialization method in which a subject is derived from the DN in a X.509 certificate.

Python

A dynamic programming language.

http://www.python.org

Java

A statically typed programming language.

http://java.com

X.509

An ITU-T standard for a public key infrastructure (PKI) for single sign-on (SSO) and Privilege Management Infrastructure (PMI). X.509 specifies, amongst other things, standard formats for public key certificates, certificate revocation lists, attribute certificates, and a certification path validation algorithm.

http://en.wikipedia.org/wiki/X509

CA
Certificate Authority

A certificate authority is an entity that issues digital certificate s. The digital certificate certifies the ownership of a public key by the named subject of the certificate. This allows others (relying parties) to rely upon signatures or assertions made by the private key that corresponds to the public key that is certified. In this model of trust relationships, a CA is a trusted third party that is trusted by both the subject (owner) of the certificate and the party relying upon the certificate. CAs are characteristic of many public key infrastructure (PKI) schemes.

http://en.wikipedia.org/wiki/Certificate_authority

CA signing key
The private key which the CA uses for signing CSRs.
Server key
The private key that Apache will use for proving that it is the owner of the certificate that it provides to the client during the SSL handshake.
CSR

Certificate Signing Request

A message sent from an applicant to a CA in order to apply for a certificate.

http://en.wikipedia.org/wiki/Certificate_signing_request

Certificate

A public key certificate (also known as a digital certificate or identity certificate) is an electronic document which uses a digital signature to bind a public key with an identity – information such as the name of a person or an organization, their address, and so forth. The certificate can be used to verify that a public key belongs to an individual.

http://en.wikipedia.org/wiki/Public_key_certificate

CA certificate
A certificate that belongs to a CA and serves as the root certificate in a term:chain of trust.
Self signed certificate

A certificate that is signed by its own creator. A self signed certificate is not a part of a chain of trust and so, it is not possible to validate the information stored in the certificate. Because of this, self signed certificates are useful mostly for testing in an implicitly trusted environment.

http://en.wikipedia.org/wiki/Self-signed_certificate

Chain of trust

The Chain of Trust of a Certificate Chain is an ordered list of certificates, containing an end-user subscriber certificate and intermediate certificates (that represents the Intermediate CA), that enables the receiver to verify that the sender and all intermediates certificates are trustworthy.

http://en.wikipedia.org/wiki/Chain_of_trust

OpenSSL
Toolkit implementing the SSL v2/v3 and TLS v1 protocols as well as a full-strength general purpose cryptography library.
SSL

Secure Sockets Layer

A protocol for transmitting private information via the Internet. SSL uses a cryptographic system that uses two keys to encrypt data − a public key known to everyone and a private or secret key known only to the recipient of the message.

SSL handshake

The initial negotiation between two machines that communicate over SSL.

http://developer.connectopensource.org/display/CONNECTWIKI/SSL+Handshake

http://developer.connectopensource.org/download/attachments/34210577/Ssl_handshake_with_two_way_authentication_with_certificates.png

TLS

Transport Layer Security

Successor of SSL.

Client side authentication
SSL Client side authentication is part of the SSL handshake, where the client proves its identity to the web server by providing a certificate to the server. The certificate provided by the client must be signed by a CA that is trusted by the server. Client Side Authentication is not a required part of the handshake. The server can be set up to not allow Client side authentication, to require it or to let it be optional.
Server Side Authentication
SSL Server Side Authentication is part of the SSL handshake, where the server proves its identity to the client by providing a certificate to the client. The certificate provided by the server must be signed by a CA that is trusted by the client. Server Side Authentication is a required part of the handshake.
Client side certificate
Certificate that is provided by the client during client side authentication.
Server side certificate
Certificate that is provided by the server during server side authentication.
CILogon

The CILogon project facilitates secure access to CyberInfrastructure (CI).

http://www.cilogon.org/

LOA

Levels of Assurance

CILogon operates three Certification Authorities (CAs) with consistent operational and technical security controls. The CAs differ only in their procedures for subscriber authentication, identity validation, and naming. These differing procedures result in different Levels of Assurance (LOA) regarding the strength of the identity contained in the certificate. For this reason, relying parties may decide to accept certificates from only a subset of the CILogon CAs.

http://ca.cilogon.org/loa

REST

Representational State Transfer

A style of software architecture for distributed hypermedia systems such as the World Wide Web.

http://en.wikipedia.org/wiki/Representational_State_Transfer

Data Packaging

Data, in the context of DataONE, is a discrete unit of digital content that is expected to represent information obtained from some experiment or scientific study.

http://mule1.dataone.org/ArchitectureDocs-current/design/DataPackage.html

RDF

Resource Description Framework

http://www.w3.org/RDF/

OAI-ORE

Open Archives Initiative’s Object Resource and Exchange

http://www.openarchives.org/ore/

Resource Map

An object (file) that describes one or more aggregations of Web resources. In the context of DataONE, the web resources are DataONE objects such as Science Data and Science Metadata.

http://www.openarchives.org/ore/1.0/toc

RDF

Resource Description Framework

The Resource Description Framework (RDF) is a family of World Wide Web Consortium (W3C) specifications [1] originally designed as a metadata data model. It has come to be used as a general method for conceptual description or modeling of information that is implemented in web resources, using a variety of syntax notations and data serialization formats.

http://en.wikipedia.org/wiki/Resource_Description_Framework

Tier

A tier designates a certain level of functionality exposed by a MN.

DataONE Member Node Tiers.

MNCore API

A set of MN APIs that implement core functionality.

http://mule1.dataone.org/ArchitectureDocs-current/apis/MN_APIs.html#module-MNCore

MNRead API

A set of MN APIs that implement Read functionality.

http://mule1.dataone.org/ArchitectureDocs-current/apis/MN_APIs.html#module-MNRead

Science Data
An object (file) that contains scienctific observational data.
Science Metadata
An object (file) that contains information about a Science Data object.
System Metadata

An object (file) that contains system level information about a Science Data-, Science Metadata- or other DataONE object.

Overview of System Metadata <http://mule1.dataone.org/ArchitectureDocs-current/design/SystemMetadata.html>

Description of the System Metadata type <http://mule1.dataone.org/ArchitectureDocs-current/apis/Types.html#Types.SystemMetadata>

Identity Provider

A service that authenticates users and issues security tokens.

In the context of DataONE, an Identity Provider is a 3rd party institution where the user has an account. CILogon acts as an intermediary between DataONE and the institution by creating X.509 certificates based on identity assertions made by the institutions.