About the KNB

The Knowledge Network for Biocomplexity (KNB) is an international repository intended to facilitate ecological and environmental research.

Data underlies metadata

Powered by rich, detailed metadata.

For scientists, the KNB is an efficient way to share, discover, access and interpret complex ecological data. Due to rich contextual information provided with KNB data, scientists are able to integrate and analyze data with less effort. The data originate from a highly-distributed set of field stations, laboratories, research sites, and individual researchers. The foundation of the KNB is the rich, detailed metadata provided by researchers that collect data, which promotes both automated and manual integration of data into new projects.

Open-source software.

Learn about the KNB Developer Tools and API

As part of the KNB effort, data management software is developed in a free and open source manner, so other groups can build upon the tools. The KNB is powered by the Metacat data management system, and is optimized for handling data sets described using the Ecological Metadata Language, but can store any XML-based metadata document. Learn more about the software behind KNB

Easily share your research and get more exposure with the DataONE network.

As a long-term repository, the KNB allows you to preserve your data for future generations of scientists. However, you can share your data with your colleagues today, and get a permanent identifier for all files in your data set. The KNB supports Digital Object Identifiers (DOIs), so your data sets can be confidently referenced in any publication. And as a DataONE Member Node, the KNB will securely replicate your data to other servers, maintaining all of the privacy controls you set. This means that your data are secure in the event that the KNB servers themselves experience any catastrophic failure. What are the benefits of KNB being a DataONE Member Node?


Getting Started

Storing and Sharing Your Research

Submitting your data to the KNB is easy. Here are two ways to get started:

KNB Online Data Registry Form

On the web

Upload your data on the KNB website using a simple online form.

Upload now

Morpho Desktop Application

On your desktop

Download the Morpho desktop application, which features a "wizard"-style step-by-step process to describe and package up your data.

Download Morpho

If your project involves large numbers of files (hundreds to thousands), you may want to automate the creation of your metadata. This is certainly possible, and has been done in programs like Matlab, R, etc. Send a note to KNB Help, and we can point you to some examples of scripted metadata creation.

Use any file format.

File formats
Credit: Blugraphic.com

While the KNB supports the upload of any data file format, sharing data can be greatly enhanced if you use ubiquitous, easy-to-read formats. For instance, while Microsoft Excel files are commonplace, it's better to export these spreadsheets to Comma Separated Values (CSV) text files, which can be read on any computer without having Microsoft products installed.

To prepare for upload, it's good to have your files in order. You might want to take a look at some best practices for managing your data files. For a given project, perhaps you have 6 data files, and one document that describes the methods that you used to collect or analyze your data. Collect these files into a single directory, and name them with short but descriptive names. Try to avoid spaces in your file names, but rather use dashes "-" or underscores "_".

For image files, use common formats like PNG, JPEG, TIFF, etc. Most all browsers can handle these. If you use specialized software to create your data, try to save you data in well-known formats. For instance, GIS data can be exported to ESRI shapefiles, and data created in Matlab or other matrix-based programs can be exported as NetCDF (an open binary format).

Search for Data

Search for data here and get helpful search tips on our help page.

Publish data with a DOI for long-term stable access.

DOI Example
Publish your data with a DOI and it will display on the metadata page for your dataset.

Assign your data set a Digital Object Identifier (DOI) and allow others to cite your data with the DOI to find the current location(s) for the data.

Because web addresses can change over time, it is important that your data set not be tied to a specific address on the internet. DOIs allow an identifier to be created that can be resolved to the multiple locations that a data set might exist, and then client tools can decide which of those copies is the most efficient to access.

How to assign a DOI to your data

To assign a DOI using the KNB requires that you have the proper permissions on the data set. Simply log in to the KNB web interface with your username and password, and then search for your data set by clicking on "My Packages" from the user menu. If you are the owner or have been granted management permissions, you will see a 'Publish' button, which both makes the data set publicly accessible and assigns a DOI to that particular version of the data set. The DOI is registered with DataCite using the EZID service, and will be discoverable through multiple data citation networks, including DataONE and others.

Learn more about DOIs


Licensing and Data Distribution

As a tool dedicated to helping researchers increase collaboration and the pace of science, this repository needs certain rights to copy, store, and redistribute data. By uploading data, metadata, and any other content to this repository, you warrant that you have the rights to the content and are authorized to do so under copyright or any other right that might pertain to the content. Data and facts themselves are not covered under copyright in the US and most countries, since facts in and of themselves are not eligible for copyright. That said, some associated metadata and some particular compilations of data could potentially be covered by copyright in some jurisdictions. Thus, by uploading content, you grant this repository and UCSB all rights needed to copy, store, redistribute, and share data, metadata, and any other content under the access control rules that you specify. By marking content as publicly available, you grant this repository, UCSB, and any other users the right to copy the content and redistribute it to the public without restriction under the terms of CC-BY.


Understanding metadata

Metadata are ultimately "data about data" - the contextual information needed to interpret a set of raw data observations. They provide meaning to data, and are critical when it comes to sharing, integrating, and analyzing data. Too often people collect data for projects and leave them undocumented for years or decades. These data, while potential of very high value, can become useless over time due to data entropy.

There are a number of resources available that help in understanding metadata. A notable reference book is "Ecological Data: Design, Management and Processing", edited by William Michner and James Brunt (ISBN 1-444-31139-5). This book has an excellent section on metadata, and the idea of data entropy over the course of a scientist's career. Likewise, researchers at the National Center for Ecological Analysis and Synthesis (NCEAS) wrote an excellent primer on understanding and using EML, entitled "Maximizing the value of ecological data with structured metadata: an introduction to Ecological Metadata Language (EML) and principles for metadata creation".

The two resources above should help in understanding metadata, and making the generation of high quality metadata a part of your scientific workflow. For more information, contact KNB Help, and we can further help you or your lab in effectively managing data and metadata.