About

Data underlies metadata

Powered by rich, detailed metadata.

For scientists, this repository is an efficient way to share, discover, access and interpret complex ecological data. Due to rich contextual information provided with KNB data, scientists are able to integrate and analyze data with less effort. The data originate from a highly-distributed set of field stations, laboratories, research sites, and individual researchers. The foundation of this repository is the rich, detailed metadata provided by researchers that collect data, which promotes both automated and manual integration of data into new projects.

Open-source software.

As part of our effort, data management software is developed in a free and open source manner, so other groups can build upon the tools. This repository is powered by the Metacat data management system, and is optimized for handling data sets described using the Ecological Metadata Language, but can store any XML-based metadata document. Learn more about the software behind this repository


Getting Started

Storing and Sharing Your Research

Submitting your data is easy. Here are two ways to get started:

Online Data Registry Form

On the web

Upload your data on this website using a simple online form.

Upload now

Morpho Desktop Application

On your desktop

Download the Morpho desktop application, which features a "wizard"-style step-by-step process to describe and package up your data.

Download Morpho

If your project involves large numbers of files (hundreds to thousands), you may want to automate the creation of your metadata. This is certainly possible, and has been done in programs like Matlab, R, etc.

Use any file format.

File formats
Credit: Blugraphic.com

While this repository supports the upload of any data file format, sharing data can be greatly enhanced if you use ubiquitous, easy-to-read formats. For instance, while Microsoft Excel files are commonplace, it's better to export these spreadsheets to Comma Separated Values (CSV) text files, which can be read on any computer without having Microsoft products installed.

To prepare for upload, it's good to have your files in order. You might want to take a look at some best practices for managing your data files. For a given project, perhaps you have 6 data files, and one document that describes the methods that you used to collect or analyze your data. Collect these files into a single directory, and name them with short but descriptive names. Try to avoid spaces in your file names, but rather use dashes "-" or underscores "_".

For image files, use common formats like PNG, JPEG, TIFF, etc. Most all browsers can handle these. If you use specialized software to create your data, try to save you data in well-known formats. For instance, GIS data can be exported to ESRI shapefiles, and data created in Matlab or other matrix-based programs can be exported as NetCDF (an open binary format).

Search for Data

Search for data here and get helpful search tips on our help page.


Licensing and Data Distribution

As a tool dedicated to helping researchers increase collaboration and the pace of science, this repository needs certain rights to copy, store, and redistribute data. By uploading data, metadata, and any other content to this repository, you warrant that you have the rights to the content and are authorized to do so under copyright or any other right that might pertain to the content. Data and facts themselves are not covered under copyright in the US and most countries, since facts in and of themselves are not eligible for copyright. That said, some associated metadata and some particular compilations of data could potentially be covered by copyright in some jurisdictions. Thus, by uploading content, you grant this repository and UCSB all rights needed to copy, store, redistribute, and share data, metadata, and any other content under the access control rules that you specify. By marking content as publicly available, you grant this repository, UCSB, and any other users the right to copy the content and redistribute it to the public without restriction under the terms of CC-BY.


Understanding metadata

Metadata are ultimately "data about data" - the contextual information needed to interpret a set of raw data observations. They provide meaning to data, and are critical when it comes to sharing, integrating, and analyzing data. Too often people collect data for projects and leave them undocumented for years or decades. These data, while potential of very high value, can become useless over time due to data entropy.

There are a number of resources available that help in understanding metadata. A notable reference book is "Ecological Data: Design, Management and Processing", edited by William Michener and James Brunt (ISBN 1-444-31139-5). This book has an excellent section on metadata, and the idea of data entropy over the course of a scientist's career. Likewise, researchers at the National Center for Ecological Analysis and Synthesis (NCEAS) wrote an excellent primer on understanding and using EML, entitled "Maximizing the value of ecological data with structured metadata: an introduction to Ecological Metadata Language (EML) and principles for metadata creation".

The two resources above should help in understanding metadata, and making the generation of high quality metadata a part of your scientific workflow.