2011-12-01T12:00:00
knb
lter
emlVersion
EML version 2.1.0 or beyond
Check the EML document declaration for version 2.1.0 or higher
eml://ecoinformatics.org/eml-2.1.0 or eml://ecoinformatics.org/eml-2.1.1
notChecked
Validity of this quality report is dependent on this check being valid.
Use an approved namespace.
schemaValid
Document is schema-valid EML
Check document schema validity
schema-valid
notChecked
Validity of this quality report is dependent on this check being valid.
Make this doc valid.
parserValid
Document is EML parser-valid
Check document using the EML IDs and references parser
Validates with the EML IDs and references parser
notChecked
Validity of this quality report is dependent on this check being valid.
Resolve issues with IDs and references.
schemaValidDereferenced
Dereferenced document is schema-valid EML
References are dereferenced, and the resulting file validated
schema-valid
notChecked
Validity of this quality report is dependent on this check being valid.
Make sure that references refer to the correct elements.
packageIdPattern
packageId pattern matches "scope.identifier.revision"
Check against LTER requirements for scope.identifier.revision
'knb-lter-abc.n.m', where 'abc' is an LTER site acronym and 'n' and 'm' are whole numbers
notChecked
Value of the packageId attribute must conform with LTER best practices
pubDatePresent
'pubDate' element is present
Check for presence of the pubDate element
The date that the dataset was submitted for publication in PASTA must be included.
(The EML schema does not require this element, but when present, it does constrain its
format to YYYY-MM-DD or just YYYY. Citation format uses only the YYYY portion even if a
full date is entered.)
notChecked
'pubDate is part of citation'. 'pubDate' qualifies use of "ongoing" in other metadata elements.
The year of public release of data online should be listed as the 'pubDate'
element. The 'pubDate' should be updated when data and/or metadata are updated or re-released.
The format can be either a 4-digit year (YYYY), or an ISO date (YYYY-MM-DD).
EML Best Practices v.2, p. 17
keywordPresent
keyword element is present
Checks to see if at least one keyword is present
Presence of one or more keyword elements
notChecked
The LTER portal allows searches on keywords. This check is a precursor for checking on keywords from the controlled vocabulary.
Add at least one keyword.
methodsElementPresent
A 'methods' element is present
All datasets should contain a 'methods' element, at a minimum a link to a separate methods doc.
presence of 'methods' at one or more xpaths.
notChecked
As a minimum, a reference to an external protocol should be given at the dataset level. However, detailed methods at this level are preferable. If further refinement is needed, methods can be defined for individual data entities or even individual attributes if necessary.
Since they are mostly for human consumption, one detailed description of all steps taken at the dataset level is frequently sufficient and more user friendly.
EML Best Practices, p. 28
coveragePresent
coverage element is present
At least one coverage element should be present in a dataset.
At least one of geographicCoverage, taxonomicCoverage, or temporalCoverage is present in the EML.
notChecked
geographicCoveragePresent
geographicCoverage is present
Check that geographicCoverage exists in EML at the dataset level, or at least one entity's level, or at least one attribute's level.
geographicCoverage at least at the dataset level.
info
Many but not all datasets are appropriate to have spatial coverage.
If sampling EML is used within methods, does that obviate geographicCoverage? Or should those sites be repeated or referenced?
EML Best Practices v.2, p. 22-23. "One geographicCoverage element should be included, whose boundingCoordinates describe the extent of the data....Additional geographicCoverage elements may be entered at the dataset level if there are significant distances between study sites and it would be confusing if they were grouped into one bounding box." 6 decimal places.
taxonomicCoveragePresent
taxonomicCoverage is present
Check that taxonomicCoverage exists in EML at the dataset level, or at least one entity's level, or at least one attribute's level.
taxonomicCoverage at least at the dataset level.
info
Only when taxa are pertinent to the dataset will they have taxonomicCoverage.
Could search title, abstract, keywords for any taxonomic name (huge). Could search keywordType="taxonomic".
EML Best Practices v.2, p. 25
temporalCoveragePresent
temporalCoverage is present
Check that temporalCoverage exists in EML at the dataset level, or at least one entity's level, or at least one attribute's level.
temporalCoverage at least at the dataset level.
info
LTER wants to search datasets by time; the best place to search is the dataset level temporal coverage.
Most datasets have a temporal range.
EML Best Practices v.2, p. 24
titleLength
Dataset title length is at 5 least words.
If the title is shorter than 5 words, it might be insufficient. Title word count between 7 and 20 including prepositions and numbers.
Between 7 and 20 words
notChecked
Best Practices document, page 13, says write a good title. This is the first view of a dataset. This is the first view of a dataset. To include what, where and when requires at least 7 words.
If title is too short, ensure title covers what, where and when. If title is too long, title more concisely.
EML Best Practices, v.2, p. 13
datasetAbstractLength
Dataset abstract element is a minimum of 20 words
Check the length of a dataset abstract and warn if less than 20 words.
An abstract is 20 words or more.
notChecked
An abstract helps a user determine if the dataset is useful for a specific purpose. An abstract is usually a paragraph.
Add an abstract.
EML Best Practices
duplicateEntityName
There are no duplicate entity names
Checks that content is not duplicated by other entityName elements in the document
entityName is not a duplicate within the document
notChecked
Data Manager requires a non-empty, non-duplicate entityName value for every entity
Declare a non-empty entityName and ensure that there are no duplicate entityName values in the document
http://knb.ecoinformatics.org/software/eml/eml-2.1.0/eml-dataTable.html#numberOfRecords
entityNameLength
Length of entityName is not excessive (less than 100 char)
length of entity name is less than 100 characters
entityName value is 100 characters or less
notChecked
entityDescriptionPresent
An entity description is present
Check for presence of an entity description.
EML Best practices pp. 32-33, "...should have enough information for a user..."
notChecked
With entityName sometimes serving as a file name rather than a title, it is important to be very descriptive here.
numHeaderLinesPresent
'numHeaderLines' element is present
Check for presence of the 'numHeaderLines' element.
Document contains 'numHeaderLines' element.
info
If data file contains header lines, 'numHeaderLines' must be specified.
Add 'numHeaderLines' element if needed.
numFooterLinesPresent
'numFooterLines' element is present
Check for presence of the 'numFooterLines' element.
Document contains 'numFooterLines' element.
info
If data file contains footer lines, 'numFooterLines' must be specified.
Add 'numFooterLines' element if needed.
onlineURLs
Online URLs are live
Check that online URLs return something
true
notChecked
urlReturnsData
URL returns data
Checks whether a URL returns data. Unless the URL is specified to be function="information", the URL should return the resource for download.
A data entity that matches the metadata
notChecked
URL should return a data entity
http://knb.ecoinformatics.org/software/eml/eml-2.1.0/eml-resource.html#UrlType
displayDownloadData
Display downloaded data
Display the first kilobyte of data that is downloaded
Up to one kilobyte of data should be displayed
notChecked
recordDelimiterPresent
Record delimiter is present
Check presence of record delimiter. Check that the record delimiter is one of the suggested values.
A record delimiter from a list of suggested values: \n, \r, \r\n, #x0A, #x0D, #x0D#x0A
notChecked
The record delimiter is not present or is not one of the suggested values.
Add a record delimiter or change the record delimiter to one of the suggested values: \n, \r, \r\n, #x0A, #x0D, #x0D#x0A
http://knb.ecoinformatics.org/software/eml/eml-2.1.0/eml-physical.html#recordDelimiter
examineRecordDelimiter
Data are examined and possible record delimiters are displayed
If no record delimiter was specified, we assume that \r\n is the delimiter. Search the first row for other record delimiters and see if other delimiters are found.
No other potential record delimiters expected in the first row.
notChecked
Detection of line endings may be automatic on most systems, so this may not be important.
Ensure that record delimiters are correctly specified.
http://knb.ecoinformatics.org/software/eml/eml-2.1.0/eml-physical.html#recordDelimiter
fieldDelimiterValid
Field delimiter is a single character
Field delimiters should be one character only
A single character is expected
notChecked
fieldDelimiter should be a single character such as (,;:|) or an escape of a single character such as \t for tab. Decimal or hex values of ASCII characters may be used such as #32, #x20, or 0x20 for the space character.
Change the fieldDelimiter to a single character or its representation, either decimal, hex or escaped.
http://knb.ecoinformatics.org/software/eml/eml-2.1.0/eml-physical.html#fieldDelimiter
attributeNamesUnique
Attribute names are unique
Checks if attributeName values are unique in the table. Not required by EML.
Unique attribute names.
notChecked
A good table does not have duplicate column names.
Check attribute names; best practice says these should be unique.
EML Best Practices
displayFirstInsertRow
Display first insert row
Display the first row of data values to be inserted into the database table
The first row of data values should be displayed
notChecked
tooFewFields
Data does not have fewer fields than metadata attributes
Compare number of fields specified in metadata to number of fields found in a data record
notChecked
A record has a number of fields less than the specified number of attributes.
Check the row for problems (un-escaped delimiters, unquoted strings, too few fields). Also check these elements in metadata: collapseDelimiters, recordDelimiter, fieldDelimiter, quoteCharacter.
http://knb.ecoinformatics.org/software/eml/eml-2.1.0/eml-physical.html#dataFormat
tooManyFields
Data does not have more fields than metadata attributes
Compare number of fields specified in metadata to number of fields found in a data record
notChecked
A record has a number of fields more than the specified number of attributes.
Check the row for problems (un-escaped delimiters, unquoted strings, extra fields). Also check these elements in metadata: collapseDelimiters, recordDelimiter, fieldDelimiter, quoteCharacter.
http://knb.ecoinformatics.org/software/eml/eml-2.1.0/eml-physical.html#dataFormat
databaseTableCreated
Database table created
Status of creating a database table
A database table is expected to be generated from the EML attributes.
notChecked
dataLoadStatus
Data can be loaded into the database
Status of loading the data table into a database
No errors expected during data loading or data loading was not attempted for this data entity
notChecked
numberOfRecords
Number of records in metadata matches number of rows loaded
Compare number of records specified in metadata to number of records found in data
notChecked