Schemas
Metadata standardization is an important aspect of the Discovery Portal. This standardizes the language used to describe the dataset, regardless of the data provider. This makes the metadata searchable. This standardization relies on a schema, which defines what properties describe the dataset and how.
The NIAID Data Ecosystem Dataset schema#
The Dataset schema is based off of a widely used schema, the schema.org. A number of changes were made when these schemas were adapted:
- Simplification: These schemas tend to be exhaustive. The goal was to focus on the small handful of properties that are actively used by repositories and that are most useful to help researchers find resources.
- Decreased ambiguity / standardization: Sometimes, there are multiple properties that could be used. If the terms are synonyms, one choice is selected to make the metadata more uniform. For instance, the property
author
is used to express who created the dataset instead ofcreator
. - More defined structure: The type expected for each property is defined, whether that property should be one thing or a list of many things (cardinality), and whether the property is required, recommended or optional for any new data collection (marginality).
- Additional biologically relevant fields: A number of fields were added to promote searching of immune-mediated and infectious disease data, such as
species
(the host that is the subject of the dataset),infectiousAgent
(the pathogen that is the subject of the dataset),healthCondition
(the health condition like COVID-19 or diabetes that is the subject of the dataset), andfunding
(who paid for its creation).
By using community standards, interoperability with other data findability projects is promoted.
📘 NIAID Data Ecosystem schemas
View the NIAID Data Ecosystem schemas on the Data Discovery Engine
Adapting the NIAID Data Ecosystem schemas for another purpose#
Users who are collecting dataset metadata for a lab, consortium, or repository are welcome to use and/or adapt these schemas for other purposes. The Data Discovery Engine has a number of tools which allow users to view, compare, reuse, and extend schemas.
View the NIAID Data Ecosystem schemas
Extend a schema using the Data Discovery Engine
Last updated on