A complex network of interconnected lines and nodes, resembling a molecular or neural network structure. The image features various shades of blue and white, with nodes of different sizes connected by thin lines, creating a web-like pattern.

Knowledge Center

Where the data comes from

The NIAID Data Ecosystem Discovery Portal harvests metadata from a wide range of NIAID-funded and generalist data sources to make it easier to find allergic or infectious and immune-mediated disease datasets, computational tools, and resource catalogs.

What is metadata? #

Metadata is data about resources — who collected the data, how it was collected, what the resource contains, etc. For something like a genome sequence, the sequence of nucleotides would be data, while the author of the data is metadata. Other metadata include the date the data was modified, the measurementTechniques used to collect the data, the healthCondition at the focus of dataset (like COVID-19, asthma, or autoimmune diseases), and more. Using the Discovery Portal, users can search these metadata to find datasets of interest.


Where does the data come from? #

The Discovery Portal harvests metadata from a curated and continually expanding list of biomedical data repositories. These sources include:

  • Infectious and immune-mediated disease repositories
  • General biomedical data platforms with relevant records
  • Generalist repositories that may include useful datasets even if not exclusive to infectious and immune-mediated disease

The Discovery Portal homepage includes a table where you can view the sources and filter/sort by type or research domain. The Sources page provides more details about each source, including the last time the metadata was harvested.


How are the sources chosen? #

Interviews with researchers about their needs informed the selection of the initial sources included in the Discovery Portal. As the Discovery Portal is continuously growing, the team is always looking to add other sources to the search platform.


How often is metadata harvested? #

Currently, metadata is harvested every quarter. The Sources page lists the last time the metadata was collected from the source. Screenshot of Sources page

Is the metadata augmented? #

Some of the metadata provided by the sources is cleaned so it is more standardized and easily searchable. Mostly, this involves standardizing the names of metadata fields (variables) so they are consistent between sources (read more about data schemas). These changes are tracked on the schema tab of the Sources page:

Table with metadata properties from source and its associated property in the NIAID Data Ecosystem.

You can also read more about metadata standardization here.

📘 Accessing the standardized metadata

The standardized metadata can be freely accessed using the Discovery Portal's open metadata API


Last updated on

Policies

  • Accessibility
  • Copyright
  • Disclaimer
  • Privacy Policy
  • Freedom of Information Act (FOIA)
  • Vulnerability Disclosure Policy
  • No Fear Act Data
Contact Us