This is the alpha version of the NIAID Data Ecosystem Discovery Portal.
Frequently Asked Questions
What can the Discovery Portal do for me?
You can use the Discovery Portal to:
- Search across millions of datasets from numerous sources, datasets that were previously unknown, to bring other dimensions into analyses.
- Download metadata or access via API to gather new insights about what’s available.
- Track research across funding programs or specific scientific areas.
What data is available in the Discovery Portal?
The Discovery Portal retrieves data related to allergic, infectious and immune-mediated disease (IID). It aggregates data across numerous sources, including NIAID-supported repositories, general biomedical repositories, and other generalist sources. See our list of data sources.
How do I suggest a data source that you don't have?
If there are data sources you’d like the Discovery Portal to include, you may suggest a source here.
How do I download the data?
Data is not stored in the Discovery Portal and cannot be downloaded directly from the Discovery Portal. The Discovery Portal can be used to search across repositories to find if/where data exists. All results are linked to the data source so you can access data from the data provider.
How do you handle controlled-access data?
The Discovery Portal does not provide direct access to data. The Discovery Portal is a searchable interface that helps you find if/where data exists. When you find data you want to use, you must follow the link to the data provider's site to request access.
When I've found my dataset, can you help me get access to the data?
The Discovery Portal can help you find datasets and link you to the data provider, but it cannot provide direct access to the data. Once you've found data you want to use, click the button "View data in source repository" and follow the data provider's guidelines to gain access from their site.
Can I preview the data before I access it?
Currently, data cannot be previewed within the Discovery Portal. Some of the data providers offer data previews before accessing/downloading the dataset, but this will be offered within their site. Click the button "View data in source repository" to see if the data can be previewed.
How do I access metadata via API?
All the metadata we harvest can be accessed though the API at api.data.niaid.nih.gov/. You can also find API documentation there.
Can I upload my own data?
Data cannot be uploaded to the Discovery Portal. The Discovery Portal is not a repository and does not store data.
How do I suggest a data source that you don't have?
If there are data sources you’d like the Discovery Portal to include, you may suggest a source here.
How do I use the Advanced Search tool?
Learn all about the Advanced Search tool here.
Can I edit the string query I created using Advanced Search?
Users can view your string query in the Advanced Search tool. Click the expandable button "view raw query" at the bottom of the Advanced Search page.
Can I write my own advanced string queries?
Users can write their own fielded queries in the Discovery Portal. Learn more about writing your own fielded queries (including available fields you can search).
How do operators and syntax work in the Discovery Portal?
Learn all about writing your own fielded queries, including the nitty-gritty of operators and syntax here.
Why can't I download data directly from the Discovery Portal?
The Discovery Portal is not a repository. The Discovery Portal can be used to search across repositories to find if/where data exists. It helps support the FAIRness of data. However, users cannot analyze data in the Discovery Portal, nor download data directly from the Discovery Portal.
Why are some metadata fields empty?
The Discovery Portal attempts to standardize metadata that is available; however, it cannot create information that does not exist. If metadata is missing at the source, it will also be absent within the Discovery Portal.
Why am I seeing some results that aren't related to immune-mediated or infectious disease?
As the Discovery Portal aggregates data from some generalist repositories, search results may include datasets that are not related to allergic or infectious and immune-mediated disease (IID).
Why am I seeing some duplicate results?
The Discovery Portal searches across some sources that are also data aggregators. This means you may see two records for the same dataset from two different sources.
Why don't you include all data from all infectious disease repositories?
The Discovery Portal does not aggregate every dataset from every source related to allergic or infectious and immune-mediated disease (IID). The Discovery Portal pulls from a list of data sources. If there are data sources you’d like included, you may suggest a source.
How does the Discovery Portal retrieve results?
NIAID Data Ecosystem Discovery Crawlers harvest dataset metadata from a variety of repositories and other sources. API infrastructure has also been created to access the metadata created by the Discovery Crawlers based on BioThings SDK. The Discovery Portal uses custom translators written in Python to transform metadata harvested by the Discovery Crawlers into a common schema, derived from schema.org. When you perform a basic search, the Discovery Portal looks for your terms anywhere within the metadata record. You can narrow your search to specific fields through advanced searching or you can use the filters.
How do you standardize the metadata?
The Discovery Portal standardizes metadata to a common schema, so different data providers describe datasets in the same way. View the Data Ecosystem schemas on the Data Discovery Engine. This is based on schema.org. Read more about our Schemas. On the Sources page, you can read about how the metadata provided by the repositories are translated into the Data Ecosystem schema.
How are my search terms matched to results?
When you perform a basic search, the Discovery Portal looks for your terms anywhere within the metadata record. You can narrow your search to specific fields through advanced searching.
How are results ordered/ranked?
While a basic search will retrieve all results that contain your terms anywhere within the metadata record, the Discovery Portal gives the most weight to results that contain your terms in the dataset name.
How often are results updated?
Currently, the Discovery Portal harvests metadata every quarter. The Sources page lists the last time the metadata was collected from the source.
I want to search on additional metadata fields.
The Advanced Search tool allows users to construct queries based on dozens of metadata fields. You can browse or search for fields, preview the number of records for each field, and enter your search terms. Click the button "Advanced Search" above the Discovery Portal search bar to open the tool.
I want to filter on additional metadata fields.
The Discovery Portal provides filters to help users narrow down their searches based on metadata that is available. To suggest enhancements to the filtering capabilities, submit issues or make suggestions on GitHub.
I want to view additional metadata fields before I access the data.
The Discovery Portal provides dataset details to make it easy for users to get a sense of what each resource contains when browsing search results. This is based on metadata that is available at the source. To suggest metadata enhancements, you can submit issues or make suggestions on GitHub.
Why are some metadata fields empty?
The Discovery Portal attempts to standardize metadata that is available; however, it cannot create information that does not exist. If metadata is missing at the source, it will also be absent within the Discovery Portal.
How do I download the metadata?
To download metadata, go to the Search Results page and click the button "Download Metadata."
How do I access metadata via API?
All the metadata we harvest can be accessed though the API at api.data.niaid.nih.gov/. You can also find API documentation there.
Where do I ask questions or send feedback?
For any questions, contact NIAIDDataEcosystem@mail.nih.gov. You can also submit issues or make suggestions on GitHub.
If you have any other questions not covered in the documents above, please reach out to the team at NIAIDDataEcosystem@mail.nih.gov.