Skip to main content
. 2024 Feb 9;7(1):21–27. doi: 10.1097/JP9.0000000000000173

Table 2.

Catalog of databases hosted by OmicsDI

PRIDE (PRoteomics IDEntifications)
-A centralized, standards compliant, public data repository for mass spectrometry proteomics data. The data includes protein and peptide identification and corresponding expression values, post-translational modifications, and supporting mass spectra evidence, both as raw data and peak list files.
https://www.ebi.ac.uk/pride/archive/
PeptideAtlas
-A multi-organism, publicly accessible compendium of peptides identified in a large set of tandem mass spectrometry proteomics experiments. Mass spectrometer output files are collected for human, mouse, yeast, and several other organisms, and searched using the latest search engines and protein sequences.
http://www.peptideatlas.org/
MassIVE (Mass Spectrometry Interactive Virtual Environment)
-A community resource developed by the NIH-funded Center for Computational Mass Spectrometry to promote the global, free exchange of mass spectrometry data.
https://massive.ucsd.edu/ProteoSAFe/static/massive.jsp
GPMDB (Global Proteome Machine Database)
-The GPMDB utilizes the information obtained from GPM servers to aid in the difficult process of validating peptide MS/MS spectra as well as protein coverage patterns.
https://www.thegpm.org/
JPOST Repository (Japan ProteOme STandard Repository)
-jPOSTrepo is a data repository for sharing MS raw/processed data.
https://jpostdb.org/
Physiome Model Repository
-The repository is intended to provide a “quantitative description of physiological dynamics and functional behavior of the intact organism.” Integration of this database into OmicsDI is in an early testing stage.
http://physiomeproject.org/
EGA (European Genome-Phenome Archive)
-The EGA provides a service for the permanent archiving and distribution of personally identifiable genetic and phenotypic data obtained from biomedical research projects.
https://ega-archive.org/
EVA (European Variation Archive)
-An open-access database of all types of genetic variation data from all species.
https://www.ebi.ac.uk/eva/
ENA (European Nucleotide Archive)
-An open, supported platform for the management, sharing, integration, archiving and dissemination of sequence data.
https://www.ebi.ac.uk/ena/browser/home
LINCS (Library of Integrated Network-based Cellular Signatures)
-The LINCS program is an NIH Common Fund program that includes six Data and Signature Centers: Drug Toxicity Signature Generation Center, HMS LINCS Center, LINCS Center for Transcriptomics, LINCS Proteomic Characterization Center for Signaling and Epigenetics, MEP LINCS Center, and NeuroLINCS Center. The extensive, well-annotated datasets generated by the centers along with relevant experimental information are made openly available with user-friendly search interfaces to access and download the signatures and tools to display the data and perform integrative analysis.
http://lincsportal.ccs.miami.edu/dcic-portal/
https://lincsproject.org
PAXDB (Protein Abundance Database)
-A comprehensive absolute protein abundance database, which contains whole genome protein abundance information across organisms and tissues. Furthermore, it provides information about inter-species variation of protein abundances.
https://pax-db.org/
Cell Collective
-Interactive modeling of biological networks. Integration of this database into OmicsDI is at an early testing stage.
https://cellcollective.org/#
MetaboLights
-MetaboLights is a recommended repository for depositing information derived from Metabolomics experiments. The database is cross-species, cross-technique and covers metabolite structures and their reference spectra as well as their biological roles, locations and concentrations, and experimental data from metabolic experiments.
https://www.ebi.ac.uk/metabolights/index
Metabolomics Workbench
-A national and international repository for metabolomics data and metadata that provides analysis tools and access to metabolite standards, protocols, tutorials, training, and more.
https://www.metabolomicsworkbench.org/about/index.php
GNPS (Global Natural Products Social Molecular Networking)
-The GNPS is a platform for providing an overview of the molecular features in mass spectrometry-based metabolomics by comparing fragmentation patterns to identify chemical relationships.
https://gnps.ucsd.edu/ProteoSAFe/static/gnps-splash.jsp
BioModels
-BioModels is a repository of mathematical models of biological and biomedical systems. It hosts a vast selection of existing literature-based physiologically and pharmaceutically relevant mechanistic models in standard formats.
https://www.ebi.ac.uk/biomodels/
FAIRDOMHub
-The FAIRDOMHub is built upon the FAIRDOM-SEEK software suite, which is an open source web platform for sharing scientific research assets, processes, and outcomes. Integration of this database into OmicsDI is at an early testing stage.
https://fairdomhub.org/
ArrayExpress
-ArrayExpress archives functional genomics data from microarray and sequencing platforms to support reproducible research. Experiments are submitted directly to ArrayExpress or are imported from the NCBI Gene Expression Omnibus (GEO) database.
https://www.ebi.ac.uk/arrayexpress/
dbGaP (Genotypes and Phenotypes)
-The database of dbGaP was developed to archive and distribute the data and results from studies that have investigated the interaction of genotype and phenotype in Humans.
https://www.ncbi.nlm.nih.gov/gap/
ExpressionAtlas
-The Expression Atlas provides information on gene and protein expression patterns under different biological conditions. Gene expression data can be acquired and re-analyzed to detect genes showing interesting baseline and differential expression patterns.
https://www.ebi.ac.uk/gxa/home
GEO (NCBI Gene Expression Omnibus)
-GEO is a public functional genomics data repository supporting MIAME-compliant data submissions. Array- and sequence-based data are accepted. Tools are provided to help users to query and download curated gene expression profiles with associated experimental information as well as perform differential expression analysis.
https://www.ncbi.nlm.nih.gov/geo/
NODE (The National Omics Data Encyclopedia)
-NODE provides an integrated, compatible, comparable, and scalable multi-omics resource platform that supports flexible data management and effective data release. NODE uses a hierarchical data architecture to support storage of muti-omics data including sequencing data, MS-based proteomics data, MS or NMR-based metabolomics data, and fluorescence imaging data.
https://www.biosino.org/node/
BioStudies
-The BioStudies database allows users to explore datasets from genomic studies, deposited by a range of data providers. Access to datasets must be approved by the specified Data Access Committee (DAC).
https://www.ebi.ac.uk/biostudies/