Skip to main content
. 2018 Jun 4;6:1618. Originally published 2017 Aug 31. [Version 2] doi: 10.12688/f1000research.12344.2

Table 1. Overview of some representative databases, registries and other tools to find life science data.

A more complete list can be found at FAIRsharing.

Database/
registry
Name Description Datatypes URL
Database Gene Ontology Repository of functional roles of gene products,
including: proteins, ncRNAs, and complexes.
Functional roles as determined experimentally or
through inference. Includes evidence for these
roles and links to literature
http://geneontology.org/
Database Kyoto
Encyclopedia
of Genes and
Genomes
(KEGG)
Repository for pathway relationships of
molecules, genes and cells, especially molecular
networks
Protein, gene, cell, and genome pathway
membership data
http://www.genome.jp/kegg/
Database OrthoDB Repository for gene ortholog information Protein sequences and orthologous group
annotations for evolutionarily related species
groups
http://www.orthodb.org/
Database
with analysis
layer
eggNOG Repository for gene ortholog information with
functional annotation prediction tool
Protein sequences, orthologous group
annotations and phylogenetic trees for
evolutionarily related species groups
http://eggnogdb.embl.de/
Database European
Nucleotide
Archive (ENA)
Repository for nucleotide sequence information Raw next-generation sequencing data, genome
assembly and annotation data
http://www.ebi.ac.uk/ena
Database Sequence Read
Archive (SRA)
Repository for nucleotide sequence information Raw high-throughput DNA sequencing and
alignment data
https://www.ncbi.nlm.nih.gov/sra/
Database GenBank Repository for nucleotide sequence information Annotated DNA sequences https://www.ncbi.nlm.nih.gov/genbank/
Database ArrayExpress Repository for genomic expression data RNA-seq, microarray, CHIP-seq, Bisulfite-seq and
more (see https://www.ebi.ac.uk/arrayexpress/
help/experiment_types.html for full list)
https://www.ebi.ac.uk/arrayexpress/
Database Gene
Expression
Omnibus (GEO)
Repository for genetic/genomic expression data RNA-seq, microarray, real-time PCR data on
gene expression
https://www.ncbi.nlm.nih.gov/geo/
Database PRIDE Repository for proteomics data Protein and peptide identifications, post-translational
modifications and supporting
spectral evidence
https://www.ebi.ac.uk/pride/archive/
Database Protein Data
Bank (PDB)
Repository for protein structure information 3D structures of proteins, nucleic acids and
complexes
https://www.wwpdb.org/
Database MetaboLights Repository for metabolomics experiments and
derived information
Metabolite structures, reference spectra and
biological characteristics; raw and processed
metabolite profiles
http://www.ebi.ac.uk/metabolights/
Ontology/
database
ChEBI Ontology and repository for chemical entities Small molecule structures and chemical
properties
https://www.ebi.ac.uk/chebi/
Database Taxonomy Repository of taxonomic classification information Taxonomic classification and nomenclature data
for organisms in public NCBI databases
https://www.ncbi.nlm.nih.gov/taxonomy
Database BioStudies Repository for descriptions of biological studies,
with links to data in other databases and
publications
Study descriptions and supplementary files https://www.ebi.ac.uk/biostudies/
Database Biosamples Repository for information about biological
samples, with links to data generated from these
samples located in other databases
Sample descriptions https://www.ebi.ac.uk/biosamples/
Database
with analysis
layer
IntAct Repository for molecular interaction information Molecular interactions and evidence type http://www.ebi.ac.uk/intact/
Database UniProtKB
(SwissProt and
TrEMBL)
Repository for protein sequence and function
data. Combines curated (UniProtKB/SwissProt)
and automatically annotated, uncurated
(UniProtKB/TrEMBL) databases
Protein sequences, protein function and
evidence type
http://www.uniprot.org/
Database European
Genome-
Phenome
Archive
Controlled-access repository for sequence and
genotype experiments from human participants
whose consent agreements authorise data
release for specific research use
Raw, processed and/or analysed sequence and
genotype data along with phenotype information
https://www.ebi.ac.uk/ega/
Database
with analysis
layer
EBI
Metagenomics
Repository and analysis service for
metagenomics and metatranscriptomics data.
Data is archived in ENA
Next-generation sequencing metagenomic
and metatranscriptomic data; metabarcoding
(amplicon-based) data
https://www.ebi.ac.uk/metagenomics/
Database
with analysis
layer
MG-RAST Repository and analysis service for
metagenomics data.
Next-generation sequencing metagenomic and
metabarcoding (amplicon-based) data
http://metagenomics.anl.gov/
Registry Omics DI Registry for dataset discovery that currently
spans 11 data repositories: PRIDE, PeptideAtlas,
MassIVE, GPMDB, EGA, Metabolights,
Metabolomics Workbench, MetabolomeExpress,
GNPS, ArrayExpress, ExpressionAtlas
Genomic, transcriptomic, proteomic and
metabolomic data
http://www.omicsdi.org
Registry DataMed Registry for biomedical dataset discovery that
currently spans 66 data repositories
Genomic, transcriptomic, proteomic,
metabolomic, morphology, cell signalling,
imaging and other data
https://datamed.org
Registry Biosharing Curated registry for biological databases, data
standards, and policies
Information on databases, standards and
policies including fields of research and usage
recommendations by key organisations
https://biosharing.org/
Registry re3data Registry for research data repositories across
multiple research disciplines
Information on research data repositories, terms
of use, research fields
http://www.re3data.org