Skip to main content
. 2020 Nov 2;7:590842. doi: 10.3389/fmolb.2020.590842

TABLE 3.

Non-exhaustive list of the main public omics databases (listed alphabetically).

Databases Types of data Number of sites Number of samples Links
ArrayExpress Healthy + Diseases 5 types of molecules 27,462 https://www.ebi.ac.uk/arrayexpress/
CCLE Cell line cancers 39 tissues 1,457 https://portals.broadinstitute.org/ccle
ColPortal Healthy + Diseases 48 253 https://colportal.imib.es/colportal/index.jsf
CPTAC Cancers 10 tissues 772 https://proteomics.cancer.gov/programs/cptac
dbGAP Healthy + Diseases 1,513 studies 2,935,530 https://www.ncbi.nlm.nih.gov/gap/
ENCODE Healthy + Diseases 94 7,536 https://www.encodeproject.org
GDC Cancers 67 tissues 84,031 https://portal.gdc.cancer.gov
GEO Healthy + Diseases 55,176 entries 1,957,921 https://www.ncbi.nlm.nih.gov/geo/browse/
gnomAD Healthy + Diseases 9 populations 71,702 https://gnomad.broadinstitute.org
GTEx Healthy 54 tissues 17,382 https://www.gtexportal.org/home/
HMDB Healthy + Diseases 114,184 metabolites 25,000 https://hmdb.ca
ICGC Cancers 22 tissues 24,289 https://icgc.org
METABRIC Breast cancers 1 tissues 2,509 https://www.cbioportal.org/study/summary?id=brca_metabric
MGnify Healthy + Diseases 20 127,417 https://www.ebi.ac.uk/metagenomics/
Omics discovery index Healthy + Cancers 30 tissues 92,846 https://www.omicsdi.org
PCAWG Cancers of ICGC 20 tissues 2,793 https://dcc.icgc.org/pcawg
PDB Healthy + Diseases 5 types of polymers entities 47,552 https://www.rcsb.org
Roadmap epigenomics Healthy + Diseases 310 127 https://egg2.wustl.edu/roadmap/web_portal/index.html
TARGET Pediatric cancers 16 tissues 6,197 https://ocg.cancer.gov/programs/target
TCGA Cancers 30 tissues 11,315 https://www.cancer.gov/about-nci/organization/ccg/research/structural-genomics/tcga
1000 genomes Healthy Diseases 26 populations 2,504 https://www.internationalgenome.org/home

For each database, we report the type of data (healthy or Diseases), the number of site (tissues, populations, experiments) available, the number of human samples available, and the links to web databases. Numbers are only given for organism Homo Sapiens. CCLE, Cancer Cell Line Encyclopedia; CPTAC, Clinical Proteomic Tumor Analysis Consortium; dbGAP, database of Genotypes and Phenotypes; ENCODE, Encyclopedia of DNA Elements; GDC, Genomic Data Commons; GEO, Gene Expression Omnibus; gnomAD, genome aggregation database; GTEx, Genotype-Tissue Expression; HMDB, The Human Metabolome Database; ICGC, International Cancer Genome Consortium; METABRIC, Molecular Taxonomy of Breast Cancer International Consortium; PCAWG, Pan Cancer Analysis of Whole Genomes; PDB, Protein Data Bank; SRA, Sequence Read Archive; TARGET, Therapeutically Applicable Research To Generate Effective Treatments; TCGA, The Cancer Genome Atla.