TABLE 3.
Non-exhaustive list of the main public omics databases (listed alphabetically).
| Databases | Types of data | Number of sites | Number of samples | Links |
| ArrayExpress | Healthy + Diseases | 5 types of molecules | 27,462 | https://www.ebi.ac.uk/arrayexpress/ |
| CCLE | Cell line cancers | 39 tissues | 1,457 | https://portals.broadinstitute.org/ccle |
| ColPortal | Healthy + Diseases | 48 | 253 | https://colportal.imib.es/colportal/index.jsf |
| CPTAC | Cancers | 10 tissues | 772 | https://proteomics.cancer.gov/programs/cptac |
| dbGAP | Healthy + Diseases | 1,513 studies | 2,935,530 | https://www.ncbi.nlm.nih.gov/gap/ |
| ENCODE | Healthy + Diseases | 94 | 7,536 | https://www.encodeproject.org |
| GDC | Cancers | 67 tissues | 84,031 | https://portal.gdc.cancer.gov |
| GEO | Healthy + Diseases | 55,176 entries | 1,957,921 | https://www.ncbi.nlm.nih.gov/geo/browse/ |
| gnomAD | Healthy + Diseases | 9 populations | 71,702 | https://gnomad.broadinstitute.org |
| GTEx | Healthy | 54 tissues | 17,382 | https://www.gtexportal.org/home/ |
| HMDB | Healthy + Diseases | 114,184 metabolites | 25,000 | https://hmdb.ca |
| ICGC | Cancers | 22 tissues | 24,289 | https://icgc.org |
| METABRIC | Breast cancers | 1 tissues | 2,509 | https://www.cbioportal.org/study/summary?id=brca_metabric |
| MGnify | Healthy + Diseases | 20 | 127,417 | https://www.ebi.ac.uk/metagenomics/ |
| Omics discovery index | Healthy + Cancers | 30 tissues | 92,846 | https://www.omicsdi.org |
| PCAWG | Cancers of ICGC | 20 tissues | 2,793 | https://dcc.icgc.org/pcawg |
| PDB | Healthy + Diseases | 5 types of polymers entities | 47,552 | https://www.rcsb.org |
| Roadmap epigenomics | Healthy + Diseases | 310 | 127 | https://egg2.wustl.edu/roadmap/web_portal/index.html |
| TARGET | Pediatric cancers | 16 tissues | 6,197 | https://ocg.cancer.gov/programs/target |
| TCGA | Cancers | 30 tissues | 11,315 | https://www.cancer.gov/about-nci/organization/ccg/research/structural-genomics/tcga |
| 1000 genomes | Healthy Diseases | 26 populations | 2,504 | https://www.internationalgenome.org/home |
For each database, we report the type of data (healthy or Diseases), the number of site (tissues, populations, experiments) available, the number of human samples available, and the links to web databases. Numbers are only given for organism Homo Sapiens. CCLE, Cancer Cell Line Encyclopedia; CPTAC, Clinical Proteomic Tumor Analysis Consortium; dbGAP, database of Genotypes and Phenotypes; ENCODE, Encyclopedia of DNA Elements; GDC, Genomic Data Commons; GEO, Gene Expression Omnibus; gnomAD, genome aggregation database; GTEx, Genotype-Tissue Expression; HMDB, The Human Metabolome Database; ICGC, International Cancer Genome Consortium; METABRIC, Molecular Taxonomy of Breast Cancer International Consortium; PCAWG, Pan Cancer Analysis of Whole Genomes; PDB, Protein Data Bank; SRA, Sequence Read Archive; TARGET, Therapeutically Applicable Research To Generate Effective Treatments; TCGA, The Cancer Genome Atla.