Skip to main content
. Author manuscript; available in PMC: 2018 Jun 6.
Published in final edited form as: Nat Rev Genet. 2018 Feb 26;19(5):299–310. doi: 10.1038/nrg.2018.4

Table 1.

Data types for integrative omics

Data type Large-scale research efforts Utility and advantages Major caveats
Genetic variation Many GWAS consortia, 1000 Genomes, gnomAD and UK Biobank Unbiased source of genetic basis of disease and direct inference of causality At least one step removed from the phenotype
Epigenetics ENCODE and Roadmap Epigenomics Project Functional impact and typically easy to infer causality Not applicable for all phenotypes
Gene expression GTEx and GEUVADIS Inexpensive assay for an intermediate step towards the phenotype Not applicable for all phenotypes
Proteomics and metabolomics CPTAC, EDRN and Common Fund Likely to be very close to the phenotype Expensive and difficult to scale (proteomics)
Microbiome Human Microbiome Project Likely to be very close to the phenotype and measures a combination of genetic and environmental influences Combination of genetic and environmental influences makes it difficult to infer the direction of causality

In this table, ‘phenotype’ refers to an organismal phenotype. CPTAC, Clinical Proteomic Tumour Analysis Consortium; EDRN, Early Detection Research Network; ENCODE, Encyclopedia of DNA Elements; GEUVADIS, Genetic European Variation in Health and Disease; gnomAD, Genome Aggregation Database; GTEx, Genotype–Tissue Expression; GWAS, genome-wide association study.