Skip to main content
. 2021 Nov 5;126(7):981–993. doi: 10.1038/s41416-021-01612-6

Table 2.

Resources.

ENCODE The Encyclopedia of DNA Elements (ENCODE) Consortium maintains a portal of publicly available epigenetic datasets from a wide range of assays for identification of functional and regulatory elements, including many variations of RNA-seq, ChIP-seq, DNase-seq and DNA methylation arrays. https://www.encodeproject.org/
Roadmap Epigenomics The NIH Roadmap Epigenomics Mapping Consortium is a resource that comprises publicly available epigenomic data from primary cells generated using a number of methods, such as histone modification ChIP-seq, RNA-seq and DNA methylation assays. http://www.roadmapepigenomics.org/
Viestra.org Digital genomic footprinting providing a high-resolution genome-wide consensus transcription-factor footprint index in 243 human cell and tissue types. Accessible through the ENCODE portal and UCSC browser. https://www.vierstra.org/resources/dgf
Descartes Single-cell ATAC-seq and gene expression data generated in a broad range of human foetal tissues (53 samples representing 15 organs), to create an atlas of linked cell-type-specific enhancers and genes. https://descartes.brotmanbaty.org/bbi/human-chromatin-during-development/
IHEC The International Human Epigenome Consortium provides public access to high-resolution reference human epigenome maps via a data portal bringing together ENCODE, Roadmap Epigenomics, CEEHRC (Canadian Epigenetics, Environment and Health Research Consortium), and other data resources. It interfaces with UCSC, Ensembl and WashU browsers as well as Galaxy for data processing. http://ihec-epigenomes.org/
UCSC genome browser This widely used browser has many tracks which are useful for annotation; multiple SNP and variant tracks as well as tracks for resources such as ENCODE-integrated regulation and GTEx gene expression. https://genome.ucsc.edu/
Ensembl genome browser An extensive resource of publicly available downloadable data along with a genome browser containing regulatory annotations, again including multiple ENCODE data tracks. https://www.ensembl.org/index.html
WashU Epigenome Browser A browser specifically designed for epigenetic data; the usual SNPs, variation and ENCODE data are available, as well as additional epigenomic datasets from IHEC. http://epigenomegateway.wustl.edu/
GTEx The Genotype Tissue Expression project is a database of tissue-specific gene expression and regulation data with downloadable and browsable QTLs, levels of expression, H3K27ac ChIP-seq and DNA methylation data. https://www.gtexportal.org/home/
GEO Gene Expression Omnibus is a public functional genomics data repository supporting Minimum Information About a Microarray Experiment (MIAME)-compliant data submissions. Array- and sequence-based data are accepted. Tools are provided to help users query and download experiments and curated gene expression profiles. https://www.ncbi.nlm.nih.gov/geo/
METABRIC The Molecular Taxonomy of Breast Cancer International Consortium is a large dataset of breast tumours and matched normal tissue with clinical, gene expression, copy-number aberrations (CNA), and SNP data available via cBioPortal. https://www.cbioportal.org/study/summary?id=brca_metabric
TCGA The Cancer Genome Atlas is a conglomeration of over 20,000 primary tumours and matched normal tissue across 33 cancer types with datasets encompassing clinical, whole exome, whole genome, DNA methylation, gene expression, microRNA and proteomic profiles. https://www.cancer.gov/tcga
ICGC International Cancer Genome Consortium is a collection of 86 cancer genome profiling projects, including datasets generated by the TCGA consortium. These datasets include clinical, whole exome, whole genome, DNA methylation, gene expression, microRNA and proteomic profiles. https://dcc.icgc.org/
PCAWG The Pan-Cancer Analysis of Whole Genomes from ICGC and TCGA includes more than 2600 cancer whole genomes across 38 cancer types explored for somatic and germline variation with particular emphasis on non-coding RNAs, cis-regulatory sites and large structural alterations. The data portal contains somatic and germline mutations (controlled access), DNA methylation, gene expression and clinical data. https://dcc.icgc.org/pcawg
CCLE The Cancer Cell Line Encyclopedia is a data portal including 1457 cancer cell lines encompassing gene and protein expression, DNA methylation, miRNA, mutation and CNA data. https://portals.broadinstitute.org/ccle