Skip to main content
Bioinformatics logoLink to Bioinformatics
. 2019 Jun 4;35(22):4767–4769. doi: 10.1093/bioinformatics/btz456

eFORGE v2.0: updated analysis of cell type-specific signal in epigenomic data

Charles E Breeze 1,2,✉,2, Alex P Reynolds 2,2, Jenny van Dongen 3, Ian Dunham 4, John Lazar 2, Shane Neph 2, Jeff Vierstra 2, Guillaume Bourque 5, Andrew E Teschendorff 6,7, John A Stamatoyannopoulos 2, Stephan Beck 1
Editor: John Hancock
PMCID: PMC6853678  PMID: 31161210

Abstract

Summary

The Illumina Infinium EPIC BeadChip is a new high-throughput array for DNA methylation analysis, extending the earlier 450k array by over 400 000 new sites. Previously, a method named eFORGE was developed to provide insights into cell type-specific and cell-composition effects for 450k data. Here, we present a significantly updated and improved version of eFORGE that can analyze both EPIC and 450k array data. New features include analysis of chromatin states, transcription factor motifs and DNase I footprints, providing tools for epigenome-wide association study interpretation and epigenome editing.

Availability and implementation

eFORGE v2.0 is implemented as a web tool available from https://eforge.altiusinstitute.org and https://eforge-tf.altiusinstitute.org/.

Supplementary information

Supplementary data are available at Bioinformatics online.

1 Introduction

DNA methylation (DNAm) is the main epigenetic mark assayed in the study of human diseases. The new EPIC BeadChip developed by Illumina can detect DNAm at over 850 000 genomic sites, extending the promoter-centric coverage of the 450k array to enhancers identified by the ENCODE and FANTOM5 projects (Andersson et al., 2014; ENCODE Project Consortium, 2012), constituting a powerful and robust new tool for epigenome-wide association studies (EWAS) (Moran et al., 2016).

The original version of eFORGE (Breeze et al., 2016) employs multiple layers of epigenetic information, including data for open chromatin sites (DNase I hotspots) and histone marks (H3K4me1, H3K4me3, H3K27me3, H3K9me3 and H3K36me3) to detect cell types driving EWAS signal.

The updated version of eFORGE extends and enhances the tool, adding new features, such as simultaneous analysis across 15 chromatin states, detection of transcription factor (TF) motifs associated with EWAS signal, cumulative DNase I footprint analysis, EPIC array support and a browser to analyze TF occupancy in EWAS loci. Notably, we incorporate many of these features into a new web-based suite, eFORGE-TF, to aid the multilevel characterization of TF-associated EWAS mechanisms.

2 Description

eFORGE v2.0 is a software tool written in Perl and Python. eFORGE takes a list of EWAS array probes, and tests them for overlap enrichment with epigenetic tracks using an extensive database of 815 individual datasets. eFORGE currently includes tracks for DNaseI hotspots, 5 histone marks and 15 chromatin states. Probes can be filtered using a 1-kb proximity filter, and can be input in either BED or probe ID format. Statistical enrichment analysis is performed using a binomial test against an array-specific background. eFORGE outputs key information including Benjamini–Yekutieli-corrected P-values, sample IDs and lists of probes overlapping tracks in each sample. Both static and interactive charts and tables are provided to view results (Fig. 1A).

Fig. 1.

Fig. 1.

Output examples of eFORGE v2.0. (A) Enrichment in brain enhancers and TSS flanking chromatin states for the top 200 sites from an Illumina EPIC array study (Moran et al., 2016). (B) TF motifs with the highest enrichment for the top 1000 study probes, including two RFX5 motifs. (C) Distribution of study sites in aggregated RFX5 footprints from brain samples shows 6 of the 1000 top study probes overlapping 5 different positions within the motif (q-value =0.038, hypergeometric test, BY correction)

eFORGE chromatin state enrichment analysis extends previous eFORGE analyses by breaking down cell type-specific signal into regulatory element classes (e.g. subcategories of promoters, enhancers and transcribed regions). Furthermore, the activity of many of these classes is known to be associated with the binding of sequence-specific TFs. Interestingly, DNAm changes can also result as a consequence of the binding of sequence-specific TFs (Smith and Meissner, 2013).

To break down cell type-specific signal into particular TF groups we have generated eFORGE-TF, a software suite that allows for TF analysis at three levels. First, it analyses the EWAS probe list for TF motif enrichment across all known TF motifs (Fig. 1B). This analysis can optionally be conditioned to DNase I footprints. Second, eFORGE-TF performs a cumulative analysis across an aggregated set of DNase I footprints to visualize DNAm changes in relation to particular basepairs of an aggregated footprint (Fig. 1C). Third, eFORGE-TF provides a locus-specific gallery to inspect individual probes using a combination of TF motifs and DNase I footprints. These three analysis levels provide a robust and comprehensive assessment of TF associations for a given EWAS probe list.

In each gallery panel, eFORGE-TF includes an interactive browser for locus-specific analysis. This browser overlays multiple levels of information, such as footprints, TF motifs and genomic coordinates, providing an in-depth view into the TF associations of a particular locus. Such a tool can aid both EWAS interpretation and epigenome editing approaches, which can benefit from the characterization of local TF binding sites (Voigt and Reinberg, 2013).

More details, including an example eFORGE analysis, are available (Supplementary Material).

3 Conclusion

This updated and improved version of eFORGE constitutes an extended and powerful tool for the analysis of Illumina EPIC and 450k array data.

Supplementary Material

btz456_Supplementary_Materials

Acknowledgements

We would like to acknowledge the IHEC Integrative analysis project for supporting this research.

Funding

C.E.B. was supported by a fellowship from the EU-FP7 project EpiTrain [316758]. Research in S.B.’s group was supported by the Wellcome Trust [99148]; Royal Society Wolfson Research Merit Award [WM100023] and EU-FP7 projects EpiTrain [316758]; EPIGENESYS [257082]; and BLUEPRINT [282510].

Conflict of Interest: none declared.

References

  1. Andersson R. et al. (2014) An atlas of active enhancers across human cell types and tissues. Nature, 507, 455–461. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Breeze C.E. et al. (2016) eFORGE: a tool for identifying cell type-specific signal in epigenomic data. Cell Rep., 17, 2137–2150. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. ENCODE Project Consortium (2012) An integrated encyclopedia of DNA elements in the human genome. Nature, 489, 57–74. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Moran S. et al. (2016) Validation of a DNA methylation microarray for 850, 000 CpG sites of the human genome enriched in enhancer sequences. Epigenomics, 8, 389–399. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Smith Z.D., Meissner A. (2013) DNA methylation: roles in mammalian development. Nat. Rev. Genet., 14, 204–220. [DOI] [PubMed] [Google Scholar]
  6. Voigt P., Reinberg D. (2013) Epigenome editing. Nat. Biotechnol., 31, 1097–1099. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

btz456_Supplementary_Materials

Articles from Bioinformatics are provided here courtesy of Oxford University Press

RESOURCES