Skip to main content
. 2015 Dec 3;17(6):967–979. doi: 10.1093/bib/bbv101

Table 2.

Overview of data and features used for enhancer identification

Data sources Feature example Advantage Disadvantage Representative methods
Evolutionary conservation Conserved motifs across species Easy to compute Insufficient information for predicting enhancer's tissue-specific activity [20]
Histone marks ChIP-seq from H3K4me1 Provides cell-line-/tissue-specific information that characterize enhancers and also different categories of enhancers (e.g. poised versus active) Different cell lines/tissues are associated with different combination of histone marks [21, 28, 33, 34]
TFBSs ChIP-seq from P300 Provides cell-line-/tissue-specific information that characterize enhancers. High-resolution data for testing activity of enhancer-related TFs Not available for many cell lines/tissues [23, 29]
Open chromatin DHS High discriminative capacity when combined with other data types, e.g. P300-binding sites Regions with enriched DHS activity do not necessarily correspond to enhancers [25]
Sequence characteristics Kmers of size 5 Easy to compute Insufficient information for predicting enhancers’ activity across different tissues [39, 51]
eRNA expression CAGE data High accuracy eRNA regulation mechanisms are unknown, and not all of the enhancers are known to produce eRNAs [40]
Enhancer-screening data STARR-seq High accuracy for testing enhancer activity Not useful for ab initio discovery of enhancers [42, 43, 52]