ABSTRACT
Super-enhancers and stretch enhancers represent classes of transcriptional enhancers that have been shown to control the expression of cell identity genes and carry disease- and trait-associated variants. Specifically, super-enhancers are clusters of enhancers defined based on the binding occupancy of master transcription factors, chromatin regulators, or chromatin marks, while stretch enhancers are large chromatin-defined regulatory regions of at least 3,000 base pairs. Several studies have characterized these regulatory regions in numerous cell types and tissues to decipher their functional importance. However, the differences and similarities between these regulatory regions have not been fully assessed. We integrated genomic, epigenomic, and transcriptomic data from ten human cell types to perform a comparative analysis of super and stretch enhancers with respect to their chromatin profiles, cell type-specificity, and ability to control gene expression. We found that stretch enhancers are more abundant, more distal to transcription start sites, cover twice as much the genome, and are significantly less conserved than super-enhancers. In contrast, super-enhancers are significantly more enriched for active chromatin marks and cohesin complex, and more transcriptionally active than stretch enhancers. Importantly, a vast majority of super-enhancers (85%) overlap with only a small subset of stretch enhancers (13%), which are enriched for cell type-specific biological functions, and control cell identity genes. These results suggest that super-enhancers are transcriptionally more active and cell type-specific than stretch enhancers, and importantly, most of the stretch enhancers that are distinct from super-enhancers do not show an association with cell identity genes, are less active, and more likely to be poised enhancers.
KEYWORDS: Gene regulation, genomics, epigenomics, super-enhancers, stretch enhancers
Background
The human body contains several hundred distinct cell types and most of the regulatory code that drives cell type-specific gene expression resides in cis-regulatory elements termed enhancers [1]. Enhancers are noncoding regulatory regions distal to the genes they regulate where transcription factors (TFs) and the transcriptional apparatus bind and orchestrate the gene regulation [1,2]. While estimations predicted approximately one million potential enhancers in the human genome, only a very small fraction of these enhancers are active in a given cell [3,4]. These active enhancers are primarily found in regions of accessible chromatin [5], marked by monomethylation of histone H3 at lysine 4 (H3K4me1) and acetylation of histone H3 at lysine 27 (H3K27ac) [5–8], produce enhancer RNAs (eRNAs) [9], and are significantly loaded with the coactivator protein p300 [10] and RNA polymerase II (RNA Pol II) [8]. Additionally, considerable levels of H3K4me3 are also observed at active enhancers bound by RNA Pol II [11] and some studies have linked H3K4me3 broad peaks with transcriptional elongation, enhancer activity, and cellular identity [12–14]. Further, poised enhancers are enriched for trimethylation of histone H3 on lysine 27 (H3K27me3) and have depleted H3K27ac and RNA Pol II signal [6,15].
Despite critical advances in terms of technology and methodology to identify enhancers genome-wide and to understand their molecular mechanisms and function, a clear understanding is still lacking. Defining cell type-specific enhancers and accurately assigning them to the gene(s) they regulate is of great interest but very challenging due to lack of known cell type-specific signatures. In 2013, Whyte et al. showed that clusters of enhancers, termed super-enhancers, control cell identity. These super-enhancers were defined based on their enrichment for binding of key master regulator TFs, Mediator, and chromatin regulators [16]. These cluster of enhancers are cell type-specific, control the expression of cell identity genes, are sensitive to perturbation, associated with disease, distinctly methylated, and boost the processing of primary microRNA into precursors of microRNAs [16–20]. Concomitantly to the discovery of super-enhancers, Parker et al. showed that large genomic regions with enhancer characteristic termed stretch enhancers, and defined based on their size (>3 kb), control cell identity [21]. These stretch enhancers are known to be cell type-specific and are enriched for disease-associated variants, for instance, carrying small nuclear polymorphisms SNPs associated with type 2 diabetes [21,22]. Further, it has been shown that super-enhancers and stretch enhancers overlap with the known locus control regions (LCR) [17,21,23]. Significant attention has been given to these cell type-specific regulatory regions to understand their molecular mechanisms and functional importance. Since the parallel publications of these two concepts, there has been confusion among some in the research community to differentiate these two classes of regulatory regions. The concepts of super-enhancers and stretch enhancers try to capture the same underlying biological phenomenon, but it is unclear whether these regions, defined using different approaches, have different or equivalent regulatory potential. Some recent studies even referred to these regions collectively as SEs (Super/Stretch Enhancers), assuming them to be equivalent [24–26]. While some studies showed that super-enhancers and stretch enhancers overlap [27,28], a detailed comparative analysis to understand their differences and similarities is lacking.
We performed a comprehensive analysis of super-enhancers and stretch enhancers in 10 human cell lines by integrating ChIP-seq data for histone modifications (H3K27ac, H3K4me1, H3K4me3, and H3K27me3), RNA Pol II, P300, cohesin components (RAD21 and SMC3) and CTCF, chromatin accessibility data (DNase-seq), and transcriptomics data (RNA-seq, GRO-seq, and GRO-cap). Our analyzes revealed significant differences between super and stretch enhancers. Stretch enhancers are more abundant, distal to transcription start sites (TSS) and less conserved than super-enhancers. Comparatively, super-enhancers are significantly more enriched for active chromatin marks, RNA Pol II, and cohesin components, are transcriptionally more active, and more transcribed than stretch enhancers. Finally, only a small fraction of stretch enhancers overlap with super-enhancers, which we named super-stretch enhancers. These super-stretch enhancers are highly transcribed, associated with cell identity genes, and enriched for cell type-specific biological functions.
Results
Genomic distribution and conservation of super-enhancers and stretch enhancers from ten human cell types
To systematically compare super-enhancers and stretch enhancers, we obtained super-enhancers from dbSUPER [29] and stretch enhancers from ten human cell types [21]. These cell types included B-lymphoblastoid cells (GM128278), embryonic stem cells (H1-ES), erythrocytic leukemia cells (K562), hepatocellular carcinoma cells (HepG2), human umbilical vein endothelial cells (HUVEC), human mammary epithelial cells (HMEC), human smooth muscle myoblasts (HSMM), normal human epidermal keratinocytes (NHEK), normal human lung fibroblasts (NHLF), and pancreatic islets (Islets). We looked at the genomic distribution of these regulatory regions and found that stretch enhancers are an order of magnitude more numerous and cover twice as much the human genome than super-enhancers (Figure 1(a,b)). Specifically, we found an average of 745 super-enhancers with mean size 22,812 bp and 11,160 stretch enhancers with mean size 5,060 bp. Further, a majority of super-enhancers (69%) was located close to TSSs (<2 kb), while a majority of stretch enhancers (70%) was located very distal from TSSs (>10 kb) (Figure 1(c), Supplementary Figure S1). This difference in distribution of distances from TSS to super and stretch enhancers is statistically significant (P value <2.2e-16, Wilcoxon rank sum test). Next, we investigated the evolutionary conservation of super- enhancers and stretch enhancers by using phastCons scores for 99 vertebrate genomes aligned to the human genome (hgdownload.cse.ucsc.edu/goldenpath/hg19/phastCons100way/) [30]. Super-enhancers were significantly more conserved than stretch enhancers (P value <2.2e-16, Wilcoxon rank sum test) (Figure 1(d), S2). Taken together, these results indicate that stretch enhancers are more distal to TSSs, more abundant in number, and cover twice as much of the genome as super-enhancers, which are significantly more conserved.
Super-enhancers are enriched for active chromatin marks
We next sought to highlight the potentially distinct chromatin marks found at super-enhancers and stretch enhancers by using ChIP-seq data from the ENCODE project [31] for H3K27ac, H3K4me1, H3K4me3, and H3K27me3. Looking at average ChIP-seq signals, we observed that super-enhancers are highly enriched for active chromatin marks, such as H3K27ac and H3K4me3, while depleted for poised marks such as H3K27me3 (Figure 2(a,b); Supplementary Figure S3, S4). In contrast, stretch enhancers are highly enriched for poised chromatin mark H3K27me3 and depleted for active chromatin marks H3K27ac and H3K4me3 (Figure 2(a,b)). Additionally, super-enhancers and stretch enhancers are similarly marked by H3K4me1 (Figure 2(a,b)).
Active enhancers are primarily found in regions of accessible chromatin [5]. Here, for most of the cell types, we observed significantly higher levels of DNase I hypersensitive sites (DHSs) for super-enhancers than for stretch enhancers (Supplementary Figure S5). Furthermore, we found that stretch enhancers overlap with super-enhancers at key cell identity genes, such as SOX2 (Figure 2(b)), POU5F/OCT4 (Supplementary Figure S6(a)), and NANOG (Supplementary Figure S6(b)) in embryonic stem cells, and are highly enriched for active chromatin marks. Additionally, in Pancreatic islet, most of the binding sites of islet-specific transcription factors including PDX1, NKX2-2, FOXA2, and NKX6-1, are mapped to super-enhancers, while some are mapped to stretch enhancers [32] (Supplementary Figure S7).
Many type-2 diabetes SNPs from the Diabetes genetics replication and meta-analysis (DIAGRAM; red color) [33], as well as fasting glycemia SNPs from the Meta-analyzes of glucose and insulin-related traits consortium (MAGIC; blue color) [34] were observed on and around these enhancer regions. Interestingly, we observed higher ChIP-seq binding signal at stretch enhancers that overlap with super-enhancers. Taken together, these results highlight that super-enhancers are significantly more active and located in open regions than stretch enhancers, which are more likely to be poised.
Super-enhancers are enriched with cohesin and CTCF binding
It is known that enhancers are brought close to their target genes through chromatin looping mechanisms. These long-range enhancer–promoter interactions and DNA looping are mediated by the cohesin complex and CTCF [35,36]. We compared the occupancy of two cohesin components (SMC3 and RAD21) and CTCF at super-enhancers and stretch enhancers in GM12878 and K562 cells. We observed significantly higher SMC3, RAD21, and CTCF ChIP-seq binding signal at super-enhancers than at stretch enhancers (Figure 3(a,b)). This suggests that super-enhancers are regions with frequent interactions mediated by the cohesin complex and CTCF when compared to stretch enhancers.
Super-enhancers are transcriptionally more active than stretch enhancers
RNA Pol II plays a critical role in transcription and a majority of active enhancers recruit RNA Pol II [9]. We observed significantly higher RNA Pol II binding at super-enhancers than at stretch enhancers (Figure 4(a); Supplementary Figure S4). This was expected, since we previously highlighted a significantly higher occupancy of active chromatin marks H3K27ac and H3K4me3 at super-enhancers. To further assess the effect of enhancer activity on gene expression regulation, we associated genes with super-enhancers and stretch enhancers based on proximity and calculated their transcriptional abundance. For all the 10 cell types, we found that the genes near super-enhancers were significantly more expressed than genes near stretch-enhancers (P value <2.2e-16, Wilcoxon rank sum test) (Figure 4(b)).
Recent studies have shown that most of the active enhancers are bi-directionally transcribed to produce RNA transcripts, referred to as eRNAs [9]. These transcribed enhancers exhibit higher in vitro activity, suggesting that production of eRNA is linked to functional activity [37]. Recent techniques based on global run-on sequencing (GRO-seq and GRO-cap) have been developed for the detection of these unstable RNAs generated from enhancer elements [38,39]. We used publicly available data from GRO-seq and GRO-cap assays in K562 and GM12878 to investigate the levels of eRNAs at super- enhancers and stretch enhancers. Super-enhancers and their constituents (i.e., individual enhancers composing the clusters) harbored a significantly higher signal for both GRO-seq and GRO-cap (Figure 4(c,d)) Supplementary Figure S8) than stretch enhancers. Taken together, these results confirm that super-enhancers are more active and transcribed and can greatly enhance the transcription of their nearby genes when compared with stretch enhancers.
A small subset of stretch enhancers overlaps with super-enhancers
Next, we investigated to what extent super-enhancers and stretch enhancers are conserved and overlapped together among cell types. We computed pairwise Jaccard statistics among cell types for super-enhancer and stretch enhancer regions, independently. We observed higher Jaccard statistics for stretch enhancers than for super-enhancers, meaning that stretch enhancers significantly shared across cell types than super-enhancers (P value = 4.311e-10, Wilcoxon signed rank test) (Supplementary Figure S9). Next, we computed the fraction of overlap between super-enhancers and stretch enhancers in each cell type (Figure 5(a)). We observed that most of the super-enhancers overlap with a small fraction of the stretch enhancers in all cell types. On average, a vast majority of super-enhancers (85%) overlap with only a small number of stretch enhancers (13%) (Figure 5(b)). These stretch enhancers that overlap with super-enhancer regions were termed super-stretch enhancers. We then compared these super-stretch enhancers with super-enhancers and stretch-only enhancers, which do not overlap super-enhancers (referred to as stretch from now on in this work) (Supplementary Figure S10).
These super-stretch enhancers are significantly smaller in size than the remaining stretch-only enhancers P value <2.2e-16, Wilcoxon rank sum test) (Figure 5(c)). In line with the fact that the vast majority of super-enhancers are super-stretch enhancers, we recapitulate that these regions were (i) enriched for active chromatin marks, such as H3K27ac and H3K4me3 (Supplementary Figure S11); (ii) enriched for cohesin and CTCF binding (Supplementary Figures S12 and S13(a)); (iii) near highly expressed genes (P value < 2.2e-16, Wilcoxon rank sum test, Supplementary Figure S13(a)); and (iv) significantly transcribed when compared to stretch enhancers (Supplementary Figure S13(b)).
To further test the ability of those super-enhancer that do not overlap with stretch, we grouped the regions into super-only, super-stretch, and stretch-only regions based on the overlap analysis (Supplementary Figure S14(a)). We investigated the average ChIP-seq profile of H3K27ac, H3K4me1, H3K4me3, and RNA Pol II at these three categories in three cell lines (H1, GM12878, and 562) from ENCODE tier-1 (Supplementary Figure S14(b)). We observed that the super-only group has significantly higher signal for active chromatin marks, such as H3K27ac, H3K4me3, and RNA Pol II, and lower signal for H3K4me1, compared to stretch-only and super-stretch groups.
Taken together, these results show that super-enhancers are more active in general and a vast majority of super-enhancers also contain a small fraction of stretch enhancers. Further, the small fraction of stretch enhancers that overlap with super-enhancers (called super-stretch enhancers) are highly enriched for active chromatin marks, highly transcribed, and can greatly enhance the transcription of their associated genes.
Super-stretch enhancers are cell type-specific and control key cell identity genes
We sought to analyze how super-stretch enhancers were associated with cell type-specific genes. We performed k-means clustering based on the H3K27ac histone modification at super, super-stretch, and stretch enhancers in five cell types (GM12878, K562, H1-ES, HepG2, and HUVEC). We observed cell type-specific clusters for all the three groups, but significantly stronger cell type-specific signal at super and super-stretch enhancers than at stretch enhancers (Figure 6(a)). It was expected to observe higher level of H3K27ac signal at super-enhancers, as these were defined solely based on H3K27ac signal while stretch enhancers were defined using several other chromatin marks, including H3K27ac. To test if the signal was coming from the way super-enhancers were identified, we redefined stretch enhancers using same candidate enhancers defined using H3K27ac peaks in H1-ES, GM12878, and K562 cell types. We then divided these stretch enhancers into super-stretch and stretch enhancers, as described above. Interestingly, we found similar patterns for H3K27ac (Supplementary Figure S15(a)) and GRO-seq signal (Supplementary Figure S15(b)). In addition, we found cell type-specific GO terms for super and super-stretch enhancers but not stretch enhancers (Supplementary Figure S16). This reinforces our observations that super-enhancers and stretch enhancers are intrinsically different.
When considering the closest genes to super-enhancers or super-stretch enhancers defined in H1-ES cells, we found that they were highly expressed in H1-ES cells but not in other cell types (Figure 6(b)). This observation holds true for the 10 considered cell types (Supplementary Figure S17). On the contrary, we did not observe such cell type-specific behavior for genes close to stretch enhancers (Supplementary Figure S17). To confirm this cell type-specific behavior associated to super-enhancers and stretch enhancers, we further assessed the biological function of these regulatory regions using the tool GREAT to perform gene ontology enrichment analysis from the genes close to super, super-stretch, and stretch enhancers. In all cell types analyzed, super-enhancers and super-stretch enhancers were found close to genes enriched for corresponding cell type-specific functions. For example, in ES cells, terms like stem cell development, stem cell differentiation, and stem cell activation were enriched for genes associated with super and super-stretch enhancer (Figure 6(c); Supplementary Figure S18).
Next, we performed the overlap of genes associated with super-, super-stretch and stretch enhancers and checked for the known key cell identity genes. We found a majority of key cell identity genes associated with either super-enhancers or super-stretch enhancers. For example, the ESC pluripotency genes SOX2, OCT4 (POU5F1), and NANOG binding were found in super-enhancers and super-stretch enhancers but not stretch enhancers (Figure 6(d)). In K562 cells, proteins such as GATA1, TAL1, and JUN bound at super-enhancers and super-stretch enhancers, while GATA2 and HBG1 bound at super-enhancers. Similarly, in Islets cells, PDX1 and NKX2-2 bound at super-enhancers and super-stretch enhancers and FOXA2 bound at super-stretch enhancers. Taken together, these results suggest that a small subset of stretch enhancers, the one overlapping with super-enhancers (super-stretch), are preferentially associated with genes that have a key role in the cell type-specific biology.
Discussion
In this study, we have performed a comprehensive analysis of super-enhancers and stretch enhancers by comparing their histone modification profiles, chromatin accessibilities, and abilities to regulate cell type-specific gene expression. At the genome scale, stretch enhancers are more abundant, cover twice the genome, and are further away from TSS than super-enhancers; super-enhancers are evolutionary more conserved. Moreover, super-enhancers are found to overlap active chromatin marks, such as H3K27ac and H3K4me3, while stretch enhances are enriched for the poised mark H3K27me3. Super-enhancers are found to be significantly more occupied by the cohesin complex, CTCF, and RNA Pol II and produce eRNAs. Further, a majority of the super-enhancers overlaps with a small fraction of stretch enhancers; the overlapping regions are more cell type-specific and found near genes involved in cell maintenance, differentiation, and development.
We observed that super-enhancers are loaded with active chromatin marks, such as H3K27ac and RNA Pol II and produce eRNAs, while stretch enhances are enriched for H3K27me3, depleted for H3K27ac, and do not produce eRNA. These properties show that super-enhancers are transcriptionally active, while stretch enhancers are poised.
As super-enhancers were originally defined using H3K27ac signal, it was expected that they were containing higher level of H3K27ac signal when compared to stretch enhancers, which were defined based on multiple chromatin marks. Redefining stretch enhancers using solely the H3K27ac mark did not change our observations of the difference between super-enhancer and stretch enhancer. This reinforces the fact that super-enhancers are overall more active than stretch enhancers, which are likely poised enhancers. Further studies will be required to find the specific function of the genomic regions defined as stretch enhancers and that do not overlap super-enhancers.
Conventionally, enhancers and promoters are regarded as distinct cis-regulatory elements and distinguished by enrichment for histone modifications (H3K4me1/2 and H3K27ac enriched in enhancers; H3K4me3 enriched in promoters); however, recent studies have shown that some of the highly transcribed enhancers can function as promoters in vivo [40–42]. The exceptionally high enrichment for H3K27ac and RNA Pol II combined with H3K4me3 (a mark usually associated with promoters) at super-enhancer regions suggests that super-enhancers may also work as promoters to regulate their target gene(s) or these clusters may encompass promoter sequences. This is in line with the enrichment of GRO-seq and GRO-cap signal at these super-enhancer regions, which confirms that these regions initiate transcription.
Stretch enhancers were defined as large (>3 Kb) linear genomic regions with specific chromatin marks [21] while super-enhancers were defined as clusters of enhancers enriched for active chromatin marks. We observed that a small fraction of stretch enhancers that overlap super-enhancers was more active, cell type-specific, and significantly smaller than the rest of these regions. It clearly highlights that length is not an optimal feature to determine cell type-specific regulatory regions. Our analyzes and several other lines of evidence suggest that enhancers are more likely to be cell type-specific, transcriptionally active, and frequently interacting when found in clusters at the genomic scale, regardless of their sizes [32,43]. Whether the individual enhancers within a cluster (super-enhancer) work synergistically or additively is still in active debate [44]. Some studies have shown synergy and hierarchy between the individual enhancers within the clusters [25,45,46], but this synergy is less obvious for some developmentally regulated super-enhancers [47]. Further, locus control regions [23] have been known for decades and whether the newly introduced super- and stretch enhancers represent a new paradigm in transcriptional regulation is still debated [28]. It has been suggested that genes that are regulated by multiple cis-regulatory elements may achieve higher levels of RNA Pol II recruitment due to higher local concentration of required factors than genes regulated by fewer regulatory elements [48,49]. A very recent simulation study, based on the properties of super-enhancers, proposed a conceptual phase separation model for transcriptional gene regulation [50]. If that is the case, then the current approaches to identify clusters of enhancers are not optimal. The concepts of super-enhancers and stretch enhancers try to capture the same underlying biological phenomenon; however, the methods used to identify regions of the genome of particular importance for cell identity are different and imperfect. Further, the current methods do not integrate chromatin interaction data to restrict these clusters to be within the so-called topologically associated domains, where most of the enhancer-promoter interactions and regulation takes place. Hence, we still need to develop new methods to capture cell type-specific enhancers that might not be derived from clusters of enhancers at the genomic scale but rather in the 3D space of the nucleus.
Material and methods
Data description
We used the processed ChIP-seq data for histone modifications and DNase-seq data generated by the Encyclopedia of regulatory DNA elements (ENCODE) to perform all the analysis in this report (hgdownload-test.cse.ucsc.edu/goldenPath/hg19/encodeDCC/) (Table S1). We obtained the processed RNA-seq based gene expression data as RPKM for the ten cell types from [21]. We also used GRO-seq and GRO-cap data in GM12878 and K562, which is listed in supplementary Table S2.
GRO-seq and GRO-cap data analysis
The GRO-seq, and GRO-cap reads were aligned to human genome-build hg19 using bowtie2 [51] with default parameters. The aligned and sorted bam files were used to compute the average signal profile at super-enhancer and stretch enhancers.
Super-enhancers
Super-enhancers were downloaded as BED files for the ten human cell types GM12878, H1-ES, K562, HepG2, HUVEC, HMEC, HSMM, NHEK, NHLF, and Islets from dbSUPER (http://asntech.org/dbsuper/) [29].
Stretch enhancers
We obtained the stretch enhancer annotations for the 10 human cell types (including GM12878, H1-ES, K562, HepG2, HUVEC, HMEC, HSMM, NHEK, NHLF and Islets) from [21]. We also redefined stretch enhancers using only H3K27ac ChIP-seq data in cell lines H1-ES, GM12878, and K562. We used the stitched enhancer regions (typical enhancers and super-enhancers) and ranked them based on length, separating the ones that are larger than 3 kb.
Assigning genes to super-enhancers and stretch enhancers
Genes were assigned to super-enhancers and stretch enhancers based on proximity as descried in [16,17]. It is known that enhancers tend to loop and communicate with target genes [52], and most of these enhancer-promoter interactions occur within a distance of ~50 kb [53]. This approach identified a large proportion of true enhancer/promoter interactions in ESC [54]. Hence, all transcriptionally active genes were assigned to super-enhancers and stretch enhancers within a 50 kb window.
Gene ontology analysis
To perform Gene Ontology analysis, we used the Genomic regions enrichment of annotations tool (GREAT) (http://bejerano.stanford.edu/great/, version 3.0.0) [55] with default parameters. The top 10 GO terms with the lowest P value were reported.
Overlap analysis
We used BEDTools [56] to perform intersection of genomic regions. We considered two regions to overlap if they shared at least 1 bp. To perform pairwise overlap analysis we used the Intervene tool [57].
Visualization and statistical analysis
We generated box plots using the R programming language by extending the whiskers to 1.5x the interquartile range. The P values for box-plots were calculated using Wilcoxon signed-rank tests with the wilcox.test function in R. We used ngs.plot [58] to generate heat maps and normalized binding profiles at the constituents of super-enhancers and stretch enhancers along with their flanking 3 kb regions. We used Intervene (https://asntech.shinyapps.io/intervene/) [57] to generate pairwise heatmaps. For genome-browser screenshots, we used the Biodalliance genome browser with ENCODE data [59].
Funding Statement
AK and AM have been supported by the Norwegian Research Council (project number 187615), Helse Sør-Øst, and the University of Oslo through the Centre for Molecular Medicine Norway (NCMM). XZ is supported by the NSFC grants 61721003 and 61673231. Norwegian Research Council; NSFC [61721003 and 61673231]; Helse Sør-Øst
Acknowledgments
We thank the ENCODE consortium and all the authors of the original studies used in this work for making their data freely available.
Disclosure statement
No potential conflict of interest was reported by the authors.
Authors’ contributions
AK conceived the project and designed the experiments. XZ reviewed and approved experiment design and supervised the project. AM provided suggestions, supported to redefine stretch enhancers, and overviewed the finalization of the project. AK performed the experiments and analyzed the data. AK wrote the manuscript and XZ and AM reviewed it. All authors read and approved the final manuscript.
Supplementary material
Supplemental data for this article can be accessed here.
References
- 1.Heinz S, Romanoski CE, Benner C, et al. The selection and function of cell type-specific enhancers. Nat Rev Mol Cell Biol. 2015;16:144–154. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Barolo S. Shadow enhancers: frequently asked questions about distributed cis-regulatory information and enhancer redundancy. BioEssays. 2012;34:135–141. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Thurman RE, Rynes E, Humbert R, et al. The accessible chromatin landscape of the human genome. Nature. 2012;489:75–82. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Heintzman ND, Hon GC, Hawkins RD, et al. Histone modifications at human enhancers reflect global cell type-specific gene expression. Nature. 2009;459:108–112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Shlyueva D, Stampfel G, Stark A. Transcriptional enhancers: from properties to genome-wide predictions. Nat Rev Genet. 2014;15:272–286. [DOI] [PubMed] [Google Scholar]
- 6.Rada-Iglesias A, Bajpai R, Swigut T, et al. A unique chromatin signature uncovers early developmental enhancers in humans. Nature. 2011;470:279–283. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Arnold CD, Gerlach D, Stelzer C, et al. Genome-wide quantitative enhancer activity maps identified by STARR-seq. Science. 2013;339:1074–1077. [DOI] [PubMed] [Google Scholar]
- 8.Bonn S, Zinzen RP, Girardot C, et al. Tissue-specific analysis of chromatin state identifies temporal signatures of enhancer activity during embryonic development. Nat Genet. 2012;44:148–156. [DOI] [PubMed] [Google Scholar]
- 9.Kim T-K, Hemberg M, Gray JM, et al. Widespread transcription at neuronal activity-regulated enhancers. Nature. 2010;465:182–187. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Visel A, Mj B, Li Z, et al. ChIP-seq accurately predicts tissue-specific activity of enhancers. Nature. 2009;457:854–858. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Calo E, Wysocka J. Modification of enhancer chromatin: what, how, and why? Mol Cell. 2013;49:825–837. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Chen K, Chen Z, Wu D, et al. Broad H3K4me3 is associated with increased transcription elongation and enhancer activity at tumor-suppressor genes. Nat Genet. 2015;47:1149–1157. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Benayoun BA, Pollina EA, Ucar D, et al. H3K4me3 breadth is linked to cell identity and transcriptional consistency. Cell. 2014;158:673–688. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Dincer A, Gavin DP, Xu K, et al. Deciphering H3K4me3 broad domains associated with gene-regulatory networks and conserved epigenomic landscapes in the human brain. Transl Psychiatry. 2015;5:e679. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Creyghton MP, Cheng AW, Welstead GG, et al. Histone H3K27ac separates active from poised enhancers and predicts developmental state. Proc Natl Acad Sci U S A. 2010;107:21931–21936. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Whyte WA, Orlando DA, Hnisz D, et al. Master transcription factors and mediator establish super-enhancers at key cell identity genes. Cell. 2013;153:307–319. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Hnisz D, Abraham BJ, Lee TI, et al. Super-enhancers in the control of cell identity and disease. Cell. 2013;155:934–947. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Lovén J, Hoke HA, Cy L, et al. Selective inhibition of tumor oncogenes by disruption of super-enhancers. Cell. 2013;153:320–334. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Suzuki HI, Young RA, Sharp PA. Super-enhancer-mediated RNA processing revealed by integrative microRNA network analysis. Cell. 2017;168:1000–1014.e15. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Heyn H, Vidal E, Ferreira HJ, et al. Epigenomic analysis detects aberrant super-enhancer DNA methylation in human cancer. Genome Biol. 2016;17:11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Parker SCJ, Stitzel ML, Taylor DL, et al. Chromatin stretch enhancer states drive cell-specific gene regulation and harbor human disease risk variants. Proc Natl Acad Sci U S A. 2013;110:17921–17926. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Quang DX, Erdos MR, Parker SCJ, et al. Motif signatures in stretch enhancers are enriched for disease-associated genetic variants. Epigenetics Chromatin. 2015;8:23. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Li Q, Peterson KR, Fang X, et al. Locus control regions. Blood. 2002;100:3077–3086. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Vahedi G, Kanno Y, Furumoto Y, et al. Super-enhancers delineate disease-associated regulatory nodes in T cells. Nature. 2015;520:558–562. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Hnisz D, Schuijers J, Lin CY, et al. Convergence of developmental and oncogenic signaling pathways at transcriptional super-enhancers. Mol Cell. 2015;58:362–370. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Flynn RA, Do BT, Rubin AJ, et al. 7SK-BAF axis controls pervasive transcription at enhancers. Nat Struct Mol Biol. 2016;23:1–11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Niederriter A, Varshney A, Parker S, et al. Super enhancers in cancers, complex disease, and developmental disorders. Genes. 2015;6:1183–1200. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Pott S, Lieb JD. What are super-enhancers? Nat Genet. 2014;47:8–12. [DOI] [PubMed] [Google Scholar]
- 29.Khan A, Zhang X. dbSUPER: a database of super-enhancers in mouse and human genome. Nucleic Acids Res. 2016;44:D164–D171. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Rosenbloom KR, Armstrong J, Barber GP, et al. The UCSC Genome Browser database: 2015 update. Nucleic Acids Res. 2014;43:D670–D681. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Dunham I, Kundaje A, Aldred SF, et al. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012;489:57–74. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Pasquali L, Gaulton KJ, Rodríguez-Seguí SA, et al. Pancreatic islet enhancer clusters enriched in type 2 diabetes risk-associated variants. Nat Genet. 2014;46:136–143. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Morris AP, Voight BF, Teslovich TM, et al. Large-scale association analysis provides insights into the genetic architecture and pathophysiology of type 2 diabetes. Nat Genet. 2012;44:981–990. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Scott RA, Lagou V, Welch RP, et al. Large-scale association analyses identify new loci influencing glycemic traits and provide insight into the underlying biological pathways. Nat Genet. 2012;44:991–1005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Rao SSP, Huang S-C, Hilaire BGS, et al. Cohesin loss eliminates all loop domains. Cell. 2017;171:305–320.e24. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Zuin J, Dixon JR, Mija VDR, et al. Cohesin and CTCF differentially affect chromatin architecture and gene expression in human cells. Proc Natl Acad Sci USA. 2014;111:996–1001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Andersson R, Gebhard C, Miguel-Escalada I, et al. An atlas of active enhancers across human cell types and tissues. Nature. 2014;507:455–461. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Core LJ, Waterfall JJ, Lis JT. Nascent RNA sequencing reveals widespread pausing and divergent initiation at human promoters. Science. 2008;322:1845–1848. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Kruesi WS, Core LJ, Waters CT, et al. Condensin controls recruitment of RNA polymerase II to achieve nematode X-chromosome dosage compensation. eLife. 2013;2:e00808. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Mikhaylichenko O, Bondarenko V, Harnett D, et al. The degree of enhancer or promoter activity is reflected by the levels and directionality of eRNA transcription. Genes Dev. 2018;32:42–57. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Henriques T, Scruggs BS, Inouye MO, et al. Widespread transcriptional pausing and elongation control at enhancers. Genes Dev. 2018;32:26–41. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Rennie S, Dalby M, Lloret-Llinares M, et al. Transcription start site analysis reveals widespread divergent transcription in D. melanogaster and core promoter-encoded enhancer activities. Nucleic Acids Res. 2018;46:5455–5469. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Ernst J, Kheradpour P, Mikkelsen TS, et al. Mapping and analysis of chromatin state dynamics in nine human cell types. Nature. 2011;473:43–49. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Dukler N, Gulko B, Huang Y, et al. Is a super-enhancer greater than the sum of its parts? Nat Genet. 2017;49:2–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Proudhon C, Snetkova V, Raviram R, et al. Active and inactive enhancers cooperate to exert localized and long-range control of gene regulation. Cell Rep. 2016;15:2159–2169. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Shin HY, Willi M, Yoo KH, et al. Hierarchy within the mammary STAT5-driven wap super-enhancer. Nat Genet. 2016;48:904–911. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Hay D, Hughes JR, Babbs C, et al. Genetic dissection of the α-globin super-enhancer in vivo. Nat Genet. 2016;48:1–12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Andersson R. Promoter or enhancer, what’s the difference? Deconstruction of established distinctions and presentation of a unifying model. BioEssays. 2015;37:314–323. [DOI] [PubMed] [Google Scholar]
- 49.Andersson R, Sandelin A, Danko CG. A unified architecture of transcriptional regulatory elements. Trends Genet. 2015;31:426–433. [DOI] [PubMed] [Google Scholar]
- 50.Hnisz D, Shrinivas K, Ra Y, et al. A phase separation model for transcriptional control. Cell. 2017;169:13–23. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Langmead B, Trapnell C, Pop M, et al. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 2009;10:R25. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Ong C-T, Corces VG. Enhancer function: new insights into the regulation of tissue-specific gene expression. Nat Rev Genet. 2011;12:283–293. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Chepelev I, Wei G, Wangsa D, et al. Characterization of genome-wide enhancer-promoter interactions reveals co-expression of interacting genes and modes of higher order chromatin organization. Cell Res. 2012;22:490–503. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Dixon JR, Selvaraj S, Yue F, et al. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature. 2012;485:376–380. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.McLean CY, Bristor D, Hiller M, et al. GREAT improves functional interpretation of cis-regulatory regions. Nat Biotechnol. 2010;28:495–501. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26:841–842. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Khan A, Mathelier A. Intervene: a tool for intersection and visualization of multiple gene or genomic region sets. BMC Bioinformatics. 2017;18:287. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Shen L, Shao N, Liu X, et al. ngs.plot: quick mining and visualization of next-generation sequencing data by integrating genomic databases. BMC genomics. BMC Genomics. 2014;15:284. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Down TA, Piipari M, Hubbard TJP. Dalliance: interactive genome viewing on the web. Bioinformatics. 2011;27:889–890. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.