Significance
The decision-making process of cellular phenotype specification is controlled by the interplay between genetic and epigenetic elements. Intragenic CGIs (iCGIs) associated with developmental regulators have sequence features that favor DNA methylation and bivalent histone modification, i.e., both activating histone H3 lysine 4 trimethylation and repressing H3K27me3 marks. The epigenetic transition from bivalent modification to DNA methylation on iCGIs during differentiation results in cell type-specific activation of their associated genes. This process is accompanied by loss of physical interactions with promoter regions, and the motifs of developmental regulators are enriched at iCGIs, indicating involvement of these regulators in the epigenetic transition. Our work uncovers the role of iCGIs in cell type-specific differentiation.
Keywords: intragenic CpG islands, DNA methylation, bivalent chromatin, embryonic stem cell, differentiation
Abstract
CpG, 5′-C-phosphate-G-3′, islands (CGIs) have long been known for their association with enhancers, silencers, and promoters, and for their epigenetic signatures. They are maintained in embryonic stem cells (ESCs) in a poised but inactive state via the formation of bivalent chromatin containing both active and repressive marks. CGIs also occur within coding sequences, where their functional role has remained obscure. Intragenic CGIs (iCGIs) are largely absent from housekeeping genes, but they are found in all genes associated with organ development and cell lineage control. In this paper, we investigated the epigenetic status of iCGIs and found that they too reside in bivalent chromatin in ESCs. Cell type-specific DNA methylation of iCGIs in differentiated cells was linked to the loss of both the H3K4me3 and H3K27me3 marks, and disruption of physical interaction with promoter regions, resulting in transcriptional activation of key regulators of differentiation such as PAXs, HOXs, and WNTs. The differential epigenetic modification of iCGIs appears to be mediated by cell type-specific transcription factors distinct from those bound by promoter, and these transcription factors may be involved in the hypermethylation of iCGIs upon cell differentiation. iCGIs thus play a key role in the cell type-specific regulation of transcription.
To provide a common starting ground for the diverse epigenetic landscapes of differentiated cells, embryonic stem cells (ESCs) have a relatively open chromatin structure (1). At the same time, many promoters of developmentally regulated genes are in an intermediate condition, poised for use in specific cell lineages or for inactivation. These promoters are often marked both with activating [trimethylation of histone H3 lysine 4 (H3K4me3)] and repressing (H3K27me3) epigenetic modifications (2). Such bivalent histone modifications facilitate conversion to active or repressive states by further modification upon differentiation (2). However, the key drivers of lineage-specific histone modifications, in particular those of stem cells, are poorly understood.
Along with histone modifications, DNA methylation is a key regulator of gene transcription in the mammalian life cycle (3). Approximately 70% of the total CpG, 5′-C-phosphate-G-3′, sites in the mammalian genome are methylated, but CpG sites in crowded contexts, such as CpG islands (CGIs), are frequently unmethylated (4). These unmethylated CGIs are highly enriched at promoters, in particular at the promoters of housekeeping genes (4, 5), and serve as major binding sites for activating histone modifiers. SETD1 (SET domain-containing 1) and MLL/KMT2A [mixed-lineage leukemia or lysine (K)-specific methyltransferase 2A] proteins create active H3K4me3 at CGIs, whereas KDM2A [lysine (K)-specific demethylase 2A] and TET1 (tet methylcytosine dioxygenase 1) proteins protect promoters from the spread of gene body-associated epigenetic patterns by removing H3K36me1/2 and DNA methylation, respectively (6). A number of unmethylated CGIs are also highly enriched for the repressive Polycomb-repressive complex 2 (PRC2) (7) and are prone to bivalent modification by the coaccumulation of MLL and PRC2 complexes (8). CGIs thus provide integration sites for diverse regulatory signals mediated by DNA methylation and histone modifications.
CGIs show distinct epigenetic modification patterns depending on their locations, whether promoter-associated, intragenic, or intergenic. Promoter-associated CGIs (pCGIs) are usually marked with H3K4me3, whereas nonpromoter-associated CGIs are frequently methylated (9). The modifications often correlate with expression levels of the associated genes (10). Methylation of pCGIs is regarded as a repressive signature because some methylated CGIs recruit the H3K9 methyltransferase to form repressive heterochromatin (11), but the intragenic CGIs (iCGIs) of actively expressed genes are highly DNA methylated, pointing to a positive role of methylation in this context (12).
Some iCGIs are known to act as orphan promoters, and cell type-specific DNA methylation on iCGIs can repress alternative transcription from iCGIs in a manner that inversely correlates with H3K4me3 levels (9). The effects of iCGI hypermethylation on expression of the associated genes are still unclear. Deaton et al. showed that the cell type-specific DNA methylation of iCGIs tends to correlate with the silencing of the associated genes in the immune system (13). On the other hand, Lee et al. have observed gene repression induced by iCGI hypomethylation in cancerous cells, indicating a positive correlation between methylation and gene expression (14). Therefore, neither the regulatory roles of iCGIs nor the cause of modification changes are currently well understood.
To elucidate the epigenetic signatures required for the maintenance of ESCs and their differentiation into specific cell types, we reanalyzed publicly available epigenetic modification data (the analyzed databases are listed in Dataset S1) and validated the results experimentally with special emphasis on the roles of iCGIs. We observe that iCGIs serve as key platforms for the production of bivalent chromatin at sites in important developmental genes. The hypermethylation of iCGIs linked to the simultaneous loss of both H3K4me3 and H3K27me3 triggers transcriptional activation of the associated gene by releasing repressive interactions with the promoter. We found that iCGIs are highly enriched for binding sites for development-specific transcription factors that enable cell type-specific epigenetic modification of iCGIs. We describe a previously unidentified regulatory mechanism controlling differentiation that is mediated by the epigenetic modification of iCGIs.
Results
iCGIs Show a Distinct Sequence Signature.
To understand the functional differences between CGIs within promoters (pCGIs) and within genes (iCGIs), we examined their epigenetic modification patterns using human ESC data (Gene Expression Omnibus: GSE16256). A 2D density plot of CGIs and their associated modifications, DNA methylation and H3K4me3, revealed that these modifications are mutually exclusive (Fig. 1A). We grouped all CGIs into two types according to whether they had high levels of H3K4me3 modification or DNA methylation, using k-means clustering (CGIm, methylated, or CGIum, unmethylated, H3K4me3-enriched) (SI Appendix, Fig. S1A). pCGIs were primarily H3K4me3-modified, whereas most iCGIs were primarily cytosine-methylated (Fig. 1A; SI Appendix, Fig. S1 B and C; and Dataset S2). Although this differential modification of CGIs is not well understood, H3K4 methyltransferase preferentially binds to 5′-CGG-3′ (hereafter CGG) (15), which indicates sequence-related CGI modifications. Examination of CGG/C and CGA/T frequencies in pCGIs and iCGIs revealed that most iCGIs had low CGG/C frequencies, whereas pCGIs had high CGG/C frequencies (Fig. 1B). When we examined the relationship between the epigenetic modification and the CGG/C ratio [the (CGG/C):(CGA/T) ratio] for each CGI, the H3K4me3 modification preferentially occurred in CGIs with high CGG/C ratios, whereas DNA methylation preferentially occurred in CGIs with low CGG/C ratios (<1.6) (Fig. 1C). By contrast, H3K27me3 modifications gradually increased as CGG/C ratios decreased, but then decreased as DNA methylation became more dominant. Therefore, the distinct modification patterns of CGIs appear to be a result of their sequences.
Fig. 1.
Sequence features of iCGIs favor H3K27me3 and DNA methylation. (A) Density scatter plot showing the distribution patterns of H3K4me3 [x axis, log2(RPKM+1)] and DNA methylation levels (y axis, methylation rate from 0 to 1) at pCGIs and iCGIs in human H1 ESCs. Density is color-coded from low (blue) to high (red) as shown in the reference color triangle. (B) Density scatter plots showing the distributions of pCGIs and iCGIs based on their sequence features (x axis, percentile ranks of each CGI’s CGG/C percentage; y axis, CGT/A percentage. (C) Enrichments of H3K4me3 (red line, left y axis), H3K27me3 (blue line, right y axis), and DNA methylation (black line, far-right y axis) on CGIs in human ESCs. CGIs were arranged from left to right by CGG/C to CGT/A ratio. (D) Enrichments of H3K4me3 (red line, left y axis), H3K27me3 (blue line, right y axis), and DNA methylation (black line, far-right y axis) on CGIs in mESCs. CGIs were arranged from left to right by CGG/C to CGT/A ratio. (E) Enrichments of CXXC1 (pink line), MLL2 (green line), SUZ12 (blue line), DNMT3A (black line), and DNMT3B (gray line) on CGIs in mouse ESCs. CGIs were arranged from left to right by CGG/C to CGT/A ratio. (F) Scatter plots showing the distributions of length (x axis) and ratio of (CGG/C) to (CGT/A) (y axis) of pCGIs (Upper) and iCGIs (Lower). The color of each dot represents the extent of DNA methylation of the CGI in hESCs [blue (0), 0%; red (1), 100%] as shown in the color bars to the right of each panel.
Mouse ESCs (GSE30202) contain fewer CGIs (SI Appendix, Fig. S1D) and have a higher preference for the H3K4me3 modification than human ESCs (SI Appendix, Fig. S1 A and E and Dataset S2) (16). Nevertheless, the overall trend of sequence-related CGI methylation appears to be evolutionarily conserved (SI Appendix, Fig. S1F). CGI modification patterns remained relatively constant as CGG/C ratios decreased, but H3K4me3 levels dropped dramatically at CGG/C ratios below 1.1, whereas H3K27me3 and DNA methylation levels increased sharply (Fig. 1D).
In agreement with these modification patterns, binding of DNA methyltransferases (DNMTs) 3A and B to CGIs gradually increased as CGG/C ratios decreased, and binding of SUZ12 (a PRC2 component) dramatically increased. As DNA methylation levels increased, binding of CXXC1 (a nonmethylated CpG binding protein) to CGIs with low CGG/C ratios decreased substantially (Fig. 1E). As shown with H3K4me3, DNA methylation and the H3K27me3 modification did not occur at the same iCGIs (SI Appendix, Fig. S1G). Thus, iCGIs with low CGG/C ratios tend to be enriched for methylation. Nevertheless, a small portion of iCGIs appeared to be histone-modified (H3K4me3 and H3K27me3). Indeed, the H3K4me3 levels were substantially lower in the iCGIs of Mll2 mutant mouse ESCs than in those of the wild type (SI Appendix, Fig. S1H). Therefore, distinct binding preferences of epigenetic modifiers for iCGIs appear to influence the epigenetic modifications of the iCGIs and the expression of the associated genes. However, additional factors also appear to influence CGI modifications; iCGIs with similar sequence features as pCGIs were more biased toward DNA methylation than the pCGIs (Fig. 1F). Thus, the regulatory mechanism of iCGIs and the correlation between their DNA methylation levels and the expression levels of their associated genes remains puzzling.
Unmethylated iCGIs Are Connected with Bivalent Chromatin in ESCs and Are Associated with Key Developmental Regulators.
iCGIs were found in about 15% of genes (SI Appendix, Fig. S2A), and ontology analysis predicted that they probably have roles in the regulation of developmental genes (SI Appendix, Table S1). iCGIs were not found in most housekeeping genes; instead, they were highly enriched in genes involved in coordinating developmental processes, such as the Hedgehog-, Notch-, and Wnt-signaling pathways (SI Appendix, Fig. S2B). In particular, iCGIs were frequently found in genes encoding human transcription factors showing strong cell type-specific expression patterns (17) (SI Appendix, Fig. S2C). Indeed, all of the major transcription factors associated with specific organ development and lineage determination had iCGIs (SI Appendix, Table S2). Therefore, iCGIs may function as cis-acting elements controlling tissue-specific transcription.
To understand the functions of iCGIs, we examined the possibility that iCGIs correspond to some previously defined type of regulatory element, namely an enhancer, silencer, or alternative promoter. However, the high levels of H3K4me3 modification or DNA methylation patterns of either type of iCGI (iCGIm or iCGIum) were quite different from the well-known signatures of enhancers (SI Appendix, Fig. S2D) (18–20). In addition, there was no overlap between the iCGIs and previously defined enhancers and silencers (SI Appendix, Fig. S2E). iCGIsum were highly enriched for H3K4me3, which is associated with promoter function, and showed weak enrichment for the enhancer marker H3K4me1 (SI Appendix, Fig. S2D). We assessed the signal enrichment for capped analysis of gene expression (CAGE) experiments to test the role of iCGIsum as alternative promoters or active enhancers. CAGE-seq analysis of the RNAs containing or lacking poly(A) tail of human ESCs identified only 227 iCGIsum (13%) that were revealed to be alternative transcription start sites (TSSs) with CAGE signals (SI Appendix, Fig. S2F). These 227 iCGIsum were marked with H3K4me3, but showed weak enrichment of H3K4me1, which is different from enhancers more enriched with H3K4me1 than H3K4me3 (SI Appendix, Fig. S2F). We also performed a luciferase assay to test whether iCGIs can stimulate gene expression with enhancer activity. We fused the mouse iCGIs to luciferase with the upstream SV40 promoter and evaluated the mouse iCGI activities in transient transfection experiments in NIH/3T3 mouse fibroblast cells (SI Appendix, Fig. S2G). Neither iCGIsum nor iCGIsm showed the stimulating activities compared with control SV40 enhancer (SI Appendix, Fig. S2H). Therefore, iCGIs does not seem to act as an enhancer, and a substantial number of iCGIsum do not appear to be alternative TSSs. The regulatory roles of the remaining iCGIsum enriched for H3K4me3 remain to be elucidated.
To examine a possible link between iCGIs and other regulatory elements of the genes, we scanned the effects of iCGI modifications on the epigenetic landscapes of the associated genes. To this end, we compared diverse epigenetic modification patterns along the genic structure among the genes with DNA-methylated iCGIs (iCGIsm: red in Fig. 2A; group 3 genes in SI Appendix, Fig. S2A), unmethylated iCGIs (iCGIsum: blue, group 2 genes), or without iCGIs (shaded gray, group 1 genes) (Fig. 2A). Promoters of genes lacking iCGIs were accessible and were associated with high levels of H3K4me3, RNA polymerase II (pol II), CXXC1, and MLL2. The presence of iCGIsum within gene sequences was associated with extra-accessible regions and high H3K4me3 levels along with the accumulation of pol II, CXXC1, and MLL2 centered on the iCGIum. Especially, it was associated with a dramatic increase in H3K27me3 over the entire genic area including the promoter, and this increase was accompanied by strong binding of polycomb complex components at both the promoter and iCGIum, creating a strongly repressed chromatin structure over almost the entire gene. Indeed, the histone modification patterns of pCGIs were highly correlated with those of the associated iCGIs (SI Appendix, Fig. S3A), but the presence of iCGIum appeared to induce even higher levels of H3K27me3 at the associated promoters (SI Appendix, Fig. S3B). Consequently, bivalent iCGIum-containing genes had reduced levels of the transcriptional elongation mark, H3K79me2 (Fig. 2A), and lower expression levels (SI Appendix, Fig. S2A).
Fig. 2.
Epigenetic modification landscapes of iCGI-containing genes and their expression levels. (A) Average signal intensity of epigenetic factors indicated at the left of each panel along the length of CGI promoter-containing genes in human H1 ESCs. The MLL2, CXXC1, and RING1B data are from mouse ESCs. Genic regions [from −1 kb upstream of TSS to transcription end site (TxEnd)] are divided into four areas: promoter (gray zone), iCGI, and the regions between iCGI and promoter or TxEnd. Boundaries of promoters and iCGIs are marked with vertical dotted lines. CGI promoter-containing genes were divided into three groups: without iCGI (gray dotted lines), with iCGIum (blue lines), and with iCGIm (red lines). The y axis indicates the RPKM values for ChIP-seq signals or DNA methylation ratios obtained from bisulfite sequencing data. (B) Average signal intensity of H3K27me3 and H3K36me3 along the length of iCGI-containing genes in human H1 ESCs. H3K27me3 levels are represented according to the number of iCGIsum (Left: gray, 1 iCGIum; green, 2 iCGIsum; blue, ≥3 iCGIsum), and H3K36me3 levels are represented according to the number of iCGIsm (Right: gray, 1 iCGIm; orange, 2 iCGIsm; red, ≥3 iCGIsm). (C) Average signal intensity of H3K27me3 and H3K36me3 along the length of genes containing only iCGIum (blue line), only iCGIm (red line), or both (purple line) in human H1 ESCs (Left: H3K27me3; Right: H3K36me3). (D) A 3D scatter plot showing the levels of three epigenetic marks (H3K27me3, H3K4me3, and DNA methylation) at iCGIs in hESCs. Each spot is color-coded according to the percentile rank of the expression of the associated gene, and the relative ranks are shown in the color bars at the right of each panel (blue: 0; red: 1). Log2(RPKM+1) values are used for H3K4me3 and H3K27me3 ChIP-seq signals, and methylation ratios (0–1) obtained from bisulfite sequencing data are shown for DNA methylation. (E) The 2D scatter plots showing the levels of H3K4me3 (x axis) and H3K27me3 (y axis) on iCGIsum (Left) and iCGIsm (Right) in hESCs. Each spot is color-coded according to the percentile rank of the associated gene.
In contrast to bivalently modified iCGIsum, methylated iCGIsm did not affect the expression of associated genes, except that elevated H3K36me3 levels were associated with high levels of transcription of genic areas (Fig. 2A). Unlike methylated pCGIs, iCGIsm and silenced bivalent iCGIsum were not enriched for the repressive H3K9me3 modification. In agreement, iCGIm-associated promoters displayed active H3K4me3 modification patterns (SI Appendix, Fig. S3C). These iCGI-dependent epigenetic patterns were also observed in genes containing non-CGI promoters (SI Appendix, Fig. S3D) and in mice (SI Appendix, Fig. S3E), demonstrating the epigenetic nature of iCGIs and their regulatory potential.
Although each iCGI type has distinct biochemical characteristics, ∼30% of iCGI-containing genes have more than one type of CGI, which complicates the analysis of their regulatory effects. In general, the effects of iCGIs on gene expression appeared to be additive because the level of repression by H3K27me3 and the level of activation by H3K36me3 correlated with the numbers of bivalent iCGIsum and methylated iCGIsm found within a gene, respectively (Fig. 2B). When both types of iCGIs were present within a gene, an intermediary level of active and repressive histone modifications was found (Fig. 2C). These results indicate that gene expression is influenced by the overall modification patterns of the associated iCGIs.
We examined the correlation between epigenetic modification patterns of all iCGIs and the expression levels of their associated genes. We plotted iCGI modifications in human ESCs using 3D scatter plots (X: DNA methylation; Y: H3K4me3; Z: H3K27me3) with differential coloring for the percentiles of the associated gene expression (Fig. 2D and Dataset S3). When DNA methylation levels were high (iCGIm), iCGIs were devoid of histone modifications and showed high expression levels of the associated genes (Fig. 2 D and E). Conversely, iCGIs with low DNA methylation levels (iCGIum) were either H3K4me3-modified (798 H3K4me3 monovalent iCGIsum) with active expression or contained both active (H3K4me3) and repressive (H3K27me3) modification marks (758 bivalent iCGIsum) in silenced genes (Fig. 2 D and E). These CGI modification patterns were also observed in mouse ESCs (SI Appendix, Fig. S3G), indicating that iCGIs have alternative epigenetic modification states that correlate with the expression levels of the associated genes.
Epigenetic Changes on iCGIs Correlate with Developmental Gene Expression During Differentiation.
Two distinct epigenetic states of iCGIs reflecting the expression levels of the associated genes suggest that alteration of their modification state could be reflected in gene expression. To examine this possibility during ESC differentiation, we compared human ESCs (hESCs) with differentiated cells, human neural progenitor cells (hNPCs, GSE16368) and fibroblasts (IMR90, GSE16256), for cell type-specific modification of iCGIs with respect to differential expression (Fig. 3A). In general, the correlation between modification patterns and expression levels observed in the ESCs was also seen in the differentiated cells. However, dramatic changes in iCGI modifications correlated with gene expression changes induced by differentiation. First, many iCGIs (both iCGIsum and iCGIsm) in the differentiated cells were silenced by higher levels of H3K27me3. Second, a substantial proportion of the bivalently modified iCGIsum was absent from the differentiated cells. Approximately one-third of bivalent iCGIsum were hypermethylated in differentiated cells, and this was associated with gene activation (SI Appendix, Fig. S4 A and B). Scatter plot analysis of differential modification levels of iCGIs revealed that hypermethylation was associated with transcriptional activation and a reduction in both H3K4me3 and H3K27me3 marks (Fig. 3B): 11% of iCGIsum had switched to iCGIsm (>0.4% of DNA methylation rate changes, hNPC-hESC) with transcriptional activation in differentiated hNPCs (SI Appendix, Fig. S4C). The genes containing the hypermethylated iCGIs in hNPCs have key roles in transcriptional regulation and brain development (SI Appendix, Fig. S4D). This transcriptional activation associated with hypermethylation of iCGIs is also observed in genes with the methylated promoters (SI Appendix, Fig. S4 E–G).
Fig. 3.
Epigenetic modification changes of iCGIs during ESC differentiation. (A) The 3D scatter plots showing the levels of DNA methylation, H3K4me3, and H3K27me3 at iCGIs in hESCs, hNPCs, and IMR90. Each spot is color-coded according to the percentile rank of the expression of the associated gene, as shown in the vertical color bars (blue: 0; red; 1). Each axis represents DNA methylation rate or log2(RPKM+1) values for H3K4me3 and H3K27me3. (B) DNA methylation, H3K4me3, and H3K27me3 changes at 1,369 iCGIs associated with 651 genes, which were differentially expressed (percentile rank differences >0.3) in hNPCs compared with hESCs. Difference values of DNA methylation rate and log2(RPKM+1) for H3K4me3 and H3K27me3 of each iCGI are shown in 2D scatter plots (hNPCs-hESCs). Each spot is color-coded according to the percentile rank difference of the expression of their associated gene as shown in the vertical color bars (blue: −1; red: +1). (C) Heatmap display of various epigenetic marker enrichments for the genes activated by hypermethylation of the associated iCGIs in IMR90 cells. Sixty-six genes showing a more than 40% increase of iCGI methylation are shown. The Top two rows show the levels of the indicated epigenetic marks along the relative locations of CGIs indicated in the leftmost panels. Genes are arranged from Top to Bottom according to the intragenic locations of the CGIs in their gene body. Red gradient indicates the RPKM values of the modifications (or DNA methylation rates) in H1 ESCs or IMR90 cells. The heatmaps in the bottom two rows show differential levels of the epigenetic marks in IMR90 compared with those in hESCs (IMR90-hESC) or differential levels of the epigenetic marks in iPSCs compared with those in IMR90 (iPSC-IMR90). Color scale in heatmap shows the differences as increased (red) and decreased (green).
The loss of H3K4me3 marks specifically occurred at iCGIs and not at other intragenic regions, but DNA methylation increased over entire intragenic regions (SI Appendix, Fig. S4H). Hypermethylation of these intragenic regions was observed until ∼1 kb away from the TSS (SI Appendix, Fig. S4I). H3K27me3 decreased at regions between TSSs and iCGIs to the same extent as at hypermethylated iCGIs (SI Appendix, Fig. S4H). This effect was substantial when iCGIs were located within 10 kb of the TSS, and moderate when iCGIs were located more than 10 kb away from the TSS (SI Appendix, Fig. S4J). Hypomethylation was often associated with loss of transcriptional activity and increased levels of H3K27me3 (Fig. 3B and SI Appendix, Fig. S4B). A comparison between mouse ESCs (mESCs) and mouse neural progenitor cells (mNPCs) revealed an even more striking conversion of bivalently modified iCGIsum into DNA-methylated iCGIsm in the differentiated cells (SI Appendix, Fig. S5 A and B). Therefore, these results suggest an intriguing possibility that iCGIs regulate gene expression by converting bivalent histone modification to DNA methylation at poised developmental genes or by gaining H3K27me3 marks to further silence genes, the expression of which is not required.
To examine how the DNA methylation changes at iCGIs affect gene expression during ESC differentiation, the epigenetic patterns influenced by hypermethylation of iCGIs were examined for the entire genic landscape in differentiated IMR90 and induced pluripotent stem cells (iPSCs) (GSE16256) derived from fibroblasts. As shown above, hypermethylation of iCGIs was accompanied by loss of bivalent histone marks from the iCGIs. The loss of H3K27me3 was not limited to the iCGIs but was observed over the entire genic area including the promoters (Fig. 3C). Conversely, promoters and genic regions gained substantial amounts of H3K4me3 and H3K27 acetylation (H3K27ace), along with H3K36me3 at gene bodies, resulting in higher transcriptional activity, as shown by high global run-on sequencing levels. In addition, pol II that was stalled at promoters and iCGIs was released to genic regions, increasing the elongation efficiency in differentiated IMR90 cells compared with that in ESCs. Consistent with this, reverse differentiation of fibroblasts to iPSCs was accompanied by a transition of the chromatin modification pattern in the opposite direction, along with the silencing of gene expression (Fig. 3C and SI Appendix, Fig. S5C). Therefore, these observations provide evidence that reversible changes of the chromatin structure of iCGIs play an important role in regulating developmental gene expression during ESC maintenance and differentiation.
Methylation of iCGIs Disrupts Promoter Interactions.
The correlation between changes at iCGIs and the activity of their associated promoters raised the possibility of physical interactions between these two elements. Because iCGI function appears to be conserved in mammals, we relied on mESC chromatin interaction data (21) to test this idea. Chromatin interaction analysis of ESCs by paired-end-tag sequencing (ChIA-PET) of pol II-associated DNA revealed hot spots of chromatin interactions. As expected, unmethylated CGIs at promoters showed a high frequency of chromatin interactions (Fig. 4A). In addition, iCGIsum, as well as bivalently silenced promoters, had high levels of interaction. Interestingly, iCGIsum interacted most frequently with promoters (Fig. 4B and Dataset S4), even with their own promoters located more than 50 kb upstream of the iCGI (Fig. 4C). Sixty-four percent of bivalent iCGIsum and 79% of H3K4me3 monovalent iCGIsum physically interacted with other genomic regions (SI Appendix, Fig. S6A). However, these interactions were not seen at methylated CGIs, even at iCGIsm displaying high transcriptional activity (Fig. 4A and SI Appendix, Fig. S6B). To test whether interactions at iCGIs are affected by the hypermethylation that occurs during ESC differentiation to neuronal cells, interaction frequencies of hypermethylated iCGIs and of iCGIsum in mNPCs were compared with their corresponding genes in mESCs. ChIA-PET analysis of mESCs and mNPCs revealed that hypermethylation of iCGIs during differentiation resulted in decreased interaction frequencies in differentiated cells, whereas the interaction frequencies of iCGIsum were unchanged (Fig. 4D). The ability of iCGIs to provide additional binding sites for PRC complexes and to mediate physical interaction with promoter regions may enable strong enrichments of H3K27me3 marks from TSSs to iCGIs, resulting in more stable silencing of key developmental genes in ESCs (Fig. 2A and SI Appendix, Figs. S3B and S6C). iCGI hypermethylation can disrupt physical interactions with promoter regions and prevent the binding of PRC complexes to iCGIs, and this may cause the elimination of H3K27me3 marks from TSSs to iCGIs (SI Appendix, Fig. S4 H and G). H3K4me3 monovalent iCGIs also interact with promoter regions in actively expressed genes (SI Appendix, Fig. S6A); however, why transcription starts only in promoter regions and not in iCGIs (except for the 13% CAGE+ iCGIsum in SI Appendix, Fig. S2F) remains to be elucidated.
Fig. 4.
Disruption of CGI interactions by methylation of iCGIs. (A) Histogram showing the enrichment of different types of CGIs and bivalent promoters (Bivalent_P) undergoing ChIA-PET interactions with RNA polymerase II in mouse ESCs (GSE44067). The y axis indicates the RPKM values for the ChIA-PET signal. (B) Donut chart showing the relative genomic distribution of iCGIum-interacting regions. (C) Genome browser view showing the enrichment of ChIA-PET interaction signals, Pol II, H3K4me3, and H3K27me3 at the Insr and Gprin1 loci. Locations of iCGIs are highlighted in cyan. Interacting regions are indicated by red dashed lines with P values. (D) Histogram showing the enrichment of methylated or unmethylated iCGIs undergoing ChIA-PET interactions with RNA polymerase II in mESCs and NPCs. (Upper) The interaction frequencies of hypermethylated iCGIs in NPCs compared with ESCs. (Lower) The interaction frequencies of unmethylated iCGIs in both ESCs and NPCs. The y axis indicates the RPKM values for the ChIA-PET signal. (E) The 3C-qPCR analysis of interactions between promoter CGIs and iCGIs (iCGIum: Insr; iCGIm: Igsf3, Map3k1, and Whrn) within each gene in control DMSO-treated and 5-azacytidine–treated mouse ES14 cells. Two independent 3C-qPCR experiments were performed on two independent biological samples. The data were normalized to a Gapdh loading control.
To validate these findings experimentally, we examined the effect of demethylation on iCGI chromosomal interactions by chromosome conformation capture (3C) analysis of iCGI-containing genes in mouse ES14 cells with and without treatment with a DNMT inhibitor (22). Because all of the pCGIs tested were unmethylated, there was no change in the DNA methylation level upon 5-azacytidine treatment (SI Appendix, Fig. S6E). However, the iCGIm-containing genes Igsf3, Map3k1, and Whrn were markedly demethylated upon treatment with 5-azacytidine. 3C-quantitative real-time PCR (3C-qPCR) measurements showed that the Insr iCGIum had a high interaction frequency, whereas the interaction frequencies of the iCGIsm of Igsf3, Map3k1, and Whrn with their promoters were barely detectable (Fig. 4E). This correlated with a reduction of methylation at these iCGIs upon 5-azacytidine treatment (SI Appendix, Fig. S6E). However, treatment with 5-azacytidine enhanced chromatin interactions in Map3k1, Igsf3, and Whrn between the iCGI and the promoter, but had no effect on other intragenic regions (SI Appendix, Fig. S6 D and E). Therefore, DNA methylation of iCGIs appears to interrupt chromatin interactions within promoter regions.
DNA Hypermethylation of iCGIs Is Required for Active Expression of Their Associated Genes.
To validate the requirement of iCGI methylation for gene regulation, we analyzed the effects of DNMT mutations on iCGI-mediated regulation in mouse ESCs (GSE28254 and GSE29413). We examined the effects of DNA demethylation on accumulation of the repressive histone modifications H3K27me3 and H3K9me3 on iCGIsum and iCGIsm. A DNMT triple-knockout (TKO) caused a complete loss of DNA methylation along with a slight depletion of H3K9me3 on iCGIsum and iCGIsm, reflecting similarities between these two repressive epigenetic markers (SI Appendix, Fig. S7 A and B). However, the loss of DNA methylation had different effects on H3K27me3 depending on the type of iCGI. iCGIsm gained H3K27me3 modifications upon the loss of DNA methylation, whereas bivalent iCGIsum lost the repressive modifications (SI Appendix, Fig. S7B).
Reduced DNA methylation of iCGIs caused transcriptional repression of their associated genes, whereas genes containing hypomethylated pCGIs were activated in TKO cells (SI Appendix, Fig. S7C). In addition, recovery of DNMT3A expression in TKO cells (GSE57577) induced hypermethylation of CGIs in both promoter and intragenic regions, but led to differential effects on gene expression depending on the locations of the CGIs. pCGIs were repressed, whereas iCGIs were de-repressed in DNMT3A-expressing cells (SI Appendix, Fig. S7D). Therefore, DNA methylation at iCGIs appears to counteract the repressive H3K27me3 modification and activate gene expression.
Cell Type-Specific Gene Expression Is Associated with Differential Methylation of iCGIs.
We examined DNA methylation changes at iCGIsum of ESCs in three different somatic cell types (GM12878: LB, IMR90, and NPC). We identified 355 iCGIs that were markedly hypermethylated (>40%) in at least one somatic cell type (Fig. 5A and Dataset S5). Many iCGIs (cluster 1) gained methylation in all of the differentiated cell types, but three distinct groups of iCGIs were specifically hypermethylated in just one cell type each, resulting in depletion of H3K4me3 and H3K27me3 marks concurrently with the activation of the associated genes in the corresponding cell type (Fig. 5 B–D and SI Appendix, Fig. S8 A–C). These cell type specifically activated genes were enriched for homophilic cell adhesion (blood cell-specific cluster 2), morphogenic transcription factors (fibroblast-specific cluster 3), and brain development-associated genes (neuron-specific cluster 4) (SI Appendix, Table S3), and similar patterns of iCGI methylation were observed in the analysis of additional cell types (mobilized CD34+ cell, HSC; cortex-derived neural stem cell, Neuro; breast myoepiblast, Myoe) (SI Appendix, Fig. S9), confirming the importance of iCGIs in cell type-specific differentiation. Moreover, many of these differentially regulated iCGI-containing genes were homeobox proteins specifically expressed in the corresponding cell types (Fig. 5A). In particular, several key transcription factors including BAI1, LHX2, ZIC1, PAX6, MEIS1, and ONECUT1 were specifically activated in NPCs with hypermethylation of their iCGIs, indicating the essential roles of iCGI-mediated transcriptional control in neuronal differentiation.
Fig. 5.
Cell type-specific regulation of DNA methylation at iCGIs. (A) Heatmap showing the hierarchical clustering of DNA methylation differences at iCGIs between ESCs and indicated somatic cells [LB (lymphoblastoid, GM12878), IMR90 (fibroblast), or hNPC]. A total of 355 iCGIs hypermethylated (more than 40%) in at least one somatic cell type were analyzed using hierachial clustering. Four clusters are generated (cluster 1: common; cluster 2: LB-specific; cluster 3: fibroblast-specific; cluster 4, NPC-specific). Representative transcription factors of each cluster are indicated on the left. Color scale for the methylation differences is shown. (B) Expression levels of the genes associated with each cluster in LB, IMR90, and NPC. Box-whisker plots showing the FPKM values of the genes from mRNA-seq data of each cell line. (C) H3K27me3 levels of iCGIs in LB, IMR90, and NPC. Box-whisker plots showing the RPKM values of the iCGIs from ChIP-seq data of each cell line. (D) H3K4me3 levels of iCGIs in LB, IMR90, and NPC. Box-whisker plots showing the RPKM values of the iCGIs from ChIP-seq data of each cell line.
Cell Type-Specific Transcription Factor-Binding Sites Are Enriched at iCGIs.
Specific epigenetic changes are usually induced by cell type-specific transcription factors (23). To test whether cell type-specific methylation of iCGIs can be triggered by tissue-specific transcription factors, we analyzed transcription factor-binding sites in the differentially methylated iCGI clusters in Fig. 5A using Homer software (24). Indeed, these iCGI clusters, rather than their pCGIs, were enriched for binding sites for a distinct set of developmental transcription factors known to act in the corresponding cell types (Fig. 6A).
Fig. 6.
Transcription factor-binding motif enrichment at iCGIs. (A) Enrichment of defined transcription factor-binding motifs was calculated using Homer software. Heatmaps show the hierarchical clustering of transcription factor motifs enriched in each cluster of the pCGIs (Left) and iCGIs (Right) defined in Fig. 5 (cluster 1: common; cluster 2: LB-specific: cluster 3: fibroblast-specific; cluster 4: NPC-specific). Black gradient indicates enrichment (minus log P values) of motifs in CGIs, and the motifs with cutoff minus log-P value greater than 3 in at least one cluster are shown. (B) Heatmap showing the hierarchical clustering of transcription factor motif enrichments for active promoter CGIs, bivalent promoter CGIs, iCGIsum, and iCGsIm in hESCs. Enrichments (minus log P values) of 88 motifs with a cutoff minus log P value of 4 in at least one region are shown. The representative transcription factors enriched specifically in each category are indicated with different colors. Sequences for the bivalent promoter CGIs were obtained from the bivalent genes defined in Li et al. (49). Active promoter CGIs are from the genes without an iCGI. (C) Expression levels (FPKM) of the transcription factors with binding sites at active promoter CGIs, bivalent promoter CGIs, iCGIsum, and iCGIsm are obtained from the RNA-seq data of hESCs. Horizontal lines indicate the median values. (D) Density plot of iCGI-containing transcription factors with iCGIsum (black line) or iCGIsm (gray line) according to their tissue specificity scores. Tissue specificity scores were obtained from Ravasi et al. (17).
To examine the generality of this phenomenon, we analyzed the enrichment of transcription factor-binding sites among the entire set of promoter-associated and iCGIs. Grouping of transcription factors according to their calculated probability of binding to different CGI elements revealed four distinct groups of transcription factors specifically enriched in given CGI elements (Fig. 6B and Dataset S6). The transcription factors recognizing the CGIs of active promoters (Active pCGI) encompassed many abundant transcription factors including SP1, NFY, and YY1 (green), whereas NRSF and TBP (gray) were specifically enriched at bivalent pCGIs (Fig. 6B). Although their histone modification patterns are similar to those of bivalent promoters, iCGIsum were enriched for binding sites for unique developmental transcription factors (blue) quite different from those binding to bivalent pCGIs. These binding sites for distinct groups of transcription factors in each CGI type were evolutionarily conserved such that similar enrichment patterns were observed in mESCs (SI Appendix, Fig. S10 and Dataset S6).
These unique transcription factors recognizing bivalently modified CGIs are expressed in a cell type-specific manner and are thus expressed at much lower levels in ESCs than those recognizing active promoters or methylated iCGIsm (Fig. 6C). Due to their absence in ESCs, most of their target iCGIs might remain in an unmethylated state and only become methylated in differentiated derivatives specifically expressing the corresponding transcription factors. Intriguingly, many of these cell type-specific transcription factors also contained iCGIs and were regulated in a similar way. On the other hand, the methylated iCGIsm of ESCs contained binding sites for abundant transcription factors present in ESCs, such as MYC and MAX (Fig. 6 B and C). These results indicate that the availability of iCGI-binding transcription factors may be the main determinant for differential modification of iCGIs. For this reason, transcription factor genes containing iCGIsum are in a silenced state waiting for a specific developmental cue and thus have substantially higher cell-type specificity scores than those containing methylated iCGIsm (Fig. 6D). These results suggest that cell type-specific expression of both transcription factors and their targets may take advantage of epigenetic modifications of iCGIs to generate differentiated lineages from ESCs. Combinatorial transcription factor occupancy on promoters and iCGIs may lead to diverse transcriptional outputs.
Discussion
Most work concerning the importance of epigenetic modifications has focused on promoter-associated CGIs. The only known role of DNA methylation on iCGIs is to repress alternative TSSs in a cell type-specific manner. Here we provide several lines of evidence suggesting that iCGIs have additional roles in ESC differentiation.
First, iCGI-containing genes, including most cell lineage transcription factors, are important developmental regulators. Many of the genes encoding key regulators of morphogenesis and organ development, such as BMPs, HOXs, PAXs, GATAs, Notch, and WNTs, contain iCGIs. The HOXA gene cluster contains 39 CGIs distributed between 13 HOXA genes (chromosome 7, 155-kb region) (25), and seven PAX genes contain 18 iCGIs. Therefore, iCGIs appear to be a common element for development-specific gene expression.
Second, the epigenetic modifications of iCGIs are evolutionarily conserved and vary in a cell type-specific manner. In both human and mouse, the modification status of iCGIs correlated with the expression of their associated genes. Comparison of various human differentiated cell types with ESCs revealed that cell type-specific hypermethylation of iCGIs correlated with activated transcription of their associated developmentally regulated genes. For example, iCGIs of cell lineage markers (PAX5, RARA, LHX2, and MEIS1) were hypermethylated only in the corresponding cell types that typically express these transcription factors (26–29). Although there are a number of iCGIs commonly methylated in diverse cell types, including ESCs, CGIs are the major targets of controlled de novo DNA methylation during early embryogenesis following genome-wide demethylation (30). Therefore, methylation of iCGIs is highly associated with development.
Third, iCGIs may provide a common platform for bivalent chromatin and DNA methylation. Many developmental genes have been suggested to silence their expression via bivalent histone modification of pCGIs (31, 32). The establishment of bivalent modification and its progress into monovalent modification have been largely speculated to rely on transcription factor-mediated recruitment of specific histone modifiers. However, this may not be enough to provide the necessary regulatory complexity required for cellular differentiation. Additionally, the requirement for DNA methylation to regulate bivalent chromatin has been uncertain. Therefore, identification of the additional layer of regulation mediated by iCGIs expands our understanding of bivalent chromatin and suggests a new regulatory mechanism. In addition, areas encompassed by the iCGIs, which could include more than one gene, can interact with other CGIs and be coregulated by the spread of the repressive histone mark H3K27me3. These CGI interactions with interlocking repressive chromatin can be disrupted by DNA methylation of iCGIs, demonstrating another way in which DNA methylation can affect gene expression. Indeed, DNA hypomethylation in DNMT knockout cells leads to the reappearance of H3K27me3 and H3K4me3 at iCGIs, resulting in gene silencing (6, 33, 34). Therefore, iCGIs reveal a regulatory mechanism distinct from that of enhancers and silencers.
Fourth, iCGIs are enriched for binding sites for developmentally regulated transcription factors that are distinct from those of pCGIs. It is intriguing that many cell type-specific transcription factors can bind to iCGIs to affect target gene expression. A complex combination of transcription factors on promoters and iCGIs can result in subtle changes in transcriptional regulation depending on each cell type. We observed that iCGI-binding transcription factor availability correlated somewhat with differential modification of iCGIs, but we do not yet know the exact role of transcription factors in epigenetic modifications of iCGIs. However, iCGI-binding transcription factors, including glucocorticoid receptors and progesterone receptors, interact with the H3K4 demethylases LSD1, KDM5A, and KDM5B (35) and DNMTs (36), supporting the idea that iCGI-binding transcription factors can recruit chromatin modifiers to iCGIs. LHX2, SCL, HOXD13, GATA3, progesterone receptor, glucocorticoid receptor, NR5A2, and CTCF are key players inducing differentiation into neuronal (ectoderm) or hematopoietic (mesendoderm) cells, but appear to bind specifically to iCGIs rather than to pCGIs (28, 37–41).
Recently, a number of transcription factors binding at promoters were shown to elicit dynamic epigenetic changes during ESC differentiation (42). However, these transcription factors are quite different from those enriched at iCGIs and appear to regulate a different set of target genes using a distinct mechanism that induces DNA methylation loss or H3K27 acetylation accumulation. Although the bivalent promoter regions of the iCGI-containing genes also gained H3K27 acetylation upon activation, they did not show DNA methylation changes. Therefore, different transcription factors appear to regulate development-specific gene activation at different genic locations by distinct mechanisms.
Finally, iCGIs have distinct sequence features optimized for methylation-mediated regulation. We showed that several sequence features (length, CGG/C ratio, and transcription factor-binding sites) of CGIs influence their methylation levels and differ between intragenic and promoter-associated CGIs. Reflecting their sequence differences, the methylation process and its functional effect on iCGIs is different from that on pCGIs. Transcriptional activation at iCGIs could facilitate methylation at H3K36 by SETD2 (SET domain-containing 2), which binds to the C-terminal domain of elongating pol II (43, 44). High levels of H3K36me3 at iCGIs may promote DNA methylation because the PWWP domain of DNMT3A has been shown to recognize the H3K36me3 mark (45). Although there is no direct evidence that SETD2 binds to methylated DNA, the loss of DNA methylation can promote the binding of H3K36 demethylases KDM2A and NO66 to iCGIs (46, 47), which could repress H3K36 accumulation. Therefore, DNA methylation and H3K36me3 appear to promote each other in actively transcribing regions. Intriguingly, the histone variant H2A.B (48) was preferentially associated with iCGIs rather than pCGIs (SI Appendix, Fig. S3F). However, its binding to iCGIs was not dependent on their methylation status; it was enriched at iCGIsum as well as at iCGIsm. The histone variant may contribute to the formation of the distinct epigenetic features and regulatory roles of iCGIs. Therefore, distinctive sequence features of iCGIs may be able to mediate both the arrest and the release of transcriptional elongation, making iCGIs efficient regulators of gene expression during differentiation (SI Appendix, Fig. S11).
These lines of evidence demonstrate regulatory functions associated with iCGIs. The epigenetic transition from bivalent histone modification to DNA methylation at iCGIs appears to be widely used in many developmental processes. Therefore, genes containing iCGIs may be inferred to be important players in developmentally regulated processes, such as cancer progression and aging. Our analyses thus provide fresh insights into the roles of epigenetic regulation in development and disease.
Materials and Methods
Public Datasets Used.
The following are accession numbers of the datasets used: GSE11431, GSE12241, GSE16256, GSE16368, GSE17312, GSE18927, GSE19468, GSE28254, GSE29413, GSE30202, GSE40832, GSE41009, GSE43070, GSE44067, GSE48122, GSE49294, GSE52017, GSE53490, GSE57413, GSE57575, GSE64115, human ENCODE, and mouse ENCODE. Details of analyzed data are listed in Dataset S1.
Sequencing and Data Processing.
Mapping.
All human sequence data were mapped to the University of California at Santa Cruz (UCSC) human reference genome (hg19, downloaded from UCSC). For consistent analysis, all mouse data were mapped to the UCSC mouse reference genome (mm9). Human and mouse data were mapped to the reference genome using Bowtie2 (version 2.1.0). The default parameters were used for the mapping, and no mismatches were allowed (default setting).
ChIP-seq data analysis.
To analyze the enrichment of histone modification signals, the SAM output was converted to a BED file. Mapped reads were counted at a 50-bp resolution according to the sequence of the genome, and the reads per kilobase of counts per million reads sequenced (RPKM) at a 5-bp resolution were calculated. These RPKM values were averaged between replicate samples to minimize experimental variation.
Whole-genome bisulfite sequencing data analysis.
For the bisulfite sequencing data, the raw sequences were first trimmed using Trim-galore to trim off any low-quality base calls, and the adapter sequences from the 3′ end. The sequences were then aligned to the Bismark genome built from the reference genome using Bismark with the Bowtie2 aligner. The standard alignment parameter was used with a multiseed length of 20 bp with 0 mismatches. The Bismark methylation extractor was used to extract the methylation percentage at every called methylation base in Bedgraph data format.
RNA-seq data analysis.
RNA-seq data were mapped to the reference genome using Tophat2 (version 2.0.10). The aligned BAM files were then assembled using the Cufflinks package (version 2.1.1) to extract and compare their FPKM (Fragments Per Kilobase Million) values. National Center for Biotechnology Information mRNA reference sequence collection (RefSeq) numbers were assigned to each gene name. The expression values of genes with multiple isoforms were averaged. To minimize variation in overall expression values between samples, Cuffnorm (included in the Cufflinks package), which normalizes the read counts across samples for relative comparisons, was used.
Detailed descriptions of methods are available in SI Appendix.
Supplementary Material
Acknowledgments
We thank Dr. Roger D. Kornberg for critical comments and Dr. Yoon-Young Koh for statistical analysis. This work was supported by Samsung Science and Technology Foundation Project SSTF-BA1601-13.
Footnotes
The authors declare no conflict of interest.
This article is a PNAS Direct Submission.
This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1613300114/-/DCSupplemental.
References
- 1.Chen T, Dent SY. Chromatin modifiers and remodellers: Regulators of cellular differentiation. Nat Rev Genet. 2014;15(2):93–106. doi: 10.1038/nrg3607. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Bernstein BE, et al. A bivalent chromatin structure marks key developmental genes in embryonic stem cells. Cell. 2006;125(2):315–326. doi: 10.1016/j.cell.2006.02.041. [DOI] [PubMed] [Google Scholar]
- 3.Jones PA. Functions of DNA methylation: Islands, start sites, gene bodies and beyond. Nat Rev Genet. 2012;13(7):484–492. doi: 10.1038/nrg3230. [DOI] [PubMed] [Google Scholar]
- 4.Beck S, et al. CpG island-mediated global gene regulatory modes in mouse embryonic stem cells. Nat Commun. 2014;5:5490. doi: 10.1038/ncomms6490. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Zhu J, He F, Hu S, Yu J. On the nature of human housekeeping genes. Trends Genet. 2008;24(10):481–484. doi: 10.1016/j.tig.2008.08.004. [DOI] [PubMed] [Google Scholar]
- 6.Hashimoto H, Vertino PM, Cheng X. Molecular coupling of DNA methylation and histone methylation. Epigenomics. 2010;2(5):657–669. doi: 10.2217/epi.10.44. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Mendenhall EM, et al. GC-rich sequence elements recruit PRC2 in mammalian ES cells. PLoS Genet. 2010;6(12):e1001244. doi: 10.1371/journal.pgen.1001244. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Hu D, et al. The Mll2 branch of the COMPASS family regulates bivalent promoters in mouse embryonic stem cells. Nat Struct Mol Biol. 2013;20(9):1093–1097. doi: 10.1038/nsmb.2653. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Maunakea AK, et al. Conserved role of intragenic DNA methylation in regulating alternative promoters. Nature. 2010;466(7303):253–257. doi: 10.1038/nature09165. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Deaton AM, Bird A. CpG islands and the regulation of transcription. Genes Dev. 2011;25(10):1010–1022. doi: 10.1101/gad.2037511. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Sarraf SA, Stancheva I. Methyl-CpG binding protein MBD1 couples histone H3 methylation at lysine 9 by SETDB1 to DNA replication and chromatin assembly. Mol Cell. 2004;15(4):595–605. doi: 10.1016/j.molcel.2004.06.043. [DOI] [PubMed] [Google Scholar]
- 12.Hawkins RD, et al. Distinct epigenomic landscapes of pluripotent and lineage-committed human cells. Cell Stem Cell. 2010;6(5):479–491. doi: 10.1016/j.stem.2010.03.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Deaton AM, et al. Cell type-specific DNA methylation at intragenic CpG islands in the immune system. Genome Res. 2011;21(7):1074–1086. doi: 10.1101/gr.118703.110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Lee SM, et al. HBx induces hypomethylation of distal intragenic CpG islands required for active expression of developmental regulators. Proc Natl Acad Sci USA. 2014;111(26):9555–9560. doi: 10.1073/pnas.1400604111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Xu C, Bian C, Lam R, Dong A, Min J. The structural basis for selective binding of non-methylated CpG islands by the CFP1 CXXC domain. Nat Commun. 2011;2:227. doi: 10.1038/ncomms1237. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Stamatoyannopoulos JA, et al. Mouse ENCODE Consortium An encyclopedia of mouse DNA elements (Mouse ENCODE) Genome Biol. 2012;13(8):418. doi: 10.1186/gb-2012-13-8-418. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Ravasi T, et al. An atlas of combinatorial transcriptional regulation in mouse and man. Cell. 2010;140(5):744–752. doi: 10.1016/j.cell.2010.01.044. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Hawkins RD, et al. Dynamic chromatin states in human ES cells reveal potential regulatory sequences and genes involved in pluripotency. Cell Res. 2011;21(10):1393–1409. doi: 10.1038/cr.2011.146. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Andersson R, et al. FANTOM Consortium An atlas of active enhancers across human cell types and tissues. Nature. 2014;507(7493):455–461. doi: 10.1038/nature12787. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Satoh J, Kawana N, Yamamoto Y. ChIP-Seq data mining: Remarkable differences in NRSF/REST target genes between human ESC and ESC-derived neurons. Bioinform Biol Insights. 2013;7:357–368. doi: 10.4137/BBI.S13279. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Zhang Y, et al. Chromatin connectivity maps reveal dynamic promoter-enhancer long-range associations. Nature. 2013;504(7479):306–310. doi: 10.1038/nature12716. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Tsuji-Takayama K, et al. Demethylating agent, 5-azacytidine, reverses differentiation of embryonic stem cells. Biochem Biophys Res Commun. 2004;323(1):86–90. doi: 10.1016/j.bbrc.2004.08.052. [DOI] [PubMed] [Google Scholar]
- 23.Iwafuchi-Doi M, Zaret KS. Pioneer transcription factors in cell reprogramming. Genes Dev. 2014;28(24):2679–2692. doi: 10.1101/gad.253443.114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Heinz S, et al. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol Cell. 2010;38(4):576–589. doi: 10.1016/j.molcel.2010.05.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Rauch T, et al. Homeobox gene methylation in lung cancer studied by genome-wide analysis with a microarray-based methylated CpG island recovery assay. Proc Natl Acad Sci USA. 2007;104(13):5527–5532. doi: 10.1073/pnas.0701059104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Medvedovic J, Ebert A, Tagoh H, Busslinger M. Pax5: A master regulator of B cell development and leukemogenesis. Adv Immunol. 2011;111:179–206. doi: 10.1016/B978-0-12-385991-4.00005-2. [DOI] [PubMed] [Google Scholar]
- 27.Laursen KB, Wong PM, Gudas LJ. Epigenetic regulation by RARα maintains ligand-independent transcriptional activity. Nucleic Acids Res. 2012;40(1):102–115. doi: 10.1093/nar/gkr637. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Morales D, Hatten ME. Molecular markers of neuronal progenitors in the embryonic cerebellar anlage. J Neurosci. 2006;26(47):12226–12236. doi: 10.1523/JNEUROSCI.3493-06.2006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Spieler D, et al. Restless legs syndrome-associated intronic common variant in Meis1 alters enhancer function in the developing telencephalon. Genome Res. 2014;24(4):592–603. doi: 10.1101/gr.166751.113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Guo H, et al. The DNA methylation landscape of human early embryos. Nature. 2014;511(7511):606–610. doi: 10.1038/nature13544. [DOI] [PubMed] [Google Scholar]
- 31.Milne TA, et al. MLL associates specifically with a subset of transcriptionally active target genes. Proc Natl Acad Sci USA. 2005;102(41):14765–14770. doi: 10.1073/pnas.0503630102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Schmitges FW, et al. Histone methylation by PRC2 is inhibited by active chromatin marks. Mol Cell. 2011;42(3):330–341. doi: 10.1016/j.molcel.2011.03.025. [DOI] [PubMed] [Google Scholar]
- 33.Murphy PJ, et al. Single-molecule analysis of combinatorial epigenomic states in normal and tumor cells. Proc Natl Acad Sci USA. 2013;110(19):7772–7777. doi: 10.1073/pnas.1218495110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Hagarman JA, Motley MP, Kristjansdottir K, Soloway PD. Coordinate regulation of DNA methylation and H3K27me3 in mouse embryonic stem cells. PLoS One. 2013;8(1):e53880. doi: 10.1371/journal.pone.0053880. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Stratmann A, Haendler B. Histone demethylation and steroid receptor function in cancer. Mol Cell Endocrinol. 2012;348(1):12–20. doi: 10.1016/j.mce.2011.09.028. [DOI] [PubMed] [Google Scholar]
- 36.Hervouet E, Vallette FM, Cartron PF. Dnmt3/transcription factor interactions as crucial players in targeted DNA methylation. Epigenetics. 2009;4(7):487–499. doi: 10.4161/epi.4.7.9883. [DOI] [PubMed] [Google Scholar]
- 37.Pandolfi PP, et al. Targeted disruption of the GATA3 gene causes severe abnormalities in the nervous system and in fetal liver haematopoiesis. Nat Genet. 1995;11(1):40–44. doi: 10.1038/ng0995-40. [DOI] [PubMed] [Google Scholar]
- 38.Watson LA, et al. Dual effect of CTCF loss on neuroprogenitor differentiation and survival. J Neurosci. 2014;34(8):2860–2870. doi: 10.1523/JNEUROSCI.3769-13.2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Real PJ, et al. SCL/TAL1 regulates hematopoietic specification from human embryonic stem cells. Mol Ther. 2012;20(7):1443–1453. doi: 10.1038/mt.2012.49. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Shetty AS, et al. Lhx2 regulates a cortex-specific mechanism for barrel formation. Proc Natl Acad Sci USA. 2013;110(50):E4913–E4921. doi: 10.1073/pnas.1311158110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Hale MA, et al. The nuclear hormone receptor family member NR5A2 controls aspects of multipotent progenitor cell formation and acinar differentiation during pancreatic organogenesis. Development. 2014;141(16):3123–3133. doi: 10.1242/dev.109405. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Tsankov AM, et al. Transcription factor binding dynamics during human ES cell differentiation. Nature. 2015;518(7539):344–349. doi: 10.1038/nature14233. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Sun XJ, et al. Identification and characterization of a novel human histone H3 lysine 36-specific methyltransferase. J Biol Chem. 2005;280(42):35261–35271. doi: 10.1074/jbc.M504012200. [DOI] [PubMed] [Google Scholar]
- 44.Brown SJ, Stoilov P, Xing Y. Chromatin and epigenetic regulation of pre-mRNA processing. Hum Mol Genet. 2012;21(R1):R90–R96. doi: 10.1093/hmg/dds353. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Dhayalan A, et al. The Dnmt3a PWWP domain reads histone 3 lysine 36 trimethylation and guides DNA methylation. J Biol Chem. 2010;285(34):26114–26120. doi: 10.1074/jbc.M109.089433. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Brien GL, et al. Polycomb PHF19 binds H3K36me3 and recruits PRC2 and demethylase NO66 to embryonic stem cell genes during differentiation. Nat Struct Mol Biol. 2012;19(12):1273–1281. doi: 10.1038/nsmb.2449. [DOI] [PubMed] [Google Scholar]
- 47.Blackledge NP, et al. CpG islands recruit a histone H3 lysine 36 demethylase. Mol Cell. 2010;38(2):179–190. doi: 10.1016/j.molcel.2010.04.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Chen Y, Chen Q, McEachin RC, Cavalcoli JD, Yu X. H2A.B facilitates transcription elongation at methylated CpG loci. Genome Res. 2014;24(4):570–579. doi: 10.1101/gr.156877.113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Li Q, Lian S, Dai Z, Xiang Q, Dai X. 2013. BGDB: A database of bivalent genes. Database (Oxford) 2013:bat057.
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.






