Abstract
Polycomb domains safeguard cell identity by maintaining lineage-specific chromatin states enriched in repressive histone modifications, preserving the epigenetic memory of cell lineages. While Polycomb Repressive Complex 2 (PRC2) can re-establish its occupancy after perturbation, the mechanisms that guide de novo Polycomb recruitment remain unclear. To address this, we engineered an auxin-inducible degradation system to reversibly deplete and reintroduce the endogenous PRC2 core subunit Suz12 in mouse embryonic stem cells (mESCs). Genome-wide profiling at an early recovery time point revealed ~1,100 PRC2 nucleation sites, characterized by rapid Suz12 and histone H3K27me3 re-accumulation with strong signal, with minimal impact on gene expression. These sites were significantly enriched at bivalent promoters, coinciding with unmethylated CpG islands and chromatin states associated with developmental regulation, and were largely conserved in differentiated cells. Motif analysis identified G/C-rich DNA sequences associated with E2F and zinc-finger proteins, alongside strong co-occupancy with MTF2 and JARID2, two PRC2 cofactors previously implicated in Polycomb targeting. Notably, a subset of nucleation sites overlapped with long-range chromatin interaction anchors in histone H3K27me3 HiChIP datasets. These findings reveal that PRC2 de novo nucleation sites are associated with a combination of chromatin states, DNA sequence features, cofactor co-occupancy and spatial genome organization, suggesting that epigenetic memory can be re-established through defined genomic and chromatin features.
Keywords: Polycomb, Suz12 Degron, PRC2 de novo recruitment, Epigenetic memory
Author summary
Polycomb group proteins are key epigenetic regulators that silence gene expression by establishing and dispersing repressive chromatin domains marked by histone modifications such as histone H3K27me3 and H2AK119ub1 and are critical for defining cell identity. During differentiation, Polycomb domains are dynamically redistributed, implying a mechanism for de novo targeting to specific loci. How the Polycomb Repressive Complex 2 (PRC2) is initially recruited to these nucleation sites and which features stabilize its binding remain poorly understood. To explore this, we studied the characteristics of the de novo recruitment sites of PRC2 in mouse embryonic stem cells (mESCs) using an auxin-inducible degradation (AID) system targeting PRC2 to deplete and reintroduce the core subunit Suz12. We identified recruitment sites after complete clearance of the histone H3K27me3 through ChIP-seq at a very early time point after Suz12 reintroduction. Most nucleation sites were located at bivalent promoters of developmental genes and correlated with unmethylated CpG islands. Motif analysis revealed over-represented sequences and accessory partners such as MTF2 and JARID2, along with long-range chromatin interactions. These nucleation sites were also conserved in differentiated cells, highlighting their potential role in developmental regulation. Our study provides insight into how Polycomb domains are established and how epigenetic memory is maintained in stem cells.
Introduction
The Polycomb Repressive Complex 2 (PRC2) is a key epigenetic regulator that fine-tunes gene expression through the deposition of tri-methylation of histone H3 at lysine 27 (H3K27me3), which is associated with gene silencing (1,2). This histone posttranslational modification helps to establish facultative heterochromatin regions, characterized by low transcriptional rate (3,4). Like gene expression patterns, H3K27me3 chromatin domains are highly cell-type specific (5). Classical examples include PRC2-mediated transcriptional repression of somatic genes in embryonic stem cells (ESCs), while conversely repressing the expression of pluripotency genes in differentiated cells (6). Therefore, understanding the dynamic recruitment of Polycomb to chromatin remains a central question, as impairment of PRC2 is associated with defects in embryonic development and the appearance of diseases such as cancer (7,8).
During DNA replication, parental histones carrying histone H3K27me3 are recycled to enable the Polycomb machinery to restore local repressive regions (9). After that, two main positive feedback axes reinforce Polycomb recruitment: (i) recognition and catalysis of histone H3K27me3 by PRC2 itself (10), and (ii) the binding of PRC2 to the Polycomb Repressive Complex 1 (PRC1) histone mark (H2AK119ub1) (11). However, the specific mechanisms by which Polycomb identifies its target loci and generates new repressive domains — particularly following domain erosion or during differentiation — remain poorly understood (12,13).
Recent findings have proposed de novo recruitment as an additional mechanism that operates independently of the positive feedback of pre-existing histone marks (14–17). Although unmethylated CpG islands (CGI) are often enriched at these sites (18), CGI content alone cannot explain the diversity and cell-type specificity of Polycomb domains. Additional factors such as cis-regulatory elements, transcription factor interactions, or epigenetic crosstalk may act as nucleation centers from which H3K27me3 domains disperse, feeding back into the steady-state.
Prolonged depletion and subsequent reintroduction of Polycomb group (PcG) proteins have enabled the dissection of Polycomb’s ability to accurately reconstitute lineage-specific chromatin patterns, even in the absence of pre-existing repressive histone marks (14–17). For example, ablation of the PRC2 subunit Suz12 erodes the mark, but its re-expression restores the histone H3K27me3 pattern at CGI (15). Similarly, reintroduction of EED unable to spread the repressive domains leads to spatial stalling of PRC2 at nucleation sites, forming clusters (14). Additionally, reintroduction of the catalytic subunit EZH2 in Ezh1/Ezh2 double knockouts leads to restoration of histone H3K27me3 levels and PRC2 occupancy (16). While these studies focused on steady-state profiles, they collectively suggest that PRC2 recruitment mechanisms can operate intrinsically, even in the absence of the histone H3K27me3 signal. However, it remains unclear how de novo sites are identified in a developmental and physiological context, and what actors stabilize PRC2 at the domain nucleation sites.
The ability of PRC2 to locate its targets depends on its core subunit Suz12. Suz12 knockout assays have demonstrated that the patterns of histone H3K27 methylation can only be restored upon reintroducing Suz12 (15). This subunit serves as a structural platform that coordinates both the assembly of catalytic core subunits and accessory proteins essential for chromatin recruitment, as none of the core PRC2 subunits possess intrinsic DNA-binding ability (9,19–23). Suz12 possesses two functional components: the VEFS domain that mediates the interaction with other PRC2 members, and the N-terminal region, which couples accessory subunits involved in chromatin binding (1,22). Notably, Suz12 recruitment can occur without pre-existing H3K27me3 or H2AK119ub1, indicating histone mark independent targeting (17).
To investigate the early events of de novo PRC2 recruitment, we implemented an auxin-inducible and reversible system targeting the endogenous Suz12 in mouse embryonic stem cells (mESCs). Auxin treatment led to the rapid degradation of Suz12 and consequent erosion of H3K27me3 domains. Following reintroduction of Suz12, we performed Chromatin Immunoprecipitation (ChIP-seq) at a very early time point to map nucleation sites before domain dispersal. Unlike prior studies focused on restored steady-state profiles, our approach captures the earliest stages, revealing that nucleation sites are significantly enriched at bivalent promoters, display higher histone H3K27me3 re-accumulation, overlap with MTF2 and JARID2 cofactors, and coincide with unmethylated CGI and long-range H3K27me3 chromatin loops. This study provides insights into the initial steps that govern PRC2 recruitment and histone H3K27me3 distribution, providing a multidimensional view of early Polycomb recruitment in the context of epigenetic memory.
Results
Establishment of an Auxin-inducible Suz12 depletion system to capture de novo PRC2 recruitment events
Mouse embryonic stem cells (mESCs) retain features of the preimplantation epiblast, including naïve pluripotency and self-renewal capability, and importantly, they can proliferate in the absence of PRC2 (24,25). To investigate early PRC2 recruitment, we generated a homozygous mESCs line incorporating an Auxin Inducible and Degradation system (AID) targeting the endogenous Suz12 using CRISPR-Cas9, along with the introduction of the OsTIR1 receptor (26) (Fig 1A and S1A-G). In this system, auxin induces rapid degradation of the transgenic protein (also targeted with a fluorescent protein), while auxin removal leads to Suz12 re-accumulation (Fig 1B).
Fig 1. ON/OFF system for Suz12.
(A) Auxin Inducible-Degradation (AID) system of Suz12 in mouse embryonic stem cells (mESCs). The endogenous Suz12 Open Reading Frame (ORF) was tagged with an AID-mClover cassette via CRISPR-Cas9 in cells expressing an OsTIR1 transgene, enabling auxin (IAA)-mediated proteasomal degradation. (B) Experimental design for PRC2 domain erosion and Suz12 de novo recruitment assay: (1) Steady-state; (2) Auxin treatment for Suz12 degradation and histone H3K27me3 loss; (3) Reintroduction: auxin removal and re-accumulation of Suz12. (C) Western blot showing progressive histone H3K27me3 loss at different auxin treatment time points. (D) Time course of Suz12-mClover recovery after auxin washout, monitored by FACS over 12 hours. At 4.5 hours, Suz12 reaches 11.5% of GFP+ cells. (E-F) ChIP-seq tracks for Suz12 (green) and histone H3K27me3 (orange) 4.5 hours after auxin removal at the representative loci, including nucleation sites at the HoxB cluster (E) and the HoxC cluster that shows signal dispersion (F). (G-H) Global ChIP-seq signal for Suz12 (G) and histone H3K27me3 (H) at Control-exclusive peaks through the three conditions. Heatmaps of read density are within a ±1.5 kb centered on the maximum value of the peak signal.
To identify de novo recruitment sites independent of histone H3K27me3, we treated cells with auxin for 96 hours (~6 cell divisions (27)) to allow sustained absence of PRC2 and depletion of histone H3K27me3 via dilution during replication (Fig 1C) (28). At this time, 96.4% of the cells were arrested in G1 phase (Fig S1H), likely due to derepression of cell cycle regulators (e.g., cyclins D1/E1, Ink4a locus) (29). Cells resumed S/G2 phase progression upon auxin washout. To capture early recruitment events before domain spreading, we calibrated the recovery time to 4.5 hours post-washout, corresponding to ~10% of the Suz12 fluorescent signal (Fig 1D, S1I), at which we performed ChIP-seq for histone H3K27me3 and Suz12. Inspection of the ChIP-seq enrichment signals at two Hox gene clusters validated our system. While the treatment depleted Suz12 and histone H3K27me3, upon Reintroduction with auxin removal, the signals were recovered at the nucleation site within the HoxB cluster but not in the distal HoxC region (Fig 1E and 1F), consistent with prior report (14).
Genome-wide ChIP-seq analysis confirmed global loss of Suz12 and histone H3K27me3 upon auxin treatment, followed by partial restoration during Reintroduction (4.5 hours after auxin washout) (Fig 1G and 1H). We interpreted this as early de novo recruitment of Suz12 and re-establishment of histone H3K27me3 at pre-perturbation sites. Interestingly, we noticed an increased deposition of histone H3K27me3 at nucleation sites during Reintroduction, which was not observed at dispersal sites. This effect may reflect preferential deposition of histone H3K27me3 at nucleation sites during early de novo recruitment. Consistent with this, prior work reported higher local catalytic rates of H3K27 methylation at putative nucleation sites (1) and classified them as “strong” or “weak” based on H3K27me3 deposition (14). Overall, our AID-Suz12 ON/OFF system enables time-resolved capture of early PRC2 recruitment events before domain spreading.
Genome-wide identification and classification of PRC2 de novo nucleation sites
A detailed analysis of all peaks across the three conditions (Steady-state, Auxin, and Reintroduction) revealed that auxin treatment led to a significant decrease in Suz12 binding, with a reduction of almost 85% of its target sites lost (Fig 2A). We detected a modest relocalization of Suz12 to new targets in Auxin and Reintroduction conditions (372 and 448 peaks, respectively), which were absent in Steady-state (Control). These likely reflect transient or opportunistic binding under perturbed chromatin states (30,31), even in controlled culture conditions that preserve pluripotency as the 2i media. The new targets are mainly intergenic and intronic with no distinctive features detected. Additionally, a small subset of 896 Suz12 peaks persisted under Auxin treatment. These peaks were largely intergenic and lacked consistent chromatin or transcriptional features. In order to study the erasure and restoration of Suz12 from the Steady-state, we excluded newly acquired peaks from further analysis.
Fig 2. Auxin-inducible degradation of Suz12 enables identification of de novo nucleation sites.
(A) Total number of Suz12 peaks detected across the three conditions. Bar plots indicate peak counts per category, including persistent peaks during Auxin, and newly acquired peaks (gray). (B) Identification of Suz12 nucleation sites based on dynamic signal changes upon recovery. Each point represents a Suz12 ChIP-seq peak plotted by its fold change upon recovery or loss, versus Auxin. The quadrant highlights the peaks that experienced a signal loss with Auxin but recovered upon washout (Reintroduction) and presumably nucleation true sites. Statistically significant peaks are shown in blue (|log2FoldChange| > 1.5, adjusted P value < 0.005). (C) Peak overlap summary across datasets. (D) ChIP-seq tracks for Suz12 after auxin removal at representative loci: the Evx2 nucleation site within the HoxD cluster. Tracks signals for Auxin relative to Steady-state and Reintroduction relative to Auxin (upper; positive signal in red, negative signal in blue). Nucleation and Control-exclusive coordinates (bottom). The diagonal arrow indicates apparent spreading direction from the Evx2 nucleation site across the HoxD cluster. The gray boxes point out the nucleation peaks. (E) Heatmaps showing normalized Suz12 ChIP-seq signal intensity across the three conditions at classified nucleation and Control-exclusive peaks. Each heatmap is centered on the peak summit within a ±1.5 kb window.
To identify bona fide nucleation sites, we normalized Suz12 ChIP-seq read counts to estimate the fold change over each peak using DESeq2, (|log2FoldChange| > 1.5, adjusted P-value < 0.005). To characterize the binding dynamic of Suz12 under different conditions, we compared signal intensities at each peak between Auxin and Steady-state, versus Reintroduction and Auxin, respectively (Fig 2B). This approach allows us to classify nucleation sites as peaks whose signal dynamics met the following criteria: (i) a significantly decreased signal upon Auxin treatment compared to the Steady-state, and (ii) a significantly increased signal in the Reintroduction compared to the Auxin (blue quadrant, Fig 2B). Our strategy identified 1,109 Suz12 nucleation sites (corresponding to 19% of the total Steady-state peaks) For analysis purposes, we referred to the rest of the peaks that do not meet the nucleation criteria as “Control-exclusive”. Applying the same classification to histone H3K27me3 ChIP-seq data identified 437 nucleation sites (Fig S2A). However, we focused on Suz12, as its binding marks the earliest step in PRC2 recruitment, preceding histone H3K27me3 deposition, and therefore offers a more direct measure of nucleation independent of downstream catalytic activity.
Our intensity-based classification method was both effective and efficient, as it identified most of the putative nucleation sites described in the literature, including sites at genes such as Cyp26b1, Emx1, Evx2, Lhx2, Lmx1b, and Tox2 (14), all of which fell within the first quadrant of our analysis (Fig S2B). We also compared our nucleation set with an independent published dataset that reintroduced an EED-cage mutant (Phe97 or Tyr365 substitution with alanine) unable to spread H3K27me3 domains (14). Remarkably, 908 (81%) of our de novo Suz12 peaks overlapped with the 5,128 PRC2 peaks from the EED-cage mutant ChIP-seq dataset (Fig S2C). This suggests our AID-based approach may capture a more stringent set of early, high-confidence PRC2 recruitment events.
To further support this classification, we assessed the overlap of our nucleation peaks with peaks independently called in each condition. As shown in Fig 2C, our nucleation peaks overlapped strongly with those detected upon Suz12 reintroduction, confirming that our approach effectively captures early PRC2 recruitment events that are absent upon auxin treatment but reappear after washout.
We analyze the signal dynamics of nucleation and Control-exclusive peaks at the Evx2 nucleation site, which spreads across the HoxD cluster (14) (Fig 2D). The first two tracks illustrate the positive (red) and negative (blue) changes occurring in the Auxin compared to the Steady-state, as well as the changes after Reintroduction compared to the Auxin condition. We observed a similar behavior indicating a potential spreading direction for Pax2 and Lbx1 (Fig S2D). Our findings confirm that the nucleation sites exhibit greater intensity changes than the Control-exclusive peaks, which are not present at our early Reintroduction time point.
We next analyzed whether nucleation sites differ in their size. We found that nucleation sites were consistently broader, with a median size of 1342 bp, whereas Control-exclusive sites had a median size of 714 bp (Fig S2E). This is an intriguing finding, especially when we consider the intricate nature of Polycomb domains, which vary from 1 kb to over 100 kb (32). To validate our classification, we separately examined Suz12 (Fig 2E) and histone H3K27me3 (Fig S2F-G) global signals at the nucleation and Control-exclusive sites. These visualizations confirmed distinct occupancy patterns that met our selection criteria. Taken together, our analysis demonstrates that the AID-Suz12 system effectively reveals early PRC2 nucleation sites that overcome histone H3K27me3 loss and can rediscover their targets.
Nucleation sites localize to bivalent promoters and poised chromatin states
We assessed whether our mESCs PRC2-nucleation sites are preserved in differentiated cells. To address this, we re-analyzed publicly available Suz12 ChIP-seq datasets from thymocytes (33), intestinal epithelium (34), and neural progenitor cells (35). Approximately, 70% of nucleation sites overlapped with the 3,512 PRC2 peaks shared across these three lineages (Fig 3A), indicating that most nucleation regions are maintained across tissues derived from the three germ layers.
Fig 3. Suz12 nucleation sites are enriched at bivalent promoters.
(A) Overlap of Suz12 ChIP-seq peaks in thymocytes (33), intestine epithelium (34), neural progenitors (35) with our mESCs nucleation sites. (B) Genomic distribution of Suz12 peaks across annotated genomic categories. Nucleation sites show ~20% higher localization at promoters and reduction at intergenic regions (C) Chromatin state annotations from ChromHMM of mESCs. Nucleation sites are highly enriched at bivalent and repressed chromatin compared with Control-exclusive sites. (D) Representative chromatin landscape at the Lhx2 locus at steady-state. ChIP-seq track for H3K27me3, Suz12 (this study), H3K4me3, H3K9ac and H3K27ac show bivalency at nucleation sites (gray boxes). (E) Volcano plot of RNA-seq differential expression (|log2FC| > 1.5, adjusted P < 0.005) of the nearest genes to nucleation sites between Steady-state and Reintroduction conditions.
Next, we examined the genomic distribution of Suz12 nucleation sites versus Control-exclusive by annotating peaks for comparative analysis. In both instances, most peaks localized at promoters and intergenic regions (Fig 3B), consistent with PRC2’s known binding at gene promoters (36,37). However, nucleation sites showed ~4-fold reduction at intergenic regions and ~20% greater localization at promoters (Fig 3B), indicating a bias for de novo recruitment near transcription start sites (TSSs). To further explore the epigenetic context of these peaks, we intersected them with mESC used ChromHMM annotations (38), which classify states such as active or bivalent promoter, transcribed/elongation, strong/weak enhancer, insulator, intergenic, heterochromatin and repressed based on histone marks. Nucleation sites displayed markedly higher overlap with bivalent and repressed states (Fig 3C). Specifically, 93% of nucleation sites overlapped with bivalent chromatin (H3K4me3+H3K27me3), compared with 67% of the Control-exclusive peaks. Previous reports indicate that nearly 85% of H3K27me3-marked promoters are bivalent (39), but our results suggested that not all of them can function as nucleation sites. These enrichments were statistically significant when compared to a set of random peaks from the Control-exclusive list (Fig S3A), supporting the idea that de novo PRC2 recruitment is tightly associated with poised chromatin landscapes at promoters. Other studies propose that bivalency is better defined by the coexistence of H3K4me3, H3K9ac and H3K27me3 (40). To illustrate this, we visualized ChIP-seq tracks at the Lhx2 locus, where nucleation peaks coincide with the three marks and Suz12 occupancy (Fig. 3D). We also included H3K27ac, which antagonizes Polycomb deposition.
To examine the functional process associated with nucleation sites, we assessed the nearest associated gene exploring protein interactions using STRING (41). We identified 627 genes near the de novo recruitment peaks, forming a highly connected network enriched for pattern specification processes, and DNA-binding transcription factor activity (Fig S3B). Gene Ontology (GO) analysis of such genes revealed terms for “Cell fate commitment” (GO:0045165, FDR = 6.14E-61), “Regionalization” (GO:0003002, FDR = 7.01E-59), “Pattern specification process” (GO:0007389, FDR = 8.51E-63), and “Embryonic organ morphogenesis” (GO:0048562, FDR = 5.42E-50) (Fig S3C). Together, these data underscore the role of nucleation at bivalent promoters in safeguarding stem identity and restricting premature differentiation.
To assess the impact of Suz12 removal and reintroduction on these bivalent promoters, we conducted a transcriptional analysis. RNA-seq between Steady-state and Reintroduction revealing a total of 1,967 differentially expressed genes (DEGs; |log2FoldChange| > 1.5, P adjusted < 0.005), corresponding to ~4% of the 48,440 annotated as genes in the mouse genome (GRCm38 M10, Ensembl 85) (Fig 3E). This is consistent with previous reports in mESCs (42). Focusing on genes nearest to nucleation sites, only 120 were DEG after Reintroduction (Fig 3E). Gene ontology analysis of the nucleation-associated DEGs revealed enrichment for “Forebrain development” (GO:0030900, FDR = 4.77E-10), and “Central nervous system development” (GO:0007417, FDR = 2.5E-10) (Fig S3D). These enrichments may reflect the propensity of mESCs to initiate early neurodevelopment in the absence of PRC2. Overall, these results indicate that most nucleation sites reside at bivalent loci associated with chromatin promoters and that Suz12 Reintroduction has relatively marginal transcriptional effects in mESCs.
Unmethylated CGI, Polycomb cofactors and specific DNA motifs define nucleation site identity
In Drosophila melanogaster, Polycomb Response Elements (PREs) serve as a specific DNA sequence involved in Polycomb recruitment (43). While no direct mammalian equivalent has been clearly identified, CG-rich sequences at unmethylated CpG islands (CGI) and overrepresentation of GA/GCN tandem repeats have been associated with PRC2 targeting in mammals (14). To examine whether DNA methylation patterns correlate with nucleation sites, we analyzed the UCSC Genome Browser CGI list (44) together with a publicly available DNA methylation dataset in mESCs (45). We found that the number of unmethylated CGI is higher at nucleation sites than at Control-exclusive sites (Fig 4A and Fig S4A). This enrichment was statistically significant (P-value < 0.0001) when compared to the expected overlap from random sampling (Fig 4B). Moreover, nucleation sites had higher G and C nucleotide content than both Control-exclusive sites and random genomic regions (n = 10,000) (Fig 4C). These results are consistent with the established antagonism between Polycomb and DNA methylation and confirm that unmethylated CGI are strongly associated with de novo PRC2 recruitment (46).
Fig 4. Genomic and sequence features associated with PRC2 nucleation sites.
(A) Number of unmethylated CpG islands (CGI) overlapping with Suz12 nucleation sites or a random set of Control-exclusive (n = 1,000). (B) Observed/expected ratio of overlap between nucleation sites and methylated or unmethylated CGI, compared to random samplings of Control-exclusive peaks. The black dashed line marks the random expected ratio. (C) Cytosine and Guanine nucleotide content of random genomic regions (n = 10,000), Control-exclusive and nucleation sites. The dashed line marks the average GC content of the mouse genome. P-value < 0.0001 (D) Five selected motifs from the motif enrichment analysis of over-represented sequences that are differentially enriched at nucleation versus Control-exclusive sites. Matches-per-sequence profiles are shown. (E) Overlap between Suz12 nucleation sites with Mtf2 and Jarid2 ChIP-seq peaks in mESCs (47,48).
Then, we explored whether specific DNA motifs define our nucleation sites by applying HOMER (49) and MEME (50) motif discovery tools (S1 Table). HOMER analysis for known and novel motifs revealed matches for ZNF669, E2F6, E2F3, and E2F2 motifs. Concomitantly, MEME identified a GCC-rich motif that matches zinc-finger proteins ZNF93 and ZNF610 among the motifs differentially enriched. Further enrichment analysis using Simple Enrichment Analysis (SEA) (51) confirmed overrepresentation of the DNA binding motifs for ZNF610, ZNF669, E2F3, E2F2, and ZNF93 at nucleation sites. In Fig 4D, we present the five hit SEA motifs that were shared among studies and their Matches per sequence graphs. Overall, the identification of E2F cell cycle regulators at nucleation sites suggests a potential link to Polycomb epigenetic memory after cell division, while several zinc-finger proteins previously associated with PRC1 or PRC2 could contribute to nucleation site targeting.
Next, we examined the overlap between our nucleation sites and two well-characterized Polycomb cofactors implicated in PRC2 recruitment: the Metal Response Element Binding Transcription Factor 2 (MTF2) and Jumonji and AT-Rich Interaction Domain Containing 2 (JARID2) (52–54). By reanalyzing published ChIP-seq datasets (47,48) we found that MTF2 binding was present at 95% of our mESCs nucleation sites (1,057 sites), with a highly significant enrichment (Fisher’s test P-value < 2.2e–16) (Fig 4E). The odds ratio for MTF2 binding at nucleation versus Control-exclusive indicated a nearly 10-fold greater likelihood of MTF2 association at nucleation sites, supporting a strong association of MTF2 with de novo PRC2 recruitment. Similarly, JARID2, which has a zinc-finger domain (55), overlapped with 80% of our nucleation sites (880 sites) (Fig 4E) showing a perfect enrichment at PRC2 targets (Fisher’s test P-value = 0). Concomitantly, the odds ratio for JARID2 binding for nucleation sites versus Control-exclusive indicates a 16-fold higher enrichment at nucleation regions. Remarkably, 79% of nucleation sites were co-occupied by both MTF2 and JARID2, consistent with the model in which MTF2 initiates nucleation while JARID2 supports domain spreading.
Although MTF2 lacks a strict consensus motif, it preferentially binds GCG-rich, low-methylated DNA, which widens the minor groove (20,52). This relaxed helical shape of DNA is reminiscent of the DNA-RNA hybrids in R-loop structures (56), which can also stabilize PRC2-Ezh2 and PRC1-Ring1B to the chromatin (57). Thus, we sought to identify superposition between MTF2 and R-loops in mESCs by using DRIP-seq datasets (58). We found that only 11% of our nucleation sites overlap with MTF2 and R-loops (Fig S4B), with a modest enrichment (Fisher’s test, P-value > 0.05 and odds ratio of 1.75). Moreover, when comparing G-quadruplex density (a hallmark of R-loop formation), we found no statistical differences between nucleation sites and random regions (Fig S4C). These results suggest that while MTF2 recruitment to R-loops can occur, this is not a defining feature of PRC2 de novo recruitment. Thus, this data supports a model in which PRC2 nucleation sites correlate with unmethylated CGIs, enriched for E2F/ZNF motifs and co-occupied by MTF2/JARID2, while R-loops are not broadly required.
Nucleation sites correlate with Polycomb-loop anchors for long-range chromatin interactions
In addition to their role in gene repression, the Polycomb proteins contribute to 3D nuclear organization by forming long-range chromatin loops (59). This prompted us to investigate whether our putative de novo recruitment sites are associated with such spatial interactions. We first compared the genomic distances between our Suz12 nucleation and Control-exclusive sites. Nucleation sites displayed a bimodal distribution: a proximal group with an interpeak distance of <10 kb interpeak distance, and a distal group spanning nearly 8 Mb (Fig 5A and S5A). The proximal group includes 122 cases that span multiple nucleation sites within a gene (e.g., Barx1, Nkx2–2, and Foxa2). By contrast, the median interpeak distance for Control-exclusive sites was ~100 kb, whereas nucleation peaks were spaced ~1 Mb apart, with some exceeding 10 Mb (Fig 5A). The 10-fold increase in spacing at nucleation sites suggests that they may be involved in longer-range chromatin interactions. To explore this possibility, we analyzed histone H3K27me3 HiChIP-seq data from mESCs (59) at 10 kb resolution to capture both short and long interactions. We identified 4,254 H3K27me3-mediated interactions, of which 260 were mediated by one nucleation site and an average distance of 0.33 Mb, and 102 were anchored by two nucleation sites with an average distance of 0.86 Mb (Fig 5B). Thus, the average nucleation anchor-mediated interaction does not exceed 1 Mb. Among these, the longest loop spanned 7.58 Mb, connecting nucleation sites at Barx1 and Neurog1. All remaining interactions with non-nucleation anchors have an average interaction distance of 0.18 Mb. According to a one-way ANOVA test, the interactions mediated by two-nucleation anchor interactions are significantly longer (P value of < 0.0001) than those intervened by one-nucleation or non-nucleation anchors (Fig 5C).
Fig 5. Nucleation sites act as anchor points for H3K27me3-mediated chromatin loops.
(A) Interpeak distances of nucleation and Control-exclusive Suz12 peaks. Nucleation sites exhibit significantly longer distances. (B) Global interaction counts involving one or two nucleation sites as loop anchors based on histone H3K27me3 HiChIP data (59). (C) Anchor nucleation distances. One-way ANOVA test of P-value of < 0.0001. (D) Overlap between nucleation sites and H3K27me3 loop anchors. (E) HiChIP contact matrix (10 kb resolution) showing H3K27me3-associated interactions for the Insm1-Nkx2-Foxa2 region in mESCs. ChIP-seq tracks for histone H3K27me3 (blue), nucleation site (red) and its H3K27me3-mediated loops seen as a virtual 4C assay. Loops identified on the Hi-ChIP matrix are shown as black circles. (F) Aggregate Peak Analysis (APA) of the H3K27me3 Hi-ChIP comparing the interaction strength for loops without nucleation anchors, with one anchor, or two nucleation anchors at the Chr2 (n = 102 sites). (G) One-way ANOVA test of loop scores. P value of < 0.05. (H) Circos plot of Chr2 with intrachromosomal H3K27me3 interactions (red loops) mediated by at least one nucleation site (yellow) as anchor point. Interactions occur primarily in the parts of the genome that have histone H3K27me3 enrichment (middle ring).
Furthermore, we found that approximately 25% of our Suz12 nucleation sites overlap with these histone H3K27me3 HiChIP-seq peaks (Fig 5D), suggesting a coupling of de novo PRC2 recruitment to high-order genome architecture. To investigate the spatial features of these loops in more detail, we built contact maps around anchor nucleation sites, integrating HiChIP contact matrix, nucleation coordinates, histone H3K27me3 enrichment peaks, and virtual 4C representation of chromatin interactions. These contact maps revealed overlap between histone H3K27me3 peaks, nucleation sites, and loop anchor points. For example, consistent with previous report (14), nucleation sites near Evx2 formed long-range interactions within the vicinity of the HoxD locus and other distant nucleation sites (Fig S5B). When centered on Pax1 as viewpoint, significant interactions were observed leading to Foxa2 (0.67 Mb) and Nkx2–2 (0.17 Mb) loci (Fig 5E), two of the strongest interactions, as indicated by the thickness of the loop arcs. We further noticed that H3K27me3 signal intensity at Steady-state is higher at nucleation-anchored loops compared to non-nucleation mediated loops (Fig S5C). Therefore, we infer that nucleation-anchored loops act as hubs that concentrate PRC2/H3K27me3, facilitating mark spreading between linked loci.
To quantify interaction strength, we performed Aggregate Peak Analysis (APA), which quantifies the strength and significance of chromatin loops. When comparing the same number of H3K27me3 loops genome-wide (n = 102), we noticed a higher APA score at those with nucleation sites as anchors compared to the loops formed without nucleation sites (Fig 5F). In particular, the loops with two nucleation anchors are the most robust, exhibiting a concentrated signal with defined contrast and a focalized pattern. However, a one-way ANOVA test did not reach statistical significance (Fig 5G), although nucleation-anchored loops showed a consistent trend toward higher score. Hence, the average intensity of chromosomal interactions is higher when two nucleation anchors mediate the loops. The robustness of these loops suggests a stable chromosomal organization that likely plays a crucial role in gene regulation and the maintenance of repressive chromatin domains. The chromosomes (Chr) with the most nucleation sites acting as looping anchors are Chr1 and Chr2. This finding is consistent with previous evidence highlighting the organizational role of Polycomb in nuclear architecture, particularly on chromosomes with large regulatory gene clusters, such as Chr2 and Chr17 (60,61). To visualize this, we generated a circos plot of Chr2 intrachromosomal interactions mediated by at least one nucleation site (Fig 5H). We identified 54 H3K27me3-loops at Chr2 which supports the concept of sub-megabase Polycomb interacting neighborhoods. In sum, nucleation sites preferentially anchor and strengthen Polycomb loops, linking de novo PRC2 recruitment to high-order genome architecture
Discussion
In this work we identified early nucleation sites of PRC2 in mouse embryonic stem cells (mESCs) using an inducible and reversible degradation system for Suz12, a core component essential for assembling the complex and its recruitment (1). This approach enabled us to track early stages of de novo Polycomb recruitment genome-wide, revealing over 1000 high-confidence putative nucleation sites. Our findings suggest that de novo PRC2 recruitment is linked to a combination of chromatin states, DNA features, cofactor recruitment, and spatial genome organization (Fig. 6).
Fig. 6. Graphical abstract of features associated with PRC2 recruitment.
Nucleation sites act as magnets for de novo PRC2 recruitment. Potential recruiting partners such as MTF2, JARID2, zinc-finger proteins (ZFPs), and E2F cell cycle regulators are hallmarks of these nucleation sites. Their attractiveness correlates with the presence of unmethylated CGI at gene promoters, characterized by bivalent chromatin. The histone H3K27me3 disperses from nucleation sites while they can interact with one another, strengthening long-distance loop interactions.
We found that after depletion and reintroduction of Suz12, only modest transcriptional changes occurred, supporting the view that de novo recruitment in mESCs primarily establishes a poised chromatin state rather than actively repressing transcription. This functional poising likely preserves developmental potential, allowing rapid gene activation (or silencing) upon differentiation, and is consistent with previous findings that PRC2 loss does not compromise self-renewal but may impair differentiation (62,63). Remarkably, nearly 90% of the nucleation sites are located at bivalent promoters and are approximately twice as broad compared to Control-exclusive peaks, likely reflecting condensation of Polycomb complexes during early recruitment (64,65). Developmental regulatory genes are frequently marked by this bivalent state, where H3K27me3 and H3K4me3 coexist. Such genes are typically expressed at low levels in mESCs to undergo rapid activation or repression upon differentiation. Interestingly, ~70% of these sites are shared across tissues from all three germ layers, suggesting a universal role in developmental gene regulation, whereas the remaining (30%) sites appear stem-cell-specific and may resolve during lineage commitment.
Shortly after Suz12 reintroduction, nucleation sites show strong Suz12 and histone H3K27me3 gains, supporting the notion that these sites act as binding hubs. We found that nucleation events highly correlate with unmethylated CGI and exhibit higher local CG content, consistent with the antagonism between DNA methylation and Polycomb occupancy. This epigenetic context appears to enhance PRC2 accessibility and stabilization, in part via interaction with accessory partners such as MTF2 and JARID2, as the ChIP-seq dataset overlapped 95% and 80%, respectively, with our nucleation sites. JARID2 facilitates PRC2 recruitment by PRC1-mediated H2AK119ub1 recognition, supporting localized PRC2.2 de novo recruitment that relies on the CGG and GA sequence motifs (54). Although JARID2 can restore Suz12 binding in the absence of MTF2 (14), it does so with slower kinetics. Conversely, most evidence strengthens the idea that MTF2 ensures PRC2 recruitment at Polycomb key genes, whereas JARID2 is essential for dispersing new H3K27me3 domains (13). Thus, these chromatin-binding proteins play essential roles in domain nucleation, significantly enhancing their binding chances to de novo recruitment sites compared to other PRC2 targets. Nevertheless, additional interactions with other protein partners or RNA molecules may contribute to nucleation, such as lncRNAs or chromatin-associated RNAs that have been demonstrated to play an important role in Polycomb targeting (66–70).
At sequence level, nucleation sites are enriched for GCC/GC-rich motifs. While Polycomb Response Elements (PREs) can specify targeting in invertebrates (43,71), in mammals, the only genomic signature associated with Polycomb recruitment is the enrichment of GC-rich sequences at unmethylated CGI promoters, often accompanied by GA or GCN tandem repeats (14,72,73). Within this CG-rich context, our motif analysis was resolved two major classes: E2F cell cycle regulators and zinc-finger proteins (e.g. ZNF93, ZNF610 and ZNF669), suggesting a sequence-encoding docking for PRC2 cofactors. In particular, E2F2/3 are activators of G1-S genes and are critical for cell cycle re-entry following arrest (74,75). Their enrichment at nucleation sites may reflect a mechanism to orchestrate Polycomb-epigenetic memory after cell division. While evidence for E2F-mediated recruitment of PRC2 is limited, E2F6 can act as an accessory protein of PRC1 complexes in quiescence and germline gene silencing (75–77), and it has also been associated with PRC2 during silencing of meiotic genes (76). However, only a small fraction of Suz12 co-localizes with E2F6 (77). Regarding the zinc-finger proteins, ZFP277 interacts with PRC1 through the PCGF4 subunit at the Ink4a/Arf locus (78). Also, POGZ associates with PRC1 during neuronal differentiation (79), while ZFP217 and ZFP516 indirectly interact with PRC2 through the transcriptional corepressor Ctbp2 during mESCs differentiation (80). These examples are consistent with the notion that zinc-finger proteins could participate in PRC2 recruitment (81). Together, these observations support a model in which clustered GC/GCC motifs and specific TFs (E2Fs, ZNFs) recruit Polycomb cofactors to a subset of CGI promoters, facilitating PRC2 nucleation and H3K27me3 spreading; future experiments editing these motifs at endogenous loci, or reconstituting them ectopically, will define their involvement in PRC2 nucleation and their roles in maintaining the epigenetic landscape.
Beyond the DNA sequences and chromatin context, we observed that PRC2 nucleation sites correlate with a spatial network of long-range chromatin loops ranging from 1 Mb to 10 Mb, which are significantly longer than the average H3K27me3 loop distance in Control-exclusive conditions (~100 kb). These long-range interactions among nucleation sites are more robust when both anchor points are nucleation sites. However, only about 25% of nucleation sites overlapped with H3K27me3 HiChIP-detected loop anchors. This observation suggests that only a subset of nucleation sites form long-range contacts that may seed spreading of H3K27me3 in cis and in trans.
Finally, our findings must be interpreted considering models that emphasize the role of PRC1 in initiating Polycomb recruitment and long-range interactions (8). In particular, RING1B-mediated monoubiquitylation of H2AK119 has been shown to promote PRC2 binding (54,82). Future studies should also explore how PRC1 architecture and activity intersect with PRC2 nucleation and memory.
Concluding remarks
Our data indicate that early PRC2 nucleation in mESCs are concentrated at broad, GC-rich bivalent promoters, frequently co-occupied by MTF2 and JARID2, and embedded within a subset of H3K27me3-linked long-range interactions. These features -chromatin state, sequence composition, cofactor occupancy, and 3D architecture- provide a coherent framework for how PRC2 re-establishes its occupancy after domain erasure, with limited transcriptional impact at early time points and with many sites conserved across germ-layer lineages. The predominance of nucleation at bivalent promoters highlights how this chromatin state serves as a privileged entry point for Polycomb targeting, balancing transcriptional flexibility with repressive stability. While these associations are compelling, they are correlative. Definitive tests will require motif disruption and transplant experiments at endogenous loci, timed depletion/restore of MTF2 and JARDI2, and dissection of PRC1 contributions. Mapping the order and dependency of these events should clarify how nucleation sites act as hubs to initiate local deposition and promote spreading within repressive neighborhoods. By focusing on the onset of de novo recruitment rather than steady-state profiles, this work outlines sequence and chromatin encoded entry points that can be functionally interrogated to explain how Polycomb domains, and the landscape they embody, operate to acquire and maintain a stable epigenetic memory.
Methods
Cell culture and auxin treatment
J1 mouse embryonic stem cells (mESCs) were cultured under standard conditions on 0.1% gelatin-coated plates and in ES medium: DMEM High Glucose supplemented with 15% FBS (Biowest L0101–500), 0.1 mM 2-mercaptoethanol, 2 mM L-glutamine, 0.1 mM MEM non-essential amino acids (NEAA), 1% nucleoside mix, 50 U/mL Penicillin/Streptomycin (P/S), Leukemia inhibitory factor (LIF, made in house) and 2i inhibitors: 1 μM PD0325901 and 3 μM CHIR99021. Cells were passed every other day using 0.05% Trypsin/EDTA. HEK293T cells were used for lentiviral production in DMEM supplemented with 10% FBS, 2 mM L-glutamine, 0.5 mM sodium pyruvate, and 50 U/mL P/S. All cells were grown at 37°C in a humidified incubator with 5% CO2. Auxin (indole-3-acetic acid, IAA; Sigma, catalog no. I5148) was dissolved in water to a 200 mM stock solution and stored at −20°C. For AID-mediated degradation, auxin was titrated and used at a final concentration of 50 μM in culture media.
Cloning
To generate the CRISPR donor plasmid, four fragments (5’ coding and 3’ UTR homology arms, mAID-mClover-NeoR cassette, and pBluescript backbone) were amplified with Phusion High-Fidelity DNA polymerase (ThermoScientific) using Gibson assembly with the In-Fusion kit (Takara). Homology arms were amplified from mESCs genomic DNA. The mAID-mClover-NeoR sequence was subcloned from pMK289, from Masato T. Kanemaki’s laboratory (83), and the backbone from pBluescript SK (−). To prevent recleavage by Cas9, we disrupted the PAM sequence within the donor template. The gRNA was selected using Benchling (https://benchling.com/faq) and cloned by Gisbon assembly into pSpCas9(BB)-2A-Puro (PX459) V2.0 (Addgene 62988;). The pRAIDRS-P7 NLS-mOrange-AID from Ran Brosh (26) was modified by Gibson to exclude the AID sequence. Primer sequences are in the S1 Table. All plasmids’ constructs were propagated in Stbl3 competent cells One Shot Stlb3™ (Invitrogene) and NEB®Stable Competent E. coli (New England).
Suz12 Auxin-Inducible Degradation knock in
mESCs were co-transfected with the gRNA plasmid and the donor plasmid (2:1) using Polyethylenimine (PEI 1:3) and selected with neomycin (1.8 ug/mL). mClover cells were sorted as GFP+ using a BD FACSMelody Cell Sorter. Single mESCs colonies with typical morphology were manually picked for genotyping by PCRs using primers surrounding the target site along with Sanger sequencing verification of in-frame insertion. Clones with correct insertion were validated by western blot and transduced with lentiviral particles to integrate the OsTIR1 receptor; lentiviral particles were generated co-transfecting the pRAIDRS-P7 NLS-mOrange-notAID along with psPAX2 and pMD2.G using PEI; particles were concentrated using Amicon Ultra-15 (Millipore). Double positive GFP/Orange cells were sorted and the integration of OsTIR1 was verified by PCR genotyping. After auxin treatment, cells were fixed with ice-cold paraformaldehyde at 4% and the GFP median fluorescence was measured by FACS.
Western blot
Cells were lazed in RIPA buffer (1% NP-40, 0.5% deoxycholate, 0.1% SDS, 150 mM NaCl, 50 mM Tris-HCl pH 8, with Protease Inhibitor Cocktail, Roche) and incubated on ice for 30 minutes with occasional vortexing. Lysates were centrifugated at 14,000 × g for 15 minutes at 4°C and protein content was determined on the supernatant. For histone extraction, 1×107 cells we resuspend in a Triton Extraction Buffer (PBS, 0.5% Triton X-100, 2 mM PMSF, 0.02% sodium azide) followed by a 10-minute centrifugation at 400 × g at 4°C, pellet was resuspended in 0.2 N HCl (4 ×107 cells/mL). Histones were acid extracted overnight at 4°C. Protein content was measured with DC Protein Assay Reagents (BioRad, 5000116). 20–40 μg of total histones was resolved by SDS–PAGE and transferred to PVDF (0.45 μm, Immobilon-P) or nitrocellulose (0.2 um, Amersham) membranes. Membranes were blocked with TBST (10 mM Tris-HCl pH 7.9, 150 mM NaCl and 0.05% Tween-20) with 5% skim milk, incubated overnight with primary antibodies and washed three times with TBST. HRP-conjugated secondary antibodies were applied, and blots were developed with Immobilon HRP substrate and imaged using a C-DiGit Blot Scanner (LI-COR) with minor adjustments. Antibodies: rabbit anti-GFP (Abcam, ab290, 1:2000), rabbit anti-SUZ12 (Thermo Fisher, 39357, 1:1500), rabbit anti-H3K27me3 (Millipore, 07449, 1:1000), rabbit anti-H3 (Abcam, ab1791, 1:10,000), mouse anti-Actin (Abcam, ab197277, 1:10,000), mouse anti-GAPDH (Sigma, G8795, 1:5000), goat anti-rabbit (Santa Cruz, sc2030, 1:10,000) and goat anti-mouse (Santa Cruz, sc2031, 1:10,000).
Chromatin immunoprecipitation (ChIP)
Protocol was performed as described Lee et al. (84,85) with modifications: cells were dissociated with 0.05% trypsin, fixed with 1% formaldehyde solution for 10 min at RT, and quenched with fresh 125 mM glycine for 5 min RT. After PBS washed, nuclei were isolated using sequential lysis buffers: B1 (50 mM HEPES-KOH, pH 7.5, 140 mM NaCl, 1 mM EDTA, 10% Glycerol, 0.5% NP-40, 0.25% Triton X-100, 1x protease inhibitors; 10 min at 4°C), B2 (10 mM Tris-HCL, pH 8, 200 mM NaCl, 1 mM EDTA, 0.5 mM EGTA, and 1x protease inhibitors; 10 min at 4°C), and B3 (10 mM Tris-HCL, pH 8, at 4°C, 100 mM NaCl,1 mM EDTA, 0.5 mM EGTA, 0.1% Na-deoxycholate, 0.5% N-Lauroylsarcosine sodium salt, and 1x protease inhibitors). Chromatin was fragmented to an average size of 250–500 bp using a Diagenode Bioruptor and Triton X-100 was added to a final concentration of 1%. ChIP was performed using 1 ug of antibody per 2 million cells incubated and rotated overnight at 4°C. Antibodies: rabbit anti-GFP (Abcam, ab290), rabbit anti-H3K27me3 (Cell signaling, 9733). Dynabeads (10 μL pero μg Ab) were pre-washed in PBS + 0.5% BSA and rotated for 2 hours at 4°C before adding to the IPs. Then washes X5 with ChIP RIPA buffer 4°C (50mM HEPES-KOH, pH 7.5 at 4°C, 500 mM LiCl, 1 mM EDTA, 1% NP-40, and 0.7% Na-deoxycholate) and one wash with TE + 50 mM NaCl. DNA was eluted in freshly prepared elution buffer (50 mM Tris, pH 8, 10 mM EDTA, 1% SDS) at 65°C for 20 min and then de-crosslinked overnight at 65 °C. Samples were treated with RNase A and Proteinase K and DNA was purified by phenol:chloroform:isoamyl alcohol (P:C:IA). Libraries were prepared using NEBNext Ultra II DNA Library Prep Kit (NEB), quantified by Qubit dsDNA HS Assay, quality-checked with Aligent Bioanalyzer HS DNA chip. Libraries were sequenced as 150 pb paired-end reads on the Illumina HiSeq platform.
RNA extraction and sequencing
Total RNA was purified in triplicates from mESCs with TRIzol (Life Technologies), and RNA integrity (RIN) was corroborated using an Aligent Bioanalyzer. We follow the manufacturer’s TruSeq RNA Sample Prep v2 (Illumina, USA) instructions for transcriptome sequencing. The mRNA was purified utilizing poly-T oligo-attached magnetic beads. After purification, the mRNA was fragmented and converted into first-strand cDNA utilizing reverse transcriptase and random primers. The second strand, cDNA, is then synthesized using DNA Polymerase. Actinomycin D was added to prevent DNA-dependent synthesis and improve strand specificity. The cDNA fragments undergo a single ‘A’ base addition and subsequent adapter ligation. After purification and PCR enrichment, the final cDNA library is suitable for subsequent cluster generation and DNA sequencing. We sequenced the libraries using a NextSeq 500 platform (Illumina) in 75 bp paired end reads format.
Preprocesing and maping of Next-generation sequencing data
All Raq FASTQ files generated in this study or retrieved from public repositories were processed using a uniform pipeline: Adapter sequences and low-quiality reads were trimmed with Trim Galore v0.6.10 with default parameters (86). Files were verified with FastQC v0.12.1 (87). Trimmed reads were aligned to the mm10 mouse reference genome using Bowtie2 v2.5.3 then BAM files were obtained using samtools 1.20; samtools view -bS, then sorted and indexed with samtools sort and samtools index with default parameters (88,89).
ChIP-Seq data analysis
Our ChIP-Seq data and public dataset BAM files were processed using a standardized pipeline. Peak calling was performed per replicate using MACS2 v2.2.7.1; callpeak with default parameters (90). The peaks obtained from the processing of our auxin induced experiment were merged into non-redundant union sets for each of the immunoprecipitation antibodies with Bedtools v2.30.0; bedtools merge with default parameters (91). We then used the unified peak sets to count reads that overlapped with each peak using FeatureCounts v2.0.6 (92). We then used the counts to build a matrix suitable for its use with DESeq2 v1.38.0 within R v4.2.3 environment, example scripts with our utilized conditions and parameters publicly available (https://github.com/cperalta22/suz12_nucleation_mesc) (93). Plots for downstream visualization were generated with ggplot2 3v.4.2. To compare and intersect BED files obtained from the peak calling of our samples and re-analyzed public datasets as well ChromHMM annotation were assessed using the command intervene venn with the parameter –save-overlaps from the package Intervene v0.6.5 (94). Fisher’s exact tests were performed with BEDtools v2.30.0 using bedtools fisher with default parameters. Peak annotation and visualization were conducted using ChIPseeker (Galaxy v1.28.3+galaxy0) (95). To create visualization files and plots such as BigWig and heatmaps, we calculate scaling factors for all samples for each immunoprecipitation condition with deepTools 3.5.0; multiBamSummary with –scalingFactors as a parameter, we used the normalization factors as a parameter of deepTools using bamCoverage –scaleFactor - v –extendReads –binSize 5. We calculated and created new BigWig files for every plot that required the usage of a different set of BigWig files as input. To obtain signal heatmaps we utilized deepTools using computeMatrix and plotHeatmap. Comparative bigwig tracks were created with bigwigCompare from the deeptools suite, taken as input the normalized and scaled merged bigwig files for each experimental condition evaluated. Gene ontology and protein interaction analysis were performed with STRING v12.0 (41). Motif analysis was assessed with MEME suite (v5.5.7) identification tools (96), using Motif Analysis of Large Nucleotide Datasets (MEME-ChIP) (50) and Simple Enrichment Analysis (SEA 5.5.7) (51), both with default parameters. HOMER Motif Discovery and Analysis (49) was assessed with suite (v5.1) with default parameters.
Differential gene expression analysis
With sorted BAM files as input we estimated raw sequencing reads abundance over the Ensembl gene annotation (mm10) mouse genome using FeatureCounts v2.0.6. Then with DESeq2 1.38.0 running under an R 4.2.3 environment we estimated gene expression differences. Scripts with our utilized conditions and parameters are available at: https://github.com/cperalta22/suz12_nucleation_mesc. Plots for downstream analysis were made using ggplot2 v3.4.2
Code and data availability
Raw sequencing data generated in this study is available via the Gene Expression Omnibus accession number: ChIP-seq GSE305054 and RNA-seq GSE305055, in addition we utilized publicly available datasets under the following accession numbers: thymocytes GSM1498452 (33), intestinal epithelium GSM3020554 (34), neural progenitors GSM878558 (35), H3K4me3 GSM6261533 (97), H3K9ac GSM8107956 (98), H3K27ac GSE280487 (99), Mtf2 GSM6585902 (47), Jarid2 GSM6585904 (48), R-loops GSM1720620 (58), HiChIP-seq GSE150907 (59), and DNA methylation GSE266926 (45). Scripts with the code and full parameter list of the commands mentioned on this methods section are available on the following repository: https://github.com/cperalta22/suz12_nucleation_mesc. In addition, we forked the original repository (https://github.com/guifengwei/ChromHMM_mESC_mm10) containing the BED files for the ChromHMM annotation of mESCs, fork is available on the following link: https://github.com/cperalta22/ChromHMM_mESC_mm10.
Random distribution analysis
A set of random peaks was selected from a list of background (control exclusive) peaks using a perl script (v. 5.34) with the random function. The number of random peaks selected was equal to the number of nucleation sites. The number of random peaks that overlap with different chromatin states was assessed with intersectBed from bed tools (v2.31.1) using the -wa option. The enrichment of G-quadruplexes within random peaks and nucleation sites was calculated using computeMatrix in scale-regions mode from deeptools (v3.5.6). The process was repeated 1,000 times with a bash script to get a random distribution. The resulting random distributions were compared to the distribution of nucleation sites and plotted in R (v 4.5.0) with ggplot2 (v 3.5.2). The coordinates for chromatin states were obtained from https://github.com/guifengwei/ChromHMM_mESC_mm10 and the enrichment for G- quadruplexes from GSM5259790 (100).
CpG methylation and CG percentage
A table with single-base pair methylation levels derived from control mESCs was downloaded from GSE266926 (45). Only sites with >50% methylation were considered as methylated. A bed file of CpG islands (CGI) was downloaded from the UCSC genome browser table (mm10 version). The number of methylated CpGs at CGI was calculated by overlapping the CpG bed file with the mESCs methylation file intersectBed using the -c option. Islands with > = 20% of methylated CpGs were considered as methylated, and islans with <20% of methylated CpGs were considered unmethylated. Methylated and unmethylated CGI were overlapped with nucleation sites or a random set of peaks selected from control exclusive (1,100 peaks). The randomization was repeated 1,000 times. Observed/Expected values were obtained by dividing the values of nucleation / random peaks. For CG percentage, the fasta sequence from nucleation, control exclusive, and a set of 10,000 random peaks (obtained with bedtools random) of the same size as nucleation sites was obtained. GC percentage was calculated with an in-house perl script. All results were plotted in R (v 4.5.0) with ggplot2 (v 3.5.2).
HiChIP analysis
Histone H3K27me3 HiChIP data from mouse embryonic stem cells (mESCs) (59). The .hic file was converted into a multi-resolution .mcool file using hic2cool utility (https://github.com/4dn-dcic/hic2cool). Chromatin loop detection was performed at 10 kb resolution using pyHICCUPS script from hicpeaks (101) package with default parameters. The resulting BEDPE file was intersected with a BED file containing nucleation site coordinates. Aggregate Peak Analysis (APA) was carried out using the apa-analysis script from hicpeaks (101). Visualization of HiChIP contact matrices at selected genomic regions was performed using the tadlib package (102,103). A circos plot showing H3K27me3 loops on chromosome 2 was carried out using pyCirclize (moshi4. (n.d.). and vizualized in Python (Retrieved July 1, 2025, from https://moshi4.github.io/pyCirclize/) (104).
Acknowledgments
Itzel Alejandra Hernández-Romero conducted this study to fulfill the requirements of Programa de Doctorado en Ciencias Bioquíımicas of Universidad Nacional Autónoma de Mexico (UNAM), and received a doctoral scholarship from Secretaría de Ciencia, Humanidades, Tecnología e Innovación (SECIHTI, #CVU 886138). We thank the technical assistance of Beatriz Aguirre López for in-house rLIF production and José Fernando Becerra-Vélez for technical advice. At IFC, we thank the Taller, Biblioteca, Bioterio and Unidades de Biología Molecular e Imagenologia: Laura Ongay-Larios, Guadalupe Codiz, Minerva Mora and Ruth Rincón. We also thank Ran Brosh for pRAIDRS system plasmid donation (26), and Masato T. Kanemaki for the pMK289 (mAID-mClover-NeoR) plasmid donation (83).
Funding:
This work was funded by grants from PAPIIT-UNAM (IN203820 and IN217824) and CONACYT Ciencia Básica (0284867) to VJV. Furlan Lab is supported by SECIHTI 303068 and PAPIIT IN210323 grants. Research in the Wang laboratory is supported by the NIH (R21HD116446, HD114122 and R01CA285299). IAHR was supported by a SECIHTI scholarship (777482). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Footnotes
Competing Interests: The authors have declared that no competing interests exist.
References
- 1.Laugesen A, Højfeldt JW, Helin K. Molecular Mechanisms Directing PRC2 Recruitment and H3K27 Methylation. Mol Cell. 2019. Apr 4;74(1):8–18. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Khan AA, Lee AJ, Roh TY. Polycomb group protein-mediated histone modifications during cell differentiation. Epigenomics. 2015;7(1):75–84. [DOI] [PubMed] [Google Scholar]
- 3.Lee TI, Jenner RG, Boyer LA, Guenther MG, Levine SS, Kumar RM, et al. Control of developmental regulators by Polycomb in human embryonic stem cells. Cell. 2006. Apr 21;125(2):301–13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Boyer LA, Plath K, Zeitlinger J, Brambrink T, Medeiros LA, Lee TI, et al. Polycomb complexes repress developmental regulators in murine embryonic stem cells. Nature. 2006. May 18;441(7091):349–53. [DOI] [PubMed] [Google Scholar]
- 5.Liu Y, Shao Z, Yuan GC. Prediction of Polycomb target genes in mouse embryonic stem cells. Genomics. 2010. Jul;96(1):17–26. [DOI] [PubMed] [Google Scholar]
- 6.Obier N, Lin Q, Cauchy P, Hornich V, Zenke M, Becker M, et al. Polycomb protein EED is required for silencing of pluripotency genes upon ESC differentiation. Stem Cell Rev Rep. 2015. Feb;11(1):50–61. [DOI] [PubMed] [Google Scholar]
- 7.Kloet SL, Makowski MM, Baymaz HI, van Voorthuijsen L, Karemaker ID, Santanach A, et al. The dynamic interactome and genomic targets of Polycomb complexes during stem-cell differentiation. Nat Struct Mol Biol. 2016. Jul;23(7):682–90. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Loubiere V, Martinez AM, Cavalli G. Cell Fate and Developmental Regulation Dynamics by Polycomb Proteins and 3D Genome Architecture. Bioessays. 2019. Mar;41(3):e1800222. [DOI] [PubMed] [Google Scholar]
- 9.Margueron R, Justin N, Ohno K, Sharpe ML, Son J, Drury WJ, et al. Role of the polycomb protein EED in the propagation of repressive histone marks. Nature. 2009. Oct 8;461(7265):762–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Alabert C, Groth A. Chromatin replication and epigenome maintenance. Nat Rev Mol Cell Biol. 2012. Feb 23;13(3):153–67. [DOI] [PubMed] [Google Scholar]
- 11.Blackledge NP, Farcas AM, Kondo T, King HW, McGouran JF, Hanssen LLP, et al. Variant PRC1 complex-dependent H2A ubiquitylation drives PRC2 recruitment and polycomb domain formation. Cell. 2014. Jun 5;157(6):1445–59. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Bauer M, Trupke J, Ringrose L. The quest for mammalian Polycomb response elements: are we there yet? Chromosoma. 2016. Jun 1;125(3):471–96. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Hernández-Romero IA, Valdes VJ. De Novo Polycomb Recruitment and Repressive Domain Formation. Epigenomes. 2022. Aug 22;6(3):25. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Oksuz O, Narendra V, Lee CH, Descostes N, LeRoy G, Raviram R, et al. Capturing the Onset of PRC2-Mediated Repressive Domain Formation. Mol Cell. 2018. Jun 21;70(6):1149–1162.e5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Højfeldt JW, Laugesen A, Willumsen BM, Damhofer H, Hedehus L, Tvardovskiy A, et al. Accurate H3K27 methylation can be established de novo by SUZ12-directed PRC2. Nat Struct Mol Biol. 2018. Mar;25(3):225–32. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Lavarone E, Barbieri CM, Pasini D. Dissecting the role of H3K27 acetylation and methylation in PRC2 mediated control of cellular identity. Nat Commun. 2019. Apr 11;10(1):1679. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Tavares L, Dimitrova E, Oxley D, Webster J, Poot R, Demmers J, et al. RYBP-PRC1 complexes mediate H2A ubiquitylation at polycomb target sites independently of PRC2 and H3K27me3. Cell. 2012. Feb 17;148(4):664–78. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Lynch MD, Smith AJH, De Gobbi M, Flenley M, Hughes JR, Vernimmen D, et al. An interspecies analysis reveals a key role for unmethylated CpG dinucleotides in vertebrate Polycomb complex recruitment. The EMBO Journal. 2012. Jan 18;31(2):317–29. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Pasini D, Bracken AP, Hansen JB, Capillo M, Helin K. The polycomb group protein Suz12 is required for embryonic stem cell differentiation. Mol Cell Biol. 2007. May;27(10):3769–79. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Loh CH, van Genesen S, Perino M, Bark MR, Veenstra GJC. Loss of PRC2 subunits primes lineage choice during exit of pluripotency. Nat Commun. 2021. Nov 30;12(1):6985. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Chen S, Jiao L, Shubbar M, Yang X, Liu X. Unique Structural Platforms of Suz12 Dictate Distinct Classes of PRC2 for Chromatin Binding. Mol Cell. 2018. Mar 1;69(5):840–852.e5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Kasinath V, Poepsel S, Nogales E. Recent Structural Insights into Polycomb Repressive Complex 2 Regulation and Substrate Binding. Biochemistry. 2019. Feb 5;58(5):346–54. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Cao R, Zhang Y. SUZ12 is required for both the histone methyltransferase activity and the silencing function of the EED-EZH2 complex. Mol Cell. 2004. Jul 2;15(1):57–67. [DOI] [PubMed] [Google Scholar]
- 24.Sim YJ, Kim MS, Nayfeh A, Yun YJ, Kim SJ, Park KT, et al. 2i Maintains a Naive Ground State in ESCs through Two Distinct Epigenetic Mechanisms. Stem Cell Reports. 2017. May 9;8(5):1312–28. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Boroviak T, Loos R, Bertone P, Smith A, Nichols J. The ability of inner-cell-mass cells to self-renew as embryonic stem cells is acquired following epiblast specification. Nat Cell Biol. 2014. Jun;16(6):516–28. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Brosh R, Hrynyk I, Shen J, Waghray A, Zheng N, Lemischka IR. A dual molecular analogue tuner for dissecting protein function in mammalian cells. Nat Commun. 2016. May 27;7(1):11742. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Mulas C, Kalkan T, von Meyenn F, Leitch HG, Nichols J, Smith A. Defined conditions for propagation and manipulation of mouse embryonic stem cells. Development. 2019. Mar 26;146(6):dev173146. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Jadhav U, Manieri E, Nalapareddy K, Madha S, Chakrabarti S, Wucherpfennig K, et al. Replicational Dilution of H3K27me3 in Mammalian Cells and the Role of Poised Promoters. Mol Cell. 2020. Apr 2;78(1):141–151.e5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Aoto T, Saitoh N, Sakamoto Y, Watanabe S, Nakao M. Polycomb group protein-associated chromatin is reproduced in post-mitotic G1 phase and is required for S phase progression. J Biol Chem. 2008. Jul 4;283(27):18905–15. [DOI] [PubMed] [Google Scholar]
- 30.Huseyin MK, Klose RJ. Live-cell single particle tracking of PRC1 reveals a highly dynamic system with low target site occupancy. Nat Commun. 2021. Feb 9;12:887. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Klose RJ, Cooper S, Farcas AM, Blackledge NP, Brockdorff N. Chromatin Sampling—An Emerging Perspective on Targeting Polycomb Repressor Proteins. PLOS Genetics. 2013. Aug 22;9(8):e1003717. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Veronezi GMB, Ramachandran S. Nucleation and spreading maintain Polycomb domains every cell cycle. Cell Reports. 2024. Apr;43(4):114090. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.A O, A A, K P, B J, S LG, S C, et al. Ikaros mediates gene silencing in T cells through Polycomb repressive complex 2. Nature communications [Internet]. 2015. Sep 11 [cited 2025 Jul 26];6. Available from: https://pubmed.ncbi.nlm.nih.gov/26549758/ [Google Scholar]
- 34.U J, A C, Kk B, H X, Nk O, V SV, et al. Extensive Recovery of Embryonic Enhancer and Gene Memory Stored in Hypomethylated Enhancer DNA. Molecular cell [Internet]. 2019. Feb 5 [cited 2025 Jul 26];74(3). Available from: https://pubmed.ncbi.nlm.nih.gov/30905509/ [Google Scholar]
- 35.Mb J, Pp W, Kd A, Ea M, Rn D, Jl H, et al. Single-cell analysis reveals transcriptional heterogeneity of neural progenitors in human cortex. Nature neuroscience [Internet]. 2015. May [cited 2025 Jul 26];18(5). Available from: https://pubmed.ncbi.nlm.nih.gov/25734491/ [Google Scholar]
- 36.Szczurek AT, Dimitrova E, Kelley JR, Blackledge NP, Klose RJ. The Polycomb system sustains promoters in a deep OFF state by limiting pre-initiation complex formation to counteract transcription. Nat Cell Biol. 2024. Sep 11;1–12. [DOI] [PubMed] [Google Scholar]
- 37.Illingworth RS, Gruenewald-Schneider U, Webb S, Kerr ARW, James KD, Turner DJ, et al. Orphan CpG islands identify numerous conserved promoters in the mammalian genome. PLoS Genet. 2010. Sep 23;6(9):e1001134. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Wei G. guifengwei/ChromHMM_mESC_mm10 [Internet]. 2025. [cited 2025 Jun 30]. Available from: https://github.com/guifengwei/ChromHMM_mESC_mm10
- 39.Mantsoki A, Devailly G, Joshi A. CpG island erosion, polycomb occupancy and sequence motif enrichment at bivalent promoters in mammalian embryonic stem cells. Sci Rep. 2015. Nov 19;5:16791. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Trovato M, Bunina D, Yildiz U, Fernandez-Novel Marx N, Uckelmann M, Levina V, et al. Histone H3.3 lysine 9 and 27 control repressive chromatin at cryptic enhancers and bivalent promoters. Nat Commun. 2024. Aug 30;15(1):7557. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.STRING: functional protein association networks [Internet]. [cited 2024 Sep 19]. Available from: https://string-db.org/
- 42.Riising EM, Comet I, Leblanc B, Wu X, Johansen JV, Helin K. Gene Silencing Triggers Polycomb Repressive Complex 2 Recruitment to CpG Islands Genome Wide. Molecular Cell. 2014. Aug;55(3):347–60. [DOI] [PubMed] [Google Scholar]
- 43.Müller J, Kassis JA. Polycomb response elements and targeting of Polycomb group proteins in Drosophila. Curr Opin Genet Dev. 2006. Oct;16(5):476–84. [DOI] [PubMed] [Google Scholar]
- 44.UCSC Genome Browser Home [Internet]. [cited 2025 Aug 5]. Available from: https://genome.ucsc.edu/
- 45.Elder E, Lemieux A, Legault LM, Caron M, Bertrand-Lehouillier V, Dupas T, et al. Rescuing DNMT1 fails to fully reverse the molecular and functional repercussions of its loss in mouse embryonic stem cells. Nucleic Acids Res. 2025. Feb 8;53(4):gkaf130. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Li Y, Zheng H, Wang Q, Zhou C, Wei L, Liu X, et al. Genome-wide analyses reveal a role of Polycomb in promoting hypomethylation of DNA methylation valleys. Genome Biology. 2018. Feb 8;19(1):18. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.M N, F M, Cy I, Wl S. Tissue-Specific Tumour Suppressor and Oncogenic Activities of the Polycomb-like Protein MTF2. Genes [Internet]. 2023. Sep 27 [cited 2025 Jul 26];14(10). Available from: https://pubmed.ncbi.nlm.nih.gov/37895228/ [Google Scholar]
- 48.D L, S S, R P, M D, L M, Hf J, et al. Jarid2 is a PRC2 component in embryonic stem cells required for multi-lineage differentiation and recruitment of PRC1 and RNA Polymerase II to developmental regulators. Nature cell biology [Internet]. 2010. Jun [cited 2025 Jul 26];12(6). Available from: https://pubmed.ncbi.nlm.nih.gov/20473294/ [Google Scholar]
- 49.Homer Software and Data Download [Internet]. [cited 2024 Dec 20]. Available from: http://homer.ucsd.edu/homer/motif/
- 50.Machanick P, Bailey TL. MEME-ChIP: motif analysis of large DNA datasets. Bioinformatics. 2011. Jun 15;27(12):1696–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Bailey TL, Grant CE. SEA: Simple Enrichment Analysis of motifs [Internet]. bioRxiv; 2021. [cited 2025 Jun 30]. p. 2021.08.23.457422. Available from: https://www.biorxiv.org/content/10.1101/2021.08.23.457422v1 [Google Scholar]
- 52.Perino M, van Mierlo G, Loh C, Wardle SMT, Zijlmans DW, Marks H, et al. Two Functional Axes of Feedback-Enforced PRC2 Recruitment in Mouse Embryonic Stem Cells. Stem Cell Reports. 2020. Dec 8;15(6):1287–300. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Kasinath V, Faini M, Poepsel S, Reif D, Feng XA, Stjepanovic G, et al. Structures of human PRC2 with its cofactors AEBP2 and JARID2. Science. 2018. Feb 23;359(6378):940–4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Cooper S, Grijzenhout A, Underwood E, Ancelin K, Zhang T, Nesterova TB, et al. Jarid2 binds mono-ubiquitylated H2A lysine 119 to mediate crosstalk between Polycomb complexes PRC1 and PRC2. Nat Commun. 2016. Nov 28;7:13661. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Landeira D, Fisher AG. Inactive yet indispensable: the tale of Jarid2. Trends in Cell Biology. 2011. Feb 1;21(2):74–80. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.R S, S S, H Sr, M M, Cj B, F C. Interplay between DNA sequence and negative superhelicity drives R-loop structures. Proceedings of the National Academy of Sciences of the United States of America [Internet]. 2019. Mar 26 [cited 2025 Jul 26];116(13). Available from: https://pubmed.ncbi.nlm.nih.gov/30850542/ [Google Scholar]
- 57.Davidovich C, Cech TR. The recruitment of chromatin modifiers by long noncoding RNAs: lessons from PRC2. RNA. 2015. Dec;21(12):2007–22. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Skourti-Stathaki K, Triglia ET, Warburton M, Voigt P, Bird A, Pombo A. R-Loops Enhance Polycomb Repression at a Subset of Developmental Regulator Genes. Molecular Cell. 2019. Mar 7;73(5):930–945.e4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Kraft K, Yost KE, Murphy SE, Magg A, Long Y, Corces MR, et al. Polycomb-mediated genome architecture enables long-range spreading of H3K27 methylation. Proc Natl Acad Sci U S A. 2022. May 31;119(22):e2201883119. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Vieux-Rochas M, Fabre PJ, Leleu M, Duboule D, Noordermeer D. Clustering of mammalian Hox genes with other H3K27me3 targets within an active nuclear domain. Proc Natl Acad Sci U S A. 2015. Apr 14;112(15):4672–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Pauler FM, Sloane MA, Huang R, Regha K, Koerner MV, Tamir I, et al. H3K27me3 forms BLOCs over silent genes and intergenic regions and specifies a histone banding pattern on a mouse autosomal chromosome. Genome Res. 2009. Feb;19(2):221–33. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Miller SA, Damle M, Kim J, Kingston RE. Full methylation of H3K27 by PRC2 is dispensable for initial embryoid body formation but required to maintain differentiated cell identity. Development. 2021. Apr 1;148(7):dev196329. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.E W, Wy C, J H, G C, K G, J T, et al. Polycomb-like 2 associates with PRC2 and regulates transcriptional networks during mouse embryonic stem cell self-renewal and differentiation. Cell stem cell [Internet]. 2010. May 2 [cited 2025 Jul 26];6(2). Available from: https://pubmed.ncbi.nlm.nih.gov/20144788/ [Google Scholar]
- 64.Guo Y, Wang GG. Modulation of the high-order chromatin structure by Polycomb complexes. Front Cell Dev Biol [Internet]. 2022. Oct 5 [cited 2025 Jun 19];10. Available from: https://www.frontiersin.org/journals/cell-and-developmental-biology/articles/10.3389/fcell.2022.1021658/full [Google Scholar]
- 65.Eeftens JM, Kapoor M, Michieletto D, Brangwynne CP. Polycomb condensates can promote epigenetic marks but are not required for sustained chromatin compaction. Nat Commun. 2021. Oct 7;12(1):5888. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Wang X, Goodrich KJ, Gooding AR, Naeem H, Archer S, Paucek RD, et al. Targeting of Polycomb Repressive Complex 2 to RNA by Short Repeats of Consecutive Guanines. Molecular Cell. 2017. Mar 16;65(6):1056–1067.e5. [DOI] [PubMed] [Google Scholar]
- 67.Rosenberg M, Blum R, Kesner B, Aeby E, Garant JM, Szanto A, et al. Motif-driven interactions between RNA and PRC2 are rheostats that regulate transcription elongation. Nat Struct Mol Biol. 2021. Jan;28(1):103–17. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Song J, Yao L, Gooding AR, Thron V, Kasinath V, Cech TR. Diverse RNA Structures Induce PRC2 Dimerization and Inhibit Histone Methyltransferase Activity [Internet]. bioRxiv; 2024. [cited 2025 Jun 19]. p. 2024.08.29.610323. Available from: https://www.biorxiv.org/content/10.1101/2024.08.29.610323v1 [Google Scholar]
- 69.Long Y, Bolanos B, Gong L, Liu W, Goodrich KJ, Yang X, et al. Conserved RNA-binding specificity of polycomb repressive complex 2 is achieved by dispersed amino acid patches in EZH2. Shilatifard A, editor. eLife. 2017. Nov 29;6:e31558. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.R A, M R, V L, Jt L. An evolving landscape of PRC2-RNA interactions in chromatin regulation. Nature reviews Molecular cell biology [Internet]. 2025. Aug [cited 2025 Jul 26];26(8). Available from: https://pubmed.ncbi.nlm.nih.gov/40307460/ [Google Scholar]
- 71.Alhaj Abed J, Ghotbi E, Ye P, Frolov A, Benes J, Jones RS. De novo recruitment of Polycomb-group proteins in Drosophila embryos. Development. 2018. Nov 27;145(23):dev165027. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Mendenhall EM, Koche RP, Truong T, Zhou VW, Issac B, Chi AS, et al. GC-Rich Sequence Elements Recruit PRC2 in Mammalian ES Cells. PLoS Genet. 2010. Dec 9;6(12):e1001244. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Wachter E, Quante T, Merusi C, Arczewska A, Stewart F, Webb S, et al. Synthetic CpG islands reveal DNA sequence determinants of chromatin structure. Ferguson-Smith AC, editor. eLife. 2014. Sep 26;3:e03397. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Kent LN, Leone G. The broken cycle: E2F dysfunction in cancer. Nat Rev Cancer. 2019. Jun;19(6):326–38. [DOI] [PubMed] [Google Scholar]
- 75.Bertoli C, Skotheim JM, de Bruin RAM. Control of cell cycle transcription during G1 and S phases. Nat Rev Mol Cell Biol. 2013. Aug;14(8):518–28. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Leseva M, Santostefano KE, Rosenbluth AL, Hamazaki T, Terada N. E2f6-mediated repression of the meiotic Stag3 and Smc1β genes during early embryonic development requires Ezh2 and not the de novo methyltransferase Dnmt3b. Epigenetics. 2013. Aug;8(8):873–84. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Shirahama Y, Yamamoto K. The E2F6 Transcription Factor is Associated with the Mammalian SUZ12-Containing Polycomb Complex. Kurume Med J. 2023. Feb 6;67(4):171–83. [DOI] [PubMed] [Google Scholar]
- 78.Negishi M, Saraya A, Mochizuki S, Helin K, Koseki H, Iwama A. A Novel Zinc Finger Protein Zfp277 Mediates Transcriptional Repression of the Ink4a/Arf Locus through Polycomb Repressive Complex 1. PLOS ONE. 2010. Aug 24;5(8):e12373. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Chavez J, Wolf T, Geng Z, Tai YT, Bright K, Stafford J, et al. The zinc-finger protein POGZ associates with Polycomb repressive complex 1 to regulate bone morphogenetic protein signaling during neuronal differentiation [Internet]. bioRxiv; 2025. [cited 2025 Aug 6]. p. 2025.01.07.631780. Available from: https://www.biorxiv.org/content/10.1101/2025.01.07.631780v1 [Google Scholar]
- 80.Kwak S, Kim TW, Kang BH, Kim JH, Lee JS, Lee HT, et al. Zinc finger proteins orchestrate active gene silencing during embryonic stem cell differentiation. Nucleic Acids Res. 2018. Jul 27;46(13):6592–607. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Sun A, Li F, Liu Z, Jiang Y, Zhang J, Wu J, et al. Structural and biochemical insights into human zinc finger protein AEBP2 reveals interactions with RBBP4. Protein Cell. 2018. Aug;9(8):738–42. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Tamburri S, Lavarone E, Fernández-Pérez D, Conway E, Zanotti M, Manganaro D, et al. Histone H2AK119 Mono-Ubiquitination Is Essential for Polycomb-Mediated Transcriptional Repression. Molecular Cell. 2020. Feb 20;77(4):840–856.e5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83.Natsume T, Kiyomitsu T, Saga Y, Kanemaki MT. Rapid Protein Depletion in Human Cells by Auxin-Inducible Degron Tagging with Short Homology Donors. Cell Rep. 2016. Apr 5;15(1):210–8. [DOI] [PubMed] [Google Scholar]
- 84.Lee TI, Johnstone SE, Young RA. Chromatin immunoprecipitation and microarray-based analysis of protein location. Nat Protoc. 2006. Aug;1(2):729–48. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85.Bar C, Valdes VJ, Ezhkova E. Chromatin Immunoprecipitation of Low Number of FACS-Purified Epidermal Cells. In: Botchkareva NV, Westgate GE, editors. Molecular Dermatology: Methods and Protocols [Internet]. New York, NY: Springer US; 2020. [cited 2025 Jul 29]. p. 197–215. Available from: 10.1007/978-1-0716-0648-3_17 [DOI] [Google Scholar]
- 86.Krueger F, James F, Ewels P, Afyounian E, Weinstein M, Schuster-Boeckler B, et al. FelixKrueger/TrimGalore: v0.6.10 - add default decompression path [Internet]. Zenodo; 2023. [cited 2025 Jun 30]. Available from: https://zenodo.org/records/7598955 [Google Scholar]
- 87.Babraham Bioinformatics - FastQC A Quality Control tool for High Throughput Sequence Data [Internet]. [cited 2025 Jun 30]. Available from: https://www.bioinformatics.babraham.ac.uk/projects/fastqc/ [Google Scholar]
- 88.Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nat Methods. 2012. Mar 4;9(4):357–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 89.Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009. Aug 15;25(16):2078–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90.Gaspar JM. Improved peak-calling with MACS2 [Internet]. bioRxiv; 2018. [cited 2025 Jun 30]. p. 496521. Available from: https://www.biorxiv.org/content/10.1101/496521v1 [Google Scholar]
- 91.Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010. Mar 15;26(6):841–2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 92.Liao Y, Smyth GK, Shi W. featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics. 2014. Apr 1;30(7):923–30. [DOI] [PubMed] [Google Scholar]
- 93.Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15(12):550. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 94.Khan A, Mathelier A. Intervene: a tool for intersection and visualization of multiple gene or genomic region sets. BMC Bioinformatics. 2017. May 31;18(1):287. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 95.Yu G, Wang LG, He QY. ChIPseeker: an R/Bioconductor package for ChIP peak annotation, comparison and visualization. Bioinformatics. 2015. Jul 1;31(14):2382–3. [DOI] [PubMed] [Google Scholar]
- 96.MEME-ChIP - Submission form [Internet]. [cited 2024 Oct 6]. Available from: https://meme-suite.org/meme/tools/meme-chip
- 97.Mu M, Li X, Dong L, Wang J, Cai Q, Hu Y, et al. METTL14 regulates chromatin bivalent domains in mouse embryonic stem cells. Cell Rep. 2023. Jun 27;42(6):112650. [DOI] [PubMed] [Google Scholar]
- 98.Li J, Xi Y, Li W, McCarthy R, Stratton S, Zou W, et al. TRIM28 interacts with EZH2 and SWI/SNF to activate genes that promote mammosphere formation. Oncogene. 2017. May 25;36(21):2991–3001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 99.Paldi F, Szalay MF, Stefano MD, Jost D, Reboul H, Cavalli G. Transient histone deacetylase inhibition induces cellular memory of gene expression and three-dimensional genome folding [Internet]. bioRxiv; 2024. [cited 2025 Aug 29]. p. 2024.11.21.624660. Available from: https://www.biorxiv.org/content/10.1101/2024.11.21.624660v1 [Google Scholar]
- 100.Tsai RX, Fang KC, Yang PC, Hsieh YH, Chiang IT, Chen Y, et al. TERRA regulates DNA G-quadruplex formation and ATRX recruitment to chromatin. Nucleic Acids Res. 2022. Nov 28;50(21):12217–34. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 101.Rao SSP, Huntley MH, Durand NC, Stamenova EK, Bochkov ID, Robinson JT, et al. A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell. 2014. Dec 18;159(7):1665–80. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 102.Wang XT, Cui W, Peng C. HiTAD: detecting the structural and functional hierarchies of topologically associating domains from chromatin interactions. Nucleic Acids Res. 2017. Nov 2;45(19):e163. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 103.Wang XT, Dong PF, Zhang HY, Peng C. Structural heterogeneity and functional diversity of topologically associating domains in mammalian genomes. Nucleic Acids Res. 2015. Sep 3;43(15):7237–46. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 104.pyCirclize [Internet]. [cited 2025 Jul 1]. Available from: https://moshi4.github.io/pyCirclize/
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
Raw sequencing data generated in this study is available via the Gene Expression Omnibus accession number: ChIP-seq GSE305054 and RNA-seq GSE305055, in addition we utilized publicly available datasets under the following accession numbers: thymocytes GSM1498452 (33), intestinal epithelium GSM3020554 (34), neural progenitors GSM878558 (35), H3K4me3 GSM6261533 (97), H3K9ac GSM8107956 (98), H3K27ac GSE280487 (99), Mtf2 GSM6585902 (47), Jarid2 GSM6585904 (48), R-loops GSM1720620 (58), HiChIP-seq GSE150907 (59), and DNA methylation GSE266926 (45). Scripts with the code and full parameter list of the commands mentioned on this methods section are available on the following repository: https://github.com/cperalta22/suz12_nucleation_mesc. In addition, we forked the original repository (https://github.com/guifengwei/ChromHMM_mESC_mm10) containing the BED files for the ChromHMM annotation of mESCs, fork is available on the following link: https://github.com/cperalta22/ChromHMM_mESC_mm10.






