Summary
CTCF is a critical regulator of genome architecture and gene expression that binds thousands of sites on chromatin. CTCF genomic localization is controlled by recognition of a DNA sequence motif and regulated by DNA modifications. However, CTCF does not bind to all its potential sites in all cell types, raising the question of whether underlying chromatin structure can regulate CTCF occupancy. Here we report that R-loops facilitate CTCF binding through formation of associated G-quadruplex (G4) structures. R-loops and G4s co-localize with CTCF at many genomic regions in mouse embryonic stem cells and promote CTCF binding to its cognate DNA motif in vitro. R-loop attenuation reduces CTCF binding in vivo. Deletion of a specific G4-forming motif in a gene reduces CTCF binding and alters gene expression. Conversely, chemical stabilization of G4s results in CTCF gains and accompanying alterations in chromatin organization, suggesting a pivotal role for G4 structures in reinforcing long-range genome interactions through CTCF.
Graphical Abstract
eTOC
CTCF is a highly conserved protein that is essential for genome organization. While CTCF recognizes specific DNA sequences, it does not bind to all its potential sites. Wulfridge et al report that CTCF bound sites are enriched for R-loops and G-quadruplex chromatin structures. These structures facilitate CTCF binding to promote looping interactions, which contributes to proper gene expression.
Introduction
Transcription factors localize to the genome by recognizing specific double-stranded DNA (dsDNA) sequences. Fundamental biological processes such as transcription and replication can disrupt dsDNA, alter the binding of regulatory protein complexes, and impact gene expression and genome organization. R-loops are three-stranded chromatin structures that contain a DNA:RNA hybrid and a displaced single-strand DNA (ssDNA)1. R-loops are predominantly formed during replication or co-transcriptionally but can also occur when regulatory RNAs interact with specific genomic regions2–5. R-loop formation is promoted by G-rich sequences in the non-template DNA strand, which can fold into G-quadruplex (G4) structures that stabilize R-loops6. G4 forming regions are closely associated with transcription factor hubs7,8. While R-loop accumulation is closely linked to DNA damage4,5,9,10, it is becoming increasingly clear that both R-loops and their associated G4s have important roles in gene regulation3,8.
The three-dimensional organization of the eukaryotic genome is an important contributor to proper gene expression11–14. Mammalian genomes fold into A/B compartments15,16, topologically associating domains13,17–20 (TADs), subTADs21, and loops16. TADs are self-interacting chromatin domains demarcated by boundaries which restrict long-range looping interactions between genomic elements to fine-tune and optimize gene expression levels22. The architectural protein CCCTC binding factor (CTCF) is a highly conserved zinc finger protein14 essential for the formation of many TADS, subTADs, and loops21,23. CTCF binding is enriched at a subset of TAD and subTAD boundaries and functions by blocking cohesin-based loop extrusion24–26. Together, these data highlight the central role for CTCF in folding of the mammalian genome.
Given the integral role for CTCF in genome organization, it is important to ascertain how its occupancy is regulated. While CTCF is well-established to recognize a specific dsDNA motif that dictates its genomic binding27–29, it is also influenced by chromatin modifications30–33 and underlying chromatin structures7. Recently, R-loops were reported as potential modulators of CTCF binding34,35. Proteomic identification of R-loop interactors revealed an enrichment for CTCF36,37 and a positive association of CTCF and R-loops was also observed at transcriptional termination sites34. Some long non-coding RNAs (lncRNAs) localize to chromatin by interacting with complementary DNA sequences to form R-loops38,39. The HOTTIP lncRNA, which is upregulated in acute myeloid leukemia (AML), localizes to some genomic sites through R-loops and contributes to aberrant genome organization and gene expression in AML35. HOTTIP lncRNA localizes to the TAD boundary of the β-catenin encoding gene, CTNNB1, through R-loops to facilitate CTCF interactions. Stabilization of this TAD is associated with increased β-catenin expression, which drives leukemic transcription programs in AML. Decreases in R-loop levels at these boundary sites are associated with reduced CTCF enrichment and decreased β-catenin expression. This intriguing observation begs the question whether R-loops play a broader role in regulating CTCF occupancy genome-wide.
CTCF targeting to some sites is regulated through its RNA interactions40–44. R-loops contain an RNA component and can also contain G4s. Two contrasting reports implicate G4 structures in CTCF regulation, with one study indicating that CTCF can bind G4 DNAs with similar affinity as its dsDNA consensus motif45 and the second showing a lack of interaction between CTCF and G4s in vitro7. Nevertheless, the frequent co-occurrence and proximity of G4 structures and CTCF motifs on the same strand at loop boundaries46 suggests a potential collaboration between the two. In this study, we evaluate the genomic contribution of R-loops to CTCF localization and establish a molecular basis for enhanced CTCF interactions through R-loops and their associated G4s.
Results
CTCF-bound regions are highly enriched for R-loops and G-quadruplexes.
To examine the genome-wide co-occurrence of CTCF and R-loops, we performed CTCF CUT&RUN47 in mouse embryonic stem cells (mESCs) and used MapR48, a sensitive and high-resolution RNase H-based method that identifies the genome-wide locations of R-loops. CTCF CUT&RUN identified 61,541 mESC CTCF peaks that were shared across two replicates. We performed clustering of MapR signal over CTCF peaks to classify them into two groups. This clustering identified a “high R-loop” group of 17,189 CTCF peaks that have both strong CTCF and strong R-loop signals, and a “low R-loop” group of 44,352 peaks with little to no R-loop signal and slightly reduced CTCF signal overall (Figures 1A–1C, Figures S1A and S1B). As R-loops are closely associated with G4 structures6, we examined the presence of putative G4 motifs49 in the DNA sequence of high and low R-loop CTCF peaks. 4,918 of 17,189 high R-loop peaks (28.6%) contained a putative G4 motif, compared to 4,464 of 44,352 (10.1%) low R-loop peaks (Figure S1C). Analysis of previously published G4 CUT&TAG data from mESCs50 across our two CTCF peak groups showed that G4 CUT&TAG signal was markedly stronger in the high R-loop group (Figure 1A and 1B, Figures S1A and S1B). Intersection of G4 CUT&TAG peaks with high and low R-loop CTCF sites showed that 5,438 of 17,189 high R-loop peaks (31.6%) overlapped a G4 CUT&TAG peak, compared to only 2,487 of 44,352 (5.6%) low R-loop peaks (Figure S1D). Genomic annotation of CTCF groups revealed that high R-loop CTCF sites occupy a higher proportion of promoter regions (Figure 1D, top). Notably, although only 30% of all CTCF sites are high R-loop peaks, 49% of promoter CTCF sites are high R-loop peaks, representing a 1.63-fold enrichment (p = 8.74 x 10−544, hypergeometric test). In contrast, the majority of low R-loop CTCF peaks were in intronic and intergenic regions (Figure 1D, bottom). Furthermore, high R-loop CTCF sites were more likely to occupy promoters of actively transcribed genes: 4,045 of the 8,630 high R-loop CTCF sites at promoters (46.9%) were at genes expressed in mESCs, compared to only 2,440 of 9,130 (26.7%) promoter-associated low R-loop CTCF sites. Consistent with this enrichment at promoters and active transcription, RNA polymerase II occupancy is higher at high R-loop CTCF sites (Figure S1E). Thus, our analyses reveal a subset of high R-loop containing CTCF bound regions that are associated with active promoters.
Figure 1. A subset of CTCF peaks localize with R-loops and G-quadruplex structures.
(A) Heatmap of CTCF, MapR, and G4 CUT&TAG signal across 61,541 CTCF peaks in mESCs. Peaks were grouped into high and low R-loop sites via k-means clustering (k = 2 clusters) based on MapR signal alone.
(B) Genome browser view of the Alms1 gene showing RPM-normalized CTCF CUT&RUN, MapR, and G4 CUT&TAG signal in mESCs. Low and high R-loop CTCF peaks are highlighted.
(C) Profile plot of CTCF CUT&RUN signal over 17,189 high and 44,352 low R-loop CTCF sites in mESCs.
(D) Pie charts displaying distribution of high and low R-loop CTCF sites across genomic features.
(E) Volcano plot showing log2 fold changes in CTCF binding between WT and ADNP KO mESCs on the x-axis and −log10 FDR on the y-axis. 2,592 sites gain (red) and 111 sites lose (blue) CTCF. Red dotted line, 0.05 FDR cutoff for significance.
(F) Heatmap of MapR and CTCF CUT&RUN signal in WT and ADNP KO mESCs across 2,592 randomly sampled CTCF peaks that are invariant (top) or 2,592 gained CTCF peaks in ADNP KO (bottom).
(G) Genome browser view of the Mmrn2 and Bmpr1a genes showing RPM-normalized CTCF CUT&RUN, MapR, and ADNP CUT&RUN signal in WT and ADNP KO mESCs. Invariant and gained CTCF peaks are highlighted.
See also Figure S1
CTCF is recruited to regions that accumulate R-loops upon ADNP loss.
CTCF occupancy at its consensus motifs is regulated by many epigenetic factors 41,42,44,51, the most well-characterized being DNA methylation32,33. More recently, the activity dependent neuroprotective protein (ADNP), which forms the ChAHP complex with the heterochromatin protein HP1 and the chromatin remodeler CHD4, was discovered to bind a subset of CTCF recognition motifs located at SINE B2 transposable elements and prevent CTCF enrichment to these regions51. Interestingly, ADNP resolves R-loop structures in vivo and its deletion results in R-loop accumulation at ADNP binding sites52. The function of ADNP as an R-loop and CTCF regulator provided a unique opportunity to examine how R-loop changes at specific sites might affect CTCF occupancy. We performed CTCF CUT&RUN in ADNP knockout (KO) mESCs and compared to WT mESCs to identify sites where CTCF was changed upon ADNP loss. 2,703 sites display significantly altered CTCF occupancy in ADNP KO compared to WT (Figure 1E). Of these, 2,592 sites (96%) gained CTCF and only 111 sites (4%) lost CTCF upon ADNP deletion. CTCF protein levels are unaffected in ADNP KO (Figure S1F). Analysis of R-loops and CTCF at the 2,592 sites that gain CTCF showed a clear and significant increase in both in ADNP KO compared to WT mESCs (Figure 1F, bottom and Figures S1G and S1H). In contrast, MapR signal in a control group of 2,592 “invariant” CTCF sites shows no change between WT and ADNP KO (Figure 1F, top) and CTCF signal is unchanged or slightly decreased, consistent with previous reports51. At the Bmpr1a and Fgl1 genes, R-loop and CTCF signal at intronic ADNP binding sites are barely detectable in WT mESCs and are increased in ADNP KO (Figure 1G and Figure S1I). However, flanking ‘invariant’ CTCF binding sites that are present in WT mESCs remain unchanged in ADNP KO (Figure 1G and Figure S1I). Of the 2,592 sites that gain CTCF and R-loops upon ADNP loss, 2,248 (86.7%) overlap a SINE B2 element, while only 16.2% of invariant CTCF sites contain SINE B2, consistent with the ChAHP complex’s known role in competing with CTCF at these elements. Consistent with previously published data which shows that the ChAHP complex acts through an H3K9me3-independent mechanism53 and does not contain H3K9me3 at its binding sites, we found no enrichment for H3K9me3 across sites that show CTCF gains (Figure S1J). DNA methylation of the CTCF motif is well-established to inhibit CTCF binding. However, whole genome bisulfite sequencing analyses of DNA methylation in WT E14 mESCs54 show that methylation levels across gained CTCF sites are very low (Figure S1K), suggesting that CTCF increases in ADNP KO is not regulated by DNA methylation. Together, our results suggest that R-loop accumulation stimulates CTCF binding and that ADNP counters this process.
CTCF genome interactions are sensitive to RNase H treatment.
We showed that R-loop accumulation stimulates CTCF binding. Next, we asked whether R-loops are necessary for genome-wide CTCF localization. R-loops consist of a DNA:RNA hybrid that is sensitive to treatment with RNase H48,55,56. We treated permeabilized mESCs with RNase H and then performed CTCF CUT&RUN. We also performed MapR to confirm reduction in R-loops and CUT&RUN for EZH2, a Polycomb repressive complex 2 (PRC2) component that shows reduced chromatin association in the presence of R-loops57. As a control for comparison, we performed CUT&RUN and MapR simultaneously in mESCs without RNase H treatment.
We identified 38,219 CTCF peaks from control mESCs which showed high correlation to our previous CTCF experiment and also partition into high and low R-loop containing peaks (Figures S2A, S2B). We confirmed that at CTCF peaks called from control mESC samples, R-loop signal is significantly (p<2.2X10−16) diminished upon RNase H treatment (Figure 2A). Strikingly, CTCF signal is also significantly (p<2.2X10−16) lost from its binding sites upon RNase H treatment (Figure 2A–2C). Consistent with a previous report showing that RNase H overexpression in mESCs promotes PRC2 binding to chromatin57, in our experiments, EZH2 signal is increased at CTCF peaks in RNase H treated samples (Figure 2A). However, the EZH2 levels observed at these CTCF sites are extremely low compared to bona fide PRC2 peaks (Figure S2C). These results indicate that CTCF binding to its genome-wide motifs is sensitive to R-loop removal.
Figure 2. Genomic co-dependency of CTCF and R-loops.
(A) Top, heatmap of MapR, CTCF CUT&RUN, and EZH2 CUT&RUN signal in mESCs untreated or treated with RNase H across 38,219 CTCF peaks called in untreated mESCs. Bottom, boxplots summarizing read densities from heatmaps. p-value, Welch’s two-sided t-test. Box, 25th percentile – median – 75th percentile. Whiskers extend to 1.5x interquartile range; outliers not displayed.
(B) Profile plot of CTCF CUT&RUN signal over 38,219 CTCF sites in untreated and RNase H-treated mESCs. *, p-value < 2.2X10−16 (Welch’s two-sided t-test).
(C) Genome browser view of the Gm33738 gene showing RPM-normalized CTCF CUT&RUN and MapR signal in untreated and RNase H-treated mESCs.
(D) Heatmap of CTCF CUT&RUN and MapR signal in CTCF-AID mESCs untreated or treated with auxin across 28,146 CTCF peaks called in CTCF-AID mESCs.
(E) Heatmap of MapR signal in CTCF-AID mESCs untreated or treated with auxin and G4 CUT&TAG signal in WT mESCs across 3,437 CTCF peaks that overlap a G4 peak (top) or 3,437 randomly sampled CTCF peaks that do not overlap G4 (bottom).
(F) Profile plot of MapR signal in untreated or auxin-treated CTCF-AID mESCs over CTCF sites that overlap a G4 peak (top) and CTCF sites that do not overlap G4 in untreated mESCs (bottom).
(G) Genome browser view of the Cdk20 and Gm31218 genes showing RPM-normalized CTCF CUT&RUN and MapR signal in untreated and auxin-treated CTCF-AID mESCs and G4 CUT&TAG signal in WT mESCs.
See also Figure S2
CTCF is an RNA binding protein40,42,58 and its localization to chromatin depends on its RNA interactions42. Consequently, transcription inhibition results in reduced chromatin occupancy of CTCF42. Transcription inhibition also reduces R-loop levels genome-wide48,59. To determine if R-loop disruption that occurs upon transcription inhibition contributes to altered CTCF localization, we examined changes in CTCF binding upon transcription inhibition in relation to differential presence of R-loops. We called CTCF peaks from ChIP signal of mock treated mESCs42, then separated CTCF sites into those that overlapped an R-loop peak, based on our MapR data, and those without R-loop overlap. Of 93,637 CTCF ChIP peaks, 14,633 peaks overlap with a MapR peak and therefore were classified as CTCF peaks with R-loops, while the remaining 79,004 CTCF ChIP peaks do not contain R-loops (Figure S2D). We found that transcription inhibition decreases CTCF binding at sites with (Figure S2E, left) or without R-loops (Figure S2E, right). However, the decrease in CTCF enrichment is greater at CTCF sites with R-loops (Figure S2E). At sites that do not contain R-loops but still show CTCF reduction upon transcription inhibition, CTCF may be recruited through RNAs that do not engage in R-loop formation. Altogether our data provide strong evidence that in addition to previously reported epigenetic mechanisms such as DNA methylation, CTCF genomic occupancy may also be modulated by the presence of R-loops.
R-loops that do not contain G-quadruplex structures are sensitive to CTCF depletion.
Next, we asked whether CTCF has any role in stabilizing R-loop structures. We examined this using an auxin inducible degradation mESC system (EN52.9.1) where CTCF can be targeted for degradation upon addition of auxin23. Upon treatment with auxin, AID-tagged CTCF is rapidly degraded and by 8 hours is barely detectable by western blot (Figure S2F). We performed CTCF CUT&RUN and MapR in EN52.9.1 cells with or without auxin treatment. CTCF CUT&RUN confirmed near-total loss of CTCF signal upon auxin treatment (Figure 2D). Surprisingly, R-loop signal is also lost across many CTCF sites upon auxin treatment (Figure 2D) including around the Ptgr2 gene (Figure S2G). Promoter R-loops that do not contain CTCF appear unaffected (Figures S2G and S2H).
The ssDNA component of the R-loop can form G4 or not depending on the distribution of guanine nucleotides on the non-template strand56. G4 structures can contribute to the stabilization of R-loops6,56. To determine why some R-loops are more sensitive to CTCF loss than others, we examined the relationship between CTCF occupancy and R-loops based on the presence of G4. We identified 3,437 CTCF sites that also overlap a G4 CUT&TAG peak. As expected, the sites with high G4 signal also contain high R-loop signal, indicative of stable R-loops (Figure 2E). Interestingly, acute depletion of CTCF does not influence R-loops that contain G4s (Figure 2E and 2F, top, Figure 2G). However, R-loops without G4 show decrease in the absence of CTCF (Figure 2E and 2F, bottom, Figure 2G). ADNP binds some genomic sites that are also targeted by CTCF51. We examined if R-loop reduction upon CTCF loss occurs because of ADNP recruitment and ADNP-mediated R-loop resolution activity52. ADNP CUT&RUN in EN52.9.1 cells before and after auxin treatment showed that ADNP levels are not increased at R-loops that decrease upon CTCF depletion (Figure S2I), indicating ADNP-independent R-loop loss. Thus, CTCF binding may have a protective effect on R-loops that is more evident at R-loops that lack G4s. This could occur if CTCF reduced access of R-loop resolvers to CTCF-adjacent R-loops; depletion of CTCF would expose such R-loops to removal, although R-loops with G4s would remain intrinsically more stable and resistant to resolution.
G-quadruplex formation proximal to CTCF consensus motif strengthens CTCF interactions.
Our results indicate that a subset of CTCF sites coincides with R-loops and G4 structures (Figure 1) and that removal of R-loops results in diminished CTCF occupancy (Figure 2). These observations suggest a model in which R-loops with associated G4s facilitate CTCF recruitment. Therefore, we sought to biochemically test how R-loops affect CTCF binding in electrophoretic mobility shift assays (EMSA). We expressed and purified a GST-CTCF fragment that contained all 11 zinc fingers (Figure S3A) and specifically binds dsDNA sequences containing the CTCF recognition motif in vitro (Figure S3B, top)60. We confirmed that our recombinant CTCF protein shows no binding to a dsDNA substrate without the core motif (Figure S3B, bottom). We next tested the binding of GST-CTCF fragment to a panel of reconstituted nucleic acid substrates (Figure 3A). First, we tested if CTCF is able to bind its core motif in the context of a DNA:RNA hybrid. We found that CTCF cannot bind DNA:RNA hybrids (Figure 3B, lanes 5-8) or ssDNA that contains the core motif (Figure 3B, lanes 9-12). This result excludes the possibility that CTCF is recruited directly through the hybrid or ssDNA components of R-loops and indicates that, as previously established, the core motif in dsDNA is the determining factor for CTCF binding.
Figure 3. R-loops and G-quadruplexes reinforce CTCF binding to its consensus motif.
(A) Representative schematics and descriptions of the nucleic acid substrates used in EMSA experiments. CTCF motif sequences are highlighted in red and RNA components are colored in blue. Full sequences of all substrates are provided in Table S1.
(B) EMSA with 0, 200, 400, or 800 nM CTCF and 0.1 nM dsDNA (left), DNA:RNA hybrid (center), or ssDNA (right) substrates. For all resolution assays, schematic structures of substrates from A are shown. Quantification of 3 or more independent experiments are shown as mean values ± SEM. Two-sided Student’s t test p-values are shown where applicable.
(C) EMSA with 0, 200, 400, or 800 nM CTCF and 0.1 nM of substrates. Substrates are dsDNA (left), dsDNA with a downstream non-G4-containing R-loop (center), or dsDNA with a downstream G4-containing R-loop (right).
(D) EMSA with 0, 200, 400, or 800 nM CTCF and 0.1 nM of substrates. Substrates are dsDNA (left), dsDNA with a downstream non-G4-containing DNA bubble (center), or dsDNA with a downstream G4-containing DNA bubble (right).
(E) EMSA with 0, 200, 400, or 800 nM CTCF and 0.1 nM of dsDNA (left) or G4-forming ssDNA (right) substrates.
(F) EMSA with 0, 200, 400, or 800 nM CTCF and 0.1 nM of substrates. Substrates are dsDNA (left), dsDNA with a downstream G4-containing DNA bubble (center), or a mixture of dsDNA and G4-forming ssDNA.
See also Figure S3
Next, we analyzed the proximity of CTCF, R-loops, and G4 on a genomic scale. We found that 33% of CTCF peaks are within 1000 base pairs of an R-loop peak (Figure S3C) and 13% are within 1000 base pairs of a G4 peak (Figure S3D). This represents an over ten-fold enrichment compared to a set of randomly permuted genomic regions of which only 3.2% and 1.1% are within 1000 base pairs of R-loop or G4 peak centers, respectively (Figure S3C and 3D). We also confirmed the in situ juxtaposition of CTCF and G4 using a proximity ligation assay (Figure S3E). These data indicate that genomic sites bound by CTCF are close to, but not necessarily spanned by R-loops, consistent with previous reports showing that G4 peaks are slightly offset from CTCF peak centers46. Given this observation, we tested CTCF binding to its core motif in dsDNA flanked by an R-loop. We standardized the length of fragments in EMSA experiments to prevent any differences arising from different DNA lengths (Table S1). Because of the close association between R-loops and G4s, we also designed R-loop substrates containing a G4 forming sequence in the ssDNA strand (R-loop with G4) such that the midpoints of the CTCF and G4 motifs in our template were 26 bp apart. EMSA experiments using dsDNA, R-loops, and R-loops with G4 show that the presence of an R-loop flanking the CTCF motif promotes binding by recombinant CTCF (Figure 3C, lanes 1-8). Interestingly, G4 presence on the ssDNA strand of the R-loop further stimulates CTCF binding (Figure 3C, lanes 9-12). Our results demonstrate that while R-loop formation improves CTCF binding to its core motif, G4 formation on the ssDNA strand further boosts this binding.
CTCF is well-established to bind RNA35,40–43,58. To distinguish the relative contribution of RNA versus G4 structures to CTCF binding, we performed EMSAs with DNA substrates resembling R-loops but without the RNA component. We call these DNA structures ‘bubbles with or without G4’. We found that CTCF binding is enhanced when the consensus motif is next to bubbles, both with or without G4 (Figure 3D, lanes 5-12). Interestingly, CTCF binds equally to either R-loops or bubbles containing G4 (Figure 3C and 3D, lanes 9-12). Next, we asked whether CTCF binding is enhanced because of the G-rich sequences downstream of the core motif, or if formation of G4 secondary structures upon DNA displacement is needed. We generated a fully double-stranded substrate containing G4 sequence downstream of the CTCF motif and compared it in EMSAs to the bubble-with-G4 substrate. CTCF binding to G4-containing dsDNA is markedly weaker than to a G4-containing DNA bubble, indicating that formation of the G4 secondary structure itself is necessary for strongest CTCF binding (Figure S3F). Altogether, our results indicate that the main determinant of enhanced CTCF binding is proximity of the core motif to G4 (Figure 3D). While RNA presence within R-loops does not significantly augment CTCF binding in vitro (Figure 3C), RNA may contribute to some extent to CTCF binding in vivo.
CTCF binding to G4 is controversial. Tikhonova et al showed that CTCF binds almost equally to G4-containing DNA or dsDNA containing the CTCF recognition motif45. However, Spiegel et al showed that CTCF cannot bind G4 DNA structures to any appreciable extent7. We also examined if CTCF could bind G4-containing ssDNA under the same conditions where it showed motif-specific interactions with dsDNA (Figure S3B) and found that CTCF did not bind G4s (Figure 3E). Finally, we asked whether CTCF binding to its core motif could be enhanced by the presence of G4 DNAs in the binding reaction, or whether G4 structures had to be present in the context of the motif. We compared CTCF binding to dsDNA (Figure 3F, lanes 1-4), bubble with G4 (Figure 3F, lanes 5-8), or dsDNA mixed with G4-containing ssDNA (Figure 3F, lanes 9-12). Our results clearly demonstrate that CTCF binding is increased only when G4s are present on the same substrate as the consensus motif (Figure 3F, lanes 5-8).
The first and tenth zinc finger domains of CTCF (ZF1 and ZF10) are known to bind RNA42. Interestingly, although ZF1 and ZF10 deletions both result in loss of CTCF binding from a subset of sites, the exact regions are almost entirely different between mutants. We examined R-loop and G4 signals at regions that are differentially bound by CTCFΔZF1 and CTCFΔZF10 and found that CTCF sites that are lost in ΔZF1, but not ΔZF10, are strongly centrally enriched for R-loop and G4 signal compared to unaffected sites (Figure S3G). We expressed and purified ΔZF1 and ΔZF10 CTCF proteins and performed EMSAs with dsDNA and bubble-with-G4 substrates containing the CTCF motif. Both mutants bind the CTCF motif-containing dsDNA substrate with the same affinity as wildtype CTCF (Figure S3H). However, ΔZF1, but not ΔZF10, fails to show increased binding to the bubble-with-G4 substrate (Figure S3I). Our results pinpoint an important role for the ZF1 domain in strengthening CTCF binding to G4-proximal motifs.
Loss of a G-quadruplex motif impairs CTCF binding and alters gene expression.
Our biochemical assays indicate that G4 proximity to CTCF binding sites enhances CTCF interactions (Figure 3C and 3D). We thus asked if CTCF genomic binding could be affected by G4 deletion in vivo. Praf2, a gene on chromosome X, is expressed in mESCs and contains R-loops, a CTCF motif in the first exon, and a G4 motif in the first intron 26 bp downstream of the CTCF site (Figure 4A). We replaced the G4 motif in Praf2 with a non-G4 sequence that preserves the G/C and CpG content of the region but removes all instances of the GGG repeats characteristic of G4 motifs61,62, while confirming that the CTCF binding site in Praf2ΔG4 mESCs was unaffected (Figure S4A). To confirm that the replacement sequence does not form G4, we tested its ability to bind the BG4 antibody, which is well-established to recognize G4 structures, in vitro via EMSA. We found that the WT G4 sequence from the Praf2 intron binds BG4 as evidenced by a clear gel shift, while the mutated sequence does not (Figure S4B).
Figure 4. Deletion of a G4 motif in Praf2 reduces CTCF binding and gene expression.
(A) Genome browser view of the Praf2 gene showing spike-in normalized CTCF CUT&RUN, MapR, and BG4 CUT&RUN signal in mESCs. Locations of the CTCF and G4 motifs in the region are indicated.
(B) Genome browser view of the Praf2 gene showing spike-in normalized MapR signal in WT and Praf2ΔG4 deletion clones. Locations of the CTCF and G4 motifs in the region are indicated.
(C) Genome browser view of the Praf2 gene showing spike-in normalized CTCF CUT&RUN signal in WT and Praf2ΔG4 deletion clones. Locations of the CTCF and G4 motifs in the region are indicated.
(D) Bar chart quantifying average CTCF signal in WT and Praf2ΔG4 clones across the Praf2 peak displayed in (C).
(E) Genome browser view of the Krt78 gene showing spike-in normalized CTCF CUT&RUN signal in WT and Praf2ΔG4 deletion clones.
(F) Bar chart of Praf2 gene expression in WT and Praf2ΔG4 deletion clones. Bar chart represents mean values ± SEM from independent RNA-Seq samples. Individual values from each sample are shown as dots.
(G) Volcano plot showing log2 fold changes in gene expression between WT and Praf2ΔG4 deletion mESCs on the x-axis and −log10 p-value on the y-axis. Praf2 gene expression is highlighted in red.
See also Figure S4
We examined CTCF binding and R-loops at the Praf2 gene in WT mESCs and two independent clones that lack the G4 motif (Praf2ΔG4). We found that R-loops form similarly across the promoter and first exon on Praf2 in both WT and Praf2ΔG4 mESCs (Figure 4B). However, CTCF binding is considerably reduced in both Praf2ΔG4 clones compared to WT mESCs (Figure 4C and 4D). CTCF binding at other genomic sites is unchanged between WT and Praf2ΔG4 mESCs, demonstrating that CTCF reduction is highly specific to the Praf2 locus (Figure 4E and Figure S4C). To test if reduced CTCF occupancy in Praf2 had a functional consequence on gene expression, we examined expression of Praf2 in WT and Praf2ΔG4 mESCs. Compared to WT mESCs, Praf2 expression is significantly reduced in both Praf2ΔG4 clones (Figure 4F). Analysis of global gene expression revealed that Praf2 is the most significantly downregulated gene in Praf2ΔG4 clones (Figure 4G). Expression of Gpkow, which is adjacent to Praf2, is not significantly changed (Figure S4D). We conclude that loss of a G4 motif can result in compromised binding of CTCF that affects gene expression.
G-quadruplex stabilization strengthens CTCF binding and enhances chromatin loop formation.
We showed that abrogating G4s reduced CTCF binding to a specific genomic site. Next, we asked if G4 stabilization is sufficient to increase CTCF binding. Numerous chemicals stabilize G4 structures in vitro and in vivo63. Pyridostatin (PDS) is a synthetic small molecule G4 stabilizer that is well-documented to increase cellular G4 levels64–66. Treatment of mESCs with PDS results in an increase in G4 observable by immunofluorescence using the BG4 antibody (Figure S5A). BG4 CUT&RUN also confirmed a genome-wide increase in G4 signal (Figure 5A). In addition, we tested the effects of a second G4 stabilizing drug, PhenDC3, which also stabilizes G4s in vitro and in vivo67,68 but unlike PDS does not trigger DNA damage. As with PDS, treatment of mESCs with PhenDC3 significantly increases cellular G4 levels (Figures S5A and S5B). G4 signal increase is also seen at CTCF peaks upon PDS or PhenDC3 treatment (Figure S5C). Co-treatment of cells with actinomycin D, which inhibits transcription and reduces R-loops and associated G4s, results in decreased BG4 signal (Figure S5A). Notably, treatment with either PDS or PhenDC3 results in a global increase in CTCF binding, which is attenuated in cells co-treated with actinomycin D (Figure 5B and 5C). The sites with the greatest CTCF increases are highly correlated between PDS-treated and PhenDC3-treated cells, indicating that the effect of different G4 stabilizers on CTCF is similar (Figure S5D). To identify patterns that might predict G4 stabilization-related CTCF increases, we sorted CTCF peaks by their signal increase upon PDS treatment (Figure 5B). The top quartile of CTCF sites shows greatest increase in CTCF binding in response to both PDS and PhenDC3 treatment, again indicating consistency between drug effects (Figure 5B). Interestingly, CTCF binding sites in the top 25% have lower motif strength compared to CTCF peaks in the bottom 75% (Figure S5E). These results suggest that G4 stabilization may improve CTCF binding to motifs that normally only bind CTCF weakly in the absence of G4.
Figure 5. G-quadruplex stabilization and concurrent CTCF increases alter genome organization.
(A) Heatmap of spike-in normalized BG4 CUT&RUN signal in mESCs untreated or treated with 2 μM pyridostatin across 15,035 BG4 peaks.
(B) Left, heatmap of CTCF ChIP signal in mESCs treated with mock, 2 μM pyridostatin (PDS), 2 μM PhenDC3, 2 μM PDS + 5 μg/ml actinomycin D, or 2 μM PhenDC3 + 5 μg/ml actinomycin D across 78,285 CTCF peaks called across mock-treated, PDS-treated, and PhenDC3-treated mESCs. Peaks are grouped into the quartile with the highest 25% of CTCF signal increase in PDS-treated cells, and the remaining 75% of peaks. Right, profile plot of CTCF signal in mock and drug-treated conditions across the same quartile groupings shown in heatmaps.
(C) Genome browser view of the Thop1 gene showing CTCF ChIP (RPM) and BG4 CUT&RUN (spike-in normalized) signal in mock, PDS-treated, or PhenDC3 treated mESCs, with or without Actinomycin D (ActD) co-treatment.
(D) 5-kb resolution Hi-C maps of a 4-Mb region of chromosome 2 showing observed contacts in mock (left) and PDS-treated (right) mESCs. Topologically associated domain (TAD) boundaries are indicated with arrows.
(E) Bar chart indicating numbers of TADs called in mock and PDS-treated mESCs.
(F) 5-kb resolution Hi-C map of a 4-Mb region of chromosome 2 showing observed contacts in mock-treated mESCs (top) and insulation scores in mock (blue) and PDS-treated (red) mESCs (bottom). TAD boundaries with increased boundary strength in PDS-treated cells are indicated by arrows.
(G) Bar chart indicating numbers of chromatin loops called in mock and PDS-treated mESCs.
(H) Normalized aggregate peak analysis (APA) at 10-kb resolution comparing chromatin loop strength in mock and PDS-treated mESCs across loops called in PDS-treated cells, grouped by whether they are loops shared with mock (top) or PDS-only (bottom). Peak to lower left (P2LL) ratios are indicated in the lower left corners of APA plots. Rightmost plots display the difference in scores between mock and PDS-treated cells.
(I) 5-kb resolution Hi-C map of a 400-Kb region of chromosome 2 showing observed contacts in mock (upper right triangle) and PDS-treated (lower left triangle) mESCs. Loops with increased interaction in PDS-treated mESCs are indicated with arrows. CTCF CUT&RUN tracks for mock (blue) and PDS-treated (red) mESCs are displayed below the map. Peaks with gained CTCF in PDS-treated mESCs are indicated with arrows.
(J) 5-kb resolution Hi-C map of a 700-Kb region of chromosome 4 showing observed contacts in mock (upper right triangle) and PDS-treated (lower left triangle) mESCs. Loop with increased interaction in PDS-treated mESCs is indicated with arrows. CTCF CUT&RUN tracks for mock (blue) and PDS-treated (red) mESCs are displayed below the map. Peaks with gained CTCF in PDS-treated mESCs are indicated with arrows.
See also Figure S5
We examined the consequence of G4 stabilization to higher-order genome conformation by performing Hi-C on untreated and PDS-treated mESCs. PDS treated cells did not show marked changes in A/B compartmentalization (Figure S5F). Topologically associated domains (TADs) were also not globally affected, with a similar number of TADs identified (Figure 5D and 5E). However, treatment with PDS resulted in lower insulations scores at TAD boundaries, indicating lower contact frequency between adjacent TADs and an increase in boundary strength (Figure 5F, Figures S5G and S5H). Interestingly, unlike TADs which were similar between untreated and PDS-treated mESCs, chromatin loop analysis identified substantially more loops in PDS-treated cells (13,162 in PDS versus 8,491 in untreated; Figure 5G). We separated loops called in PDS into “shared” and “PDS-only” loops, defining a loop as shared if both anchors overlapped the anchors of a loop called in the mock condition and PDS-only if one or both anchors did not overlap with mock loop anchors. Aggregate peak analysis (APA) showed that at PDS-only loops, contacts are increased in PDS compared to mock (Figure 5H, bottom). Interestingly, contacts are also slightly increased at shared loops upon PDS treatment (Figure 5H, top), suggesting that CTCF increases resulting from G4 stabilization strengthens existing contacts in addition to establishing new loops. We observed that increased chromatin contacts are often accompanied by CTCF gains in PDS-treated cells (Figure 5I and 5J and Figure S5I). Comparison of CTCF signal differences on a genome-wide scale showed that CTCF peaks within loop anchors that are only identified in PDS have a greater increase in signal upon G4 stabilization, compared to CTCF peaks within shared loop anchors (Figure S5J). Together, our results suggest that G4 stabilization in the proximity of CTCF motifs facilitates CTCF binding and increases long-range chromatin contacts, revealing a role for G4 in modulation of higher order genome architecture.
Bivalent promoters activated upon neurodifferentiation accrue G4 and CTCF.
G4 stabilization by PDS and PhenDC3 treatment, while globally increasing CTCF signal, results in greater CTCF gains at some genomic sites (top quartile, Figure 5B). We examined the genes associated with the top quartile of CTCF sites with the most binding increase upon PDS treatment. Interestingly, of 10,636 genes associated with top-quartile CTCF sites, 5,144 (48.4%) are also associated with bivalent regions in mESCs, defined as being marked by both the activating H3K4me3 and repressive H3K27me3 histone modifications. Many bivalent genes encode important developmental regulators that are poised for rapid activation or silencing during development69,70, suggesting the possibility that G4-related CTCF binding could be involved in this process.
We focused on bivalent genes that are activated upon neurodifferentiation and found that G4 signal at these regions increases in NPCs compared to ESCs (Figure 6A, left). Consistent with our findings that G4 stabilization positively correlates with CTCF enrichment, CTCF levels also increase upon differentiation at these activated bivalent regions (Figure 6A, right). At these same regions, G4 stabilization with PDS also induces CTCF gains (Figure S6A). For example, at the Epha7 bivalent gene, which has roles in neuronal differentiation and cortical development71,72, CTCF levels in mESCs increase upon stabilization of G4 by PDS (Figure S6B). This same locus shows increase in G4 and CTCF signal in neural progenitor cells (NPCs) (Figure S6B), suggesting that PDS-induced G4 stabilization and CTCF increase in mESCs at Epha7 may mimic a process that normally occurs during neuronal differentiation. Visualization of Hi-C contacts in mESCs and NPCs shows that the Epha7 gene forms local chromatin loops in mESCs, which are strengthened upon differentiation to NPCs (Figure S6C). Similarly, at the Pdcd4 bivalent gene, CTCF and G4 levels are increased upon PDS treatment, and these same genomic regions accumulate CTCF and G4 in neural progenitors (Figure 6B). Looping interactions at the Pdcd4 gene are reinforced in NPCs compared to ESCs (Figure 6C). Strikingly, this same loop is also strengthened when G4s are stabilized in ESCs by PDS treatment (Figure 6D and 6E). Our data suggest that formation of R-loop mediated G4 structures may regulate CTCF-dependent chromatin looping interactions to fine-tune gene expression during differentiation and development (Figure 6F).
Figure 6. Induction of G4 and CTCF by PDS resembles differentiation activation at bivalent genes.
(A) Top, schematic representing mESC differentiation to neural progenitor cells (NPCs). Bottom, profile plot of G4 CUT&TAG signal and CTCF ChIP signal in mESCs and NPCs across 6,631 bivalent chromatin regions associated with genes activated during NPC differentiation.
(B) Genome browser view of the Pdcd4 gene showing CTCF ChIP (RPM) and G4 CUT&RUN (spike-in normalized) signal in mock and PDS-treated mESCs, CTCF ChIP and G4 CUT&TAG signal in mESCs and NPCs (RPM), and H3K4me3 and H3K27me3 ChIP signal in mESCs (fold change over input).
(C) 5-kb resolution Hi-C map of a 700-Kb region of chromosome 19 showing observed contacts in mESCs (upper right triangle) and NPCs (lower left triangle). Pixels with increased interaction in NPCs are highlighted by blue arrows, squares, and insets. The location of the Pdcd4 promoter (region depicted in B) is indicated by black arrows on the sides.
(D) 5-kb resolution Hi-C map of a 700-Kb region of chromosome 19 showing observed contacts in mock (upper right triangle) and PDS-treated (lower left triangle) mESCs. Pixels with increased interaction in PDS-treated cells are highlighted by blue arrows, squares, and insets. The location of the Pdcd4 promoter (region depicted in B) is indicated by black arrows on the sides.
(E) Violin plot and boxplot of normalized interaction strength of 100 pixels centered around the Pdcd4 chromatin loop showing increased interactions in PDS-treated mESCs (area depicted in inset in D). Data points for individual pixels are represented as dots. Box, 25th percentile – median – 75th percentile. Whiskers extend to 1.5x interquartile range.
(F) Model of R-loop and G4 reinforced CTCF binding. CTCF recognizes and binds to its consensus motifs genome-wide. At weaker sequences that are more divergent from the consensus motif, CTCF weakly binds in the absence of other factors, and this level of binding is not sufficient to participate in chromatin looping. Upon R-loop formation near the weak CTCF site, G4s strengthen CTCF binding and promote higher CTCF occupancy, resulting in the formation of de novo chromatin loops.
See also Figure S6
Discussion
In this study, we identify a role for R-loops and G4s in promoting CTCF binding genome-wide. R-loops can impact the transcription potential of genes73. R-loop stabilization inhibits expression of the Arabidopsis COOLAIR lncRNA that regulates the floral repressor FLC74. On the other hand, R-loops can also have stimulatory effects on transcription. The HOTTIP lncRNA was recently shown to localize to specific genomic sites in trans, a subset of which are also occupied by CTCF and cohesin75. HOTTIP localizes to the TAD boundary of the β-catenin encoding gene, CTNNB1, through R-loops35. Abrogating R-loops at these boundary sites reduced HOTTIP enrichment, and CTCF recruitment proposed to occur through direct interactions with HOTTIP was also decreased. Our studies reveal an alternate mechanism in which co-transcriptional R-loops can reinforce CTCF binding through the formation of proximal G4 structures. Of note, the HOTTIP binding motif contains a partial G4 sequence that can promote CTCF recruitment and boundary stabilization. Therefore, in this case, it is possible that both RNA and G4 mediated mechanisms co-operate to stabilize TADs that drive β-catenin expression. Interestingly, genome-wide analysis of G4s and TADs indicate that G4-rich TAD boundaries contain higher CTCF and interact more frequently with each other46. Furthermore, G4 presence has been associated with enhancer-promoter looping, increased CTCF-mediated insulation, and stronger TAD boundaries8,46. TAD boundaries with higher levels of transcription were shown to contain more CTCF and display enhanced insulation76. While boundary-associated RNAs are implicated in the recruitment of CTCF to these TAD boundaries76, it is also possible that at these regions co-transcriptional R-loop associated G4 structures also contribute to CTCF stabilization.
Both R-loops and CTCF are dynamically regulated during cell differentiation77–79. Our data show that G4 stabilization in mESCs results in CTCF gains at bivalent genes, which have important and well-described roles in development69,70. We also found that some bivalent promoters display increased G4, CTCF and genome contacts in neural progenitors. Early transcription events during differentiation may result in the formation of G4 structures in proximity to CTCF binding sites. The recruitment of CTCF to these sites could result in the establishment of stable chromatin loops favoring cell type-specific expression. Thus, G4s formed during differentiation and development may help modulate CTCF occupancy at low affinity motifs and may be a driving force for cell fate specification.
G4s are also formed during DNA replication80 where they stall DNA polymerase81. G4 formation on the leading or lagging strands proximate to a newly replicated CTCF motif could serve as markers for efficient recruitment and inheritance of genomic CTCF occupancy. Our findings also have important implications for diseases characterized by aberrant accrual of R-loops82, as the effects of their accumulation may extend beyond genome instability and DNA damage. It is possible that in these diseases, R-loops and their associated G4s not only promote DNA damage, but additionally alter CTCF profiles to result in pathogenic rewiring of the genome, which can initiate and sustain disease-specific gene expression signatures83. Thus, advancing small molecule strategies to adjust G4 levels in vivo84,85 will have important therapeutic implications for the treatment of diseases where pathogenic gene expression programs are driven by aberrant genome organization.
Limitations of the study
In this study, we find that formation of G4 structures near a CTCF consensus motif promotes CTCF binding. We identify a domain of CTCF, zinc finger 1, that mediates this increased interaction, as deletion of this domain mitigates the ability of G4 to stimulate CTCF binding. However, we also demonstrate that CTCF does not bind G4 DNA alone, suggesting that the increased binding affinity associated with G4s is not the result of a direct interaction. Thus, the exact structural basis of how G4s can alter CTCF binding remains an open question. Interestingly, previous structural analysis of CTCF shows that zinc finger 1 is not directly involved in binding of the consensus motif86. An intriguing alternate explanation is that formation of the G4 structure, which involves opening of dsDNA and displacement of the opposite strand, could alter topology or coiling of the motif-containing DNA in a way that facilitates CTCF binding87. This process could be mediated by zinc finger 1 by stabilizing CTCF in a DNA topology-dependent manner. Future structural analyses of CTCF in complex with substrates containing G4s can reveal how G4s strengthen CTCF recognition of its consensus motif.
STAR★Methods
Resource Availability
Lead Contact
Lead Contact: Kavitha Sarma (kavitha@sarmalab.com).
Materials Availability
Material generated in this study is available from the lead contact.
Data and code availability
Sequencing data have been deposited at GEO and original western blot and EMSA images have been deposited at Mendeley Data. All data are publicly available as of the date of publication. Accession numbers and DOI are listed in the key resources table.
This paper does not report original code.
Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request.
KEY RESOURCES TABLE
REAGENT or RESOURCE | SOURCE | IDENTIFIER |
---|---|---|
Antibodies | ||
Anti-CTCF (D31H2) | Cell Signaling | Cat# 3418;RRID:AB_2086791 |
Anti-ADNP | Yan et al., 2022 | N/A |
Anti-EZH2 | Cell Signaling | Cat# 5246; RRID:AB_10694683 |
BG4 (scFv) | Absolute Antibody | Ab00174-30.126 |
Monoclonal ANTI-FLAG M2 antibody | Sigma | A2220 |
Rabbit Anti-Mouse | Fisher | Cat# SA5-10192; RRID:AB_2556772 |
Rabbit Anti-FLAG | Sigma | Cat# F7425; RRID:AB_439687 |
Goat Anti-Rabbit IgG Alexa 555 | Fisher | Cat# 21429; RRID:AB_2535850 |
Bacterial and Virus Strains | ||
BL21 | Fisher | C601003 |
Chemicals, Peptides, and Recombinant Proteins | ||
NEBuilder HiFi | NEB | E2621 |
GST-ΔRNH-MNase | Yan et al., 2019 | N/A |
GST-MNase | Yan et al., 2019 | N/A |
pA-MNase | Skene and Henikoff, 2017 | N/A |
IPTG | Fisher | BP17755-10 |
Glutathione Sepharose 4B | Sigma | GE17-0756-01 |
RNase H | NEB | M0297 |
Indole-3-acetic acid | Sigma | I5148 |
Trizol | Fisher | 15596026 |
FastSelect -rRNA HMR | Qiagen | 334387 |
Pyridostatin hydrochloride | Sigma | SML2690 |
Phen-DC3 trifluoromethanesulfonate | MedChemExpress | HY-15594A |
Actinomycin D | Cell Signaling | 15021S |
Critical Commercial Assays | ||
Duolink In Situ Mouse/Rabbit | Sigma | DUO92101 |
Ultra II Directional RNA Library Prep Kit | NEB | E7760 |
Hi-C+ for High-Coverage Kit | Arima Genomics | N/A |
NEBNext Ultra II DNA Library Prep Kit | NEB | E7645 |
Deposited Data | ||
CUT&RUN, MapR, ChIP-Seq, RNA-Seq, and Hi-C data | This paper | GSE208145 |
Unprocessed gel images for Figures 3, S1, S2, and S3 | This paper | doi:10.17632/gyhk7sd38w.1 |
G4 CUT&TAG data | Lyu et al., 2022 | GSE173103 |
RNA-Seq and CUT&RUN data | Yan et al., 2022 | GSE171401 |
ChIP-Seq data | Liang et al., 2022 | GSE175631 |
ChIP-Seq data | Saldaña-Meyer et al., 2019 | GSE125595 |
WGBS data | Habibi et al., 2013 | GSE41923 |
H3K4me3 data | ENCODE | ENCFF469DBC |
H3K27me3 data | ENCODE | ENCFF012GHA |
HiC contact maps | Bonev et al., 2017 | N/A |
Experimental Models: Cell Lines | ||
E14 | ||
ADNPKO E14 | Yan et al., 2022 | N/A |
EN52.9.1 | Nora et al., 2017 | N/A |
E14-Praf2-GDel1 | This paper | N/A |
E14-Praf2-GDel2 | This paper | N/A |
Oligonucleotides | ||
Oligos and gBlocks (plasmid construction) | IDT (See Table S1) | N/A |
Oligos (CTCF EMSAs) | IDT (See Table S2) | N/A |
Recombinant DNA | ||
pETDuet-GST-CTCF-ZF1-11 | Ronen Marmorstein’s lab | N/A |
pETDuet-GST-CTCF-ΔZF1 | This paper | N/A |
pETDuet-GST-CTCF-ΔZF10 | This paper | N/A |
PX459 | Feng Zhang’s lab | Addgene:62988 |
PX459-Praf2-gRNA | This paper | N/A |
pcDNA3 | Invitrogen | Addgene:2092 |
pcDNA3-Praf2-gDel | This paper | N/A |
Software and Algorithms | ||
Bowtie 2.2.9 | Langmead and Salzberg, 2012 | RRID:SCR_005476 |
Samtools | Li et al., 2009 | RRID:SCR_002105 |
Integrative Genome Viewer | Robinson et al., 2011; Thorvaldsdóttir et al., 2013 | RRID:SCR_011793 |
deepTools 3.4.1 | Ramírez et al., 2016 | RRID:SCR_016366 |
MACS2 2.2.7.1 | Zhang et al., 2008; https://github.com/taoliu/MACS/ | RRID:SCR_013291 |
Juicer | Durand et al., 2016 | RRID:SCR_017226 |
STAR 2.7.3 | Dobin et al., 2013 | RRID:SCR_004463 |
RSEM 1.3.3 | Li and Dewey, 2011 | RRID:SCR_013027 |
Juicebox | Durand et al., 2016 | RRID:SCR_021172 |
FAN-C 0.9.23 | Kruse et al., 2020 | https://github.com/vaquerizaslab/fanc |
bedtools 2.30.0 | Quinlan and Hall, 2010 | RRID:SCR_006646 |
DiffBind 2.12.0 | Stark and Brown, 2011; http://bioconductor.org/packages/DiffBind/ | RRID:SCR_012918 |
ChIPseeker 1.20.0 | Yu et al., 2015 | http://bioconductor.org/packages/ChIPseeker/ |
DeepBind 0.11 | Alipanahi et al., 2015 | http://tools.genes.toronto.edu/deepbind/ |
limma 3.40.6 | Ritchie et al., 2015 | RRID:SCR_010943 |
edgeR 3.26.8 | Robinson et al., 2010 | RRID:SCR_012802 |
ChromHMM 1.20.0 | Ernst and Kellis, 2012 | RRID:SCR_018141 |
Experimental Model and Study Participant Details
Mouse ESC culture
Male mouse embryonic stem cells (mESCs) were cultured feeder-free on 0.1% gelatin coated plates in “LIF/2i” media consisting of DMEM, 15% fetal bovine serum (Gibco), 1x non-essential amino acids, 1X GlutaMAX (Gibco 35050), 25mM HEPES, 100U/ml Pen-Strep, 55μM 2-mercaptoethanol, 3μM glycogen synthase kinase (GSK) inhibitor (Millipore 361559), 1μM MEK1/2 inhibitor (Millipore 444966), and LIF (Sigma, ESGRO). CTCF-AID mESCs (EN52.9.1) were a gift from Elphège Nora and Benoit Bruneau23. Cell lines have not been authenticated.
Method Details
Generation of G4 deletion mESCs
Guide RNA sequence targeting the Praf2 region was inserted into PX459, a gift from Feng Zhang88 (Addgene plasmid: 62988). Plasmids containing the Praf2 promoter lacking the G4 motif and containing a scrambled replacement sequence flanked by 500bp homology arms on either side were used as donor repair templates. Donor and guide RNA containing plasmids were co-transfected into mESCs using Lipofectamine 3000, selected using puromycin, and confirmed by Sanger sequencing across the G4 region. gBlock and oligo sequences are listed in Table S2.
CTCF expression and purification
pETDuet-GST-CTCF-ZF1-11 expression plasmid was a gift from Ronen Marmorstein (UPenn). CTCF zinc finger 1 and 10 deletion plasmids (pETDuet-GST-CTCF-ZF1Δ, pETDuet-GST-CTCF-ZF10Δ) were generated via amplification from pETDuet-GST-CTCF-ZF1-11 using primers as listed in Table S2, followed by assembly using NEBuilder (NEB E2621). pETDuet-GST-CTCF-ZF1-11, pETDuet-GST-CTCF-ZF1Δ, and pETDuet-GST-CTCF-ZF10Δ expression plasmids were transformed into BL21 (DE3) (ThermoFisher C601003) and grown in a 37°C shaker until the culture reached OD600 of 0.8. Protein expression was induced by the addition of 0.3 mM IPTG (Fisher scientific BP17755-10). 200 μM ZnSO4 was added to the culture at the time of protein induction. Cells were grown in an 18°C shaker overnight, then harvested and lysed in cold PBS containing 10 μM ZnSO4. Lysate was sonicated on a BRANSON Sonifier 450 at power setting 7 with 3 cycles of 10s each, with incubation on ice between cycles. GST-CTCF-ZF1-11, GST-CTCF-ZF1Δ, and GST-CTCF-ZF10Δ were purified from lysates using GST-agarose beads (Affymetrix 78820) according to manufacturer’s instructions. Purified proteins were dialyzed against 2L of dialysis buffer (25 mM HEPES pH 7.5, 250 mM NaCl, 10 μM ZnSO4, 20% glycerol) at 4°C for 2 hours, aliquoted, and stored at −80°C.
Electrophoretic Mobility Shift Assays
Duplex DNA and DNA:RNA hybrids without G4 were prepared by mixing equimolar amounts of forward and reverse oligos in duplex buffer (10 mM Tris pH 7.6, 100 mM NaCl, 1 mM EDTA), incubating at 95°C for 5 min in a thermal cycler, then slowly cooling to 21°C at 0.1°C/s to anneal. To form R-loops without G4, duplex DNA and RNA were mixed at a 1:3 ratio in assembly buffer (90 mM Tris pH 7.5, 90 mM Borate, 10 mM MgCl2) and incubated at room temperature for 2 hours. Excess RNA was removed by purifying R-loops using NucAway spin columns (Ambion AM10070). Duplex DNA containing G4 was assembled by mixing equimolar amounts of forward and reverse oligos in G4-duplex buffer (20 mM Tris pH 7.4, 100 mM NaCl, 10 mM MgCl2, 1 mM DTT). R-loops containing G4 were formed by mixing equimolar amounts of forward DNA, reverse DNA, and RNA in G4-RNA buffer (20 mM Tris pH 7.4, 100 mM NaCl, 1 mM DTT). Single-stranded DNA containing Praf2 WT or mutated G4 sequence was assembled in G4 buffer (20 mM Tris pH 7.4, 50 mM NaCl, 50 mM KCl, 10 mM MgCl2, 1 mM DTT). Duplex DNA, single-stranded G4, and R-loops containing G4 substrates were heated at 94°C in a thermal cycler, slowly cooled to 37°C at 0.1°C/s, and incubated at 37°C for an additional 30 minutes. For binding assays, purified protein or antibody was incubated with substrates for 30 min at 30 °C in binding buffer (50 mM Tris pH 8.0, 100 mM NaCl, 10 μg/ml BSA, 1 mM DTT, 0.1 mM EDTA, 5% Glycerol, 0.05% NP-40). For CTCF EMSAs, poly(dI-dC) was included in the binding reaction as a non-specific competitor. After incubation, mixtures were loaded on 8% (CTCF EMSAs) or 10% (BG4 EMSAs) native TBE gel in 0.5x TBE buffer and run at 4°Cfor 90 min at 100 V. CTCF gels were visualized on an Amersham Typhoon imager on Cy2 channel. BG4 gels were visualized via SYBR Gold stain. Oligo sequences and information on specific oligos used to assemble substrates for each assay can be found in Table S1.
G-Quadruplex Immunofluorescence and Proximity Ligation Assay
100,000 mESCs were cytospun onto glass slides. Slides were incubated for 5 minutes in ice cold PBS followed by 7 minutes in ice cold CSKT buffer (100mM NaCl, 300mM sucrose, 10mM PIPES pH 6.8, 3mM MgCl2, 0.5% Triton), fixed in 4% paraformaldehyde solution at room temperature for 10 minutes, and stored in 70% ethanol at 4°C. For immunofluorescence experiments, all antibodies were diluted in blocking solution and each incubation was followed by three washes with PBST (0.1% Tween 20 in PBS) at room temperature, 2 minutes each. Before BG4 immunostaining, slides were dehydrated with increasing concentrations of ethanol (70%, 80%, 90%, and 100%) for 2 minutes each. After air drying the slides for 5 minutes, the cells were blocked in blocking solution (1% BSA, 0.1% Tween 20 in PBS) for 20 minutes at room temperature and incubated with BG4-FLAG antibody (10 ng/μl) for 2 hours at 37°C, rabbit anti-Flag antibody (Merk Millipore, F7425, 1:800 dilution) for 1 hour at room temperature, goat anti-rabbit IgG Alexa 555 (Invitrogen, A21429, 1:500) for 30 minutes at room temperature, and mounted in Vectashield containing DAPI (Vector Labs H-1200). Mouse ESCs were incubated with BG4-FLAG antibody (10ng/μl), mouse monoclonal anti-FLAG M2 antibody (5ng/μl), and rabbit monoclonal anti-CTCF antibody (5ng/μl) and processed for proximity ligation assays (Sigma DUO92101). Control reactions were done in parallel without mouse monoclonal anti-FLAG M2 antibody.
MapR and CUT&RUN
MapR was performed as previously described48,89 on 5 million cells per sample with the addition of heterologous Drosophila spike-in DNA. CUT&RUN was performed as previously described47 using ADNP52, CTCF (Cell Signaling 3418S), EZH2 (Cell Signaling 5246S), and BG4 (Absolute Antibody Ab00174-30.126) antibodies on 5 million cells per sample, with the addition of heterologous Drosophila spike-in DNA. For RNase H experiments, immobilized cells were treated with 150 U of E. coli RNase H for 1 hour at room temperature with rotation, prior to incubation with antibodies. For CTCF-AID experiments, EN52.9.1 cells were treated with 500 μM indole-3-acetic acid for 8 hours prior to harvest for MapR and CUT&RUN. For BG4 CUT&RUN experiments, following overnight incubation with BG4 antibody, cells were serially incubated with mouse anti-FLAG antibody (Sigma F1804) for 1 hour at 4°C and rabbit anti-mouse antibody for 1 hour at 4°C prior to pA-MNase incubation, with wash steps in between incubation steps.
RNA-Seq
RNA was extracted from mESCs using Trizol (Invitrogen) and subjected to Turbo DNase digestion (Ambion AM2238). RNA was rRNA-depleted using FastSelect -rRNA HMR (Qiagen) and converted to cDNA using Ultra II Directional RNA Library Prep Kit (NEB E7760).
ChIP-Seq and Hi-C
For ChIP-Seq and Hi-C experiments, mESCs were treated with 2 μM pyridostatin, 2 μM PhenDC3, or mock for 24 hours prior to harvest and crosslinking. For actinomycin D-treated samples, 5 μg/ml actinomycin D was added at 8 hours prior to harvest. ChIP-Seq was performed as previously described on 15 million cells per sample90. Crosslinked DNA was sheared via sonication to an average size of 200-400 bp on a Covaris ME220 instrument (10% duty cycle, 75 W peak incident power, 1000 cycles per burst, 6 min). Immunoprecipitation was performed with 500 ng of CTCF antibody (Cell Signaling 3418S). Hi-C was performed using the Arima Genomics Hi-C+ for High-Coverage Kit on 1 million cells per sample.
Library preparation and sequencing
DNA samples from MapR, CUT&RUN, ChIP-Seq, and RNA-Seq were end-repaired using End-Repair Mix (Enzymatics), A-tailed using Klenow exonuclease minus (Enzymatics), purified using MinElute columns (Qiagen), and ligated to Illumina adapters (NEB #E7600) using T4 DNA ligase (Enzymatics). Size selection for fragments >150 bp was performed using AMPure XP magnetic beads (Beckman Coulter). Libraries were PCR amplified with dual index primers (NEB #E7600) using Q5 DNA polymerase (NEB #M0491) and purified with MinElute. Sequencing was performed on a NextSeq 500 instrument (Illumina) with 38x2 paired-end cycle setting for MapR, CUT&RUN, and RNA-Seq and 61x2 paired-end cycle setting for ChIP-Seq. Libraries for Hi-C samples were prepared using the NEBNext Ultra II DNA Library Prep Kit with modifications according to the Arima Genomics Hi-C Kit. Sequencing of Hi-C libraries was performed on a NextSeq 2000 instrument using 61x2 paired-end cycle settings.
Quantification and Statistical Analysis
Statistical test and measurement details
Statistical details of experiments, including tests used, significance p-values, and definitions of centers, dispersions, and precisions, are listed when relevant alongside text, figures, and figure legends.
Immunofluorescence quantification
50 imaged cells were selected from each treatment group and relative fluorescence (sum intensity per μm2) was quantified using NIS-Elements Basic Research.
Read alignment and data processing
CUT&RUN, MapR, and ChIP-seq reads were mapped to the mm10 mouse genome using Bowtie2 version 2.2.991 with default paired-end settings. Discordantly aligning reads were removed from CTCF CUT&RUN BAM files from WT and Praf2ΔG4 samples, BG4 CUT&RUN BAM files, and ChIP-Seq BAM files using Samtools92. PCR read duplicates were flagged and removed from paired-end ChIP-Seq BAM files using Samtools. CUT&RUN, MapR, and ChIP-Seq BigWig tracks were generated using the bamCoverage function in deepTools version 3.4.193 using the parameters “--binSize 5 --extendReads –blackListFileName”, which removes a known set of ENCODE blacklist regions94, and either RPM normalization with the “--normalizeUsing CPM” option or normalization to a spike-in factor, calculated based on the number of reads aligning to the Drosophila genome, using the “--scaleFactor” option. For ChIP-Seq samples, BAM files were combined between replicates prior to BigWig generation. RNA-Seq reads were aligned to mm10 using STAR version 2.7.395 and RSEM version 1.3.396 was used to obtain estimated counts and TPM values. Hi-C reads were combined between replicates, then aligned to the mm10 genome using Juicer97 with default settings and an annotation file based on the Arima restriction enzyme digest. Juicer output was used for visualization in Juicebox98. Hi-C contact maps were generated using Juicebox and FAN-C99 using Knight-Ruiz normalization. Insulation scores of Hi-C data were generated from Juicer .hic files using the insulation function in FAN-C, which is based on a method by Crane et al. 2015100, with geometric-mean normalization, using 10kb resolution data and a window size of 250kb. Signal plots of normalized tracks were generated using the computeMatrix and plotProfile functions in deepTools. Heatmaps were generated usingthe plotHeatmap function in deepTools. Read density values used for scatterplots and boxplots were calculated using the multiBigwigSummary function in deepTools.
Sequencing analysis
CUT&RUN peaks were called with MACS2 2.2.184101 using the parameters “-f BAMPE -g mm --keep-dup all”. Permutation of CTCF peaks to obtain random genomic regions was performed using the bedtools shuffle tool v2.30.0 with parameters “-excl [blacklist regions] -seed 10 -f 0 -noOverlapping”, which generates a set of random genomic regions with identical number and region widths as the set of CTCF peaks. ChIP-Seq peaks were called with MACS2 with parameters “-f BAMPE -g mm --keep-dup 1”. Differential binding analysis of CUT&RUN data was performed in R version 3.6.1 using DiffBind version 2.12.0102. Sites of differential occupancy were called using the edgeR method and an FDR cutoff of 0.05. Genomic annotation of peaks was performed using ChIPseeker version 1.20.0103. Motif strength of CTCF sites was measured using Deepbind 0.11104 on 200 bp sequence around the centers of CTCF peaks. Analysis of RNA-Seq data was performed in R using limma version 3.40.6105 and edgeR version 3.26.8106 and filtering of genes using the edgeR built-in function “filterByExpr”. An activated bivalent gene was defined as a gene with significant upregulation in NPCs compared to ESCs (log2FC > 0, adjusted p-value <=0.05) and an associated bivalent chromatin region as defined by a chromatin state map generated from mESC data107. Hi-C domains and loops were called using Juicer Tools. TADs were called with Arrowhead and loops were called with HICCUPS on the CPU setting, using default parameters. TAD boundary strength was calculated by subtracting the lowest insulation score from the highest insulation score within a 200-kb flanking window centered around each TAD boundary, with a higher positive value indicating greater insulation strength. Aggregate peak analysis (APA) matrices were generated using the APA function in Juicer Tools. For shared vs PDS-only loop definition, a loop was defined as shared if the left anchor in the PDS condition overlapped the left anchor of a loop called in the mockcondition, and the right anchor in the PDS condition overlapped the right anchor of that same mock loop. Subtraction maps of APA matrices were generated by subtracting mock from PDS contact values for each individual pixel.
External data and data availability
We obtained FASTQ files for mESC and NPC G4 CUT&TAG from (GSE173103)50, gene expression counts for mESC and NPC RNA-Seq and BigWig files for ADNP CUT&RUN in WT mESCs and MapR in ADNP KO mESCs from (GSE171401) 52, BigWig files for RNA polymerase II from (GSE175631)108, ChIP-Seq FASTQ and BigWig files for transcription inhibition and CTCF-ZFΔ mutants from (GSE125595)42, WGBS methylation data from (GSE41923)54, and BigWig files for H3K4me3 (ENCFF469DBC) and H3K27me3 (ENCFF012GHA) ChIP from ENCODE109. HiC contact maps for ESCs and NPCs78 were available from Juicebox. All FASTQs, BigWigs, and RNA-Seq counts generated during this project have been uploaded to GEO with accession code GSE208145.
Supplementary Material
Table S1. Oligo sequences and combinations used for EMSA experiments, Related to Figure 3
Table S2. Oligos and gBlocks used in this study, Related to Figure 4
Highlights.
CTCF-bound regions are enriched for R-loops and G-quadruplexes.
G4s proximal to CTCF consensus motif strengthens CTCF interactions in vitro.
G4 stabilization strengthens CTCF binding genome wide.
Strengthened CTCF binding through G4s enhances chromatin looping.
Acknowledgments
We thank Ronen Marmorstein for the CTCF expression plasmid and Feng Zhang for the PX459 plasmid. We thank Elphège Nora and Benoit Bruneau for the CTCF-AID mESCs (EN52.9.1). We are grateful to Roberto Bonasio and Paul Lieberman for critical reading of the manuscript. This work was supported by NIH grants DP2-NS105576, R01NS127828, and R01GM143229 to K.S., F32GM143832 to P.W. A.G. is supported by grants from the NIH (R01 HL141326 and CA252223) and S.O. by the NIH training grant T32-CA009171. J.E.P-C acknowledges support from NIH (1DP1MH129957), Chan Zuckerberg Initiative Neurodegenerative Disease Pairs Award (2020-221479-5022), and the Friedreich’s Ataxia Research Alliance.
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Declaration of interests
The authors declare no competing interests.
References
- 1.Thomas M, White RL, and Davis RW (1976). Hybridization of RNA to double-stranded DNA: formation of R-loops. Proc Natl Acad Sci U S A 73, 2294–2298. 10.1073/pnas.73.7.2294. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Garcia-Muse T, and Aguilera A (2019). R Loops: From Physiological to Pathological Roles. Cell 179, 604–618. 10.1016/j.cell.2019.08.055. [DOI] [PubMed] [Google Scholar]
- 3.Niehrs C, and Luke B (2020). Regulatory R-loops as facilitators of gene expression and genome stability. Nat Rev Mol Cell Biol 21, 167–178. 10.1038/s41580-019-0206-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Petermann E, Lan L, and Zou L (2022). Sources, resolution and physiological relevance of R-loops and RNA-DNA hybrids. Nat Rev Mol Cell Biol. 10.1038/s41580-022-00474-x. [DOI] [PubMed] [Google Scholar]
- 5.Brickner JR, Garzon JL, and Cimprich KA (2022). Walking a tightrope: The complex balancing act of R-loops in genome stability. Mol Cell 82, 2267–2297. 10.1016/j.molcel.2022.04.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Duquette ML, Handa P, Vincent JA, Taylor AF, and Maizels N (2004). Intracellular transcription of G-rich DNAs induces formation of G-loops, novel structures containing G4 DNA. Genes Dev 18, 1618–1629. 10.1101/gad.1200804. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Spiegel J, Cuesta SM, Adhikari S, Hänsel-Hertsch R, Tannahill D, and Balasubramanian S (2021). G-quadruplexes are transcription factor binding hubs in human chromatin. Genome Biology 22, 117. 10.1186/s13059-021-02324-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Robinson J, Raguseo F, Nuccio SP, Liano D, and Di Antonio M (2021). DNA G-quadruplex structures: more than simple roadblocks to transcription? Nucleic Acids Research 49, 8419–8431. 10.1093/nar/gkab609. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Gan W, Guan Z, Liu J, Gui T, Shen K, Manley JL, and Li X (2011). R-loop-mediated genomic instability is caused by impairment of replication fork progression. Genes Dev 25, 2041–2056. 10.1101/gad.17010011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Costantino L, and Koshland D (2018). Genome-wide Map of R-Loop-Induced Damage Reveals How a Subset of R-Loops Contributes to Genomic Instability. Mol Cell 71, 487–497.e483. 10.1016/j.molcel.2018.06.037. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Bonev B, and Cavalli G (2016). Organization and function of the 3D genome. Nat Rev Genet 17, 661–678. 10.1038/nrg.2016.112. [DOI] [PubMed] [Google Scholar]
- 12.Dekker J, and Mirny L (2016). The 3D Genome as Moderator of Chromosomal Communication. Cell 164, 1110–1121. 10.1016/j.cell.2016.02.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Dixon JR, Gorkin DU, and Ren B (2016). Chromatin Domains: The Unit of Chromosome Organization. Mol Cell 62, 668–680. 10.1016/j.molcel.2016.05.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Rowley MJ, and Corces VG (2018). Organizational principles of 3D genome architecture. Nat Rev Genet 19, 789–800. 10.1038/s41576-018-0060-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Lieberman-Aiden E, van Berkum NL, Williams L, Imakaev M, Ragoczy T, Telling A, Amit I, Lajoie BR, Sabo PJ, Dorschner MO, et al. (2009). Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 326, 289–293. 10.1126/science.1181369. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Rao SS, Huntley MH, Durand NC, Stamenova EK, Bochkov ID, Robinson JT, Sanborn AL, Machol I, Omer AD, Lander ES, and Aiden EL (2014). A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell 159, 1665–1680. 10.1016/j.cell.2014.11.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Dixon JR, Selvaraj S, Yue F, Kim A, Li Y, Shen Y, Hu M, Liu JS, and Ren B (2012). Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature 485, 376–380. 10.1038/nature11082. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Nora EP, Lajoie BR, Schulz EG, Giorgetti L, Okamoto I, Servant N, Piolot T, van Berkum NL, Meisig J, Sedat J, et al. (2012). Spatial partitioning of the regulatory landscape of the X-inactivation centre. Nature 485, 381–385. 10.1038/nature11049. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Hou C, Li L, Qin ZS, and Corces VG (2012). Gene density, transcription, and insulators contribute to the partition of the Drosophila genome into physical domains. Mol Cell 48, 471–484. 10.1016/j.molcel.2012.08.031. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Sexton T, Yaffe E, Kenigsberg E, Bantignies F, Leblanc B, Hoichman M, Parrinello H, Tanay A, and Cavalli G (2012). Three-dimensional folding and functional organization principles of the Drosophila genome. Cell 148, 458–472. 10.1016/j.cell.2012.01.010. [DOI] [PubMed] [Google Scholar]
- 21.Phillips-Cremins JE, Sauria ME, Sanyal A, Gerasimova TI, Lajoie BR, Bell JS, Ong CT, Hookway TA, Guo C, Sun Y, et al. (2013). Architectural protein subclasses shape 3D organization of genomes during lineage commitment. Cell 153, 1281–1295. 10.1016/j.cell.2013.04.053. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Beagan JA, Pastuzyn ED, Fernandez LR, Guo MH, Feng K, Titus KR, Chandrashekar H, Shepherd JD, and Phillips-Cremins JE (2020). Three-dimensional genome restructuring across timescales of activity-induced neuronal gene expression. Nat Neurosci 23, 707–717. 10.1038/s41593-020-0634-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Nora EP, Goloborodko A, Valton AL, Gibcus JH, Uebersohn A, Abdennur N, Dekker J, Mirny LA, and Bruneau BG (2017). Targeted Degradation of CTCF Decouples Local Insulation of Chromosome Domains from Genomic Compartmentalization. Cell 169, 930–944.e922. 10.1016/j.cell.2017.05.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Fudenberg G, Imakaev M, Lu C, Goloborodko A, Abdennur N, and Mirny LA (2016). Formation of Chromosomal Domains by Loop Extrusion. Cell Rep 15, 2038–2049. 10.1016/j.celrep.2016.04.085. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Sanborn AL, Rao SS, Huang SC, Durand NC, Huntley MH, Jewett AI, Bochkov ID, Chinnappan D, Cutkosky A, Li J, et al. (2015). Chromatin extrusion explains key features of loop and domain formation in wild-type and engineered genomes. Proc Natl Acad Sci U S A 112, E6456–6465. 10.1073/pnas.1518552112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Alipour E, and Marko JF (2012). Self-organization of domain structures by DNA-loop-extruding enzymes. Nucleic Acids Res 40, 11202–11212. 10.1093/nar/gks925. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Lobanenkov VV, Nicolas RH, Adler VV, Paterson H, Klenova EM, Polotskaja AV, and Goodwin GH (1990). A novel sequence-specific DNA binding protein which interacts with three regularly spaced direct repeats of the CCCTC-motif in the 5′-flanking sequence of the chicken c-myc gene. Oncogene 5, 1743–1753. [PubMed] [Google Scholar]
- 28.Kim TH, Abdullaev ZK, Smith AD, Ching KA, Loukinov DI, Green RD, Zhang MQ, Lobanenkov VV, and Ren B (2007). Analysis of the vertebrate insulator protein CTCF-binding sites in the human genome. Cell 128, 1231–1245. 10.1016/j.cell.2006.12.048. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Essien K, Vigneau S, Apreleva S, Singh LN, Bartolomei MS, and Hannenhalli S (2009). CTCF binding site classes exhibit distinct evolutionary, genomic, epigenomic and transcriptomic features. Genome Biol 10, R131. 10.1186/gb-2009-10-11-r131. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Kanduri C, Pant V, Loukinov D, Pugacheva E, Qi CF, Wolffe A, Ohlsson R, and Lobanenkov VV (2000). Functional association of CTCF with the insulator upstream of the H19 gene is parent of origin-specific and methylation-sensitive. Curr Biol 10, 853–856. 10.1016/s0960-9822(00)00597-2. [DOI] [PubMed] [Google Scholar]
- 31.Bell AC, and Felsenfeld G (2000). Methylation of a CTCF-dependent boundary controls imprinted expression of the Igf2 gene. Nature 405, 482–485. 10.1038/35013100. [DOI] [PubMed] [Google Scholar]
- 32.Hark AT, Schoenherr CJ, Katz DJ, Ingram RS, Levorse JM, and Tilghman SM (2000). CTCF mediates methylation-sensitive enhancer-blocking activity at the H19/Igf2 locus. Nature 405, 486–489. 10.1038/35013106. [DOI] [PubMed] [Google Scholar]
- 33.Wang H, Maurano MT, Qu H, Varley KE, Gertz J, Pauli F, Lee K, Canfield T, Weaver M, Sandstrom R, et al. (2012). Widespread plasticity in CTCF occupancy linked to DNA methylation. Genome Res 22, 1680–1688. 10.1101/gr.136101.111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Sanz LA, Hartono SR, Lim YW, Steyaert S, Rajpurkar A, Ginno PA, Xu X, and Chédin F (2016). Prevalent, Dynamic, and Conserved R-Loop Structures Associate with Specific Epigenomic Signatures in Mammals. Mol Cell 63, 167–178. 10.1016/j.molcel.2016.05.032. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Luo H, Zhu G, Eshelman MA, Fung TK, Lai Q, Wang F, Zeisig BB, Lesperance J, Ma X, Chen S, et al. (2022). HOTTIP-dependent R-loop formation regulates CTCF boundary activity and TAD integrity in leukemia. Mol Cell 82, 833–851.e811. 10.1016/j.molcel.2022.01.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Cristini A, Groh M, Kristiansen MS, and Gromak N (2018). RNA/DNA Hybrid Interactome Identifies DXH9 as a Molecular Player in Transcriptional Termination and R-Loop-Associated DNA Damage. Cell Rep 23, 1891–1905. 10.1016/j.celrep.2018.04.025. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Mosler T, Conte F, Longo GMC, Mikicic I, Kreim N, Möckel MM, Petrosino G, Flach J, Barau J, Luke B, et al. (2021). R-loop proximity proteomics identifies a role of DDX41 in transcription-associated genomic instability. Nat Commun 12, 7314. 10.1038/s41467-021-27530-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Ariel F, Lucero L, Christ A, Mammarella MF, Jegu T, Veluchamy A, Mariappan K, Latrasse D, Blein T, Liu C, et al. (2020). R-Loop Mediated trans Action of the APOLO Long Noncoding RNA. Mol Cell 77, 1055–1065.e1054. 10.1016/j.molcel.2019.12.015. [DOI] [PubMed] [Google Scholar]
- 39.Feretzaki M, Renck Nunes P, and Lingner J (2019). Expression and differential regulation of human TERRA at several chromosome ends. Rna 25, 1470–1480. 10.1261/rna.072322.119. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Saldaña-Meyer R, González-Buendía E, Guerrero G, Narendra V, Bonasio R, Recillas-Targa F, and Reinberg D (2014). CTCF regulates the human p53 gene through direct interaction with its natural antisense transcript, Wrap53. Genes Dev 28, 723–734. 10.1101/gad.236869.113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Sun S, Del Rosario BC, Szanto A, Ogawa Y, Jeon Y, and Lee JT (2013). Jpx RNA activates Xist by evicting CTCF. Cell 153, 1537–1551. 10.1016/j.cell.2013.05.028. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Saldaña-Meyer R, Rodriguez-Hernaez J, Escobar T, Nishana M, Jácome-Lñpez K, Nora EP, Bruneau BG, Tsirigos A, Furlan-Magaril M, Skok J, and Reinberg D (2019). RNA Interactions Are Essential for CTCF-Mediated Genome Organization. Mol Cell 76, 412–422.e415. 10.1016/j.molcel.2019.08.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Hansen AS, Hsieh TS, Cattoglio C, Pustova I, Saldaña-Meyer R, Reinberg D, Darzacq X, and Tjian R (2019). Distinct Classes of Chromatin Loops Revealed by Deletion of an RNA-Binding Region in CTCF. Mol Cell 76, 395–411.e313. 10.1016/j.molcel.2019.07.039. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Oh HJ, Aguilar R, Kesner B, Lee HG, Kriz AJ, Chu HP, and Lee JT (2021). Jpx RNA regulates CTCF anchor site selection and formation of chromosome loops. Cell 184, 6157–6173.e6124. 10.1016/j.cell.2021.11.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Tikhonova P, Pavlova I, Isaakova E, Tsvetkov V, Bogomazova A, Vedekhina T, Luzhin AV, Sultanov R, Severov V, Klimina K, et al. (2021). DNA G-Quadruplexes Contribute to CTCF Recruitment. Int J Mol Sci 22. 10.3390/ijms22137090. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Hou Y, Li F, Zhang R, Li S, Liu H, Qin ZS, and Sun X (2019). Integrative characterization of G-Quadruplexes in the three-dimensional chromatin structure. Epigenetics 14, 894–911. 10.1080/15592294.2019.1621140. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Skene PJ, and Henikoff S (2017). An efficient targeted nuclease strategy for high-resolution mapping of DNA binding sites. eLife 6, e21856. 10.7554/eLife.21856. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Yan Q, Shields EJ, Bonasio R, and Sarma K (2019). Mapping Native R-Loops Genome-wide Using a Targeted Nuclease Approach. Cell Rep 29, 1369–1380.e1365. 10.1016/j.celrep.2019.09.052. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Verma A, Halder K, Halder R, Yadav VK, Rawal P, Thakur RK, Mohd F, Sharma A, and Chowdhury S (2008). Genome-wide computational and expression analyses reveal G-quadruplex DNA motifs as conserved cis-regulatory elements in human and related species. J Med Chem 51, 5641–5649. 10.1021/jm800448a. [DOI] [PubMed] [Google Scholar]
- 50.Lyu J, Shao R, Kwong Yung PY, and Elsässer SJ (2022). Genome-wide mapping of G-quadruplex structures with CUT&Tag. Nucleic Acids Res 50, e13. 10.1093/nar/gkab1073. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Kaaij LJT, Mohn F, van der Weide RH, de Wit E, and Bühler M (2019). The ChAHP Complex Counteracts Chromatin Looping at CTCF Sites that Emerged from SINE Expansions in Mouse. Cell 178, 1437–1451.e1414. 10.1016/j.cell.2019.08.007. [DOI] [PubMed] [Google Scholar]
- 52.Yan Q, Wulfridge P, Doherty J, Fernandez-Luna JL, Real PJ, Tang H-Y, and Sarma K (2022). Proximity labeling identifies a repertoire of site-specific R-loop modulators. Nature Communications 13, 53. 10.1038/s41467-021-27722-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Ostapcuk V, Mohn F, Carl SH, Basters A, Hess D, Iesmantavicius V, Lampersberger L, Flemr M, Pandey A, Thomä NH, et al. (2018). Activity-dependent neuroprotective protein recruits HP1 and CHD4 to control lineage-specifying genes. Nature 557, 739–743. 10.1038/s41586-018-0153-8. [DOI] [PubMed] [Google Scholar]
- 54.Habibi E, Brinkman AB, Arand J, Kroeze LI, Kerstens HH, Matarese F, Lepikhov K, Gut M, Brun-Heath I, Hubner NC, et al. (2013). Whole-genome bisulfite sequencing of two distinct interconvertible DNA methylomes of mouse embryonic stem cells. Cell Stem Cell 13, 360–369. 10.1016/j.stem.2013.06.002. [DOI] [PubMed] [Google Scholar]
- 55.Ginno PA, Lott PL, Christensen HC, Korf I, and Chédin F (2012). R-loop formation is a distinctive characteristic of unmethylated human CpG island promoters. Mol Cell 45, 814–825. 10.1016/j.molcel.2012.01.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Chen L, Chen JY, Zhang X, Gu Y, Xiao R, Shao C, Tang P, Qian H, Luo D, Li H, et al. (2017). R-ChIP Using Inactive RNase H Reveals Dynamic Coupling of R-loops with Transcriptional Pausing at Gene Promoters. Mol Cell 68, 745–757.e745. 10.1016/j.molcel.2017.10.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Chen PB, Chen HV, Acharya D, Rando OJ, and Fazzio TG (2015). R loops regulate promoter-proximal chromatin architecture and cellular differentiation. Nat Struct Mol Biol 22, 999–1007. 10.1038/nsmb.3122. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Kung JT, Kesner B, An JY, Ahn JY, Cifuentes-Rojas C, Colognori D, Jeon Y, Szanto A, del Rosario BC, Pinter SF, et al. (2015). Locus-specific targeting to the X chromosome revealed by the RNA interactome of CTCF. Mol Cell 57, 361–375. 10.1016/j.molcel.2014.12.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Dumelie JG, and Jaffrey SR (2017). Defining the location of promoter-associated R-loops at near-nucleotide resolution using bisDRIP-seq. Elife 6. 10.7554/eLife.28306. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Mawhinney MT, Liu R, Lu F, Maksimoska J, Damico K, Marmorstein R, Lieberman PM, and Urbanc B (2018). CTCF-Induced Circular DNA Complexes Observed by Atomic Force Microscopy. J Mol Biol 430, 759–776. 10.1016/j.jmb.2018.01.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Sen D, and Gilbert W (1988). Formation of parallel four-stranded complexes by guanine-rich motifs in DNA and its implications for meiosis. Nature 334, 364–366. 10.1038/334364a0. [DOI] [PubMed] [Google Scholar]
- 62.Huppert JL, and Balasubramanian S (2005). Prevalence of quadruplexes in the human genome. Nucleic Acids Res 33, 2908–2916. 10.1093/nar/gki609. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Asamitsu S, Obata S, Yu Z, Bando T, and Sugiyama H (2019). Recent Progress of Targeted G-Quadruplex-Preferred Ligands Toward Cancer Therapy. Molecules 24. 10.3390/molecules24030429. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Biffi G, Tannahill D, McCafferty J, and Balasubramanian S (2013). Quantitative visualization of DNA G-quadruplex structures in human cells. Nat Chem 5, 182–186. 10.1038/nchem.1548. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.De Magis A, Kastl M, Brossart P, Heine A, and Paeschke K (2021). BG-flow, a new flow cytometry tool for G-quadruplex quantification in fixed cells. BMC Biol 19, 45. 10.1186/s12915-021-00986-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Rodriguez R, Müller S, Yeoman JA, Trentesaux C, Riou JF, and Balasubramanian S (2008). A novel small molecule that alters shelterin integrity and triggers a DNA-damage response at telomeres. J Am Chem Soc 130, 15758–15759. 10.1021/ja805615w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Piazza A, Boulé JB, Lopes J, Mingo K, Largy E, Teulade-Fichou MP, and Nicolas A (2010). Genetic instability triggered by G-quadruplex interacting Phen-DC compounds in Saccharomyces cerevisiae. Nucleic Acids Res 38, 4337–4348. 10.1093/nar/gkq136. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.De Cian A, Delemos E, Mergny JL, Teulade-Fichou MP, and Monchaud D (2007). Highly efficient G-quadruplex recognition by bisquinolinium compounds. J Am Chem Soc 129, 1856–1857. 10.1021/ja067352b. [DOI] [PubMed] [Google Scholar]
- 69.Bernstein BE, Mikkelsen TS, Xie X, Kamal M, Huebert DJ, Cuff J, Fry B, Meissner A, Wernig M, Plath K, et al. (2006). A bivalent chromatin structure marks key developmental genes in embryonic stem cells. Cell 125, 315–326. 10.1016/j.cell.2006.02.041. [DOI] [PubMed] [Google Scholar]
- 70.Voigt P, Tee WW, and Reinberg D (2013). A double take on bivalent promoters. Genes Dev 27, 1318–1338. 10.1101/gad.219626.113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Clifford MA, Athar W, Leonard CE, Russo A, Sampognaro PJ, Van der Goes MS, Burton DA, Zhao X, Lalchandani RR, Sahin M, et al. (2014). EphA7 signaling guides cortical dendritic development and spine maturation. Proc Natl Acad Sci U S A 111, 4994–4999. 10.1073/pnas.1323793111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Leonard CE, Baydyuk M, Stepler MA, Burton DA, and Donoghue MJ (2020). EphA7 isoforms differentially regulate cortical dendrite development. PLoS One 15, e0231561. 10.1371/journal.pone.0231561. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Skourti-Stathaki K, and Proudfoot NJ (2014). A double-edged sword: R loops as threats to genome integrity and powerful regulators of gene expression. Genes Dev 28, 1384–1396. 10.1101/gad.242990.114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Sun Q, Csorba T, Skourti-Stathaki K, Proudfoot NJ, and Dean C (2013). R-loop stabilization represses antisense transcription at the Arabidopsis FLC locus. Science 340, 619–621. 10.1126/science.1234848. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Wang F, Tang Z, Shao H, Guo J, Tan T, Dong Y, and Lin L (2018). Long noncoding RNA HOTTIP cooperates with CCCTC-binding factor to coordinate HOXA gene expression. Biochem Biophys Res Commun 500, 852–859. 10.1016/j.bbrc.2018.04.173. [DOI] [PubMed] [Google Scholar]
- 76.Islam Z, Saravanan B, Walavalkar K, Farooq U, Singh AK, Radhakrishnan S, Thakur J, Pandit A, Henikoff S, and Notani D (2023). Active enhancers strengthen insulation by RNA-mediated CTCF binding at chromatin domain boundaries. Genome Res 33, 1–17. 10.1101/gr.276643.122. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Yan P, Liu Z, Song M, Wu Z, Xu W, Li K, Ji Q, Wang S, Liu X, Yan K, et al. (2020). Genome-wide R-loop Landscapes during Cell Differentiation and Reprogramming. Cell Rep 32, 107870. 10.1016/j.celrep.2020.107870. [DOI] [PubMed] [Google Scholar]
- 78.Bonev B, Mendelson Cohen N, Szabo Q, Fritsch L, Papadopoulos GL, Lubling Y, Xu X, Lv X, Hugnot JP, Tanay A, and Cavalli G (2017). Multiscale 3D Genome Rewiring during Mouse Neural Development. Cell 171, 557–572.e524. 10.1016/j.cell.2017.09.043. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Beagan JA, Duong MT, Titus KR, Zhou L, Cao Z, Ma J, Lachanski CV, Gillis DR, and Phillips-Cremins JE (2017). YY1 and CTCF orchestrate a 3D chromatin looping switch during early neural lineage commitment. Genome Res 27, 1139–1152. 10.1101/gr.215160.116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.Bryan TM (2019). Mechanisms of DNA Replication and Repair: Insights from the Study of G-Quadruplexes. Molecules 24. 10.3390/molecules24193439. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Weitzmann MN, Woodford KJ, and Usdin K (1996). The development and use of a DNA polymerase arrest assay for the evaluation of parameters affecting intrastrand tetraplex formation. J Biol Chem 271, 20958–20964. 10.1074/jbc.271.34.20958. [DOI] [PubMed] [Google Scholar]
- 82.Richard P, and Manley JL (2017). R Loops and Links to Human Disease. J Mol Biol 429, 3168–3180. 10.1016/j.jmb.2016.08.031. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83.Misteli T (2010). Higher-order genome organization in human disease. Cold Spring Harb Perspect Biol 2, a000794. 10.1101/cshperspect.a000794. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Mitteaux J, Lejault P, Wojciechowski F, Joubert A, Boudon J, Desbois N, Gros CP, Hudson RHE, Boulé JB, Granzhan A, and Monchaud D (2021). Identifying G-Quadruplex-DNA-Disrupting Small Molecules. J Am Chem Soc 143, 12567–12577. 10.1021/jacs.1c04426. [DOI] [PubMed] [Google Scholar]
- 85.Waller ZA, Sewitz SA, Hsu ST, and Balasubramanian S (2009). A small molecule that disrupts G-quadruplex DNA structure and enhances gene expression. J Am Chem Soc 131, 12628–12633. 10.1021/ja901892u. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Hashimoto H, Wang D, Horton JR, Zhang X, Corces VG, and Cheng X (2017). Structural Basis for the Versatile and Methylation-Dependent Binding of CTCF to DNA. Mol Cell 66, 711–720.e713. 10.1016/j.molcel.2017.05.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.Davidson IF, Barth R, Zaczek M, van der Torre J, Tang W, Nagasaka K, Janissen R, Kerssemakers J, Wutz G, Dekker C, and Peters JM (2023). CTCF is a DNA-tension-dependent barrier to cohesin-mediated loop extrusion. Nature 616, 822–827. 10.1038/s41586-023-05961-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 88.Ran FA, Hsu PD, Wright J, Agarwala V, Scott DA, and Zhang F (2013). Genome engineering using the CRISPR-Cas9 system. Nat Protoc 8, 2281–2308. 10.1038/nprot.2013.143. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 89.Wulfridge P, and Sarma K (2021). A nuclease- and bisulfite-based strategy captures strand-specific R-loops genome-wide. Elife 10. 10.7554/eLife.65146. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90.Trizzino M, Barbieri E, Petracovici A, Wu S, Welsh SA, Owens TA, Licciulli S, Zhang R, and Gardini A (2018). The Tumor Suppressor ARID1A Controls Global Transcription via Pausing of RNA Polymerase II. Cell Rep 23, 3933–3945. 10.1016/j.celrep.2018.05.097. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 91.Langmead B, and Salzberg SL (2012). Fast gapped-read alignment with Bowtie 2. Nature Methods 9, 357–359. 10.1038/nmeth.1923. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 92.Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, and Durbin R (2009). The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079. 10.1093/bioinformatics/btp352. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 93.Ramírez F, Ryan DP, Grüning B, Bhardwaj V, Kilpert F, Richter AS, Heyne S, Dündar F, and Manke T (2016). deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic Acids Res 44, W160–165. 10.1093/nar/gkw257. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 94.Amemiya HM, Kundaje A, and Boyle AP (2019). The ENCODE Blacklist: Identification of Problematic Regions of the Genome. Scientific Reports 9, 9354. 10.1038/s41598-019-45839-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 95.Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, Batut P, Chaisson M, and Gingeras TR (2013). STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21. 10.1093/bioinformatics/bts635. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 96.Li B, and Dewey CN (2011). RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinformatics 12, 323. 10.1186/1471-2105-12-323. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 97.Durand NC, Shamim MS, Machol I, Rao SS, Huntley MH, Lander ES, and Aiden EL (2016). Juicer Provides a One-Click System for Analyzing Loop-Resolution Hi-C Experiments. Cell Syst 3, 95–98. 10.1016/j.cels.2016.07.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 98.Durand NC, Robinson JT, Shamim MS, Machol I, Mesirov JP, Lander ES, and Aiden EL (2016). Juicebox Provides a Visualization System for Hi-C Contact Maps with Unlimited Zoom. Cell Syst 3, 99–101. 10.1016/j.cels.2015.07.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 99.Kruse K, Hug CB, and Vaquerizas JM (2020). FAN-C: a feature-rich framework for the analysis and visualisation of chromosome conformation capture data. Genome Biology 21, 303. 10.1186/s13059-020-02215-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 100.Crane E, Bian Q, McCord RP, Lajoie BR, Wheeler BS, Ralston EJ, Uzawa S, Dekker J, and Meyer BJ (2015). Condensin-driven remodelling of X chromosome topology during dosage compensation. Nature 523, 240–244. 10.1038/nature14450. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 101.Zhang Y, Liu T, Meyer CA, Eeckhoute J, Johnson DS, Bernstein BE, Nusbaum C, Myers RM, Brown M, Li W, and Liu XS (2008). Model-based Analysis of ChIP-Seq (MACS). Genome Biology 9, R137. 10.1186/gb-2008-9-9-r137. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 102.Rory Stark GB (2011). DiffBind: differential binding analysis of ChIP-Seq peak data.
- 103.Yu G, Wang LG, and He QY (2015). ChIPseeker: an R/Bioconductor package for ChIP peak annotation, comparison and visualization. Bioinformatics 31, 2382–2383. 10.1093/bioinformatics/btv145. [DOI] [PubMed] [Google Scholar]
- 104.Alipanahi B, Delong A, Weirauch MT, and Frey BJ (2015). Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning. Nat Biotechnol 33, 831–838. 10.1038/nbt.3300. [DOI] [PubMed] [Google Scholar]
- 105.Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, and Smyth GK (2015). limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res 43, e47. 10.1093/nar/gkv007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 106.Robinson MD, McCarthy DJ, and Smyth GK (2010). edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139–140. 10.1093/bioinformatics/btp616. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 107.Pintacuda G, Wei G, Roustan C, Kirmizitas BA, Solcan N, Cerase A, Castello A, Mohammed S, Moindrot B, Nesterova TB, and Brockdorff N (2017). hnRNPK Recruits PCGF3/5-PRC1 to the Xist RNA B-Repeat to Establish Polycomb-Mediated Chromosomal Silencing. Mol Cell 68, 955–969.e910. 10.1016/j.molcel.2017.11.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 108.Liang S, Silva JC, Suska O, Lukoszek R, Almohammed R, and Cowling VH (2022). CMTR1 is recruited to transcription start sites and promotes ribosomal protein and histone gene expression in embryonic stem cells. Nucleic Acids Res 50, 2905–2922. 10.1093/nar/gkac122. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 109.An integrated encyclopedia of DNA elements in the human genome. (2012). Nature 489, 57–74. 10.1038/nature11247. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Table S1. Oligo sequences and combinations used for EMSA experiments, Related to Figure 3
Table S2. Oligos and gBlocks used in this study, Related to Figure 4
Data Availability Statement
Sequencing data have been deposited at GEO and original western blot and EMSA images have been deposited at Mendeley Data. All data are publicly available as of the date of publication. Accession numbers and DOI are listed in the key resources table.
This paper does not report original code.
Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request.
KEY RESOURCES TABLE
REAGENT or RESOURCE | SOURCE | IDENTIFIER |
---|---|---|
Antibodies | ||
Anti-CTCF (D31H2) | Cell Signaling | Cat# 3418;RRID:AB_2086791 |
Anti-ADNP | Yan et al., 2022 | N/A |
Anti-EZH2 | Cell Signaling | Cat# 5246; RRID:AB_10694683 |
BG4 (scFv) | Absolute Antibody | Ab00174-30.126 |
Monoclonal ANTI-FLAG M2 antibody | Sigma | A2220 |
Rabbit Anti-Mouse | Fisher | Cat# SA5-10192; RRID:AB_2556772 |
Rabbit Anti-FLAG | Sigma | Cat# F7425; RRID:AB_439687 |
Goat Anti-Rabbit IgG Alexa 555 | Fisher | Cat# 21429; RRID:AB_2535850 |
Bacterial and Virus Strains | ||
BL21 | Fisher | C601003 |
Chemicals, Peptides, and Recombinant Proteins | ||
NEBuilder HiFi | NEB | E2621 |
GST-ΔRNH-MNase | Yan et al., 2019 | N/A |
GST-MNase | Yan et al., 2019 | N/A |
pA-MNase | Skene and Henikoff, 2017 | N/A |
IPTG | Fisher | BP17755-10 |
Glutathione Sepharose 4B | Sigma | GE17-0756-01 |
RNase H | NEB | M0297 |
Indole-3-acetic acid | Sigma | I5148 |
Trizol | Fisher | 15596026 |
FastSelect -rRNA HMR | Qiagen | 334387 |
Pyridostatin hydrochloride | Sigma | SML2690 |
Phen-DC3 trifluoromethanesulfonate | MedChemExpress | HY-15594A |
Actinomycin D | Cell Signaling | 15021S |
Critical Commercial Assays | ||
Duolink In Situ Mouse/Rabbit | Sigma | DUO92101 |
Ultra II Directional RNA Library Prep Kit | NEB | E7760 |
Hi-C+ for High-Coverage Kit | Arima Genomics | N/A |
NEBNext Ultra II DNA Library Prep Kit | NEB | E7645 |
Deposited Data | ||
CUT&RUN, MapR, ChIP-Seq, RNA-Seq, and Hi-C data | This paper | GSE208145 |
Unprocessed gel images for Figures 3, S1, S2, and S3 | This paper | doi:10.17632/gyhk7sd38w.1 |
G4 CUT&TAG data | Lyu et al., 2022 | GSE173103 |
RNA-Seq and CUT&RUN data | Yan et al., 2022 | GSE171401 |
ChIP-Seq data | Liang et al., 2022 | GSE175631 |
ChIP-Seq data | Saldaña-Meyer et al., 2019 | GSE125595 |
WGBS data | Habibi et al., 2013 | GSE41923 |
H3K4me3 data | ENCODE | ENCFF469DBC |
H3K27me3 data | ENCODE | ENCFF012GHA |
HiC contact maps | Bonev et al., 2017 | N/A |
Experimental Models: Cell Lines | ||
E14 | ||
ADNPKO E14 | Yan et al., 2022 | N/A |
EN52.9.1 | Nora et al., 2017 | N/A |
E14-Praf2-GDel1 | This paper | N/A |
E14-Praf2-GDel2 | This paper | N/A |
Oligonucleotides | ||
Oligos and gBlocks (plasmid construction) | IDT (See Table S1) | N/A |
Oligos (CTCF EMSAs) | IDT (See Table S2) | N/A |
Recombinant DNA | ||
pETDuet-GST-CTCF-ZF1-11 | Ronen Marmorstein’s lab | N/A |
pETDuet-GST-CTCF-ΔZF1 | This paper | N/A |
pETDuet-GST-CTCF-ΔZF10 | This paper | N/A |
PX459 | Feng Zhang’s lab | Addgene:62988 |
PX459-Praf2-gRNA | This paper | N/A |
pcDNA3 | Invitrogen | Addgene:2092 |
pcDNA3-Praf2-gDel | This paper | N/A |
Software and Algorithms | ||
Bowtie 2.2.9 | Langmead and Salzberg, 2012 | RRID:SCR_005476 |
Samtools | Li et al., 2009 | RRID:SCR_002105 |
Integrative Genome Viewer | Robinson et al., 2011; Thorvaldsdóttir et al., 2013 | RRID:SCR_011793 |
deepTools 3.4.1 | Ramírez et al., 2016 | RRID:SCR_016366 |
MACS2 2.2.7.1 | Zhang et al., 2008; https://github.com/taoliu/MACS/ | RRID:SCR_013291 |
Juicer | Durand et al., 2016 | RRID:SCR_017226 |
STAR 2.7.3 | Dobin et al., 2013 | RRID:SCR_004463 |
RSEM 1.3.3 | Li and Dewey, 2011 | RRID:SCR_013027 |
Juicebox | Durand et al., 2016 | RRID:SCR_021172 |
FAN-C 0.9.23 | Kruse et al., 2020 | https://github.com/vaquerizaslab/fanc |
bedtools 2.30.0 | Quinlan and Hall, 2010 | RRID:SCR_006646 |
DiffBind 2.12.0 | Stark and Brown, 2011; http://bioconductor.org/packages/DiffBind/ | RRID:SCR_012918 |
ChIPseeker 1.20.0 | Yu et al., 2015 | http://bioconductor.org/packages/ChIPseeker/ |
DeepBind 0.11 | Alipanahi et al., 2015 | http://tools.genes.toronto.edu/deepbind/ |
limma 3.40.6 | Ritchie et al., 2015 | RRID:SCR_010943 |
edgeR 3.26.8 | Robinson et al., 2010 | RRID:SCR_012802 |
ChromHMM 1.20.0 | Ernst and Kellis, 2012 | RRID:SCR_018141 |