Abstract
CCCTC-binding factor (CTCF), a ubiquitously expressed and highly conserved protein, is known to play a critical role in chromatin structure. Post-translational modifications (PTMs) diversify the functions of protein to regulate numerous cellular processes. However, the effects of PTMs on the genome-wide binding of CTCF and the organization of three-dimensional (3D) chromatin structure have not been fully understood. In this study, we uncovered the PTM profiling of CTCF and demonstrated that CTCF can be O-GlcNAcylated and arginine methylated. Functionally, we demonstrated that O-GlcNAcylation inhibits CTCF binding to chromatin. Meanwhile, deficiency of CTCF O-GlcNAcylation results in the disruption of loop domains and the alteration of chromatin loops associated with cellular development. Furthermore, the deficiency of CTCF O-GlcNAcylation increases the expression of developmental genes and negatively regulates maintenance and establishment of stem cell pluripotency. In conclusion, these results provide key insights into the role of PTMs for the 3D chromatin structure.
Subject terms: Chromatin structure, Self-renewal
CTCF, which is known to play critical role in chromatin structure, undergoes post-translational modifications (PTMs). In this research, O-GlcNAcylation was found to inhibit CTCF binding, impacting 3D chromatin structure, gene expression and cellular development.
Introduction
The mammalian genome is organized into 3D structures ranging from compartments, topological associated domains (TADs), and loops1–3. These structures are highly dynamic during development, and their disruption could lead to abnormal gene transcription and diseases4,5. CTCF, a ubiquitously expressed and highly conserved DNA-binding protein, plays a critical role in chromatin architecture organization6–9. CTCF has been found to form TADs via cohesin-mediated loop extrusion3,9–12 and the phase separation of CTCF organizes inter-A compartment interactions13.
The chromatin binding of CTCF is important for chromatin structure organization14–16. DNA methylation, RNA binding, and protein–protein interactions have been reported to play roles in CTCF binding17–22. However, the mechanisms regulating CTCF binding to chromatin are not fully understood23–25. PTMs are essential mechanisms to extend protein functions beyond what is dictated by gene transcription and to regulate cellular physiology26,27. Despite the potential diversity of CTCF PTMs28–30, only a few have been identified, such as acetylation, sumoylation, poly(ADP-ribosyl)nation, and phosphorylation31–36. To understand the functional significance of CTCF PTMs, potential PTMs should be systematically mapped from the N to the C terminal. Furthermore, previous studies have found that PTMs of CTCF can affect its chromatin binding in a few loci. For example, the phosphorylation of CTCF decreases its DNA-binding activity in the rRNA gene locus, and mutating CTCF at the major phosphorylation sites leads to the repression of binding to the chicken and human c-myc promoters37,38. However, the effect of PTMs on the genome-wide binding of CTCF has not been fully elucidated. Meanwhile, the effect of CTCF PTMs on 3D chromatin structure has not been fully understood.
Here, we uncovered the PTM profiling of CTCF in mouse embryonic stem cells (mESCs) and showed that CTCF can undergo O-GlcNAcylation and arginine methylation. To investigate the role of CTCF O-GlcNAcylation, we mapped 3D chromatin architecture, CTCF ChIP-seq, chromatin accessibility, and gene expression in the context of O-GlcNAcylation deficiency. These findings provide information on the importance of CTCF PTMs in organizing 3D genome structure.
Results
The PTM profiling of CTCF
It has been reported that CTCF can be phosphorylated, acetylated, sumoylated, and poly (ADP-ribosyl) ated31–34,39,40. Many previously reported PTMs were identified by candidate-based approach, a method based on prediction of PTM sites by known databases and followed by validation38. In addition to the method above, other PTMs of CTCF have also been identified by the enrichment of interested PTM, such as the enrichment of phosphorylation by antibody or phos-tag SDS/PAGE37. However, these strategies have high levels of inherent biases, which is not conducive to finding new CTCF PTM.
To understand the function of CTCF PTMs, potential PTMs should be mapped from the N to the C terminal with an unbiased method41. Hence, we mapped CTCF PTMs unbiasedly and a schematic overview is shown in Fig. 1a. In brief, CTCF was purified from the lysates of WT mESCs and separated by SDS-PAGE (Supplementary Fig. 1a). After coomassie blue staining, the purified protein was analyzed using liquid chromatography tandem MS (LC–MS/MS), followed by analysis of the raw MS files via Byonic database42. The PTMs can be classified into four types, among which phosphorylation and acetylation have been found previously (Fig. 1b; Supplementary Fig. 1b)31,39. Importantly, we observed CTCF can also be O-GlcNAcylated and arginine methylated, which has not been reported previously (Fig. 1b). To further confirm the results, purified CTCF proteins were enriched with the antibodies of O-GlcNAcylation and arginine methylation, respectively (Fig. 1a). LC–MS/MS analysis revealed one peptide with amino acids 668-672 (TNQPK) as an O-GlcNAcylated peptide (Fig. 1c, d), while four peptides as arginine methylated peptides, which showed a mass shift of 14 or 28 Da on arginine residue (Supplementary Fig. 1c–f).
Taken together, we performed the PTM profiling of CTCF with an unbiased method and identified that CTCF can be O-GlcNAcylated and arginine methylated.
Validation of O-GlcNAcylation and arginine methylation
O-GlcNAc-specific antibody RL2 and succinylated wheat germ agglutinin (sWGA) have high affinity for O-GlcNAcylated proteins. To further verify the O-GlcNAcylation of CTCF, we purified RL2 and sWGA-bound proteins from mESCs lysates under denatured condition to disrupt protein–protein interactions and probed for CTCF using western blot. Results showed that endogenous CTCF specifically bound to RL2 and sWGA (Fig. 2a), proving CTCF is O-GlcNAcylated. We reconfirmed the O-GlcNAcylation of CTCF by purifying CTCF from mESCs lysates and using western blot with RL2 to recognize O-GlcNAcylated serine and threonine residues (Fig. 2b). Meanwhile, mutation of the CTCF amino acid involved in O-GlcNAcylation showed a decreased signal of RL2 (Fig. 2c). We also removed terminal O-linked glycosidic modifications using β-N-Acetyl-hexosaminidase (β-hex)43 and the O-GlcNAcylation signal of CTCF was decreased (Fig. 2d). To further confirm the modification, we extracted ion chromatography. Results showed that the peak area of O-GlcNAcylated peptide was 23,340,846,486 and the peak area of unmodified peptide was 15,501,338,529. Therefore, the O-GlcNAcylation exists in 60.09% of CTCF. Since phosphorylation can also occur at serine or threonine residues, we wanted to determine if phosphorylation is present at T668. We performed high-throughput phosphoproteomics and identified approximately 11,000 phosphorylation sites in mESCs. The phosphoproteomic analysis did not detect phosphorylated signal at T668, indicating a low probability of phosphorylation modification occurring at this site. Furthermore, we investigated the distribution of CTCF O-GlcNAcylation across mammalian tissues. We identified a common pattern of modification, in which the CTCF O-GlcNAcylation is found in multiple important tissues (Fig. 2e). Meanwhile, we tested the CTCF O-GlcNAcylation in mouse embryonic fibroblast (MEF), pre-iPSC, which is a late intermediate reprogramming stage, and mESC, the result showed that the signal of CTCF O-GlcNAcylation increases with pluripotency (Fig. 2f).
There are three types of arginine methylation, including mono-methylarginine (MMA), asymmetrical dimethylarginine (ADMA), and symmetrical dimethylarginine (SDMA)44. To further verify the arginine methylation of CTCF, we used antibodies of ADMA, SDMA, and MMA respectively to purify arginine methylated protein under denatured condition, and probed for CTCF by western blot (Supplementary Fig. 2a–c). We also purified CTCF and performed western blot of ADMA, SDMA, and MMA to reconfirm the results (Supplementary Fig. 2d–f). Moreover, the level of arginine methylation decreased in response to treatment with the arginine methylation inhibitor, periodate oxidized adenosine (AdOx) (Supplementary Fig. 2g). In conclusion, we demonstrated that CTCF possesses three types of arginine methylation.
Taken together, we uncovered the PTMs map of CTCF (Fig. 1b) and demonstrated that CTCF can also be O-GlcNAcylated and arginine methylated.
The O-GlcNAcylation of CTCF is regulated by OGT
O-GlcNAcylation, an O-β-glycosidic attachment of single N-acetylglucosamine (GlcNAc) to serine and threonine residues, is involved in the regulation of diverse cellular processes45. Since in previous studies, O-GlcNAcylation plays important roles in diverse aspects of protein functions, such as protein–protein interaction, protein stability, and enzyme activity46–48. We focused on the regulation mechanism and function of CTCF O-GlcNAcylation.
To investigate the regulation mechanism of CTCF O-GlcNAcylation, we focused on the process of O-GlcNAcylation. It is reported that O-GlcNAc transferase (OGT) adds the modification while O-GlcNAcase (OGA) removes it45. We confirmed interactions of OGT with CTCF (Fig. 2g). Then we also analyzed the interaction domain of OGT and CTCF by ZDOCK49 and found the OGT interacts with CTCF in the region of O-GlcNAcylation (Fig. 2h). Furthermore, we knocked down the Ogt and found the level of CTCF O-GlcNAcylation was decreased (Fig. 2i).
Taken together, these results show that the O-GlcNAcylation of CTCF is regulated by OGT.
Deficiency of O-GlcNAcylation enhances chromatin binding of CTCF
To examine the function of the CTCF O-GlcNAcylation, we used CTCF-mAID mESCs as donor cells14, and transduced the cells with vector encoding WT_CTCF or T668A O-GlcNAc-deficient CTCF (MUT_CTCF), respectively (Fig. 3a). Then, we induced the rapid degradation of endogenously tagged CTCF using 0.5 mM auxin. The clone in which the expression level of exogenous CTCF was at a similar level to that of the endogenous CTCF was collected for further functional experiments (Fig. 3b–d, Supplementary Fig. 3a). After treatment with auxin for 48 h to replace the endogenous CTCF, the expression of core pluripotency factors OCT4, SOX2 and NANOG was similar between WT_CTCF and MUT_CTCF mESCs, and the cells were morphologically identical and were undifferentiated (Supplementary Fig. 3b, c). We used these cells to detect the function of O-GlcNAcylation of CTCF in chromatin binding and 3D chromatin structure.
We firstly investigated the change of CTCF genome-wide binding in response to the mutation of O-GlcNAcylation (Supplementary Fig. 3d). To identify specific differences between chromatin binding of O-GlcNAc-deficient CTCF (MUT_CTCF) and WT_CTCF, we conducted the spike-in normalized ChIP-seq50,51, and found that the global chromatin binding of O-GlcNAc-deficient CTCF was increased significantly (Fig. 3e). Consistently, immunoblots showed that chromatin-bound CTCF was increased and non-chromatin-bound CTCF was decreased in the MUT_CTCF mESCs (Fig. 3h). The location distribution of the enhanced peaks showed that enhanced peaks preferentially bound introns, intergenic regions and insulators (Fig. 3f, Supplementary Fig. 3e). Since the gain of CTCF occupancy is associated with a gain of chromatin accessibility, we performed an assay of transposase-accessible chromatin using high-throughput sequencing (ATAC-seq) experiments. Results showed that chromatin accessibility was increased on the regions of enhanced peaks between MUT_CTCF and WT_CTCF (Fig. 3g). Meanwhile, there were more CTCF motifs on the enhanced peaks, which indicates that there could be more CTCF proteins binding to CTCF peaks (Supplementary Fig. 3f). Thereafter, we defined the genes with enhanced peaks located on their promoters (3000 base pairs upstream of gene transcriptional start sites) as enhanced CTCF peaks related genes. Gene Ontology (GO) analysis demonstrated that these genes were highly correlated with developmental process (Fig. 3i, j). To further validate the results, we selected additional clones of WT_CTCF and MUT_CTCF with consistent expression levels and found that the results were consistent (Supplementary Fig. 3g, h).
Taken together, these results demonstrate that O-GlcNAcylation inhibits CTCF binding to chromatin.
Deficiency of CTCF O-GlcNAcylation disturbs a subset of loop domains
To measure changes in 3D chromatin structure upon mutation of O-GlcNAcylation, we generated Hi-C profiles of WT_CTCF and MUT_CTCF mESCs (Supplemental Table 4), which revealed high correlation between replicates (Supplementary Fig. 4a). Our WT_CTCF Hi-C showed a high correlation with ESCs in Bonev et al. (Supplementary Fig. 4b) and demonstrated similarity at compartments, TADs and loops52 (Supplementary Fig. 4d–f). Specifically, 7908 and 7975 TADs, 11,207 and 11,140 loops were identified in WT_CTCF and MUT_CTCF cells respectively, and 7769 TADs and 9931 loops were identified in Bonev’s data (Supplementary Fig. 4c).
First, we investigated the extent to which O-GlcNAcylation mutation affects the compartments. The relative contact probability and contact maps did not show a significant change (Fig. 4a, b). The compartment states were maintained upon mutation of O-GlcNAcylation (Fig. 4c). Overall, the mutation of CTCF O-GlcNAcylation does not affect the distribution of A/B compartments. Then, we analyzed the influence on loop domains, which are contact domains whose endpoints form a chromatin loop. Using an approach developed by Rao et al.3, we identified 3193 loop domains in WT_CTCF and 3263 in MUT_CTCF. Loop domains with both anchors overlapped were merged to generate unique results. In the end, we obtained a total of 4431 merged loop domains, among which 356 were enhanced and 375 were weakened (Fig. 4d, e).
Taken together, these results demonstrate that deficiency of CTCF O-GlcNAcylation disturbs a subset of loop domains.
Deficiency of CTCF O-GlcNAcylation alters enhancer-promoter interactions associated with cellular development
To test whether the O-GlcNAcylation of CTCF plays a role in regulating chromatin loops, we defined the loops with CTCF ChIP-seq peaks locating on both anchors as CTCF-related loops and performed the aggregate peak analysis to quantify the differences (Fig. 5a). To examine the types of interactions in each differential loops, we divided CTCF-related loops into enhancer-enhancer (E-E), enhancer-promoter (E-P), promoter-promoter (P-P) and insulator-insulator (I-I) interactions53. The O-GlcNAc-deficient CTCF-related loops were preferentially I-I and E-P interactions (Fig. 5b). Since E-P interactions contribute to gene transcription for cellular development54, we then analyzed differential CTCF-related E-P interactions. 201 and 316 E-P interactions were strengthened and weakened respectively (Fig. 5c). GO analysis revealed that the genes with promoters overlapping the anchors of strengthened CTCF-related E-P interactions were highly related to developmental processes and the genes with promoters overlapping the anchors of weakened CTCF-related E-P interactions were closely associated with metabolic process and regulation of cell cycle (Fig. 5d). These results indicate deficiency of CTCF O-GlcNAcylation may contribute to cellular development.
Taken together, these results demonstrate that deficiency of CTCF O-GlcNAcylation alters enhancer-promoter interactions which are closely associated with cellular development.
Deficiency of CTCF O-GlcNAcylation upregulates the expression of developmental genes
To test whether mutation of CTCF O-GlcNAcylation has any effect on gene expression, we compared the RNA-seq of MUT_CTCF mESCs and WT_CTCF mESCs (Fig. 6, Supplementary Fig. 5a). GO analysis revealed the up-regulated genes were associated with developmental processes, while the down-regulated genes were associated with cell adhesion and metabolic process (Supplementary Fig. 5b, c). These results can partially account for differentiated phenotype of mESCs when culturing the MUT_CTCF mESCs for a long time (Fig. 7). Next, we wondered whether the changes in 3D chromatin structure and CTCF binding have any effect on gene expression. Since deficiency of O-GlcNAcylation enhanced chromatin binding of CTCF, we then want to determine if the strengthened CTCF-related E-P interactions is related to the enhanced chromatin binding of CTCF. We analyzed the reads difference of CTCF peaks on strengthened CTCF-related E-P interactions and found that overall CTCF binding was enhanced (Fig. 6a). Further analysis revealed that 65.52% strengthened CTCF-related E-P interactions had enhanced CTCF peaks on them (Fig. 6b). We then examined the group of genes targeted by strengthened CTCF-related E-P interactions with enhanced CTCF peaks and found that 63.64% of genes have increased expression (Fig. 6b). GSEA showed that these genes were enriched in MUT_CTCF mESCs, indicating they were prone to be up-regulated in MUT_CTCF mESCs (Fig. 6c). GO enrichment analysis further revealed these genes were enriched for biological processes related to the function of mESCs, including the ability to proliferate indefinitely in vitro (self-renewal) and differentiate into cells from all three germ layers (pluripotency) (Fig. 6d). Specific examples of these genes include Hox genes, like Hoxd8 and Hoxd9, which are part of an important gene family involved in limb development and stem cell differentiation55 (Fig. 6f). GO analysis suggested that mutation of O-GlcNAcylation may affect the ability of proliferation and lead to differentiation of MUT_CTCF mESC cell lines.
Taken together, these results indicate that after the deficiency of O-GlcNAcylation, strengthened CTCF-related E-P interactions with enhanced CTCF peaks upregulate the expression of developmental genes (Fig. 6e).
Deficiency of CTCF O-GlcNAcylation negatively regulates maintenance and establishment of pluripotency
Emerging evidence suggests that CTCF plays an important role in cell fate transitions. For example, overexpression of CTCF improves reprogramming56, knockout of CTCF impedes cell differentiation from ESCs to neural precursor cells (NPCs)16 and CTCF is a barrier for 2C-like reprogramming57. Meanwhile, large number of studies have shown that O-GlcNAcylation plays a role in pluripotency of mESCs. For instance, O-GlcNAcylation of OCT4 and ESRRB facilitates the maintenance of pluripotency47,58, whereas O-GlcNAcylation of SOX2 inhibits pluripotency46. Importantly, we demonstrated above that deficiency of CTCF O-GlcNAcylation affects chromatin structure associated with cellular development. Hence, we wondered whether O-GlcNAcylation of CTCF is important for cell fate transitions.
To answer this question, WT_CTCF and MUT_CTCF mESCs were cultured for a prolonged period. The cell viability was inhibited in MUT_CTCF cells (Fig. 7a) and MUT_CTCF cells grew slower than WT_CTCF cells due to an elongated G1 phase (Fig. 7b). Moreover, mutation of O-GlcNAcylation resulted in the reduction of total colony numbers with an increased proportion of partially differentiated populations (Fig. 7c, d, g). The expressions of ectodermic marker genes, Pax3 and Fgf5, were increased (Fig. 7e) while the core pluripotency factors, Oct4 and Nanog, were decreased (Fig. 7f). These results indicate O-GlcNAcylation is required for maintenance and self-renewal of mESCs. To study the potential roles of O-GlcNAcylation of CTCF in differentiation, we performed embryoid body (EB) differentiation, which is a spontaneous differentiation of mESCs into cells of all three germ layers. During EBs formation, the size of EBs was larger in MUT_CTCF cells (Fig. 7h) and the expressions of ectodermic markers, Fgf5 and Pax3, were increased in MUT_CTCF cells compared with those in WT_CTCF cells (Fig. 7i). Furthermore, we performed directional differentiation from mESCs to NPCs. As shown in Supplementary Fig. 6a, the expressions of the NPCs markers were decreased, indicating that mutation of O-GlcNAcylation of CTCF inhibits the differentiation into NPCs. At the same time, we explored the potential roles of O-GlcNAcylation of CTCF in establishment of pluripotency during somatic cell reprogramming. We found that mutation of O-GlcNAcylation decreased reprogramming efficiency (Fig. 7j).
Taken together, our data demonstrates that O-GlcNAcylation of CTCF is important for the maintenance and establishment of pluripotency.
Discussion
In this study, we performed the PTM profiling of CTCF and demonstrated CTCF can also be O-GlcNAcylated and arginine methylated, providing a basis for further functional studies on the PTMs of CTCF. Functionally, we revealed that O-GlcNAcylation inhibits CTCF binding to chromatin, and deficiency of CTCF O-GlcNAcylation results in the disruption of loop domains and the alteration of chromatin loops associated with cellular development. These findings provide a molecular mechanism for regulation of 3D chromatin structure and indicate the importance of PTMs in chromatin organization. Among the altered loops, some of them are weakened loops. We also briefly discussed the potential mechanisms for the weakening of the loops. Since previous study has indicated the impact of OCT4 on CTCF loops22, we speculate that it also has an effect on the weakened CTCF loops. To investigate this, we analyzed the binding of OCT4 on weakened CTCF-related E-P interactions. The permutation test results indicated that OCT4′s binding was significantly higher, whether it’s on the anchors of weakened CTCF-related E-P interactions or the CTCF peaks of weakened CTCF-related E-P interactions (Supplementary Fig. 7a, b). Meanwhile, we conducted ChIP-qPCR experiments and found that, after the mutation of O-GlcNAcylation, the chromatin binding of OCT4 decreased on the anchor of the weakened loops (Supplementary Fig. 7c, d). However, the binding of OCT4 on the anchor of the strengthened loops remained unchanged (Supplementary Fig. 7c). These results suggest that the weakened loops may be attributed to the reduced binding of OCT4. As for why the mutation of CTCF O-GlcNAcylation leads to a decrease in OCT4 binding on the anchor of weakened loops, we found that the mutation of O-GlcNAcylation weakened the interaction between CTCF and OCT4 (Supplementary Fig. 7e). Therefore, we speculate that at the anchor of weakened loops, the mutation of O-GlcNAcylation weakens the interaction between CTCF and OCT4, leading to a decrease in OCT4’s chromatin binding, which may subsequently result in a reduction in the strength of CTCF-related loops.
CTCF is a widely expressed protein with thousands of chromatin-binding sites6,8,59,60. Depletion of its zinc finger domain or RNA-binding domain has different effects on the chromatin binding of CTCF and 3D chromatin structure19,25,61,62. The PTM profiling performed in this study shows that PTMs are widely distributed in various regions (Fig. 1b), and the role of PTMs in different regions is also worth investigating. Meanwhile, CTCF has been reported to favor binding to consensus motif, which is present at 55,000–65,000 sites in the genome, but 30%–60% of CTCF-binding sites show cell-type-specific pattern7,18,63–65. Deficiency of O-GlcNAcylation selectively enhances chromatin binding of CTCF in promoter of developmental genes in mESCs (Fig. 3), indicating that PTMs play a role in the selection of CTCF binding. Furthermore, the cell-type-specific CTCF-binding sites result in selective gene expression66,67. We found that the level of CTCF O-GlcNAcylation varies across different tissues (Fig. 2), which may be involved in CTCF-regulated gene expression in different cell types.
CTCF plays a key role in the organismal development. Studies have showed that removal of CTCF from oocytes leads to embryo lethality and homozygous null mutant embryos die by the pre-implantation68–70. Depletion of CTCF from cardiac progenitors results in lethality in mice71. Knockout of CTCF gene from neural cells also causes lethality72. It was showed that CTCF O-GlcNAcylation is common among different cell types (Fig. 2), the role of CTCF O-GlcNAcylation in the organismal development remains to be studied. Abnormity of CTCF-mediated chromatin structures leads to disease5,73, we found that CTCF O-GlcNAcylation regulates 3D chromatin structures, whether the O-GlcNAcylation is related to the abnormal 3D structure is worth studying.
Methods
Cell culture
ESC lines R174 was used in this study and cultured in gelatin-coated dish with ESC medium, which consisted of DMEM (Hyclone, #SH30022.01), 15% (v/v) fetal bovine serum (FBS, Lonsera, #S712-012S), 0.1 mM β-mercaptoethanol (Sigma, #M6250), 2 mM L-glutamine (Thermo Fisher, #35050061), 0.1 mM nonessential amino acids (Thermo Fisher, #11140050), 1% (v/v) nucleoside mix (Sigma), 1000 U/mL recombinant LIF (Millipore).
Mouse CTCF-EGFP-AID ESCs were presented by Bruneau lab14 and cultured in gelatin-coated dish with ESC medium. For the depletion of CTCF, the cells were cultured in ESC medium with 0.5 mM auxin (Sigma, #I5148-2g) for 2 days. To establish the T668A glycosylated mutant CTCF (MUT_CTCF) and WT CTCF (WT_CTCF) in CTCF-EGFP-AID ESCs, whose exogenous CTCF was in similar level with endogenous CTCF in CTCF-EGFP-AID ESCs, the fragment of MUT_CTCF and WT_CTCF was subcloned into pJD10575 and the Fugene HD (Promega, #E2311) was used for transfection and the cells were selected by Hygromycin B (Thermo Fisher, #10687010). Cell clones were picked and then cultured in ESC medium with 0.5 mM auxin for 2 days to degrade endogenous CTCF. The clone whose expression level of exogenous CTCF was similar with endogenous was identified by anti-CTCF antibody and was used in the functional experiments.
To knockdown Ogt, we used lipofectamineTM2000 for siRNA transfection. Gene expression was detected through RT-qPCR after 72 h. The cells with the highest knockdown efficiency were used for downstream experiments.
NPC differentiation
The cell line was cultured in N2B27 medium which consisted of a 1:1 mixture of DMEM supplemented with 1× N2 (Gibco, #17502048), NEAA, 1 mM L-glutamine, and 0.1 mM β-mercaptoethanol with Neurobasal (Thermo Fisher, #21103049) supplemented with B27 (Gibco, #17504044). After 7 days, the cells were harvested and the total mRNA was extracted for RT-qPCR.
Antibodies
Antibodies used in this study were anti-CTCF (Active Motif, #61311), anti-CTCF (Abclonal, #A19588), anti-RL2 (abcam, #ab2739), anti-OGT (GeneTex, # GTX109939), anti-Mono-Methyl Arginine (MMA) antibody (CST, #8015), anti-Symmetric Di-Methyl Arginine (SDMA) antibody (PTMBIO, #PTM-617RM), anti-Asymmetric Di-Methyl Arginine (ADMA) antibody (PTMBIO, #PTM-605RM), anti-Acetyllysine antibody (PTMBIO, #PTM-105RM), anti-GAPDH (Abclonal, #AC001), anti-OCT4 (Santa, #sc-5279), anti-SOX2 (Santa, #sc-36582-3), anti-NANOG (Bethyl, #A300-397A), anti-H3 (Santa, #sc-17576). Goat anti-rabbit-IgG (H + L)–HRP (CST, #7074s), and goat anti-mouse-IgG (H + L)-HRP (CST, #7076s) were used as secondary antibodies for western blotting.
The purification of CTCF
A total of 50 million cells were harvested by trypsinization, washed with cold PBS, and frozen in liquid nitrogen. Whole-cell pellets were lysed in lysis buffer (50 mM HEPES pH = 7.6, 250 mM NaCl, 0.1% NP-40, 0.2 mM EDTA, 0.2 mM PMSF, 1× protease inhibitor cocktail) on ice. The lysates were cleared by centrifugation at 16,000 × g for 10 min at 4 °C and the supernatant was protein extraction. Antibody-based purification was performed to detect the PTMs of endogenous CTCF. Briefly, CTCF antibody was conjugated with Protein G agarose (Roche, #11243233001) by incubating in IP DNP buffer (20 mM HEPES pH = 7.6, 0.2 mM EDTA, 1.5 mM MgCl2, 100 mM KCl, 20% glycerol, 0.02% NP-40, 1× protease inhibitor cocktail) overnight at 4 °C and washed twice. Then, protein extraction was added to the Protein G agarose and rotated at 4 °C. After 12 h, the supernatant was removed and the antibody-conjugated Protein G agarose was washed twice with IP DNP buffer. 2× SDS loading buffer was added to the antibody-conjugated Protein G agarose and the protein was eluted by boiling 5 min at 95 °C. SDS-PAGE analysis was applied to assess purification efficiency and the remaining protein was kept in −80 °C for mass spectrometric and western blot.
Mass spectrometric analysis of CTCF PTMs
Coomassie-stained SDS-PAGE gel band was excised and incubated with 200 μL of 100 mM ammonium bicarbonate with 25% ACN at 37 °C. After 30 min, the supernatant was removed and the gel slices was incubated with 10 mM DTT solution at 60 °C for 30 min. The DTT solution was removed and 100 mM IAA solution was added and incubated sample at 37 °C for 15 min in the dark with shaking. After then, IAA solution was removed and 100 mM ammonium bicarbonate/25% ACN was added to rinse gel slices. In order to digest the gel, 500 μL of ACN was added to shrink gel pieces and was removed after 15 min. 100 mM ammonium bicarbonate was added to recover gel and 1 mg/mL trypsin stock solution was added in the sample to digest overnight at 37 °C. The gel pieces were extracted three times by adding 50 μL of 25% ACN/0.1% TFA solution and incubated at 37 °C for 5–15 min. The supernatant was transferred to a new tube and the peptides were completely dried under vacuum. The peptides were desalted using reverse-phase solid-phase extraction cartridges (Sep-Pak C-18), completely dried under vacuum, and were resuspended in 0.1% formic acid before LC–MS/MS analysis. The mass spectrometry experiments were repeated twice and the coverage of the protein was 68.26% and 65.33%, respectively. Only modifications identified in both mass spectrometry experiments were considered as modifications identified for CTCF.
Enrichment of O-GlcNAcylated, methylated, and acetylated proteins
25 million cells were individually used for the enrichment of O-GlcNAcylated, methylated, and acetylated proteins and protein extraction was the same as above and denatured by boiling for 10 min at 95 °C. For enrichment of O-GlcNAcylated protein, WGA and RL2 purifications were carried out as described elsewhere76. Denatured proteins were incubated overnight at 4 °C with agarose-bound WGA resin (Vector Laboratories, #AL-1023S) or with RL2-conjugated Protein G agarose. The agarose was then washed twice with lysis buffer and the eluted proteins were analyzed by western blot. For enrichment of methylated proteins, anti-Mono-Methyl Arginine (MMA) antibody, anti-Symmetric Di-Methyl Arginine (SDMA) antibody and anti-Asymmetric Di-Methyl Arginine (ADMA) antibody were used respectively as above. For enrichment of acetylated proteins, anti-Acetyllysine antibody was used as above. Agarose without WGA or IgG-conjugated agarose were used as control.
Identification of phosphorylation modification
2 × 107 cells were collected for protein extraction and proteins were digested into peptides. Phosphorylated peptides were enriched and identified by mass spectrometry according to ref. 77.
The treatment of β-N-Acetyl-hexosaminidase (β-hex), and periodate oxidized adenosine (AdOx)
The treatment of β-N-Acetyl-hexosaminidase (β-hex, NEB, #P0721S) was performed according to the manufacturer’s instructions, and 4 μg purified CTCF was used. For the treatment of periodate oxidized adenosine (AdOx, TargetMol, #T22231), mESCs were treated with 30 μM AdOx and DMSO for 36 h and harvested for the purification of CTCF.
Co-immunoprecipitation
10 million cells were collected and nuclear extracts were prepared from mESCs as described78. Endogenous CTCF was immunoprecipitated with 5 µg of CTCF antibody pre-bound to Protein G agarose and co-immunoprecipitated OGT was identified by western blot with the antibody of OGT. The immunoprecipitation of OGT was done as the same.
Tissue protein extraction
Tissue protein used in Fig. 2e was generously provided by Dr. Ma from Peking University. The tissue was dissected on ice in PBS/10% FBS. Each tissue was separated and placed to separate tubes using fine-pointed forceps. The dissected tissues were weighed, and then washed with PBS on ice. A ratio of ~1 g of tissue to 20 mL T-PER Reagent (Thermo, #78510) was added to the tissue sample. After homogenizing, the tissues were centrifuged for 5 min at 10,000 × g. The supernatant was collected and the protein was quantified by BCA Protein Assay Kit. For the purification of CTCF, each tissue sample weighing 200 mg was used for protein extraction and 500 µg protein was used as total input to purify CTCF.
Western blot
Samples were electrophoresed on SDS-PAGE gels and transferred to PVDF membranes (BIO-RAD, #1620177), and the membranes were blocked in 5% BSA at room temperature. After 1 h, the membranes were incubated with the primary antibody overnight at 4 °C, washed three times, and incubated with peroxidase-labeled secondary antibody for 1 h at room temperature. After three washes with TBST, bands were visualized with ECL substrate (BIO-RAD, #1705061) and imaged with a CCD camera. All uncropped and unprocessed scans of blots were provided in the Source Data file.
Immunofluorescence
Cells were grown on gelatin-coated glass for 24 h and then fixed in 4% paraformaldehyde (PFA, Solarbio, #P1110) for 15 min. After washing with PBS, cells were permeabilized with 0.25% Triton X-100 (AMERSCO, #0694-1L) and blocked with 10% bovine serum albumin. After 1 h, cells were incubated with CTCF antibodies in 3% BSA overnight at 2–8 °C. After washing 3 times with PBS, cells were incubated with secondary antibodies for 1 h. The fixed cells were imaged using confocal microscopy.
ChIP-seq
Cells were crosslinked in 1% formaldehyde (Sigma, #F8775) for 10 min at room temperature and quenched with 125 mM glycine (Sigma, #G7126) for 5 min. After then, the cells were collected and incubated in lysis buffer I (50 mM HEPES-KOH, pH = 7.5, 140 mM NaCl, 1 mM EDTA, 10% glycerol, 0.5% NP-40, 0.25% Triton X-100, protease inhibitors). After 10 min, the cells were collected, resuspended in lysis buffer II (10 mM Tris-HCl, pH = 8.0, 200 mM NaCl, 1 mM EDTA, 0.5 mM EGTA, protease inhibitors), and rotated for 10 min. For sonication, the cells were collected, and resuspended in sonication buffer (20 mM Tris-HCl pH = 8.0, 150 mM NaCl, 2 mM EDTA pH = 8.0, 0.1% SDS, and 1% Triton X-100, protease inhibitors). Sonicated lysates were cleared once by centrifugation at 16,000 × g for 10 min at 4 °C and the supernatant was transferred to 15 ml conical tube. Spike-in Drosophila chromatin (Active Motif, #53083) was added to the supernatant and 50 μL of mixture was saved as input. 10 mg of anti-CTCF (Active Motif, #61311) together with 4 mg spike-in antibody (Active motif, #61686) was added alongside with beads. The remainder of the mixture was incubated with magnetic beads bound with antibody to enrich for DNA fragments overnight at 4 °C. The next day, beads were washed with wash buffer (50 mM HEPES-KOH pH = 7.5, 500 mM LiCl, 1 mM EDTA pH = 8.0, 0.7% Na-Deoxycholate, 1% NP-40) and followed with TE buffer (10 mM Tris-HCl pH 8.0, 1 mM EDTA, 50 mM NaCl). Beads were removed by incubation at 65 °C for 30 min in elution buffer (50 mM Tris-HCl pH = 8.0, 10 mM EDTA, 1% SDS), and supernatant was reverse crosslinked overnight at 65 °C. To purify eluted DNA, 200 μL TE was added to dilute SDS, and 8 μL 10 mg/ml RNase A (Thermo Fisher, #EN0531) was added to degrade RNA. After 2 h, protein was degraded by addition of 4 μL 20 mg/ml proteinase K (Thermo Fisher, #25530049) and incubation at 55 °C for 2 h. Phenol: chloroform: isoamyl alcohol extraction (G-CLONE, #EX0128) was performed followed by an ethanol precipitation. The DNA pellet was then resuspended in 50 μL TE. Library was performed with NEBNext Ultra II DNA library kit (NEB, #E7645). Two biological replicates were performed for each cell line.
ATAC-seq
Cells were collected, washed with PBS, and incubated in lysis buffer (10 mM Tris-HCl, 10 mM NaCl, 3 mM MgCl2, 0.5% NP-40) for 10 min at 4 °C. After 5 min of centrifugation, TruePrepTM DNA Library Prep Kit V2 for Illumina® (Vazyme, #TD501) was used to make DNA fragmentation. After 30 min, 100 μL VAHTS DNA Clean Beads (Vazyme, #N411) were added to the sample. Then, DNA was collected with a magnet and washed with 80% ethanol. H2O was added to elute the DNA and library preparation was performed with TruePrepTM Index Kit V2 for Illumina® (Vazyme, #TD202). Libraries were amplified for 12–15 cycles and were size-selected with VAHTS DNA Clean Beads (Vazyme, #N411). Two biological replicates were performed for each cell line.
In situ Hi-C
In situ Hi-C was performed as in Rao et al.3 with some modifications. Cells were crosslinked as above. Cells were incubated in lysis buffer (10 mM Tris-HCl pH8.0, 10 mM NaCl, 0.2% Igepal CA630, protease inhibitors cocktail) for 15 min at 4 °C and washed twice. Cells were collected and incubated in 50 μL of 0.5% SDS at 65 °C for 8 min. Then 25 μL 10% Triton-X and 145 μL H2O were added to quenched SDS. To digest chromatin, 25 μL of 10× NEBuffer2 and 20 μL of MboI (New England Biolabs, #R0147) were added and incubated overnight at 37 °C. The next day, restriction fragments were biotinylated by supplementing the reaction with 37.5 μL biotin-14-dATP (Life Technologies, #19524016), 1.5 μL of 10 mM dCTP (Invitrogen, #18253013), 1.5 μL of 10 mM dGTP (Invitrogen, #18254011), 1.5 μL of 10 mM dTTP (Invitrogen, #18255018), 8 μL of DNA polymerase I, large (Klenow) fragment (New England Biolabs, #M0210) and incubated at 37 °C for 4 h. The end-repaired chromatin was added into 663 μL H2O, 120 μL NEB T4 ligase buffer, 100 μL 10% Triton-X-100, 12 μL 10 mg/mL BSA, 5 μL T4 DNA ligase (New England Biolabs, #M0202) and incubated for 4 h at room temperature. To reverse crosslink, 50 μL of 20 mg/ml proteinase K, 120 μL 10% SDS and 130 μL 5 M NaCl were added in order and incubated at 68 °C overnight. The DNA was precipitated with ethanol and resuspended in 130 μL of Tris-buffer (10 mM Tris-HCl, pH = 8.0) for sonication. Then the liquid volume is replenished to 300 μL. To isolate biotin-labeled ligation junctions, 150 μL of 10 mg/ml Dynabeads MyOne Streptavidin T1 beads (Life technologies, #65602) were washed with 400 μL of 1× Tween washing buffer (5 mM Tris-HCl pH = 7.5, 0.5 mM EDTA, 1 M NaCl, 0.05% Tween 20), collected with a magnet, resuspended in 300 μL of 2× binding buffer (10 mM Tris-HCl pH = 7.5, 1 mM EDTA, 2 M NaCl) and added to the sample. After 45 min, biotinylated DNA was bound to the beads. To remove end-repair and biotin from unligated ends, 88 μL 1× NEB T4 DNA ligase buffer, 2 μL dNTPs, 5 μL T4 PNK (New England Biolabs, #M0201), 4 μL T4 DNA polymerase I (New England Biolabs, #M0203), 1 μL DNA polymerase I, large fragment (Klenow) were added to incubate for 30 min at room temperature. The beads were washed twice in 1× Tween Wash Buffer 2 min at 55 °C. A-tailing was performed by incubating in 90 μL 1× NEB buffer 2, 5 μL dATP, 5 μL Klenow exo at 37 °C for 30 min. The adapter was ligated by incubating in a mixture of 50 μL 1× Quick Ligation Buffer, 2 μL Quick Ligase (New England Biolabs, #M2200), and 3 μL Illumina indexed adapter (New England Biolabs, #E7337) for 15 min at room temperature and followed by adding 2.5 μL User Enzyme. Library was performed with a NEBNext DNA Library Prep Kit and amplified for 10–12 cycles and were size-selected with AMPure XP beads (Beckman Coulter, #A63881). Three biological replicates were performed for each cell line.
RNA isolation and RNA-seq
Cell pellets were homogenized in RNAzol reagent (MRC, #RN190-500) and processed according to the manufacturer’s instructions. Then the total mRNA was extracted for sequencing. Two biological replicates were performed for each cell line.
Flow cytometry analysis
Single-cell suspensions were prepared and were fixed. The Foxp3/Transcription Factor Staining Buffer Set (eBioscience, #00-5523-00) was used according to the manufacturer’s instructions to detect the expression of CTCF. For cell cycle analysis, two days after treatment of auxin, 1000 WT_CTCF, and MUT_CTCF mESCs were seeded into individual wells of a six-well plate. After 4 days, the cells were collected, fixed in 75% ethanol, and washed three times. DAPI (Sigma, #10236276001) was used for staining and cells were analyzed by flow cytometry. Experiments were conducted in three independent triplicates.
Cell viability assay
The cell counting kit 8 (CCK8) (DOJINDO, #CK04) was used according to the manufacturer’s instructions. Experiments were conducted in three independent triplicates.
Colony formation assay
Two days after treatment of auxin, 1000 WT_CTCF and MUT_CTCF mESCs were seeded into individual wells of a six-well plate. After 5 days, the colonies were stained by alkaline phosphate (AP, Yeasen, #40749ES60). Colonies of undifferentiated cells (UD), partially differentiated cells (PD), and differentiated cells (D) in each well were counted. Experiments were conducted in three independent triplicates.
EB differentiation assay
105 WT_CTCF and MUT_CTCF were cultured in standard ESC medium without LIF. Images were taken at day 2, 4, 8, and 14.
Prediction of protein–protein interaction
Predicted protein structures of CTCF (Q61164) and OGT (Q8CGY8) are obtained from Alphafold Protein Structure Database. Protein–protein interactions are predicted by ZDOCK Server49 (https://zdock.umassmed.edu/) and visualized by PyMol.
Quality control of sequencing reads
All the Illumina sequencing reads used in the study were firstly quality controlled by Trim Galore. In detail, we removed the bases with quality below 20 and the adapter sequences from the 3′ end, and filtered the reads with length less than 50 nt.
RNA-seq data analysis
RNA-seq reads were aligned to mm9 reference genome using STAR79 with default parameters. The uniquely mapped reads were counted with HTSeq-count. We detected the differentially expressed genes using edgeR80. Genes were considered differentially expressed when the p-value < 0.05 and the fold change is above 2.0. GSEA analysis was performed by GSEA_Linux_4.0.381.
ChIP-seq and ATAC-seq data analysis
ChIP-seq and ATAC-seq reads were aligned to mm9 reference genome using Bowtie282 with default parameters, followed by removing the multiple aligned reads, PCR duplications with SAMtools. Alignment track bigwig files were generated using deepTools83. ChIP-seq and ATAC-seq profiling plot were generated by deepTools83.
ChIP-seq data were aligned to BDGP6 reference genome, followed by removing the multiple aligned reads. Spike-in reads were counted and scale-factors between samples were determined for downstream analyses. For differential binding analysis, we first called CTCF peak by macs2 peak calling pipeline (https://github.com/macs3-project/MACS/wiki/Advanced:-Call-peaks-using-MACS2-subcommands) with multiplication of scale factors to generate bedGraph files. We then concatenated WT_CTCF and MUT_CTCF CTCF peaks to get full set of CTCF peaks, and then used bedtools multicov for reads counting on all CTCF peaks. Read counts were then normalized by scale factors and R package DESeq2 was used to perform differential analysis.
In situ Hi-C data analysis
Hi-C reads were processed using Hi-C-Pro84 pipeline: reads were aligned to mm9 reference genome using bowtie2, reads with mapping quality >10 were assigned to MboI restriction fragments, and interaction pairs were reconstructed. Singleton or multi-hits pairs were filtered out, followed by removal of failed ligation products (dangling end pairs, re-ligation pairs, self-cycle pairs) and pairs not able to reconstruct the ligation product. Remaining pairs were then de-duplicated and used for building contact matrices. Contact matrices were normalized by iterative correction and eigenvector decomposition (ICE).
Compartments were identified by CscoreTool85 on 500 kb-resolution contact matrices. Loop domains were identified using the method proposed by Rao et al. 86. Loops were annotated by HiCCUPS with default parameters (-r 5000, 10000, 25000 –f 0.1). Differential loops were identified by diffloop87. CTCF-related loops were defined as loops with CTCF ChIP-seq peaks located on both anchors. Aggregate peak analysis about loops were performed by GENOVA88. The correlations of contact matrices were analyzed by HiCRep89.
Definition of regulatory regions
Enhancers were defined as H3K27ac peaks that did not overlap with a promoter. Insulators were defined according to reference53, and insulators were identified as the subset of CTCF ChIP-seq peaks that overlapped SMC1 ChIA-PET anchors.
Genomic feature analysis
BEDTools90 was used to perform analysis about genomic intervals, including intersecting, expanding, flanking and randomly shuffling intervals.
Gene ontology analysis
Gene symbols were first converted to EntrzID with R package BiomaRt (version 2.42.0)91. Gene ontology analysis is based on reference92.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Supplementary information
Source data
Acknowledgements
We thank the members of the Ding laboratory for discussion, technical advice, and support. We thank Dr. Ma from Peking University for sharing tissue protein. This research was funded by grants from the National Key R&D Program of China (2023YFA1800900), National Natural Science Foundation of China (31970811 and 32170798), the Guangdong Basic and Applied Basic Research Foundation (2021B1515120063) to J.D., The National Natural Science Foundation of China (32100497), the Natural Science Foundation of Guangdong Province, China (2023A1515010197) to C.W., the Natural Science Foundation of Guangdong Province, China (2021A1515010938, 2023A1515010148), the Fundamental Research Funds for the Central Universities, Sun Yat-sen University (23PTPY88) to J.S., the Science and Technology Development Fund, Macau SAR (file 0077/2020/A2) and from the University of Macau (file: MYRG2022-00251-FHS) to W.C.
Author contributions
X.T. conceived the experiments. X.T., K.L., L.Q., Y.S., L.L. conducted experiments. P.Z., X.T. performed the bioinformatics analysis. X.T. and P.Z. wrote the paper with input from all other authors. X.L., C.W., J.W., S.J., J.S., W.C., J.Z. revised the manuscript. H.Y., H.C. provided guidance in bioinformatics analysis. J.D., Y.M., L.F. and C.X. supervised the project.
Peer review
Peer review information
Nature Communications thanks the anonymous reviewer(s) for their contribution to the peer review of this work. A peer review file is available.
Data availability
Previously published raw reads of ChIP-seq data were downloaded from GSE29218 (CTCF)1, GSE85185 (CTCF)14, GSM2417096 (H3K27ac)93 and re-processed as described in methods. Previously published raw reads of Hi-C data were downloaded from GSE9610752, and re-processed as described in methods. All datasets are available in GEO under the accession number GSE255897. Other information needed is available form corresponding author upon request. Source data are provided with this paper.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
These authors contributed equally: Xiuxiao Tang, Pengguihang Zeng.
Contributor Information
Chengfang Xu, Email: xuchengf@mail.sysu.edu.cn.
Lili Fan, Email: fanlili@jnu.edu.cn.
Yi-Liang Miao, Email: miaoyl@mail.hzau.edu.cn.
Junjun Ding, Email: dingjunj@mail.sysu.edu.cn.
Supplementary information
The online version contains supplementary material available at 10.1038/s41467-024-47048-3.
References
- 1.Dixon JR, et al. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature. 2012;485:376–380. doi: 10.1038/nature11082. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Lieberman-Aiden E, et al. Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science. 2009;326:289–293. doi: 10.1126/science.1181369. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Rao SSP, et al. A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell. 2014;159:1665–1680. doi: 10.1016/j.cell.2014.11.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Hnisz D, et al. Activation of proto-oncogenes by disruption of chromosome neighborhoods. Science. 2016;351:1454–1458. doi: 10.1126/science.aad9024. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Lupianez DG, et al. Disruptions of topological chromatin domains cause pathogenic rewiring of gene-enhancer interactions. Cell. 2015;161:1012–1025. doi: 10.1016/j.cell.2015.04.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Klenova EM, et al. CTCF, a conserved nuclear factor required for optimal transcriptional activity of the chicken c-myc gene, is an 11-Zn-finger protein differentially expressed in multiple forms. Mol. Cell. Biol. 1993;13:7612–7624. doi: 10.1128/mcb.13.12.7612. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Handoko L, et al. CTCF-mediated functional chromatin interactome in pluripotent cells. Nat. Genet. 2011;43:630–638. doi: 10.1038/ng.857. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Ong CT, Corces VG. CTCF: an architectural protein bridging genome topology and function. Nat. Rev. Genet. 2014;15:234–246. doi: 10.1038/nrg3663. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Phillips-Cremins JE, et al. Architectural protein subclasses shape 3D organization of genomes during lineage commitment. Cell. 2013;153:1281–1295. doi: 10.1016/j.cell.2013.04.053. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Davidson IF, et al. DNA loop extrusion by human cohesin. Science. 2019;366:1338–1345. doi: 10.1126/science.aaz3418. [DOI] [PubMed] [Google Scholar]
- 11.Fudenberg G, et al. Formation of chromosomal domains by loop extrusion. Cell Rep. 2016;15:2038–2049. doi: 10.1016/j.celrep.2016.04.085. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Kim Y, Shi Z, Zhang H, Finkelstein IJ, Yu H. Human cohesin compacts DNA by loop extrusion. Science. 2019;366:1345–1349. doi: 10.1126/science.aaz4475. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Wei, C. et al. CTCF organizes inter-A compartment interactions through RYBP-dependent phase separation. Cell Res.32, 744–760 (2022). [DOI] [PMC free article] [PubMed]
- 14.Nora EP, et al. Targeted degradation of CTCF decouples local insulation of chromosome domains from genomic compartmentalization. Cell. 2017;169:930–944.e22. doi: 10.1016/j.cell.2017.05.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Sanborn AL, et al. Chromatin extrusion explains key features of loop and domain formation in wild-type and engineered genomes. Proc. Natl Acad. Sci. USA. 2015;112:E6456–E6465. doi: 10.1073/pnas.1518552112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Kubo N, et al. Promoter-proximal CTCF binding promotes distal enhancer-dependent gene activation. Nat. Struct. Mol. Biol. 2021;28:152. doi: 10.1038/s41594-020-00539-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Bell AC, Felsenfeld G. Methylation of a CTCF-dependent boundary controls imprinted expression of the Igf2 gene. Nature. 2000;405:482–485. doi: 10.1038/35013100. [DOI] [PubMed] [Google Scholar]
- 18.Wang H, et al. Widespread plasticity in CTCF occupancy linked to DNA methylation. Genome Res. 2012;22:1680–1688. doi: 10.1101/gr.136101.111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Hansen AS, et al. Distinct classes of chromatin loops revealed by deletion of an RNA-binding region in CTCF. Mol. Cell. 2019;76:395. doi: 10.1016/j.molcel.2019.07.039. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Sun S, et al. Jpx RNA activates Xist by evicting CTCF. Cell. 2013;153:1537–1551. doi: 10.1016/j.cell.2013.05.028. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Donohoe ME, Zhang LF, Xu N, Shi Y, Lee JT. Identification of a Ctcf cofactor, Yy1, for the X chromosome binary switch. Mol. Cell. 2007;25:43–56. doi: 10.1016/j.molcel.2006.11.017. [DOI] [PubMed] [Google Scholar]
- 22.Wang J, et al. Phase separation of OCT4 controls TAD reorganization to promote cell fate transitions. Cell Stem Cell. 2021;28:1868. doi: 10.1016/j.stem.2021.04.023. [DOI] [PubMed] [Google Scholar]
- 23.Plasschaert RN, et al. CTCF binding site sequence differences are associated with unique regulatory and functional trends during embryonic stem cell differentiation. Nucleic Acids Res. 2014;42:7487. doi: 10.1093/nar/gku470. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Oh HJ, et al. Jpx RNA regulates CTCF anchor site selection and formation of chromosome loops. Cell. 2021;184:6157. doi: 10.1016/j.cell.2021.11.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Soochit W, et al. CTCF chromatin residence time controls three-dimensional genome organization, gene expression and DNA methylation in pluripotent cells. Nat. Cell Biol. 2021;23:881. doi: 10.1038/s41556-021-00722-w. [DOI] [PubMed] [Google Scholar]
- 26.Zhang ZH, et al. Identification of lysine succinylation as a new post-translational modification. Nat. Chem. Biol. 2011;7:58–63. doi: 10.1038/nchembio.495. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Wang YC, Peterson SE, Loring JF. Protein post-translational modifications and regulation of pluripotency in human stem cells. Cell Res. 2014;24:143–160. doi: 10.1038/cr.2013.151. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Xue Y, et al. GPS: a comprehensive www server for phosphorylation sites prediction. Nucleic Acids Res. 2005;33:W184–W187. doi: 10.1093/nar/gki393. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Hornbeck PV, et al. PhosphoSitePlus, 2014: mutations, PTMs and recalibrations. Nucleic Acids Res. 2015;43:D512–D520. doi: 10.1093/nar/gku1267. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Huang KY, et al. dbPTM 2016: 10-year anniversary of a resource for post-translational modification of proteins. Nucleic Acids Res. 2016;44:D435–D446. doi: 10.1093/nar/gkv1240. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Caiafa P, Zlatanova J. CCCTC-binding factor meets poly(ADP-ribose) polymerase-1. J. Cell. Physiol. 2009;219:265–270. doi: 10.1002/jcp.21691. [DOI] [PubMed] [Google Scholar]
- 32.Kitchen NS, Schoenherr CJ. Sumoylation modulates a domain in CTCF that activates transcription and decondenses chromatin. J. Cell. Biochem. 2010;111:665–675. doi: 10.1002/jcb.22751. [DOI] [PubMed] [Google Scholar]
- 33.Wang J, Wang YM, Lu L. De-SUMOylation of CCCTC binding factor (CTCF) in hypoxic stress-induced human corneal epithelial cells. J. Biol. Chem. 2012;287:12469–12479. doi: 10.1074/jbc.M111.286641. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Yu WQ, et al. Poly(ADP-ribosyl)ation regulates CTCF-dependent chromatin insulation. Nat. Genet. 2004;36:1105–1110. doi: 10.1038/ng1426. [DOI] [PubMed] [Google Scholar]
- 35.Luo H, et al. LATS kinase-mediated CTCF phosphorylation and selective loss of genomic binding. Sci. Adv. 2020;6:eaaw4651. doi: 10.1126/sciadv.aaw4651. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Del Rosario, B. C. et al. Exploration of CTCF post-translation modifications uncovers Serine-224 phosphorylation by PLK1 at pericentric regions during the G2/M transition. Elife8, e42341 (2019). [DOI] [PMC free article] [PubMed]
- 37.Sekiya T, Murano K, Kato K, Kawaguchi A, Nagata K. Mitotic phosphorylation of CCCTC-binding factor (CTCF) reduces its DNA binding activity. FEBS Open Bio. 2017;7:397–404. doi: 10.1002/2211-5463.12189. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Klenova EM, et al. Functional phosphorylation sites in the C-terminal region of the multivalent multifunctional transcriptional factor CTCF. Mol. Cell. Biol. 2001;21:2221–2234. doi: 10.1128/MCB.21.6.2221-2234.2001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Delgado MD, Chernukhin IV, Bigas A, Klenova EM, Leon J. Differential expression and phosphorylation of CTCF, a c-myc transcriptional regulator, during differentiation of human myeloid cells. FEBS Lett. 1999;444:5–10. doi: 10.1016/S0014-5793(99)00013-7. [DOI] [PubMed] [Google Scholar]
- 40.Tang JB, Chen YH. Identification of a tyrosine-phosphorylated CCCTC-binding nuclear factor in capacitated mouse spermatozoa. Proteomics. 2006;6:4800–4807. doi: 10.1002/pmic.200600256. [DOI] [PubMed] [Google Scholar]
- 41.Wesseling H, et al. Tau PTM profiles identify patient heterogeneity and stages of Alzheimer’s disease. Cell. 2020;183:1699. doi: 10.1016/j.cell.2020.10.029. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Bern, M., Kil, Y. J. & Becker, C. Byonic: advanced peptide and protein identification software. Curr. Protoc. Bioinform.Chapter 13, Unit13 20 (2012). [DOI] [PMC free article] [PubMed]
- 43.Gandy JC, Rountree AE, Bijur GN. Akt1 is dynamically modified with O-GlcNAc following treatments with PUGNAc and insulin-like growth factor-1. FEBS Lett. 2006;580:3051–3058. doi: 10.1016/j.febslet.2006.04.051. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Wu Q, Schapira M, Arrowsmith CH, Barsyte-Lovejoy D. Protein arginine methylation: from enigmatic functions to therapeutic targeting. Nat. Rev. Drug Discov. 2021;20:509–530. doi: 10.1038/s41573-021-00159-8. [DOI] [PubMed] [Google Scholar]
- 45.Yang XY, Qian KV. Protein O-GlcNAcylation: emerging mechanisms and functions. Nat. Rev. Mol. Cell Biol. 2017;18:452–465. doi: 10.1038/nrm.2017.22. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Myers, S. A. et al. SOX2 O-GlcNAcylation alters its protein-protein interactions and genomic occupancy to modulate gene expression in pluripotent cells. Elife5, e10647 (2016). [DOI] [PMC free article] [PubMed]
- 47.Hao Y, et al. Next-generation unnatural monosaccharides reveal that ESRRB O-GlcNAcylation regulates pluripotency of mouse embryonic stem cells. Nat. Commun. 2019;10:4065. doi: 10.1038/s41467-019-11942-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Nie H, et al. O-GlcNAcylation of PGK1 coordinates glycolysis and TCA cycle to promote tumor growth. Nat. Commun. 2020;11:36. doi: 10.1038/s41467-019-13601-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Pierce BG, et al. ZDOCK server: interactive docking prediction of protein-protein complexes and symmetric multimers. Bioinformatics. 2014;30:1771–1773. doi: 10.1093/bioinformatics/btu097. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Egan B, et al. An alternative approach to ChIP-Seq normalization enables detection of genome-wide changes in histone H3 lysine 27 trimethylation upon EZH2 inhibition. PLoS ONE. 2016;11:e0166438. doi: 10.1371/journal.pone.0166438. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Ahn JH, et al. Phase separation drives aberrant chromatin looping and cancer development. Nature. 2021;595:591–595. doi: 10.1038/s41586-021-03662-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Bonev B, et al. Multiscale 3D genome rewiring during mouse neural development. Cell. 2017;171:557–572.e24. doi: 10.1016/j.cell.2017.09.043. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Weintraub AS, et al. YY1 is a structural regulator of enhancer-promoter loops. Cell. 2017;171:1573. doi: 10.1016/j.cell.2017.11.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Schoenfelder S, Fraser P. Long-range enhancer-promoter contacts in gene expression control. Nat. Rev. Genet. 2019;20:437–455. doi: 10.1038/s41576-019-0128-0. [DOI] [PubMed] [Google Scholar]
- 55.Favier B, Dolle P. Developmental functions of mammalian Hox genes. Mol. Hum. Reprod. 1997;3:115–131. doi: 10.1093/molehr/3.2.115. [DOI] [PubMed] [Google Scholar]
- 56.Song YW, et al. CTCF functions as an insulator for somatic genes and a chromatin remodeler for pluripotency genes during reprogramming. Cell Rep. 2022;39:110626. doi: 10.1016/j.celrep.2022.110626. [DOI] [PubMed] [Google Scholar]
- 57.Olbrich T, et al. CTCF is a barrier for 2C-like reprogramming. Nat. Commun. 2021;12:4856. doi: 10.1038/s41467-021-25072-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Jang H, et al. O-GlcNAc regulates pluripotency and reprogramming by directly acting on core components of the pluripotency network. Cell Stem Cell. 2012;11:62–74. doi: 10.1016/j.stem.2012.03.001. [DOI] [PubMed] [Google Scholar]
- 59.Filippova GN, et al. An exceptionally conserved transcriptional repressor, CTCF, employs different combinations of zinc fingers to bind diverged promoter sequences of avian and mammalian c-myc oncogenes. Mol. Cell. Biol. 1996;16:2802–2813. doi: 10.1128/MCB.16.6.2802. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Arzate-Mejia RG, Recillas-Targa F, Corces VG. Developing in 3D: the role of CTCF in cell differentiation. Development. 2018;145:dev137729. doi: 10.1242/dev.137729. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Nakahashi H, et al. A genome-wide map of CTCF multivalency redefines the CTCF code. Cell Rep. 2013;3:1678–1689. doi: 10.1016/j.celrep.2013.04.024. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Saldana-Meyer R, et al. RNA interactions are essential for CTCF-mediated genome organization. Mol. Cell. 2019;76:412–422.e5. doi: 10.1016/j.molcel.2019.08.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Chen H, Tian Y, Shu W, Bo X, Wang S. Comprehensive identification and annotation of cell type-specific and ubiquitous CTCF-binding sites in the human genome. PLoS ONE. 2012;7:e41374. doi: 10.1371/journal.pone.0041374. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Kim TH, et al. Analysis of the vertebrate insulator protein CTCF-binding sites in the human genome. Cell. 2007;128:1231–1245. doi: 10.1016/j.cell.2006.12.048. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Cuddapah S, et al. Global analysis of the insulator binding protein CTCF in chromatin barrier regions reveals demarcation of active and repressive domains. Genome Res. 2009;19:24–32. doi: 10.1101/gr.082800.108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Plasschaert RN, et al. CTCF binding site sequence differences are associated with unique regulatory and functional trends during embryonic stem cell differentiation. Nucleic Acids Res. 2014;42:774–789. doi: 10.1093/nar/gkt910. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Essien K, et al. CTCF binding site classes exhibit distinct evolutionary, genomic, epigenomic and transcriptomic features. Genome Biol. 2009;10:R131. doi: 10.1186/gb-2009-10-11-r131. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Heath H, et al. CTCF regulates cell cycle progression of alphabeta T cells in the thymus. EMBO J. 2008;27:2839–2850. doi: 10.1038/emboj.2008.214. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Moore JM, et al. Loss of maternal CTCF Is associated with peri-implantation lethality of Ctcf null embryos. PLoS ONE. 2012;7:e34915. doi: 10.1371/journal.pone.0034915. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Wan LB, et al. Maternal depletion of CTCF reveals multiple functions during oocyte and preimplantation embryo development. Development. 2008;135:2729–2738. doi: 10.1242/dev.024539. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Gomez-Velazquez M, et al. CTCF counter-regulates cardiomyocyte development and maturation programs in the embryonic heart. PLoS Genet. 2017;13:e1006985. doi: 10.1371/journal.pgen.1006985. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Hirayama T, Tarusawa E, Yoshimura Y, Galjart N, Yagi T. CTCF is required for neural development and stochastic expression of clustered Pcdh genes in neurons. Cell Rep. 2012;2:345–357. doi: 10.1016/j.celrep.2012.06.014. [DOI] [PubMed] [Google Scholar]
- 73.Zheng H, Xie W. The role of 3D genome organization in development and cell differentiation. Nat. Rev. Mol. Cell Biol. 2019;20:535–550. doi: 10.1038/s41580-019-0132-4. [DOI] [PubMed] [Google Scholar]
- 74.Sene KH, et al. Gene function in early mouse embryonic stem cell differentiation. BMC Genom. 2007;8:85. doi: 10.1186/1471-2164-8-85. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Ding JJ, et al. Tex10 coordinates epigenetic control of super-enhancer activity in pluripotency and reprogramming. Cell Stem Cell. 2015;16:653–668. doi: 10.1016/j.stem.2015.04.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Vella P, et al. Tet proteins connect the O-linked N-acetylglucosamine transferase Ogt to chromatin in embryonic stem cells. Mol. Cell. 2013;49:645–656. doi: 10.1016/j.molcel.2012.12.019. [DOI] [PubMed] [Google Scholar]
- 77.Li S, et al. Integrative proteomic characterization of adenocarcinoma of esophagogastric junction. Nat. Commun. 2023;14:778. doi: 10.1038/s41467-023-36462-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Ding JJ, Xu HL, Faiola F, Ma’ayan A, Wang JL. Oct4 links multiple epigenetic pathways to the pluripotency network. Cell Res. 2012;22:155–167. doi: 10.1038/cr.2011.179. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Dobin A, et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics. 2013;29:15–21. doi: 10.1093/bioinformatics/bts635. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.Robinson MD, McCarthy DJ, Smyth GK. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2010;26:139–140. doi: 10.1093/bioinformatics/btp616. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Subramanian A, et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl Acad. Sci. USA. 2005;102:15545–15550. doi: 10.1073/pnas.0506580102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nat. Methods. 2012;9:357–U54. doi: 10.1038/nmeth.1923. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83.Ramirez F, et al. deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic Acids Res. 2016;44:W160–W165. doi: 10.1093/nar/gkw257. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Servant, N. et al. HiC-Pro: an optimized and flexible pipeline for Hi-C data processing. Genome Biol.16, 259 (2015). [DOI] [PMC free article] [PubMed]
- 85.Zheng XB, Zheng YX. CscoreTool: fast Hi-C compartment analysis at high resolution. Bioinformatics. 2018;34:1568–1570. doi: 10.1093/bioinformatics/btx802. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Rao SSP, et al. Cohesin loss eliminates all loop domains. Cell. 2017;171:305–320.e24. doi: 10.1016/j.cell.2017.09.026. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.Lareau CA, Aryee MJ. diffloop: a computational framework for identifying and analyzing differential DNA loops from sequencing data. Bioinformatics. 2018;34:672–674. doi: 10.1093/bioinformatics/btx623. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 88.van der Weide RH, et al. Hi-C analyses with GENOVA: a case study with cohesin variants. NAR Genom. Bioinform. 2021;3:lqab040. doi: 10.1093/nargab/lqab040. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 89.Yang T, et al. HiCRep: assessing the reproducibility of Hi-C data using a stratum-adjusted correlation coefficient. Genome Res. 2017;27:1939–1949. doi: 10.1101/gr.220640.117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90.Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26:841–842. doi: 10.1093/bioinformatics/btq033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 91.Smedley D, et al. The BioMart community portal: an innovative alternative to large, centralized data repositories. Nucleic Acids Res. 2015;43:W589–W598. doi: 10.1093/nar/gkv350. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 92.Ashburner M, et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 2000;25:25–29. doi: 10.1038/75556. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 93.Chronis C, et al. Cooperative binding of transcription factors orchestrates reprogramming. Cell. 2017;168:442–459.e20. doi: 10.1016/j.cell.2016.12.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Previously published raw reads of ChIP-seq data were downloaded from GSE29218 (CTCF)1, GSE85185 (CTCF)14, GSM2417096 (H3K27ac)93 and re-processed as described in methods. Previously published raw reads of Hi-C data were downloaded from GSE9610752, and re-processed as described in methods. All datasets are available in GEO under the accession number GSE255897. Other information needed is available form corresponding author upon request. Source data are provided with this paper.