Skip to main content
iScience logoLink to iScience
. 2024 Nov 22;27(12):111452. doi: 10.1016/j.isci.2024.111452

A negatively charged region within carboxy-terminal domain maintains proper CTCF DNA binding

Lian Liu 1, Yuanxiao Tang 1, Yan Zhang 1, Qiang Wu 1,2,
PMCID: PMC11667065  PMID: 39720519

Summary

As an essential regulator of higher-order chromatin structures, CCCTC-binding factor (CTCF) is a highly conserved protein with a central DNA-binding domain of 11 tandem zinc fingers (ZFs), which are flanked by amino (N-) and carboxy (C-) terminal domains of intrinsically disordered regions. Here we report that CRISPR deletion of the entire C-terminal domain of alternating charge blocks decreases CTCF DNA binding but deletion of the C-terminal fragment of 116 amino acids results in increased CTCF DNA binding and aberrant gene regulation. Through a series of genetic targeting experiments, in conjunction with electrophoretic mobility shift assay (EMSA), circularized chromosome conformation capture (4C), qPCR, chromatin immunoprecipitation with sequencing (ChIP-seq), and assay for transposase-accessible chromatin with sequencing (ATAC-seq), we uncovered a negatively charged region (NCR) responsible for weakening CTCF DNA binding and chromatin accessibility. AlphaFold prediction suggests an autoinhibitory mechanism of CTCF via NCR as a flexible DNA mimic domain, possibly competing with DNA binding for the positively charged ZF surface area. Thus, the unstructured C-terminal domain plays an intricate role in maintaining proper CTCF-DNA interactions and 3D genome organization.

Subject areas: Experimental systems for structural biology, Molecular physiology, Molecular Structure, Properties of biomolecules

Graphical abstract

graphic file with name fx1.jpg

Highlights

  • CTCF C-terminal domain (CTD) contains alternating charge blocks

  • A negatively charged region (NCR) within CTD suppresses CTCF DNA binding

  • CRISPR deletion of CTCF NCR alters gene expression and 3D genome architecture

  • CTCF autoinhibition by NCR as a DNA mimicry regulates its genome affinity


Experimental systems for structural biology; Molecular physiology; Molecular Structure; Properties of biomolecules

Introduction

CCCTC-binding factor (CTCF) is a principal architectural protein for the construction of 3D genomes and is highly conserved in bilateria.1,2,3,4,5 Together with the cohesin complex, CTCF mediates the formation of long-distance chromatin loops between distant sites, known as CBS (CTCF binding site) elements, through an ATP-dependent active process known as “loop extrusion,” leading to higher-order chromatin structures such as TADs (topologically associating domains).6,7,8,9,10,11,12,13 Interestingly, CTCF/cohesin-mediated chromatin loops are preferentially formed between pairs of CBS elements in a forward-reverse convergent orientation.14,15,16,17 In particular, topological chromatin loops are formed between tandem-arrayed CBS elements via cohesin-mediated dynamic loop extrusion, leading to balanced promoter choice.18,19,20 The dynamic cohesin loop extrusion and its asymmetric blocking by oriented CTCF binding on numerous CBS elements distributed throughout mammalian genomes constitute a general principle in 3D genome organization and play an important role in gene regulation. Finally, other DNA-binding zinc-finger (ZF) proteins such as YY1, MAZ, PATZ1, and ZNF263 may collaborate with CTCF to form long-distance chromatin contacts.21,22

The clustered protocadherin (cPcdh) genes are an excellent model to investigate the relationships between CTCF/cohesin-mediated chromatin looping and gene expression programs. The 53 highly similar human cPCDH genes are organized into three tandem linked clusters of PCDHα, PCDHβ, and PCDHγ, spanning a large region of ∼1 M bps genomic DNA.23 The PCDHα gene cluster comprises an upstream region of 15 variable exons and a downstream region with 3 constant exons. Similarly, PCDHγ comprises an upstream region of 22 variable exons and a downstream region of 3 constant exons. Each variable exon is separately spliced to the respective set of 3 constant exons within the PCDH α or γ gene cluster. By contrast, the PCDHβ gene cluster comprises only 16 variable exons with no constant exons.

Similar to the intriguing Dscam gene for generating enormous diversity of cell-recognition codes in fly, the cPCDH genes generate an exquisite diversity for neuronal self-avoidance and nonself discrimination in vertebrates.24,25 Different from competitive RNA pairing-mediated mutually exclusive splicing mechanism for Dscam, the cPCDH diversity is generated by a combination of balanced promoter choice and cis-alternative splicing determined by CTCF-directed DNA looping.14,26 In this complicated 3D genome configuration, CTCF directionally binds to tandem arrays of oriented CBS elements associated with Pcdh variable promoters and super-enhancers. CTCF-mediated chromatin loops are then formed between pairs of convergent forward-reverse CBS elements. For example, in the PCDHα gene cluster, there are two forward CBS elements flanking each of the 13 alternate variable exons and two reverse CBS elements flanking the HS5-1 enhancer.27 A “double-clamping” chromatin interaction between these convergent pairs of CBS elements determines the cPCDH promoter choice.14 In summary, CTCF/cohesin-mediated loop extrusion bridges remote super-enhancers in close contact with target variable promoters to form long-distance chromatin loops, and this looping process is essential for establishing proper expression patterns of the cPCDH genes in the brain.14,17,18,19,28,29,30

CTCF contains a central domain of 11 ZFs organized in a tandem array flanked by intrinsically disordered regions of the N-terminal domain (NTD) and C-terminal domain (CTD) (Figure 1A).1,31,32 The central domain of CTCF binds to DNA directly through ZFs 3–7 and ZFs 9–11.17,33,34,35 Recently, several lines of evidence suggest that ZF1 and ZF2 recognize base pairs downstream of the CTCF core motif.36,37,38,39,40 Remarkably, CTCF also interacts with RNA to mediate chromatin loop formation and to regulate gene expression pattern.41,42 Finally, the intrinsically disordered NTD, but not CTD, of CTCF interacts with cohesin complex to anchor chromatin loops between distant DNA elements.4,12,43,44,45,46 Here by a combination of a series of genetic deletions, in conjunction with chromosome conformation capture and gene expression analyses, we found that a negatively charged region (NCR) within the disordered CTD is important for proper CTCF DNA binding, higher-order chromatin organization, and gene regulation.

Figure 1.

Figure 1

Deletion of CTCF C-terminal domain results in decreased CTCF binding and gene dysregulation

(A) Schematic of the CTCF zinc-finger domain (ZFD) and the flanking N-terminal domain (NTD) and C-terminal domain (CTD).

(B and C) Amino acid composition of CTCF NTD (B) and CTD (C).

(D) Disordered propensity by a computer program indicates that NTD and CTD are intrinsically disorder region (IDR).

(E) Multiple sequence alignment of CTD of the vertebrate CTCF proteins. The negatively charged region (NCR) is highlighted in a yellow background.

(F) CTCF and Rad21 ChIP-seq peaks at the human cPCDH gene complex. The human cPCDH locus comprises three tandem linked gene clusters: PCDHα, PCDHβ, and PCDHγ. The PCDHα and PCDHγ clusters each consist of a variable region with multiple highly similar and unusually large exons and a constant region with three small exons. Each variable exon is separately cis-spliced to a single set of cluster-specific constant exons. The PCDHβ cluster contains 16 variable exons but with no constant exon. These three clusters form a superTAD with two subTADs: PCDHα and PCDHβγ. Each subTAD has its own downstream super-enhancer. Var, variable; Con, constant; HS, hyper-sensitive site; SE, super-enhancer.

(G and H) Heatmaps of CTCF (G) and Rad21 (H) normalized signals at CBS elements in the cPCDH locus. Student’s t test, ∗p < 0.05.

(I and J) Heatmaps of CTCF (I) and Rad21 (J) normalized signals at genome-wide CBS elements. Student’s t test, ∗∗∗∗p < 0.0001.

(K–M) Three types of CTCF motifs in WT and ΔCTD cells.

(N) QHR-4C interaction profiles of the PCDHα gene cluster using HS5-1 as an anchor.

(O) RNA-seq shows decreased expression levels of PCDHα6, PCDHα12, and PCDHαc2 upon deletion of CTD. FPKM, fragments per kilobase of exon per million fragments mapped. Data are presented as mean ± SD; Student’s t test, ∗∗∗∗p < 0.0001.

See also Figures S1–S3.

Results

Deletion of CTCF CTD results in decreased DNA binding

We analyzed the amino acid (aa) composition of CTCF NTD and CTD and found that the NTD of CTCF contains all of the 20 types of aa while the CTD of CTCF contains only 15 aa types, suggesting that CTCF CTD has an aa compositional bias and is a low-complexity region (Figures 1B and 1C). Computational analyses suggest that both NTD and CTD regions have a high intrinsically disordered score, especially the CTD region (Figure 1D).47 In addition, the CTD region is highly conserved in vertebrates (Figure 1E). Owing to the lethality of CTCF deletion in mice or in cultured cells,48,49 we tried to delete the CTCF NTD or CTD in HEC-1-B17 and found that deletion of NTD, but not CTD, is lethal in cultured cells. Specifically, we screened 254 single-cell clones for deletion of NTD and could not find a single homozygous cell clone. Therefore, we focused our genetic dissection on CTCF CTD.

We screened for CTD-deletion clones by CRISPR DNA-fragment editing programmed with dual single guide RNAs (sgRNAs) and a donor construct containing FLAG sequences for tagging50,51 and obtained two single-cell clones with precise deletion of the CTCF CTD (ΔCTD) (Figures S1A–S1D). We performed CTCF chromatin immunoprecipitation with sequencing (ChIP-seq) with these two ΔCTD clones as well as with wild-type (WT) clones as a control (Figure S1B) and found that almost every CTCF peak within the three PCDH gene clusters is decreased upon CTD deletion, suggesting that CTD has an important role in CTCF binding to DNA (Figures 1F, S2A, and S2B). Aggregated peak analysis showed that there is a significant decrease of CTCF binding at the PCDH CBS elements (Figure 1G). DNA-bound CTCF anchors cohesin complex via its NTD but not CTD.4,12,43,44,45,46 To this end, we performed ChIP-seq experiments with a specific antibody against Rad21, a cohesin subunit, and found that cohesin is colocalized with CTD-deleted CTCF, suggesting that CTD-deleted CTCF is still able to anchor cohesin at the cPCDH locus (Figures 1F and S2A). However, there is a significant decrease of cohesin enrichments upon deletion of CTCF CTD (Figures 1F, 1H, S2A, and S2C). We then analyzed genome-wide CTCF and cohesin enrichments and found that both CTCF and cohesin enrichments are significantly decreased upon CTD deletion (Figures 1I and 1J). However, computational analyses revealed no alternation of CTCF motifs of all three types of CBS elements upon CTD deletion, suggesting that deletion of CTD does not alter the DNA binding specificity of the central ZF domain (Figures 1K–1M), despite the fact that CTCF enrichments are decreased for all three types of CBS elements (Figures S2D–S2F).

We next performed quantitative high-resolution circularized chromosome conformation capture experiments (QHR-4C, see STAR Methods)19 with the HS5-1 enhancer as an anchor and found that there is a significant decrease of long-distance chromatin interactions between the HS5-1 enhancer and its target promoters (Figure 1N). Finally, we performed RNA sequencing (RNA-seq) experiments and found that, consistent with decreased chromatin interactions between enhancers and promoters, there is a significant decrease of expression levels of members of the PCDHα gene cluster upon CTCF CTD deletion (Figure 1O).

Deletion of CTCF CTD affects gene regulation

We next analyzed the RNA-seq data using DESeq2 with adjusted p value <0.05 and log2FC (fold change) >1 as cutoffs. We found 150 up-regulated genes (Table S1, ingenuity pathway analysis [IPA]; Figure S3A) with mean log2FC of 1.70 and 207 down-regulated genes (Table S2; Figure S3B) with mean log2FC of −1.84 (Figure S3C). We also found that the down-regulated genes are closer to CBS elements than the up-regulated genes (Figure S3D).

Deletion of C-terminal 116 aa leads to increased CTCF binding

The CTD of CTCF contains an internal RNA-binding region (RBR) and a downstream region of 116 aa.42,52 To investigate its function, we generated targeted deletion by screening single-cell CRISPR clones using CRISPR DNA-fragment editing with Cas9 programmed by dual sgRNAs.17,50 We obtained two clones with deletion of the C-terminal 116 aa (ΔCT116) (Figures S1E–S1G). We performed CTCF ChIP-seq experiments and found, remarkably, that there is a significant increase of CTCF enrichments in the cPCDH gene complex upon deletion of the C-terminal 116 aa (Figures 2A, 2B, S4A, and S4B). In addition, we performed Rad21 ChIP-seq experiments and found that there is a significant increase of cohesin enrichments at the cPCDH locus (Figures 2A, 2C, S4A, and S4C), consistent with the model of CTCF asymmetrical blocking of cohesin “loop extrusion.”

Figure 2.

Figure 2

Deleting 116 amino acids from the C terminus leads to increased CTCF binding and affects gene expression

(A) CTCF and Rad21 ChIP-seq peaks at the cPCDH gene complex.

(B and C) Heatmaps of CTCF (B) and Rad21 (C) normalized signals at the cPCDH CBS elements. Student’s t test, ∗p < 0.05.

(D and E) Heatmaps of global CTCF (D) and Rad21 (E) ChIP-seq signals. Student’s t test, ∗∗∗∗p < 0.0001.

(F) QHR-4C interaction profiles of the PCDHα gene cluster using HS5-1 as an anchor.

(G) RNA-seq indicates increased expression levels of PCDHα6, PCDHα12, and PCDHαc2. Data are presented as mean ± SD; Student’s t test, ∗∗∗∗p < 0.0001.

See also Figures S1, S3, and S4.

We next performed genome-wide analyses and found similar enrichments of CTCF and cohesin upon deletion of the C-terminal 116 aa (Figures 2D and 2E). Genome-wide analyses of CTCF motifs showed no alteration of all three types of the CBS elements (Figures S4D–S4I). We also performed QHR-4C experiments using the HS5-1 enhancer as an anchor and found a significant increase of chromatin contacts with the target promoters of PCDHα6 and PCDHα12 (Figure 2F). Finally, RNA-seq experiments showed a significant increase of expression levels of PCDHα6, PCDHα12, and PCDHαc2 (Figure 2G). These data demonstrated that the unstructured region of C-terminal 116 aa inhibits CTCF binding to DNA.

Deletion of the CTCF C-terminal 116 aa affects gene expression

RNA-seq experiments revealed 432 up-regulated genes (Table S3; Figure S3E) with mean log2FC of 2.36 and 461 down-regulated genes (Table S4; Figure S3F) with mean log2FC of −1.98 in ΔCT116 cells (Figure S3G). Interestingly, we found that the up-regulated genes are closer to increased CTCF peaks (Figure S3H).

Rescue with a series of C-terminal truncated CTCFs uncovers an NCR

We next generated a series of C-terminal truncated CTCF with V5 tags and transfected them into ΔCTD cells (Figure 3A). Specifically, truncated CTCF with 116 aa deleted at the C terminus was constructed as CTCF1-611. In addition, CTCF1-637 contains an additional region of 26 highly conserved mostly negatively charged amino acids (NCR). Finally, CTCF1-663 contains a further downstream region of 26 aa with 9 proline residues and 7 positively charged lysine or arginine residues. We generated stable cell lines by infecting with lentiviruses containing these constructs and verified their expression by western blots (Figures 3A and 3B).

Figure 3.

Figure 3

Rescue with a series of C-terminal truncated CTCF proteins reveals an NCR

(A) Schematic illustration of different truncated CTCFs used in constructing stable cell lines. All truncated CTCFs are tagged with V5.

(B) Western blots of stable cell lines with the V5 antibody indicating truncated CTCF proteins in ΔCTD cells.

(C) Binding profiles of different length of CTCFs at the cPCDH locus.

(D) Heatmaps of CTCF ChIP-seq signals at the cPCDH locus. Student’s t test, ∗p < 0.05,∗∗p < 0.01,∗∗∗p < 0.001.

(E) Heatmaps of different truncated CTCFs, showing global binding profiles. Student’s t test, ∗∗∗∗p < 0.0001.

(F–H) CTCF enrichments at the three types of CTCF motifs. Student’s t test, ∗∗∗∗p < 0.0001.

See also Figure S5.

We performed ChIP-seq experiments with a specific antibody against V5 tag to investigate CTCF binding profiles and found that the binding strength of CTCF1-611 at the cPCDH locus and throughout the entire genome is the highest among these four CTCF transgenes including full-length CTCF (Figures 3C–3E, S5A, and S5B). Specifically, CTCF1-611 has the highest affinity for all three types of genome-wide CBS elements (Figures 3F–3H and S5C). In conjunction with data of endogenous C-terminal truncation (Figure 2), we concluded that CTCF1-611 has the highest DNA binding affinity and that the NCR of 26 aa from 612 to 637 suppresses CTCF-DNA interactions.

NCR deletion increases CTCF binding and cPCDH expression

To investigate the endogenous function of NCR, we genetically deleted it by screening single-cell CRISPR clones and obtained two cell clones (ΔNCR) (Figures S1H–S1J). We performed CTCF ChIP-seq experiments with these clones and found that there is a significant increase of CTCF enrichments in the cPCDH locus compared with WT controls (Figures 4A, 4B, S6A, and S6B). We also performed Rad21 ChIP-seq and found a similar increase of cohesin enrichments at the cPCDH locus (Figures 4A, 4C, S6A, and S6C). In addition, genome-wide CTCF and cohesin enrichments are also significantly increased upon NCR deletion (Figures 4D and 4E). Furthermore, CTCF or cohesin enrichments are significantly increased at all three types of CBS elements (Figures S6D–S6I). We also performed ChIP-qPCR to further validate increased CTCF binding in ChIP-seq (Figure S6J). We next performed QHR-4C experiments using these single-cell clones with HS5-1 as an anchor and found there is a significant increase of long-distance chromatin interactions with the target promoters of PCDHα6 and PCDHα12 (Figure 4F). Finally, we performed RNA-seq experiments and found that there is a significant increase of expression levels of PCDHα6 and PCDHα12 upon NCR deletion (Figure 4G). These data suggest an important function of NCR in CTCF binding and gene regulation.

Figure 4.

Figure 4

NCR deletion increases CTCF binding and affects gene regulation

(A) CTCF and Rad21 ChIP-seq signals at the cPCDH locus, showing increased CTCF and cohesin enrichments at CBS elements upon NCR deletion (Δ612-637).

(B and C) Heatmaps of CTCF (B) and Rad21 (C) ChIP-seq signals at the cPCDH locus. Student’s t test, ∗p < 0.05.

(D and E) Heatmaps of CTCF (D) and Rad21 (E) ChIP-seq signals, indicating genome-wide CTCF and Rad21 enrichments upon NCR deletion. Student’s t test, ∗∗∗∗p < 0.0001.

(F) Quantitative high-resolution 4C (QHR-4C) experiments with HS5-1 as an anchor, indicating increased chromatin interactions of the HS5-1 enhancer with PCDHα6 or PCDHα12 promoters.

(G) RNA-seq indicates increased expression levels of PCDHα6, PCDHα12, and PCDHαc2 upon NCR deletion. Data are presented as mean ± SD; Student’s t test, ∗∗∗∗p < 0.0001.

(H) Volcano plots of differential gene expression analyses for WT and ΔNCR cells. Red dots, fold change of gene expression upon NCR deletion (log2FC > 1 and adjusted p value <0.05). Blue dots, genes only passed adjusted p value <0.05. Yellow dots, gene only passed log2FC > 1. FC, fold change.

(I) TSS distances of up-, down-, or none-regulated (NC) genes to the closest increased CTCF peaks in ΔNCR cells (data are shown as a cumulative distribution function, CDF).

See also Figures S1, S3, and S6 and Tables S5 and S6.

Deletion of NCR affects gene regulation

Genome-wide analyses of RNA-seq data identified 312 up-regulated genes (Table S5; Figure S3I) with mean log2FC of 1.72 and 125 down-regulated genes (Table S6; Figure S3J) with mean log2FC of −1.55 (Figure 4H). We also found that the up-regulated genes are closer to increased CTCF peaks (Figure 4I). For example, we found that expression levels of EHD2 (EH domain containing 2) correlate with the change of CTCF binding at the promoter region (Figures S3K–S3M).

A role of NCR in rewiring chromatin accessibility

We next performed the assay for transposase-accessible chromatin with sequencing (ATAC-seq) and found that chromatin accessibilities are increased at most ATAC-seq peaks in the cPCDH locus upon CTCF NCR deletion (Figure S7). We also observed significantly increased ATAC-seq signals genome-wide at increased CTCF peaks (Figure S8A). We next calculated the log2FC of ATAC-seq peaks and found that the majority of ATAC-seq peaks are also increased and minority peaks are decreased (Figure S8B). Specifically, we identified 7,373 increased differential accessibility regions (DARs) and 1,311 decreased DARs genome-widely (Figure S8C). Interestingly, we found that most increased DARs are located at gene promoters and that many decreased DARs are located in the intergenic regions (Figure S8C).

We noted that ∼3/4 DARs (6,403) are overlapped with CBS elements and ∼1/4 DARs (2,281) are not (Figure S8D). Among all DARs overlapped with CBS elements, there are 5,431 increased DARs and 972 decreased DARs (Figure S8E). We calculated the distance of non-CBS DARs to nearest CBS elements and found that increased DARs are closer to CBS than decreased DARs (Figure S8F). Integrated analysis with RNA-seq data showed that increased DARs correlate with enhanced levels of gene expression (Figure S8G). Together, these data suggest that CTCF NCR regulates chromatin accessibility and gene expression.

AlphaFold prediction suggests allosteric autoinhibition of CTCF DNA binding by NCR in CTD

We first validated the inhibitory role of NCR in CTCF DNA binding in vitro directly by the electrophoretic mobility shift assay (EMSA) (Figures 5A–5E). We then used AlphaFold3 to predict the conformation model of the CTCF ZF array and CTD with or without DNA ligands (Figures 5F–5I).53 Intriguingly, AlphaFold3 modeling suggests that the apostate CTCF appears as an autoinhibition conformation with NCR folding back and interacting with the positively charged surface of the ZF array via electrostatic contacts (Figure 5F). In particular, a cluster of seven negatively charged residues within NCR appears as a DNA mimicry and can fold onto the positively charged DNA binding surface of the CTCF ZF array via multiple electrostatic interactions (Figure 5G). In addition, both the negatively charged residues of NCR and positively charged residues of ZFs are highly conserved across vertebrates (Figure 5H). Binding to DNA targets induces large conformational changes of CTCF and releases the flexible CTD (Figure 5I). Thus, AlphaFold modeling suggests allosteric autoregulation of CTCF DNA binding via DNA-mimicking NCR within the flexible CTD (Figure 5J).

Figure 5.

Figure 5

AlphaFold3 modeling suggests autoregulation of CTCF DNA binding by NCR cis-interactions

(A) Western blot of recombinant human CTCF proteins.

(B–E) EMSA of full-length human CTCF, ΔNCR, ΔCT116 with three different PCDH promoter or exonic CBS elements (B–D) with quantifications (E). Data are presented as mean ± SD; Student’s t test, ∗∗∗∗p < 0.0001.

(F) The surface representation of the AlphaFold3 model of human CTCF in the absence of DNA. NTD, ZF, CTD, and NCR are in orange, green, blue, and red, respectively.

(G) Negatively charged residues of NCR form nine pairs of electrostatic interactions with the zinc-finger domain.

(H) Pairwise interactions between NCR (top) and ZFs (bottom).

(I) The surface representation of the AlphaFold3 model of CTCF with DNA ligand.

(J) The model of CTCF autoinhibition via large conformational changes with NCR as a flexible DNA mimicry.

Discussion

Combinatorial and patterned expression of diverse cadherin-like cPcdh genes in single cells in the brain enables a molecular logic of self-avoidance between neurites from the same neurons as well as a functional assembly of synaptic connectivity between neurons of the same developmental origin.29,54,55,56,57 This complicated cPcdh expression program is achieved by ATP-dependent active cohesin “loop extrusion,” which brings remote super-enhancer in close contacts to target variable promoters via CTCF-mediated anchoring at tandem arrayed directional CBS elements.14,17,18,19,27,30,58,59,60 In particular, tandem-arrayed CTCF sites function as topological chromatin insulators to balance distance- and context-dependent promoter choice to activate cell-specific gene expression in the brain.18,19,30,58,59 Consequently, the dynamic interactions between CTCF and its recognition sites at variable promoters and super-enhancers are central for establishing cPcdh expression programs during brain development. In this work we systematically investigated the function of CTCF CTD and uncovered an NCR for maintaining proper CTCF binding to DNA in the large cPCDH gene complex and throughout the entire genome.

CTCF is a key 3D genome architectural protein that recognizes a large range of genomic sites via the central domain of 11 tandem ZFs and anchors loop-extruding cohesin via the YDF motif of the NTD.4,12,43,44,45,46 In addition, other ZF architectural proteins such as ZNF143, MAZ, PATZ1, and ZNF263 may collaborate with CTCF to maintain proper CTCF insulation at TAD boundaries.22,60,61 The CTCF last ZF and CTD contain an internal RBR of 38 aa, and this region helps CTCF clustering and searching for authentic CBS elements.42,52,62,63 Through a series of genetic deletion and rescue experiments, we uncovered an important NCR immediately downstream of the positively charged RBR. Specifically, we showed that NCR deletion leads to a significant increase of CTCF enrichments at all three types of CBS elements throughout the whole genome. In particular, NCR deletion results in increased CTCF binding at cPCDH variable promoters and super-enhancers, accompanied by increased chromatin accessibility and long-distance DNA looping. NCR may play an inhibitory role in CTCF clustering via RNA and in dynamic recognition of cognate genomic target sites.63,64,65 NCR either repulses DNA directly or weakens the strength of CTCF interaction with RNA and thus inhibits CTCF clustering and searching for cognate CBS elements. Either way NCR appears to be important in maintaining proper CTCF affinity for cognate genomic sites and specific chromatin looping at the cPCDH gene complex and likely throughout the entire genome. Thus, CTCF has an intricate self-adjusting mechanism to control the dynamic binding to genomic sites.

One intriguing allosteric self-adjusting mechanism of CTCF DNA binding is suggested by AlphaFold3 prediction (Figure 5J). According to the large conformational change of CTCF induced by DNA binding, NCR could be released from interacting with the ZF array upon CTCF recognition of genomic CBS elements. Thus, NCR functions as a DNA mimicry and self-associates with the positively charged surface of the ZF array via electrostatic interactions in the CTCF apostate. Indeed, CTCF CTD interacts with its ZF array in in vitro pull-down experiments.66 The self-association of disordered flexible NCR may be a potential general mechanism of DNA- and RNA-binding proteins. For example, an NCR of ZNF410 regulates its ZF array to bind DNA via a cis-allosteric inhibitory mechanism.65 In addition, autoinhibitory intramolecular interactions proofread U2AF recognition of authentic polypyrimidine tracts during RNA binding.67

NCR contains an acidic array of 10 glutamates and 5 aspartates for a total of 15 negatively charged aa residues (Figure 3A). In addition, there are four serine residues immediately upstream of NCR that may be phosphorylated by casein kinase II, thus switching to negative charges upon phosphorylation.68 Numerous CTCF mutations are related to multiple cancers or a group of neurodevelopmental diseases known as CTCF-related disorders (CRDs).69,70 The large spectrum of neurodevelopmental diseases or CRDs may be related to dysregulation of clustered protocadherins.71,72 Interestingly, mutations within the CTCF NCR that alter its electronic charges are associated with several types of cancers. For example, CTCF mutations of E616K or E626K are associated with melanoma and lung cancers, respectively.69,70 The exact pathogenetic mechanisms are not known but are very likely related to disruptions of alternate positive-negative aa block patterns of RBR and NCR within CTD and selective partitioning into CTCF-specific trapping zones.63,73 Specifically, the positively charged RBR and negatively charged NCR within C-terminal domain of CTCF may constitute recently noticed alternating charge block patterns and may participate in the selective partitioning of phase separation or in the formation of protein condensates during chromatin looping in 3D genome.73,74

Limitations of the study

While our genetic experiments, in conjunction with chromosome conformation capture and RNA-seq, demonstrated that CTCF CTD, in particular NCR, plays a crucial role in maintaining proper CTCF binding at the cPCDH gene complex, and subsequent PCDH chromatin looping and gene regulation, whether this close correlation between chromatin looping and gene expression could be generalized to the entire genome remains to be tested. In addition, while our ATAC-seq experiments showed that enhanced CTCF DNA binding correlates mostly with higher chromatin accessibility, the exact mechanism is not known but most likely related to pioneering factors.

Resource availability

Lead contact

Further information and requests for resources and reagents should be directed to and will be fulfilled by the lead contact, Qiang Wu (qiangwu@sjtu.edu.cn).

Materials availability

All unique/stable reagents generated in this study are available from the lead contact with a completed materials transfer agreement.

Data and code availability

  • High-throughput sequencing files (ChIP-seq, RNA-seq, ATAC-seq, and QHR-4C) have been deposited into the NCBI Gene Expression Omnibus (GEO) database with the accession number GSE261210, GSE261212, GSE261213, and GSE261209, respectively.

  • This paper does not report original code.

  • Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request.

Acknowledgments

We are very grateful to the reviewers who made constructive comments and excellent suggestions of AlphaFold and EMSA analyses, which we have appreciatively implemented in the generalized allosteric autoinhibition model of DNA- and RNA-binding proteins. We are also grateful for advice on bioinformatics from Dr. Jingwei Li and all members of our laboratory for discussion. This study was supported by grants to Q.W. from the National Natural Science Foundation of China (32330016) and the National Key R&D Program of China (2022YFC3400200).

Author contributions

Q.W. conceived the research. Y.Z. contributed resources. L.L. performed experiments. L.L. and Y.T. analyzed data. L.L. and Q.W. wrote the manuscript.

Declaration of interests

The authors declare no competing interests.

STAR★Methods

Key resources table

REAGENT or RESOURCE SOURCE IDENTIFIER
Antibodies

Anti-CTCF Millipore Cat# 07–729;
RRID: AB_441965
Anti-Rad21 Abcam Cat# AB992;
RRID: AB_2176601
Anti-V5 Abcam Cat# AB15828; RRID: AB_443253
Goat anti-Rabbit IgG (H + L) Highly Cross-Adsorbed Secondary Antibody ThermoFisher Scientific Cat# A32735; RRID: AB_2633284
Anti-GAPDH Abmart Cat# P30008; RRID: AB_2936506
Anti-MYC Abmart Cat# M20002;
RRID:AB_2861172
Anti-Flag Abmart Cat# M2008;
RRID:AB_2713960

Bacterial and virus strains

Stbl3 competent cells This study N/A

Chemicals, peptides, and recombinant proteins

EcoRI NEB Cat# R0101S
BsaI NEB Cat# R0535S
DpnII NEB Cat# R0176L
BamHI NEB Cat# R0136S
NEB buffer 2 NEB Cat# B7002S
T4 DNA ligase NEB Cat# M0202L
Formaldehyde ThermoFisher Scientific Cat# 28908
RNase A ThermoFisher Scientific Cat# EN0531
Proteinase K NEB Cat# P8107S
Glycogen ThermoFisher Scientific Cat# R0561
TRIzol Reagent ThermoFisher Scientific Cat# 15596026
Anti-FLAG M2 Magnetic Beads Sigma-Aldrich Cat# M8823
Streptavidin Dynabeads ThermoFisher Scientific Cat# 65001
AMPure XP beads Beckman Coulter Cat# A63881
Lipofectamine 3000 Invitrogen Cat# L30001
Polybrene Sigma-Aldrich Cat# TR-1003-G
Puromycin dihydrochloride Sigma-Aldrich Cat# P8833

Critical commercial assays

MinElute Gel Extraction Kit QIAGEN Cat# 28606
VAHTSTM Universal DNA Library Prep Kit for Illumina V2 Vazyme Cat# ND606
VAHTSTM Multiplex Oligos set 4 for Illumina Vazyme Cat# N321
QIAquick PCR Purification Kit QIAGEN Cat# 28106
VAHTS Universal V6 RNA-seq Library Prep Kit for Illumina Vazyme Cat# NR604
VAHTS mRNA Capture Beads Vazyme Cat# N401
VAHTS® RNA Multiplex Oligos Set 1 for Illumina Vazyme Cat# N323
Vazyme Hyperactive ATAC-Seq Library Prep Kit for Illumina Vazyme Cat# TD711
TruePrep Index Kit V2 for Illumina Vazyme Cat# TD202
pClone007 Versatile Simple Vector Kit TsingKe Cat# TSV-007VSm
BCA protein assay kit Beyotime Cat# P0009
TnT® T7 Quick Coupled Transcription/Translation System Promega Cat# L1170
LightShift® Chemiluminescent EMSA Kit Thermo Cat# 20148
SYBR qPCR Master Mix Vazyme Cat# Q711

Deposited data

High-throughput sequencing files (QHR-4C) This study GEO: GSE261209
High-throughput sequencing files (ChIP-seq) This study GEO: GSE261210
High-throughput sequencing files (RNA-seq) This study GEO: GSE261212
High-throughput sequencing files (ATAC-seq) This study GEO: GSE261213
Raw imaging files This study, Mendeley data https://data.mendeley.com/preview/345thpxnbt?a=4721a09e-e875-4859-9e00-8eaa95ebf5b9
Structures predicted by AlphaFold3 This study https://www.modelarchive.org/doi/10.5452/ma-jd5dd; https://www.modelarchive.org/doi/10.5452/ma-u27pa

Experimental models: Cell lines

Human: HEC-1-B ATCC Cat# HTB-113, RRID: CVCL_0294
Human: HEK293T ATCC Cat# CRL-3216, RRID: CVCL_0063
Human: HEC-1-B ΔCTD clone1 This paper N/A
Human: HEC-1-B ΔCTD clone2 This paper N/A
Human: HEC-1-B ΔCT116 clone1 This paper N/A
Human: HEC-1-B ΔCT116 clone2 This paper N/A
Human: HEC-1-B ΔNCR clone1 This paper N/A
Human: HEC-1-B ΔNCR clone2 This paper N/A
Human: HEC-1-B WT-FLAG clone1 This paper N/A
Human: HEC-1-B WT-FLAG clone2 This paper N/A
Human: HEC-1-B CTCF1-611 This paper N/A
Human: HEC-1-B CTCF1-637 This paper N/A
Human: HEC-1-B CTCF1-663 This paper N/A
Human: HEC-1-B CTCF-FL This paper N/A

Oligonucleotides

See Table S7 This study NA

Recombinant DNA

Plasmid: pMD2.G Addgene Cat# 12259, RRID: Addgene_12259
psPAX2 Addgene Cat# 12260, RRID: Addgene_12260
Plasmid: pGL3-U6-sgRNA-PGK-Puro Li et al.50 https://academic.oup.com/jmcb/article/7/4/284/901042
pcDNA3.1-Cas9-WT J. Xi (Peking University) N/A
Plasmid: pClone007-ΔCTD homology arm This paper N/A
Plasmid: pClone007-ΔCT116 homology arm This paper N/A
Plasmid: pClone007-ΔNCR homology arm This paper N/A
Plasmid: pClone007-WT-FLAG homology arm This paper N/A
Plasmid: pGL3- ΔCTD-sg1 This paper N/A
Plasmid: pGL3- ΔCTD-sg2 This paper N/A
Plasmid: pGL3- ΔCT116-sg1 This paper N/A
Plasmid: pGL3- ΔNCR-sg1 This paper N/A
Plasmid: pGL3- ΔNCR-sg2 This paper N/A
Plasmid: pLVX-CTCF1-611 This paper N/A
Plasmid: pLVX-CTCF1-637 This paper N/A
Plasmid: pLVX-CTCF1-663 This paper N/A
Plasmid: pLVX-CTCF-FL This paper N/A
Plasmid: pTNT-CTCF-FL This paper N/A
Plasmid: pTNT-CTCF-ΔNCR This paper N/A
Plasmid: pTNT-CTCF-ΔCT116 This paper N/A
Plasmid: pClone007-EMSA-a6 This paper N/A
Plasmid: pClone007-EMSA-a12 This paper N/A

Software and algorithms

Bowtie2 software (v2.3.4.2) Langmead et al.75 http://bowtie-bio.sourceforge.net/bowtie2/index.shtml
MACS2 (v2.1.2) Zhang et al.76 https://github.com/taoliu/MACS
MEME v4.12.0 Bailey et al.77 https://meme-suite.org/meme/
STAR (v2.7.3a) Dobin et al.78 https://github.com/alexdobin/STAR
Cufflinks (v2.2.1) Trapnell et al.79 http://cole-trapnell-lab.github.io/cufflinks/
Deeptools (v3.5.3) Ramírez et al.80 https://deeptools.readthedocs.io/en/latest/
Samtools (v1.12) Li et al.81 http://www.htslib.org/doc/
Bedtools (v2.30.0) Quinlan et al.82 https://bedtools.readthedocs.io/en/latest/index.html
R3Cseq (v1.38.0) Thongjuea et al.83 https://bioconductor.org/packages/release/bioc/html/r3Cseq.html
DESeq2 (v1.32.0) Love et al.84 https://bioconductor.org/packages/release/bioc/html/DESeq2.html
DiffBind (v3.4) Ross-Innes et al.85 https://bioconductor.org/packages/release/bioc/html/DiffBind.html
Fimo (v5.5.5) Grant et al.86 http://meme-suite.org/tools/fimo
ChIPseeker (v1.30.3) Yu et al.87 https://guangchuangyu.github.io/software/ChIPseeker
PicardTools N/A http://broadinstitute.github.io/picard/
UCSC Genome Browser N/A https://genome.ucsc.edu/
ggplot2 (v3.4.2) Open source https://ggplot2.tidyverse.org/
ImageJ Schneider et al.88 https://imagej.net/ij/index.html
PyMOL Molecular Graphics System, Version 2.3.0 Schrodinger, LLC https://www.pymol.org/

Experimental model and study participant details

Cells and culture conditions

Human HEC-1-B cells (ATCC) were cultured as previously described in MEM medium (Hyclone) supplemented with 10% (v/v) FBS (Gibco), 2 mM GlutaMAX (Gibco), 1 mM sodium pyruvate (Sigma), and 1% penicillin-streptomycin (Gibco).17 Briefly, HEC-1-B cells were maintained at 37°C in a humidified incubator containing 5% CO2. Medium of cultured cells was changed every 24 h. Cells were passed every 72 h. When cells were passaged, the medium was removed and cells were washed by PBS and then digested by trypsin (Gibco) for 5 min at 37°C in a humidified incubator containing 5% CO2 and quenched by 10% FBS supplemented PBS. Digested cells were collected by centrifuging at 500 rpm for 5 min at room temperature. Pelleted cells were resuspended with media and seeded to new plates.

HEK293T cells (ATCC) were cultured in Dulbecco’s modified Eagle’s medium (Hyclone) supplemented with 10% (v/v) FBS (Gibco) and 1% penicillin-streptomycin (Gibco). Cells were maintained at 37°C in a humidified incubator containing 5% CO2. Medium of cultured cells was changed every 24 h. HEK293T cells were passed every 48 h. The passage process of HEK293T is similar to HEC-1-B except with shortened trypsin digesting time of 2 min.

Method details

Plasmid construction

For all CRISPR/Cas9 experiments, the sgRNA plasmids were constructed as described before.50 Briefly, pGL3-U6 vector was linearized by BsaI (NEB) to generate the cloning backbone with 5′ overhangs of ‘TGGC’ and ‘TTTG’ at the two ends. In addition, a pair of complementary oligonucleotides (Table S7) containing the sgRNA targeting sequences with 5′ overhangs of ‘ACCG’ or ‘AAAC’ was annealed and ligated to BsaI-digested linearized vector backbone using T4 DNA ligase (NEB). The complete sgRNA sequences are under the control of the U6 promoter and will be transcribed by Pol III in mammalian cells. Finally, the Cas9 plasmid was obtained as a gift from Peking University.

For donor plasmids used in establishing ΔCTD and ΔCT116 stable cell lines via homologous recombination (HR), the donor was designed such that ΔCTD or ΔCT116 were tagged with FLAG sequences (Table S7) for tracing CTCF proteins with FLAG-specific antibodies because deletions of endogenous CTCF C-terminal fragment removed the epitope (659–675 AAs) for CTCF antibodies (Millipore). In addition, we FLAG-tagged WT CTCF via HR at its C-terminus at the endogenous locus as a control. To generate donor plasmid for CRISPR screening of single-cell clones through the homologous recombination (HR) pathway, two homologous arms each of ∼1kb flanking the target sites were amplified from genomic DNA by PCR with primers (Table S7). The amplified homologous arms were PCR-purified and a donor DNA fragment for HR was generated through overlapping-PCR. Finally, the donor DNA fragment was ligated to a T-vector through TA cloning. For ΔNCR cells, the epitope of CTCF antibodies was intact and the donor plasmid was constructed without FLAG.

All rescue CTCF constructs were V5-tagged to distinguish them from the endogenous FLAG-tagged CTCF in ΔCTD cells. To construct Lentivirus pLVX vectors containing a series of truncated CTCF C-terminal domains, pLVX vector was first linearized by EcoR1 and BamH1 (NEB) and purified as the cloning backbone. CTCF1-611, CTCF1-637, CTCF1-663, and full length CTCF were cloned from a cDNA library of HEC-1-B cells with the same 5′ primers containing EcoRI restriction endonuclease sites and different 3′ primers containing BamHI restriction endonuclease sites in conjunction with the V5 tag sequence (Table S7). Truncated CTCF fragments amplified from cDNA were digested with restriction endonucleases and ligated into pLVX linearized vector using T4 DNA ligase.

ChIP-seq

Chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) experiments were performed as described before.14 Briefly, 5×106 cells were collected and crosslinked with 1% formaldehyde, quenched by 2 M glycine at a final concentration of 125 mM and washed by ice-cold PBS twice. Crosslinked cells were lysed twice by the ChIP buffer (10 mM Tris-HCl pH 7.5, 1 mM EDTA, 1% Triton X-100, 0.1% SDS, 0.1% sodium deoxycholate, 0.15 M NaCl, and 1×protease inhibitors, Roche) with slow rotation at 4°C for 10 min. The lysed cells were centrifuged at 2,500 g at 4°C for 10 min to isolate cell nuclei. The isolated nuclei were resuspended with the ChIP buffer and sonicated using a Bioruptor Sonicator (with the high energy setting at a train of 30-s sonication with 30-s interval for 30 cycles) to fragment DNA to 200–400 bp. The sonicated mixture was centrifuged at 14,000 g at 4°C for 10 min and the supernatant was then precleared by protein A/G magnetic beads (Thermo 26162) for 3h at 4°C with slow rotation. Antibody (CTCF: Millipore 07–729, Rad21: Abcam ab992, V5: Abcam ab15828) or Anti-FLAG antibody conjugated Magnetic Beads (Sigma M8823) were added to the precleared solution and incubated at 4°C overnight with slow rotation to precipitate CTCF or cohesin protein-DNA complex.

Protein A/G beads were added and incubated for 3 h at 4°C with slow rotation to capture the antibody-protein-DNA complex. The ChIP buffer, high salt buffer (10 mM Tris-HCl pH 7.5, 1 mM EDTA, 1% Triton X-100, 0.1% SDS, 0.1% sodium deoxycholate, 0.4 M NaCl), no salt buffer (high salt buffer without NaCl), LiCl buffer (50 mM HEPES pH 7.5, 1 mM EDTA, 1% NP-40, 0.7% sodium deoxycholate, 0.5 M LiCl), and 10 mM Tris-HCl buffer (pH 7.5) were used sequentially to wash the beads. Finally, the elution buffer (50 mM Tris-HCl pH 8.0, 10 mM EDTA, 1% SDS) was used to elute ChIP DNA from beads at 65°C for 1h with 1,000 rpm shaking. The eluted protein-DNA complex was reverse-crosslinked at 65°C overnight with 1,000 rpm shaking to dissociate DNA. Finally, the proteinase K (NEB) was added and incubated for 2 h at 55°C to digest protein and the RNase A (Thermo) was added and incubated for 2h at 37°C to digest RNA.

The DNA was purified by adding equal volume of phenol-chloroform and mixed by vigorously shaking. The mixture was centrifuged at 4 °C at 14,000 g for 10 min to separate the proteins and DNA. The supernatant containing DNA was transferred to a new tube. 2.5-fold volume of ice-cold ethanol, 1/10 volume of 3 M NaAc (pH 5.2), and 1.5 μL of glycogen (Thermo) were added to participate DNA at −80°C for 1 h. The sample was centrifuged at 14,000 g at 4°C for 30 min to pellet DNA. 70% ethanol was added to wash DNA pellets and finally the sample was centrifuged at 14,000 g at 4°C for 10 min. The supernatant was removed and the pelleted DNA was air-dried for 5 min to remove residual ethanol. The DNA was then dissolved in the nuclease-free water. The concentration of DNA was measured using Qubit (Invitrogen).

To prepare DNA library for deep sequencing, we used the Universal DNA Library Prep Kit (Vazyme ND606). Briefly, DNA was first end-repaired and ligated to adapters from the Multiplex Oligos Set (Vazyme, N321). Adapter-ligated DNA was then purified using AMPure XP beads (Beckman) and the final ChIP library was amplified by PCR. The library was sequenced on an Illumina NovaSeq platform. All of the ChIP-seq experiments were performed with at least two biological replicates.

ChIP-qPCR

The chromatin immunoprecipitation followed by quantitative PCR (ChIP-qPCR) experiments were performed as described before.14 ChIP steps were the same as the ChIP-seq experiments described above. PCR primers were designed around CTCF binding sites (Table S7). qPCR were then carried out on the ABI QS6 platform using the SYBR qPCR master mix.

Western blot

1×106 cells were collected and lysed by the RIPA buffer (50 mM Tris-HCl pH 7.4, 150 mM NaCl, 1% Triton X-100, 1% sodium deoxycholate, 0.1% SDS, and 1× protease inhibitors) on ice for 30 min. The sample was sonicated and centrifuged at 14,000 g for 15 min at 4°C to remove cell debris. The concentration of supernatant proteins mixture was measured using the BCA protein assay kit (Beyotime). The supernatant proteins were mixed with 5× SDS protein loading buffer with DTT and incubated at 95°C for 10 min to denature the proteins. The denatured protein mixture was centrifuged at 14,000g for 15 min at room temperature and separated on SDS-PAGE.

The separated proteins in gel were transferred to the nitrocellulose membrane through electrophoreses. After transferring, the membrane was blocked by 5% non-fat milk in PBST for 2 h at room temperature and then washed 3 times with PBST. Corresponding antibodies were incubated with the membrane overnight at 4°C with slow shaking. The membrane was washed 3 times with PBST. The secondary antibody (Invitrogen) was then incubated with the membrane for 90 min at room temperature with slow shaking. Finally, the membrane was washed by PBST for 3 times and scanned by the Odyssey System (LI-COR Biosciences). The intensity of bands were measured by ImageJ software.

Recombinant CTCF protein production

The recombinant mutation proteins of CTCF were prepared by rabbit reticulocyte lysate (Promega L1170) as previously described.17 Firstly, we linearized the empty pTNT plasmid by EcoR1 and Not1 (NEB). Then we cloned different CTCF deletions of the CTCF C-terminal domain from cDNA through PCR. The PCR products containing EcoR1 and Not1 restriction endonuclease sites were gel-purified and digested by EcoR1 and Not1. The digested DNAs were ligated to linearized pTNT vectors using T4 DNA ligase. The plasmids were used as a template to express CTCF mutants using the rabbit reticulocyte lysate at 30°C for 90 min. The recombinant CTCF proteins were finally analyzed by Western blots.

Electrophoretic mobility shift assay (EMSA)

The EMSA experiments were performed as described before.17 We first cloned the probe sequences containing CBS from genomic DNA. Then the DNAs were gel-purified and ligated into a T-vector. The sequences were verified by Sanger sequencing. We finally preformed PCR experiments using plasmids as a template with 5′ biotin labeled primers. We used high-fidelity DNA polymerase to perform PCR and PCR products were gel-purified as EMSA probes. The concentrations of probes were measured by Nano-drop. We performed the EMSA experiments using LightShift Chemiluminescent EMSA reagents (Thermo) according to the manuals. Briefly, equal amounts of protein were incubated in the binding buffer (containing 10 mM Tris, 50 mM KCl, 5 mM MgCl2, 0.1 mM ZnSO4, 1 mM dithiothreitol, 0.1% (v/v) Nonidet P-40 (NP-40), 50 ng/μL poly (dI-dC), and 2.5% (v/v) glycerol) on ice for 20 min to reduce the background. We added the same amounts of probes to the binding buffer and then incubated at room temperature for 30 min. The binding mixtures were then electrophoresed on 5% non-denaturing polyacrylamide gels in the ice-cold 0.5×TBE buffer (45 mM Tris-borate, 1 mM EDTA, pH8.0), and then the gels were transferred to nylon membranes. The membranes were cross-linked under UV for 12 min. We incubated the membrane in the blocking buffer for 20 min. The membranes were treated with streptavidin-horseradish peroxidase conjugate for 20 min. We washed the membranes for 4 times using the washing buffer and stained the membrane by chemiluminescence using the ChemiDoc XRS+ system (Bio-Rad). The intensity of bands was measured by the ImageJ software. All of the EMSA experiments were performed with at least two biological replicates.

CRISPR genome editing and single-cell cloning

CRISPR genome editing was performed as described before.50,51 Cas9 and sgRNA plasmids, and donor constructs were transfected in 12-well plates at 70% confluency using lipofectamine 3000 (Invitrogen). Medium containing lipofectamine 3000 and plasmid DNA was then replaced with fresh medium after 6 h. After transfection for 48 h, the transfected cells were incubated with medium containing 2 μg/mL puromycin and refreshed daily. After 4 days, the cells were recovered for 2 days in fresh medium without puromycin. The cells were then dissociated by trypsin and seeded to 96-wells cell plates at the concentration of about one cell per well. After incubation for about 1 week in 96-well plates, the seeded cells were checked under microscope and single-cell clones were marked manually.

PCR genotyping was performed when seeded single-cell clones reached ∼80% confluence in 96-well plates. The cells were digested using trypsin and PCR-genotyped with a pair of primers matching sequences outside of the homologous arms (Table S7). The positive PCR products were purified and sequenced by Sanger sequencing for confirmation. The identified single-cell clones were incubated for another 4–5 passages and confirmed again by PCR genotyping and Sanger sequencing.

Given that the CTCF antibody used in ChIP-seq recognizes 659–675 AAs of CTCF, we added an FLAG tag in ΔCTD and ΔCT116 cell lines via homologous recombination. We also inserted an FLAG tag in WT cell as an experimental control. For ΔCTD cells, we genotyped 191 single-cell clones and found 3 single-cell clones with CTD deletion. For WT-FLAG cells, we genotyped 160 single-cell clones and found 4 of them with FLAG tag insertion. For ΔCT116 cells, we genotyped 192 single-cell clones and found 8 of them with CT116 deletion. For ΔNCR cells, we genotyped 240 single-cell clones and found 6 of them with NCR deletion. For the cells with same genotype, we used two independent single-cell clones for subsequent experiments.

Lentivirus packaging

Lentiviruses were used to build stable transgenic cell lines that expressing different CTCF rescue constructs in ΔCTD cells. pLVX plasmids containing truncated CTCF sequences and puromycin selection marker, and helper plasmids (psPAX2, pMD2.G, Addgene) were transfected into HEK293T cells in 6-well plate at 70% confluence using lipofectamine 3000 reagents. The medium containing lipofectamine and plasmid DNA was replaced with fresh medium after 6 h. The medium containing lentivirus was harvested after transfection for 48 h for the first time, and then replaced with fresh medium and harvested again after another 24 h. The harvested medium was then centrifuged at 1,300 rpm at 4°C for 5 min to remove cells and then filtered by 0.45 μm filter (Merck) and the virus was collected and stored at −80°C.

Lentivirus infection and stable cell line establishment

All lentivirus particles expressing different truncated CTCF constructs were thawed on ice and infected with ΔCTD cells by adding polybrene at 8 μg/ml. After infection for two days, the infected cells were incubated with medium containing 2 μg/mL puromycin for 4 days and replaced daily. After selection, cells were incubated with fresh medium without puromycin for 2 days for recovering. The cells were then cultured for another 12 days and harvested for Western blot and ChIP-seq experiments.

RNA-seq

The RNA-seq experiment was performed as previously described.17 Briefly, about 1×106 cells in 6-well plates were collected and washed by ice-cold PBS for 3 times. One mL of TRIzol (Invitrogen) was added to the cells and mixed thoroughly. After incubation for 5 min at room temperature, 0.2 mL of chloroform was added and mixed well by hands. The sample was then incubated for 2 min at room temperature and centrifuged at 12,000 rpm for 10 min at 4°C. The supernatant was transferred to a new tube and mixed with equal volume of isopropanol. The sample was incubated at room temperature for 10 min and centrifuged at 12,000 rpm for 15 min at 4°C to pellet RNA. After removing the supernatant, 1 mL of 75% ethanol was added to wash RNA. Finally, RNA pellets were dissolved in nuclease-free water. The quantity and concentration of the total RNA was measured by NanoDrop (Thermo).

The Oligo (dT) coupled magnetic beads (Vazyme) were used to purify mRNA from total RNA. The mRNA library preparation was performed using the Universal V6 RNA-seq Library Prep Kit (Vazyme) according to the manual. Briefly, mRNA was fragmented by heating at 94°C for 8 min. Next, reverse transcription for the first-strand cDNA was performed and then the double-strand cDNA was synthesized. Adapters from the RNA Multiplex Oligos Set (Vazyme) were added to cDNA and the adapter-ligated cDNA was purified by AMPure XP beads (Beckman). Finally, Library was amplified through PCR and was sequenced on an Illumina NovaSeq platform. All of the RNA-seq experiments were performed with at least two biological replicates.

ATAC-seq

ATAC-seq was performed as previously described.59 Briefly, cells in 12-well plates at 80% confluence were digested by trypsin and were collected by centrifuging at 500 rpm at room temperature for 5 min. 50,000 cells were collected and washed twice by PBS. The cells were then washed twice by ice-cold buffer by centrifuging at 2,300 rpm at 4°C for 5 min. The cells were then lysed by fresh prepared lysis buffer containing NP-40, Tween 20, and digitonin on ice for 5 min and centrifuged for 10 min at 2,300 rpm at 4°C to collect the cell nuclei.

The chromatins were fragmented with Tn5 transposase in conjunction with adaptor at 37°C for 30 min. The fragmentation reaction was terminated by adding the stop buffer at room temperature for 5 min. The fragmented DNA was purified by ATAC DNA Extract Beads (Vazyme). Finally, PCR was performed to amplify the library using primers in TruePrep Index Kit V2 (Vazyme) and the library was sequenced on an Illumina NovaSeq platform. All of the ATAC-seq experiments were performed with at least two biological replicates.

QHR-4C

Quantitative high-resolution circularized chromosome conformation capture (QHR-4C) experiments were performed as previously described.19 Briefly, 1×106 cells were collected and crosslinked by 2% formaldehyde at room temperature for 10 min with slow rotation. The crosslinking reaction was quenched by adding 2 M glycine to the final concentration of 200 mM. The crosslinked cells were washed twice by ice-cold PBS by centrifuging for 5 min at 800 g at 4°C and then lysed twice by ice-cold lysis buffer (50 mM Tris-HCl pH 7.5, 150 mM NaCl, 5 mM EDTA, 0.5% NP-40,1% Triton X-100, and 1× protease inhibitors) for 10 min at 4°C with slow rotation. The cell nuclei were collected by centrifuge at 800 g at 4°C for 5 min. The pelleted nuclei were resuspended in 73 μL of nuclease-free water, 10μL of DpnII buffer (NEB), and 2.5 μL of 10% SDS and incubated for 1 h at 37°C with shaking at 900 rpm. 12.5 μL of 20% Triton X-100 was added for 1 h at 37°C with shaking at 900 rpm. 2 μL of DpnII (NEB) was added to digest the chromatin overnight at 37°C with shaking at 900 rpm and inactivated at 65°C for 20 min.

The nuclei were then collected through centrifuging at 1,000 g for 1 min. The supernatant was carefully removed. The proximal ligation was performed with resuspended nuclei by adding 100 μL of T4 ligation buffer (NEB) containing 1 μL of T4 ligase and incubated at 16°C for 24 h. 1 μL of proteinase K was then added to the ligation mixture to digest the proteins and incubated for 4 h at 65°C for reverse crosslinking. The DNA was purified by phenol-chloroform as described above in ChIP-seq. Finally, the purified DNA was dissolved in 50 μL nuclease-free water and sonicated with the Bioruptor Sonicator (with the low energy setting by a train of 30-s sonication with 30-s interval for 12 cycles) to fragment DNA to sizes of 200–600 bp.

The PCR amplification was performed using 5′ biotin-labeled primers (Table S7) to capture DNA anchored at the HS5-1 enhancer. To maximize the PCR product, 100 μL of reaction system and 60 cycles of PCR were used. The PCR product was incubated at 95°C for 5 min and immediately chilled on ice to obtain single-strand DNA (ssDNA). The biotin-labeled ssDNA was collected by incubating with Streptavidin Beads (Invitrogen) for 2 h at room temperature and the beads were washed twice by the washing buffer (5 mM Tris-HCl pH7.5, 1 M NaCl, 0.5 mM EDTA).

To prepare library for deep sequencing, adapters containing sequences that match the 3′ end of the Illumina P7 sequence were ligated to ssDNA at 16°C for 24 h. The adapters were generated through annealing of two complementary primers (Table S7) in the annealing buffer (25 mM NaCl, 10 mM Tris-HCl pH 7.5, 0.5 mM EDTA). The beads with ssDNA-adapter were washed twice by the washing buffer. Finally, the library was amplified through PCR with two primers. The forward primer contains the Illumina P5 sequences and the sequences adjacent to the HS5-1 anchor. The reverse primer contains the Illumina P7 sequences and indexes. The amplified library was sequenced on an Illumina NovaSeq platform. All of the QHR-4C experiments were performed with at least two biological replicates.

Data analysis of ChIP-seq

Raw FASTQ files were aligned to the human reference genome (GRCh37/hg19) using Bowtie2.75 The MarkDuplicates module of PICARD tools (http://broadinstitute.github.io/picard/) was used to remove the duplicates and Samtools81 was used to index or sort bam files. ChIP-seq peaks were called by MACS276 with the default parameter. The read counts were normalized to reads per kilobase per million mapped (RPKM) using bamCoverage module of Deeptools80 with a bin sized of 20 bp. The plotHeatmap module of Deeptools was used to generate heatmaps. Normalized read counts were converted to bedGraph to be visualized in the UCSC genome browser. Differential binding analyses were performed using DiffBind with the default parameter.85 Motif analyses of CTCF were performed using the MEME suite77 and FIMO.86

Data analysis of RNA-seq

Raw FASTQ files were aligned to the human reference genome (GRCh37/hg19) using STAR78 with default parameters and the FPKMs were calculated using Cufflink.79 Differential analysis of gene expression was performed using DEseq2.84 Volcano plots were generated by ggplot2. TSS distance to nearest CTCF peaks were calculated by Bedtools.82

Data analysis of ATAC-seq

Raw FASTQ files were aligned to the human reference genome (GRCh37/hg19) using Bowtie2.75 ATAC-seq peaks were called by MACS2.76 Differential accessibility regions (DARs) were analyzed using DESeq2.84 ATAC-seq peak annotation was performed with ChIPseeker.87 Bedtools were used in overlapping analyses of DAR and CBS elements and in calculating the distance of DARs to nearest CBS elements.82

Data analysis of QHR-4C

Raw FASTQ files were aligned to the human reference genome (GRCh37/hg19) using Bowtie2.75 Reads were normalized by r3Cseq program (version 1.20).83

Quantification and statistical analysis

All statistical tests were performed with the R scripts. Statistical significance values were calculated using the Student’s t test. p < 0.05 was shown as ∗, p < 0.01 was shown as ∗∗, p < 0.001 was shown as ∗∗∗ and p < 0.0001 was shown as ∗∗∗∗.

Published: November 22, 2024

Footnotes

Supplemental information can be found online at https://doi.org/10.1016/j.isci.2024.111452.

Supplemental information

Document S1. Figures S1–S8
mmc1.pdf (2.6MB, pdf)
Table S1. Up-regulated genes upon CTD deletion, related to Figure S3
mmc2.xlsx (26.2KB, xlsx)
Table S2. Down-regulated genes upon CTD deletion, related to Figure S3
mmc3.xlsx (31.1KB, xlsx)
Table S3. Up-regulated genes upon CT116 deletion, related to Figure S3
mmc4.xlsx (54KB, xlsx)
Table S4. Down-regulated genes upon CT116 deletion, related to Figure S3
mmc5.xlsx (56.5KB, xlsx)
Table S5. Up-regulated genes upon NCR deletion, related to Figures 4, 5, and S3
mmc6.xlsx (42.6KB, xlsx)
Table S6. Down-regulated genes upon NCR deletion, related to Figures 4, 5, and S3
mmc7.xlsx (23.8KB, xlsx)
Table S7. Oligonucleotides used in this study, related to all Figures
mmc8.xlsx (15.2KB, xlsx)

References

  • 1.Klenova E.M., Nicolas R.H., Paterson H.F., Carne A.F., Heath C.M., Goodwin G.H., Neiman P.E., Lobanenkov V.V. CTCF, a conserved nuclear factor required for optimal transcriptional activity of the chicken c-myc gene, is an 11-Zn-finger protein differentially expressed in multiple forms. Mol. Cell Biol. 1993;13:7612–7624. doi: 10.1128/mcb.13.12.7612-7624.1993. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Heger P., Marin B., Bartkuhn M., Schierenberg E., Wiehe T. The chromatin insulator CTCF and the emergence of metazoan diversity. Proc. Natl. Acad. Sci. USA. 2012;109:17507–17512. doi: 10.1073/pnas.1111941109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Rowley M.J., Corces V.G. Organizational principles of 3D genome architecture. Nat. Rev. Genet. 2018;19:789–800. doi: 10.1038/s41576-018-0060-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Li Y., Haarhuis J.H.I., Sedeño Cacciatore Á., Oldenkamp R., van Ruiten M.S., Willems L., Teunissen H., Muir K.W., de Wit E., Rowland B.D., Panne D. The structural basis for cohesin-CTCF-anchored loops. Nature. 2020;578:472–476. doi: 10.1038/s41586-019-1910-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Wu Q., Liu P., Wang L. Many facades of CTCF unified by its coding for three-dimensional genome architecture. J. Genet. Genom. 2020;47:407–424. doi: 10.1016/j.jgg.2020.06.008. [DOI] [PubMed] [Google Scholar]
  • 6.Parelho V., Hadjur S., Spivakov M., Leleu M., Sauer S., Gregson H.C., Jarmuz A., Canzonetta C., Webster Z., Nesterova T., et al. Cohesins functionally associate with CTCF on mammalian chromosome arms. Cell. 2008;132:422–433. doi: 10.1016/j.cell.2008.01.011. [DOI] [PubMed] [Google Scholar]
  • 7.Rubio E.D., Reiss D.J., Welcsh P.L., Disteche C.M., Filippova G.N., Baliga N.S., Aebersold R., Ranish J.A., Krumm A. CTCF physically links cohesin to chromatin. Proc. Natl. Acad. Sci. USA. 2008;105:8309–8314. doi: 10.1073/pnas.0801273105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Wendt K.S., Yoshida K., Itoh T., Bando M., Koch B., Schirghuber E., Tsutsumi S., Nagae G., Ishihara K., Mishiro T., et al. Cohesin mediates transcriptional insulation by CCCTC-binding factor. Nature. 2008;451:796–801. doi: 10.1038/nature06634. [DOI] [PubMed] [Google Scholar]
  • 9.Zuin J., Dixon J.R., van der Reijden M.I.J.A., Ye Z., Kolovos P., Brouwer R.W.W., van de Corput M.P.C., van de Werken H.J.G., Knoch T.A., van IJcken W.F.J., et al. Cohesin and CTCF differentially affect chromatin architecture and gene expression in human cells. Proc. Natl. Acad. Sci. USA. 2014;111:996–1001. doi: 10.1073/pnas.1317788111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Sanborn A.L., Rao S.S.P., Huang S.C., Durand N.C., Huntley M.H., Jewett A.I., Bochkov I.D., Chinnappan D., Cutkosky A., Li J., et al. Chromatin extrusion explains key features of loop and domain formation in wild-type and engineered genomes. Proc. Natl. Acad. Sci. USA. 2015;112:E6456–E6465. doi: 10.1073/pnas.1518552112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Fudenberg G., Imakaev M., Lu C., Goloborodko A., Abdennur N., Mirny L.A. Formation of Chromosomal Domains by Loop Extrusion. Cell Rep. 2016;15:2038–2049. doi: 10.1016/j.celrep.2016.04.085. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.van Ruiten M.S., Rowland B.D. On the choreography of genome folding: A grand pas de deux of cohesin and CTCF. Curr. Opin. Cell Biol. 2021;70:84–90. doi: 10.1016/j.ceb.2020.12.001. [DOI] [PubMed] [Google Scholar]
  • 13.Popay T.M., Dixon J.R. Coming full circle: On the origin and evolution of the looping model for enhancer-promoter communication. J. Biol. Chem. 2022;298 doi: 10.1016/j.jbc.2022.102117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Guo Y., Monahan K., Wu H., Gertz J., Varley K.E., Li W., Myers R.M., Maniatis T., Wu Q. CTCF/cohesin-mediated DNA looping is required for protocadherin α promoter choice. Proc. Natl. Acad. Sci. USA. 2012;109:21081–21086. doi: 10.1073/pnas.1219280110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Rao S.S.P., Huntley M.H., Durand N.C., Stamenova E.K., Bochkov I.D., Robinson J.T., Sanborn A.L., Machol I., Omer A.D., Lander E.S., Aiden E.L. A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell. 2014;159:1665–1680. doi: 10.1016/j.cell.2014.11.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.de Wit E., Vos E.S.M., Holwerda S.J.B., Valdes-Quezada C., Verstegen M.J.A.M., Teunissen H., Splinter E., Wijchers P.J., Krijger P.H.L., de Laat W. CTCF Binding Polarity Determines Chromatin Looping. Mol. Cell. 2015;60:676–684. doi: 10.1016/j.molcel.2015.09.023. [DOI] [PubMed] [Google Scholar]
  • 17.Guo Y., Xu Q., Canzio D., Shou J., Li J., Gorkin D.U., Jung I., Wu H., Zhai Y., Tang Y., et al. CRISPR Inversion of CTCF Sites Alters Genome Topology and Enhancer/Promoter Function. Cell. 2015;162:900–910. doi: 10.1016/j.cell.2015.07.038. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Canzio D., Nwakeze C.L., Horta A., Rajkumar S.M., Coffey E.L., Duffy E.E., Duffié R., Monahan K., O'Keeffe S., Simon M.D., et al. Antisense lncRNA Transcription Mediates DNA Demethylation to Drive Stochastic Protocadherin α Promoter Choice. Cell. 2019;177:639–653.e15. doi: 10.1016/j.cell.2019.03.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Jia Z., Li J., Ge X., Wu Y., Guo Y., Wu Q. Tandem CTCF sites function as insulators to balance spatial chromatin contacts and topological enhancer-promoter selection. Genome Biol. 2020;21 doi: 10.1186/s13059-020-01984-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Liang Z., Zhao L., Ye A.Y., Lin S.G., Zhang Y., Guo C., Dai H.Q., Ba Z., Alt F.W. Contribution of the IGCR1 regulatory element and the 3'Igh CTCF- binding elements to regulation of Igh V(D)J recombination. Proc. Natl. Acad. Sci. USA. 2023;120 doi: 10.1073/pnas.2306564120. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Weintraub A.S., Li C.H., Zamudio A.V., Sigova A.A., Hannett N.M., Day D.S., Abraham B.J., Cohen M.A., Nabet B., Buckley D.L., et al. YY1 Is a Structural Regulator of Enhancer-Promoter Loops. Cell. 2017;171:1573–1588.e28. doi: 10.1016/j.cell.2017.11.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Ortabozkoyun H., Huang P.Y., Gonzalez-Buendia E., Cho H., Kim S.Y., Tsirigos A., Mazzoni E.O., Reinberg D. Members of an array of zinc-finger proteins specify distinct Hox chromatin boundaries. Mol. Cell. 2024;84:3406–3422.e6. doi: 10.1016/j.molcel.2024.08.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Wu Q., Maniatis T. A striking organization of a large family of human neural cadherin-like cell adhesion genes. Cell. 1999;97:779–790. doi: 10.1016/S0092-8674(00)80789-8. [DOI] [PubMed] [Google Scholar]
  • 24.Zipursky S.L., Sanes J.R. Chemoaffinity Revisited: Dscams, Protocadherins, and Neural Circuit Assembly. Cell. 2010;143:343–353. doi: 10.1016/j.cell.2010.10.009. [DOI] [PubMed] [Google Scholar]
  • 25.Dong H., Li J., Wu Q., Jin Y. Confluence and convergence of Dscam and Pcdh codes. Trends Biochem. Sci. 2023;48:1044–1057. doi: 10.1016/j.tibs.2023.09.001. [DOI] [PubMed] [Google Scholar]
  • 26.Tasic B., Nabholz C.E., Baldwin K.K., Kim Y., Rueckert E.H., Ribich S.A., Cramer P., Wu Q., Axel R., Maniatis T. Promoter choice determines splice site selection in protocadherin α and -γ pre-mRNA splicing. Mol. Cell. 2002;10:21–33. doi: 10.1016/S1097-2765(02)00578-6. [DOI] [PubMed] [Google Scholar]
  • 27.Monahan K., Rudnick N.D., Kehayova P.D., Pauli F., Newberry K.M., Myers R.M., Maniatis T. Role of CCCTC binding factor (CTCF) and cohesin in the generation of single-cell diversity of Protocadherin-α gene expression. Proc. Natl. Acad. Sci. USA. 2012;109:9125–9130. doi: 10.1073/pnas.1205074109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Allahyar A., Vermeulen C., Bouwman B.A.M., Krijger P.H.L., Verstegen M.J.A.M., Geeven G., van Kranenburg M., Pieterse M., Straver R., Haarhuis J.H.I., et al. Enhancer hubs and loop collisions identified from single-allele topologies. Nat. Genet. 2018;50:1151–1160. doi: 10.1038/s41588-018-0161-5. [DOI] [PubMed] [Google Scholar]
  • 29.Lv X., Li S., Li J., Yu X.Y., Ge X., Li B., Hu S., Lin Y., Zhang S., Yang J., et al. Patterned cPCDH expression regulates the fine organization of the neocortex. Nature. 2022;612:503–511. doi: 10.1038/s41586-022-05495-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Kiefer L., Chiosso A., Langen J., Buckley A., Gaudin S., Rajkumar S.M., Servito G.I.F., Cha E.S., Vijay A., Yeung A., et al. WAPL functions as a rheostat of Protocadherin isoform diversity that controls neural wiring. Science. 2023;380:eadf8440. doi: 10.1126/science.adf8440. [DOI] [PubMed] [Google Scholar]
  • 31.Martinez S.R., Miranda J.L. CTCF terminal segments are unstructured. Protein Sci. 2010;19:1110–1116. doi: 10.1002/pro.367. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Bonchuk A., Kamalyan S., Mariasina S., Boyko K., Popov V., Maksimenko O., Georgiev P. N-terminal domain of the architectural protein CTCF has similar structural organization and ability to self-association in bilaterian organisms. Sci. Rep. 2020;10 doi: 10.1038/s41598-020-59459-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Nakahashi H., Kieffer Kwon K.R., Resch W., Vian L., Dose M., Stavreva D., Hakim O., Pruett N., Nelson S., Yamane A., et al. A genome-wide map of CTCF multivalency redefines the CTCF code. Cell Rep. 2013;3:1678–1689. doi: 10.1016/j.celrep.2013.04.024. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Hashimoto H., Wang D., Horton J.R., Zhang X., Corces V.G., Cheng X. Structural Basis for the Versatile and Methylation-Dependent Binding of CTCF to DNA. Mol. Cell. 2017;66:711–720.e3. doi: 10.1016/j.molcel.2017.05.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Yin M., Wang J., Wang M., Li X., Zhang M., Wu Q., Wang Y. Molecular mechanism of directional CTCF recognition of a diverse range of genomic sites. Cell Res. 2017;27:1365–1377. doi: 10.1038/cr.2017.131. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Xu D., Ma R., Zhang J., Liu Z., Wu B., Peng J., Zhai Y., Gong Q., Shi Y., Wu J., et al. Dynamic nature of CTCF tandem 11 zinc fingers in multivalent recognition of DNA as revealed by NMR spectroscopy. J. Phys. Chem. Lett. 2018;9:4020–4028. doi: 10.1021/acs.jpclett.8b01440. [DOI] [PubMed] [Google Scholar]
  • 37.Soochit W., Sleutels F., Stik G., Bartkuhn M., Basu S., Hernandez S.C., Merzouk S., Vidal E., Boers R., Boers J., et al. CTCF chromatin residence time controls three-dimensional genome organization, gene expression and DNA methylation in pluripotent cells. Nat. Cell Biol. 2021;23:881–893. doi: 10.1038/s41556-021-00722-w. [DOI] [PubMed] [Google Scholar]
  • 38.Lebeau B., Zhao K., Jangal M., Zhao T., Guerra M., Greenwood C.M.T., Witcher M. Single base-pair resolution analysis of DNA binding motif with MoMotif reveals an oncogenic function of CTCF zinc-finger 1 mutation. Nucleic Acids Res. 2022;50:8441–8458. doi: 10.1093/nar/gkac658. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Hyle J., Djekidel M.N., Williams J., Wright S., Shao Y., Xu B., Li C. Auxin-inducible degron 2 system deciphers functions of CTCF domains in transcriptional regulation. Genome Biol. 2023;24:14. doi: 10.1186/s13059-022-02843-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Yang J., Horton J.R., Liu B., Corces V.G., Blumenthal R.M., Zhang X., Cheng X. Structures of CTCF-DNA complexes including all 11 zinc fingers. Nucleic Acids Res. 2023;51:8447–8462. doi: 10.1093/nar/gkad594. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Saldaña-Meyer R., Rodriguez-Hernaez J., Escobar T., Nishana M., Jácome-López K., Nora E.P., Bruneau B.G., Tsirigos A., Furlan-Magaril M., Skok J., Reinberg D. RNA Interactions Are Essential for CTCF-Mediated Genome Organization. Mol. Cell. 2019;76:412–422.e415. doi: 10.1016/j.molcel.2019.08.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Hansen A.S., Hsieh T.H.S., Cattoglio C., Pustova I., Saldaña-Meyer R., Reinberg D., Darzacq X., Tjian R. Distinct Classes of Chromatin Loops Revealed by Deletion of an RNA-Binding Region in CTCF. Mol. Cell. 2019;76:395–411.e13. doi: 10.1016/j.molcel.2019.07.039. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Nishana M., Ha C., Rodriguez-Hernaez J., Ranjbaran A., Chio E., Nora E.P., Badri S.B., Kloetgen A., Bruneau B.G., Tsirigos A., Skok J.A. Defining the relative and combined contribution of CTCF and CTCFL to genomic regulation. Genome Biol. 2020;21:108. doi: 10.1186/s13059-020-02024-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Nora E.P., Caccianini L., Fudenberg G., So K., Kameswaran V., Nagle A., Uebersohn A., Hajj B., Saux A.L., Coulon A., et al. Molecular basis of CTCF binding polarity in genome folding. Nat. Commun. 2020;11:5612. doi: 10.1038/s41467-020-19283-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Pugacheva E.M., Kubo N., Loukinov D., Tajmul M., Kang S., Kovalchuk A.L., Strunnikov A.V., Zentner G.E., Ren B., Lobanenkov V.V. CTCF mediates chromatin looping via N-terminal domain-dependent cohesin retention. Proc. Natl. Acad. Sci. USA. 2020;117:2020–2031. doi: 10.1073/pnas.1911708117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Hansen A.S. CTCF as a boundary factor for cohesin-mediated loop extrusion: evidence for a multi-step mechanism. Nucleus. 2020;11:132–148. doi: 10.1080/19491034.2020.1782024. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Hu G., Katuwawala A., Wang K., Wu Z., Ghadermarzi S., Gao J., Kurgan L. flDPnn: Accurate intrinsic disorder prediction with putative propensities of disorder functions. Nat. Commun. 2021;12:4438. doi: 10.1038/s41467-021-24773-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Moore J.M., Rabaia N.A., Smith L.E., Fagerlie S., Gurley K., Loukinov D., Disteche C.M., Collins S.J., Kemp C.J., Lobanenkov V.V., Filippova G.N. Loss of maternal CTCF is associated with peri-implantation lethality of null embryos. PLoS One. 2012;7 doi: 10.1371/journal.pone.0034915. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.González-Buendía E., Pérez-Molina R., Ayala-Ortega E., Guerrero G., Recillas-Targa F. In: Cancer Cell Signaling: Methods and Protocols. Robles-Flores M., editor. Springer; 2014. Experimental Strategies to Manipulate the Cellular Levels of the Multifunctional Factor CTCF; pp. 53–69. [DOI] [PubMed] [Google Scholar]
  • 50.Li J., Shou J., Guo Y., Tang Y., Wu Y., Jia Z., Zhai Y., Chen Z., Xu Q., Wu Q. Efficient inversions and duplications of mammalian regulatory DNA elements and gene clusters by CRISPR/Cas9. J. Mol. Cell Biol. 2015;7:284–298. doi: 10.1093/jmcb/mjv016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Shou J., Li J., Liu Y., Wu Q. Precise and Predictable CRISPR Chromosomal Rearrangements Reveal Principles of Cas9-Mediated Nucleotide Insertion. Mol. Cell. 2018;71:498–509.e4. doi: 10.1016/j.molcel.2018.06.021. [DOI] [PubMed] [Google Scholar]
  • 52.Saldaña-Meyer R., González-Buendía E., Guerrero G., Narendra V., Bonasio R., Recillas-Targa F., Reinberg D. CTCF regulates the human p53 gene through direct interaction with its natural antisense transcript, Wrap53. Genes Dev. 2014;28:723–734. doi: 10.1101/gad.236869.113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Abramson J., Adler J., Dunger J., Evans R., Green T., Pritzel A., Ronneberger O., Willmore L., Ballard A.J., Bambrick J., et al. Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature. 2024;630:493–500. doi: 10.1038/s41586-024-07487-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Lefebvre J.L., Kostadinov D., Chen W.V., Maniatis T., Sanes J.R. Protocadherins mediate dendritic self-avoidance in the mammalian nervous system. Nature. 2012;488:517–521. doi: 10.1038/nature11305. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Thu C.A., Chen W.V., Rubinstein R., Chevee M., Wolcott H.N., Felsovalyi K.O., Tapia J.C., Shapiro L., Honig B., Maniatis T. Single-cell identity generated by combinatorial homophilic interactions between alpha, beta, and gamma protocadherins. Cell. 2014;158:1045–1059. doi: 10.1016/j.cell.2014.07.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Rubinstein R., Thu C.A., Goodman K.M., Wolcott H.N., Bahna F., Mannepalli S., Ahlsen G., Chevee M., Halim A., Clausen H., et al. Molecular logic of neuronal self-recognition through protocadherin domain interactions. Cell. 2015;163:629–642. doi: 10.1016/j.cell.2015.09.026. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Mountoufaris G., Chen W.V., Hirabayashi Y., O'Keeffe S., Chevee M., Nwakeze C.L., Polleux F., Maniatis T. Multicluster Pcdh diversity is required for mouse olfactory neural circuit assembly. Science. 2017;356:411–414. doi: 10.1126/science.aai8801. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Zhou Y., Xu S., Zhang M., Wu Q. Systematic functional characterization of antisense eRNA of protocadherin composite enhancer. Gene Dev. 2021;35:1383–1394. doi: 10.1101/gad.348621.121. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Ge X., Huang H., Han K., Xu W., Wang Z., Wu Q. Outward-oriented sites within clustered CTCF boundaries are key for intra-TAD chromatin interactions and gene regulation. Nat. Commun. 2023;14:8101. doi: 10.1038/s41467-023-43849-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Zhang M., Huang H., Li J., Wu Q. ZNF143 deletion alters enhancer/promoter looping and CTCF/cohesin geometry. Cell Rep. 2024;43 doi: 10.1016/j.celrep.2023.113663. [DOI] [PubMed] [Google Scholar]
  • 61.Huang H., Wu Q. Pushing the TAD boundary: Decoding insulator codes of clustered CTCF sites in 3D genomes. Bioessays. 2024;46 doi: 10.1002/bies.202400121. [DOI] [PubMed] [Google Scholar]
  • 62.Kung J.T., Kesner B., An J.Y., Ahn J.Y., Cifuentes-Rojas C., Colognori D., Jeon Y., Szanto A., del Rosario B.C., Pinter S.F., et al. Locus-specific targeting to the X chromosome revealed by the RNA interactome of CTCF. Mol. Cell. 2015;57:361–375. doi: 10.1016/j.molcel.2014.12.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Hansen A.S., Amitai A., Cattoglio C., Tjian R., Darzacq X. Guided nuclear exploration increases CTCF target search efficiency. Nat. Chem. Biol. 2020;16:257–266. doi: 10.1038/s41589-019-0422-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Hansen A.S., Pustova I., Cattoglio C., Tjian R., Darzacq X. CTCF and cohesin regulate chromatin loop stability with distinct dynamics. Elife. 2017;6 doi: 10.7554/eLife.25776. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Kaur G., Ren R., Hammel M., Horton J.R., Yang J., Cao Y., He C., Lan F., Lan X., Blobel G.A., et al. Allosteric autoregulation of DNA binding via a DNA-mimicking protein domain: a biophysical study of ZNF410-DNA interaction using small angle X-ray scattering. Nucleic Acids Res. 2023;51:1674–1686. doi: 10.1093/nar/gkac1274. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Pant V., Kurukuti S., Pugacheva E., Shamsuddin S., Mariano P., Renkawitz R., Klenova E., Lobanenkov V., Ohlsson R. Mutation of a single CTCF target site within the H19 imprinting control region leads to loss of Igf2 imprinting and complex patterns of de novo methylation upon maternal inheritance. Mol. Cell Biol. 2004;24:3497–3504. doi: 10.1128/Mcb.24.8.3497-3504.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Kang H.S., Sánchez-Rico C., Ebersberger S., Sutandy F.X.R., Busch A., Welte T., Stehle R., Hipp C., Schulz L., Buchbender A., et al. An autoinhibitory intramolecular interaction proof-reads RNA recognition by the essential splicing factor U2AF2. Proc. Natl. Acad. Sci. USA. 2020;117:7140–7149. doi: 10.1073/pnas.1913483117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Klenova E.M., Chernukhin I.V., El-Kady A., Lee R.E., Pugacheva E.M., Loukinov D.I., Goodwin G.H., Delgado D., Filippova G.N., León J., et al. Functional phosphorylation sites in the C-terminal region of the multivalent multifunctional transcriptional factor CTCF. Mol. Cell Biol. 2001;21:2221–2234. doi: 10.1128/mcb.21.6.2221-2234.2001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.Valverde de Morales H.G., Wang H.L.V., Garber K., Cheng X., Corces V.G., Li H. Expansion of the genotypic and phenotypic spectrum of CTCF-related disorder guides clinical management: 43 new subjects and a comprehensive literature review. Am. J. Med. Genet. 2023;191:718–729. doi: 10.1002/ajmg.a.63065. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.Bailey M.H., Tokheim C., Porta-Pardo E., Sengupta S., Bertrand D., Weerasinghe A., Colaprico A., Wendl M.C., Kim J., Reardon B., et al. Comprehensive Characterization of Cancer Driver Genes and Mutations. Cell. 2018;173:371–385.e18. doi: 10.1016/j.cell.2018.02.060. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Flaherty E., Maniatis T. The role of clustered protocadherins in neurodevelopment and neuropsychiatric diseases. Curr. Opin. Genet. Dev. 2020;65:144–150. doi: 10.1016/j.gde.2020.05.041. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72.Jia Z., Wu Q. Clustered Protocadherins Emerge as Novel Susceptibility Loci for Mental Disorders. Front. Neurosci. 2020;14 doi: 10.3389/fnins.2020.587819. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Lyons H., Veettil R.T., Pradhan P., Fornero C., De La Cruz N., Ito K., Eppert M., Roeder R.G., Sabari B.R. Functional partitioning of transcriptional regulators by patterned charge blocks. Cell. 2023;186:327–345.e28. doi: 10.1016/j.cell.2022.12.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Ahn J.H., Davis E.S., Daugird T.A., Zhao S., Quiroga I.Y., Uryu H., Li J., Storey A.J., Tsai Y.H., Keeley D.P., et al. Phase separation drives aberrant chromatin looping and cancer development. Nature. 2021;595:591–595. doi: 10.1038/s41586-021-03662-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Langmead B., Salzberg S.L. Fast gapped-read alignment with Bowtie 2. Nat. Methods. 2012;9:357–359. doi: 10.1038/Nmeth.1923. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Zhang Y., Liu T., Meyer C.A., Eeckhoute J., Johnson D.S., Bernstein B.E., Nusbaum C., Myers R.M., Brown M., Li W., Liu X.S. Model-based Analysis of ChIP-Seq (MACS) Genome Biol. 2008;9:R137. doi: 10.1186/gb-2008-9-9-r137. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 77.Bailey T.L., Johnson J., Grant C.E., Noble W.S. The MEME Suite. Nucleic Acids Res. 2015;43:W39–W49. doi: 10.1093/nar/gkv416. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78.Dobin A., Davis C.A., Schlesinger F., Drenkow J., Zaleski C., Jha S., Batut P., Chaisson M., Gingeras T.R. STAR: ultrafast universal RNA-seq aligner. Bioinformatics. 2013;29:15–21. doi: 10.1093/bioinformatics/bts635. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79.Trapnell C., Williams B.A., Pertea G., Mortazavi A., Kwan G., van Baren M.J., Salzberg S.L., Wold B.J., Pachter L. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat. Biotechnol. 2010;28:511–515. doi: 10.1038/nbt.1621. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80.Ramírez F., Ryan D.P., Grüning B., Bhardwaj V., Kilpert F., Richter A.S., Heyne S., Dündar F., Manke T. deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic Acids Res. 2016;44:W160–W165. doi: 10.1093/nar/gkw257. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81.Li H., Handsaker B., Wysoker A., Fennell T., Ruan J., Homer N., Marth G., Abecasis G., Durbin R., 1000 Genome Project Data Processing Subgroup The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009;25:2078–2079. doi: 10.1093/bioinformatics/btp352. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 82.Quinlan A.R., Hall I.M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26:841–842. doi: 10.1093/bioinformatics/btq033. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 83.Thongjuea S., Stadhouders R., Grosveld F.G., Soler E., Lenhard B. r3Cseq: an R/Bioconductor package for the discovery of long-range genomic interactions from chromosome conformation capture and next-generation sequencing data. Nucleic Acids Res. 2013;41 doi: 10.1093/nar/gkt373. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 84.Love M.I., Huber W., Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15:550. doi: 10.1186/s13059-014-0550-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 85.Ross-Innes C.S., Stark R., Teschendorff A.E., Holmes K.A., Ali H.R., Dunning M.J., Brown G.D., Gojis O., Ellis I.O., Green A.R., et al. Differential oestrogen receptor binding is associated with clinical outcome in breast cancer. Nature. 2012;481:389–393. doi: 10.1038/nature10730. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 86.Grant C.E., Bailey T.L., Noble W.S. FIMO: scanning for occurrences of a given motif. Bioinformatics. 2011;27:1017–1018. doi: 10.1093/bioinformatics/btr064. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 87.Yu G., Wang L.G., He Q.Y. ChIPseeker: an R/Bioconductor package for ChIP peak annotation, comparison and visualization. Bioinformatics. 2015;31:2382–2383. doi: 10.1093/bioinformatics/btv145. [DOI] [PubMed] [Google Scholar]
  • 88.Schneider C.A., Rasband W.S., Eliceiri K.W. NIH Image to ImageJ: 25 years of image analysis. Nat. Methods. 2012;9:671–675. doi: 10.1038/nmeth.2089. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Document S1. Figures S1–S8
mmc1.pdf (2.6MB, pdf)
Table S1. Up-regulated genes upon CTD deletion, related to Figure S3
mmc2.xlsx (26.2KB, xlsx)
Table S2. Down-regulated genes upon CTD deletion, related to Figure S3
mmc3.xlsx (31.1KB, xlsx)
Table S3. Up-regulated genes upon CT116 deletion, related to Figure S3
mmc4.xlsx (54KB, xlsx)
Table S4. Down-regulated genes upon CT116 deletion, related to Figure S3
mmc5.xlsx (56.5KB, xlsx)
Table S5. Up-regulated genes upon NCR deletion, related to Figures 4, 5, and S3
mmc6.xlsx (42.6KB, xlsx)
Table S6. Down-regulated genes upon NCR deletion, related to Figures 4, 5, and S3
mmc7.xlsx (23.8KB, xlsx)
Table S7. Oligonucleotides used in this study, related to all Figures
mmc8.xlsx (15.2KB, xlsx)

Data Availability Statement

  • High-throughput sequencing files (ChIP-seq, RNA-seq, ATAC-seq, and QHR-4C) have been deposited into the NCBI Gene Expression Omnibus (GEO) database with the accession number GSE261210, GSE261212, GSE261213, and GSE261209, respectively.

  • This paper does not report original code.

  • Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request.


Articles from iScience are provided here courtesy of Elsevier

RESOURCES