Summary
As an essential regulator of higher-order chromatin structures, CCCTC-binding factor (CTCF) is a highly conserved protein with a central DNA-binding domain of 11 tandem zinc fingers (ZFs), which are flanked by amino (N-) and carboxy (C-) terminal domains of intrinsically disordered regions. Here we report that CRISPR deletion of the entire C-terminal domain of alternating charge blocks decreases CTCF DNA binding but deletion of the C-terminal fragment of 116 amino acids results in increased CTCF DNA binding and aberrant gene regulation. Through a series of genetic targeting experiments, in conjunction with electrophoretic mobility shift assay (EMSA), circularized chromosome conformation capture (4C), qPCR, chromatin immunoprecipitation with sequencing (ChIP-seq), and assay for transposase-accessible chromatin with sequencing (ATAC-seq), we uncovered a negatively charged region (NCR) responsible for weakening CTCF DNA binding and chromatin accessibility. AlphaFold prediction suggests an autoinhibitory mechanism of CTCF via NCR as a flexible DNA mimic domain, possibly competing with DNA binding for the positively charged ZF surface area. Thus, the unstructured C-terminal domain plays an intricate role in maintaining proper CTCF-DNA interactions and 3D genome organization.
Subject areas: Experimental systems for structural biology, Molecular physiology, Molecular Structure, Properties of biomolecules
Graphical abstract
Highlights
-
•
CTCF C-terminal domain (CTD) contains alternating charge blocks
-
•
A negatively charged region (NCR) within CTD suppresses CTCF DNA binding
-
•
CRISPR deletion of CTCF NCR alters gene expression and 3D genome architecture
-
•
CTCF autoinhibition by NCR as a DNA mimicry regulates its genome affinity
Experimental systems for structural biology; Molecular physiology; Molecular Structure; Properties of biomolecules
Introduction
CCCTC-binding factor (CTCF) is a principal architectural protein for the construction of 3D genomes and is highly conserved in bilateria.1,2,3,4,5 Together with the cohesin complex, CTCF mediates the formation of long-distance chromatin loops between distant sites, known as CBS (CTCF binding site) elements, through an ATP-dependent active process known as “loop extrusion,” leading to higher-order chromatin structures such as TADs (topologically associating domains).6,7,8,9,10,11,12,13 Interestingly, CTCF/cohesin-mediated chromatin loops are preferentially formed between pairs of CBS elements in a forward-reverse convergent orientation.14,15,16,17 In particular, topological chromatin loops are formed between tandem-arrayed CBS elements via cohesin-mediated dynamic loop extrusion, leading to balanced promoter choice.18,19,20 The dynamic cohesin loop extrusion and its asymmetric blocking by oriented CTCF binding on numerous CBS elements distributed throughout mammalian genomes constitute a general principle in 3D genome organization and play an important role in gene regulation. Finally, other DNA-binding zinc-finger (ZF) proteins such as YY1, MAZ, PATZ1, and ZNF263 may collaborate with CTCF to form long-distance chromatin contacts.21,22
The clustered protocadherin (cPcdh) genes are an excellent model to investigate the relationships between CTCF/cohesin-mediated chromatin looping and gene expression programs. The 53 highly similar human cPCDH genes are organized into three tandem linked clusters of PCDHα, PCDHβ, and PCDHγ, spanning a large region of ∼1 M bps genomic DNA.23 The PCDHα gene cluster comprises an upstream region of 15 variable exons and a downstream region with 3 constant exons. Similarly, PCDHγ comprises an upstream region of 22 variable exons and a downstream region of 3 constant exons. Each variable exon is separately spliced to the respective set of 3 constant exons within the PCDH α or γ gene cluster. By contrast, the PCDHβ gene cluster comprises only 16 variable exons with no constant exons.
Similar to the intriguing Dscam gene for generating enormous diversity of cell-recognition codes in fly, the cPCDH genes generate an exquisite diversity for neuronal self-avoidance and nonself discrimination in vertebrates.24,25 Different from competitive RNA pairing-mediated mutually exclusive splicing mechanism for Dscam, the cPCDH diversity is generated by a combination of balanced promoter choice and cis-alternative splicing determined by CTCF-directed DNA looping.14,26 In this complicated 3D genome configuration, CTCF directionally binds to tandem arrays of oriented CBS elements associated with Pcdh variable promoters and super-enhancers. CTCF-mediated chromatin loops are then formed between pairs of convergent forward-reverse CBS elements. For example, in the PCDHα gene cluster, there are two forward CBS elements flanking each of the 13 alternate variable exons and two reverse CBS elements flanking the HS5-1 enhancer.27 A “double-clamping” chromatin interaction between these convergent pairs of CBS elements determines the cPCDH promoter choice.14 In summary, CTCF/cohesin-mediated loop extrusion bridges remote super-enhancers in close contact with target variable promoters to form long-distance chromatin loops, and this looping process is essential for establishing proper expression patterns of the cPCDH genes in the brain.14,17,18,19,28,29,30
CTCF contains a central domain of 11 ZFs organized in a tandem array flanked by intrinsically disordered regions of the N-terminal domain (NTD) and C-terminal domain (CTD) (Figure 1A).1,31,32 The central domain of CTCF binds to DNA directly through ZFs 3–7 and ZFs 9–11.17,33,34,35 Recently, several lines of evidence suggest that ZF1 and ZF2 recognize base pairs downstream of the CTCF core motif.36,37,38,39,40 Remarkably, CTCF also interacts with RNA to mediate chromatin loop formation and to regulate gene expression pattern.41,42 Finally, the intrinsically disordered NTD, but not CTD, of CTCF interacts with cohesin complex to anchor chromatin loops between distant DNA elements.4,12,43,44,45,46 Here by a combination of a series of genetic deletions, in conjunction with chromosome conformation capture and gene expression analyses, we found that a negatively charged region (NCR) within the disordered CTD is important for proper CTCF DNA binding, higher-order chromatin organization, and gene regulation.
Results
Deletion of CTCF CTD results in decreased DNA binding
We analyzed the amino acid (aa) composition of CTCF NTD and CTD and found that the NTD of CTCF contains all of the 20 types of aa while the CTD of CTCF contains only 15 aa types, suggesting that CTCF CTD has an aa compositional bias and is a low-complexity region (Figures 1B and 1C). Computational analyses suggest that both NTD and CTD regions have a high intrinsically disordered score, especially the CTD region (Figure 1D).47 In addition, the CTD region is highly conserved in vertebrates (Figure 1E). Owing to the lethality of CTCF deletion in mice or in cultured cells,48,49 we tried to delete the CTCF NTD or CTD in HEC-1-B17 and found that deletion of NTD, but not CTD, is lethal in cultured cells. Specifically, we screened 254 single-cell clones for deletion of NTD and could not find a single homozygous cell clone. Therefore, we focused our genetic dissection on CTCF CTD.
We screened for CTD-deletion clones by CRISPR DNA-fragment editing programmed with dual single guide RNAs (sgRNAs) and a donor construct containing FLAG sequences for tagging50,51 and obtained two single-cell clones with precise deletion of the CTCF CTD (ΔCTD) (Figures S1A–S1D). We performed CTCF chromatin immunoprecipitation with sequencing (ChIP-seq) with these two ΔCTD clones as well as with wild-type (WT) clones as a control (Figure S1B) and found that almost every CTCF peak within the three PCDH gene clusters is decreased upon CTD deletion, suggesting that CTD has an important role in CTCF binding to DNA (Figures 1F, S2A, and S2B). Aggregated peak analysis showed that there is a significant decrease of CTCF binding at the PCDH CBS elements (Figure 1G). DNA-bound CTCF anchors cohesin complex via its NTD but not CTD.4,12,43,44,45,46 To this end, we performed ChIP-seq experiments with a specific antibody against Rad21, a cohesin subunit, and found that cohesin is colocalized with CTD-deleted CTCF, suggesting that CTD-deleted CTCF is still able to anchor cohesin at the cPCDH locus (Figures 1F and S2A). However, there is a significant decrease of cohesin enrichments upon deletion of CTCF CTD (Figures 1F, 1H, S2A, and S2C). We then analyzed genome-wide CTCF and cohesin enrichments and found that both CTCF and cohesin enrichments are significantly decreased upon CTD deletion (Figures 1I and 1J). However, computational analyses revealed no alternation of CTCF motifs of all three types of CBS elements upon CTD deletion, suggesting that deletion of CTD does not alter the DNA binding specificity of the central ZF domain (Figures 1K–1M), despite the fact that CTCF enrichments are decreased for all three types of CBS elements (Figures S2D–S2F).
We next performed quantitative high-resolution circularized chromosome conformation capture experiments (QHR-4C, see STAR Methods)19 with the HS5-1 enhancer as an anchor and found that there is a significant decrease of long-distance chromatin interactions between the HS5-1 enhancer and its target promoters (Figure 1N). Finally, we performed RNA sequencing (RNA-seq) experiments and found that, consistent with decreased chromatin interactions between enhancers and promoters, there is a significant decrease of expression levels of members of the PCDHα gene cluster upon CTCF CTD deletion (Figure 1O).
Deletion of CTCF CTD affects gene regulation
We next analyzed the RNA-seq data using DESeq2 with adjusted p value <0.05 and log2FC (fold change) >1 as cutoffs. We found 150 up-regulated genes (Table S1, ingenuity pathway analysis [IPA]; Figure S3A) with mean log2FC of 1.70 and 207 down-regulated genes (Table S2; Figure S3B) with mean log2FC of −1.84 (Figure S3C). We also found that the down-regulated genes are closer to CBS elements than the up-regulated genes (Figure S3D).
Deletion of C-terminal 116 aa leads to increased CTCF binding
The CTD of CTCF contains an internal RNA-binding region (RBR) and a downstream region of 116 aa.42,52 To investigate its function, we generated targeted deletion by screening single-cell CRISPR clones using CRISPR DNA-fragment editing with Cas9 programmed by dual sgRNAs.17,50 We obtained two clones with deletion of the C-terminal 116 aa (ΔCT116) (Figures S1E–S1G). We performed CTCF ChIP-seq experiments and found, remarkably, that there is a significant increase of CTCF enrichments in the cPCDH gene complex upon deletion of the C-terminal 116 aa (Figures 2A, 2B, S4A, and S4B). In addition, we performed Rad21 ChIP-seq experiments and found that there is a significant increase of cohesin enrichments at the cPCDH locus (Figures 2A, 2C, S4A, and S4C), consistent with the model of CTCF asymmetrical blocking of cohesin “loop extrusion.”
We next performed genome-wide analyses and found similar enrichments of CTCF and cohesin upon deletion of the C-terminal 116 aa (Figures 2D and 2E). Genome-wide analyses of CTCF motifs showed no alteration of all three types of the CBS elements (Figures S4D–S4I). We also performed QHR-4C experiments using the HS5-1 enhancer as an anchor and found a significant increase of chromatin contacts with the target promoters of PCDHα6 and PCDHα12 (Figure 2F). Finally, RNA-seq experiments showed a significant increase of expression levels of PCDHα6, PCDHα12, and PCDHαc2 (Figure 2G). These data demonstrated that the unstructured region of C-terminal 116 aa inhibits CTCF binding to DNA.
Deletion of the CTCF C-terminal 116 aa affects gene expression
RNA-seq experiments revealed 432 up-regulated genes (Table S3; Figure S3E) with mean log2FC of 2.36 and 461 down-regulated genes (Table S4; Figure S3F) with mean log2FC of −1.98 in ΔCT116 cells (Figure S3G). Interestingly, we found that the up-regulated genes are closer to increased CTCF peaks (Figure S3H).
Rescue with a series of C-terminal truncated CTCFs uncovers an NCR
We next generated a series of C-terminal truncated CTCF with V5 tags and transfected them into ΔCTD cells (Figure 3A). Specifically, truncated CTCF with 116 aa deleted at the C terminus was constructed as CTCF1-611. In addition, CTCF1-637 contains an additional region of 26 highly conserved mostly negatively charged amino acids (NCR). Finally, CTCF1-663 contains a further downstream region of 26 aa with 9 proline residues and 7 positively charged lysine or arginine residues. We generated stable cell lines by infecting with lentiviruses containing these constructs and verified their expression by western blots (Figures 3A and 3B).
We performed ChIP-seq experiments with a specific antibody against V5 tag to investigate CTCF binding profiles and found that the binding strength of CTCF1-611 at the cPCDH locus and throughout the entire genome is the highest among these four CTCF transgenes including full-length CTCF (Figures 3C–3E, S5A, and S5B). Specifically, CTCF1-611 has the highest affinity for all three types of genome-wide CBS elements (Figures 3F–3H and S5C). In conjunction with data of endogenous C-terminal truncation (Figure 2), we concluded that CTCF1-611 has the highest DNA binding affinity and that the NCR of 26 aa from 612 to 637 suppresses CTCF-DNA interactions.
NCR deletion increases CTCF binding and cPCDH expression
To investigate the endogenous function of NCR, we genetically deleted it by screening single-cell CRISPR clones and obtained two cell clones (ΔNCR) (Figures S1H–S1J). We performed CTCF ChIP-seq experiments with these clones and found that there is a significant increase of CTCF enrichments in the cPCDH locus compared with WT controls (Figures 4A, 4B, S6A, and S6B). We also performed Rad21 ChIP-seq and found a similar increase of cohesin enrichments at the cPCDH locus (Figures 4A, 4C, S6A, and S6C). In addition, genome-wide CTCF and cohesin enrichments are also significantly increased upon NCR deletion (Figures 4D and 4E). Furthermore, CTCF or cohesin enrichments are significantly increased at all three types of CBS elements (Figures S6D–S6I). We also performed ChIP-qPCR to further validate increased CTCF binding in ChIP-seq (Figure S6J). We next performed QHR-4C experiments using these single-cell clones with HS5-1 as an anchor and found there is a significant increase of long-distance chromatin interactions with the target promoters of PCDHα6 and PCDHα12 (Figure 4F). Finally, we performed RNA-seq experiments and found that there is a significant increase of expression levels of PCDHα6 and PCDHα12 upon NCR deletion (Figure 4G). These data suggest an important function of NCR in CTCF binding and gene regulation.
Deletion of NCR affects gene regulation
Genome-wide analyses of RNA-seq data identified 312 up-regulated genes (Table S5; Figure S3I) with mean log2FC of 1.72 and 125 down-regulated genes (Table S6; Figure S3J) with mean log2FC of −1.55 (Figure 4H). We also found that the up-regulated genes are closer to increased CTCF peaks (Figure 4I). For example, we found that expression levels of EHD2 (EH domain containing 2) correlate with the change of CTCF binding at the promoter region (Figures S3K–S3M).
A role of NCR in rewiring chromatin accessibility
We next performed the assay for transposase-accessible chromatin with sequencing (ATAC-seq) and found that chromatin accessibilities are increased at most ATAC-seq peaks in the cPCDH locus upon CTCF NCR deletion (Figure S7). We also observed significantly increased ATAC-seq signals genome-wide at increased CTCF peaks (Figure S8A). We next calculated the log2FC of ATAC-seq peaks and found that the majority of ATAC-seq peaks are also increased and minority peaks are decreased (Figure S8B). Specifically, we identified 7,373 increased differential accessibility regions (DARs) and 1,311 decreased DARs genome-widely (Figure S8C). Interestingly, we found that most increased DARs are located at gene promoters and that many decreased DARs are located in the intergenic regions (Figure S8C).
We noted that ∼3/4 DARs (6,403) are overlapped with CBS elements and ∼1/4 DARs (2,281) are not (Figure S8D). Among all DARs overlapped with CBS elements, there are 5,431 increased DARs and 972 decreased DARs (Figure S8E). We calculated the distance of non-CBS DARs to nearest CBS elements and found that increased DARs are closer to CBS than decreased DARs (Figure S8F). Integrated analysis with RNA-seq data showed that increased DARs correlate with enhanced levels of gene expression (Figure S8G). Together, these data suggest that CTCF NCR regulates chromatin accessibility and gene expression.
AlphaFold prediction suggests allosteric autoinhibition of CTCF DNA binding by NCR in CTD
We first validated the inhibitory role of NCR in CTCF DNA binding in vitro directly by the electrophoretic mobility shift assay (EMSA) (Figures 5A–5E). We then used AlphaFold3 to predict the conformation model of the CTCF ZF array and CTD with or without DNA ligands (Figures 5F–5I).53 Intriguingly, AlphaFold3 modeling suggests that the apostate CTCF appears as an autoinhibition conformation with NCR folding back and interacting with the positively charged surface of the ZF array via electrostatic contacts (Figure 5F). In particular, a cluster of seven negatively charged residues within NCR appears as a DNA mimicry and can fold onto the positively charged DNA binding surface of the CTCF ZF array via multiple electrostatic interactions (Figure 5G). In addition, both the negatively charged residues of NCR and positively charged residues of ZFs are highly conserved across vertebrates (Figure 5H). Binding to DNA targets induces large conformational changes of CTCF and releases the flexible CTD (Figure 5I). Thus, AlphaFold modeling suggests allosteric autoregulation of CTCF DNA binding via DNA-mimicking NCR within the flexible CTD (Figure 5J).
Discussion
Combinatorial and patterned expression of diverse cadherin-like cPcdh genes in single cells in the brain enables a molecular logic of self-avoidance between neurites from the same neurons as well as a functional assembly of synaptic connectivity between neurons of the same developmental origin.29,54,55,56,57 This complicated cPcdh expression program is achieved by ATP-dependent active cohesin “loop extrusion,” which brings remote super-enhancer in close contacts to target variable promoters via CTCF-mediated anchoring at tandem arrayed directional CBS elements.14,17,18,19,27,30,58,59,60 In particular, tandem-arrayed CTCF sites function as topological chromatin insulators to balance distance- and context-dependent promoter choice to activate cell-specific gene expression in the brain.18,19,30,58,59 Consequently, the dynamic interactions between CTCF and its recognition sites at variable promoters and super-enhancers are central for establishing cPcdh expression programs during brain development. In this work we systematically investigated the function of CTCF CTD and uncovered an NCR for maintaining proper CTCF binding to DNA in the large cPCDH gene complex and throughout the entire genome.
CTCF is a key 3D genome architectural protein that recognizes a large range of genomic sites via the central domain of 11 tandem ZFs and anchors loop-extruding cohesin via the YDF motif of the NTD.4,12,43,44,45,46 In addition, other ZF architectural proteins such as ZNF143, MAZ, PATZ1, and ZNF263 may collaborate with CTCF to maintain proper CTCF insulation at TAD boundaries.22,60,61 The CTCF last ZF and CTD contain an internal RBR of 38 aa, and this region helps CTCF clustering and searching for authentic CBS elements.42,52,62,63 Through a series of genetic deletion and rescue experiments, we uncovered an important NCR immediately downstream of the positively charged RBR. Specifically, we showed that NCR deletion leads to a significant increase of CTCF enrichments at all three types of CBS elements throughout the whole genome. In particular, NCR deletion results in increased CTCF binding at cPCDH variable promoters and super-enhancers, accompanied by increased chromatin accessibility and long-distance DNA looping. NCR may play an inhibitory role in CTCF clustering via RNA and in dynamic recognition of cognate genomic target sites.63,64,65 NCR either repulses DNA directly or weakens the strength of CTCF interaction with RNA and thus inhibits CTCF clustering and searching for cognate CBS elements. Either way NCR appears to be important in maintaining proper CTCF affinity for cognate genomic sites and specific chromatin looping at the cPCDH gene complex and likely throughout the entire genome. Thus, CTCF has an intricate self-adjusting mechanism to control the dynamic binding to genomic sites.
One intriguing allosteric self-adjusting mechanism of CTCF DNA binding is suggested by AlphaFold3 prediction (Figure 5J). According to the large conformational change of CTCF induced by DNA binding, NCR could be released from interacting with the ZF array upon CTCF recognition of genomic CBS elements. Thus, NCR functions as a DNA mimicry and self-associates with the positively charged surface of the ZF array via electrostatic interactions in the CTCF apostate. Indeed, CTCF CTD interacts with its ZF array in in vitro pull-down experiments.66 The self-association of disordered flexible NCR may be a potential general mechanism of DNA- and RNA-binding proteins. For example, an NCR of ZNF410 regulates its ZF array to bind DNA via a cis-allosteric inhibitory mechanism.65 In addition, autoinhibitory intramolecular interactions proofread U2AF recognition of authentic polypyrimidine tracts during RNA binding.67
NCR contains an acidic array of 10 glutamates and 5 aspartates for a total of 15 negatively charged aa residues (Figure 3A). In addition, there are four serine residues immediately upstream of NCR that may be phosphorylated by casein kinase II, thus switching to negative charges upon phosphorylation.68 Numerous CTCF mutations are related to multiple cancers or a group of neurodevelopmental diseases known as CTCF-related disorders (CRDs).69,70 The large spectrum of neurodevelopmental diseases or CRDs may be related to dysregulation of clustered protocadherins.71,72 Interestingly, mutations within the CTCF NCR that alter its electronic charges are associated with several types of cancers. For example, CTCF mutations of E616K or E626K are associated with melanoma and lung cancers, respectively.69,70 The exact pathogenetic mechanisms are not known but are very likely related to disruptions of alternate positive-negative aa block patterns of RBR and NCR within CTD and selective partitioning into CTCF-specific trapping zones.63,73 Specifically, the positively charged RBR and negatively charged NCR within C-terminal domain of CTCF may constitute recently noticed alternating charge block patterns and may participate in the selective partitioning of phase separation or in the formation of protein condensates during chromatin looping in 3D genome.73,74
Limitations of the study
While our genetic experiments, in conjunction with chromosome conformation capture and RNA-seq, demonstrated that CTCF CTD, in particular NCR, plays a crucial role in maintaining proper CTCF binding at the cPCDH gene complex, and subsequent PCDH chromatin looping and gene regulation, whether this close correlation between chromatin looping and gene expression could be generalized to the entire genome remains to be tested. In addition, while our ATAC-seq experiments showed that enhanced CTCF DNA binding correlates mostly with higher chromatin accessibility, the exact mechanism is not known but most likely related to pioneering factors.
Resource availability
Lead contact
Further information and requests for resources and reagents should be directed to and will be fulfilled by the lead contact, Qiang Wu (qiangwu@sjtu.edu.cn).
Materials availability
All unique/stable reagents generated in this study are available from the lead contact with a completed materials transfer agreement.
Data and code availability
-
•
High-throughput sequencing files (ChIP-seq, RNA-seq, ATAC-seq, and QHR-4C) have been deposited into the NCBI Gene Expression Omnibus (GEO) database with the accession number GSE261210, GSE261212, GSE261213, and GSE261209, respectively.
-
•
This paper does not report original code.
-
•
Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request.
Acknowledgments
We are very grateful to the reviewers who made constructive comments and excellent suggestions of AlphaFold and EMSA analyses, which we have appreciatively implemented in the generalized allosteric autoinhibition model of DNA- and RNA-binding proteins. We are also grateful for advice on bioinformatics from Dr. Jingwei Li and all members of our laboratory for discussion. This study was supported by grants to Q.W. from the National Natural Science Foundation of China (32330016) and the National Key R&D Program of China (2022YFC3400200).
Author contributions
Q.W. conceived the research. Y.Z. contributed resources. L.L. performed experiments. L.L. and Y.T. analyzed data. L.L. and Q.W. wrote the manuscript.
Declaration of interests
The authors declare no competing interests.
STAR★Methods
Key resources table
REAGENT or RESOURCE | SOURCE | IDENTIFIER |
---|---|---|
Antibodies | ||
Anti-CTCF | Millipore | Cat# 07–729; RRID: AB_441965 |
Anti-Rad21 | Abcam | Cat# AB992; RRID: AB_2176601 |
Anti-V5 | Abcam | Cat# AB15828; RRID: AB_443253 |
Goat anti-Rabbit IgG (H + L) Highly Cross-Adsorbed Secondary Antibody | ThermoFisher Scientific | Cat# A32735; RRID: AB_2633284 |
Anti-GAPDH | Abmart | Cat# P30008; RRID: AB_2936506 |
Anti-MYC | Abmart | Cat# M20002; RRID:AB_2861172 |
Anti-Flag | Abmart | Cat# M2008; RRID:AB_2713960 |
Bacterial and virus strains | ||
Stbl3 competent cells | This study | N/A |
Chemicals, peptides, and recombinant proteins | ||
EcoRI | NEB | Cat# R0101S |
BsaI | NEB | Cat# R0535S |
DpnII | NEB | Cat# R0176L |
BamHI | NEB | Cat# R0136S |
NEB buffer 2 | NEB | Cat# B7002S |
T4 DNA ligase | NEB | Cat# M0202L |
Formaldehyde | ThermoFisher Scientific | Cat# 28908 |
RNase A | ThermoFisher Scientific | Cat# EN0531 |
Proteinase K | NEB | Cat# P8107S |
Glycogen | ThermoFisher Scientific | Cat# R0561 |
TRIzol Reagent | ThermoFisher Scientific | Cat# 15596026 |
Anti-FLAG M2 Magnetic Beads | Sigma-Aldrich | Cat# M8823 |
Streptavidin Dynabeads | ThermoFisher Scientific | Cat# 65001 |
AMPure XP beads | Beckman Coulter | Cat# A63881 |
Lipofectamine 3000 | Invitrogen | Cat# L30001 |
Polybrene | Sigma-Aldrich | Cat# TR-1003-G |
Puromycin dihydrochloride | Sigma-Aldrich | Cat# P8833 |
Critical commercial assays | ||
MinElute Gel Extraction Kit | QIAGEN | Cat# 28606 |
VAHTSTM Universal DNA Library Prep Kit for Illumina V2 | Vazyme | Cat# ND606 |
VAHTSTM Multiplex Oligos set 4 for Illumina | Vazyme | Cat# N321 |
QIAquick PCR Purification Kit | QIAGEN | Cat# 28106 |
VAHTS Universal V6 RNA-seq Library Prep Kit for Illumina | Vazyme | Cat# NR604 |
VAHTS mRNA Capture Beads | Vazyme | Cat# N401 |
VAHTS® RNA Multiplex Oligos Set 1 for Illumina | Vazyme | Cat# N323 |
Vazyme Hyperactive ATAC-Seq Library Prep Kit for Illumina | Vazyme | Cat# TD711 |
TruePrep Index Kit V2 for Illumina | Vazyme | Cat# TD202 |
pClone007 Versatile Simple Vector Kit | TsingKe | Cat# TSV-007VSm |
BCA protein assay kit | Beyotime | Cat# P0009 |
TnT® T7 Quick Coupled Transcription/Translation System | Promega | Cat# L1170 |
LightShift® Chemiluminescent EMSA Kit | Thermo | Cat# 20148 |
SYBR qPCR Master Mix | Vazyme | Cat# Q711 |
Deposited data | ||
High-throughput sequencing files (QHR-4C) | This study | GEO: GSE261209 |
High-throughput sequencing files (ChIP-seq) | This study | GEO: GSE261210 |
High-throughput sequencing files (RNA-seq) | This study | GEO: GSE261212 |
High-throughput sequencing files (ATAC-seq) | This study | GEO: GSE261213 |
Raw imaging files | This study, Mendeley data | https://data.mendeley.com/preview/345thpxnbt?a=4721a09e-e875-4859-9e00-8eaa95ebf5b9 |
Structures predicted by AlphaFold3 | This study | https://www.modelarchive.org/doi/10.5452/ma-jd5dd; https://www.modelarchive.org/doi/10.5452/ma-u27pa |
Experimental models: Cell lines | ||
Human: HEC-1-B | ATCC | Cat# HTB-113, RRID: CVCL_0294 |
Human: HEK293T | ATCC | Cat# CRL-3216, RRID: CVCL_0063 |
Human: HEC-1-B ΔCTD clone1 | This paper | N/A |
Human: HEC-1-B ΔCTD clone2 | This paper | N/A |
Human: HEC-1-B ΔCT116 clone1 | This paper | N/A |
Human: HEC-1-B ΔCT116 clone2 | This paper | N/A |
Human: HEC-1-B ΔNCR clone1 | This paper | N/A |
Human: HEC-1-B ΔNCR clone2 | This paper | N/A |
Human: HEC-1-B WT-FLAG clone1 | This paper | N/A |
Human: HEC-1-B WT-FLAG clone2 | This paper | N/A |
Human: HEC-1-B CTCF1-611 | This paper | N/A |
Human: HEC-1-B CTCF1-637 | This paper | N/A |
Human: HEC-1-B CTCF1-663 | This paper | N/A |
Human: HEC-1-B CTCF-FL | This paper | N/A |
Oligonucleotides | ||
See Table S7 | This study | NA |
Recombinant DNA | ||
Plasmid: pMD2.G | Addgene | Cat# 12259, RRID: Addgene_12259 |
psPAX2 | Addgene | Cat# 12260, RRID: Addgene_12260 |
Plasmid: pGL3-U6-sgRNA-PGK-Puro | Li et al.50 | https://academic.oup.com/jmcb/article/7/4/284/901042 |
pcDNA3.1-Cas9-WT | J. Xi (Peking University) | N/A |
Plasmid: pClone007-ΔCTD homology arm | This paper | N/A |
Plasmid: pClone007-ΔCT116 homology arm | This paper | N/A |
Plasmid: pClone007-ΔNCR homology arm | This paper | N/A |
Plasmid: pClone007-WT-FLAG homology arm | This paper | N/A |
Plasmid: pGL3- ΔCTD-sg1 | This paper | N/A |
Plasmid: pGL3- ΔCTD-sg2 | This paper | N/A |
Plasmid: pGL3- ΔCT116-sg1 | This paper | N/A |
Plasmid: pGL3- ΔNCR-sg1 | This paper | N/A |
Plasmid: pGL3- ΔNCR-sg2 | This paper | N/A |
Plasmid: pLVX-CTCF1-611 | This paper | N/A |
Plasmid: pLVX-CTCF1-637 | This paper | N/A |
Plasmid: pLVX-CTCF1-663 | This paper | N/A |
Plasmid: pLVX-CTCF-FL | This paper | N/A |
Plasmid: pTNT-CTCF-FL | This paper | N/A |
Plasmid: pTNT-CTCF-ΔNCR | This paper | N/A |
Plasmid: pTNT-CTCF-ΔCT116 | This paper | N/A |
Plasmid: pClone007-EMSA-a6 | This paper | N/A |
Plasmid: pClone007-EMSA-a12 | This paper | N/A |
Software and algorithms | ||
Bowtie2 software (v2.3.4.2) | Langmead et al.75 | http://bowtie-bio.sourceforge.net/bowtie2/index.shtml |
MACS2 (v2.1.2) | Zhang et al.76 | https://github.com/taoliu/MACS |
MEME v4.12.0 | Bailey et al.77 | https://meme-suite.org/meme/ |
STAR (v2.7.3a) | Dobin et al.78 | https://github.com/alexdobin/STAR |
Cufflinks (v2.2.1) | Trapnell et al.79 | http://cole-trapnell-lab.github.io/cufflinks/ |
Deeptools (v3.5.3) | Ramírez et al.80 | https://deeptools.readthedocs.io/en/latest/ |
Samtools (v1.12) | Li et al.81 | http://www.htslib.org/doc/ |
Bedtools (v2.30.0) | Quinlan et al.82 | https://bedtools.readthedocs.io/en/latest/index.html |
R3Cseq (v1.38.0) | Thongjuea et al.83 | https://bioconductor.org/packages/release/bioc/html/r3Cseq.html |
DESeq2 (v1.32.0) | Love et al.84 | https://bioconductor.org/packages/release/bioc/html/DESeq2.html |
DiffBind (v3.4) | Ross-Innes et al.85 | https://bioconductor.org/packages/release/bioc/html/DiffBind.html |
Fimo (v5.5.5) | Grant et al.86 | http://meme-suite.org/tools/fimo |
ChIPseeker (v1.30.3) | Yu et al.87 | https://guangchuangyu.github.io/software/ChIPseeker |
PicardTools | N/A | http://broadinstitute.github.io/picard/ |
UCSC Genome Browser | N/A | https://genome.ucsc.edu/ |
ggplot2 (v3.4.2) | Open source | https://ggplot2.tidyverse.org/ |
ImageJ | Schneider et al.88 | https://imagej.net/ij/index.html |
PyMOL | Molecular Graphics System, Version 2.3.0 Schrodinger, LLC | https://www.pymol.org/ |
Experimental model and study participant details
Cells and culture conditions
Human HEC-1-B cells (ATCC) were cultured as previously described in MEM medium (Hyclone) supplemented with 10% (v/v) FBS (Gibco), 2 mM GlutaMAX (Gibco), 1 mM sodium pyruvate (Sigma), and 1% penicillin-streptomycin (Gibco).17 Briefly, HEC-1-B cells were maintained at 37°C in a humidified incubator containing 5% CO2. Medium of cultured cells was changed every 24 h. Cells were passed every 72 h. When cells were passaged, the medium was removed and cells were washed by PBS and then digested by trypsin (Gibco) for 5 min at 37°C in a humidified incubator containing 5% CO2 and quenched by 10% FBS supplemented PBS. Digested cells were collected by centrifuging at 500 rpm for 5 min at room temperature. Pelleted cells were resuspended with media and seeded to new plates.
HEK293T cells (ATCC) were cultured in Dulbecco’s modified Eagle’s medium (Hyclone) supplemented with 10% (v/v) FBS (Gibco) and 1% penicillin-streptomycin (Gibco). Cells were maintained at 37°C in a humidified incubator containing 5% CO2. Medium of cultured cells was changed every 24 h. HEK293T cells were passed every 48 h. The passage process of HEK293T is similar to HEC-1-B except with shortened trypsin digesting time of 2 min.
Method details
Plasmid construction
For all CRISPR/Cas9 experiments, the sgRNA plasmids were constructed as described before.50 Briefly, pGL3-U6 vector was linearized by BsaI (NEB) to generate the cloning backbone with 5′ overhangs of ‘TGGC’ and ‘TTTG’ at the two ends. In addition, a pair of complementary oligonucleotides (Table S7) containing the sgRNA targeting sequences with 5′ overhangs of ‘ACCG’ or ‘AAAC’ was annealed and ligated to BsaI-digested linearized vector backbone using T4 DNA ligase (NEB). The complete sgRNA sequences are under the control of the U6 promoter and will be transcribed by Pol III in mammalian cells. Finally, the Cas9 plasmid was obtained as a gift from Peking University.
For donor plasmids used in establishing ΔCTD and ΔCT116 stable cell lines via homologous recombination (HR), the donor was designed such that ΔCTD or ΔCT116 were tagged with FLAG sequences (Table S7) for tracing CTCF proteins with FLAG-specific antibodies because deletions of endogenous CTCF C-terminal fragment removed the epitope (659–675 AAs) for CTCF antibodies (Millipore). In addition, we FLAG-tagged WT CTCF via HR at its C-terminus at the endogenous locus as a control. To generate donor plasmid for CRISPR screening of single-cell clones through the homologous recombination (HR) pathway, two homologous arms each of ∼1kb flanking the target sites were amplified from genomic DNA by PCR with primers (Table S7). The amplified homologous arms were PCR-purified and a donor DNA fragment for HR was generated through overlapping-PCR. Finally, the donor DNA fragment was ligated to a T-vector through TA cloning. For ΔNCR cells, the epitope of CTCF antibodies was intact and the donor plasmid was constructed without FLAG.
All rescue CTCF constructs were V5-tagged to distinguish them from the endogenous FLAG-tagged CTCF in ΔCTD cells. To construct Lentivirus pLVX vectors containing a series of truncated CTCF C-terminal domains, pLVX vector was first linearized by EcoR1 and BamH1 (NEB) and purified as the cloning backbone. CTCF1-611, CTCF1-637, CTCF1-663, and full length CTCF were cloned from a cDNA library of HEC-1-B cells with the same 5′ primers containing EcoRI restriction endonuclease sites and different 3′ primers containing BamHI restriction endonuclease sites in conjunction with the V5 tag sequence (Table S7). Truncated CTCF fragments amplified from cDNA were digested with restriction endonucleases and ligated into pLVX linearized vector using T4 DNA ligase.
ChIP-seq
Chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) experiments were performed as described before.14 Briefly, 5×106 cells were collected and crosslinked with 1% formaldehyde, quenched by 2 M glycine at a final concentration of 125 mM and washed by ice-cold PBS twice. Crosslinked cells were lysed twice by the ChIP buffer (10 mM Tris-HCl pH 7.5, 1 mM EDTA, 1% Triton X-100, 0.1% SDS, 0.1% sodium deoxycholate, 0.15 M NaCl, and 1×protease inhibitors, Roche) with slow rotation at 4°C for 10 min. The lysed cells were centrifuged at 2,500 g at 4°C for 10 min to isolate cell nuclei. The isolated nuclei were resuspended with the ChIP buffer and sonicated using a Bioruptor Sonicator (with the high energy setting at a train of 30-s sonication with 30-s interval for 30 cycles) to fragment DNA to 200–400 bp. The sonicated mixture was centrifuged at 14,000 g at 4°C for 10 min and the supernatant was then precleared by protein A/G magnetic beads (Thermo 26162) for 3h at 4°C with slow rotation. Antibody (CTCF: Millipore 07–729, Rad21: Abcam ab992, V5: Abcam ab15828) or Anti-FLAG antibody conjugated Magnetic Beads (Sigma M8823) were added to the precleared solution and incubated at 4°C overnight with slow rotation to precipitate CTCF or cohesin protein-DNA complex.
Protein A/G beads were added and incubated for 3 h at 4°C with slow rotation to capture the antibody-protein-DNA complex. The ChIP buffer, high salt buffer (10 mM Tris-HCl pH 7.5, 1 mM EDTA, 1% Triton X-100, 0.1% SDS, 0.1% sodium deoxycholate, 0.4 M NaCl), no salt buffer (high salt buffer without NaCl), LiCl buffer (50 mM HEPES pH 7.5, 1 mM EDTA, 1% NP-40, 0.7% sodium deoxycholate, 0.5 M LiCl), and 10 mM Tris-HCl buffer (pH 7.5) were used sequentially to wash the beads. Finally, the elution buffer (50 mM Tris-HCl pH 8.0, 10 mM EDTA, 1% SDS) was used to elute ChIP DNA from beads at 65°C for 1h with 1,000 rpm shaking. The eluted protein-DNA complex was reverse-crosslinked at 65°C overnight with 1,000 rpm shaking to dissociate DNA. Finally, the proteinase K (NEB) was added and incubated for 2 h at 55°C to digest protein and the RNase A (Thermo) was added and incubated for 2h at 37°C to digest RNA.
The DNA was purified by adding equal volume of phenol-chloroform and mixed by vigorously shaking. The mixture was centrifuged at 4 °C at 14,000 g for 10 min to separate the proteins and DNA. The supernatant containing DNA was transferred to a new tube. 2.5-fold volume of ice-cold ethanol, 1/10 volume of 3 M NaAc (pH 5.2), and 1.5 μL of glycogen (Thermo) were added to participate DNA at −80°C for 1 h. The sample was centrifuged at 14,000 g at 4°C for 30 min to pellet DNA. 70% ethanol was added to wash DNA pellets and finally the sample was centrifuged at 14,000 g at 4°C for 10 min. The supernatant was removed and the pelleted DNA was air-dried for 5 min to remove residual ethanol. The DNA was then dissolved in the nuclease-free water. The concentration of DNA was measured using Qubit (Invitrogen).
To prepare DNA library for deep sequencing, we used the Universal DNA Library Prep Kit (Vazyme ND606). Briefly, DNA was first end-repaired and ligated to adapters from the Multiplex Oligos Set (Vazyme, N321). Adapter-ligated DNA was then purified using AMPure XP beads (Beckman) and the final ChIP library was amplified by PCR. The library was sequenced on an Illumina NovaSeq platform. All of the ChIP-seq experiments were performed with at least two biological replicates.
ChIP-qPCR
The chromatin immunoprecipitation followed by quantitative PCR (ChIP-qPCR) experiments were performed as described before.14 ChIP steps were the same as the ChIP-seq experiments described above. PCR primers were designed around CTCF binding sites (Table S7). qPCR were then carried out on the ABI QS6 platform using the SYBR qPCR master mix.
Western blot
1×106 cells were collected and lysed by the RIPA buffer (50 mM Tris-HCl pH 7.4, 150 mM NaCl, 1% Triton X-100, 1% sodium deoxycholate, 0.1% SDS, and 1× protease inhibitors) on ice for 30 min. The sample was sonicated and centrifuged at 14,000 g for 15 min at 4°C to remove cell debris. The concentration of supernatant proteins mixture was measured using the BCA protein assay kit (Beyotime). The supernatant proteins were mixed with 5× SDS protein loading buffer with DTT and incubated at 95°C for 10 min to denature the proteins. The denatured protein mixture was centrifuged at 14,000g for 15 min at room temperature and separated on SDS-PAGE.
The separated proteins in gel were transferred to the nitrocellulose membrane through electrophoreses. After transferring, the membrane was blocked by 5% non-fat milk in PBST for 2 h at room temperature and then washed 3 times with PBST. Corresponding antibodies were incubated with the membrane overnight at 4°C with slow shaking. The membrane was washed 3 times with PBST. The secondary antibody (Invitrogen) was then incubated with the membrane for 90 min at room temperature with slow shaking. Finally, the membrane was washed by PBST for 3 times and scanned by the Odyssey System (LI-COR Biosciences). The intensity of bands were measured by ImageJ software.
Recombinant CTCF protein production
The recombinant mutation proteins of CTCF were prepared by rabbit reticulocyte lysate (Promega L1170) as previously described.17 Firstly, we linearized the empty pTNT plasmid by EcoR1 and Not1 (NEB). Then we cloned different CTCF deletions of the CTCF C-terminal domain from cDNA through PCR. The PCR products containing EcoR1 and Not1 restriction endonuclease sites were gel-purified and digested by EcoR1 and Not1. The digested DNAs were ligated to linearized pTNT vectors using T4 DNA ligase. The plasmids were used as a template to express CTCF mutants using the rabbit reticulocyte lysate at 30°C for 90 min. The recombinant CTCF proteins were finally analyzed by Western blots.
Electrophoretic mobility shift assay (EMSA)
The EMSA experiments were performed as described before.17 We first cloned the probe sequences containing CBS from genomic DNA. Then the DNAs were gel-purified and ligated into a T-vector. The sequences were verified by Sanger sequencing. We finally preformed PCR experiments using plasmids as a template with 5′ biotin labeled primers. We used high-fidelity DNA polymerase to perform PCR and PCR products were gel-purified as EMSA probes. The concentrations of probes were measured by Nano-drop. We performed the EMSA experiments using LightShift Chemiluminescent EMSA reagents (Thermo) according to the manuals. Briefly, equal amounts of protein were incubated in the binding buffer (containing 10 mM Tris, 50 mM KCl, 5 mM MgCl2, 0.1 mM ZnSO4, 1 mM dithiothreitol, 0.1% (v/v) Nonidet P-40 (NP-40), 50 ng/μL poly (dI-dC), and 2.5% (v/v) glycerol) on ice for 20 min to reduce the background. We added the same amounts of probes to the binding buffer and then incubated at room temperature for 30 min. The binding mixtures were then electrophoresed on 5% non-denaturing polyacrylamide gels in the ice-cold 0.5×TBE buffer (45 mM Tris-borate, 1 mM EDTA, pH8.0), and then the gels were transferred to nylon membranes. The membranes were cross-linked under UV for 12 min. We incubated the membrane in the blocking buffer for 20 min. The membranes were treated with streptavidin-horseradish peroxidase conjugate for 20 min. We washed the membranes for 4 times using the washing buffer and stained the membrane by chemiluminescence using the ChemiDoc XRS+ system (Bio-Rad). The intensity of bands was measured by the ImageJ software. All of the EMSA experiments were performed with at least two biological replicates.
CRISPR genome editing and single-cell cloning
CRISPR genome editing was performed as described before.50,51 Cas9 and sgRNA plasmids, and donor constructs were transfected in 12-well plates at 70% confluency using lipofectamine 3000 (Invitrogen). Medium containing lipofectamine 3000 and plasmid DNA was then replaced with fresh medium after 6 h. After transfection for 48 h, the transfected cells were incubated with medium containing 2 μg/mL puromycin and refreshed daily. After 4 days, the cells were recovered for 2 days in fresh medium without puromycin. The cells were then dissociated by trypsin and seeded to 96-wells cell plates at the concentration of about one cell per well. After incubation for about 1 week in 96-well plates, the seeded cells were checked under microscope and single-cell clones were marked manually.
PCR genotyping was performed when seeded single-cell clones reached ∼80% confluence in 96-well plates. The cells were digested using trypsin and PCR-genotyped with a pair of primers matching sequences outside of the homologous arms (Table S7). The positive PCR products were purified and sequenced by Sanger sequencing for confirmation. The identified single-cell clones were incubated for another 4–5 passages and confirmed again by PCR genotyping and Sanger sequencing.
Given that the CTCF antibody used in ChIP-seq recognizes 659–675 AAs of CTCF, we added an FLAG tag in ΔCTD and ΔCT116 cell lines via homologous recombination. We also inserted an FLAG tag in WT cell as an experimental control. For ΔCTD cells, we genotyped 191 single-cell clones and found 3 single-cell clones with CTD deletion. For WT-FLAG cells, we genotyped 160 single-cell clones and found 4 of them with FLAG tag insertion. For ΔCT116 cells, we genotyped 192 single-cell clones and found 8 of them with CT116 deletion. For ΔNCR cells, we genotyped 240 single-cell clones and found 6 of them with NCR deletion. For the cells with same genotype, we used two independent single-cell clones for subsequent experiments.
Lentivirus packaging
Lentiviruses were used to build stable transgenic cell lines that expressing different CTCF rescue constructs in ΔCTD cells. pLVX plasmids containing truncated CTCF sequences and puromycin selection marker, and helper plasmids (psPAX2, pMD2.G, Addgene) were transfected into HEK293T cells in 6-well plate at 70% confluence using lipofectamine 3000 reagents. The medium containing lipofectamine and plasmid DNA was replaced with fresh medium after 6 h. The medium containing lentivirus was harvested after transfection for 48 h for the first time, and then replaced with fresh medium and harvested again after another 24 h. The harvested medium was then centrifuged at 1,300 rpm at 4°C for 5 min to remove cells and then filtered by 0.45 μm filter (Merck) and the virus was collected and stored at −80°C.
Lentivirus infection and stable cell line establishment
All lentivirus particles expressing different truncated CTCF constructs were thawed on ice and infected with ΔCTD cells by adding polybrene at 8 μg/ml. After infection for two days, the infected cells were incubated with medium containing 2 μg/mL puromycin for 4 days and replaced daily. After selection, cells were incubated with fresh medium without puromycin for 2 days for recovering. The cells were then cultured for another 12 days and harvested for Western blot and ChIP-seq experiments.
RNA-seq
The RNA-seq experiment was performed as previously described.17 Briefly, about 1×106 cells in 6-well plates were collected and washed by ice-cold PBS for 3 times. One mL of TRIzol (Invitrogen) was added to the cells and mixed thoroughly. After incubation for 5 min at room temperature, 0.2 mL of chloroform was added and mixed well by hands. The sample was then incubated for 2 min at room temperature and centrifuged at 12,000 rpm for 10 min at 4°C. The supernatant was transferred to a new tube and mixed with equal volume of isopropanol. The sample was incubated at room temperature for 10 min and centrifuged at 12,000 rpm for 15 min at 4°C to pellet RNA. After removing the supernatant, 1 mL of 75% ethanol was added to wash RNA. Finally, RNA pellets were dissolved in nuclease-free water. The quantity and concentration of the total RNA was measured by NanoDrop (Thermo).
The Oligo (dT) coupled magnetic beads (Vazyme) were used to purify mRNA from total RNA. The mRNA library preparation was performed using the Universal V6 RNA-seq Library Prep Kit (Vazyme) according to the manual. Briefly, mRNA was fragmented by heating at 94°C for 8 min. Next, reverse transcription for the first-strand cDNA was performed and then the double-strand cDNA was synthesized. Adapters from the RNA Multiplex Oligos Set (Vazyme) were added to cDNA and the adapter-ligated cDNA was purified by AMPure XP beads (Beckman). Finally, Library was amplified through PCR and was sequenced on an Illumina NovaSeq platform. All of the RNA-seq experiments were performed with at least two biological replicates.
ATAC-seq
ATAC-seq was performed as previously described.59 Briefly, cells in 12-well plates at 80% confluence were digested by trypsin and were collected by centrifuging at 500 rpm at room temperature for 5 min. 50,000 cells were collected and washed twice by PBS. The cells were then washed twice by ice-cold buffer by centrifuging at 2,300 rpm at 4°C for 5 min. The cells were then lysed by fresh prepared lysis buffer containing NP-40, Tween 20, and digitonin on ice for 5 min and centrifuged for 10 min at 2,300 rpm at 4°C to collect the cell nuclei.
The chromatins were fragmented with Tn5 transposase in conjunction with adaptor at 37°C for 30 min. The fragmentation reaction was terminated by adding the stop buffer at room temperature for 5 min. The fragmented DNA was purified by ATAC DNA Extract Beads (Vazyme). Finally, PCR was performed to amplify the library using primers in TruePrep Index Kit V2 (Vazyme) and the library was sequenced on an Illumina NovaSeq platform. All of the ATAC-seq experiments were performed with at least two biological replicates.
QHR-4C
Quantitative high-resolution circularized chromosome conformation capture (QHR-4C) experiments were performed as previously described.19 Briefly, 1×106 cells were collected and crosslinked by 2% formaldehyde at room temperature for 10 min with slow rotation. The crosslinking reaction was quenched by adding 2 M glycine to the final concentration of 200 mM. The crosslinked cells were washed twice by ice-cold PBS by centrifuging for 5 min at 800 g at 4°C and then lysed twice by ice-cold lysis buffer (50 mM Tris-HCl pH 7.5, 150 mM NaCl, 5 mM EDTA, 0.5% NP-40,1% Triton X-100, and 1× protease inhibitors) for 10 min at 4°C with slow rotation. The cell nuclei were collected by centrifuge at 800 g at 4°C for 5 min. The pelleted nuclei were resuspended in 73 μL of nuclease-free water, 10μL of DpnII buffer (NEB), and 2.5 μL of 10% SDS and incubated for 1 h at 37°C with shaking at 900 rpm. 12.5 μL of 20% Triton X-100 was added for 1 h at 37°C with shaking at 900 rpm. 2 μL of DpnII (NEB) was added to digest the chromatin overnight at 37°C with shaking at 900 rpm and inactivated at 65°C for 20 min.
The nuclei were then collected through centrifuging at 1,000 g for 1 min. The supernatant was carefully removed. The proximal ligation was performed with resuspended nuclei by adding 100 μL of T4 ligation buffer (NEB) containing 1 μL of T4 ligase and incubated at 16°C for 24 h. 1 μL of proteinase K was then added to the ligation mixture to digest the proteins and incubated for 4 h at 65°C for reverse crosslinking. The DNA was purified by phenol-chloroform as described above in ChIP-seq. Finally, the purified DNA was dissolved in 50 μL nuclease-free water and sonicated with the Bioruptor Sonicator (with the low energy setting by a train of 30-s sonication with 30-s interval for 12 cycles) to fragment DNA to sizes of 200–600 bp.
The PCR amplification was performed using 5′ biotin-labeled primers (Table S7) to capture DNA anchored at the HS5-1 enhancer. To maximize the PCR product, 100 μL of reaction system and 60 cycles of PCR were used. The PCR product was incubated at 95°C for 5 min and immediately chilled on ice to obtain single-strand DNA (ssDNA). The biotin-labeled ssDNA was collected by incubating with Streptavidin Beads (Invitrogen) for 2 h at room temperature and the beads were washed twice by the washing buffer (5 mM Tris-HCl pH7.5, 1 M NaCl, 0.5 mM EDTA).
To prepare library for deep sequencing, adapters containing sequences that match the 3′ end of the Illumina P7 sequence were ligated to ssDNA at 16°C for 24 h. The adapters were generated through annealing of two complementary primers (Table S7) in the annealing buffer (25 mM NaCl, 10 mM Tris-HCl pH 7.5, 0.5 mM EDTA). The beads with ssDNA-adapter were washed twice by the washing buffer. Finally, the library was amplified through PCR with two primers. The forward primer contains the Illumina P5 sequences and the sequences adjacent to the HS5-1 anchor. The reverse primer contains the Illumina P7 sequences and indexes. The amplified library was sequenced on an Illumina NovaSeq platform. All of the QHR-4C experiments were performed with at least two biological replicates.
Data analysis of ChIP-seq
Raw FASTQ files were aligned to the human reference genome (GRCh37/hg19) using Bowtie2.75 The MarkDuplicates module of PICARD tools (http://broadinstitute.github.io/picard/) was used to remove the duplicates and Samtools81 was used to index or sort bam files. ChIP-seq peaks were called by MACS276 with the default parameter. The read counts were normalized to reads per kilobase per million mapped (RPKM) using bamCoverage module of Deeptools80 with a bin sized of 20 bp. The plotHeatmap module of Deeptools was used to generate heatmaps. Normalized read counts were converted to bedGraph to be visualized in the UCSC genome browser. Differential binding analyses were performed using DiffBind with the default parameter.85 Motif analyses of CTCF were performed using the MEME suite77 and FIMO.86
Data analysis of RNA-seq
Raw FASTQ files were aligned to the human reference genome (GRCh37/hg19) using STAR78 with default parameters and the FPKMs were calculated using Cufflink.79 Differential analysis of gene expression was performed using DEseq2.84 Volcano plots were generated by ggplot2. TSS distance to nearest CTCF peaks were calculated by Bedtools.82
Data analysis of ATAC-seq
Raw FASTQ files were aligned to the human reference genome (GRCh37/hg19) using Bowtie2.75 ATAC-seq peaks were called by MACS2.76 Differential accessibility regions (DARs) were analyzed using DESeq2.84 ATAC-seq peak annotation was performed with ChIPseeker.87 Bedtools were used in overlapping analyses of DAR and CBS elements and in calculating the distance of DARs to nearest CBS elements.82
Data analysis of QHR-4C
Raw FASTQ files were aligned to the human reference genome (GRCh37/hg19) using Bowtie2.75 Reads were normalized by r3Cseq program (version 1.20).83
Quantification and statistical analysis
All statistical tests were performed with the R scripts. Statistical significance values were calculated using the Student’s t test. p < 0.05 was shown as ∗, p < 0.01 was shown as ∗∗, p < 0.001 was shown as ∗∗∗ and p < 0.0001 was shown as ∗∗∗∗.
Published: November 22, 2024
Footnotes
Supplemental information can be found online at https://doi.org/10.1016/j.isci.2024.111452.
Supplemental information
References
- 1.Klenova E.M., Nicolas R.H., Paterson H.F., Carne A.F., Heath C.M., Goodwin G.H., Neiman P.E., Lobanenkov V.V. CTCF, a conserved nuclear factor required for optimal transcriptional activity of the chicken c-myc gene, is an 11-Zn-finger protein differentially expressed in multiple forms. Mol. Cell Biol. 1993;13:7612–7624. doi: 10.1128/mcb.13.12.7612-7624.1993. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Heger P., Marin B., Bartkuhn M., Schierenberg E., Wiehe T. The chromatin insulator CTCF and the emergence of metazoan diversity. Proc. Natl. Acad. Sci. USA. 2012;109:17507–17512. doi: 10.1073/pnas.1111941109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Rowley M.J., Corces V.G. Organizational principles of 3D genome architecture. Nat. Rev. Genet. 2018;19:789–800. doi: 10.1038/s41576-018-0060-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Li Y., Haarhuis J.H.I., Sedeño Cacciatore Á., Oldenkamp R., van Ruiten M.S., Willems L., Teunissen H., Muir K.W., de Wit E., Rowland B.D., Panne D. The structural basis for cohesin-CTCF-anchored loops. Nature. 2020;578:472–476. doi: 10.1038/s41586-019-1910-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Wu Q., Liu P., Wang L. Many facades of CTCF unified by its coding for three-dimensional genome architecture. J. Genet. Genom. 2020;47:407–424. doi: 10.1016/j.jgg.2020.06.008. [DOI] [PubMed] [Google Scholar]
- 6.Parelho V., Hadjur S., Spivakov M., Leleu M., Sauer S., Gregson H.C., Jarmuz A., Canzonetta C., Webster Z., Nesterova T., et al. Cohesins functionally associate with CTCF on mammalian chromosome arms. Cell. 2008;132:422–433. doi: 10.1016/j.cell.2008.01.011. [DOI] [PubMed] [Google Scholar]
- 7.Rubio E.D., Reiss D.J., Welcsh P.L., Disteche C.M., Filippova G.N., Baliga N.S., Aebersold R., Ranish J.A., Krumm A. CTCF physically links cohesin to chromatin. Proc. Natl. Acad. Sci. USA. 2008;105:8309–8314. doi: 10.1073/pnas.0801273105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Wendt K.S., Yoshida K., Itoh T., Bando M., Koch B., Schirghuber E., Tsutsumi S., Nagae G., Ishihara K., Mishiro T., et al. Cohesin mediates transcriptional insulation by CCCTC-binding factor. Nature. 2008;451:796–801. doi: 10.1038/nature06634. [DOI] [PubMed] [Google Scholar]
- 9.Zuin J., Dixon J.R., van der Reijden M.I.J.A., Ye Z., Kolovos P., Brouwer R.W.W., van de Corput M.P.C., van de Werken H.J.G., Knoch T.A., van IJcken W.F.J., et al. Cohesin and CTCF differentially affect chromatin architecture and gene expression in human cells. Proc. Natl. Acad. Sci. USA. 2014;111:996–1001. doi: 10.1073/pnas.1317788111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Sanborn A.L., Rao S.S.P., Huang S.C., Durand N.C., Huntley M.H., Jewett A.I., Bochkov I.D., Chinnappan D., Cutkosky A., Li J., et al. Chromatin extrusion explains key features of loop and domain formation in wild-type and engineered genomes. Proc. Natl. Acad. Sci. USA. 2015;112:E6456–E6465. doi: 10.1073/pnas.1518552112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Fudenberg G., Imakaev M., Lu C., Goloborodko A., Abdennur N., Mirny L.A. Formation of Chromosomal Domains by Loop Extrusion. Cell Rep. 2016;15:2038–2049. doi: 10.1016/j.celrep.2016.04.085. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.van Ruiten M.S., Rowland B.D. On the choreography of genome folding: A grand pas de deux of cohesin and CTCF. Curr. Opin. Cell Biol. 2021;70:84–90. doi: 10.1016/j.ceb.2020.12.001. [DOI] [PubMed] [Google Scholar]
- 13.Popay T.M., Dixon J.R. Coming full circle: On the origin and evolution of the looping model for enhancer-promoter communication. J. Biol. Chem. 2022;298 doi: 10.1016/j.jbc.2022.102117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Guo Y., Monahan K., Wu H., Gertz J., Varley K.E., Li W., Myers R.M., Maniatis T., Wu Q. CTCF/cohesin-mediated DNA looping is required for protocadherin α promoter choice. Proc. Natl. Acad. Sci. USA. 2012;109:21081–21086. doi: 10.1073/pnas.1219280110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Rao S.S.P., Huntley M.H., Durand N.C., Stamenova E.K., Bochkov I.D., Robinson J.T., Sanborn A.L., Machol I., Omer A.D., Lander E.S., Aiden E.L. A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell. 2014;159:1665–1680. doi: 10.1016/j.cell.2014.11.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.de Wit E., Vos E.S.M., Holwerda S.J.B., Valdes-Quezada C., Verstegen M.J.A.M., Teunissen H., Splinter E., Wijchers P.J., Krijger P.H.L., de Laat W. CTCF Binding Polarity Determines Chromatin Looping. Mol. Cell. 2015;60:676–684. doi: 10.1016/j.molcel.2015.09.023. [DOI] [PubMed] [Google Scholar]
- 17.Guo Y., Xu Q., Canzio D., Shou J., Li J., Gorkin D.U., Jung I., Wu H., Zhai Y., Tang Y., et al. CRISPR Inversion of CTCF Sites Alters Genome Topology and Enhancer/Promoter Function. Cell. 2015;162:900–910. doi: 10.1016/j.cell.2015.07.038. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Canzio D., Nwakeze C.L., Horta A., Rajkumar S.M., Coffey E.L., Duffy E.E., Duffié R., Monahan K., O'Keeffe S., Simon M.D., et al. Antisense lncRNA Transcription Mediates DNA Demethylation to Drive Stochastic Protocadherin α Promoter Choice. Cell. 2019;177:639–653.e15. doi: 10.1016/j.cell.2019.03.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Jia Z., Li J., Ge X., Wu Y., Guo Y., Wu Q. Tandem CTCF sites function as insulators to balance spatial chromatin contacts and topological enhancer-promoter selection. Genome Biol. 2020;21 doi: 10.1186/s13059-020-01984-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Liang Z., Zhao L., Ye A.Y., Lin S.G., Zhang Y., Guo C., Dai H.Q., Ba Z., Alt F.W. Contribution of the IGCR1 regulatory element and the 3'Igh CTCF- binding elements to regulation of Igh V(D)J recombination. Proc. Natl. Acad. Sci. USA. 2023;120 doi: 10.1073/pnas.2306564120. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Weintraub A.S., Li C.H., Zamudio A.V., Sigova A.A., Hannett N.M., Day D.S., Abraham B.J., Cohen M.A., Nabet B., Buckley D.L., et al. YY1 Is a Structural Regulator of Enhancer-Promoter Loops. Cell. 2017;171:1573–1588.e28. doi: 10.1016/j.cell.2017.11.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Ortabozkoyun H., Huang P.Y., Gonzalez-Buendia E., Cho H., Kim S.Y., Tsirigos A., Mazzoni E.O., Reinberg D. Members of an array of zinc-finger proteins specify distinct Hox chromatin boundaries. Mol. Cell. 2024;84:3406–3422.e6. doi: 10.1016/j.molcel.2024.08.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Wu Q., Maniatis T. A striking organization of a large family of human neural cadherin-like cell adhesion genes. Cell. 1999;97:779–790. doi: 10.1016/S0092-8674(00)80789-8. [DOI] [PubMed] [Google Scholar]
- 24.Zipursky S.L., Sanes J.R. Chemoaffinity Revisited: Dscams, Protocadherins, and Neural Circuit Assembly. Cell. 2010;143:343–353. doi: 10.1016/j.cell.2010.10.009. [DOI] [PubMed] [Google Scholar]
- 25.Dong H., Li J., Wu Q., Jin Y. Confluence and convergence of Dscam and Pcdh codes. Trends Biochem. Sci. 2023;48:1044–1057. doi: 10.1016/j.tibs.2023.09.001. [DOI] [PubMed] [Google Scholar]
- 26.Tasic B., Nabholz C.E., Baldwin K.K., Kim Y., Rueckert E.H., Ribich S.A., Cramer P., Wu Q., Axel R., Maniatis T. Promoter choice determines splice site selection in protocadherin α and -γ pre-mRNA splicing. Mol. Cell. 2002;10:21–33. doi: 10.1016/S1097-2765(02)00578-6. [DOI] [PubMed] [Google Scholar]
- 27.Monahan K., Rudnick N.D., Kehayova P.D., Pauli F., Newberry K.M., Myers R.M., Maniatis T. Role of CCCTC binding factor (CTCF) and cohesin in the generation of single-cell diversity of Protocadherin-α gene expression. Proc. Natl. Acad. Sci. USA. 2012;109:9125–9130. doi: 10.1073/pnas.1205074109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Allahyar A., Vermeulen C., Bouwman B.A.M., Krijger P.H.L., Verstegen M.J.A.M., Geeven G., van Kranenburg M., Pieterse M., Straver R., Haarhuis J.H.I., et al. Enhancer hubs and loop collisions identified from single-allele topologies. Nat. Genet. 2018;50:1151–1160. doi: 10.1038/s41588-018-0161-5. [DOI] [PubMed] [Google Scholar]
- 29.Lv X., Li S., Li J., Yu X.Y., Ge X., Li B., Hu S., Lin Y., Zhang S., Yang J., et al. Patterned cPCDH expression regulates the fine organization of the neocortex. Nature. 2022;612:503–511. doi: 10.1038/s41586-022-05495-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Kiefer L., Chiosso A., Langen J., Buckley A., Gaudin S., Rajkumar S.M., Servito G.I.F., Cha E.S., Vijay A., Yeung A., et al. WAPL functions as a rheostat of Protocadherin isoform diversity that controls neural wiring. Science. 2023;380:eadf8440. doi: 10.1126/science.adf8440. [DOI] [PubMed] [Google Scholar]
- 31.Martinez S.R., Miranda J.L. CTCF terminal segments are unstructured. Protein Sci. 2010;19:1110–1116. doi: 10.1002/pro.367. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Bonchuk A., Kamalyan S., Mariasina S., Boyko K., Popov V., Maksimenko O., Georgiev P. N-terminal domain of the architectural protein CTCF has similar structural organization and ability to self-association in bilaterian organisms. Sci. Rep. 2020;10 doi: 10.1038/s41598-020-59459-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Nakahashi H., Kieffer Kwon K.R., Resch W., Vian L., Dose M., Stavreva D., Hakim O., Pruett N., Nelson S., Yamane A., et al. A genome-wide map of CTCF multivalency redefines the CTCF code. Cell Rep. 2013;3:1678–1689. doi: 10.1016/j.celrep.2013.04.024. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Hashimoto H., Wang D., Horton J.R., Zhang X., Corces V.G., Cheng X. Structural Basis for the Versatile and Methylation-Dependent Binding of CTCF to DNA. Mol. Cell. 2017;66:711–720.e3. doi: 10.1016/j.molcel.2017.05.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Yin M., Wang J., Wang M., Li X., Zhang M., Wu Q., Wang Y. Molecular mechanism of directional CTCF recognition of a diverse range of genomic sites. Cell Res. 2017;27:1365–1377. doi: 10.1038/cr.2017.131. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Xu D., Ma R., Zhang J., Liu Z., Wu B., Peng J., Zhai Y., Gong Q., Shi Y., Wu J., et al. Dynamic nature of CTCF tandem 11 zinc fingers in multivalent recognition of DNA as revealed by NMR spectroscopy. J. Phys. Chem. Lett. 2018;9:4020–4028. doi: 10.1021/acs.jpclett.8b01440. [DOI] [PubMed] [Google Scholar]
- 37.Soochit W., Sleutels F., Stik G., Bartkuhn M., Basu S., Hernandez S.C., Merzouk S., Vidal E., Boers R., Boers J., et al. CTCF chromatin residence time controls three-dimensional genome organization, gene expression and DNA methylation in pluripotent cells. Nat. Cell Biol. 2021;23:881–893. doi: 10.1038/s41556-021-00722-w. [DOI] [PubMed] [Google Scholar]
- 38.Lebeau B., Zhao K., Jangal M., Zhao T., Guerra M., Greenwood C.M.T., Witcher M. Single base-pair resolution analysis of DNA binding motif with MoMotif reveals an oncogenic function of CTCF zinc-finger 1 mutation. Nucleic Acids Res. 2022;50:8441–8458. doi: 10.1093/nar/gkac658. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Hyle J., Djekidel M.N., Williams J., Wright S., Shao Y., Xu B., Li C. Auxin-inducible degron 2 system deciphers functions of CTCF domains in transcriptional regulation. Genome Biol. 2023;24:14. doi: 10.1186/s13059-022-02843-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Yang J., Horton J.R., Liu B., Corces V.G., Blumenthal R.M., Zhang X., Cheng X. Structures of CTCF-DNA complexes including all 11 zinc fingers. Nucleic Acids Res. 2023;51:8447–8462. doi: 10.1093/nar/gkad594. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Saldaña-Meyer R., Rodriguez-Hernaez J., Escobar T., Nishana M., Jácome-López K., Nora E.P., Bruneau B.G., Tsirigos A., Furlan-Magaril M., Skok J., Reinberg D. RNA Interactions Are Essential for CTCF-Mediated Genome Organization. Mol. Cell. 2019;76:412–422.e415. doi: 10.1016/j.molcel.2019.08.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Hansen A.S., Hsieh T.H.S., Cattoglio C., Pustova I., Saldaña-Meyer R., Reinberg D., Darzacq X., Tjian R. Distinct Classes of Chromatin Loops Revealed by Deletion of an RNA-Binding Region in CTCF. Mol. Cell. 2019;76:395–411.e13. doi: 10.1016/j.molcel.2019.07.039. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Nishana M., Ha C., Rodriguez-Hernaez J., Ranjbaran A., Chio E., Nora E.P., Badri S.B., Kloetgen A., Bruneau B.G., Tsirigos A., Skok J.A. Defining the relative and combined contribution of CTCF and CTCFL to genomic regulation. Genome Biol. 2020;21:108. doi: 10.1186/s13059-020-02024-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Nora E.P., Caccianini L., Fudenberg G., So K., Kameswaran V., Nagle A., Uebersohn A., Hajj B., Saux A.L., Coulon A., et al. Molecular basis of CTCF binding polarity in genome folding. Nat. Commun. 2020;11:5612. doi: 10.1038/s41467-020-19283-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Pugacheva E.M., Kubo N., Loukinov D., Tajmul M., Kang S., Kovalchuk A.L., Strunnikov A.V., Zentner G.E., Ren B., Lobanenkov V.V. CTCF mediates chromatin looping via N-terminal domain-dependent cohesin retention. Proc. Natl. Acad. Sci. USA. 2020;117:2020–2031. doi: 10.1073/pnas.1911708117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Hansen A.S. CTCF as a boundary factor for cohesin-mediated loop extrusion: evidence for a multi-step mechanism. Nucleus. 2020;11:132–148. doi: 10.1080/19491034.2020.1782024. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Hu G., Katuwawala A., Wang K., Wu Z., Ghadermarzi S., Gao J., Kurgan L. flDPnn: Accurate intrinsic disorder prediction with putative propensities of disorder functions. Nat. Commun. 2021;12:4438. doi: 10.1038/s41467-021-24773-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Moore J.M., Rabaia N.A., Smith L.E., Fagerlie S., Gurley K., Loukinov D., Disteche C.M., Collins S.J., Kemp C.J., Lobanenkov V.V., Filippova G.N. Loss of maternal CTCF is associated with peri-implantation lethality of null embryos. PLoS One. 2012;7 doi: 10.1371/journal.pone.0034915. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.González-Buendía E., Pérez-Molina R., Ayala-Ortega E., Guerrero G., Recillas-Targa F. In: Cancer Cell Signaling: Methods and Protocols. Robles-Flores M., editor. Springer; 2014. Experimental Strategies to Manipulate the Cellular Levels of the Multifunctional Factor CTCF; pp. 53–69. [DOI] [PubMed] [Google Scholar]
- 50.Li J., Shou J., Guo Y., Tang Y., Wu Y., Jia Z., Zhai Y., Chen Z., Xu Q., Wu Q. Efficient inversions and duplications of mammalian regulatory DNA elements and gene clusters by CRISPR/Cas9. J. Mol. Cell Biol. 2015;7:284–298. doi: 10.1093/jmcb/mjv016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Shou J., Li J., Liu Y., Wu Q. Precise and Predictable CRISPR Chromosomal Rearrangements Reveal Principles of Cas9-Mediated Nucleotide Insertion. Mol. Cell. 2018;71:498–509.e4. doi: 10.1016/j.molcel.2018.06.021. [DOI] [PubMed] [Google Scholar]
- 52.Saldaña-Meyer R., González-Buendía E., Guerrero G., Narendra V., Bonasio R., Recillas-Targa F., Reinberg D. CTCF regulates the human p53 gene through direct interaction with its natural antisense transcript, Wrap53. Genes Dev. 2014;28:723–734. doi: 10.1101/gad.236869.113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Abramson J., Adler J., Dunger J., Evans R., Green T., Pritzel A., Ronneberger O., Willmore L., Ballard A.J., Bambrick J., et al. Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature. 2024;630:493–500. doi: 10.1038/s41586-024-07487-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Lefebvre J.L., Kostadinov D., Chen W.V., Maniatis T., Sanes J.R. Protocadherins mediate dendritic self-avoidance in the mammalian nervous system. Nature. 2012;488:517–521. doi: 10.1038/nature11305. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Thu C.A., Chen W.V., Rubinstein R., Chevee M., Wolcott H.N., Felsovalyi K.O., Tapia J.C., Shapiro L., Honig B., Maniatis T. Single-cell identity generated by combinatorial homophilic interactions between alpha, beta, and gamma protocadherins. Cell. 2014;158:1045–1059. doi: 10.1016/j.cell.2014.07.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Rubinstein R., Thu C.A., Goodman K.M., Wolcott H.N., Bahna F., Mannepalli S., Ahlsen G., Chevee M., Halim A., Clausen H., et al. Molecular logic of neuronal self-recognition through protocadherin domain interactions. Cell. 2015;163:629–642. doi: 10.1016/j.cell.2015.09.026. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Mountoufaris G., Chen W.V., Hirabayashi Y., O'Keeffe S., Chevee M., Nwakeze C.L., Polleux F., Maniatis T. Multicluster Pcdh diversity is required for mouse olfactory neural circuit assembly. Science. 2017;356:411–414. doi: 10.1126/science.aai8801. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Zhou Y., Xu S., Zhang M., Wu Q. Systematic functional characterization of antisense eRNA of protocadherin composite enhancer. Gene Dev. 2021;35:1383–1394. doi: 10.1101/gad.348621.121. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Ge X., Huang H., Han K., Xu W., Wang Z., Wu Q. Outward-oriented sites within clustered CTCF boundaries are key for intra-TAD chromatin interactions and gene regulation. Nat. Commun. 2023;14:8101. doi: 10.1038/s41467-023-43849-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Zhang M., Huang H., Li J., Wu Q. ZNF143 deletion alters enhancer/promoter looping and CTCF/cohesin geometry. Cell Rep. 2024;43 doi: 10.1016/j.celrep.2023.113663. [DOI] [PubMed] [Google Scholar]
- 61.Huang H., Wu Q. Pushing the TAD boundary: Decoding insulator codes of clustered CTCF sites in 3D genomes. Bioessays. 2024;46 doi: 10.1002/bies.202400121. [DOI] [PubMed] [Google Scholar]
- 62.Kung J.T., Kesner B., An J.Y., Ahn J.Y., Cifuentes-Rojas C., Colognori D., Jeon Y., Szanto A., del Rosario B.C., Pinter S.F., et al. Locus-specific targeting to the X chromosome revealed by the RNA interactome of CTCF. Mol. Cell. 2015;57:361–375. doi: 10.1016/j.molcel.2014.12.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Hansen A.S., Amitai A., Cattoglio C., Tjian R., Darzacq X. Guided nuclear exploration increases CTCF target search efficiency. Nat. Chem. Biol. 2020;16:257–266. doi: 10.1038/s41589-019-0422-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Hansen A.S., Pustova I., Cattoglio C., Tjian R., Darzacq X. CTCF and cohesin regulate chromatin loop stability with distinct dynamics. Elife. 2017;6 doi: 10.7554/eLife.25776. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Kaur G., Ren R., Hammel M., Horton J.R., Yang J., Cao Y., He C., Lan F., Lan X., Blobel G.A., et al. Allosteric autoregulation of DNA binding via a DNA-mimicking protein domain: a biophysical study of ZNF410-DNA interaction using small angle X-ray scattering. Nucleic Acids Res. 2023;51:1674–1686. doi: 10.1093/nar/gkac1274. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Pant V., Kurukuti S., Pugacheva E., Shamsuddin S., Mariano P., Renkawitz R., Klenova E., Lobanenkov V., Ohlsson R. Mutation of a single CTCF target site within the H19 imprinting control region leads to loss of Igf2 imprinting and complex patterns of de novo methylation upon maternal inheritance. Mol. Cell Biol. 2004;24:3497–3504. doi: 10.1128/Mcb.24.8.3497-3504.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Kang H.S., Sánchez-Rico C., Ebersberger S., Sutandy F.X.R., Busch A., Welte T., Stehle R., Hipp C., Schulz L., Buchbender A., et al. An autoinhibitory intramolecular interaction proof-reads RNA recognition by the essential splicing factor U2AF2. Proc. Natl. Acad. Sci. USA. 2020;117:7140–7149. doi: 10.1073/pnas.1913483117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Klenova E.M., Chernukhin I.V., El-Kady A., Lee R.E., Pugacheva E.M., Loukinov D.I., Goodwin G.H., Delgado D., Filippova G.N., León J., et al. Functional phosphorylation sites in the C-terminal region of the multivalent multifunctional transcriptional factor CTCF. Mol. Cell Biol. 2001;21:2221–2234. doi: 10.1128/mcb.21.6.2221-2234.2001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Valverde de Morales H.G., Wang H.L.V., Garber K., Cheng X., Corces V.G., Li H. Expansion of the genotypic and phenotypic spectrum of CTCF-related disorder guides clinical management: 43 new subjects and a comprehensive literature review. Am. J. Med. Genet. 2023;191:718–729. doi: 10.1002/ajmg.a.63065. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Bailey M.H., Tokheim C., Porta-Pardo E., Sengupta S., Bertrand D., Weerasinghe A., Colaprico A., Wendl M.C., Kim J., Reardon B., et al. Comprehensive Characterization of Cancer Driver Genes and Mutations. Cell. 2018;173:371–385.e18. doi: 10.1016/j.cell.2018.02.060. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Flaherty E., Maniatis T. The role of clustered protocadherins in neurodevelopment and neuropsychiatric diseases. Curr. Opin. Genet. Dev. 2020;65:144–150. doi: 10.1016/j.gde.2020.05.041. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Jia Z., Wu Q. Clustered Protocadherins Emerge as Novel Susceptibility Loci for Mental Disorders. Front. Neurosci. 2020;14 doi: 10.3389/fnins.2020.587819. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Lyons H., Veettil R.T., Pradhan P., Fornero C., De La Cruz N., Ito K., Eppert M., Roeder R.G., Sabari B.R. Functional partitioning of transcriptional regulators by patterned charge blocks. Cell. 2023;186:327–345.e28. doi: 10.1016/j.cell.2022.12.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Ahn J.H., Davis E.S., Daugird T.A., Zhao S., Quiroga I.Y., Uryu H., Li J., Storey A.J., Tsai Y.H., Keeley D.P., et al. Phase separation drives aberrant chromatin looping and cancer development. Nature. 2021;595:591–595. doi: 10.1038/s41586-021-03662-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Langmead B., Salzberg S.L. Fast gapped-read alignment with Bowtie 2. Nat. Methods. 2012;9:357–359. doi: 10.1038/Nmeth.1923. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Zhang Y., Liu T., Meyer C.A., Eeckhoute J., Johnson D.S., Bernstein B.E., Nusbaum C., Myers R.M., Brown M., Li W., Liu X.S. Model-based Analysis of ChIP-Seq (MACS) Genome Biol. 2008;9:R137. doi: 10.1186/gb-2008-9-9-r137. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Bailey T.L., Johnson J., Grant C.E., Noble W.S. The MEME Suite. Nucleic Acids Res. 2015;43:W39–W49. doi: 10.1093/nar/gkv416. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Dobin A., Davis C.A., Schlesinger F., Drenkow J., Zaleski C., Jha S., Batut P., Chaisson M., Gingeras T.R. STAR: ultrafast universal RNA-seq aligner. Bioinformatics. 2013;29:15–21. doi: 10.1093/bioinformatics/bts635. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Trapnell C., Williams B.A., Pertea G., Mortazavi A., Kwan G., van Baren M.J., Salzberg S.L., Wold B.J., Pachter L. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat. Biotechnol. 2010;28:511–515. doi: 10.1038/nbt.1621. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.Ramírez F., Ryan D.P., Grüning B., Bhardwaj V., Kilpert F., Richter A.S., Heyne S., Dündar F., Manke T. deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic Acids Res. 2016;44:W160–W165. doi: 10.1093/nar/gkw257. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Li H., Handsaker B., Wysoker A., Fennell T., Ruan J., Homer N., Marth G., Abecasis G., Durbin R., 1000 Genome Project Data Processing Subgroup The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009;25:2078–2079. doi: 10.1093/bioinformatics/btp352. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Quinlan A.R., Hall I.M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26:841–842. doi: 10.1093/bioinformatics/btq033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83.Thongjuea S., Stadhouders R., Grosveld F.G., Soler E., Lenhard B. r3Cseq: an R/Bioconductor package for the discovery of long-range genomic interactions from chromosome conformation capture and next-generation sequencing data. Nucleic Acids Res. 2013;41 doi: 10.1093/nar/gkt373. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Love M.I., Huber W., Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15:550. doi: 10.1186/s13059-014-0550-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85.Ross-Innes C.S., Stark R., Teschendorff A.E., Holmes K.A., Ali H.R., Dunning M.J., Brown G.D., Gojis O., Ellis I.O., Green A.R., et al. Differential oestrogen receptor binding is associated with clinical outcome in breast cancer. Nature. 2012;481:389–393. doi: 10.1038/nature10730. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Grant C.E., Bailey T.L., Noble W.S. FIMO: scanning for occurrences of a given motif. Bioinformatics. 2011;27:1017–1018. doi: 10.1093/bioinformatics/btr064. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.Yu G., Wang L.G., He Q.Y. ChIPseeker: an R/Bioconductor package for ChIP peak annotation, comparison and visualization. Bioinformatics. 2015;31:2382–2383. doi: 10.1093/bioinformatics/btv145. [DOI] [PubMed] [Google Scholar]
- 88.Schneider C.A., Rasband W.S., Eliceiri K.W. NIH Image to ImageJ: 25 years of image analysis. Nat. Methods. 2012;9:671–675. doi: 10.1038/nmeth.2089. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
-
•
High-throughput sequencing files (ChIP-seq, RNA-seq, ATAC-seq, and QHR-4C) have been deposited into the NCBI Gene Expression Omnibus (GEO) database with the accession number GSE261210, GSE261212, GSE261213, and GSE261209, respectively.
-
•
This paper does not report original code.
-
•
Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request.