Abstract
An increasing number of genes involved in chromatin structure and epigenetic regulation has been implicated in a variety of developmental disorders, often including intellectual disability. By trio exome sequencing and subsequent mutational screening we now identified two de novo frameshift mutations and one de novo missense mutation in CTCF in individuals with intellectual disability, microcephaly, and growth retardation. Furthermore, an individual with a larger deletion including CTCF was identified. CTCF (CCCTC-binding factor) is one of the most important chromatin organizers in vertebrates and is involved in various chromatin regulation processes such as higher order of chromatin organization, enhancer function, and maintenance of three-dimensional chromatin structure. Transcriptome analyses in all three individuals with point mutations revealed deregulation of genes involved in signal transduction and emphasized the role of CTCF in enhancer-driven expression of genes. Our findings indicate that haploinsufficiency of CTCF affects genomic interaction of enhancers and their regulated gene promoters that drive developmental processes and cognition.
Main Text
Chromatin organization that controls compartmentalization of distinct functional domains has a fundamental impact on correct temporal and spatial gene expression required for proper development and cognition in the mammalian genome.1,2 An increasing number of genes involved in chromatin structure and epigenetic regulation has been implicated in various developmental disorders, often including intellectual disability (ID).3 Examples are EHMT1 (MIM 607001), encoding euchromatin histone methyl transferase 1, in Kleefstra syndrome4 (MIM 610253), and ARID1B (MIM 614556) and other genes encoding for subunits of the SWI/SNF complex in unspecific ID (MIM 614562) or Coffin-Siris syndrome (MIM 135900).5–7 We now identified de novo mutations in the key chromatin organizer CTCF (MIM 604167) in individuals with intellectual disability.
We performed trio exome sequencing in an individual (I1) with mild ID, short stature, microcephaly, cleft palate, and congenital heart defect (Figure 1 and Table 1). The SureSelect Human All Exon Kit V3 (50 Mb, ∼21,000 genes) (Agilent Technologies) was used for enrichment, and sequencing was carried out with 50 bp single reads on a SOLiD 4 system (Life Technologies). On average, we obtained more than 100 million reads per individual. Read mapping to the UCSC Genome Browser hg19 reference genome was performed with SOLiD LifeScope software v.2.5 and yielded approximately 82 million mappable reads per individual. The mean target coverage was 72, and 75% of the target sequence was covered at least ten times. Variant calling was performed with the LifeScope software v.2.5 with high stringency settings and GATK v.1.48 after local realignment of indels. Only variants called by LifeScope with high stringency settings and by GATK were selected. The study was approved by the ethics committee of the Medical Faculty, University of Erlangen-Nuremberg, and informed consent was obtained from parents or guardians of the affected individuals.
Figure 1.

Identified Defects in CTCF
(A) Schemes of the genomic and protein structure of CTCF with localization and electropherograms of the mutations. The mutations are named according to isoform 1 of CTCF (RefSeq NM_006565.3).
(B) Conservation of the missense mutation. The position of the mutation at amino acid 567 is indicated by a blue bar and highly conserved throughout all indicated species.
(C) Clinical pictures of the affected individuals with unspecific facial gestalt.
Table 1.
Summarized Clinical Findings in the Affected Individuals
| Individual 1 | Individual 2 | Individual 3 | Individual 4 | |
|---|---|---|---|---|
| Defect in CTCF (NM_006565.3) | c.375dupT (p.Val126Cysfs∗14) | c.1186dupA (p.Arg396Lysfs∗13) | c.1699C>T (p.Arg567Trp) | 280 kb deletion, eight genes |
| Gender | male | male | male | female |
| Age at last investigation | 9 years, 6 months | 9 years | 3 years, 11 months | 15 years |
| Birth | 34 weeks | 40 weeks | 39 weeks | 40 weeks |
| Birth weight | SGA | 2,620 g | 2,990 g | 2,900 g |
| Birth length | SGA | 50 cm | 54 cm | 49 cm |
| Birth OFC | SGA | 33 cm | 34 cm | ND |
| Feeding difficulties | first years | first week | tube feeding | yes |
| Muscular hypotonia | ND | yes | yes | yes |
| Weight | 20 kg, −2.35 SD | 25.6 kg, −0.96 SD | 13.4 kg, −1.15 SD | 40.5 kg, −1.74 SD |
| Height | 125.5 cm, −3.15 SD | 129.5 cm, −0.71 SD | 100 cm, −1.9 SD | 156 cm, −1.94 SD |
| OFC | 48.2 cm, −3.51 SD | 49.5 cm, −2.61 SD | 47.5 cm, −2.91 SD | 54 cm, −0.84 SD |
| Developmental delay/intellectual disability | IQ 64 | Dev. delay, IQ 79–86 | severe | moderate |
| Age of walking | 24 months | 18 months | 23 months | 30 months |
| Age of first words | 4 years | >18 months | 2 words at 3 years | 2 years |
| Behavioral anomalies | no | easily overstrained | autistic behavior | sleeping disturbances, autistic behavior, temper tantrums |
| Brain anomalies | ultrasound: wide ventricles, plexus cyst | ND | normal MRI | CT at 1 year: dilated left ventricle |
| Recurrent infections | no | yes | no | no |
| Hypermetropia | yes, plus strabism | no | yes | yes, plus strabism |
| Hearing | tested normal | appeared normal | tested normal | tested normal |
| Urogenital anomalies | inguinal hernia | cryptorchidism | cryptorchidism, phimosis | ND |
| Minor, unspecific facial dysmorphisms | small mouth, prominent incisors, small other teeth, thin upper lip | pointed nose, thin upper lip | thin lips | high forehead, hypertelorism, thick eyebrows, long eyelashes, epicanthic folds, low-set posteriorly rotated ears, long philtrum, thin lips |
| Other anomalies | ASD, PDA, mild aortic coarctation, cleft palate, prominent finger joints, single palmar crease, sacral dimple, camptodactyly V | clinodactyly finger V, single palmar crease, dental anomalies | none | hypertrichosis, sandal gaps, broad 1st toe |
| Normal previous testing | 22q11.2-FISH, UPD7, UPD14, Fragile-X, Affymetrix 6.0 Mapping SNP array | Affymetrix 6.0 Mapping SNP array, Fragile-X | muscle biopsy, metabolic screening, 22q11.2-MLPA, Affymetrix 6.0 Mapping SNP array, Fragile-X, methylation test for Angelman syndrome | ND |
Abbreviations are as follows: ASD, atrial septum defect; FISH, fluorescence in situ hybridization; MLPA, multiplex ligation-dependent probe amplification; ND, no data; PDA, persistent ductus arteriosus; SD, standard deviation; SGA, small for gestational age.
For individual I1, a total of 29,523 variants (SNVs and indels) were annotated. We examined the data for de novo mutations by excluding variants present in dbSNP 135 or our in-house database of 234 exomes, in noncoding regions or in either of the parents (Tables S1 and S2 available online). We detected two de novo mutations. One of these is the frameshifting mutation c.375dupT (p.Val126Cysfs∗14) in CTCF (RefSeq accession number NM_006565.3) (Figure S1). No convincing truncating mutations or copy-number variants in CTCF were observed in dbSNP 135, 1000 Genomes, the Exome Variant Server (EVS), our in-house databases comprising more than 1,500 exomes (234 from Erlangen and 1,298 from Nijmegen), or 820 molecularly karyotyped healthy controls.
CCCTC-binding factor CTCF is an important chromatin organizer involved in a range of gene regulation processes. When bound to insulator elements, CTCF can prevent spreading of inactive heterochromatin into neighboring regions and shield particular gene promoters from enhancer function.9 This enhancer blocking by CTCF might be methylation sensitive.10,11 CTCF is involved in maintaining three-dimensional chromatin structure,12 imprinting,10 X inactivation,13 and nucleosome positioning.9,14–17 The crucial role of CTCF in development9 is reflected in early implantation lethality upon complete CTCF deficiency in mice.18 In another mouse model in which 70% of genomic Ctcf was depleted in specific neurons, postnatal growth retardation, abnormal behavior, and brain abnormalities were observed, thus underscoring the function of CTCF in cognition and other developmental processes.19 Several lines of evidence indicate that CTCF also plays a role in disease-related phenotypes in humans. Deregulation of CTCF binding has been implicated in overgrowth and growth retardation disorders resulting from aberrant methylation of the imprinted H19/IGF2 locus.20,21 Genomic CTCF binding sites significantly overlap with SNPs associated with human height.22 Also, cohesins that mediate enhancer blocking by recruiting CTCF binding to insulator sites23 are mutated in Cornelia-de-Lange syndrome (MIM 122470), a disorder with severe ID and multiple congenital anomalies. Somatic mutations in CTCF, among other chromatin modifiers, were observed in two cases of acute leukemia.24,25 So far, germline mutations in CTCF have not been reported.
We next screened CTCF (RefSeq NM_006565.3) in 399 individuals with intellectual disability by unidirectional direct sequencing (ABI BigDye Terminator Sequencing Kit v.3; Life Technologies) of all coding exons with exon-intron boundaries with an automated capillary sequencer (ABI 3730; Life Technologies). We identified two further mutations in two boys: a de novo frameshift mutation c.1186dupA (p.Arg396Lysfs∗13) (I2) and a de novo missense mutation c.1699C>T (p.Arg567Trp) (I3) (Figure 1). All mutations were excluded in dbSNP 135, 1000 Genomes, EVS, and our in-house databases. I2 had borderline intelligence but developmental delay, pronounced learning difficulties, and behavioral problems. Furthermore, microcephaly was noted. I3 had severe ID with autistic features, microcephaly, and severe feeding difficulties, still requiring tube feeding at the age of 4 years (Table 1). Shared clinical features in all three individuals with de novo mutations in CTCF comprised ID of variable severity (with I3 being most severely affected), head circumference and/or body height either in the low normal range or below –2 standard deviations, and feeding difficulties (Figure 1, Table 1). Mice with reduced levels of CTCF in neurons19 have overlapping phenotypes with affected individuals, thus emphasizing its role in physical and cognitive development. Apart from a mild congenital heart defect and cleft palate in I1, no gross malformations or specific dysmorphisms were observed in the individuals reported here.
To test the consequences of the mutations, CTCF mRNA expression and protein levels were determined. Expression analysis on lymphocyte cDNA, reversely transcribed from RNA extracted with the PAXgene Blood System (Becton Dickinson), was done by quantitative RT-PCR with the SYBRgreen mastermix (Thermo Fisher Scientific) on an ABI 7900HT instrument (Life Technologies). We found reduced expression of CTCF in both individuals with frameshift mutations (I1 and I2). Sequencing of the cDNA confirmed almost complete absence of the mutated allele (Figure S3). This observation is consistent with loss of function or haploinsufficiency, possibly through nonsense-mediated mRNA decay. A search of the Decipher database yielded one girl with intellectual disability (I4) and a de novo deletion of eight genes including CTCF (Figure S2), again supporting the notion that haploinsufficiency of CTCF gives rise to the ID phenotype. In I3 (harboring the missense mutation) the mutation was still detectable in cDNA, and CTCF mRNA expression levels as well as CTCF protein levels were unaltered (Figure S3). This missense mutation is located in the splice donor consensus site of exon 9. RT-PCR did not reveal an aberrant splice product (data not shown). Two of three prediction programs indicated a deleterious effect of this missense variant (Table S3). This variant was absent in dbSNP 135, 1000 Genomes, EVS, and in-house exomes. Molecular modeling indicated that the c.1699C>T (p.Arg567Trp) exchange does not cause steric problems in the CTCF structure, suggesting that the mutation has no significant effect on protein stability. Modeling DNA-bound CTCF indicated that replacement of an arginine by a tryptophane at position 567 would result in weaker interactions with the DNA backbone compared to the wild-type and in novel nonpolar interactions formed with the bases of the DNA (Figure S4). Therefore, the c.1699C>T (p.Arg567Trp) mutation might affect both DNA binding affinity and specificity. This is compatible with a disease model of either functional haploinsufficiency or a dominant-negative effect. We note that this individual harbors a second de novo variant in the same exon that we showed to be on the same allele (c.1650C>T [p.(=)]) (Figure S5). Because it is not located in the splice site and does not result in an amino acid change, a pathogenic relevance is not obvious. However, a contributory effect additionally to the missense mutation cannot be excluded.
Because of the known role of CTCF in transcriptional regulation and chromatin organization, we performed whole-transcriptome (mRNA) sequencing on lymphocyte RNA from I1, I2, I3, and eight healthy control individuals. Total RNA was amplified with the Ovation RNA-Seq system (NuGEN). Amplified ds-cDNA was then used for library preparation and sequenced on a SOLiD4 system (Life Technologies). On average, 30 of 81 million reads per individual (38%) were mapped to the UCSC Genome Browser hg19 reference genome with Bowtie26 v.0.12.7 with a SNP fraction of 0.001. Mapped reads were assigned to transcripts of the March 9, 2012, version of the Illumina iGenomes transcriptome with HTSeq v.0.5.3p9 with default settings, resulting in an average of 7.5 million assigned reads per individual. Differential expression between affected and control individuals was determined with the count-based DESeq R package.27 Gene expression differed between affected and control individuals when considering genes of moderate to high expression (reads per kilobase per million mapped reads [RPKM] > 10; applies to 5,088 genes) (Figures 2 and S6). The gene expression patterns of the two individuals with frameshift mutations (I1 and I2) were more similar to each other than to I3 with the missense mutation (Figure 2). This divergent expression profile might provide a possible explanation for the more severe clinical phenotype associated with the missense mutation. Nonetheless, the overlap of down- and upregulated genes in I3 compared to I1 and I2 was significant (p < 10−197 and p < 10−31, respectively, chi-square test) (Figure S7), supporting a shared pathogenic mechanism. Twelve of the deregulated genes were validated by quantitative RT-PCR (Figure S8). Six were randomly selected, and five were neuronal genes with a deregulated gene expression pattern that was consistent with that in brain tissues of neuron-specific knockout mice.19 In addition, because protocadherin (Pcdh) genes were shown to be deregulated in these mice,19 we also validated PCDH9 (MIM 603581), the only PCDH gene expressed at a reasonable level in lymphocytes (Figure S8).
Figure 2.

Gene Expression and Promoter-Enhancer Interaction in CTCF-Deficient Individuals
(A) Gene expression similarity between individuals for the 816 differentially expressed genes with corrected p value < 0.05 (698 down, 118 up). Both individuals and genes are clustered by the correlation distance. Blue, downregulated; red, upregulated.
(B) Variation in mean peak-per-gene density for CTCF motif-containing lymphocyte ChIP-seq peaks of different gene categories. Colored lines show mean density in consecutive nonoverlapping 200 kb windows within a 2 Mb region around gene transcription start sites. CTCF peak-to-gene ratio is enriched near downregulated genes relative to similarly expressed or upregulated genes (p < 10−4 and p < 10−20, respectively, for the region between −500 kb and +500 kb around transcription start site, Wilcoxon test). Peaks were called via MACS software with default settings, with the ENCODE UW Gm12878 lymphoblastoid cell line ChIP-seq input track used as background control. The BEDTools suite40 was used for overlap determination and the R statistical software package41 for mean density calculations and statistics. Genes less than 1 Mb from chromosome ends were excluded. Legend: All, all genes expressed at RPKM > 10 (4,724 genes); Similar, set of nondifferentially expressed RPKM > 10 genes (1,841 genes); Down/Up, RPKM > 10 genes that are down- or upregulated in affected individuals (671/108 genes).
(C) Enrichment of promoters that interact exclusively with enhancers and depletion of promoter-promoter interactions in downregulated genes (numbers above bars indicate absolute gene counts; chi-square test, ∗p < 10−3, ∗∗∗p < 10−9). The control set of nondifferentially expressed genes consists of the RPKM > 10 genes with an affected-control fold change of less than 1.2 and a p value of more than 0.3 (1,986 genes). Legend: All, all genes expressed at RPKM > 10; Similar, nondifferentially expressed subset; Down/Up, subsets that are down- or upregulated in affected individuals.
Evaluating our set of 698 downregulated genes with regards to Gene Ontology28 terms, we found enrichment primarily for genes involved in cellular response to extracellular stimuli (Figures S9 and S10 and Table S4), processes that are implicated in developmental and cognitive disorders.29 These Gene Ontology terms are also consistent with findings in neuron-specific knockout mice.19 Interestingly, the set of 118 upregulated genes was much smaller, which was also in line with the gene expression pattern in neuron-specific knockout mice.19 These upregulated genes were dominated by ribosomal genes, suggesting upregulation of mRNA translation as a possible compensatory mechanism.
To investigate whether heterozygous CTCF mutations might be associated with disorganized chromatin domains, we performed chromatin immunoprecipitation followed by deep sequencing (ChIP-seq). After isolating lymphocytes from Heparin blood of a human control individual with a swelling buffer, cells were cross-linked and harvested as described previously.30 Chromatin was sonicated with a Bioruptor sonicator (Diagenode), and ChIP was performed with a CTCF antibody (Millipore; 07-729). After barcording with NEXTflex adaptors, three samples were sequenced in one lane on a HiSeq 2000 (Illumina). Out of 46 million 50 bp single-end reads passing the Illumina chastity filter, 44 million (96%) were mapped to the human GRCh37 genome (UCSC Genome Browser hg19; assembly February 2009) via BWA.31 Prior to peak-calling and visualization, all duplicate reads and reads mapping to repeat regions were removed. We found 27,072 CTCF binding sites, of which 14,729 contained CTCF motifs. CTCF motifs were detected in ChIP-seq peaks with the FIMO program32 of the MEME suite v.4.8.133 with the CTCF position weight matrix from the JASPAR database,34 with genome-wide UCSC Genome Browser hg19 nucleotide frequencies as background and a p value threshold of 10−4 (multiple testing corrected p value < 0.06). The peak-per-gene density was clearly higher in the genomic regions containing downregulated genes than in those with upregulated or similarly expressed genes (Figure 2 and Table S5) (p < 10−4 and p < 10−20, respectively, in 1 Mb windows around gene promoters, Wilcoxon test). Similarly, the peaks near downregulated genes had higher peak scores than those near similarly expressed genes (p < 0.001, Wilcoxon test) (Figure S11).
Thus, in individuals with CTCF haploinsufficiency, the higher number of CTCF binding sites per gene for downregulated relative to upregulated or unaltered genes suggests that activation of these genes is regulated by CTCF. A recent CTCF-associated chromatin interactome map in murine embryonic stem cells has demonstrated a role of CTCF in connecting distinct domains for gene expression regulation.17 Additionally, CTCF-mediated chromatin interaction has recently been shown to poise an inducible gene for increased transcription in response to extracellular stimulation.35 We therefore hypothesized that CTCF mutations may result in destabilized chromatin interactions required for gene expression.
To test this hypothesis, we first investigated whether differentially regulated genes are associated with CTCF chromatin loops. Because data on chromatin interaction analysis with paired-end-tag sequencing (ChIA-PET) of CTCF was not available from lymphocytes, we obtained K562 (a chronic myelogenous leukemia cell line) RNA polymerase II and CTCF ChIA-PET interactions from the ENCODE36 data collection center at the UCSC Genome Browser database.37 K562 CTCF interactions were filtered for those with CTCF motif-containing ChIP-seq peaks from Heparin blood lymphocytes at both anchors (7,453 of 25,721 interactions). We found that downregulated but not upregulated genes are located within CTCF loops more frequently than expected (p < 10−3 and p = 0.25, respectively, hypergeometric test) and that similarly expressed genes are underrepresented (p < 10−5).
We used ChIA-PET interaction data of RNA polymerase II from K562 cells38 to investigate the chromatin interaction pattern of affected gene promoters. We found that the downregulated genes in the herewith reported individuals were enriched for promoters that exclusively interact with enhancers and depleted for genes whose promoters cluster with other promoters (Figures 2 and S12). Furthermore, EP300 ChIP-seq peaks, which are indicative of active enhancers,39 were clearly enriched in the vicinity of downregulated genes relative to similarly expressed and upregulated genes in both the ENCODE lymphoblastoid and K562 cell lines (Figure S13) (p < 10−6 to p < 10−3 for all comparisons, with a 1 Mb window around promoter, Wilcoxon test). These data indicate that CTCF deficiency predominantly affects expression of enhancer-regulated genes. CTCF-associated loops possibly stabilize promoter-enhancer interactions,17 thereby increasing their efficiency and the levels of gene expression. These data are consistent with a model in which CTCF leads to stabilization of distinct transcription domains.
Taken together, de novo mutations in CTCF in humans cause variable impairment of cognition and growth. Haploinsufficiency is probably the disease mechanism of the phenotypes. Our data suggest that CTCF is required for enhancer-driven gene activation and genomic interaction of enhancers and their regulated gene promoters in development.
Acknowledgments
We thank the affected individuals and their parents for participation in this study. We thank Christine Zeck-Papp, Angelika Diem, Christian Gilissen, Nienka Wieskamp, Eva Janssen-Megens, and Kim Berentsen for excellent technical assistance. We thank Harald Rabe, Kinderzentrum St. Martin (Regensburg, Germany), and Maria Kibaek, Odense University Hospital (Denmark), for referring individuals. We thank Han Brunner for critical reading of the manuscript and helpful suggestions. C.Z. was supported by the IZKF (Interdisziplinäres Zentrum für Klinische Forschung) Erlangen and by a grant from the Deutsche Forschungsmeinschaft (Zw184/1-1). A. Reis was supported by a grant from the German Ministry of Education and Research (01GS08160). M.O. was supported by the BioRange program of the Netherlands Bioinformatics Centre (NBIC), which is supported by the Netherlands Genomics Initiative (NGI).
Supplemental Data
Web Resources
The URLs for data presented herein are as follows:
1000 Genomes, http://browser.1000genomes.org
Berkeley Drosophila Genome Project NNSplice 0.9, http://www.fruitfly.org/seq_tools/splice.html
Gene Expression Omnibus (GEO), http://www.ncbi.nlm.nih.gov/geo/
HSF v.2.4, http://www.umd.be/HSF/
Illumina iGenomes transcriptome, http://cufflinks.cbcb.umd.edu/igenomes.html
NHLBI Exome Sequencing Project (ESP) Exome Variant Server, http://evs.gs.washington.edu/EVS/
Online Mendelian Inheritance in Man (OMIM), http://www.omim.org/
UCSC Genome Browser, http://genome.ucsc.edu
Accession Numbers
RNAseq and ChIPseq data have been deposited in the NCBI Gene Expression Omnibus (GEO), accessible through GEO Series accession number GSE46833.
References
- 1.Day J.J., Sweatt J.D. Epigenetic mechanisms in cognition. Neuron. 2011;70:813–829. doi: 10.1016/j.neuron.2011.05.019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Zentner G.E., Scacheri P.C. The chromatin fingerprint of gene enhancer elements. J. Biol. Chem. 2012;287:30888–30896. doi: 10.1074/jbc.R111.296491. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.van Bokhoven H. Genetic and epigenetic networks in intellectual disabilities. Annu. Rev. Genet. 2011;45:81–104. doi: 10.1146/annurev-genet-110410-132512. [DOI] [PubMed] [Google Scholar]
- 4.Kleefstra T., Brunner H.G., Amiel J., Oudakker A.R., Nillesen W.M., Magee A., Geneviève D., Cormier-Daire V., van Esch H., Fryns J.P. Loss-of-function mutations in euchromatin histone methyl transferase 1 (EHMT1) cause the 9q34 subtelomeric deletion syndrome. Am. J. Hum. Genet. 2006;79:370–377. doi: 10.1086/505693. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Hoyer J., Ekici A.B., Endele S., Popp B., Zweier C., Wiesener A., Wohlleber E., Dufke A., Rossier E., Petsch C. Haploinsufficiency of ARID1B, a member of the SWI/SNF-a chromatin-remodeling complex, is a frequent cause of intellectual disability. Am. J. Hum. Genet. 2012;90:565–572. doi: 10.1016/j.ajhg.2012.02.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Santen G.W., Aten E., Sun Y., Almomani R., Gilissen C., Nielsen M., Kant S.G., Snoeck I.N., Peeters E.A., Hilhorst-Hofstee Y. Mutations in SWI/SNF chromatin remodeling complex gene ARID1B cause Coffin-Siris syndrome. Nat. Genet. 2012;44:379–380. doi: 10.1038/ng.2217. [DOI] [PubMed] [Google Scholar]
- 7.Tsurusaki Y., Okamoto N., Ohashi H., Kosho T., Imai Y., Hibi-Ko Y., Kaname T., Naritomi K., Kawame H., Wakui K. Mutations affecting components of the SWI/SNF complex cause Coffin-Siris syndrome. Nat. Genet. 2012;44:376–378. doi: 10.1038/ng.2219. [DOI] [PubMed] [Google Scholar]
- 8.McKenna A., Hanna M., Banks E., Sivachenko A., Cibulskis K., Kernytsky A., Garimella K., Altshuler D., Gabriel S., Daly M., DePristo M.A. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010;20:1297–1303. doi: 10.1101/gr.107524.110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Herold M., Bartkuhn M., Renkawitz R. CTCF: insights into insulator function during development. Development. 2012;139:1045–1057. doi: 10.1242/dev.065268. [DOI] [PubMed] [Google Scholar]
- 10.Bell A.C., Felsenfeld G. Methylation of a CTCF-dependent boundary controls imprinted expression of the Igf2 gene. Nature. 2000;405:482–485. doi: 10.1038/35013100. [DOI] [PubMed] [Google Scholar]
- 11.Yu D.H., Ware C., Waterland R.A., Zhang J., Chen M.H., Gadkari M., Kunde-Ramamoorthy G., Nosavanh L.M., Shen L. Developmentally programmed 3′ CpG island methylation confers tissue- and cell-type-specific transcriptional activation. Mol. Cell. Biol. 2013;33:1845–1858. doi: 10.1128/MCB.01124-12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Gerasimova T.I., Byrd K., Corces V.G. A chromatin insulator determines the nuclear localization of DNA. Mol. Cell. 2000;6:1025–1035. doi: 10.1016/s1097-2765(00)00101-5. [DOI] [PubMed] [Google Scholar]
- 13.Chao W., Huynh K.D., Spencer R.J., Davidow L.S., Lee J.T. CTCF, a candidate trans-acting factor for X-inactivation choice. Science. 2002;295:345–347. doi: 10.1126/science.1065982. [DOI] [PubMed] [Google Scholar]
- 14.Fu Y., Sinha M., Peterson C.L., Weng Z. The insulator binding protein CTCF positions 20 nucleosomes around its binding sites across the human genome. PLoS Genet. 2008;4:e1000138. doi: 10.1371/journal.pgen.1000138. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Ohlsson R., Bartkuhn M., Renkawitz R. CTCF shapes chromatin by multiple mechanisms: the impact of 20 years of CTCF research on understanding the workings of chromatin. Chromosoma. 2010;119:351–360. doi: 10.1007/s00412-010-0262-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Phillips J.E., Corces V.G. CTCF: master weaver of the genome. Cell. 2009;137:1194–1211. doi: 10.1016/j.cell.2009.06.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Handoko L., Xu H., Li G., Ngan C.Y., Chew E., Schnapp M., Lee C.W., Ye C., Ping J.L., Mulawadi F. CTCF-mediated functional chromatin interactome in pluripotent cells. Nat. Genet. 2011;43:630–638. doi: 10.1038/ng.857. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Moore J.M., Rabaia N.A., Smith L.E., Fagerlie S., Gurley K., Loukinov D., Disteche C.M., Collins S.J., Kemp C.J., Lobanenkov V.V., Filippova G.N. Loss of maternal CTCF is associated with peri-implantation lethality of Ctcf null embryos. PLoS ONE. 2012;7:e34915. doi: 10.1371/journal.pone.0034915. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Hirayama T., Tarusawa E., Yoshimura Y., Galjart N., Yagi T. CTCF is required for neural development and stochastic expression of clustered Pcdh genes in neurons. Cell Rep. 2012;2:345–357. doi: 10.1016/j.celrep.2012.06.014. [DOI] [PubMed] [Google Scholar]
- 20.Gicquel C., Rossignol S., Cabrol S., Houang M., Steunou V., Barbu V., Danton F., Thibaud N., Le Merrer M., Burglen L. Epimutation of the telomeric imprinting center region on chromosome 11p15 in Silver-Russell syndrome. Nat. Genet. 2005;37:1003–1007. doi: 10.1038/ng1629. [DOI] [PubMed] [Google Scholar]
- 21.Prawitt D., Enklaar T., Gärtner-Rupprecht B., Spangenberg C., Oswald M., Lausch E., Schmidtke P., Reutzel D., Fees S., Lucito R. Microdeletion of target sites for insulator protein CTCF in a chromosome 11p15 imprinting center in Beckwith-Wiedemann syndrome and Wilms’ tumor. Proc. Natl. Acad. Sci. USA. 2005;102:4085–4090. doi: 10.1073/pnas.0500037102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Schaub M.A., Boyle A.P., Kundaje A., Batzoglou S., Snyder M. Linking disease associations with regulatory information in the human genome. Genome Res. 2012;22:1748–1759. doi: 10.1101/gr.136127.111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Wendt K.S., Yoshida K., Itoh T., Bando M., Koch B., Schirghuber E., Tsutsumi S., Nagae G., Ishihara K., Mishiro T. Cohesin mediates transcriptional insulation by CCCTC-binding factor. Nature. 2008;451:796–801. doi: 10.1038/nature06634. [DOI] [PubMed] [Google Scholar]
- 24.Dolnik A., Engelmann J.C., Scharfenberger-Schmeer M., Mauch J., Kelkenberg-Schade S., Haldemann B., Fries T., Krönke J., Kühn M.W., Paschka P. Commonly altered genomic regions in acute myeloid leukemia are enriched for somatic mutations involved in chromatin remodeling and splicing. Blood. 2012;120:e83–e92. doi: 10.1182/blood-2011-12-401471. [DOI] [PubMed] [Google Scholar]
- 25.Mullighan C.G., Zhang J., Kasper L.H., Lerach S., Payne-Turner D., Phillips L.A., Heatley S.L., Holmfeldt L., Collins-Underwood J.R., Ma J. CREBBP mutations in relapsed acute lymphoblastic leukaemia. Nature. 2011;471:235–239. doi: 10.1038/nature09727. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Langmead B., Trapnell C., Pop M., Salzberg S.L. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 2009;10:R25. doi: 10.1186/gb-2009-10-3-r25. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Anders S., Huber W. Differential expression analysis for sequence count data. Genome Biol. 2010;11:R106. doi: 10.1186/gb-2010-11-10-r106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Consortium G.O., Gene Ontology Consortium Creating the gene ontology resource: design and implementation. Genome Res. 2001;11:1425–1433. doi: 10.1101/gr.180801. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Vaillend C., Poirier R., Laroche S. Genes, plasticity and mental retardation. Behav. Brain Res. 2008;192:88–105. doi: 10.1016/j.bbr.2008.01.009. [DOI] [PubMed] [Google Scholar]
- 30.Denissov S., van Driel M., Voit R., Hekkelman M., Hulsen T., Hernandez N., Grummt I., Wehrens R., Stunnenberg H. Identification of novel functional TBP-binding sites and general factor repertoires. EMBO J. 2007;26:944–954. doi: 10.1038/sj.emboj.7601550. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Li H., Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25:1754–1760. doi: 10.1093/bioinformatics/btp324. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Grant C.E., Bailey T.L., Noble W.S. FIMO: scanning for occurrences of a given motif. Bioinformatics. 2011;27:1017–1018. doi: 10.1093/bioinformatics/btr064. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Bailey T.L., Boden M., Buske F.A., Frith M., Grant C.E., Clementi L., Ren J., Li W.W., Noble W.S. MEME SUITE: tools for motif discovery and searching. Nucleic Acids Res. 2009;37(Web Server issue):W202–W208. doi: 10.1093/nar/gkp335. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Portales-Casamar E., Thongjuea S., Kwon A.T., Arenillas D., Zhao X., Valen E., Yusuf D., Lenhard B., Wasserman W.W., Sandelin A. JASPAR 2010: the greatly expanded open-access database of transcription factor binding profiles. Nucleic Acids Res. 2010;38(Database issue):D105–D110. doi: 10.1093/nar/gkp950. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Larkin J.D., Cook P.R., Papantonis A. Dynamic reconfiguration of long human genes during one transcription cycle. Mol. Cell. Biol. 2012;32:2738–2747. doi: 10.1128/MCB.00179-12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Dunham I., Kundaje A., Aldred S.F., Collins P.J., Davis C.A., Doyle F., Epstein C.B., Frietze S., Harrow J., Kaul R., ENCODE Project Consortium An integrated encyclopedia of DNA elements in the human genome. Nature. 2012;489:57–74. doi: 10.1038/nature11247. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Kuhn R.M., Haussler D., Kent W.J. The UCSC genome browser and associated tools. Brief. Bioinform. 2013;14:144–161. doi: 10.1093/bib/bbs038. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Li G., Ruan X., Auerbach R.K., Sandhu K.S., Zheng M., Wang P., Poh H.M., Goh Y., Lim J., Zhang J. Extensive promoter-centered chromatin interactions provide a topological basis for transcription regulation. Cell. 2012;148:84–98. doi: 10.1016/j.cell.2011.12.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Visel A., Blow M.J., Li Z., Zhang T., Akiyama J.A., Holt A., Plajzer-Frick I., Shoukry M., Wright C., Chen F. ChIP-seq accurately predicts tissue-specific activity of enhancers. Nature. 2009;457:854–858. doi: 10.1038/nature07730. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Quinlan A.R., Hall I.M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26:841–842. doi: 10.1093/bioinformatics/btq033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Ihaka R., Gentleman R.C. R: A language for data analysis and graphics. J. Comput. Graph. Statist. 1996;5:299–314. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
