Abstract
Mammalian circadian rhythm is established by the negative feedback loops consisting of a set of clock genes, which lead to the circadian expression of thousands of downstream genes in vivo. As genome-wide transcription is organized under the high-order chromosome structure, it is largely uncharted how circadian gene expression is influenced by chromosome architecture. We focus on the function of chromatin structure proteins cohesin as well as CTCF (CCCTC-binding factor) in circadian rhythm. Using circular chromosome conformation capture sequencing, we systematically examined the interacting loci of a Bmal1-bound super-enhancer upstream of a clock gene Nr1d1 in mouse liver. These interactions are largely stable in the circadian cycle and cohesin binding sites are enriched in the interactome. Global analysis showed that cohesin-CTCF co-binding sites tend to insulate the phases of circadian oscillating genes while cohesin-non-CTCF sites are associated with high circadian rhythmicity of transcription. A model integrating the effects of cohesin and CTCF markedly improved the mechanistic understanding of circadian gene expression. Further experiments in cohesin knockout cells demonstrated that cohesin is required at least in part for driving the circadian gene expression by facilitating the enhancer-promoter looping. This study provided a novel insight into the relationship between circadian transcriptome and the high-order chromosome structure.
Author Summary
Circadian rhythm regulates daily oscillations of many physiological processes in a wide range of organisms. In mammals, circadian rhythm drives the cycling expression of thousands of downstream genes. The temporal control of transcription takes place under high-order chromosome structure, which is established by looping distant loci on the linear DNA double strands. The most important chromatin structure proteins studied so far are cohesin and CTCF. Using circular chromosome conformation capture technologies, we found that cohesin binding sites are enriched in interacting regions of an enhancer bound by a key circadian transcription factor, Bmal1. Globally, cohesin and CTCF have disparate functions on transcriptional regulation. We developed a quantitative model integrating the effects of cohesin and CTCF in circadian gene regulation. With further computational and experimental approaches, we validated several cases of circadian oscillating genes where cohesin facilitates the enhancer-promoter looping. Taken together, this study showed that circadian gene expression is orchestrated under the long-range interactions mediated by cohesin.
Introduction
Circadian rhythm is a daily oscillation of physiological processes and behaviors in varieties of living systems [1,2]. In mammals, the endogenous clock is established by interconnected transcriptional-translational feedback loops including a series of clock genes, for instance, Bmal1, Clock, Nr1d1, Nr1d2, Per and Cry family genes [3,4]. Transcription factor complex Bmal1-Clock drives Nr1d1, Nr1d2, Per and Cry family gene expression via cis-regulatory element E-box. Conversely, Per and Cry proteins repress the transcriptional activity of Bmal1-Clock by protein-protein interaction. In addition, transcription repressors Nr1d1 (Rev-erbα) and Nr1d2 (Rev-erbβ) inhibit the transcription of Bmal1 through retinoic acid-related orphan receptor response element (RRE). Other clock genes like Dbp, Tef, Dec1, and Dec2 are also involved in the feedback loops. These genes constitute the molecular makeup of central clock system that robustly oscillates across different tissues and generate the circadian expression of thousands of downstream genes. In mammals, master clock residing in suprachiasmatic nucleus (SCN) directs tissue-specific circadian clocks in peripheral tissues. Circadian oscillating genes (COGs) showing 24-hour rhythm in mRNA expression level in mouse liver have been intensively studied by transcriptomic profiling technologies [5,6]. High-throughput studies on circadian transcription factor binding [6,7] and histone modifications [6,8] by ChIP-Seq, and enhancer RNAs by GRO-Seq [9] have hinted the circadian regulation in intergenic regions distal to gene promoters. Furthermore, the cycling profiles of many COGs were found to be inconsistent with the proximal binding of circadian transcription factors [10]. Thus, long-range chromasome interactions between promoters and enhancers may be required for a deeper understanding of the temporal organization of widespread COGs.
Over the past few years, the development of comprehensive chromosomal interaction mapping technologies facilitated our current understanding of three-dimensional architecture in chromosome conformation [11]. It was found that the boundaries of chromatin interaction domains are enriched for binding sites of CTCF (CCCTC-binding factor) [12,13], which is commonly accepted as a barrier protein binding to the insulators [14]. Cohesin is another chromosome structure protein with crucial function in sister chromatin cohesion and chromosome remodeling [15]. Cohesin complex contains four subunits, Smc1, Smc3, Scc1 (also called Rad21), and Scc3 (known as Stag1 and Stag2 in mammalian cells), which form an open-close ring structure to hold DNA [16,17]. Cohesin cooperates with Mediator or CTCF [18,19] in controlling gene expression independent of its function in sister chromatid cohesion [20]. The co-binding sites of CTCF and cohesin repress gene expression by insulating enhancer action [18,21]. In comparison, CTCF-independent cohesin binding sites are reported to be cell type specific and predominately associated with transcriptional factor binding sites [22,23].
The high-order chromosome structure conveys important message on the transcription [24], which should also apply to the regulation of COGs. An earlier study in mouse embryonic fibroblast (MEF) cells analyzed the chromosomal interactions anchored to a COG Dbp [25]. However, the roles of chromosome structure proteins were not yet explored. In this study, we systematically identified long-range interactions involving a Bmal1 bound super-enhancer upstream of a clock gene Nr1d1 in mouse liver. Notably, we found that cohesin binding sites are enriched in these interactions. With bioinformatics analysis and further experiments in cohesin-deficient MEF cells, our study provides the first line of evidences that cohesin can exert the influence upon genome-wide circadian expression by mediating long-range chromosome interactions.
Results
The interactome of a circadian super-enhancer is enriched with cohesin
To study the effect of high-order chromatin structure on circadian rhythm, we focus on a pioneer-like transcription factor in circadian regulation: Bmal1 [26]. We identified 3,244 Bmal1 enhancers [27] in mouse liver from published Bmal1 binding sites and histone marks of enhancers. Among them, the top 3% with highest Bmal1 binding signals were defined as super-enhancers [28] (Methods, S1 Table). To reveal the long-range interactions involved in circadian enhancers, we selected a Bmal1 super-enhancer located ~8 kb upstream of a clock gene Nr1d1 (S1A Fig). This enhancer harbors the strongest Bmal1 binding site in mouse liver with rhythmic binding (S1B Fig). Using this enhancer as the bait, we detected its interacting regions in mouse liver by circular chromosome conformation capture sequencing (4C-Seq) at CT6 (CT: circadian time, n = 3) and CT18 (n = 3) when Bmal1 binding is at its peak and trough respectively. Genomic regions consistently enriched in 4C signals in at least two out of three biological replicates at a given time point were identified as enhancer interacting regions, resulting 49 regions at CT6 and 51 regions at CT18 respectively within 2 Mb to the enhancer (FDR = 0.01, Methods and S2 Table). A highly interacting region spanning approximately ~150 kb around the bait region shows markedly elevated signals at both CT6 and CT18 (Fig 1A and S1C Fig).
We next obtained 3,018 COGs and their circadian phases from a published microarray data of high temporal resolution in mouse liver [5]. Out of them, Fbxl20, Cdk12, Med24, Thra, and Nr1d1 show interactions with the enhancer at CT6. Quantitative chromosome conformation capture (3C-qPCR) analysis was performed to validate the interactions between selected COGs and the enhancer at CT6. In all cases tested, the interactions identified by 4C are highly consistent with 3C-qPCR results (Fig 1A and 1B). Cdk12 shows a weak interaction with the enhancer. Ormdl3, a non-interacting COG at CT6, shows a lower interaction with the enhancer than the control. In comparison, Nr1d1, Thra, and Med24 demonstrate strong interactions with the enhancer of 6–30 folds over the nearby control regions. The highly interacting region identified in our study falls into one of topologically associating domains (TAD) identified by Hi-C in mouse embryonic stem (ES) cells [12] (S2 Fig). The circadian phases of Thra and Med24 are both around CT0 (Fig 2A). The closeness of their circadian phases suggests that they are likely co-regulated by the same enhancer [8]. Interestingly, the interactions with the bait within highly interacting region are significantly enriched of chromatin loops from cohesin ChIA-PET (chromatin interaction analysis by paired-end tag) data [29] (Fig 1A, Chi-squared test p < 10−16) but are devoid of chromatin loops from CTCF ChIA-PET data in mouse ES cells [30]. The interactome data in mouse ES cells implies the potential involvement of cohesin in the long-range interactions with Nr1d1 enhancer. When we examined broader regions of interactions, nearly 50% of enhancer interacting regions at CT6 or CT18 overlapped with cohesin-non-CTCF sites as compared to merely 20% for cohesin-CTCF sites or random sites obtained by the permutation of cohesin-CTCF or cohesin-non-CTCF sites (Fig 2B). Furthermore, we profiled all Bmal1 super-enhancers in mouse liver and found that they have high occupancy of cohesin (Fig 2C). Therefore, our 4C-Seq data suggested that cohesin is implicated in facilitating circadian enhancer and promoter interactions.
Cohesin-non-CTCF binding sites are associated with high circadian rhythmicity of transcription
To globally investigate the relationship between circadian gene expression and chromosome structure proteins, we collected circadian cistrome data consisting of different DNA-binding proteins including architectural proteins cohesin and CTCF [22], core circadian transcription factors Bmal1 and Nr1d1 [6], as well as a non-circadian transcription factor Gabpa [22] from published ChIP-Seq datasets in mouse liver (Methods). All datasets were analyzed from the raw data and with the same pipeline. None of the components of cohesin and CTCF are circadian oscillating in their expression levels in mouse liver [3]. Because of the distinct function of cohesin from CTCF [22], we further classified the cohesin binding sites into cohesin-CTCF co-binding sites and cohesin-non-CTCF binding sites. Compared to cohesin-non-CTCF, the number of CTCF-non-cohesin sites is much fewer and therefore has been omitted from the analysis. In total, we obtained 10,948, 28,883, 23,662, 41,690, and 32,899 binding sites for Bmal1, Nr1d1, Gabpa, cohesin-CTCF, and cohesin-non-CTCF in mouse liver respectively.
We defined a nucleotide-level circadian index using circadian time-series GRO-Seq data [9] to quantify circadian transcriptional activities across whole mouse genome in mouse liver (Methods). In our definition, higher circadian index indicates stronger rhythmicity. As expected, the binding centers of Bmal1 and Nr1d1 have overall higher circadian indices than the other factors or genomic background. Interestingly, the profile of cohesin-non-CTCF sites is between Bmal1/Nr1d1 and the random sites, which implicates a positive role of cohesin-non-CTCF on circadian rhythmicity (Fig 3A, S3A Fig). This phenomenon is again observed when defining circadian index using circadian time-series RNA-Seq data [6] (S3B Fig). Moreover, both Bmal1 and Nr1d1 binding sites prefer to overlap with cohesin-non-CTCF binding sites rather than cohesin-CTCF binding sites in mouse liver (Fisher’s exact test p < 10−22, Fig 3B). Therefore, cohesin-non-CTCF sites are associated with high circadian rhythmicity of transcription.
Cohesin-CTCF co-binding sites insulate circadian phases of COGs
Cohesin-CTCF co-binding sites are known to play the role of genomic insulator [32]. To study whether cohesin-CTCF sites affect the circadian gene expression in mouse liver, we compared the phase differences of two neighboring COGs separated by a given binding site to the genomic background (Methods, S3C Fig). The phase differences of two adjacent windows were significantly smaller than the phase differences of two windows that were randomly picked from the genome (Mann-Whitney U test p = 10−10, S3C Fig). This demonstrated that the neighboring COGs across the genome tend to have similar circadian phases. Interestingly, the phase differences across cohesin-CTCF sites show a bimodal distribution and are significantly larger than those in genomic background (Mann-Whitney U test p = 0.03, Fig 3C). On the contrary, Bmal1 and Nr1d1 binding reduced the phase differences of COGs across their binding sites (Mann-Whitney U test p = 0.03 and 0.006 respectively). It indicated that Bmal1 and Nr1d1 might lead to the oscillation of genes in the similar phases in both directions flanking the binding sites. The effect of cohesin-non-CTCF was again similar to those of Bmal1 or Nr1d1, although it is only moderately statistically significant from the background (Mann-Whitney U test p = 0.1). In comparison, the distribution of phase differences across Gabpa sites was similar to that of genomic background. These results revealed that cohesin-CTCF co-binding sites tend to disrupt the phase continuity of neighboring COGs.
We then asked whether the COGs within the same domain defined by interacting cohesin-CTCF sites show similar circadian phases. Due to the lack of cohesin-CTCF domains in mouse liver, we inferred tissue/cell type invariant cohesin loops from the ChIA-PET data in mouse ES cells [29] in which only loops with both anchors overlapped with cohesin-CTCF binding sites in mouse liver were selected (S3D Fig). The variance of circadian phases was used to measure the phase difference of two or more COGs. We observed that the phase variance of genomic background increases as an exponential function of the domain size (S3D Fig, p < 10−16, Pearson’s correlation coefficient = 0.37). To take into account of this size effect, we divided cohesin-CTCF domains into three categories according to their sizes and compared to genomic background in the corresponding sizes (Methods). The phase variances were smaller in cohesin-CTCF domains of medium and large sizes compared to genomic background at significance levels of p = 0.01 and 0.003 respectively (Fig 3D, Mann-Whitney U test). In summary, the chromosomal domains defined by cohesin-CTCF co-binding sites tend to lock the phases of COGs.
A cohesin/CTCF dependent model of circadian gene regulation
In light of the above observations, we proposed a model that incorporated the effects of cohesin mediated enhancer-promoter interactions on the gene regulation in chromosomal domains defined by the co-binding of cohesin and CTCF (Fig 4A). We adopted the concept of regulatory potential to quantify the regulation of a gene by a given circadian transcription factor [33]. The regulatory potentials of Bmal1 on all annotated genes in mouse genome were calculated with or without considering the effect of cohesin and CTCF (Methods, Fig 4B and S3 Table). In the background model, the regulatory potential Bi of Bmal1 on a given gene i was computed as the sum of contributions from all available Bmal1 binding sites j within 2 Mb of the gene, that is, , where Dij is the distance between gene i and Bmal1 binding site j and Sj is the strength of Bmal1 binding at site j. Here we assumed that the regulatory effect of transcriptional factor on its target gene decays exponentially with distance from the binding site to its target gene and λ1 is the characteristic distance. In the cohesin/CTCF dependent model, the contribution of gene i and Bmal1 binding site j was further multiplied by three factors corresponding to the enhancing effects of a cohesin-non-CTCF site either near gene i (CNCi) or a Bmal1 binding site j (CNCj) as well as the insulating effect of cohesin-CTCF site (CCij), i.e. , (Methods). At last, the regulatory potentials were normalized to the ranks across all genes to ensure the robustness of model parameters.
Comparing with the background model, the regulatory potentials of Bmal1 on COGs were significantly higher in the cohesin/CTCF dependent model (Fig 4C, Kolmogorov-Smirnov test p = 10−16). It was known that the circadian phase of Bmal1 binding occurs around CT6 [7] and the phases of COGs directly controlled by Bmal1 are typically between CT6 and CT12. We found that the phases of COGs with top ranked Bmal1 regulatory potentials in cohesin/CTCF dependent model are more enriched in CT6-CT12 following the binding peak of Bmal1 at CT6 compared to the background model (Fig 4D). Using Nr1d1 ChIP-Seq data, we observed that Nr1d1 regulatory potentials in cohesin/CTCF dependent model could also distinguish COGs from non-COGs (S4A and S4B Fig). The fact that most core circadian clock genes have higher regulatory potentials in cohesin/CTCF dependent model suggested that chromosome structure proteins might facilitate the transcription of core components of circadian clock (S4C Fig). Taken together, our cohesin/CTCF dependent model is a more sophisticated model that integrated circadian transcription factors and chromatin organizers to explain the circadian gene expression.
To validate the regulatory potentials of circadian transcription factors, we examined the differentially expressed genes in the livers of Bmal1 knockout (KO) (Fan et al., manuscript in preparation) and Nr1d1 KO mice [9]. We observed that under-expressed genes in Bmal1 KO have higher Bmal1 regulatory potentials in cohesin/CTCF dependent model than those in the background model, while over-expressed genes in Bmal1 KO have similar regulatory potentials between two models (Fig 4E). In contrast, over-expressed genes rather than under-expressed genes in Nr1d1 KO showed much higher Nr1d1 regulatory potentials in cohesin/CTCF dependent model than those in the background model (Fig 4E). This is consistent with the current notion that Bmal1 functions as an activator and Nr1d1 as a repressor in circadian regulation.
Gene expression changes upon in vitro cohesin knock-out
The knock-out of cohesin subunits, Smc3, Scc1, and Scc3, lead to the embryonic lethality in mice [34]. To establish a knock-out system of cohesin in vitro, we transfected the post-mitotic Smc3-flox/flox MEF cells by Cre/GFP adenovirus such that the expression of Smc3 decreased by 80–90% in Smc3-/- cells compared to control cells (Methods). We measured the mRNA levels of four clock genes in Smc3-/- cells by RT-PCR assays after synchronizing the cells with dexamethasone treatment (Fig 5A). All genes showed significant oscillations both in KO and control cells (cosine fitting, p < 0.05) except for Nr1d1 in KO cells. Nr1d1 showed under-expression in KO cells (ANOVA, p = 10−7). The peak-trough ratio of Bmal1 dropped from 3.7 in control to 2.4 in KO cells. The circadian oscillations of Dbp and Per3 were not affected upon cohesin KO. Although the core clock genes have consistent cycling expression in vivo across tissues [3], the number of circadian oscillating genes in vitro in cell lines is much fewer than in vivo. To examine the gene regulation of circadian transcriptional factors in MEFs, we conducted Bmal1 ChIP-seq data in control MEF cells (Methods). However, only 244 Bmal1 binding sites were identified (S4 Table) including those on the promoters of core clock genes, Nr1d1 (S1A Fig), as well as Nr1d2, Cry1, Cry2, Per1, Bhlhe41, and Dbp (S4 Table). The lack of Bmal1 binding sites on most hepatic COGs is consistent with the fact that they are not oscillating in synchronized MEF cells.
To reveal the broader impact of cohesin on gene expression, we then applied RNA-Seq to measure gene expression in Smc3-/- MEFs vs. control MEF cells. In total, 248 and 1,064 genes were identified as over-expressed and under-expressed genes respectively in cohesin KO (log2 fold change > 0.8, Fig 5B and S4 Table). The promoter regions of differentially expressed genes upon cohesin KO were enriched with cohesin binding sites in MEFs (Fisher’s exact test p = 0.002). Interestingly, the genes involved in circadian clock were significantly enriched among the under-expressed genes by Gene Set Enrichment Analysis [35] among the canonical pathways (FDR = 10−8). To extrapolate our result in cohesin KO MEFs to mouse liver, we next focus on tissue/cell type invariant enhancer-promoter interactions mediated by cohesin. We found that 22% of differentially expressed genes in cohesin KO have their promoter regions situated near an anchor of cohesin loops in mouse ES cells [29], suggesting they are regulated by invariant enhancer-promoter loops. To identify the invariant cohesin loops, we required that both anchors of the cohesin loop in ES cells are also bound by cohesin in mouse liver. Furthermore, one anchor of the loop is situated within 15 kb near either a Bmal1 or Nr1d1 binding site in liver and the other anchor resides within 5 kb near the transcription start site of a hepatic COG that was also differentially expressed in cohesin KO in MEFs. We also required that the circadian phases of candidate genes fall into either Bmal1 controlled phase regime (CT6-CT12) or Nr1d1 controlled phase regime (CT20-CT2). The candidate pairs of COGs and enhancers identified were listed in Table 1.
Table 1. The candidate COGs interacting with circadian enhancers via invariant cohesin-mediated loops.
Gene Symbol | Circadian Phase (CT) | Transcription Factor | Transcription Factor binding site position (kb) | Cohesin KO-CN Log2-Fold-Change (LFC) |
---|---|---|---|---|
Tmtc2 | 11 | Bmal1 | 200 | -2.5 |
Rnf43 | 7.5 | Bmal1 | 39 | -1.5 |
Phldb2 | 7 | Bmal1 | 126 | -1.1 |
Cdo1 | 6 | Bmal1 | -130, -121 | -1.2 |
Kcnb1 | 11 | Bmal1 | -128 | -1 |
Atr | 11 | Bmal1 | -317,-315,-305 | -1.4 |
1200009I06Rik | 23.5 | Nr1d1 | -30 | -0.8 |
Ahnak | 20.5 | Nr1d1 | 106, 108, 131 | -0.9 |
Dapk1 | 23.5 | Nr1d1 | -121, -116, 129, 135 | -1.2 |
Npas2 | 1 | Nr1d1 | -179, -169 | -1.1 |
We then used 3C-qPCR experiments to confirm the presence of enhancer-promoter interactions in two such cases, Phldb2 and Ahnak in mouse liver (Fig 6B and S5B Fig). We also found that both interactions were significantly weakened in cohesin KO MEF cells compared to control cells (Fig 6C and S5C Fig). Phldb2 encodes a microtubule-anchoring factor that binds to phosphoinositides and filamin [36]. Phldb2 shows circadian phase at CT7 in mouse liver and is interacting with a Bmal1-bound enhancer situated 126 kb upstream in the intron of another gene Plxd2 (Fig 6A). This Bmal1 binding site is confirmed by ChIP-PCR in mouse liver (S6A Fig). Ahnak protein is a mediator in calcium signaling and transforming growth factor β signaling pathways [37]. Ahnak shows circadian phase at CT21 and is interacting with an Nr1d1-controlled enhancer (S5A Fig). The promoters of Phldb2 and Ahnak are devoid of any Bmal1 or Nr1d1 binding sites in liver. Furthermore, we found conserved histone modification marks of active transcription and cohesin binding sites at these two genes and their enhancer loci in both MEFs and liver. This supports that the cohesin-mediated loops in Phldb2 and Ahnak are invariant between tissues or cell types. Finally, to show that these interactions are functional for gene regulation, we used CRISPR-CAS9 system to delete the cohesin binding site near the enhancer of Phldb2 in Hepa1-6 cells and found a significant reduction of 41% in the expression of Phldb2 (Fig 6D and S6B Fig). Phldb2 is not circadian oscillating in synchronized MEFs or Hepa1-6 cells due to the lack of Bmal1 binding in their enhancers (Fig 6A). The binding of Bmal1 on the enhancer of Phldb2 renders its circadian expression in liver. Taken together, our results suggest that the stable and invariant enhancer-promoter loop mediated by cohesin is a prerequisite for temporal gene regulation in circadian rhythm.
Discussion
In complex organisms, it is known that genome-wide transcription is highly organized under high-order chromosome structure. In particular, distal enhancer has been considered to play a key role in gene regulation through long-range interactions. Given its far-reaching effect on gene expression, circadian clock is an ideal system to investigate the interplay between chromosome architecture and temporal regulation of gene expression under homeostasis. It was proposed that orchestrated transcription takes place at the so-called “transcription factories” where genes from distant loci across the genome are physically in contact. COGs within the same transcription factory may be regulated under the common circadian regulators such as Bmal1. Our 4C-Seq data for a super-enhancer upstream of Nr1d1 provided evidence of physical interactions between the enhancer and multiple COGs. This super-enhancer contains the binding sites of both Bmal1 and Nr1d1 (S1A Fig), a common feature in circadian cistrome [6]. Among the interacting genes within 2 Mb of the super-enhancer, the circadian phases of Fbxl20, Nr1d1, and Eif1 follows the phase of Bmal1 binding, while the phases of Ormdl3, Med22, and Thra suggests that they are more likely co-regulated by Nr1d1 (Fig 2A). We also found that strong interactions within 150 kb of the super-enhancer were independent of circadian time and restricted in a cell type invariant TAD. These results hinted that the long-range interaction acts as a stable backbone rather than a dynamic driving force for circadian regulation. Similar findings of stable interactions have been reported in other temporal processes such as animal development [38] although chromosome domains are highly dynamic during the stages of cell cycle [39]. Comparing with cohesin ChIA-PET data in mouse ES cells [29], we found the presence of multiple cohesin-mediated loops coinciding with the highly interacting regions of the enhancer. The enrichment of cohesin binding signals in both 4C interacting regions and on Bmal1 super-enhancers conforms to the general role of cohesin in organizing genome structure for gene regulation, although this may not be unique for circadian regulation.
The availability of cohesin and CTCF ChIP-Seq data in mouse liver provided us a unique opportunity to investigate the genome-wide association between cohesin and CTCF binding sites with circadian genomic features. To examine the effect of cohesin in circadian system, we designed a pipeline to capture the continuous change of circadian rhythmicity of transcription across the binding sites of several proteins including Bmal1, Nr1d1, Gabpa, CTCF, and cohesin at 2 kb resolution. This unbiased approach allowed us to include un-annotated transcripts as well as unconventional transcription that does not take place from transcription start sites [40]. We observed that the profile of cohesin-non-CTCF binding sites resembles that of Bmal1 as compared to other non-circadian transcription factors. It suggests that cohesin-non-CTCF sites have a positive effect on circadian rhythmicity of transcription although cohesin itself is not known to be a circadian regulator. From circadian phase analysis, we noted that neighboring circadian genes tend to have similar circadian phases while the co-binding of CTCF and cohesin leads to the insulation of circadian phases. Our finding is consistent with a recent study reporting that CTCF attenuates the transcription of circadian oscillating genes by mediating their contacts to the nuclear lamina [41]. Previous study showed that the nearest genes around Bmal1 sites were not always rhythmically expressed or peaking in the phase regime predicted for a Bmal1 controlled gene [7]. Based on 4C-Seq and bioinformatics analysis, our model incorporating the disparate effects of cohesin-non-CTCF and cohesin-CTCF provides a better predictor of the circadian expression of genes and their phases.
In this study, we have utilized a range of datasets from mouse liver and several cell lines. We chose mouse liver for our main analysis because it shows genome-wide circadian oscillation of gene expression [5] and it has the most comprehensive circadian transcriptomic and cistromic data to date. Mouse cell lines were used when genetic manipulations are not possible in liver as cohesin-deficiency is embryonic lethal in vivo [34]. Although synchronized MEF cells have been widely used for circadian studies [25,42–44], there are much fewer genes oscillating in MEFs and only core circadian genes are oscillating in both liver and MEFs. This is largely due to the lack of Bmal1 binding sites in MEFs as shown by our Bmal1 ChIP-Seq data in MEFs. For this reason, we used cohesin-deficient MEFs only to select candidate genes regulated by the invariant enhancer-promoter interactions mediated by cohesin even if these genes including Phldb2 and Ahnak are not oscillating in MEFs. However, the histone marks and cohesin binding sites were very conserved on the enhancer loci of the two cases between liver and MEFs indicating these are tissue/cell type invariant TADs. This is further supported by the cohesin ChIA-PET data in ES cells even though ES cells lack a functional circadian clock [45]. We used Hepa1-6 cells here because of the convenience for CRISPR-CAS9 experiment in these cells. These data in cell lines collectively suggest that these invariant enhancer-promoter interactions are both cohesin-dependent and functional in gene regulation. These DNA loops were confirmed to be also present in liver and the binding of Bmal1 in these enhancers renders the circadian expression to these two genes in liver. This picture is in line with our model that cohesin-mediated enhancer-promoter loop provides a stable and tissue/cell type invariant backbone and circadian gene regulation is a result of dynamic Bmal1 binding on the stable chromosome structure. We are also aware that the DNA loops mediated by architectural proteins seem to be developmentally regulated at specific loci within the TADs [46]. Whether our finding has general applicability for long-range circadian regulation still awaits future studies with other experimental strategies. Overall, our study sheds new light on the transcriptional landscape of circadian genes under high-order chromosome structure.
Methods
Ethics statement
All animal experiments performed in this study were approved by the Institutional Animal Care and Use Committee of Shanghai Institutes for Biological Sciences and conformed to institutional guidelines of vertebrates study.
Identification of Bmal1 bound super-enhancers
The general strategy for screening Bmal1 bound super-enhancers followed the pipeline described in [28]. We first defined 3,244 Bmal1 enhancers in mouse liver with the following rules: (1) the co-occurrence of H3K4me1 and H3K27ac marks [47,48], (2) positioning at least 1 kb away from any transcription start sites of annotated genes [49], (3) overlapping with Bmal1 binding sites at ZT8 from Koike et al.’s data [6] (see Methods section ChIP-Seq data analysis), (4) at least 100 bp in length. H3K4me1 and H3K27ac ChIP-Seq data in the livers of eight-week-old mice were used [31]. Because the signals on Bmal1 binding site do not show broad distribution, we skipped the step of merging enhancers in close distance. To obtain confident super-enhancers, the read numbers per million reads per kilobase from Koike et al.’s and Rey et al.’s Bmal1 ChIP-Seq experiments were added to rank Bmal1 enhancers [6,7]. Finally, 97 Bmal1 enhancers ranked at top 3% were defined as Bmal1 super-enhancers in mouse liver.
Circular chromosome conformation capture sequencing (4C-Seq)
4C-Seq assays were performed as previously described [50,51] with modifications. Briefly, six-week-old male C57BL/6 mice were entrained to 24 hr cycles of 12 hr light and 12 hr dark for one week and then switched into constant darkness. Three mice each were sacrificed in the dark at CT6 and CT18, respectively. Mouse liver cells were quickly dispersed and filtered through the 40 mm cell strainer to make a single-cell suspension. Approximately 50-million cells were fixed in 1% formaldehyde for 10 min at room temperature before being quenched with 0.125 M glycine. Cells were then lysed in cold lysis buffer (10 mM Tris HCl, 10 mM NaCl, 0.2% NP-40, 1×protease inhibitor) for 15 min on ice. After being washed twice, cell nuclei were re-suspended in Buffer 2.1 (New England Biolabs) including 0.1% SDS and were incubated for 10 min at 65°C. 1% (final concentration) of Triton X-100 was added to quench SDS and centrifuged to remove SDS and Triton. Nuclei were then digested overnight by 800U HindIII (New England Biolabs) at 37°C with shaking. After inactivation by 1.6% (final concentration) of SDS at 65°C for 20 min, samples were washed and re-suspended in ligation buffer and ligated by 100U T4 DNA ligase (Thermo Fisher Scientific) at 16°C for 4 hr and then room temperature for 30 min. Ligated chromatin was digested by proteinase K before DNA purification. The purified DNA was further digested by DpnII (New England Biolabs) and then circularized using T4 DNA ligase (Thermo Fisher Scientific). After purification, 200 ng of DNA from the 4C library was used as the template for the PCR amplification using DyNAzyme EXT (Finnzymes). Primers specific to bait region (S5 Table) were applied to amplify the interactome of interest in a 25 μl reaction volume under the following PCR conditions: 1 cycle at 94°C for 2 min; (94°C 30 sec; 60°C 30 min; 72°C 2 min) ×18 cycles; 1 cycle of 72°C 7 min. PCR products (1 μl) were used as the templates for a second PCR reaction utilizing the primers with the addition of Illumina adaptors in a 50 μl volume under the same PCR conditions. The PCR-amplified library was purified and sequenced with a 100 bp read length using Illumina HiSeq 2000 (S6 Table).
Sequencing reads of 4C-Seq were de-multiplexed using the bait primers, i.e. removing the upstream of HindIII restriction site (AAGCTT) and the downstream of DpnII restriction site (GATC). Then the reads were aligned to mouse genome (mm9) by Bowtie [52]. The self-ligated reads and non-cut reads were removed [53]. Only the reads uniquely mapped to the HindIII restriction sites on the cis-chromosome of the bait were kept and assigned the HindIII restricted fragments defined by two neighboring restriction sites. Peak calling was performed with a custom-designed pipeline generally following FourSig [54]. Previous interactome studies reported that 99% interactions were less than 1 Mb and inter-chromosomal interactions were hard to be validated [55]. Hence, we only considered intra-interactions within 2 Mb of the bait. The highly interacting region (150 kb to the bait, S1C Fig) was masked out during the peak calling on other regions. A sliding window with size of 3 fragments was used to calculate the cutoff based on the comparison between 100 permutations of raw reads and true data. The distribution of cutoffs under FDR = 0.01 was profiled and the final cutoff was determined as the 95% quantile. For highly interacting region, this cutoff was multiplied by the reads ratio between highly interacting region and other regions. Then the merged peaks in highly interacting region and other regions were considered as the peaks in each sample. We required that the peaks at each time point were consistently called in at least two out of three biological duplicates. In total, 49 and 51 peaks were obtained at CT6 and CT18 respectively. To compare highly interacting regions with ChIA-PET, we selected 1000 random regions of the same size and applied Chi-squared test to evaluate the significance between overlapped loops in highly interacting regions and in the random regions. The Gene Expression Omnibus (GEO) accession number for the 4C dataset is GSE68830.
Quantitative chromosome conformation capture (3C-qPCR)
3C-qPCR was performed as previously described with modifications [56]. Briefly, 10 μg of cross-linked nuclei were collected and shaken in 1 ml lysis buffer (1% SDS, 0.5% TritionX-100, proteinase inhibitor cocktail in TE buffer) for 1 hr at 37°C, followed by centrifugation for 3 min at 1000 rpm at room temperature. After removing the supernatant, the pellet was re-suspended in 500 μl digestion buffer (1% TritonX-100, 1xRE buffer, PI, 20 μl Quickcut HindIII in H2O) and digested overnight at 37°C with shaking. The reaction was terminated by the addition of SDS at a final concentration of 1.5% and the incubation at 65°C for 30 min. SDS and RE buffer were removed by centrifugation and the pellet was re-suspended for the next ligation. Reverse crosslinking was performed in the presence of proteinase K at 60°C overnight followed by RNaseA treatment at 37°C for 1 hr. The genomic DNA was extracted by phenol-chloroform. All 3C primers were designed by Primer Premier 6 (S5 Table).
ChIP-Seq data analysis
The ChIP-Seq data of CTCF, Rad21, Stag1, Stag2, and Gabpa in mouse liver were downloaded from ArrayExpress database (accession: E-MTAB-941) [22]. The ChIP-Seq data of Bmal1 at CT8 in mouse liver was downloaded from Gene Expression Ominbus (GEO) database (accession: GSE39860) [6]. The ChIP-Seq data of Nr1d1 at CT10 in mouse liver was downloaded from GEO (accession: GSE26345) [57]. The ChIP-Seq data of CTCF and Smc1 in MEFs were downloaded from GEO (accession: GSE22557) [19]. Rad21, Stag1, Stag2, and Smc1 are the subunits of cohesin. Gabpa is a non-oscillating transcription factor in mouse liver chosen as a negative control. It is known that Bmal1 and Nr1d1 rhythmically bind to the genome and their binding peaks are around CT6 and CT10, respectively.
To ensure that different datasets are directly comparable, all these ChIP-Seq data were analyzed in the same pipeline described as below. Raw reads in FASTQ files were mapped on mouse genome (mm9 assembly) by Bowtie [52]. Only reads uniquely mapped with no more than two mismatches were considered as valid reads. Peak calling was implemented by MACS with default parameters and cutoff p < 10−5 [58]. The signal files generated from MACS were normalized to per million total reads. Broad peaks with multiple peaks were split to accurately determine the peak region by PeakSplitter [59], requiring per million reads larger than 1. Peaks generated from PeakSplitter were considered as the binding sites and the centers of peaks were considered as the binding centers. The binding sites of Smc1 were considered to represent the binding sites of cohesin in MEFs. The binding sites of cohesin in liver were defined as the union of binding regions of Rad21, Stag1, and Stag2. Consequently, we obtained 10,948, 28,883, 23,662, 50,683, and 74,589 binding sites for Bmal1, Nr1d1, Gabpa, CTCF, and cohesin in mouse liver respectively. In MEFs, we obtained 5,738 and 8,756 binding sites for CTCF and cohesin respectively. These cohesin binding sites that overlap with CTCF binding sites in liver were defined as cohesin-CTCF sites (41,690) and the cohesin binding sites not overlapping with CTCF binding sites were defined as cohesin-non-CTCF sites (32,899).
Circadian rhythmicity of transcription
GRO-Seq (accession: GSE59486) [9] and RNA-Seq (accession: GSE39860) [6] data in mouse liver sampled every 3 or 4 hours over 1 day or 2 days were downloaded from GEO to obtain the genome-wide circadian gene expression. For each DNA binding factor including Bmal1, Gabpa, CTCF, and cohesin, the upstream 20 kb and downstream 20 kb relative to the binding centers were extracted. These regions were further divided into 2 kb bins as the basic unit for analyzing circadian rhythmicity of transcription across the genome. The 2-kb bin was considered as a valid bin if it contains at least one read at more than 7 (GRO-Seq) or 10 (RNA-Seq) time points. To exclude the binding sites in the region without any transcript, the binding site was considered for downstream analysis only if there is at least one valid bin in its proximity, i.e. the upstream and downstream 20 kb. BEDTools [60] were used to calculate the normalized read coverage in these bins at each time point. JTK_CYCLE [61] was applied to detect the circadian oscillation. We defined the minus logarithm of Bonferroni-adjusted p value of JTK_CYCLE, i.e. -log2(p), as the circadian index to measure circadian rhythmicity. To generate a meta-site for each binding factor, we computed the mean circadian index in each bin in the proximity of binding sites. The mock meta-site was obtained from randomly selected 100,000 sites of 40 kb in length over whole genome.
Phase analysis of COGs from microarray data
The circadian time-series microarray data in mouse liver sampled every 1 hour for 48 hours were downloaded from GEO (accession: GSE11923) [5] to analyze the phases of COGs. We chose this time-series data for phase analysis because of its high temporal resolution. The raw data in CEL files were normalized by robust multi-array average (RMA). JTK_CYCLE was performed to obtain circadian phases at the probeset level on the microarray. R package mouse4302.db was used to annotate the gene symbols of 45,101 probesets. If one gene corresponds to multiple probesets, we only kept the one with the minimum Benjamini and Hochberg (BH) q value from JTK_CYCLE. 3,018 COGs were selected with the threshold of BH q value < 0.01. The genomic locations of these genes were obtained from UCSC genome (mm9 assembly).
To examine whether the neighboring COGs tend to have similar phases, we scanned the whole genome for COGs with neighboring double windows consisting of the upstream 20 kb and downstream 20 kb windows (S3C Fig). The phase differences were computed between two COGs situated in each of the double windows. If multiple COGs were found in one window, only the COG closest to the other window was retained. Next we increased the distance of two windows apart to 10, 20, 30, 40, and 50 kb and re-calculated the phase differences of COGs in the double windows. For a random genomic background, a pair of two 20 kb windows were randomly selected on the genome and searched for COGs. The phase differences were calculated for 1,000 such random pairs of windows. Compared to the strategy of just considering contiguous genes [62], our fixed-size window approach eliminates the distance factor between neighboring genes. Mann-Whitney U test was applied to detect the significance of difference in the distributions of phase difference between double windows and randomly chosen windows.
To obtain neighboring COGs separated by the binding sites of Bmal1, Nr1d1, Gabpa, cohesin-non-CTCF, and cohesin-CTCF, the transcription start sites of COGs were searched upstream 20 kb and downstream 20 kb relative to protein binding centers providing that the whole transcripts do not overlap with the binding centers. We selected the binding sites flanked by COGs and calculated the phase difference between the opposite sides of these binding sites. Mann-Whitney U test was applied to detect the significance of difference in the distributions of phase difference between across the binding factors and genomic background.
Calculation of phase variance
The phase variance is calculated based on a method used to measure the dispersion of directional data [63]. In brief, the phase pi of COG i is given by polar co-ordinates of unit length (cos θi, sin θi), i = 1,…,n. The mean of phases p0 is defined as the direction of the vector resulted from the vector summation , where . The dispersion of phases is measured by . Hence, the phase variance is defined as 1 − |p0|/n after normalization by the sample size n. R package circular was used to calculate the phase variance.
We collected 23,724 intra-chromatin interactions from cohesin ChIA-PET data in mouse embryonic stem cells [29]. The invariant domains in mouse liver were inferred if two anchors of cohesin loops both overlapped with cohesin-CTCF binding sites in mouse liver. As a result, we obtained 16,837 invariant cohesin-CTCF domains. To explore the relationship between phase variance and window size, we scanned the whole genome with different sizes of windows 5×4i kb, i = 1,2,…,5 to extract COGs and calculate the phase variance (S3D Fig). The Pearson’s correlation coefficient (PCC) was calculated between phase variance and log2 transformed window size. The p value for testing null hypothesis (PCC = 0) was computed based on Pearson’s product moment correlation coefficient. To reduce size effect in the comparison of phase variances between cohesin-CTCF domains and background, we classified cohesin-CTCF domains into small, medium, and large categories with sizes of [10×4i, 10×4i+1] kb (i = 1,2,3) respectively. The genomic background for each category is generated by the scan across genome with window of size 5×4i+1 (i = 1,2,3).
A model of cohesin/CTCF dependent circadian gene regulation
We first defined a background model only considering the circadian regulation from nearby Bmal1 binding sites. In the background model, the regulatory potential Bi of Bmal1 on gene i is given by , where j is Bmal1 binding site located within 2 Mb to gene i, Sj is the weight representing the signal of Bmal1 binding site j in ChIP-Seq data, and Dij is the distance between gene i and Bmal1 binding site j. For cohesin/CTCF dependent model, the effects of cohesin-non-CTCF and cohesin-CTCF sites were multiplied upon the background model. For a given gene or Bmal1 site, we searched for the nearby cohesin-non-CTCF site within 5 kb that may facilitate gene regulation. We assigned a weight larger than 1 to the gene or Bmal1 binding site to increase the circadian regulatory potential. Between each pair of gene and Bmal1 binding site, we counted the number of cohesin-CTCF sites in between and assigned a weight less than 1 to reduce the circadian regulatory potential of Bmal1 on that gene. Taken together, the regulatory potential Pi of Bmal1 on gene i is given by , where NDi and NDj are the distances between the nearest cohesin-non-CTCF sites to gene i or Bmal1 site j respectively, SCi and SCj are the weights representing their signals on cohesin ChIP-Seq data, and mij is the number of cohesin-CTCF sites between gene i and Bmal1 j. If there is no cohesin-non-CTCF within 5 kb of gene i or Bmal1 site j, or was assigned to 0. The weights S and SC are defined by , where r is the rank of ChIP signal among all Bmal1 or cohesin binding sites respectively. The parameters λ1, λ2, λ3 are set to be 2000000/4, 5000/4, and (total number of peaks in ChIP-Seq)/4 respectively as suggested by an empirical model of gene regulation [64]. To render the circadian regulatory potentials directly comparable between two models, we finally converted them to their respective ranks in the models as Rank(Pi)/n where n is the total number of genes considered.
Smc3-/- mouse embryonic fibroblast cells
Smc3-flox/flox MEF (mouse embryonic fibroblast) cells was originally derived from European conditional mouse mutagenesis program [65] (http://www.informatics.jax.org/allele/MGI:4434007). The Cre/GFP adenovirus and GFP adenovirus (1010 pfu/ml) were purchased from Hanbio biotechnology, Shanghai. MEF cells were cultured with 10% FBS in DMEM (Life technology). To avoid the loss of viability in Smc3-/- cells when they enter mitosis, we infected the cells at G0/1 stage of the cell cycle. The medium was changed two days after the cells reaching the complete confluence. 109 pfu GFP and Cre/GFP adenovirus were used in 8-hr treatment for wild type and Smc3-/- MEF cells respectively. To allow the cells to recover from viral infection, we changed the medium into serum-free DMEM and kept the cells for 6 days at high confluence. MEF cells were then synchronized by dexamethasone (Sigma) with the final concentration of 100 nM for 1 hr. The cells were rinsed with PBS and cultured with serum-free DMEM. Wild type and Smc3-/- MEF cells were collected at 20, 24, 28, 32, 36, and 40 hr after synchronization. Total RNA was extracted using Trizol reagent and reverse-transcribed into cDNA by SuperScript II RT (Life Technologies). RNA-Seq libraries for 20 hr and 32 hr samples were prepared by using Illumina TruSeq RNA Sample Prep Kit V2 and were subjected to deep sequencing with 1×100 bp read on HiSeq 2000 at CAS-MPG Partner Institute for Computational Biology Omics Core, Shanghai, China (S6 Table). RNA-Seq reads were mapped to mouse reference genome (mm9 assembly) by Tophat [66]. HTSeq was used to count the number of uniquely mapped reads that are located on the exons of genes [67]. Only genes with at least one read in all samples were kept for downstream analysis. Treating 20 hr and 32 hr samples as biological replicates, we applied DESeq to select differentially expressed genes between cohesin knockout and control cells with log2 fold change > 0.8 [68]. The Gene Expression Omnibus (GEO) accession number for RNA-Seq dataset is GSE68831.
Bmal1 ChIP in MEF cells were performed following the protocol by Shimomura et al. [69] with modification. Briefly, 107 cells were washed with PBS and cross-linked by 1% formaldehyde for 10 min on a rocker at room temperature. The cross-linking was quenched by 2.5 M Glycine with final concentration of 125 mM. The nuclei was extracted at 4°C from the homogenate by lysis buffer containing protease inhibitors [50mM Hepes-KOH, pH 7.5, 140mM NaCl, 1mM EDTA, 10% glycerol, 0.5% NP-40, 0.25% Triton X-100], [10mM Tris-HCl, pH 8.0, 200mM NaCl, 1mM EDTA, 0.5 mM EGTA], and [10mM Tris-HCl, pH 8.0, 200mM NaCl, 1mM EDTA, 0.5 mM EGTA, 0.1% Na-Deoxycholate, 0.5% N-lauroylsarcosine]. DNA was fragmented with sonication into 150–300 bp at 4°C. 50 μl DNA fragments were stored in 4°C as the input DNA. The rest of DNA fragments were incubated on rocker at 4°C for 6 hr with 1:1 ChIP buffer [20% Triton, NaDOC, NaCl, TE, inhibitor] and 4 μl Bmal1 antibody (Santa Cruz: sc-8550). Then 15 μl protein A/G agarose beads were added into DNA and incubated on rocker at 4°C overnight. Co-immunoprecipitated DNA was washed with 1 ml buffers [5% Triton, 1% SDS, 1% NaDOC, 93% TE] twice, [5% Triton, 1% SDS, 1% NaDOC, 6% NaCl, 87% TE] twice, [10% LiCl, 5% NP40, 5% Na-DOC, 80% TE] twice, [10% Triton, 90% TE], and TE. Then DNA was reverse cross-linked at 50°C for 2 hr with TE 100 μl, 10% SDS 3 μl, and protease K 5 μl. QIAquick PCR Purification Kit (QIAGEN) was used to purify ChIP DNA. Input and ChIP DNA library were prepared by using Illumina TruSeq ChIP Sample Prep Kit and were subject to deep sequencing with 1×100 bp read on HiSeq 2000 at CAS-MPG Partner Institute for Computational Biology Omics Core, Shanghai, China (S6 Table). ChIP-Seq data analysis was performed in the same pipeline described above. The Gene Expression Omnibus (GEO) accession number for Bmal1 ChIP-Seq data set is GSE77162.
The genome editing of CRISPR-Cas9 system
CRISPR-Cas9 method [70] was used to delete the cohesin binding site near the enhancer of Phldb2 in Hepa1-6 cells. The gRNA target sequences (GTCTTTCACGTGGGACGGAT and GAGACCTCAAGGACATGTGC) were designed by E-CRISP [71]. The homologous arms for donor plasmids are (chr16: 45967525–45967702) and (chr16: 45967935–45968129). The regulatory module (hPGK promoter/PuroR) was amplified from commercially available expression vector pLKO.1. Two homologous arms and PGK/puroR were assembled into pGEM-T Easy vector (Promega). Hepa1-6 cells were cultured with 10% FBS in DMEM (Life technology) and co-transfected with two gRNA/Cas9 vectors and linearized donor DNA. Then the cells were screened with 3 μg/ml puromycin (Merck/ millipore) for 2 weeks. Gel electrophoresis analysis of the homologous arms, control region, and the regulatory module (PGK-puroR) in WT and CRISPR-CAS9 treated cells validated the successful deletion of target DNA region (S6B Fig). Primers used in PCR and RT-PCR are listed in S5 Table.
The whole-genome scans in this study were implemented in Java language (JDK 6). All statistical analyses were performed in R 2.11.
Supporting Information
Acknowledgments
We thank Nikolai Petkau and Dr. Gabriela Whelan in Prof. Gregor Eichele’s lab (Max Planck Institute for Biophysical Chemistry) for providing Bmal1 antibody and Smc3-flox/flox MEF cells, Dr. Gang Wei and Dr. Zhen Shao (PICB) for critical reading of the manuscript, and Xuelong Wang (PICB) for the help in ChIP assay. We are grateful to the experimental support of the Uli Schwarz public laboratory platform in PICB.
Data Availability
All sequencing data are available from the NCBI GEO database (accession numbers GSE68832).
Funding Statement
This work was supported by Chinese Academy of Sciences Strategic Priority Research Program Grant XDB02060006 (JY) and Natural Science Foundation of China Grant 31571209 (JY), National Basic Research Program of China grant 2013CB966802 (ZZ), and National Natural Science Foundation of China grant 31370762 (ZZ). JY is an Independent Research Group leader supported by both Chinese Academy of Sciences and German Max-Planck Society. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
References
- 1.Takahashi JS, Hong H-K, Ko CH, McDearmon EL. The genetics of mammalian circadian order and disorder: implications for physiology and disease. Nat Rev Genet. 2008;9: 764–75. 10.1038/nrg2430 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Mohawk J a, Green CB, Takahashi JS. Central and peripheral circadian clocks in mammals. Annu Rev Neurosci. 2012;35: 445–62. 10.1146/annurev-neuro-060909-153128 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Yan J, Wang H, Liu Y, Shao C. Analysis of gene regulatory networks in the mammalian circadian rhythm. PLoS Comput Biol. 2008;4: e1000193 10.1371/journal.pcbi.1000193 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Ueda HR, Hayashi S, Chen W, Sano M, Machida M, Shigeyoshi Y, et al. System-level identification of transcriptional circuits underlying mammalian circadian clocks. Nat Genet. 2005;37: 187–92. [DOI] [PubMed] [Google Scholar]
- 5.Hughes ME, DiTacchio L, Hayes KR, Vollmers C, Pulivarthy S, Baggs JE, et al. Harmonics of circadian gene transcription in mammals. PLoS Genet. 2009;5: e1000442 10.1371/journal.pgen.1000442 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Koike N, Yoo S-H, Huang H-C, Kumar V, Lee C, Kim T-K, et al. Transcriptional Architecture and Chromatin Landscape of the Core Circadian Clock in Mammals. Science (80-). 2012;338: 349–354. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Rey G, Cesbron F, Rougemont J, Reinke H, Brunner M, Naef F. Genome-wide and phase-specific DNA-binding rhythms of BMAL1 control circadian output functions in mouse liver. PLoS Biol. 2011;9: e1000595 10.1371/journal.pbio.1000595 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Vollmers C, Schmitz RJ, Nathanson J, Yeo G, Ecker JR, Panda S. Circadian Oscillations of Protein-Coding and Regulatory RNAs in a Highly Dynamic Mammalian Liver Epigenome. Cell Metab. Elsevier Inc.; 2012;16: 833–845. 10.1016/j.cmet.2012.11.004 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Fang B, Everett LJ, Jager J, Briggs E, Armour SM, Feng D, et al. Circadian Enhancers Coordinate Multiple Phases of Rhythmic Gene Transcription In Vivo. Cell. Elsevier Inc.; 2014;159: 1140–1152. 10.1016/j.cell.2014.10.022 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Menet JS, Rodriguez J, Abruzzi KC, Rosbash M. Nascent-Seq reveals novel features of mouse circadian transcriptional regulation. Elife. 2012;1: e00011 10.7554/eLife.00011 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Lieberman-Aiden E, van Berkum NL, Williams L, Imakaev M, Ragoczy T, Telling A, et al. Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science (80-). 2009;326: 289–93. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Dixon JR, Selvaraj S, Yue F, Kim A, Li Y, Shen Y, et al. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature. Nature Publishing Group; 2012;485: 376–380. 10.1038/nature11082 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Botta M, Haider S, Leung IXY, Lio P, Mozziconacci J. Intra- and inter-chromosomal interactions correlate with CTCF binding genome wide. Mol Syst Biol. Nature Publishing Group; 2010;6: 426 10.1038/msb.2010.79 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Kim TH, Abdullaev ZK, Smith AD, Ching K a, Loukinov DI, Green RD, et al. Analysis of the vertebrate insulator protein CTCF-binding sites in the human genome. Cell. 2007;128: 1231–45. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Nasmyth K, Haering CH. Cohesin: its roles and mechanisms. Annu Rev Genet. 2009;43: 525–58. 10.1146/annurev-genet-102108-134233 [DOI] [PubMed] [Google Scholar]
- 16.Huis in ‘t Veld PJ, Herzog F, Ladurner R, Davidson IF, Piric S, Kreidl E, et al. Characterization of a DNA exit gate in the human cohesin ring. Science (80-). 2014;346: 968–972. [DOI] [PubMed] [Google Scholar]
- 17.Gligoris TG, Scheinost JC, Burmann F, Petela N, Chan K-L, Uluocak P, et al. Closing the cohesin ring: Structure and function of its Smc3-kleisin interface. Science (80-). 2014;346: 963–967. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Wendt KS, Yoshida K, Itoh T, Bando M, Koch B, Schirghuber E, et al. Cohesin mediates transcriptional insulation by CCCTC-binding factor. Nature. 2008;451: 796–801. 10.1038/nature06634 [DOI] [PubMed] [Google Scholar]
- 19.Kagey MH, Newman JJ, Bilodeau S, Zhan Y, Orlando D a, van Berkum NL, et al. Mediator and cohesin connect gene expression and chromatin architecture. Nature. Nature Publishing Group; 2010;467: 430–5. 10.1038/nature09380 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Peric-Hupkes D, van Steensel B. Linking cohesin to gene regulation. Cell. 2008;132: 925–8. 10.1016/j.cell.2008.03.001 [DOI] [PubMed] [Google Scholar]
- 21.Parelho V, Hadjur S, Spivakov M, Leleu M, Sauer S, Gregson HC, et al. Cohesins functionally associate with CTCF on mammalian chromosome arms. Cell. 2008;132: 422–33. 10.1016/j.cell.2008.01.011 [DOI] [PubMed] [Google Scholar]
- 22.Faure A, Schmidt D, Watt S, Schwalie P, Wilson M, Xu H, et al. Cohesin regulates tissue-specific expression by stabilising highly occupied cis-regulatory modules. Genome Res. 2012;22: 2163–2175. 10.1101/gr.136507.111 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Schmidt D, Schwalie PC, Ross-Innes CS, Hurtado A, Brown GD, Carroll JS, et al. A CTCF-independent role for cohesin in tissue-specific transcription. Genome Res. 2010;20: 578–88. 10.1101/gr.100479.109 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Misteli T. Beyond the sequence: cellular organization of genome function. Cell. 2007;128: 787–800. [DOI] [PubMed] [Google Scholar]
- 25.Aguilar-Arnal L, Hakim O, Patel VR, Baldi P, Hager GL, Sassone-Corsi P. Cycles in spatial and temporal chromosomal organization driven by the circadian clock. Nat Struct Mol Biol. Nature Publishing Group; 2013;20: 1206–13. 10.1038/nsmb.2667 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Menet JS, Pescatore S, Rosbash M. CLOCK:BMAL1 is a pioneer-like transcription factor. Genes Dev. 2014;28: 8–13. 10.1101/gad.228536.113 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Whyte W, Orlando D, Hnisz D, Abraham B. Master transcription factors and mediator establish super-enhancers at key cell identity genes. Cell. Elsevier Inc.; 2013;153: 307–319. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Pott S, Lieb JD. What are super-enhancers? Nat Genet. 2014;47: 8–12. [DOI] [PubMed] [Google Scholar]
- 29.Dowen JM, Fan ZP, Hnisz D, Ren G, Abraham BJ, Zhang LN, et al. Control of Cell Identity Genes Occurs in Insulated Neighborhoods in Mammalian Chromosomes. Cell. Elsevier Inc.; 2014;159: 374–387. 10.1016/j.cell.2014.09.030 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Handoko L, Xu H, Li G, Ngan CY, Chew E, Schnapp M, et al. CTCF-mediated functional chromatin interactome in pluripotent cells. Nat Genet. 2011;43: 630–638. Available: http://www.ncbi.nlm.nih.gov/pubmed/21685913 10.1038/ng.857 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Shen Y, Yue F, McCleary DF, Ye Z, Edsall L, Kuan S, et al. A map of the cis-regulatory sequences in the mouse genome. Nature. Nature Publishing Group; 2012;488: 116–20. 10.1038/nature11243 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Phillips-Cremins JE, Sauria MEG, Sanyal A, Gerasimova TI, Lajoie BR, Bell JSK, et al. Architectural Protein Subclasses Shape 3D Organization of Genomes during Lineage Commitment. Cell. Elsevier Inc.; 2013; 1–15. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Wang H, Zang C, Taing L, Arnett KL, Wong YJ, Pear WS, et al. NOTCH1-RBPJ complexes drive target gene expression through dynamic interactions with superenhancers. Proc Natl Acad Sci U S A. 2014;111: 705–10. 10.1073/pnas.1315023111 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Singh VP, Gerton JL. Cohesin and human disease: lessons from mouse models. Curr Opin Cell Biol. 2015;37: 9–17. 10.1016/j.ceb.2015.08.003 [DOI] [PubMed] [Google Scholar]
- 35.Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette M a, et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A. 2005;102: 15545–50. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Kishi M, Kummer TT, Eglen SJ, Sanes JR. LL5beta: a regulator of postsynaptic differentiation identified in a screen for synaptically enriched transcripts at the neuromuscular junction. J Cell Biol. 2003;169: 355–366. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Lee IH, Sohn M, Lim HJ, Yoon S, Oh H, Shin S, et al. Ahnak functions as a tumor suppressor via modulation of TGFbeta/Smad signaling pathway. Oncogene. 2014;33: 4675–4684. 10.1038/onc.2014.69 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Sexton T, Cavalli G. The Role of Chromosome Domains in Shaping the Functional Genome. Cell. Elsevier Inc.; 2015;160: 1049–1059. 10.1016/j.cell.2015.02.040 [DOI] [PubMed] [Google Scholar]
- 39.Naumova N, Imakaev M, Fudenberg G, Zhan Y, Lajoie BR, Mirny L a, et al. Organization of the Mitotic Chromosome. Science (80-). 2013;342: 948–53. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Kim T-K, Hemberg M, Gray JM, Costa AM, Bear DM, Wu J, et al. Widespread transcription at neuronal activity-regulated enhancers. Nature. Nature Publishing Group; 2010;465: 182–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Zhao H, Sifakis EG, Sumida N, Millán-Ariño L, Scholz BA, Svensson JP, et al. PARP1- and CTCF-Mediated Interactions between Active and Repressed Chromatin at the Lamina Promote Oscillating Transcription. Mol Cell. 2015; 1–14. [DOI] [PubMed] [Google Scholar]
- 42.Liu Y, Hu W, Murakawa Y, Yin J, Wang G, Landthaler M, et al. Cold-induced RNA-binding proteins regulate circadian gene expression by controlling alternative polyadenylation. Sci Rep. 2013;3: 2054 10.1038/srep02054 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Valekunja UK, Edgar RS, Oklejewicz M, van der Horst GTJ, O’Neill JS, Tamanini F, et al. Histone methyltransferase MLL3 contributes to genome-scale circadian transcription. Proc Natl Acad Sci U S A. 2013;110: 1554–9. 10.1073/pnas.1214168110 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Fustin J-M, Doi M, Yamaguchi Y, Hida H, Nishimura S, Yoshida M, et al. RNA-methylation-dependent RNA processing controls the speed of the circadian clock. Cell. Elsevier Inc.; 2013;155: 793–806. 10.1016/j.cell.2013.10.026 [DOI] [PubMed] [Google Scholar]
- 45.Yagita K, Horie K, Koinuma S, Nakamura W, Yamanaka I, Urasaki A, et al. Development of the circadian oscillator during differentiation of mouse embryonic stem cells in vitro. Proc Natl Acad Sci U S A. 2010;107: 3846–51. 10.1073/pnas.0913256107 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Ong C-T, Corces VG. CTCF: an architectural protein bridging genome topology and function. Nat Rev Genet. Nature Publishing Group; 2014;15: 234–46. 10.1038/nrg3663 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Rivera CM, Ren B. Mapping Human Epigenomes. Cell. Elsevier Inc.; 2013;155: 39–55. 10.1016/j.cell.2013.09.011 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Andersson R, Gebhard C, Miguel-Escalada I, Hoof I, Bornholdt J, Boyd M, et al. An atlas of active enhancers across human cell types and tissues. Nature. 2014;507: 455–461. 10.1038/nature12787 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Nord AS, Blow MJ, Attanasio C, Akiyama J a, Holt A, Hosseini R, et al. Rapid and pervasive changes in genome-wide enhancer usage during mammalian development. Cell. Elsevier; 2013;155: 1521–31. 10.1016/j.cell.2013.11.033 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Zhao Z, Tavoosidana G, Sjölinder M, Göndör A, Mariano P, Wang S, et al. Circular chromosome conformation capture (4C) uncovers extensive networks of epigenetically regulated intra- and interchromosomal interactions. Nat Genet. 2006;38: 1341–7. [DOI] [PubMed] [Google Scholar]
- 51.Gheldof N, Leleu M, Noordermeer D, Rougemont J, Reymond A. Detecting Long-Range Chromatin Interactions Using the Chromosome Conformation Capture Sequencing (4C-seq) Method In: Deplancke B, Gheldof N, editors. Gene regulatory networks. Totowa, NJ: Humana Press; 2012. pp. 211–225. [DOI] [PubMed] [Google Scholar]
- 52.Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 2009;10: R25 10.1186/gb-2009-10-3-r25 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Werken HJG Van De, Landan G, Holwerda SJB, Hoichman M, Klous P, Chachik R, et al. Robust 4C-seq data analysis to screen for regulatory DNA interactions. 2012;9 10.1038/nmeth.2173 [DOI] [PubMed] [Google Scholar]
- 54.Williams RL, Starmer J, Mugford JW, Calabrese JM, Mieczkowski P, Yee D, et al. fourSig: a method for determining chromosomal interactions in 4C-Seq data. Nucleic Acids Res. 2014;42: e68 10.1093/nar/gku156 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Fullwood MJ, Liu MH, Pan YF, Liu J, Xu H, Mohamed Y Bin, et al. An oestrogen-receptor-alpha-bound human chromatin interactome. Nature. Nature Publishing Group; 2009;462: 58–64. 10.1038/nature08497 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Hagège H, Klous P, Braem C, Splinter E, Dekker J, Cathala G, et al. Quantitative analysis of chromosome conformation capture assays (3C-qPCR). Nat Protoc. 2007;2: 1722–33. [DOI] [PubMed] [Google Scholar]
- 57.Feng D, Liu T, Sun Z, Bugge A, Mullican SE, Alenghat T, et al. A circadian rhythm orchestrated by histone deacetylase 3 controls hepatic lipid metabolism. Science (80-). 2011;331: 1315–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Zhang Y, Liu T, Meyer C a, Eeckhoute J, Johnson DS, Bernstein BE, et al. Model-based analysis of ChIP-Seq (MACS). Genome Biol. 2008;9: R137 10.1186/gb-2008-9-9-r137 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Salmon-Divon M, Dvinge H, Tammoja K, Bertone P. PeakAnalyzer: genome-wide annotation of chromatin binding and modification loci. BMC Bioinformatics. 2010;11: 415 Available: http://www.biomedcentral.com/1471-2105/11/415 10.1186/1471-2105-11-415 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26: 841–2. 10.1093/bioinformatics/btq033 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Hughes ME. JTK_CYCLE: an efficient non-parametric algorithm for detecting rhythmic components in genome-scale dataset. J biol Rhythm. 2011;25: 372–380. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Kang B, Li Y, Li Y. Anti-clustering of circadian gene expression in mouse liver genome. 2012 IEEE 6th Int Conf Syst Biol. Ieee; 2012; 273–279.
- 63.Jammalamadaka SR, Sengupta A. Topics in Circular Statistics. World Scientific Publishing Company; 2001. [Google Scholar]
- 64.Tang Q, Chen Y, Meyer C, Geistlinger T, Lupien M, Wang Q, et al. A comprehensive view of nuclear receptor cancer cistromes. Cancer Res. 2011;71: 6940–7. 10.1158/0008-5472.CAN-11-2091 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.White JK, Gerdin A-K, Karp N a, Ryder E, Buljan M, Bussell JN, et al. Genome-wide generation and systematic phenotyping of knockout mice reveals new roles for many genes. Cell. 2013;154: 452–64. 10.1016/j.cell.2013.06.022 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Trapnell C, Pachter L, Salzberg SL. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics. 2009;25: 1105–11. 10.1093/bioinformatics/btp120 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Anders S, Pyl PT, Huber W. HTSeq—A Python framework to work with high-throughput sequencing data. Bioinformatics. 2014;31: 166–169. 10.1093/bioinformatics/btu638 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Anders S, Huber W. Differential expression analysis for sequence count data. Genome Biol. BioMed Central Ltd; 2010;11: R106 10.1186/gb-2010-11-10-r106 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Shimomura K, Kumar V, Koike N, Kim T-K, Chong J, Buhr ED, et al. Usf1, a suppressor of the circadian Clock mutant, reveals the nature of the DNA-binding of the CLOCK:BMAL1 complex in mice. Elife. 2013;2: e00426 10.7554/eLife.00426 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Ran FA, Hsu PD, Wright J, Agarwala V, Scott D a, Zhang F. Genome engineering using the CRISPR-Cas9 system. Nat Protoc. 2013;8: 2281–308. 10.1038/nprot.2013.143 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Heigwer F, Kerr G, Boutros M. E-CRISP: fast CRISPR target site identification. Nat Methods. Nature Publishing Group; 2014;11: 122–3. 10.1038/nmeth.2812 [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
All sequencing data are available from the NCBI GEO database (accession numbers GSE68832).