Abstract
DNA replication occurs through an intricately regulated series of molecular events and is fundamental for genome stability1,2. At present, it is unknown how the locations of replication origins are determined in the human genome. Here we dissect the role of topologically associating domains (TADs)3–6, subTADs7 and loops8 in the positioning of replication initiation zones (IZs). We stratify TADs and subTADs by the presence of corner-dots indicative of loops and the orientation of CTCF motifs. We find that high-efficiency, early replicating IZs localize to boundaries between adjacent corner-dot TADs anchored by high-density arrays of divergently and convergently oriented CTCF motifs. By contrast, low-efficiency IZs localize to weaker dotless boundaries. Following ablation of cohesin-mediated loop extrusion during G1, high-efficiency IZs become diffuse and delocalized at boundaries with complex CTCF motif orientations. Moreover, G1 knockdown of the cohesin unloading factor WAPL results in gained long-range loops and narrowed localization of IZs at the same boundaries. Finally, targeted deletion or insertion of specific boundaries causes local replication timing shifts consistent with IZ loss or gain, respectively. Our data support a model in which cohesin-mediated loop extrusion and stalling at a subset of genetically encoded TAD and subTAD boundaries is an essential determinant of the locations of replication origins in human S phase.
Subject terms: Epigenetics, Epigenomics, Chromatin structure, Origin selection
A study shows that the three-dimensional conformation of the human genome influences the positioning of DNA replication initiation zones, highlighting cohesin-mediated loop anchors as essential determinants of their precise location.
Main
The interphase human genome folds into TADs and nested subTADs. TADs were originally defined in first-generation Hi-C and 5C data as megabase (Mb)-scale, self-interacting chromatin segments in which DNA sequences exhibit substantially higher contact frequency within—compared to between—domains3–6. Molecular and computational advances over the past decade have resulted in ultrahigh-resolution genome folding maps with substantially improved signal-to-noise ratios8–11. Such technical advances have enabled the discovery of fine-grained A/B compartments8, nested subTADs within TADs7, punctate dot structures indicative of long-range looping interactions8, and stripes indicative of loop extrusion12–14. In light of the critical importance of dissecting the link between specific higher-order chromatin architectural features and genome function, a leading challenge is to classify subtypes of TADs/subTADs in Hi-C maps by their fine-grained structural features. Clearly defining structural classes of TADs/subTADs can in turn facilitate the careful dissection of each boundary’s molecular composition, organizing principles and unique cause-and-effect relationship across a range of genome functions.
Here we ascertain the functional link between distinct structural classes of TADs/subTADs and DNA replication. Replication initiates from tens of thousands of origins licensed in excess across the human genome in telophase and throughout G1 (refs.1,2). A small proportion of licensed origins subsequently fire in orchestrated temporal waves during S phase2. It is established that origins fire at one or more sites chosen stochastically within ≈40 kb regions (IZs)15–17. Nevertheless, a consensus sequence encoding origin or IZ placement has not been definitively identified in humans. Waves of early and late replication correlate with A and B compartments, respectively, and the temporal transitions from early to late replication can in some cases align with TAD boundaries3,18,19. However, the role of fine-scale genome folding patterns during interphase (such as loops, subTADs and TADs detectable in high-resolution Hi-C data) in the genomic placement of initiated origins following entry into S phase is not known.
We recently developed a high-resolution Repli-seq method to identify the placement of IZs across the genome at 50-kb resolution16. We first compared the genomic locations of IZs replicating across early, early–mid and late S phase to our high-resolution Hi-C data developed in the 4D Nucleome Consortium from H1 human embryonic stem (ES) cells11. We noticed that high-efficiency, early-S-phase IZs colocalize to strongly insulated boundaries demarcated by corner-dot TADs/subTADs on one or both sides (Fig. 1a and Extended Data Figs. 1, 2a and 3). By contrast, low-efficiency IZs that fire late in S phase can colocalize with boundaries between TADs/subTADs devoid of corner-dots (Fig. 1a and Extended Data Figs. 2b and 3). Our qualitative observations suggest that early and late IZs are enriched at genomic locations serving as boundaries of corner-dot and dotless TAD/subTADs, respectively.
To quantify the link between TAD/subTAD boundaries and IZ genomic placement, we identified a total of 23,851 chromatin domains genome-wide in Hi-C data for human ES cells using our graph-theory-based method 3DNetMod20 (Supplementary Methods and Supplementary Table 1). We also applied statistical methods developed by our laboratory and others to identify dot-like structures representative of bona fide looping interactions8,21,22. We identified 16,922 dots genome wide in ensemble Hi-C maps of human ES cells. Such dots represent punctate groups of adjacent pixels with significantly higher contact frequency compared to the surrounding local chromatin domain structure (Fig. 1a, green rectangles, Supplementary Methods and Supplementary Table 2). After co-registration of dots with domains, we identified 8,279 corner-dot TADs/subTADs and 15,572 dotless TADs/subTADs genome wide in human ES cells (Supplementary Table 3). We stratified boundaries into three groups, including those that are structurally demarcated by: adjacent corner-dot TADs/subTADs on both sides (double-dot boundaries, n = 6,318); corner-dot TADs/subTADs on only one side and dotless on the other (single-dot boundaries, n = 2,163); and adjacent dotless TADs/subTADs on both sides (dotless boundaries, n = 1,089) (Supplementary Table 4). By applying a range of parameter stringencies and methods for dot calling, we could modify the proportion of boundaries classified as double-dot, single-dot and dotless, but the colocalization of dot boundaries with IZs was evident regardless of statistical methodology (Supplementary Methods and Extended Data Fig. 4). We combined all double-dot and single-dot boundaries into dot boundaries, as they showed similar IZ localization patterns (Supplementary Table 4).
Cohesin is essential for the formation of TADs/subTADs through loop extrusion and stalling against boundaries insulated by the architectural protein CTCF12,13,23–25. We reasoned that the density and orientation of CTCF-binding sites might reveal an architectural protein signature at boundaries linked to placement of active origins that fire in S phase. We observed a substantially higher density of co-bound CTCF + cohesin-binding sites at dot boundaries overlapping early IZs compared to those that do not overlap any IZs (Fig. 1b and Supplementary Tables 5 and 6). We also examined sites that bind only cohesin, as they can earmark CTCF-independent enhancer–promoter interactions7,23, but we did not see a notable difference in the number of sites that bind only cohesin across dot versus dotless TAD/subTAD boundaries (Fig. 1b). Together, our data indicate that boundaries colocalizing with human early-S-phase IZs exhibit enriched occupancy of motifs co-bound by CTCF and cohesin, but not cohesin alone, thus confirming and substantially expanding on observations in previous reports linking cohesin generally to a small subset of replication origins in Drosophila26 and humans27.
Recent reports have uncovered that convergently oriented CTCF motifs anchor long-range looping interactions formed by cohesin-mediated extrusion12,14,23,28,29. We observed that most dot boundaries are marked by two or more CTCF + cohesin-bound motifs arranged in a convergent or divergent orientation (hereafter called complex motif orientation; Fig. 1c), and this molecular signature was further enriched when dot boundaries colocalize with early replicating IZs. By contrast, nearly all dotless boundaries have only one or no CTCF + cohesin-bound motifs (Fig. 1c). Dotless boundaries colocalized with late IZs were most often anchored by one CTCF motif. We therefore establish six boundary classes by stratifying dot (classes 1–3) and dotless (classes 4–6) boundaries into those localized with CTCF + cohesin-bound motifs in a complex orientation (classes 1 and 4), tandem or single-motif orientation (classes 2 and 5), or no bound motifs (classes 3 and 6; Fig. 1d).
We next formulated a statistical test to quantify IZ enrichment at boundaries compared to the background expectation across autosomes (Supplementary Methods and Supplementary Table 7). Consistent with our qualitative observations, high-efficiency IZs firing in early S phase were significantly enriched at dot boundaries marked by CTCF + cohesin-binding sites in complex orientations compared to a null distribution of random intervals matched by size and A/B compartment distribution (class 1; Fig. 1d,e, Extended Data Fig. 5b–d and Supplementary Methods). By contrast, low-efficiency IZs firing in late S phase were depleted at dot boundaries and significantly enriched at dotless boundaries with tandem + single CTCF + cohesin-bound motifs or no bound motifs (classes 5 and 6; Fig. 1d,e, Extended Data Fig. 5b–d and Supplementary Methods). We note that our null distribution was created with random intervals matched to real IZs by their size and compartment distribution, reinforcing that the enrichment reflects a strong localization at boundaries above the known link between early and late replication and A and B compartments, respectively (Supplementary Methods).
We sought to independently verify our observed link between IZs and boundaries with an orthogonal technique for assaying replication origin activity. Small nascent strand sequencing (SNS-seq) identifies approximately 10 origins per 100 kb of the genome and enriches for high-efficiency origins localized in early replicating regions30. A previous report using ENCODE (Encyclopedia of DNA Elements) phase I pilot microarray data of 1% of the human genome reported enrichment of the cohesin subunit RAD21 at approximately 300 replication origins27. Here, using genome folding features from high-resolution Hi-C data, we find that SNS-seq data from human ES cells30 exhibits heightened origin enrichment specifically at class 1 dot boundaries (Extended Data Fig. 5e). Thus, through two independent replication mapping techniques, we observe a strong enrichment of high-efficiency, early-S-phase IZs at a subset of genetically encoded corner-dot TAD/subTAD boundaries. The colocalization of IZs with TAD boundaries generally has been further confirmed recently with super-resolution imaging31.
Transcription correlates with origin placement and efficiency15,17,32–35. To ascertain whether transcription at boundaries could explain our results, we stratified dot boundaries with a complex CTCF orientation (class 1), dotless boundaries with a complex CTCF orientation (class 4) and dotless boundaries with no CTCF occupancy (class 6) into those that also had transcribed genes and those that were devoid of genes or had only inactive genes (Extended Data Fig. 6 and Supplementary Table 8). Boundaries with transcribed genes in the absence of the dot features (Extended Data Fig. 6b) or in the absence of CTCF + cohesin (Extended Data Fig. 6c) did not exhibit precise localization of high-efficiency early IZs. These results are consistent with the literature, as a large proportion of active promoters are not sites of efficient replication initiation, suggesting that further distinguishing features encode human origins36. It is also particularly noteworthy that we see enrichment of early IZs at dot boundaries with a complex CTCF motif orientation only when transcribed genes were also present (Extended Data Fig. 6a). Our data suggest that transcription alone is not sufficient to localize high-efficiency early IZs at boundaries. Transcription may cooperate with CTCF and cohesin-based loop extrusion to position high-efficiency IZs replicating in early S phase.
To understand whether cohesin and TAD/subTAD structural integrity are functionally necessary for origin placement in S phase, we examined IZs after global genome folding disruption using wild-type HCT116 cells engineered to degrade the cohesin subunit RAD21 within hours using a degron23. Such a system is uniquely suited to test the role of cohesin-mediated extrusion on IZs decoupled from transcription, as only hours of RAD21 degradation results in genome-wide ablation of nearly all loops with minimal short-term effect on transcription23. We synchronized HCT116 RAD21–mAID cells in mitosis, degraded RAD21 with auxin throughout G1, and then assessed replication initiation across S phase (Extended Data Fig. 7 and Supplementary Methods). We identified the same dot and dotless TADs/subTADs and boundary classes in Hi-C from wild-type HCT116 (untreated HCT116 RAD21–mAID) cells as in human ES cells (Fig. 2a and Supplementary Tables 9–14). Consistent with previous reports23, our observations show that nearly all dot and dotless boundaries were destroyed following short-term cohesin knockdown in HCT116 cells (Fig. 2b,d and Extended Data Fig. 8). Therefore, although the molecular composition of boundaries influences their structural features of insulation strength and corner-dot presence, most are dependent on cohesin.
Previous studies have reported that replication timing domains are not globally altered following genome-wide disruption of cohesin-mediated loops37–39. Analyses in these studies relied on the log ratio of DNA synthesized in the first or second halves of S phase (two-fraction early/late Repli-seq)40, the resolution of which renders it difficult to discern IZs. Moreover, previously published two-fraction Repli-seq signals were often quantile normalized37,39, which obscures the localized disruption in IZ placement and timing shifts at specific TAD/subTAD boundaries. We generated and analysed high-resolution 16-fraction Repli-seq data (Fig. 2c,e and Supplementary Table 15), as well as single-molecule optical replication mapping (ORM) data17 (Fig. 2f), in both wild-type and cohesin-knockdown HCT116 cells (Extended Data Fig. 7 and Supplementary Methods). As in human ES cells, we observed that 16-fraction Repli-seq data exhibit focal enrichment of high-efficiency/early IZs specifically at dot boundaries marked by CTCF + cohesin co-bound motifs in a complex orientation in wild-type HCT116 cells (class 1; Fig. 2c). Enrichment of early IZs occurs only at boundaries that colocalize with cohesin (Extended Data Fig. 9). Moreover, as in human ES cells, low-efficiency, late IZs were enriched at weak dotless boundaries in wild-type HCT116 cells (Fig. 2c). Using single-molecule ORM data, which can directly assess IZ efficiency as the percentage of molecules that initiate within a particular IZ, we detected enriched origin initiation specifically at class 1 boundaries (Fig. 2f). Together, our single-molecule and ensemble replication initiation data indicate that early-S-phase IZs fire at a key subset of genetically encoded dot boundaries.
Following ablation of cohesin-mediated boundaries (Fig. 2b,d and Extended Data Fig. 8), we observe severe disruption of high-efficiency early-S-phase IZs specifically at class 1 boundaries, as evidenced by a diffuse and delocalized Repli-seq signal (class 1; Fig. 2c,e). Consistent with our qualitative observations, early wave IZs were less numerous and increased in width specifically at dot boundaries with a complex CTCF motif orientation after loss of cohesin (Extended Data Fig. 10 and Supplementary Table 16). We also noticed that low-efficiency IZs shift to replicating at the end of S phase (fractions 14–16) at dotless boundaries following cohesin knockdown (classes 4–6, Fig. 2c,e and Extended Data Fig. 10). Independently conducted ORM analyses confirmed our observations of IZ disruption by cohesin removal (Fig. 2f). Cell cycle progression and 5-bromodeoxyuridine incorporation was not substantially affected by RAD21 knockdown39 (Extended Data Fig. 7). Together, our ensemble and single-molecule IZ data demonstrate that disruption of cohesin-mediated loops during G1 alters the genomic placement where origins or clusters of origins fire during early S phase.
On the basis of our observations, we reason that a failure of cohesin to unload, and therefore the creation of new long-range loops due to more cohesin molecules stalled at complex CTCF boundaries in G1 phase, might result in an increased number of high-efficiency origins or a narrowing of their genomic placement in S phase. Recently, it was reported that knockdown of the gene encoding the cohesin unloading factor WAPL results in increased long-range loops41. We examined the genomic placement of IZs in S phase with 16-fraction Repli-seq in wild-type HCT116 cells engineered with an improved degron system (AID2) to degrade WAPL throughout G1 phase42. First, we created Hi-C libraries in wild-type and WAPL-knockdown HCT116 cells (Fig. 3a,b and Extended Data Fig. 7). Consistent with published results, our observations show that dots indicative of loops are more numerous, and traverse a longer genomic distance, compared with those in wild-type HCT116 cells (Fig. 3a,b and Supplementary Table 18). We observed that the gain-of-looping phenotype following WAPL knockdown occurs most strongly at dot boundaries with a complex CTCF motif orientation (class 1; Fig. 3c). At class 1 boundaries, we observe that early IZs become significantly narrower following WAPL knockdown (Fig. 3d,e and Supplementary Table 17). We note that IZs tighten and refine following gain of looping in the WAPL-knockdown condition at the same boundaries where IZs grow more diffuse following cohesin knockdown (Fig. 3a,b and Extended Data Fig. 10). Together, the findings from our gain and loss of structural boundary experiments further support a model in which cohesin-based loop extrusion in interphase deterministically informs the placement of the subset of origins that fire during S phase.
We finally sought to understand whether specific boundaries are necessary and sufficient to regulate IZ firing. We used targeted CRISPR–Cas9 genome editing to delete an 80-kb section of the genome containing a complex array of more than 10 CTCF + cohesin-binding sites with complex motif orientations anchoring a long-range chromatin loop that separates late from early replication timing domains (Fig. 4a). The loop anchor was chosen because it also partially overlaps an early-S-phase IZ, but does not encompass the full IZ, thus allowing us to ablate the loop while keeping much of the IZ intact. We observed a striking local delay of replication timing from early to late following deletion of the 80-kb loop anchor, consistent with the loss of an early IZ (Fig. 4a,c(i)). As a negative control, we deleted a different 30-kb loop anchored by two tandemly oriented CTCF-binding sites within an adjacent late replication timing domain, but not overlapping an IZ (Fig. 4b). Deletion of this 30-kb loop anchor disrupted the dot boundary but preserved the timing and genomic location of DNA replication (Fig. 4b,c(ii)). The direct overlap of IZs with boundaries precludes our ability to fully decouple them, and overlap of functional elements remains a technical challenge for functional perturbative studies in the genome biology field at large. Nevertheless, our data provide evidence that replication at a specific early IZ can undergo a striking shift to late S phase following ablation of a boundary. These data are consistent with our cohesin-knockdown observations and our model in which boundaries marked by a complex CTCF motif orientation inform the precise placement of high-efficiency IZs.
As the direct overlap of IZs with boundaries is not amenable to clean, single-variable ‘loss-of-structure’ perturbative experiments, we also examined a ‘gain-of-structure’ approach in which we assessed whether the introduction of an engineered ectopic boundary was sufficient to induce changes in replication initiation. We mapped replication with two-fraction Repli-seq in published HAP1 cell lines in which we have previously demonstrated a gain in boundary following insertion of an established 2 kb-sized cell-type-invariant boundary element43. We observed a striking shift from late to early replication directly at the location of the engineered boundary (Fig. 4d), consistent with the possibility that boundaries can be sufficient for de novo early IZ firing. Together, our data reveal that both global and local gain and loss of structural boundaries can deterministically influence the placement of IZs.
It is well established that the initiation of DNA replication involves two mutually exclusive steps1,2. The first step, origin licensing, begins in telophase with the loading of two copies of the mini-chromosome maintenance (MCM2–7) complex2,44. MCM2–7 is initially loaded in excess at tens of thousands of sites across the human genome in an inactive form as a double hexamer that encircles double stranded DNA (yellow double hexamers in Fig. 4e). The second step, origin activation, occurs at the onset of S phase. Origin activation involves mechanisms that both prevent further MCM loading and recruit multiple extra factors to initiate the unwinding of the double helix and DNA synthesis2,44. In mammalian systems, a critical mystery remains regarding the mechanisms that governing the selection of a subset of MCM-bound, licensed origins for activation in S phase.
Here we propose a model in which cohesin-mediated loop extrusion and stalling at dot boundaries marked by CTCF + cohesin-binding sites oriented in convergent and divergent directions is required for the positioning of high-efficiency replication origins (Fig. 4e). We propose two possible models to explain the strong localization of high-efficiency IZs to a subset of cohesin-dependent, genetically encoded boundaries: cohesin could directly push licensed MCM double hexamers or other origin activation cofactors along the genome before stalling at high-density arrays of CTCF + cohesin-bound motifs in complex orientations; alternatively, cohesin might pass over many licensed, MCM-bound origins and selectively participate in the activation of those already loaded at boundaries. We also posit that low-efficiency IZs might fire at weaker dotless boundaries later in S phase because cohesin only temporarily pauses during its traversal along the genome, and thus cannot aggregate initiation activity (Fig. 4e). In the cell types from our study, cohesin-mediated loop extrusion is required for IZ placement, and the changes in replication timing are subtle and indirect owing to the altered distance of nearby genomic regions to the nearest initiation site. We note that although we do not see evidence for a dominant role for cohesin on the larger replication timing program, we cannot rule out that cohesin knockdown might have a more profound effect on the replication timing program in other cell types, species and experimental designs.
Previous studies using mass spectrometry and co-immuno-precipitation have reported the direct binding of cohesin to DNA replication factors, such as MCM7, MCM6, MCM4, RFC1 and DNA polymerase α27,45. The MCM complex has the ability to slide after loading and can be pushed by polymerase during transcription46–48. However, the extent and rate at which this occurs on chromatin in the presence of nucleosomes (≈11 nm) is still an open question. The internal diameter of cohesin is 40 nm, whereas the MCM2–7 double hexamer is only 15 nm. The findings of a recent Hi-C and imaging study suggest that, despite their small size, MCM complexes could also serve as boundaries to block cohesin-based loop extrusion49. TAD boundaries and loops persist through S phase50, but MCMs are removed from chromatin after IZs fire1,2. Therefore, we favour a model in which cohesin pushes licensed MCMs in G1, leading to the localization and activation of a key subset of origins at boundaries with a complex CTCF motif orientation in S phase (Fig. 4e). Nevertheless, both proposed models remain exciting areas for future mechanistic dissection.
Understanding the structure–function relationship of the human genome remains a major challenge for human geneticists and chromatin biologists. Here we stratify TADs and subTADs by their structural and molecular features. We conduct global and local perturbative studies to reveal that genetically encoded TAD/subTAD boundaries formed by cohesin-mediated loop extrusion in G1/pre-S functionally inform genome function in the case of the initiation of DNA replication in S phase. Our work sheds light on the question of whether and how the location of fired origins is deterministically encoded in humans by the genome, epigenome and higher-order chromatin folding.
Reporting summary
Further information on research design is available in the Nature Research Reporting Summary linked to this paper.
Online content
Any methods, additional references, Nature Research reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at 10.1038/s41586-022-04803-0.
Supplementary information
Acknowledgements
We thank members of the 4D Nucleome community and the laboratories of J.E.P.-C. and D.M.G. for helpful discussions. J.E.P.-C. is a New York Stem Cell Foundation – Robertson Investigator and an Alfred P. Sloan Foundation Fellow. J.D. is an investigator of the Howard Hughes Medical Institute. This research was supported by National Institute of Mental Health grants (1R011MH120269 and 1DP1MH129957; J.E.P.-C.), 4D Nucleome Common Fund grants (1U01HL12999801, 1U01DK127405 and 1U01DA052715; J.E.P.-C.), a National Science Foundation (NSF) CAREER Award (CBE-1943945; J.E.P.-C.), an NSF Emerging Frontiers in Research Innovation grant (1933400; J.E.P.-C.), 4D Nucleome Common Fund grants (DK107980 and HG011536; J.D.) and National Institutes of Health grants (R01HG010658 and U54DK107965; D.M.G.). The PhD fellowship to D.S. was provided by the CNRS 80|Prime interdisciplinary programme and W.W. was supported by a COFUND IC-3i International PhD fellowship.
Extended data figures and tables
Author contributions
Conceptualization: D.J.E., P.A.Z., D.M.G., J.E.P.-C. Experimentation: C.G., L.Z., Z.S., K.K., W.G., L.Y., J.H.G., T.S., H.Y., F.Y., J.D., D.M.G., J.E.P.-C. Computation and visualization: D.J.E., P.A.Z., A.L.C., M.K.M., R.J.B., K.R.T., S.V.V., D.S., W.W., C.-L.C., D.M.G., J.E.P.-C. Funding acquisition: J.D., D.M.G., J.E.P.-C. Project administration: D.M.G., J.E.P.-C. Writing: D.J.E., P.A.Z., A.L.C., J.E.P.-C. Review and editing: D.J.E., P.A.Z., A.L.C., M.K.M., J.D., C.-L.C., D.M.G., J.E.P.-C. Critical reagents: D.Z., M.T.K.
Peer review
Peer review information
Nature thanks Benjamin Rowland and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.
Data availability
All new raw data created in this manuscript have been uploaded to the 4D Nucleome portal and will be freely released for full distribution to the public (see specific details below). Processed data files for all figures and extended data figures are provided as Supplementary Tables 1–19. ORM data have been uploaded to the National Center for Biotechnology Information, BioProject database accession number PRJNA788726 (http://genome.ucsc.edu/s/dsaulebe/ORM%20data%20HCT116). Two-fraction Repli-seq data for Blobel engineered lines (raw data and processed log2[early/late] from three conditions) were obtained from https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE190117.
Group 1 data (16-fraction Repli-seq data for H1 human ES cells) are available from the 4D Nucleome portal as follows: H1 human ES raw fastq, https://data.4dnucleome.org/experiment-sets/4DNESXRBILXJ/; H1 human ES read-depth-normalized array for visualization, https://data.4dnucleome.org/files-processed/4DNFIEEYFQ7C/; H1 human ES scaled, read-depth-normalized array for IZ calls, https://data.4dnucleome.org/files-processed/4DNFI3N8GHKR/; H1 human ES early, early–mid and late IZs on read-depth-normalized array, https://data.4dnucleome.org/files-processed/4DNFIRF7WZ3H/.
Group 2 data (16-fraction Repli-seq data for wild-type HCT116 cells) are available from the 4D Nucleome portal as follows: wild-type HCT116 raw fastq, https://data.4dnucleome.org/experiment-sets/4DNESNGZM5FG/; wild-type HCT116 mitochondria-normalized array for IZ calls, https://data.4dnucleome.org/files-processed/4DNFIPIQTMJ9/; wild-type HCT116 early, early–mid and late IZs on mitochondria-normalized array, https://data.4dnucleome.org/files-processed/4DNFI95K53YS/.
Group 3 data (16-fraction Repli-seq data for wild-type and cohesin-knockdown HCT116 pairing) are available from the 4D Nucleome portal as follows: RAD21-knockdown HCT116 raw, https://data.4dnucleome.org/experiment-sets/4DNES92AU9JR/; RAD21-knockdown HCT116 read-depth-normalized downsampled array for IZ calls, https://data.4dnucleome.org/files-processed/4DNFI3ZMWG5T/; RAD21-knockdown HCT116 early, early–mid and late IZs called on the read-depth-normalized downsampled array, https://data.4dnucleome.org/files-processed/4DNFIGOMS9G7/; wild-type HCT116 raw fastq, https://data.4dnucleome.org/experiment-sets/4DNESNGZM5FG/; wild-type HCT116 read-depth-normalized downsampled array for IZ calls, https://data.4dnucleome.org/files-processed/4DNFI6NGWNOG/; wild-type HCT116 early, early–mid and late IZs called on the read-depth-normalized downsampled array, https://data.4dnucleome.org/files-processed/4DNFIYO3H24N/.
Group 4 data (16-fraction Repli-seq data for wild-type and WAPL-knockdown HCT116 pairing) are available from the 4D Nucleome portal as follows: WAPL-knockdown HCT116 raw, https://data.4dnucleome.org/experiment-sets/4DNES72NE7SL/; WAPL-knockdown HCT116 read-depth-normalized downsampled array for IZ calls, https://data.4dnucleome.org/files-processed/4DNFI7MI88QR/; WAPL-knockdown HCT116 early, early–mid and late IZs called on the read-depth-normalized downsampled array, https://data.4dnucleome.org/files-processed/4DNFIDI1QJVA/; wild-type HCT116 raw fastq, https://data.4dnucleome.org/experiment-sets/4DNESNGZM5FG/; wild-type HCT116 read-depth-normalized downsampled array for IZ calls, https://data.4dnucleome.org/files-processed/4DNFI6NGWNOG/; wild-type HCT116 early, early–mid and late IZs called on the read-depth-normalized downsampled array, https://data.4dnucleome.org/files-processed/4DNFILNNSFMD/.
Group 5 data (16-fraction Repli-seq data visualization) are available from the 4D Nucleome portal as follows: wild-type HCT116 read-depth-normalized downsampled array for visualization, https://data.4dnucleome.org/files-processed/4DNFI6NGWNOG/; RAD21-knockdown HCT116 read-depth-normalized downsampled array for visualization, https://data.4dnucleome.org/files-processed/4DNFI3ZMWG5T/; WAPL-knockdown HCT116 read-depth-normalized downsampled array for visualization, https://data.4dnucleome.org/files-processed/4DNFI7MI88QR/.
Hi-C data for wild-type and WAPL-knockdown HCT116 pairing are available from the 4D Nucleome portal as follows: WAPL-knockdown HCT116 raw Hi-C, https://data.4dnucleome.org/experiment-set-replicates/4DNES1JP4KZ1/; WAPL-knockdown HCT116 normalized balanced Hi-C matrices, https://data.4dnucleome.org/files-processed/4DNFIY5939F3/; WAPL-knockdown HCT116 loops, https://data.4dnucleome.org/files-processed/4DNFILP7BD5H/; wild-type HCT116 raw Hi-C, https://data.4dnucleome.org/experiment-set-replicates/4DNESNSTBMBY/; wild-type HCT116 normalized balanced Hi-C matrices, https://data.4dnucleome.org/files-processed/4DNFI5MR78O6/; wild-type HCT116 loops, https://data.4dnucleome.org/files-processed/4DNFIOQLL854/.
Two-fraction Repli-seq data for human iPS wild-type and two CRISPR-engineered lines (raw data and processed log2[early/late] from three conditions) are available from the 4D Nucleome portal as follows: wild-type human iPS line raw data, https://data.4dnucleome.org/experiment-sets/4DNESDYES9QD/; wild-type human iPS line log2[early/late], https://data.4dnucleome.org/files-processed/4DNFI5WEY784/; human engineered clone 1 80-kb-IZ-deletion iPS line raw data, https://data.4dnucleome.org/experiment-sets/4DNESE3WCUAQ/; human engineered clone 1 80-kb-IZ-deletion iPS line log2[early/late], https://data.4dnucleome.org/files-processed/4DNFIZMB415V/; human engineered clone 2 30-kb-control-deletion iPS line raw data, https://data.4dnucleome.org/experiment-sets/4DNES66YWJU7/; human engineered clone 2 30-kb-control-deletion iPS line log2[early/late], https://data.4dnucleome.org/files-processed/4DNFIWDMF7HW/.
5C data for human IPS wild-type and two engineered lines (primer bed file, raw heatmaps and processed heatmaps from three conditions) are available from the 4D Nucleome portal as follows: wild-type human iPS line raw data, https://data.4dnucleome.org/experiment-set-replicates/4DNESLRDUPZ6/; wild-type human iPS line balanced 5C data, replicate 1, https://data.4dnucleome.org/files-processed/4DNFIXM8V3ZB/, replicate 2, https://data.4dnucleome.org/files-processed/4DNFIDB6M1ZN/; wild-type human engineered clone 1 80-kb-boundary-deletion iPS line raw data, https://data.4dnucleome.org/experiment-set-replicates/4DNES39F1QWU/; wild-type human engineered clone 1 80-kb-boundary-deletion iPS line balanced 5C data, https://data.4dnucleome.org/files-processed/4DNFIA8P94BX/; wild-type human engineered clone 2 30-kb-control-deletion iPS line raw data, https://data.4dnucleome.org/experiment-set-replicates/4DNES3PDMUHG/; wild-type human engineered clone 2 30-kb-control-deletion iPS line balanced 5C data: replicate 1, https://data.4dnucleome.org/files-processed/4DNFI7WZYRHP/, replicate 2, https://data.4dnucleome.org/files-processed/4DNFI7V4VXAQ/.
Code availability
We freely release all custom code for loop, TAD and subTAD detection at the following bitbucket links: TAD/subTAD detection, https://bitbucket.org/creminslab/cremins_lab_tadsubtad_calling_pipeline_11_6_2021; loop detection, https://bitbucket.org/creminslab/cremins_lab_loop_calling_pipeline_11_6_2021/src/initial/.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
These authors contributed equally: Daniel J. Emerson, Peiyao A. Zhao, Ashley L. Cook
Extended data
is available for this paper at 10.1038/s41586-022-04803-0.
Supplementary information
The online version contains supplementary material available at 10.1038/s41586-022-04803-0.
References
- 1.Bellush JM, Whitehouse I. DNA replication through a chromatin environment. Philos. Trans. R. Soc. B. 2017;372:20160287. doi: 10.1098/rstb.2016.0287. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Mechali M. Eukaryotic DNA replication origins: many choices for appropriate answers. Nat. Rev. Mol. Cell Biol. 2010;11:728–738. doi: 10.1038/nrm2976. [DOI] [PubMed] [Google Scholar]
- 3.Dixon JR, et al. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature. 2012;485:376–380. doi: 10.1038/nature11082. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Nora EP, et al. Spatial partitioning of the regulatory landscape of the X-inactivation centre. Nature. 2012;485:381–385. doi: 10.1038/nature11049. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Hou C, Li L, Qin ZS, Corces VG. Gene density, transcription, and insulators contribute to the partition of the Drosophila genome into physical domains. Mol. Cell. 2012;48:471–484. doi: 10.1016/j.molcel.2012.08.031. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Sexton T, et al. Three-dimensional folding and functional organization principles of the Drosophila genome. Cell. 2012;148:458–472. doi: 10.1016/j.cell.2012.01.010. [DOI] [PubMed] [Google Scholar]
- 7.Phillips-Cremins JE, et al. Architectural protein subclasses shape 3D organization of genomes during lineage commitment. Cell. 2013;153:1281–1295. doi: 10.1016/j.cell.2013.04.053. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Rao SS, et al. A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell. 2014;159:1665–1680. doi: 10.1016/j.cell.2014.11.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Krietenstein N, et al. Ultrastructural details of mammalian chromosome architecture. Mol. Cell. 2020;78:554–565. doi: 10.1016/j.molcel.2020.03.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Hsieh TS, et al. Resolving the 3D landscape of transcription-linked mammalian chromatin folding. Mol. Cell. 2020;78:539–553. doi: 10.1016/j.molcel.2020.03.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Akgol Oksuz B, et al. Systematic evaluation of chromosome conformation capture assays. Nat. Methods. 2021;18:1046–1055. doi: 10.1038/s41592-021-01248-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Fudenberg G, et al. Formation of chromosomal domains by loop extrusion. Cell Rep. 2016;15:2038–2049. doi: 10.1016/j.celrep.2016.04.085. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Vian L, et al. The energetics and physiological impact of cohesin extrusion. Cell. 2018;173:1165–1178. doi: 10.1016/j.cell.2018.03.072. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Sanborn AL, et al. Chromatin extrusion explains key features of loop and domain formation in wild-type and engineered genomes. Proc. Natl Acad. Sci. USA. 2015;112:E6456. doi: 10.1073/pnas.1518552112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Petryk N, et al. Replication landscape of the human genome. Nat. Commun. 2016;7:10208. doi: 10.1038/ncomms10208. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Zhao PA, Sasaki T, Gilbert DM. High-resolution Repli-Seq defines the temporal choreography of initiation, elongation and termination of replication in mammalian cells. Genome Biol. 2020;21:76. doi: 10.1186/s13059-020-01983-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Wang W, et al. Genome-wide mapping of human DNA replication by optical replication mapping supports a stochastic model of eukaryotic replication. Mol. Cell. 2021;81:2975–2988. doi: 10.1016/j.molcel.2021.05.024. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Ryba T, et al. Evolutionarily conserved replication timing profiles predict long-range chromatin interactions and distinguish closely related cell types. Genome Res. 2010;20:761–770. doi: 10.1101/gr.099655.109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Pope BD, et al. Topologically associating domains are stable units of replication-timing regulation. Nature. 2014;515:402–405. doi: 10.1038/nature13986. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Norton HK, et al. Detecting hierarchical genome folding with network modularity. Nat. Methods. 2018;15:119–122. doi: 10.1038/nmeth.4560. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Gilgenast TG, Phillips-Cremins JE. Systematic evaluation of statistical methods for identifying looping interactions in 5C data. Cell Syst. 2019;8:197–211. doi: 10.1016/j.cels.2019.02.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Fernandez LR, Gilgenast TG, Phillips-Cremins JE. 3DeFDR: statistical methods for identifying cell type-specific looping interactions in 5C and Hi-C data. Genome Biol. 2020;21:219. doi: 10.1186/s13059-020-02061-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Rao SSP, et al. Cohesin loss eliminates all loop domains. Cell. 2017;171:305–320. doi: 10.1016/j.cell.2017.09.026. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Schwarzer W, et al. Two independent modes of chromatin organization revealed by cohesin removal. Nature. 2017;551:51–56. doi: 10.1038/nature24281. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Davidson IF, et al. Rapid movement and transcriptional re-localization of human cohesin on DNA. EMBO J. 2016;35:2671–2685. doi: 10.15252/embj.201695402. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Pherson M, Misulovin Z, Gause M, Dorsett D. Cohesin occupancy and composition at enhancers and promoters are linked to DNA replication origin proximity in Drosophila. Genome Res. 2019;29:602–612. doi: 10.1101/gr.243832.118. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Guillou E, et al. Cohesin organizes chromatin loops at DNA replication factories. Genes Dev. 2010;24:2812–2822. doi: 10.1101/gad.608210. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.de Wit E, et al. CTCF binding polarity determines chromatin looping. Mol. Cell. 2015;60:676–684. doi: 10.1016/j.molcel.2015.09.023. [DOI] [PubMed] [Google Scholar]
- 29.Tang Z, et al. CTCF-mediated human 3D genome architecture reveals chromatin topology for transcription. Cell. 2015;163:1611–1627. doi: 10.1016/j.cell.2015.11.024. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Besnard E, et al. Unraveling cell type-specific and reprogrammable human replication origin signatures associated with G-quadruplex consensus motifs. Nat. Struct. Mol. Biol. 2012;19:837–844. doi: 10.1038/nsmb.2339. [DOI] [PubMed] [Google Scholar]
- 31.Li Y, et al. Transcription-coupled structural dynamics of topologically associating domains regulate replication origin efficiency. Genome Biol. 2021;22:206. doi: 10.1186/s13059-021-02424-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Sequeira-Mendes J, et al. Transcription initiation activity sets replication origin efficiency in mammalian cells. PLoS Genet. 2009;5:e1000446. doi: 10.1371/journal.pgen.1000446. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Chen YH, et al. Transcription shapes DNA replication initiation and termination in human cells. Nat. Struct. Mol. Biol. 2019;26:67–77. doi: 10.1038/s41594-018-0171-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Cayrou C, et al. The chromatin environment shapes DNA replication origin organization and defines origin classes. Genome Res. 2015;25:1873–1885. doi: 10.1101/gr.192799.115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Liu Y, et al. Transcription shapes DNA replication initiation to preserve genome integrity. Genome Biol. 2021;22:176. doi: 10.1186/s13059-021-02390-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Cadoret JC, et al. Genome-wide studies highlight indirect links between human replication origins and gene regulation. Proc. Natl Acad. Sci. USA. 2008;105:15837–15842. doi: 10.1073/pnas.0805208105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Oldach P, Nieduszynski CA. Cohesin-mediated genome architecture does not define DNA replication timing domains. Genes. 2019;10:196. doi: 10.3390/genes10030196. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Cremer M, et al. Cohesin depleted cells rebuild functional nuclear compartments after endomitosis. Nat. Commun. 2020;11:6146. doi: 10.1038/s41467-020-19876-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Sima J, et al. Identifying cis elements for spatiotemporal control of mammalian DNA replication. Cell. 2019;176:816–830. doi: 10.1016/j.cell.2018.11.036. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Hiratani I, et al. Global reorganization of replication domains during embryonic stem cell differentiation. PLoS Biol. 2008;6:e245. doi: 10.1371/journal.pbio.0060245. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Haarhuis JHI, et al. The cohesin release factor WAPL restricts chromatin loop extension. Cell. 2017;169:693–707. doi: 10.1016/j.cell.2017.04.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Yesbolatova A, et al. The auxin-inducible degron 2 technology provides sharp degradation control in yeast, mammalian cells, and mice. Nat. Commun. 2020;11:5701. doi: 10.1038/s41467-020-19532-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Zhang D, et al. Alteration of genome folding via contact domain boundary insertion. Nat. Genet. 2020;52:1076–1087. doi: 10.1038/s41588-020-0680-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Dimitrova DS, Prokhorova TA, Blow JJ, Todorov IT, Gilbert DM. Mammalian nuclei become licensed for DNA replication during late telophase. J. Cell Sci. 2002;115:51–59. doi: 10.1242/jcs.115.1.51. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Ryu MJ, et al. Direct interaction between cohesin complex and DNA replication machinery. Biochem. Biophys. Res. Commun. 2006;341:770–775. doi: 10.1016/j.bbrc.2006.01.029. [DOI] [PubMed] [Google Scholar]
- 46.Gros J, et al. Post-licensing specification of eukaryotic replication origins by facilitated Mcm2-7 sliding along DNA. Mol. Cell. 2015;60:797–807. doi: 10.1016/j.molcel.2015.10.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Powell SK, et al. Dynamic loading and redistribution of the Mcm2-7 helicase complex through the cell cycle. EMBO J. 2015;34:531–543. doi: 10.15252/embj.201488307. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Sasaki T, et al. The Chinese hamster dihydrofolate reductase replication origin decision point follows activation of transcription and suppresses initiation of replication within transcription units. Mol. Cell. Biol. 2006;26:1051–1062. doi: 10.1128/MCB.26.3.1051-1062.2006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Dequeker, B. J. H. et al. MCM complexes are barriers that restrict cohesin-mediated loop extrusion. Nature10.1038/s41586-022-04730-0 (2022). [DOI] [PMC free article] [PubMed]
- 50.Nagano T, et al. Cell-cycle dynamics of chromosomal organization at single-cell resolution. Nature. 2017;547:61–67. doi: 10.1038/nature23001. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
All new raw data created in this manuscript have been uploaded to the 4D Nucleome portal and will be freely released for full distribution to the public (see specific details below). Processed data files for all figures and extended data figures are provided as Supplementary Tables 1–19. ORM data have been uploaded to the National Center for Biotechnology Information, BioProject database accession number PRJNA788726 (http://genome.ucsc.edu/s/dsaulebe/ORM%20data%20HCT116). Two-fraction Repli-seq data for Blobel engineered lines (raw data and processed log2[early/late] from three conditions) were obtained from https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE190117.
Group 1 data (16-fraction Repli-seq data for H1 human ES cells) are available from the 4D Nucleome portal as follows: H1 human ES raw fastq, https://data.4dnucleome.org/experiment-sets/4DNESXRBILXJ/; H1 human ES read-depth-normalized array for visualization, https://data.4dnucleome.org/files-processed/4DNFIEEYFQ7C/; H1 human ES scaled, read-depth-normalized array for IZ calls, https://data.4dnucleome.org/files-processed/4DNFI3N8GHKR/; H1 human ES early, early–mid and late IZs on read-depth-normalized array, https://data.4dnucleome.org/files-processed/4DNFIRF7WZ3H/.
Group 2 data (16-fraction Repli-seq data for wild-type HCT116 cells) are available from the 4D Nucleome portal as follows: wild-type HCT116 raw fastq, https://data.4dnucleome.org/experiment-sets/4DNESNGZM5FG/; wild-type HCT116 mitochondria-normalized array for IZ calls, https://data.4dnucleome.org/files-processed/4DNFIPIQTMJ9/; wild-type HCT116 early, early–mid and late IZs on mitochondria-normalized array, https://data.4dnucleome.org/files-processed/4DNFI95K53YS/.
Group 3 data (16-fraction Repli-seq data for wild-type and cohesin-knockdown HCT116 pairing) are available from the 4D Nucleome portal as follows: RAD21-knockdown HCT116 raw, https://data.4dnucleome.org/experiment-sets/4DNES92AU9JR/; RAD21-knockdown HCT116 read-depth-normalized downsampled array for IZ calls, https://data.4dnucleome.org/files-processed/4DNFI3ZMWG5T/; RAD21-knockdown HCT116 early, early–mid and late IZs called on the read-depth-normalized downsampled array, https://data.4dnucleome.org/files-processed/4DNFIGOMS9G7/; wild-type HCT116 raw fastq, https://data.4dnucleome.org/experiment-sets/4DNESNGZM5FG/; wild-type HCT116 read-depth-normalized downsampled array for IZ calls, https://data.4dnucleome.org/files-processed/4DNFI6NGWNOG/; wild-type HCT116 early, early–mid and late IZs called on the read-depth-normalized downsampled array, https://data.4dnucleome.org/files-processed/4DNFIYO3H24N/.
Group 4 data (16-fraction Repli-seq data for wild-type and WAPL-knockdown HCT116 pairing) are available from the 4D Nucleome portal as follows: WAPL-knockdown HCT116 raw, https://data.4dnucleome.org/experiment-sets/4DNES72NE7SL/; WAPL-knockdown HCT116 read-depth-normalized downsampled array for IZ calls, https://data.4dnucleome.org/files-processed/4DNFI7MI88QR/; WAPL-knockdown HCT116 early, early–mid and late IZs called on the read-depth-normalized downsampled array, https://data.4dnucleome.org/files-processed/4DNFIDI1QJVA/; wild-type HCT116 raw fastq, https://data.4dnucleome.org/experiment-sets/4DNESNGZM5FG/; wild-type HCT116 read-depth-normalized downsampled array for IZ calls, https://data.4dnucleome.org/files-processed/4DNFI6NGWNOG/; wild-type HCT116 early, early–mid and late IZs called on the read-depth-normalized downsampled array, https://data.4dnucleome.org/files-processed/4DNFILNNSFMD/.
Group 5 data (16-fraction Repli-seq data visualization) are available from the 4D Nucleome portal as follows: wild-type HCT116 read-depth-normalized downsampled array for visualization, https://data.4dnucleome.org/files-processed/4DNFI6NGWNOG/; RAD21-knockdown HCT116 read-depth-normalized downsampled array for visualization, https://data.4dnucleome.org/files-processed/4DNFI3ZMWG5T/; WAPL-knockdown HCT116 read-depth-normalized downsampled array for visualization, https://data.4dnucleome.org/files-processed/4DNFI7MI88QR/.
Hi-C data for wild-type and WAPL-knockdown HCT116 pairing are available from the 4D Nucleome portal as follows: WAPL-knockdown HCT116 raw Hi-C, https://data.4dnucleome.org/experiment-set-replicates/4DNES1JP4KZ1/; WAPL-knockdown HCT116 normalized balanced Hi-C matrices, https://data.4dnucleome.org/files-processed/4DNFIY5939F3/; WAPL-knockdown HCT116 loops, https://data.4dnucleome.org/files-processed/4DNFILP7BD5H/; wild-type HCT116 raw Hi-C, https://data.4dnucleome.org/experiment-set-replicates/4DNESNSTBMBY/; wild-type HCT116 normalized balanced Hi-C matrices, https://data.4dnucleome.org/files-processed/4DNFI5MR78O6/; wild-type HCT116 loops, https://data.4dnucleome.org/files-processed/4DNFIOQLL854/.
Two-fraction Repli-seq data for human iPS wild-type and two CRISPR-engineered lines (raw data and processed log2[early/late] from three conditions) are available from the 4D Nucleome portal as follows: wild-type human iPS line raw data, https://data.4dnucleome.org/experiment-sets/4DNESDYES9QD/; wild-type human iPS line log2[early/late], https://data.4dnucleome.org/files-processed/4DNFI5WEY784/; human engineered clone 1 80-kb-IZ-deletion iPS line raw data, https://data.4dnucleome.org/experiment-sets/4DNESE3WCUAQ/; human engineered clone 1 80-kb-IZ-deletion iPS line log2[early/late], https://data.4dnucleome.org/files-processed/4DNFIZMB415V/; human engineered clone 2 30-kb-control-deletion iPS line raw data, https://data.4dnucleome.org/experiment-sets/4DNES66YWJU7/; human engineered clone 2 30-kb-control-deletion iPS line log2[early/late], https://data.4dnucleome.org/files-processed/4DNFIWDMF7HW/.
5C data for human IPS wild-type and two engineered lines (primer bed file, raw heatmaps and processed heatmaps from three conditions) are available from the 4D Nucleome portal as follows: wild-type human iPS line raw data, https://data.4dnucleome.org/experiment-set-replicates/4DNESLRDUPZ6/; wild-type human iPS line balanced 5C data, replicate 1, https://data.4dnucleome.org/files-processed/4DNFIXM8V3ZB/, replicate 2, https://data.4dnucleome.org/files-processed/4DNFIDB6M1ZN/; wild-type human engineered clone 1 80-kb-boundary-deletion iPS line raw data, https://data.4dnucleome.org/experiment-set-replicates/4DNES39F1QWU/; wild-type human engineered clone 1 80-kb-boundary-deletion iPS line balanced 5C data, https://data.4dnucleome.org/files-processed/4DNFIA8P94BX/; wild-type human engineered clone 2 30-kb-control-deletion iPS line raw data, https://data.4dnucleome.org/experiment-set-replicates/4DNES3PDMUHG/; wild-type human engineered clone 2 30-kb-control-deletion iPS line balanced 5C data: replicate 1, https://data.4dnucleome.org/files-processed/4DNFI7WZYRHP/, replicate 2, https://data.4dnucleome.org/files-processed/4DNFI7V4VXAQ/.
We freely release all custom code for loop, TAD and subTAD detection at the following bitbucket links: TAD/subTAD detection, https://bitbucket.org/creminslab/cremins_lab_tadsubtad_calling_pipeline_11_6_2021; loop detection, https://bitbucket.org/creminslab/cremins_lab_loop_calling_pipeline_11_6_2021/src/initial/.