Abstract
The boundaries of topologically associating domains (TADs) are delimited by insulators and/or active promoters; however, how they are initially established during embryogenesis remains unclear. Here, we examined this during the first hours of Drosophila embryogenesis. DNA-FISH confirms that intra-TAD pairwise proximity is established during zygotic genome activation (ZGA) but with extensive cell-to-cell heterogeneity. Most newly formed boundaries are occupied by combinations of CTCF, BEAF-32, and/or CP190. Depleting each insulator individually from chromatin revealed that TADs can still establish, although with lower insulation, with a subset of boundaries (~10%) being more dependent on specific insulators. Some weakened boundaries have aberrant gene expression due to unconstrained enhancer activity. However, the majority of misexpressed genes have no obvious direct relationship to changes in domain-boundary insulation. Deletion of an active promoter (thereby blocking transcription) at one boundary had a greater impact than deleting the insulator-bound region itself. This suggests that cross-talk between insulators and active promoters and/or transcription might reinforce domain boundary insulation during embryogenesis.
Different boundaries require different subsets of insulator proteins, and in some cases an active promoter, for full insulation.
INTRODUCTION
The organization of chromosomes into self-interacting chromatin domains, commonly called topologically associating domains (TADs), is a widespread feature in the animal kingdom (1–4). TADs are computationally defined from chromosome conformation capture data as genomic regions with enriched chromatin interaction frequencies when averaged across a population of cells (2, 4–6). Enhancers and their target genes are often contained within the same TAD (3, 7), although this organization varies from cell to cell (5). At some loci, domain boundaries insulate regulatory elements (enhancers and promoters) within the TAD from regulatory elements in neighboring domains, thus delimiting the domain in which enhancers drive transcriptional activation (8–11). However, at other loci, this does not appear to be the case as removal of the boundary or rearrangement of the TAD has little apparent impact on gene expression (8, 10, 12).
Most domain boundaries in mouse and human cells are occupied by the CCCTC-binding factor (CTCF), which binds to motifs in a convergent orientation at the two boundaries (13, 14). Mechanistically, TADs are thought to form by a loop-extrusion mechanism in vertebrates, whereby a chromatin loop is extruded by the cohesin complex until it stalls at CTCF-bound regions (15, 16). Depletion of CTCF in mouse embryonic stem cells or differentiated tissues reduced almost all (~80%) domain structure (17). Collectively, these studies demonstrate that CTCF is essential for the maintenance of the majority of TADs in vertebrates. However, it remains unclear how these domains are initially established during embryogenesis. Their formation coincides with zygotic genome activation (ZGA) or just after in zebrafish (18), mice (19, 20), and Drosophila (21).
In addition to CTCF, Drosophila has a number of other insulator proteins (22). Of the direct DNA binding factors, CTCF, BEAF-32, and Su(Hw) bind to the majority of domain boundaries along with the architectural cofactors CP190 or Mod(Mdg4) (4, 23–29). Many insulator proteins, including BEAF-32 (30) and CP190 (31), also bind very close to gene promoters, and a number of these factors have been proposed to function as transcriptional regulators (30–33). Drosophila boundaries also typically overlap transcribing promoters (4, 34), representing ~77% of boundaries in Drosophila Kc167 cells (23), and it is currently not clear whether the enrichment of insulator proteins at domain boundaries is required for boundary formation or secondary to their role in the transcription of genes located at boundaries. TADs, especially in Drosophila, also reflect chromatin state, which, in turn, reflects transcription, and partitioning of chromatin states between domains of histone acetylation and methylation (especially H3K27me3) has been proposed to lead to TAD formation in Drosophila (35–37). Actually, transcriptional state is sufficient to predict Hi-C domain structure in a number of species (34), including flies. However, the relative contribution of insulator protein binding, transcription, or a combination of both to the formation and/or maintenance of TADs during embryonic development remains unclear.
A number of studies have begun to address this by genetic deletion or depletion of different factors in trans. For example, a complete deletion of CTCF (removing both the maternal and zygotic supply) in vivorevealed that CTCF is not required for Drosophila embryogenesis (38) or for the maintenance of TAD structure in the larval nervous system (29), but it is required for boundary insulation and correct gene expression at a small subset of loci (29, 38). Depletion of BEAF-32 in Kc167 cells also had little global impact on TAD structure (23), while depletion in BG-3 cells was reported to affect ~20% of strong boundaries (39). Depletion of both CP190 and Chromator together in BG-3 cells affected a subset of boundaries with repressive chromatin (H3K27me3), while it had little impact on active regions (39). Similarly, genetic deletion of CP190 affected insulation on the subset of Drosophila boundaries that do not contain active promoters [~23% in Kc167 cells (23)] and had little impact on others (28).
The current lack of global regulators required for the maintenance of TAD structure in studies that focused on late developmental stages or cell lines led us to speculate that insulator proteins are perhaps more relevant for the initial establishment of TAD boundaries in Drosophila rather than for their maintenance. Here, we assessed the functional requirement for three major insulator proteins in the establishment of TADs in early Drosophila embryos. We first reassessed the precise timing of when intra-TAD interactions and high-frequency loops are first established in individual embryos by measuring pairwise distances within three domains in single cells using DNA fluorescence in situ hybridization (FISH). Although the genes within these loci are expressed at different stages of embryogenesis, the pairwise interaction frequencies within the three TADs show similar temporal dynamics in their establishment across the very early stages of ZGA, spanning nuclear cycle (NC) 12 to 14. This not only confirms previous Hi-C measurements (21, 40) but also revealed extensive cell-to-cell heterogeneity in these intra-TAD distances at NC14, in line with the extensive cell-to-cell variation in TADs previously observed at this stage (41). We then profiled the genome-wide binding of five of the main insulator proteins [BEAF-32, CTCF, Su(Hw), CP190, and GAGA factor (GAF)] in a narrow window during the major wave of ZGA. All five proteins bind extensively along the genome and are present in various combinations at TAD boundaries.
We depleted BEAF-32, CTCF, or CP190 individually, removing their maternal contribution, and confirmed that all three proteins are depleted from chromatin in NC14 embryos. Evaluating their requirement for the establishment of chromatin topology, using both Hi-C and DNA-FISH, revealed that the majority of TADs are still able to form in the absence of these proteins, although with lower insulation. Boundaries that are occupied by different combinations of insulator proteins are associated with different levels of insulation at NC14 in wild-type (WT) embryos. After insulator protein depletion, 6 to 10% of boundaries have different sensitivities to the loss of a single factor, suggesting that it is not simply redundancy between any combination of insulators. Examining the impact of insulator protein depletion on gene expression revealed that ZGA occurs largely unperturbed. A few hundred genes are misexpressed, which appears to arise through multiple mechanisms. A subset of down-regulated genes (~2 to 20%) are bound by insulator proteins at their promoter and could be regulated directly or through the disruption of enhancer-promoter looping, while 6 to 13% of the up-regulated genes may occur through enhancer hijacking at weakened TAD boundaries. However, the majority (>80%) of all misexpressed genes have no obvious relationship to changes in topology and may represent more secondary indirect effects. To disentangle the interplay between insulator proteins and transcription itself, we dissected one TAD boundary that contains an active promoter during ZGA and a bound insulator binding site. Depletion of the activator protein Zelda (Zld) in trans and genetic deletion of the regulatory elements in cis indicate that removal of the active promoter (and thereby transcription) had a greater impact on TAD structure than removal of the insulator-bound region itself. This suggests that an active promoter or transcription (or some other aspect of the transcription factor’s function at the promoter) may feedback and reinforce domain insulation as embryogenesis proceeds.
RESULTS
The establishment of intra-TAD proximity during ZGA
TADs were initially detected by Hi-C in early Drosophila embryos starting at NC14, during the major wave of ZGA (21, 40), which we confirmed here by performing Hi-C on tightly staged 2- to 3-hour embryos (predominantly NC14) (examples in Fig. 1, A to C). To more precisely quantify the timing of intra-TAD interactions, we performed DNA-FISH measuring pairwise distances within three TADs across different stages of the ZGA. One TAD has relatively uniform interaction frequencies throughout the domain, as measured by Hi-C (referred to as a neutral TAD; Fig. 1A), and contains genes that are not expressed during embryogenesis, with the exception of one gene (CG9304). The other two contain high-frequency loops between paralogous genes (scyl/chrb and tsh/tio; Fig. 1, B and C), which were shown to form at the doc locus even before the establishment of TADs (41). The DNA-FISH probes (~7 to 8.5 kb) were designed to target the outermost loop and an equidistant (in the linear genome) control probe outside the TAD (Fig. 1, A to C).
For the neutral TAD, at NC12 (the minor wave of ZGA), the pairwise distance between intra-TAD and inter-TAD probes is indistinguishable, with a small proportion of alleles (9%) showing overlap, defined as distances of <250 nm (Fig. 1, D and E, bottom dashed line). This indicates that in the vast majority of cells (>90%), the intra-TAD proximity of these regions has not yet formed. However, at NC14, the overlap between the intra-TAD probes greatly increased (34% of cells have pairwise distances of <250mn), concomitant with the major wave of ZGA, which was not observed for the inter-TAD probes (1% of alleles) (Fig. 1E). This proximity is maintained in later developmental stages (16 to 18 hours; Fig. 1E). Similarly, the high-frequency loops in the scyl/chrb and tsh/tio TADs are not present at the earlier stages (NC11 and NC12); the distances between the intra-TAD probes and the control probe are indistinguishable (Fig. 1, F to I). The formation of the loops initiates at NC13 (as observed by the increased proximity between the intra-TAD probes), extending a previous finding at the doc locus (41), and, to a much greater extent, at NC14: 16% of cells versus 6% have the loop anchor regions in close proximity (<250 nm) in the scyl-chrb TAD and 20% versus 2% in tsh-tio TAD (Fig. 1, G and I).
These single-cell measurements of intra-TAD distances also revealed extensive cell-to-cell heterogeneity within these TADs at NC14 (Fig. 1, E, G, and I). Although the pairwise intra-TAD proximity is higher at NC14 compared to that at NC11 and NC12, only 16 to 34% of cells (Fig. 1, E, G, and I) have pairwise proximity within 250 nm, indicating that there is still extensive cell-to-cell heterogeneity at NC14—for example, in 18% of cells, the outer-loop anchors within the tsh/tio TAD are at distances of >600 nm at NC14, indicating that the loop is not present in these cells at this stage (Fig. 1I, top dashed line). Similar extensive cell-to-cell heterogeneity was observed at the level of TADs during these stages (41, 42). This cell-to-cell variability remained constant for the neutral TAD even at the end of embryogenesis (16 to 18 hours), where 30% of cells have distances of >250 nm (Fig. 1E). However, this changed quite markedly for the other two TADs where the number of cells with high proximity (<250 nm) between the loop anchors increased from 16 to 20% at NC14 to 50 to 54% of cells at mid-embryogenesis (6 to 8 hours, stage 11; Fig. 1, G and I). This may reflect the smaller size of nuclei at these later developmental stages, although this did not seem to affect intra-TAD pairwise distances within the neutral TAD. Alternatively, it may represent some reinforcement or stabilization of these high-frequency loops as embryogenesis proceeds. The cell-to-cell heterogeneity at NC14 also suggests that very defined TADs and intra-TAD proximity are not required for the regulation of gene expression during early stages of embryogenesis.
Domain boundaries are occupied by diverse combinations of insulator proteins during the establishment of TADs
To determine how TADs are initially established during ZGA, we assessed the requirement of insulator proteins given the role of CTCF in TAD formation in vertebrates, the occupancy of these proteins at TAD boundaries in Drosophila [shown for later embryonic stages and cell lines (23, 43, 44)], and the availability of these proteins in early embryos due to their maternal deposition. As the occupancy of these insulators had not been assessed at NC14, we first performed chromatin immunoprecipitation sequencing (ChIP-seq) on tightly staged NC14 embryos (2 hours and 10 min. to 2 hours and 40 min.) for the four most studied Drosophila insulator proteins BEAF-32, CTCF, Su(Hw), and CP190 (22, 43) during NC14. Biological replicates for each factor are highly correlated (fig. S1, A and B), and the DNA binding motif for each insulator protein is highly enriched under their ChIP peaks (fig. S1C), attesting to the quality of these NC14 ChIP datasets. All four insulator proteins are significantly bound to thousands of genomic regions at NC14 (Fig. 2, A to C): BEAF-32, 2917; CTCF, 1319; Su(Hw), 6134; and CP190, 5490 peaks at <1% Irreproducible Discovery Rate (IDR) (Materials and Methods). Each insulator protein binds to many regions alone and in combinations with each other (Fig. 2, B and C), although the three insulators with direct DNA binding [BEAF-32, CTCF, and Su(Hw)] have different distributions. BEAF-32, CTCF, and CP190 combinatorial binding is more common: 51, 58, and 60% of their peaks colocalize with at least one other insulator protein, respectively, while this proportion was 34% for Su(Hw) (Fig. 2C, left). The most frequent combinations are [BEAF-32 & CP190] and [Su(Hw) & CP190], representing 35% of BEAF, 20% of Su(Hw), and 41% of CP190 peaks. Triple binding is rare in comparison to single or double and was always observed between CP190 and two other factors. Only 60 regions are cobound by all four factors, and all 60 of which are at TAD boundaries (Fig. 2C).
We also performed ChIP-seq on the transcription factor Trithorax-like (Trl/GAF) at NC14, which was recently proposed to bind to promoter-tethering elements to establish enhancer-promoter communication in Drosophila (45). Although GAF binds to TAD boundaries at NC14 (fig. S1E), we observed very little cobinding between GAF and the other insulator proteins examined, including CP190 (fig. S1, B and E).
CP190 is recruited to chromatin indirectly through protein-protein interactions with other insulator proteins and therefore does not have a canonical DNA binding motif (46–48). This is reflected in the extensive colocalization of CP190 with the other three insulator proteins (60% of peaks overlap; Fig. 2C). This is in keeping with the de novo motif enrichments under CP190 overlapping peaks, where Su(Hw), CTCF, and BEAF-32 motifs are within the first five most enriched motifs (fig. S1C). Motif analyses of 40% of CP190-only peaks identified additional putative recruiters of CP190 during these early stages of embryogenesis (fig. S1D), including the insulator protein Pita (49) and the transcriptional regulators Knirps, Jim, Nautilus, and Visual system homeobox 2.
We next assessed how insulator occupancy relates to the newly established TAD boundaries during ZGA. As the ability to define TAD boundaries in a given condition varies greatly depending on the resolution of the Hi-C matrices and the TAD calling algorithm used, we used our Hi-C data from tightly staged 2- to 3-hour embryos to call TADs at multiple base pair resolutions (2, 5, and 10 kb) and q values (0.1, 0.05, and 0.01) (fig. S2A). The 10-kb resolution and q value of 0.1 gave the most consistent results, showing a better visual overlap with TADs while avoiding splitting larger domains. By using these thresholds, we identified 772 high-confidence TAD boundaries in 2- to 3-hour embryos (Fig. 2A and fig. S2, A and B). Approximately 90% (703 of 772) of these domain boundaries have at least one of the four analyzed insulator proteins binding within the 10-kb boundary window at NC14 (Fig. 2C, right), with CP190 being the most frequent (80%, 624 of 772). TAD boundaries are preferentially bound by combinations of insulators rather than single proteins, with 70% of all boundaries being occupied by two or more of the four insulators (Fig. 2C, right). This proportion is likely an underestimate given that there are other Drosophila insulator proteins that we did not profile here. To assess that, we examined available ChIP data for Ibf1 (50), Ibf2 (50), Pita (51), ZIPIC (51), and Zw5 (51), which were performed in either cell lines or at later developmental stages (fig. S1F). Even assuming that the binding of these factors is the same at NC14, different combinations of BEAF-32, CTCF, Su(Hw), and CP190, without the other factors, cumulatively represent the largest binding classes at TAD boundaries at NC14. The most prevalent combinations (from our NC14 data) are [BEAF-32 & CP190] and [BEAF-32, Su(Hw), & CP190], present in 17 and 18% of TAD boundaries, respectively (Fig. 2C, right).
Two trends were previously reported in mice or Drosophila cell lines. In mammals, both a higher number and a higher intensity of CTCF ChIP peaks were proposed to provide robustness to TAD boundaries (12, 52). Here, during the establishment of TADs in early Drosophila embryos, we observed a similar trend for stronger insulation at boundaries with multiple BEAF-32 or CP190 peaks, although this does not hold true for multiple CTCF peaks (note that a lower insulation score reflects higher boundary insulation; Fig. 2D). However, ChIP peak intensity at TAD boundaries for either BEAF-32, CTCF, or CP190 is not correlated with stronger insulation (Fig. 2E). In Drosophila Kc167 cells, the strength of TAD boundaries (as measured by insulation score) is correlated with the number of cobound insulator proteins (heterotypic binding) (24). We observe a similar trend in NC14 embryos: 26% (202 of 772) of domain boundaries are occupied by either all four (60 of 772) or three [BEAF-32, Su(Hw), and CP190] factors (Fig. 2C, right), and these are among those with the highest insulation (Fig. 2F). For boundaries occupied by other combinations of three or less insulator proteins, the insulation score varies depending on the identity of the bound insulators (Fig. 2F). For example, boundaries occupied by [BEAF-32, CTCF, & CP190] or by [CTCF, Su(Hw), & CP190] have weaker insulation than those bound by [BEAF-32, Su(Hw), & CP190] (Fig. 2F). Similarly, insulation strength varies in boundaries occupied by different combinations of two insulator proteins depending on their identity: For example, the median insulation of [BEAF-32 & CP190]–occupied boundaries is between the medians of the triple combinations [BEAF-32, CTCF, & CP190] and [BEAF-32, Su(Hw), & CP190]. CP190-only boundaries are an interesting exception, which have almost as strong insulation as those bound by all four proteins (Fig. 2F), suggesting that other CP190 recruiters are important for insulation at this stage of embryogenesis.
In summary, our results indicate a strong diversity in the occupancy of TAD boundaries by combinations of insulator proteins at NC14. The binding of more proteins has a higher likelihood that a site will form a boundary. However, it is not simply the number of bound insulator proteins that is the sole determinant of insulation; the nature of the bound factors (and presumably the genomic context) influences the strength of insulation at a boundary.
Genetic depletion leads to very efficient removal of insulator proteins from chromatin in early embryos
To investigate the functional contribution of each insulator protein to the establishment of TADs and gene expression in early embryos, we removed the maternal deposition of each protein in females in the developing oocyte. For CTCF, we used our previously generated knockout allele that completely removes both the maternal and zygotic CTCF mRNA and protein (38). Genetic knockout of BEAF-32, Su(Hw), or CP190 leads to strongly reduced fertility, sterility, or homozygous lethality, respectively (53–55). We therefore used an alternative depletion strategy, through RNA interference (RNAi)–mediated knockdown in the female germ line (56). We could successfully obtain embryos after knockdown of maternal BEAF-32 or CP190 in the female germ line. Unfortunately, knockdown of maternal Su(Hw) led to female sterility, and we could therefore not obtain embryos to study the contribution of Su(Hw) in the establishment of TADs. We therefore focused on the role of CTCF (using genetic deletion of the maternal and zygotic contribution) and BEAF-32 and CP190 (by RNAi knockdown).
The efficiency of protein depletion in NC14 embryos was assessed by two metrics: first, by Western blot, which indicates that the three insulator proteins (BEAF-32, CTCF, or CP190) are globally very strongly depleted to almost undetectable levels (Fig. 3, A and B); second, by quantitative CUT&Tag (C&T) using spike-ins to determine the protein’s depletion from chromatin. For both WT and insulator-depleted embryos, C&T was performed on 50,000 Drosophila melanogaster nuclei (isolated from NC14 embryos) combined with 50,000 nuclei from another Drosophila species (Drosophila virilis) (isolated from 2- to 4-hour embryos). This D. virilis spike-in was added to every sample (both WT and insulator-depleted D. melanogaster nuclei) to control for the efficiency of tagmentation between samples and thereby enables a more accurate quantification of the reduction in ChIP peaks. Such spike-in controls are particularly important in cases like this where we expect almost a complete loss of binding in the depletion condition.
The occupancy profiles from the spiked-in D. virilis nuclei for all three insulator proteins were nearly identical for nuclei pooled with D. melanogaster WT and insulator depletion–matched samples, indicating that all C&T experiments worked efficiently (Fig. 3C). In contrast, in D. melanogaster nuclei, there was a marked reduction in chromatin binding for BEAF-32, CTCF, and CP190 in their respective depletion conditions (Fig. 3, D and E), while the binding was still present in the pooled D. virilis nuclei (Fig. 3C). We could detect almost no significant peaks overlapping any of the three insulator proteins’ WT peaks following their depletion (Fig. 3, D and E). This is important, as even low levels of insulator protein are enough to sustain chromatin topology in other models (17, 57). We did observe some nonspecific peaks in the insulator-depleted samples not present in WT conditions. As these were only present after the depletion of the protein (i.e., in the D. melanogaster nuclei) and not in the D. virilis nuclei, this is likely to be spurious Tn5 activity (ATAC-seq-like signal) in the absence of the correct epitope for the primary antibodies. This is purely a technical artifact of C&T, while the binding of BEAF-32, CTCF, and CP190 at endogenous WT peaks as measured by ChIP-seq is severely reduced or completely absent (Fig. 3, D and E). These results confirm that our depletion conditions are very efficient at removing these factors from their bound regions and therefore can be used to assess the function of BEAF-32, CTCF, and CP190 in the establishment of TADs.
Insulator depletion does not inhibit TAD formation but does weaken specific boundaries with different properties
We performed Hi-C in tightly staged 2- to 3-hour embryos (which are highly enriched for NC14) that were maternally depleted of each individual insulator protein, as described above. Unexpectedly, depletion of either BEAF-32, CTCF, or CP190 did not prevent the establishment of the majority of TADs, as observed in the Hi-C matrices (Fig. 4A). A similar number of TADs are detected across all genotypes (fig. S2B), which was robust to different TAD calling parameters. Chromosomal interactions across genomic distances are also similar for all genotypes for both WT and the different insulator-depleted embryos (fig. S2C). However, quantifying the insulation scores of TAD boundaries revealed a subtle genome-wide decrease in insulation (i.e., an increase in insulation score) in all three depletions (fig. S2D), although most boundaries are still detectable. Moreover, all three depletions are more highly correlated between each other compared to the WT replicates (fig. S2E). Similarly, quantifying interaction frequencies within TADs and across the boundary with the neighboring TADs (intra- versus inter-TAD interaction frequencies) revealed that all three insulator depletions have higher inter-TAD interactions compared to WT (fig. S2D), again suggesting weakened boundaries. However, although significant, this difference between the depletions and WT is subtle.
Despite these subtle genome-wide trends, some individual loci are more strongly affected. For example, some loci have weakened boundaries in specific genotypes, which are observed as gains of interactions across the TAD boundary, as seen in the differential Hi-C matrices (Fig. 4B). To systematically identify regions exhibiting differential interactions (disrupted TAD boundaries), we used Pareidolia, which extends the computer vision–based algorithm Chromosight (58), for differential analysis between two conditions. This identified 148 of 878 (17%), 119 of 869 (14%), and 151 of 878 (17%) boundaries with differential interactions (using a cutoff of <−0.1) between WT and BEAF-32–, CTCF-, or CP190-depleted embryos, respectively (table S1). This changed to 43 (4.9%), 43 (4.9%), and 81 (9%) boundaries using a more stringent cutoff of <−0.2 (table S1). Clustering the top 100 disrupted boundaries in each genotype shows the highest reduction in differential boundary score (from Chromosight) for that protein’s depletion, as expected (Fig. 4C, green/blue). This also revealed a number of boundaries whose insulation is affected by more than one genotype (Fig. 4C, green/blue). We complemented this with a visual curation of all affected TAD boundaries. A large fraction of the computationally identified top 100 affected TAD boundaries have an obvious alteration in the contact matrix by visual inspection (60, 52, and up to 75 for BEAF-32, CTCF, and CP190 depletions, respectively; table S1). This represents roughly 6 to 10% of all TADs. CP190 depletion had the strongest effects on boundary loss (as seen in the example in Fig. 4B and subsequent figures), in agreement with recent findings at later stages (28) and consistent with its presence at ~80% of all boundaries (Fig. 2) and its proposed insulator cofactor role.
To assess whether the TAD boundaries that are sensitive to different insulator protein depletion have specific features, we divided boundaries into different classes based on their combinatorial insulator binding (as in Fig. 2) and analyzed which classes had the largest loss in insulation score. The violin plots are ordered from the largest loss of insulation (greatest change in insulation score; Fig. 4D, left) to boundaries that are not bound by that insulator protein for comparison (Fig. 4D, right). This revealed that depletion of BEAF-32, CTCF, or CP190 led to a stronger reduction in insulation at domain boundaries occupied by these respective factors (note that a higher insulation score loss indicates a larger decrease in boundary insulation; Fig. 4D), which, although expected, confirms the specificity of the depletions. It also revealed interesting differences between BEAF-32 and CTCF and CP190. The boundaries that are only bound by BEAF-32 (or [BEAF-32 & CP190]) are the most severely affected after BEAF-32 depletion (Fig. 4D), suggesting that at these boundaries, BEAF-32 is essential for full insulation. However, at other boundaries co-occupied by BEAF-32 and other factors, BEAF-32 depletion has little impact (Fig. 4D), suggesting that, in these contexts, it acts redundantly. Conversely, depletion of CTCF or CP190 has stronger effects on combinatorially bound boundaries. For example, depletion of CTCF had a stronger reduction in insulation at boundaries bound by all four insulator proteins, while CP190 had a stronger impact on boundaries occupied by [CTCF, Su(Hw), & CP190], suggesting that both insulators function by more cooperative interactions (Fig. 4D). This was also evident when plotting the quantitative signal of insulator cobinding at affected boundaries upon BEAF-32, CTCF, or CP190 depletion (fig. S3A). This suggests that CTCF and CP190 act more cooperatively with other factors. For CTCF, at least part of this cooperativity is likely through CP190 itself, as CTCF was recently shown to be essential for CP190 recruitment at a subset of boundaries at later stages of embryogenesis (29). Within our top 100 boundaries affected by CTCF depletion, 39 overlap CP190 peaks at both NC14 and later stages (29), ~50% (19 of 39) of which are dependent on CTCF for CP190 recruitment (at least at later stages) (29).
Differences between BEAF-32, CTCF, and CP190 are also apparent when looking at the relationship between the changes in insulation score and the number of ChIP peaks for that factor. TAD boundaries that overlap more BEAF-32 peaks (within the 10-kb window) in WT embryos are more dependent on BEAF-32 (i.e., had a larger reduction in insulation after depletion) (fig. S3B). However, this trend is not significant for CTCF and CP190 (fig. S3B), and there is no correlation between the ChIP peak height and insulator score for all three proteins (fig. S3C).
Examining the distance of an affected boundary to the closest insulator ChIP peak also revealed differences between each insulator protein. Boundaries affected in CTCF and CP190 are closer to a CTCF peak and further away from a BEAF-32 or CP190 peak in comparison to a random background of unaffected boundaries (fig. S3D). In BEAF-32 depletion, affected boundaries are further away from CP190 peaks than random but not significantly closer to BEAF-32 peaks. Boundaries affected in all three depletions were further from CP190 peaks (fig. S3D), which might be due to the widespread presence of CP190 at 80% of boundaries (and thus present in many nonaffected boundaries). In most cases, the significant difference in the distance distributions concentrates at about 50 kb from the TAD boundaries and is therefore difficult to reconcile with a simple direct relationship between loss of binding at the boundary and loss of boundary insulation, but again, it hints at a functional connection between CTCF and CP190.
Together, these results indicate that the initial establishment of TADs during Drosophila embryogenesis is generally robust to the loss of a single insulator protein. However, TAD boundary insulation is globally decreased, and some specific TAD boundaries are more sensitive to the loss of a given insulator protein than others. Our observations indicate that CP190 and CTCF function more combinatorially compared to BEAF-32, although BEAF-32 and CP190 cobind much more extensively throughout the genome. This suggests that these insulators act together in a more complex (or specific) manner than simple redundancy between any of the insulators. Specific combinations of insulator proteins are needed for full insulation at different boundaries.
Depletion of insulator proteins leads to transcriptional defects through different mechanisms
To investigate how insulator protein depletion during the establishment of TADs affects gene expression, we performed RNA sequencing (RNA-seq) in manually selected embryos at NC14, in triplicates. Principal components (PC) analysis illustrates that in all three genotypes, PC1 clearly separates depletion and WT replicates (fig. S4A). To avoid any potential confounding trans-effects, we removed maternally deposited genes from the analysis [using RNA-seq data from unfertilized oocytes (10)] and thereby focused only on genes that started to be expressed at the ZGA. Examining these strictly zygotically expressed genes identified 325, 436, and 597 differentially expressed genes (DEGs) in BEAF-32–, CTCF-, and CP190-depleted embryos, respectively, compared to stage-matched WT embryos [|og2 fold change| > 0.7 and false discovery rate (FDR) < 0.05] (fig. S4B and table S2). This is largely balanced between the numbers of up- and down-regulated genes (fig. S4B). Around 25% of all DEGs (263 of 1030) have expression changes in the same direction in two of the three genotypes. Thus, the transcriptional responses following depletion of insulator proteins are largely genotype specific. However, CTCF and CP190 depletion resulted in slightly more overlap in misexpressed genes (Fig. 5A), in keeping with the similarity in their most affected TAD boundaries (discussed above; Fig. 4), and in the requirement of CTCF for CP190 recruitment at a subset of boundaries (28).
We also examined the expression of genes that are activated during the minor and major waves of ZGA (59) and are involved in the first spatial patterning events of the embryo. The majority of these genes had no significant changes in their mRNA levels after the depletion of BEAF-32, CTCF, or CP190, with a few exceptions including the up-regulation of the homeotic gene scr in CTCF mutant embryos (table S3). This indicates that ZGA and the activation of the majority of the anterior-posterior and dorsal-ventral patterning genes [some of which are regulated by known distant-acting elements, e.g., (60–62)] do not require these insulator proteins.
Insulator proteins have also been implicated in directly regulating gene expression (63, 64). In Drosophila, for example, BEAF-32 (30) and CP190 (31) were proposed to have an activator role by directly binding to a subset of promoters, while CTCF was proposed to have either a transcriptional repressor (65) or activator (17, 64, 66) function. A subset of DEGs have insulator protein binding directly at their promoter [±500 base pairs (bp) from the transcription start site (TSS)] at this stage of embryogenesis (ZGA NC14 embryos) (Fig. 5B): 5% (15 of 325) of BEAF-32, 2% (10 of 436) of CTCF, and 16% (94 of 597) of CP190 DEGs. For BEAF and CTCF, these DEGs were roughly balanced between genes that were up- and down-regulated (7 of 15 and 4 of 10 down-regulated, respectively). However, 65% (61 of 94) of DEGs with CP190 binding at their promoter have reduced expression upon CP190 depletion (Fig. 5B). Therefore, our results show little evidence for direct promoter regulation by BEAF-32 or CTCF at these stages; however, CP190 may regulate the expression of a subset of genes more directly, independent of a role in TAD formation (Fig. 5B). This could involve the regulation of promoter-enhancer loops or a more direct regulatory role on the transcription machinery at the promoter. However, this represents only a small fraction (16% or less) of all misexpressed genes after CP190 depletion.
To explore the relationship between TAD boundary disruption and gene misexpression, we focused on the top 100 most affected TAD boundaries in each genotype and measured their distance to the nearest DEG (Fig. 5C). In all three depletions, the nearest DEG is located up to 300 kb away from a disrupted boundary, with a median distance of ~125, 100, and 70 kb for BEAF-32, CTCF, and CP190, respectively (Fig. 5C). This distance distribution is comparable to the background (i.e., the distance of 100 unaffected boundaries to the nearest DEG) [Fig. 5C, compare colored line (disrupted TADs) to gray (background)] and is larger than the average TAD size that most of these genes reside in. This indicates that DEGs are generally not enriched near disrupted TAD boundaries and that the TAD boundary is not constraining enhancer activity in the majority of cases, as we and others observed previously (8, 10).
However, there are examples of DEGs close to the disrupted TAD boundary and therefore could be good candidates for enhancer hijacking, where an enhancer in one TAD leads to the misregulation of a gene in a neighboring TAD after disruption of the boundary that normally segregates the two. Such examples have been previously observed in both mammals (9, 11, 67) and Drosophila (10, 68). Here, using this more rapid depletion, in a single generation, we searched for DEGs (zygotic only) with up-regulated expression in the neighboring TAD of TADs with affected boundaries after each depletion. Taking the top 100 affected boundaries in each genotype, we identified 10 (6%; 10 of 164 DEGs), 34 (12%; 34 of 275), and 40 (13%; 40 of 298) potential cases in BEAF-32, CTCF, and CP190 depletions, where a zygotic gene is up-regulated in the neighboring TAD (table S4). We chose three of these potential hijacking candidates that also have characterized enhancers that are active at NC14 to examine in more detail by fluorescent in-situ hybridisation (FISH). While one had no detectable change in expression (hairy locus), the other two (wingless and short gastrulation) had (Fig. 5, D and E).
Wingless (wg) is the ligand of one of the most characterized developmental signaling pathways whose expression initiates at NC14 to control the anterior-posterior patterning of the early embryo. wg and its enhancers are contained between two CTCF/CP190 binding sites, which separate them from the neighboring wnt4 gene, which is only activated later in development (Fig. 5D). One of the CTCF/CP190 ChIP peaks overlaps a TAD boundary between wg and Wnt4. This peak is lost and the TAD boundary is disrupted after CP190 (and CTCF) depletion (Fig. 5D). Our RNA-seq experiments indicate no change in wg expression but a slight yet significant increase in wnt4 expression, located in the neighboring domain upon CP190 depletion (Fig. 5D, arrowhead). Using RNA FISH, we observed a few cells that acquire wnt4 misexpression, which largely colocalizes with the high levels of wg in a posterior stripe (Fig. 5F, inset) or in a patch in the anterior of the embryo. This ectopic expression pattern suggests that the wg enhancer(s) controlling this expression pattern start(s) activating the wnt4 promoter upon loss of CP190.
An even more marked example is at the short gastrulation (sog) locus. sog is an essential gene expressed in two bands along the anterior-posterior axis of the embryo that will give rise to the neuroectoderm while being excluded from the presumptive mesoderm in between. Similar to wg, the sog locus is flanked by CTCF- and CP190-bound regions on both sides—the 3′ CTCF-CP190–bound region overlaps a disrupted domain boundary in both CTCF- and CP190-depleted embryos (Fig. 5E). Our RNA-seq measurements showed no alteration in sog expression but an up-regulation of two genes (CG12708 and CG15646) in the neighboring domain on the right-hand side of the boundary, in both CTCF- and CP190-depleted embryos (Fig. 5E, arrowheads). In situ hybridization using a probe that overlaps both genes shows that they are normally only expressed in a small domain in the anterior end of the embryo in NC14 stage embryos (Fig. 5G). However, they become misexpressed in a sog-like pattern in both CP190- and CTCF-depleted embryos (Fig. 5G), again suggesting that a sog enhancer can now communicate with these genes’ promoters.
Together, our results indicate that these insulator proteins likely regulate gene expression through multiple mechanisms at these stages of embryogenesis. This includes potentially acting directly at the promoter to activate gene expression (2 to 20% of down-regulated genes) and by constraining enhancer activity at domain boundaries (6 to 13% of up-regulated genes), exemplified by enhancer hijacking at the wg and sog loci. In these hijacking cases, neither the promoters [of both the patterning genes and misexpressed genes (wnt4, CG12708, and CG15646)] nor the enhancers (or their surrounding regions) are bound by insulators, indicating that these insulators do not mediate enhancer-promoter tethering but rather modulate their communication indirectly, by maintaining the nearby boundary. However, both of these mechanisms can only account for a minority (<20%) of all DEGs. The majority of DEGs have no obvious direct relationship to changes in topology and likely represent indirect effects. DEGs are enriched in various metabolic and developmental processes, including transcriptional regulators (by gene ontology term enrichments), and secondary targets of these factors are likely among the remaining >80% DEGs. There are very few cases of DEGs that directly overlap (within 10 kb) a disrupted TAD boundary after insulator protein depletion. This indicates that TAD boundaries can be diminished while maintaining normal transcription of the housekeeping genes present at that boundary and also perhaps that redundancy between insulator proteins and transcription itself helps to maintain insulation at boundaries.
An active promoter and an insulator-bound region are both required for full insulation of a TAD boundary
To explore whether an active promoter at a boundary (i.e., either the active promoter itself or transcription) can add to the overall robustness of a boundary and thereby help maintain boundary insulation after depletion of insulator proteins, we functionally dissected one boundary. We selected a boundary that overlaps both an insulator-bound region and an active promoter that drives strong expression of the main isoform of the gene bitesize (btsz) during NC14 (69). This promoter is bound by the transcription factor Zld at NC14 (Fig. 6A), which is essential for btsz expression at this stage of embryogenesis (21, 70). The boundary is also occupied by CP190 at NC14, which binds ~5 kb upstream of the btsz promoter (Fig. 6A). Both BEAF-32 and CTCF do not bind at the boundary, but CTCF binds to a site ~12 kb upstream of the promoter, while BEAF-32 cobinds with CP190 to the boundaries of the two adjacent domains (Fig. 6A). In contrast to Zld, there is no significant change in btsz expression upon depletion of any of the three insulator proteins (Fig. 6B).
To measure the impact of insulator protein depletion on boundary function, we used both Hi-C and DNA-FISH. Differential Hi-C maps indicate an increase in interactions crossing the domain boundary upon depletion of any of the three insulator proteins (Fig. 6C). This was confirmed independently using Chromosight (diff_score in relation to WT: −0.61, −0.48, and −0.49 for BEAF-32, CTCF, and CP190 depletions, respectively) and by calculating changes in insulation score (−0.71, −0.07, −0.04, and 0.13 for WT, BEAF-32, CTCF, and CP190 depletions, respectively). Concordantly, DNA-FISH measurements of the distance between the centers of the two adjacent domains (probes indicated in Fig. 6A) showed a decrease in domain distances (i.e., higher proximity) following insulator protein depletion: 12% of cells had these domains within 250 nm in WT embryos (Fig. 6D, bottom dashed line), which shifted to 31, 19, and 25% after the depletion of BEAF-32, CTCF, and CP190, respectively. Thus, both approaches indicate that removing any of the three insulators affects btsz domain boundary function, even if these insulators (BEAF-32 and CTCF) are not directly binding at the central domain boundary.
To assess the role of the active promoter, we first removed the major transcriptional regulator of this promoter, Zld (21, 70). Zld depletion led to an even stronger loss of boundary function (52% of cells with distances of <250 nm; Fig. 6D), indicating that at this locus, the activation of the promoter is more important for boundary function than insulator binding. To confirm that these effects are due to regulation in cis and not to potentially secondary trans-effects, we used CRISPR-Cas9 to delete three elements within the region: (i) the btsz promoter and promoter-proximal region (~3.5 kb), (ii) the CP190-bound region (~1 kb) leaving the btsz promoter intact, and (iii) both the btsz promoter and CP190-bound region (a 6-kb deletion spanning the entire boundary) (Fig. 6A, bottom). Deletion of the promoter (~3.5 kb) or the entire boundary (~6 kb) completely abolished btsz expression in NC14 embryos, as expected, while the insulator deletion had no detectable effect on btsz expression, confirming that the promoter is still active (Fig. 6E). Accordingly, since btsz is an essential gene, the promoter and the entire boundary deletions are homozygous lethal, while flies with the insulator binding site deletion are homozygous viable and fertile. To measure the insulation across the domain boundary in these lines, we performed DNA-FISH on embryos at NC14 (similar to Fig. 6D). Deletion of the insulator binding site led to a reduction in insulation between the two domains as seen in their increased proximity: The number of cells with distances of <250 nm changed from 12 to 16% in the WT and insulator deletion, respectively (Fig. 6F, bottom dashed line). However, the changes were much more marked in embryos with the active promoter deletion (Fig. 6F), changing from 12% (WT) to 32% (promoter deletion) of cells with distances of <250 nm. Therefore, at this locus, deletion of the active promoter had a more pronounced effect on boundary function compared to deletion of the insulator-bound region itself (Fig. 6F).
In summary, although deletion of the insulator binding site in cis (Fig. 6F) and depletion of insulator proteins in trans (Fig. 6D) have an effect on boundary function, the deletion of the promoter region and its activator protein, Zld, also had a marked effect on boundary insulation. This suggests that an active promoter (or transcription or some other property of the occupied promoter) is required for full boundary function and acts together with insulator protein binding, perhaps to reinforce boundary insulation. Of the insulators tested, the depletion of BEAF-32 had the most prominent effect (Fig. 6D), although it is not bound to the central boundary. This indicates that insulator proteins can influence domain boundary function in a long-range manner at some loci, perhaps by perturbing the compaction of the neighboring domains.
DISCUSSION
Genomic regions with different insulator binding signatures form domain boundaries in Drosophila
In mammals, deletion of individual CTCF sites typically has limited effects on TAD insulation. The fusion between two TADs usually requires deletion of multiple CTCF sites (12, 52, 71). In Drosophila, we observed, in line with others, that TAD boundaries are typically occupied combinatorically by multiple insulator proteins, which can be a mixture between multiple peaks for the same factor (homotypic) and multiple peaks of different factors (heterotypic) binding: e.g., for NC14, see Fig. 2, and for later embryonic stages, see (4, 23–26). For example, we observe both insulators cobinding at a single site (i.e., within a 200-bp window; Fig. 2C, left), consistent with the recruitment of CP190 by either BEAF-32 or CTCF (29, 31, 43, 48), in addition to more distributed binding within the same boundary region (Fig. 2C, right), with up to four different peaks for the same factor within a 10-kb boundary region. This suggests a potential basis for the robustness of TAD boundaries—perhaps the loss of a single factor is not enough to completely abolish boundary function. However, our data indicate that the strength of insulation is not simply dependent on the number of different cobound insulator proteins. In other words, this is not simple redundancy but rather the identity of the combination of factors present at each boundary matters. Different combinations of factors are associated with boundaries of different insulation strengths, with BEAF-32 appearing to act more redundantly, while CTCF and CP190 may act more cooperatively between each other and with other factors (Fig. 4D). These results highlight the diversity of Drosophila boundaries and the difficulty to predict whether an insulator-bound site will be able to form a boundary or not.
Insulator proteins are individually not required for the formation of most TADs
Our depletion strategy allowed us to achieve near-full depletion of these insulator proteins from chromatin, so they are not present at the moment when TADs are being established (Fig. 3). Using C&T with spike-ins was crucial to confidently and quantitatively compare insulator protein binding between control and insulator-depleted embryos. This revealed that there are no statistically significant peaks identified, of the thousands of WT peaks, in insulator-depleted embryos. This is important, as even low levels of insulator protein are enough to sustain chromatin topology in other models (17, 57).
In the absence of the insulator proteins BEAF-32, CTCF, or CP190, we observed that most Drosophila TADs (~90%) can still form with little apparent changes (Fig. 4), although 86% of TAD boundaries are occupied by at least one of these three proteins (Fig. 2). This robustness may be explained by alternative, not necessarily exclusive, hypotheses. One commonly proposed mechanism is redundancy between these insulator proteins, as multiple different proteins have been identified in Drosophila, as reviewed in (22). In mammals, TADs are formed by cohesin-mediated loop extrusion, which stalls at CTCF-bound sites (15, 16). Although there is currently no experimental evidence for cohesin-mediated loop extrusion in Drosophila, cohesin (or another loop extruder) could, in principle, be stalled by any of the C2H2 zinc finger insulator proteins in Drosophila, including CTCF. As we removed each one of these three insulator proteins individually, another factor could have compensated for their loss and facilitated stalling of loop extrusion to create the TAD boundary. Our data suggest that at each boundary, this is not simply the presence of “any” other insulator—there appears to be some specificity, as different boundaries have different sensitivities to each insulator protein’s depletion (Fig. 4D).
Alternatively, as the majority [almost 80% according to Ramirez et al. (23)] of Drosophila TAD boundaries overlap an active promoter of constitutively expressed genes (4, 23, 34), an active promoter (or transcription) may be involved in maintaining TAD boundary function. The injection of pharmacological inhibitors that block transcription had little apparent impact on the establishment of TADs at this stage (21, 72). However, our data dissecting the function of one boundary at the btsz TAD indicate that an active promoter (or transcription) is required for full boundary insulation. Deleting the promoter region had a stronger impact compared to deleting the insulator-bound region itself (Fig. 6, D and F). This is suggestive of a role of active promoters (or transcription) in boundary insulation, although we cannot exclude that the functional “boundary entity” is not polymerase II occupancy at the promoter (or formation of a preinitiation complex) or Zld binding itself, as discussed below. This promoter-associated function is also not the only requirement for boundary function (i.e., it does not abrogate insulation completely), but perhaps, it plays a role in the reinforcement of TAD boundaries after they are formed.
Last, TAD boundaries may be strongly locus specific in Drosophila, i.e., each boundary may rely on a specific combination of factors in addition to the surrounding genomic context to function. This is supported by the different subsets of boundaries that are more sensitive to depletion of different factors, as we observed here during ZGA and as has also been observed in other studies at different stages/contexts (23, 28, 29, 38, 39, 73). For example, for the factors depleted here, most disrupted boundaries were genotype specific and affected in the range of 6 to 10% of all TADs. Nevertheless, boundaries affected by depletion of CTCF and CP190 had more overlap compared to BEAF-32 depletion (Fig. 4). Even for the most affected TAD boundaries, the absence of the insulator protein did not lead to a complete loss of the boundary or complete fusion between neighboring TADs but rather a weakening of the boundary. As TADs are just forming at this stage and still quite fuzzy (due to extensive cell-to-cell variation; Fig. 1), we assumed that they would be easier to disrupt or break, but this does not seem to be the case. They are seemingly as resilient to perturbation here as seen in other contexts (12, 23, 28, 29, 38, 39, 52, 73).
Insulator proteins contribute to gene expression through different mechanisms
By definition, the major event that is occurring during ZGA is the activation of transcription in the embryo, making this a very interesting stage to examine the requirement of these proteins for both chromatin topology and the initiation of gene expression. After the depletion of each of the three factors, a few hundred zygotically expressed genes had significant changes in their expression (Fig. 5). However, it is interesting to note that this did not include many of the “classic” minor and major wave early patterning genes (59). However, there are some exceptions: (i) 1 of 10 pair-rule genes [even-skipped (eve)] was slightly down-regulated in CP190 depletion, and (ii) 2 of 8 homeotic genes had a change in expression: Scr was slightly up-regulated in BEAF-32 and down-regulated in CTCF depletions, while Ultrabithorax (Ubx) was down-regulated in CP190 depletion (table S3). None of the 13 gap genes were misregulated in any genotype. Therefore, ZGA can still occur largely unperturbed after the depletion of each of these insulator proteins, despite their occupancy at the majority of TAD boundaries and thousands of intra-TAD sites.
These insulator proteins may regulate a small fraction of genes (from 2 to 20% of the down-regulated genes in each genotype) by directly binding at their promoter, which can be due to the regulation of transcriptional initiation or by mediating communication with other cis-regulatory elements. We also identified a small number of cases of enhancer hijacking (6 to 13% depending on the genotype), where an enhancer in one TAD with a weakened boundary misappropriately activates a gene in the neighboring TAD (Fig. 5). For example, in the wg and sog loci, the TAD boundary seems to have restricted the enhancer’s activity such that weakening the boundary leads to enhancer activation of an additional target gene in the neighboring TAD. It is interesting to note that such enhancer hijacking cases varied in their effect size. In the wg locus, for example, disruption of the boundary is associated with a gain of wnt4 expression in a very small number of cells, while boundary disruption at the hairy locus had no obvious effect on the neighboring gene’s expression. In comparison, disruption of a boundary at the sog locus led to CG12708/CG15646 misexpression in the majority of sog-expressing cells. These observations, together with recent studies, indicate that the impact of boundary perturbations on gene expression, if any, depends on multiple factors (74). However, together, both mechanisms (direct promoter regulation and enhancer hijacking) can only account for a minority (<20% at best) of all misexpressed genes after these insulators’ depletion. DEGs are generally not enriched near disrupted TAD boundaries; the median distance of misexpressed genes is ~125, 100, and 70 kb from a disrupted boundary for BEAF-32, CTCF, and CP190 depletions, respectively (Fig. 5C). This suggests that these genes change in expression by secondary effects, perhaps by misregulated transcription factors or by other, as yet not understood mechanisms.
A potential role of active promoters or transcription in boundary strength
Using genetic deletions, we dissected one complex TAD boundary that simultaneously overlaps an active promoter and an insulator-bound region during ZGA (Fig. 6). In that boundary, deleting the active promoter (or removing its direct activator, Zld) caused a stronger effect on boundary insulation, in comparison to deleting the insulator binding site, which still had an effect on boundary insulation. This indicates that the boundary requires both insulators and the active promoter region for full insulation strength. At this time, it is still unclear what endows the active promoter region with “boundary activity.” It could be the assembly of a preinitiation complex, which is of substantial size (estimated ~100 proteins), and may therefore be sufficient to confer insulation by, for example, blocking loop extrusion. Alternatively, the formation of a transcription bubble, which is associated with local negative supercoiling, could also provide insulation, perhaps by blocking or slowing down loop extrusion. Paused polymerase could also play a role, as paused promoters can have insulator-like activity in enhancer-blocking transgenic-reporter assays (75), or perhaps, it is the movement of polymerase during transcriptional elongation. Many insulator proteins, including CTCF in mammals, have been proposed to also act as “normal” transcriptional activators. Although very speculative, this activation of transcription at promoters could also serve to reinforce these proteins’ ability to insulate regulatory domains.
Together, our results suggest that Drosophila domain boundaries are established in part by different combinations of insulator proteins, which are influencing the topology of each locus to a different degree. Approximately 80% of Drosophila boundaries contain promoters, and an active promoter or transcription at the boundary may also influence boundary strength as we show here for the btsz boundary. Future studies are needed to understand the relationship between insulator combinatorial binding and their genomic context to dissect the rules that govern the formation of domain boundaries in Drosophila and the cross-talk between boundary function and active promoters (or transcription).
MATERIALS AND METHODS
The Western blots, FISH, Hi-C, RNA-seq, ChIP-seq data processing, RNA-seq analysis, and Hi-C data analysis were performed with standard procedures. Detailed methods for each are provided in the Supplementary Materials, while the more specific methods used in this study are provided below.
Genetic deletion/knockdown of CTCF, BEAF-32, and CP190
Maternal knockout of CTCF depletion was performed as described previously in (38). Briefly, CTCF knockout flies were rescued with an FRT-flanked 5-kb CTCF genomic rescue transgene and developed into viable and fertile adults. The excision of the CTCF rescue cassette from male and female germ lines was achieved through nanos-GAL4:VP16 (NGVP16)–driven expression of UAS-FLP, and the resulting maternal/zygotic CTCF-depleted embryos were collected. This thereby generates complete genetic loss-of-function embryos.
BEAF-32 and CP190 depletion was performed by RNAi-mediated knockdown using stocks carrying short hairpin RNAs (shRNAs) as described previously (56). We first compared the efficiency of different germline GAL4 drivers and different temperatures. Virgin females carrying either BEAF-32 or CP190 shRNA [Vienna Drosophila Resource Center (VDRC), #330274 and Bloomington Drosophila Stock Center (BDSC), #33903, respectively] were crossed to males from either the MTD-Gal4 (BDSC, #31777) or Mat-tub-Gal4 lines (BDSC, #7063). These GAL4 lines express GAL4 either during all stages of oogenesis (MTD-Gal4) or only during late stages (Mat-tub-Gal4) (56). The crosses were incubated at 25° or 29°C, leading to different GAL4 efficiencies. F1 virgin females carrying one copy of either the BEAF-32 or CP190 shRNAs and one copy of the GAL4 transgene(s) were crossed to males carrying the BEAF-32 or CP190 shRNA, respectively. F2 embryos were collected and used for all experiments. For BEAF-32 depletion, the optimal combination was the mat-tub-GAL4 driver and 25°C, as other conditions (either the MTD-gal4 driver or the 29°C temperature) highly increased sterility. For CP190 depletions, the MTD-Gal4 driver and 29°C were used as this resulted in stronger levels of protein depletion (assessed by Western blot).
CRISPR deletions in the btsz locus
To generate flies with CRISPR deletions (Fig. 6), CRISPR donor and guide RNA (gRNA) plasmids were constructed following the strategy for “gene replacement with pHD-DsRed-attP” described in (76). gRNA sequences were generated by annealed oligo cloning and inserted into the Bbs I site of the pU6-BbsI-gRNA vector. To generate the homology arms, we polymerase chain reaction (PCR)–amplified regions between 2 and 3 kb starting directly from the upstream or downstream cutting site of each gRNA and inserted those into the Sap I and Aar I sites of the pHD-DsRed-attP vector, using the In-Fusion cloning kit (#639650, Takara Bio USA Inc.). Primers and gRNA oligos are listed in table S5. Both gRNA and homology repair template plasmids were injected into embryos carrying a vasa-Cas9 transgene on the third chromosome (BDSC #51324, w[1118]; PBac{y[+mDint2]=vas-Cas9}). Hatching adults were crossed to flies carrying a third chromosome balancer marked by green fluorescent protein (GFP), and the progeny was screened for both the DsRed and GFP markers. PCR genotyping was used to confirm the editing. DsRed+/GFP+ siblings were crossed, and the progeny was assessed for viability. Flies carrying a disruption of the btsz promoter or the whole boundary were not able to be maintained as homozygous stocks and were therefore maintained as a transheterozygous stock over a marker balancer chromosome.
Embryo collections
Freshly hatched adults were placed in embryo collection vials with standard apple cap plates. For DNA-FISH experiments, following three 1-hour pre-lays, the flies were allowed to lay for 3 hours, and embryos were directly collected. For genomic experiments, following three 1-hour pre-lays, the flies were either allowed to lay for 30 min after which the embryos were aged for 2h10 to reach the interval 2 hours and 10 min. - 2 hours and 40 min. (ChIP-seq at NC14), or the flies were allowed to lay for 1 hour, after which the embryos were aged for 2 hours to reach the interval of 2 to 3 hours (C&T, Hi-C, Western blot, and RNA-seq). The embryos were then dechorionated using 50% bleach and washed with deionized water and PBT 0.1% (phosphate-buffered saline containing 0.1% Triton X-100). Embryos used for FISH were cross-linked in 4% formaldehyde for 20 min at room temperature, devitelinized, and stored in 100% methanol at −20°C. Embryos used for Western blot and RNA-seq were kept on ice-cold PBT, and NC14 embryos were manually selected using an embryo needle, based on morphological indicators (77) under a stereoscope, and then directly placed in sample buffer (Western blot) or snap-frozen in liquid nitrogen (RNA-seq). Embryos used for ChIP-seq, C&T, and Hi-C (2 to 3 hours) were cross-linked in 1.8% formaldehyde for 15 min or 3% formaldehyde for 30 min at room temperature. Fixation was stopped by addition of PBT 0.1% + 125 mM glycine, followed by a wash with PBT 0.1%, then air-dried on tissue, and snap-frozen in liquid nitrogen.
3D DNA-FISH and combined DNA/RNA FISH
DNA-FISH probes were generated by PCR amplification from D. melanogaster genomic DNA (7- to 8.5-kb fragments) and TA cloning into a PGEMT-Easy vector (Promega, #A1360) (primers listed in table S5), except for the btsz locus, for which bacterial artificial chromosomes were used from DGRC (CHORI22-94I16 and CHORI22-115C08). The Nick Translation kit (Abbott Bioscience, #7J0001) was used to fluorescently label probes.
Embryos fixed with 1.8% formaldehyde and stored in 100% methanol were rehydrated, washed three times in 2× SSCT (2× SSC + 0.1% Tween), and then washed once in 20% formamide and in 50% formamide (both in 2× SSCT) at room temperature on a rotating shaker. This was followed by two 1-hour washes in 50% formamide at 37°C while rotating. The 50% formamide was removed, and the embryos were denatured at 80°C for 15 min in a water bath, placed on ice, and mixed with hybridization mix containing the fluorescent DNA probes. Following overnight hybridization at 37°C, embryos were washed twice in 50% formamide while rotating at 37°C, 1× with 20% formamide, and 3× in SSCT while rotating at room temperature. For DNA-FISH using a fluorescent marker (btsz CRISPR deletions; Fig. 6), the protocol continued using the HCR RNA FISH (Molecular Instruments) protocol to detect the GFP balancer chromosome, following the manufacturer’s instructions. Embryos were placed in ProLong Gold mounting medium with 4′,6-diamidino-2-phenylindole (DAPI; Thermo Fisher Scientific, #P36931) and mounted onto a slide.
Slides were imaged using a Leica SP8 confocal microscope with 100× objective (HC PL APO CS2 100×/numerical aperture 1.4/oil), a 405-nm laser, a white-light laser (470 to 670 nm), and HyD detectors. The z-stack step size was 200 nm. For all DNA-FISH samples, we acquired z-stacks covering a single layer of nuclei in the center of NC14 embryos. At least three embryos and hundreds of alleles were used per condition.
To precisely stage each single embryo, the number of nuclei in a 50-μm2 window was counted in each image, according to (78). Images were deconvolved using the Huygens Professional software (SVI) with default parameters. For the quantification of distances between DNA-FISH probes in each image, we used a custom FIJI plugin (“analyze FISH spots”) developed in-house by the European Molecular Biology Laboratory (EMBL) Advanced Light Microscopy Facility. The plugin has two main functions: (i) detect spots across different channels and (ii) automatically calculate the three-dimensional (3D) distances between spots. Briefly, the x, y, and z coordinates of FISH spots are determined on the basis of a manually provided value for signal intensity and background in each channel following visual inspection. The plugin displays the spots in the image, and the multipoint tool is used to manually select “clusters” of nuclear spots within a nucleus in the different channels along the z-stack (two or three channels depending on the number of DNA-FISH probes used). After all spot clusters are manually selected, the FIJI plugin calculates the pairwise 3D distances between the centers of mass of each spot per channel in all clusters. These distances were used in the DNA-FISH violin plots throughout this study. A Kolmogorov-Smirnov test was used to compare the distribution of distances between a given genotype and WT samples.
ChIP-seq and C&T of insulator proteins in NC14 embryos
ChIP-seq was performed as described in (79). After sonication and chromatin extraction, the chromatin was aliquoted into fresh tubes and stored at −80°C until use. The quality of the sheared chromatin was determined by agarose gel electrophoresis to observe chromatin fragment size distribution. The following antibodies were used: rabbit anti-CTCF (gift from R. Reinkawitz), rat anti-CP190 (gift from P. Georgiev), goat anti-Su(Hw) (gift from P. Geyer), and rabbit anti-GAF (gift from J. Lis), which were incubated overnight with chromatin in radioimmunoprecipitation assay (RIPA) buffer [140 mM NaCl, 10 mM tris-HCl (pH 8.0), 1 mM EDTA, 1% Triton X-100, 0.1% SDS, 0.1% Na-deoxycholate, and 1× Roche cOmplete Protease inhibitors] in a total volume of 900 μl. We used 6 μg of chromatin for CTCF, 4 μg for CP190, 2 μg for Su(Hw), and 10 μg for GAF. Chromatin was fixed with 1.8% formaldehyde for 15 min for CTCF and Su(Hw) and with 3% formaldehyde for 30 min for CP190 and GAF. The next day, 25 μl of magnetic protein A/G beads (Dynabeads, Invitrogen, 10002D and 10004D) was washed with 1 ml of RIPA buffer and added to the samples for an additional 3-hour incubation on the rotating wheel at 4°C. For the BEAF-32 ChIP, 25 μl of protein G beads was combined with 100 μl of the BEAF-32 antibody (DSHB, #1553420) and 300 μl of RIPA buffer for 2 hours. This was followed by two washes with RIPA and resuspension in 100 μl of RIPA, which was added to the purified chromatin and incubated on the rotating wheel at 4°C overnight. The ChIPs were then washed for 10 min on the rotating wheel with 1× 1 ml of RIPA, 4× 1 ml of RIPA-500 [500 mM NaCl, 10 mM tris-HCl (pH 8.0), 1 mM EDTA, 1% Triton X-100, 0.1% SDS, 0.1% Na-deoxycholate, and 1× Roche cOmplete Protease inhibitors], 1× 1 ml of LiCl buffer [250 mM LiCl, 10 mM tris-HCl (pH 8.0), 1 mM EDTA, 0.5% IGEPAL CA-630 CA-630, and 0.5% Na-deoxycholate], and 2× 1 ml of TE buffer [10 mM tris (pH 8.0) and 1 mM EDTA] on a magnetic rack in the cold room. The chromatin was then ribonuclease-treated (#10109142001, Roche) and reverse cross-linked overnight with proteinase K (0.5 mg/ml) and 0.5% SDS at 65°C. The next day, the DNA was purified with phenol-chloroform purification and precipitated with ethanol, sodium acetate (pH 5.3), and glycogen to obtain pure DNA. Library preparation was performed using the NEBNext Ultra II DNA Library Prep Kit for Illumina [New England Biolabs (NEB), #E7645S], following the manufacturer’s instructions.
For quantitative C&T, spike-in with the same number of D. virilis nuclei was used. Nuclei were counted using the BD LSRFortessaTM X-20 flow cytometer at the EMBL Flow Cytometry Facility and snap-frozen. We thawed fixed D. melanogaster and D. virilis nuclei (50,000 each) and then mixed the nuclei from both species for a total of 100,000 nuclei per sample. A conA bead slurry was added to the sample containing both D. melanogaster and D. virilis nuclei and placed on a rotating wheel for 10 min at 4°C. The nuclei-bead complex was washed and permeabilized, and 1 μl of primary antibody was added to each sample. For the CTCF and CP190 C&T experiments, we used the same antibodies as listed for ChIP-seq experiments, and for the BEAF-32 C&T, we used a primary antibody gifted by C. Hart. The tubes were placed on a rotating wheel and slowly rotated (5 rpm) overnight at 4°C. A secondary antibody solution (1:100) was added to the beads and incubated on a nutator at room temperature for 1 hour. We used the following secondary antibodies: guinea pig anti-rabbit immunoglobulin G (IgG; H + L) (Antibodies-online, ABIN6923140), rabbit anti-rat IgG (H + L) (Thermo Fisher Scientific, A18917), and rabbit anti-mouse IgG (H + L) (Abcam, ab46540). Samples were then washed three times, and C&T was performed as described previously (80). The PCR products were purified with Agencourt AMPure XP beads (Beckman Coulter, #A63881), quantified with Qubit, and ran on the Bioanalyzer using hs DNA reagents and chips. Final libraries were multiplexed and sequenced with 75-bp paired-end reads using an Illumina NextSeq 500 platform at the EMBL Genomics Core Facility.
Identifying disrupted TAD boundaries using Chromosight and Pareidolia
TAD boundary changes in 10-kb bin cooler matrices between WT and insulator-depleted samples were detected using Chromosight (58) and quantified with the Pareidolia tool (https://github.com/koszullab/pareidolia; release v1.2.0, accessed on 1 March 2022) (subsample = True, density_thresh = None, pearson_thresh = 0.0, and cnr_thresh = 0.0) using the 10-kb TAD boundaries as defined by HiCExplorer. Pareidolia quantifies the correlation of the Hi-C signal to the expected kernel (here, for TAD boundaries) and reports the difference (depletion minus WT, producing a negative score upon loss of correlation) together with a signal-to-noise ratio (snr) score that indicated how good was the separation between signal and noise in the evaluated submatrix. TAD boundaries with an snr of less than 5 were excluded. As Pareidolia does not provide any significance value on the reported differential score, we opted for a stratification approach where we compared the top changing boundaries (here, 100) to a set of 200 “stable” boundaries selected as the 200 boundaries with the smallest absolute differential score.
Acknowledgments
We thank all Furlong laboratory members for helpful discussions and comments. This work was technically supported by the EMBL’s Genomics core facility and by the public FlyBase and RedFly databases. We are grateful to the following groups for providing antibodies: R. Renkawitz (CTCF), P. Georgiev (CP190), P. Geyer [Su(Hw)], J. Lis (GAF), and C. Hart (BEAF-32); we thank C. Rushlow for providing Zld mutant flies.
Funding: E.E.M.F. is supported by grants from the Deutsche Forschungsgemeinschaft (DFG SPP 2202 and CRC1550) agreement FU 750, Baden-Württemberg Stiftung (BWST-ISF2019-032), and European Research Council (ERC advanced grant) agreement 787611 (DeCRyPT).
Author contributions: G.R.C., R.R.V., T.P., and S.F. performed the experiments. C.G. performed the computational analyses, including ChIP-seq, C&T, and Hi-C analyses and integration between the datasets. T.B.N.C. and P.L. performed Hi-C (Chromosight) analyses. A.R. performed RNA-seq analyses. S.F. generated CRISPR deletion lines. G.R.C. and E.E.M.F. conceptualized the project, designed and analyzed the experiments, and wrote the manuscript with input from the authors.
Competing interests: The authors declare that they have no competing interests.
Data and materials availability: All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials. All sequencing data (Hi-C, ChIP-seq, C&T, and RNA-seq) from this study were deposited in EMBL-EBI ArrayExpress with accession numbers E-MTAB-9158 (Hi-C), E-MTAB-9156 (ChIP-seq), E-MTAB-11993 (C&T), and E-MTAB-11978 (RNA-seq). All processed ChIP-seq, C&T, and RNA-seq data are available on the Furlong laboratory website (http://furlonglab.embl.de/data).
Supplementary Materials
This PDF file includes:
Other Supplementary Material for this : manuscript includes the following:
REFERENCES AND NOTES
- 1.Dekker J., Heard E., Structural and functional diversity of topologically associating domains. FEBS Lett. 589, 2877–2884 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Dixon J. R., Selvaraj S., Yue F., Kim A., Li Y., Shen Y., Hu M., Liu J. S., Ren B., Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature 485, 376–380 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Nora E. P., Lajoie B. R., Schulz E. G., Giorgetti L., Okamoto I., Servant N., Piolot T., van Berkum N. L., Meisig J., Sedat J., Gribnau J., Barillot E., Bluthgen N., Dekker J., Heard E., Spatial partitioning of the regulatory landscape of the X-inactivation centre. Nature 485, 381–385 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Sexton T., Yaffe E., Kenigsberg E., Bantignies F., Leblanc B., Hoichman M., Parrinello H., Tanay A., Cavalli G., Three-dimensional folding and functional organization principles of the Drosophila genome. Cell 148, 458–472 (2012). [DOI] [PubMed] [Google Scholar]
- 5.Bintu B., Mateo L. J., Su J. H., Sinnott-Armstrong N. A., Parker M., Kinrot S., Yamaya K., Boettiger A. N., Zhuang X., Super-resolution chromatin tracing reveals domains and cooperative interactions in single cells. Science 362, eaau1783 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Giorgetti L., Galupa R., Nora E. P., Piolot T., Lam F., Dekker J., Tiana G., Heard E., Predictive polymer modeling reveals coupled fluctuations in chromosome conformation and transcription. Cell 157, 950–963 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Symmons O., Uslu V. V., Tsujimura T., Ruf S., Nassari S., Schwarzer W., Ettwiller L., Spitz F., Functional and topological characteristics of mammalian regulatory domains. Genome Res. 24, 390–400 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Akdemir K. C., Le V. T., Chandran S., Li Y., Verhaak R. G., Beroukhim R., Campbell P. J., Chin L., Dixon J. R., Futreal P. A.; PCAWG Structural Variation Working Group; PCAWG Consortium , Disruption of chromatin folding domains by somatic genomic rearrangements in human cancer. Nat. Genet. 52, 294–305 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Flavahan W. A., Drier Y., Liau B. B., Gillespie S. M., Venteicher A. S., Stemmer-Rachamimov A. O., Suva M. L., Bernstein B. E., Insulator dysfunction and oncogene activation in IDH mutant gliomas. Nature 529, 110–114 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Ghavi-Helm Y., Jankowski A., Meiers S., Viales R. R., Korbel J. O., Furlong E. E. M., Highly rearranged chromosomes reveal uncoupling between genome topology and gene expression. Nat. Genet. 51, 1272–1282 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Lupianez D. G., Kraft K., Heinrich V., Krawitz P., Brancati F., Klopocki E., Horn D., Kayserili H., Opitz J. M., Laxova R., Santos-Simarro F., Gilbert-Dussardier B., Wittler L., Borschiwer M., Haas S. A., Osterwalder M., Franke M., Timmermann B., Hecht J., Spielmann M., Visel A., Mundlos S., Disruptions of topological chromatin domains cause pathogenic rewiring of gene-enhancer interactions. Cell 161, 1012–1025 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Despang A., Schopflin R., Franke M., Ali S., Jerkovic I., Paliou C., Chan W. L., Timmermann B., Wittler L., Vingron M., Mundlos S., Ibrahim D. M., Functional dissection of the Sox9-Kcnj2 locus identifies nonessential and instructive roles of TAD architecture. Nat. Genet. 51, 1263–1271 (2019). [DOI] [PubMed] [Google Scholar]
- 13.de Wit E., Vos E. S., Holwerda S. J., Valdes-Quezada C., Verstegen M. J., Teunissen H., Splinter E., Wijchers P. J., Krijger P. H., de Laat W., CTCF binding polarity determines chromatin looping. Mol. Cell 60, 676–684 (2015). [DOI] [PubMed] [Google Scholar]
- 14.Rao S. S., Huntley M. H., Durand N. C., Stamenova E. K., Bochkov I. D., Robinson J. T., Sanborn A. L., Machol I., Omer A. D., Lander E. S., Aiden E. L., A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell 159, 1665–1680 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Fudenberg G., Imakaev M., Lu C., Goloborodko A., Abdennur N., Mirny L. A., Formation of chromosomal domains by loop extrusion. Cell Rep. 15, 2038–2049 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Sanborn A. L., Rao S. S., Huang S. C., Durand N. C., Huntley M. H., Jewett A. I., Bochkov I. D., Chinnappan D., Cutkosky A., Li J., Geeting K. P., Gnirke A., Melnikov A., McKenna D., Stamenova E. K., Lander E. S., Aiden E. L., Chromatin extrusion explains key features of loop and domain formation in wild-type and engineered genomes. Proc. Natl. Acad. Sci. U.S.A. 112, E6456–E6465 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Nora E. P., Goloborodko A., Valton A. L., Gibcus J. H., Uebersohn A., Abdennur N., Dekker J., Mirny L. A., Bruneau B. G., Targeted degradation of CTCF decouples local insulation of chromosome domains from genomic compartmentalization. Cell 169, 930–944.e22 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Wike C. L., Guo Y., Tan M., Nakamura R., Shaw D. K., Diaz N., Whittaker-Tademy A. F., Durand N. C., Aiden E. L., Vaquerizas J. M., Grunwald D., Takeda H., Cairns B. R., Chromatin architecture transitions from zebrafish sperm through early embryogenesis. Genome Res. 31, 981–994 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Du Z., Zheng H., Huang B., Ma R., Wu J., Zhang X., He J., Xiang Y., Wang Q., Li Y., Ma J., Zhang X., Zhang K., Wang Y., Zhang M. Q., Gao J., Dixon J. R., Wang X., Zeng J., Xie W., Allelic reprogramming of 3D chromatin architecture during early mammalian development. Nature 547, 232–235 (2017). [DOI] [PubMed] [Google Scholar]
- 20.Ke Y., Xu Y., Chen X., Feng S., Liu Z., Sun Y., Yao X., Li F., Zhu W., Gao L., Chen H., Du Z., Xie W., Xu X., Huang X., Liu J., 3D chromatin structures of mature gametes and structural reprogramming during mammalian embryogenesis. Cell 170, 367–381.e20 (2017). [DOI] [PubMed] [Google Scholar]
- 21.Hug C. B., Grimaldi A. G., Kruse K., Vaquerizas J. M., Chromatin architecture emerges during zygotic genome activation independent of transcription. Cell 169, 216–228.e19 (2017). [DOI] [PubMed] [Google Scholar]
- 22.Ozdemir I., Gambetta M. C., The role of insulation in patterning gene expression. Genes (Basel) 10, 767 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Ramirez F., Bhardwaj V., Arrigoni L., Lam K. C., Gruning B. A., Villaveces J., Habermann B., Akhtar A., Manke T., High-resolution TADs reveal DNA sequences underlying genome organization in flies. Nat. Commun. 9, 189 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Van Bortle K., Nichols M. H., Li L., Ong C. T., Takenaka N., Qin Z. S., Corces V. G., Insulator function and topological domain border strength scale with architectural protein occupancy. Genome Biol. 15, R82 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Van Bortle K., Ramos E., Takenaka N., Yang J., Wahi J. E., Corces V. G., Drosophila CTCF tandemly aligns with other insulator proteins at the borders of H3K27me3 domains. Genome Res. 22, 2176–2187 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Wang Q., Sun Q., Czajkowsky D. M., Shao Z., Sub-kb Hi-C in D. melanogaster reveals conserved characteristics of TADs between insect and mammalian cells. Nat. Commun. 9, 188 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Chathoth K. T., Zabet N. R., Chromatin architecture reorganization during neuronal cell differentiation in Drosophila genome. Genome Res. 29, 613–625 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Kaushal A., Dorier J., Wang B., Mohana G., Taschner M., Cousin P., Waridel P., Iseli C., Semenova A., Restrepo S., Guex N., Aiden E. L., Gambetta M. C., Essential role of Cp190 in physical and regulatory boundary formation. Sci. Adv. 8, eabl8834 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Kaushal A., Mohana G., Dorier J., Ozdemir I., Omer A., Cousin P., Semenova A., Taschner M., Dergai O., Marzetta F., Iseli C., Eliaz Y., Weisz D., Shamim M. S., Guex N., Lieberman Aiden E., Gambetta M. C., CTCF loss has limited effects on global genome architecture in Drosophila despite critical regulatory functions. Nat. Commun. 12, 1011 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Jiang N., Emberly E., Cuvier O., Hart C. M., Genome-wide mapping of boundary element-associated factor (BEAF) binding sites in Drosophila melanogaster links BEAF to transcription. Mol. Cell. Biol. 29, 3556–3568 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Bartkuhn M., Straub T., Herold M., Herrmann M., Rathke C., Saumweber H., Gilfillan G. D., Becker P. B., Renkawitz R., Active promoters and insulators are marked by the centrosomal protein 190. EMBO J. 28, 877–888 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Li J., Gilmour D. S., Distinct mechanisms of transcriptional pausing orchestrated by GAGA factor and M1BP, a novel transcription factor. EMBO J. 32, 1829–1841 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Soshnev A. A., Baxley R. M., Manak J. R., Tan K., Geyer P. K., The insulator protein suppressor of Hairy-wing is an essential transcriptional repressor in the Drosophila ovary. Development 140, 3613–3623 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Rowley M. J., Nichols M. H., Lyu X., Ando-Kuri M., Rivera I. S. M., Hermetz K., Wang P., Ruan Y., Corces V. G., Evolutionarily conserved principles predict 3D chromatin organization. Mol. Cell 67, 837–852.e7 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.El-Sharnouby S., Fischer B., Magbanua J. P., Umans B., Flower R., Choo S. W., Russell S., White R., Regions of very low H3K27me3 partition the Drosophila genome into topological domains. PLOS ONE 12, e0172725 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Matthews N. E., White R., Chromatin architecture in the fly: Living without CTCF/cohesin loop extrusion?: Alternating chromatin states provide a basis for domain architecture in Drosophila. Bioessays 41, e1900048 (2019). [DOI] [PubMed] [Google Scholar]
- 37.Ulianov S. V., Khrameeva E. E., Gavrilov A. A., Flyamer I. M., Kos P., Mikhaleva E. A., Penin A. A., Logacheva M. D., Imakaev M. V., Chertovich A., Gelfand M. S., Shevelyov Y. Y., Razin S. V., Active chromatin and transcription play a key role in chromosome partitioning into topologically associating domains. Genome Res. 26, 70–84 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Gambetta M. C., Furlong E. E. M., The insulator protein CTCF is required for correct hox gene expression, but not for embryonic development in Drosophila. Genetics 210, 129–136 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Chathoth K. T., Mikheeva L. A., Crevel G., Wolfe J. C., Hunter I., Beckett-Doyle S., Cotterill S., Dai H., Harrison A., Zabet N. R., The role of insulators and transcription in 3D chromatin organization of flies. Genome Res. 32, 682–698 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Ogiyama Y., Schuettengruber B., Papadopoulos G. L., Chang J. M., Cavalli G., Polycomb-dependent chromatin looping contributes to gene silencing during drosophila development. Mol. Cell 71, 73–88.e5 (2018). [DOI] [PubMed] [Google Scholar]
- 41.Espinola S. M., Gotz M., Bellec M., Messina O., Fiche J. B., Houbron C., Dejean M., Reim I., Cardozo Gizzi A. M., Lagha M., Nollmann M., Cis-regulatory chromatin loops arise before TADs and gene activation, and are independent of cell fate during early Drosophila development. Nat. Genet. 53, 477–486 (2021). [DOI] [PubMed] [Google Scholar]
- 42.Cardozo Gizzi A. M., Cattoni D. I., Fiche J. B., Espinola S. M., Gurgo J., Messina O., Houbron C., Ogiyama Y., Papadopoulos G. L., Cavalli G., Lagha M., Nollmann M., Microscopy-based chromosome conformation capture enables simultaneous visualization of genome organization and transcription in intact organisms. Mol. Cell 74, 212–222.e5 (2019). [DOI] [PubMed] [Google Scholar]
- 43.Negre N., Brown C. D., Shah P. K., Kheradpour P., Morrison C. A., Henikoff J. G., Feng X., Ahmad K., Russell S., White R. A., Stein L., Henikoff S., Kellis M., White K. P., A comprehensive map of insulator elements for the Drosophila genome. PLOS Genet. 6, e1000814 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Yang J., Sung E., Donlin-Asp P. G., Corces V. G., A subset of Drosophila Myc sites remain associated with mitotic chromosomes colocalized with insulator proteins. Nat. Commun. 4, 1464 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Batut P. J., Bing X. Y., Sisco Z., Raimundo J., Levo M., Levine M. S., Genome organization controls transcriptional dynamics during development. Science 375, 566–570 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Mohan M., Bartkuhn M., Herold M., Philippen A., Heinl N., Bardenhagen I., Leers J., White R. A., Renkawitz-Pohl R., Saumweber H., Renkawitz R., The Drosophila insulator proteins CTCF and CP190 link enhancer blocking to body patterning. EMBO J. 26, 4203–4214 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Pai C. Y., Lei E. P., Ghosh D., Corces V. G., The centrosomal protein CP190 is a component of the gypsy chromatin insulator. Mol. Cell 16, 737–748 (2004). [DOI] [PubMed] [Google Scholar]
- 48.Vogelmann J., Le Gall A., Dejardin S., Allemand F., Gamot A., Labesse G., Cuvier O., Negre N., Cohen-Gonsaud M., Margeat E., Nollmann M., Chromatin insulator factors involved in long-range DNA interactions and their role in the folding of the Drosophila genome. PLOS Genet. 10, e1004544 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Maksimenko O., Bartkuhn M., Stakhov V., Herold M., Zolotarev N., Jox T., Buxa M. K., Kirsch R., Bonchuk A., Fedotova A., Kyrchanova O., Renkawitz R., Georgiev P., Two new insulator proteins, Pita and ZIPIC, target CP190 to chromatin. Genome Res. 25, 89–99 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Cuartero S., Fresan U., Reina O., Planet E., Espinas M. L., Ibf1 and Ibf2 are novel CP190-interacting proteins required for insulator function. EMBO J. 33, 637–647 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Zolotarev N., Fedotova A., Kyrchanova O., Bonchuk A., Penin A. A., Lando A. S., Eliseeva I. A., Kulakovskiy I. V., Maksimenko O., Georgiev P., Architectural proteins Pita, Zw5,and ZIPIC contain homodimerization domain and support specific long-range interactions in Drosophila. Nucleic Acids Res. 44, 7228–7241 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Rodriguez-Carballo E., Lopez-Delisle L., Zhan Y., Fabre P. J., Beccari L., El-Idrissi I., Huynh T. H. N., Ozadam H., Dekker J., Duboule D., The HoxD cluster is a dynamic and resilient TAD boundary controlling the segregation of antagonistic regulatory landscapes. Genes Dev. 31, 2264–2281 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Butcher R. D., Chodagam S., Basto R., Wakefield J. G., Henderson D. S., Raff J. W., Whitfield W. G., The Drosophila centrosome-associated protein CP190 is essential for viability but not for cell division. J. Cell Sci. 117, 1191–1199 (2004). [DOI] [PubMed] [Google Scholar]
- 54.Klug W. S., Bodenstein D., King R. C., Oogenesis in the suppressor of hairy-wing mutant of Drosophila melanogaster. I. Phenotypic characterization and transplantation experiments. J. Exp. Zool. 167, 151–156 (1968). [DOI] [PubMed] [Google Scholar]
- 55.Roy S., Gilbert M. K., Hart C. M., Characterization of BEAF mutations isolated by homologous recombination in Drosophila. Genetics 176, 801–813 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Staller M. V., Yan D., Randklev S., Bragdon M. D., Wunderlich Z. B., Tao R., Perkins L. A., Depace A. H., Perrimon N., Depleting gene activities in early Drosophila embryos with the "maternal-Gal4-shRNA" system. Genetics 193, 51–61 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Zuin J., Dixon J. R., van der Reijden M. I., Ye Z., Kolovos P., Brouwer R. W., van de Corput M. P., van de Werken H. J., Knoch T. A., van IJcken I. W. F., Grosveld F. G., Ren B., Wendt K. S., Cohesin and CTCF differentially affect chromatin architecture and gene expression in human cells. Proc. Natl. Acad. Sci. U.S.A. 111, 996–1001 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Matthey-Doret C., Baudry L., Breuer A., Montagne R., Guiglielmoni N., Scolari V., Jean E., Campeas A., Chanut P. H., Oriol E., Meot A., Politis L., Vigouroux A., Moreau P., Koszul R., Cournac A., Computer vision for pattern detection in chromosome contact maps. Nat. Commun. 11, 5795 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.De Renzis S., Elemento O., Tavazoie S., Wieschaus E. F., Unmasking activation of the zygotic genome using chromosomal deletions in the Drosophila embryo. PLoS Biol. 5, e117 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Celniker S. E., Sharma S., Keelan D. J., Lewis E. B., The molecular genetics of the bithorax complex of Drosophila: Cis-regulation in the abdominal-B domain. EMBO J. 9, 4277–4286 (1990). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Pankratz M. J., Seifert E., Gerwin N., Billi B., Nauber U., Jackle H., Gradients of Krüppel and knirps gene products direct pair-rule gene stripe patterning in the posterior region of the Drosophila embryo. Cell 61, 309–317 (1990). [DOI] [PubMed] [Google Scholar]
- 62.Small S., Blair A., Levine M., Regulation of even-skipped stripe 2 in the Drosophila embryo. EMBO J. 11, 4047–4057 (1992). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Braccioli L., de Wit E., CTCF: A Swiss-army knife for genome organization and transcription regulation. Essays Biochem. 63, 157–165 (2019). [DOI] [PubMed] [Google Scholar]
- 64.Kubo N., Ishii H., Xiong X., Bianco S., Meitinger F., Hu R., Hocker J. D., Conte M., Gorkin D., Yu M., Li B., Dixon J. R., Hu M., Nicodemi M., Zhao H., Ren B., Promoter-proximal CTCF binding promotes distal enhancer-dependent gene activation. Nat. Struct. Mol. Biol. 28, 152–161 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Filippova G. N., Fagerlie S., Klenova E. M., Myers C., Dehner Y., Goodwin G., Neiman P. E., Collins S. J., Lobanenkov V. V., An exceptionally conserved transcriptional repressor, CTCF, employs different combinations of zinc fingers to bind diverged promoter sequences of avian and mammalian c-myc oncogenes. Mol. Cell. Biol. 16, 2802–2813 (1996). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Pena-Hernandez R., Marques M., Hilmi K., Zhao T., Saad A., Alaoui-Jamali M. A., del Rincon S. V., Ashworth T., Roy A. L., Emerson B. M., Witcher M., Genome-wide targeting of the epigenetic regulatory protein CTCF to gene promoters by the transcription factor TFII-I. Proc. Natl. Acad. Sci. U.S.A. 112, E677–E686 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Hanssen L. L. P., Kassouf M. T., Oudelaar A. M., Biggs D., Preece C., Downes D. J., Gosden M., Sharpe J. A., Sloane-Stanley J. A., Hughes J. R., Davies B., Higgs D. R., Tissue-specific CTCF-cohesin-mediated chromatin architecture delimits enhancer interactions and function in vivo. Nat. Cell Biol. 19, 952–961 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Fujioka M., Sun G., Jaynes J. B., The Drosophila eve insulator Homie promotes eve expression and protects the adjacent gene from repression by polycomb spreading. PLOS Genet. 9, e1003883 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Serano J., Rubin G. M., The Drosophila synaptotagmin-like protein bitesize is required for growth and has mRNA localization sequences within its open reading frame. Proc. Natl. Acad. Sci. U.S.A. 100, 13368–13373 (2003). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Liang H. L., Nien C. Y., Liu H. Y., Metzstein M. M., Kirov N., Rushlow C., The zinc-finger protein Zelda is a key activator of the early zygotic genome in Drosophila. Nature 456, 400–403 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Anania C., Acemel R. D., Jedamzick J., Bolondi A., Cova G., Brieske N., Kuhn R., Wittler L., Real F. M., Lupianez D. G., In vivo dissection of a clustered-CTCF domain boundary reveals developmental principles of regulatory insulation. Nat. Genet. 54, 1026–1036 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Li L., Lyu X., Hou C., Takenaka N., Nguyen H. Q., Ong C. T., Cubenas-Potts C., Hu M., Lei E. P., Bosco G., Qin Z. S., Corces V. G., Widespread rearrangement of 3D chromatin organization underlies polycomb-mediated stress-induced silencing. Mol. Cell 58, 216–231 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Bag I., Chen S., Rosin L. F., Chen Y., Liu C. Y., Yu G. Y., Lei E. P., M1BP cooperates with CP190 to activate transcription at TAD borders and promote chromatin insulator activity. Nat. Commun. 12, 4170 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Galouzis C. C., Furlong E. E. M., Regulating specificity in enhancer-promoter communication. Curr. Opin. Cell Biol. 75, 102065 (2022). [DOI] [PubMed] [Google Scholar]
- 75.Chopra V. S., Cande J., Hong J. W., Levine M., Stalled Hox promoters as chromosomal boundaries. Genes Dev. 23, 1505–1509 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Gratz S. J., Ukken F. P., Rubinstein C. D., Thiede G., Donohue L. K., Cummings A. M., O’Connor-Giles K. M., Highly specific and efficient CRISPR/Cas9-catalyzed homology-directed repair in Drosophila. Genetics 196, 961–971 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.J. A. Campos-Ortega, V. Hartenstein, The Embryonic Development of Drosophila Melanogaster (Springer, 2013). [Google Scholar]
- 78.Blythe S. A., Wieschaus E. F., Zygotic genome activation triggers the DNA replication checkpoint at the midblastula transition. Cell 160, 1169–1181 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Bonn S., Zinzen R. P., Perez-Gonzalez A., Riddell A., Gavin A. C., Furlong E. E., Cell type-specific chromatin immunoprecipitation from multicellular complex samples using BiTS-ChIP. Nat. Protoc. 7, 978–994 (2012). [DOI] [PubMed] [Google Scholar]
- 80.Kaya-Okur H. S., Janssens D. H., Henikoff J. G., Ahmad K., Henikoff S., Efficient low-cost chromatin profiling with CUT&Tag. Nat. Protoc. 15, 3264–3283 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Harrison M. M., Li X. Y., Kaplan T., Botchan M. R., Eisen M. B., Zelda binding in the early Drosophila melanogaster embryo marks regions subsequently activated at the maternal-to-zygotic transition. PLOS Genet. 7, e1002266 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Schindelin J., Arganda-Carreras I., Frise E., Kaynig V., Longair M., Pietzsch T., Preibisch S., Rueden C., Saalfeld S., Schmid B., Tinevez J. Y., White D. J., Hartenstein V., Eliceiri K., Tomancak P., Cardona A., Fiji: An open-source platform for biological-image analysis. Nat. Methods 9, 676–682 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83.Schor I. E., Bussotti G., Males M., Forneris M., Viales R. R., Enright A. J., Furlong E. E. M., Non-coding RNA expression, function, and variation during Drosophila embryogenesis. Curr. Biol. 28, 3547–3561.e9 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Martin M., Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet J. 17, 1–10 (2011). [Google Scholar]
- 85.Dobin A., Davis C. A., Schlesinger F., Drenkow J., Zaleski C., Jha S., Batut P., Chaisson M., Gingeras T. R., STAR: Ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Li B., Dewey C. N., RSEM: Accurate transcript quantification from RNA-seq data with or without a reference genome. BMC Bioinformatics 12, 323 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.Love M. I., Huber W., Anders S., Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 88.Ramani V., Deng X., Qiu R., Gunderson K. L., Steemers F. J., Disteche C. M., Noble W. S., Duan Z., Shendure J., Massively multiplex single-cell Hi-C. Nat. Methods 14, 263–266 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.