Skip to main content
eLife logoLink to eLife
. 2021 Aug 3;10:e69937. doi: 10.7554/eLife.69937

CLAMP and Zelda function together to promote Drosophila zygotic genome activation

Jingyue Duan 1,†,‡,, Leila Rieder 2,, Megan M Colonnetta 3, Annie Huang 1, Mary Mckenney 1, Scott Watters 4, Girish Deshpande 1,3, William Jordan 1, Nicolas Fawzi 4, Erica Larschan 1,
Editors: Yukiko M Yamashita5, Kevin Struhl6
PMCID: PMC8367384  PMID: 34342574

Abstract

During the essential and conserved process of zygotic genome activation (ZGA), chromatin accessibility must increase to promote transcription. Drosophila is a well-established model for defining mechanisms that drive ZGA. Zelda (ZLD) is a key pioneer transcription factor (TF) that promotes ZGA in the Drosophila embryo. However, many genomic loci that contain GA-rich motifs become accessible during ZGA independent of ZLD. Therefore, we hypothesized that other early TFs that function with ZLD have not yet been identified, especially those that are capable of binding to GA-rich motifs such as chromatin-linked adaptor for male-specific lethal (MSL) proteins (CLAMP). Here, we demonstrate that Drosophila embryonic development requires maternal CLAMP to (1) activate zygotic transcription; (2) increase chromatin accessibility at promoters of specific genes that often encode other essential TFs; and (3) enhance chromatin accessibility and facilitate ZLD occupancy at a subset of key embryonic promoters. Thus, CLAMP functions as a pioneer factor that plays a targeted yet essential role in ZGA.

Research organism: D. melanogaster

Introduction

During zygotic genome activation (ZGA), dramatic reprogramming occurs in the zygotic nucleus to initiate global transcription and prepare the embryo for further development (Jukam et al., 2017). Chromatin changes that activate the zygotic genome during ZGA rely on cooperation among transcription factors (TFs) (Lee et al., 2014). However, only pioneer TFs (Cirillo and Zaret, 1999; Mayran and Drouin, 2018) can bind to closed chromatin before ZGA because most TFs cannot bind to nucleosomal DNA (Soufi et al., 2015).

In Drosophila, the pioneer TF Zelda (ZLD; zinc-finger early Drosophila activator) plays a key role during ZGA (Liang et al., 2008). ZLD exhibits several critical characteristics of pioneer TFs, including (1) binding to nucleosomal DNA (Sun et al., 2015; McDaniel et al., 2019); (2) regulating transcription of early zygotic genes (Harrison et al., 2011); and (3) modulating chromatin accessibility to increase the ability of other non-pioneer TFs to bind to DNA (Schulz et al., 2015). However, a large subset of ZLD binding sites (60%) are highly enriched for GA-rich motifs and have constitutively open chromatin even in the absence of ZLD (Schulz et al., 2015). Therefore, we and others (Schulz et al., 2015) hypothesized that other pioneer TFs which directly bind to GA-rich motifs work together with ZLD to activate the zygotic genome.

GAGA-associated factor (GAF; Farkas et al., 1994) and chromatin-linked adaptor for male-specific lethal (MSL) proteins (CLAMP; Soruco et al., 2013) are two of few known TFs that can bind to GA-rich motifs and regulate transcriptional activation in Drosophila (Fuda et al., 2015; Kaye et al., 2018). GAF performs several essential functions in early embryos, including chromatin remodeling (Shimojima et al., 2003; Judd et al., 2021; Gaskill et al., 2021) and RNA Pol II recruitment (Li et al., 2013; Fuda et al., 2015; Duarte et al., 2016), and is required for embryonic nuclear divisions (Bhat et al., 1996).

CLAMP is a GA-binding TF essential for early embryonic development (Rieder et al., 2017) that binds to promoters and plays several vital roles including opening chromatin on the male X chromosome to recruit the MSL dosage compensation complex (Urban et al., 2017b; Rieder et al., 2019) and activating coordinated regulation of the histone genes at the histone locus (Rieder et al., 2017). Therefore, we hypothesized that CLAMP functions with ZLD as a pioneer factor to promote ZGA.

Here, we first demonstrate that depleting maternal CLAMP disrupts transcription of critical early zygotic genes causing significant phenotypic changes in early embryos. Next, we define several mechanisms by which CLAMP regulates ZGA: (1) CLAMP activates zygotic transcription via direct binding to target genes; (2) CLAMP binds directly to nucleosomal DNA and increases chromatin accessibility of promoters of a subset of genes that often encode other essential TFs; and (3) CLAMP and ZLD regulate each other’s occupancy at promoters which further regulates the transcription of their target genes. Overall, we determine that CLAMP is an essential pioneer factor that functions with ZLD to regulate ZGA.

Results

Depletion of maternal CLAMP disrupts expression of genes that regulate zygotic patterning and cytoskeletal organization in blastoderm embryos

We previously reported that nearly 100% (99.87%) of maternal clamp RNAi embryos never hatch and die at early embryonic stages (Rieder et al., 2017), demonstrating that maternally deposited CLAMP is critical for embryonic development. To assess embryonic phenotypic patterning after maternal clamp depletion, we first identified three key early zygotic genes (even-skipped, runt, and neurotactin, Figure 1—figure supplement 1A) that have significantly (adjusted p<0.05, DESeq2, RNA-seq) reduced expression in early embryos when maternal CLAMP is depleted (Rieder et al., 2017). We then used single-molecule fluorescencein situ hybridization (smFISH) for even-skipped and runt, and immunostaining for Neurotactin (NRT) to determine how the depletion of maternal clamp or zld alters phenotypic patterning and cytoskeletal integrity in blastoderm stage embryos (Figure 1). We validated the knockdown of clamp and zld in early embryos by qRT-PCR and Western blotting (Figure 1—figure supplement 1B,C and Figure 1—source data 1).

Figure 1. Novel pioneer factor CLAMP is essential for early embryonic development.

(A) Control maternal egfp depletion (left), maternal clamp depletion (middle), and maternal zld depletion (right) syncytial blastoderm stage embryos probed using smFISH for the pair-rule patterning genes run (green) and eve (red). Embryos were co-labeled with Hoechst (blue) to visualize nuclei. Scale bar represents 10 µm. Quantification (%) of eve/run defective embryos is on the right, p-values were calculated with the Fisher’s exact test; number of embryos is in parentheses. (B) Control maternal egfp depletion (left), maternal clamp depletion (middle), and maternal zld depletion (right) syncytial blastoderm stage embryos were assessed for integrity of the developing cytoskeleton using anti-NRT antibody (green) and Hoechst (blue) to label nuclei. Scale bar represents 10 µm. Quantification (%) of NRT defective embryos is on the right, p-values were calculated with the Fisher’s exact test; number of embryos is in parentheses. (C) Electrophoretic mobility shift assay (EMSA) showing the binding of increasing amounts of CLAMP DNA-binding domain (DBD) fused to MBP to 5C2 naked DNA or 5C2 in vitro reconstituted nucleosomes (Nucs). Concentrations (µM) of CLAMP DBD increase from left to right. (D) EMSA showing the binding of increasing amounts of full-length (FL) CLAMP (fused to MBP) to 5C2 DNA or 5C2 Nucs. Concentrations (µM) of CLAMP FL increase from left to right. (E) Effect of maternal clamp RNAi on maternally deposited (orange) or zygotically transcribed (yellow) gene expression log2 (clamp-i/MTD) in 0–2 hr (left) or 2–4 hr (right). Maternal versus zygotic gene categories were as defined in Lott et al., 2011. p-values of significant expression changes between maternal and zygotic genes were calculated by Mann-Whitney U-test and noted on the plot. CLAMP, chromatin-linked adaptor for male-specific lethal (MSL) proteins; smFISH, single-molecule fluorescence in situ hybridization.

Figure 1—source data 1. Original western blots and EMSA images.

Figure 1.

Figure 1—figure supplement 1. Novel pioneer factor CLAMP is essential for early embryonic development.

Figure 1—figure supplement 1.

(A) Expression of early zygotic genes (even-skipped, runt, and neurotactin) in MTD and clamp-i embryos measured by RNA-seq (Rieder et al., 2017). Log2fold change and p-value were calculated using DESeq2. (B) Expression of mRNAs (clamp and zld) in MTD, clamp-i, and zld-i embryos in 0–2 hr and 2–4 hr embryos measured by RT-qPCR. Log2fold change was calculated using the ΔΔCt method (Rao et al., 2013) and normalized to reference gene pka. (C) Western blot of CLAMP, ZLD, and reference control Actin in MTD, clamp-i, and zld-i embryos in 0–2 hr and 2–4 hr embryos. MTD: MTD-Gal4 line. clamp-i: MTD-Gal4-clamp mRNAi line, zld-i: MTD-Gal4-zld mRNAi line. (D) Genome browser tracks for a region of the CES 5C2 locus (red bar is 500 bp) used to make in vitro reconstituted nucleosomes (Urban et al., 2017a). Locations of three CLAMP binding motifs are marked as black dots. CLAMP chromatin immunoprecipitation-sequencing (ChIP-seq) normalized sequencing reads are shown in green. MNase-seq MACC scores (dark blue) show chromatin accessibility in S2 cells in control (egfp-i), clamp, and msl2 RNAi treatment. The nucleosome (Nuc) profile is shown in purple. The dashed red rectangle highlights the genomic region (240 bp) used to reconstitute nucleosomes in vitro. (E) Effect of maternal zld RNAi (Schulz et al., 2015) on maternally deposited (orange) or zygotically transcribed (yellow) gene expression log2 (clamp-i/MTD) in 0–2 hr (left) or 2–4 hr (right) measured by RNA-seq. Maternal versus zygotic gene categories were as defined in Lott et al., 2011. p-values of significant expression changes between maternal and zygotic genes were calculated by Mann-Whitney U-test and noted on the plot. CLAMP, chromatin-linked adaptor for male-specific lethal (MSL) proteins; MBP, maltose-binding protein.

Both even-skipped (eve) and runt (run) play an important role in embryonic segmentation (Manoukian and Krause, 1992). Eve also establishes sharp boundaries between parasegments (Fujioka et al., 1995). Strikingly, when maternally deposited clamp is depleted, we observed the complete disruption of classic seven stripe pair-rule gene expression patterns using smFISH (Figure 1A, middle). Additionally, the nuclei in the embryonic syncytium were disassociated compared to control egfp RNAi embryos (Figure 1A, left). Furthermore, the expression of eve and run was significantly reduced in clamp maternal depletion embryos that also failed to form sharp stripe boundaries. We observed similar, but slightly stronger, phenotypic changes in zld maternal depletion embryos (Figure 1A, right), indicating that CLAMP and ZLD have critical roles in establishing embryonic patterning in pre-cellular blastoderm embryos. Moreover, all of the embryos depleted for maternal clamp or maternal zld (n=10) show defective eve/run localization (p<0.05, Fisher’s exact test).

Next, we used immunostaining to examine the localization of NRT, a cell adhesion glycoprotein that is expressed early during Drosophila embryonic cellularization in a lattice surrounding syncytial blastoderm nuclei (Hortsch et al., 1990). In clamp maternal depletion embryos (Figure 1B, middle), we observed dramatically disrupted cellularization and reduced NRT levels. These embryos fail to form the wild-type pattern of cytoskeletal elements, which can be seen in the egfp RNAi control embryos (Figure 1B, left). Embryos depleted for maternal zld also reveal similar patterns of discordant nuclei. More than 50% of embryos depleted for maternal clamp (n=23) and 100% of embryos depleted for maternal zld (n=10) show NRT disruption (Figure 1B, right) (p<0.05, Fisher’s exact test). Overall, smFISH and immunostaining results suggest that both maternally deposited CLAMP and ZLD are essential for early embryonic patterning and development.

CLAMP binds to nucleosomal DNA in vivo and in vitro

One of the intrinsic characteristics of pioneer TFs is their capacity to bind nucleosomal DNA and compacted chromatin (Cirillo and Zaret, 1999). To test the hypothesis that CLAMP is a pioneer TF, we performed electrophoretic mobility shift assays (EMSAs) that test the intrinsic capability of CLAMP to directly interact with nucleosomes in vitro. First, we identified a 240-bp region of the X-linked 5C2 locus (Figure 1—figure supplement 1D) that CLAMP binds to in cultured S2 cells and exhibited decreased chromatin accessibility in the absence of CLAMP (Urban et al., 2017b). This region is also occupied by a nucleosome (Figure 1—figure supplement 1D), suggesting that CLAMP promotes accessibility of this region while binding to nucleosomes.

We then performed in vitro nucleosome assembly using 240 bp of DNA from the 5C2 locus that contains three CLAMP-binding motifs, and we used 5C2 naked DNA as a control. We found that both the CLAMP DNA-binding domain (DBD; Figure 1C and Figure 1—source data 1) and full-length (FL) protein (Figure 1D and Figure 1—source data 1) can bind and shift both 5C2 naked DNA and nucleosomes assembled with 5C2 DNA. Increased protein concentration results in a secondary ‘super’ shift species (Figure 1C and D and Figure 1—source data 1), indicating that multiple CLAMP molecules may occupy the three CLAMP-binding motifs. Both FL CLAMP and CLAMP DBD are fused to the maltose-binding protein (MBP), which we previously demonstrated does not bind to DNA independent of CLAMP or alter the specificity of CLAMP binding (Kaye et al., 2018). Previously, we determined that CLAMP binds specifically to GAGA-repeats in vivo and in vitro (Soruco et al., 2013; Kaye et al., 2018). Here, we further demonstrate that the zinc-finger protein CLAMP can directly bind to nucleosomal DNA and generates multiple shift species consistent with the potential to bind to multiple binding sites simultaneously.

CLAMP regulates zygotic genome activation

To define how CLAMP regulates early embryonic patterning, we examined the effect of maternal CLAMP depletion on the expression of maternally deposited or zygotically transcribed genes (Lott et al., 2011) using RNA-seq data (Rieder et al., 2017). We found that the expression levels of zygotically transcribed genes but not maternally deposited genes were significantly downregulated in embryos lacking CLAMP (p<0.001, Mann-Whitney U-test) (Figure 1E). Therefore, CLAMP has a specific effect on the transcription of zygotic genes similar to that which has been previously reported for ZLD (Liang et al., 2008; Harrison et al., 2011; McDaniel et al., 2019) and confirmed in this study (Figure 1—figure supplement 1E) using stage 5 embryos lacking maternal ZLD (GSE65837, Schulz et al., 2015).

CLAMP regulates chromatin accessibility in early embryos

An essential characteristic of pioneer TFs is that they can establish and maintain the accessibility of their DNA target sites, allowing other TFs to bind to DNA and activate transcription (Zaret and Carroll, 2011; Iwafuchi-Doi et al., 2016). We previously used MNase-seq (Urban et al., 2017b) to determine that CLAMP guides MSL complex to GA-rich sequences by promoting an accessible chromatin environment on the male X chromosome in cultured S2 cells. Furthermore, GA-rich motifs are enriched in regions that remain accessible in the absence of the pioneer factor ZLD (Schulz et al., 2015). Therefore, we hypothesized that CLAMP regulates chromatin accessibility at some ZLD-independent GA-rich loci during ZGA.

To test our hypothesis, we performed the assay for transposase-accessible chromatin using sequencing (ATAC-seq) on 0–2 hr (pre-ZGA) and 2–4 hr (post-ZGA) embryos with wild-type (wt) levels of CLAMP (maternal triple GAL4) driver (MTD) alone (Ni et al., 2011) and embryos depleted for maternal CLAMP using RNAi driven by the MTD driver (clamp-i). We identified differentially accessible (DA) regions (Figure 2A and Figure 2—source data 1) by comparing ATAC-seq reads between MTD and clamp-i embryos using DiffBind (Stark and Brown, 2019). Principal component analysis plots (Figure 2—figure supplement 1A & D) show that the first principal component (PC) explains 86–87% of the variation between MTD and clamp-i embryos. However, we also observed that 5–6% of the variation among sample replicates is explained by PC2, suggesting the presence of some developmental diversity within sample groups.

Figure 2. CLAMP regulates chromatin accessibility of a subset of the early zygotic genome.

(A) Differential accessibility (DA) analysis (left) of ATAC-seq from MTD embryos versus clamp-i embryos in 0–2 hr or 2–4 hr. Blue dots indicate non-DA sites. Pink dots indicate significant (FDR<0.1) differential peaks after maternal clamp RNAi, identified by DiffBind (DESeq2) (DA peaks). The number of peaks and representative genes in each class is noted on the plot. Average ATAC-seq signal (right) in reads per genome coverage (RPGC) 1× normalization in 0–2 hr or 2–4 hr embryos after maternal clamp RNAi centered on open chromatin (≤100 bp) peaks identified significant changes upon maternal clamp RNAi. (B) Example of IGV views of genomic locus iab-8 bound by CLAMP (ChIP-seq) which shows significantly decreased CLAMP binding and ATAC-seq signal after clamp RNAi. (C) Genomic features of regions that require CLAMP for chromatin accessibility (clamp-i down DA regions, ATAC-seq) compared with all CLAMP binding sites (ChIP-seq occupancy). (D) Top motifs enriched in regions that require CLAMP for chromatin accessibility (clamp-i down DA regions, ATAC-seq). Enrichment p-value and percentage of sequences are noted. (E) Violin plot comparing gene expression (RNA-seq data) in CLAMP-mediated changes and unchanged differential accessibility regions in 0–2 hr or 2–4 hr embryos after maternal clamp RNAi. p-values of significant expression changes of CLAMP down-DA and non-DA were calculated by Mann-Whitney U-test and noted on the plot. ChIP-seq, chromatin immunoprecipitation-sequencing; CLAMP, chromatin-linked adaptor for male-specific lethal (MSL) proteins; FDR, false discovery rate.

Figure 2—source data 1. ATAC-seq read counts in peak region in replicates of MTD and RNAi samples (DiffBind analysis).
Page 1. clamp-i versus MTD in 0–2 hr embryos. Page 2. clamp-i versus MTD in 2–4 hr embryos.

Figure 2.

Figure 2—figure supplement 1. CLAMP regulates chromatin accessibility throughout ZGA.

Figure 2—figure supplement 1.

(A) PCA plot for sample replicates in MTD and clamp-i embryos at 0–2 hr time point. (B) Pearson correlation heatmap of peaks dependent on CLAMP in MTD versus clamp-i embryos at 0–2 hr time point. (C) GO terms for genes that require CLAMP for chromatin accessibility at 0–2 hr time point. (D) PCA plots for sample replicates in MTD and clamp-i embryos at 2–4 hr time point. (E) Pearson correlation heatmap of peaks dependent on CLAMP in MTD versus clamp-i embryos at 2–4 hr time point. (F) GO terms for genes that require CLAMP for chromatin accessibility at 2–4 hr time point. CLAMP, chromatin-linked adaptor for male-specific lethal (MSL) proteins; GO, Gene Ontology; PCA, principal component analysis; ZGA, zygotic genome activation.

Despite some variation among replicates, the high Pearson correlation for DA regions between replicates indicates robust reproducibility of these sites (Figure 2—figure supplement 1B & E). We identified a subset of genomic regions that exhibit significantly reduced chromatin accessibility in the absence of CLAMP (Figures 2A and 0–2 hr: 277; 2–4 hr: 50 and Figure 2—source data 1), indicating that chromatin accessibility of these genomic loci (DA sites) requires CLAMP. Moreover, DA sites include promoters of many genes essential for early embryogenesis such as Nrt, Abd-B, mod(mdg4), and zld, which encodes the ZLD TF (Figure 2A). A smaller number of loci (0–2 hr: 20; 2–4 hr: 36) increased their accessibility in the absence of CLAMP (Figure 2A). Gene Ontology (GO) analysis (Figure 2—figure supplement 1C & F) indicates that CLAMP increases accessibility of chromatin regions that are mainly within DNA-binding, RNA Pol II binding, and enhancer-binding TF encoding genes (Figure 2—figure supplement 1C & F). While CLAMP strongly regulates chromatin accessibility non-redundantly with other factors at a subset of genomic loci, CLAMP target genes are key for early development, consistent with the dramatic patterning defects observed after depleting maternal CLAMP (Figure 1A).

Furthermore, a subset (26.7% [74/277] at 0–2 hr; 90% [45/50] at 2–4 hr) of DA regions are directly bound by CLAMP, suggesting that CLAMP directly regulates their chromatin accessibility. For example, the iab8 promoter, which is located within the essential Drosophila Hox cluster that controls body plan patterning, is directly bound by CLAMP and shows a reduction in chromatin accessibility after clamp RNAi (Figure 2B). We also defined the distribution of DA sites and CLAMP binding sites throughout the genome (Figure 2C). While DA sites were significantly (p<0.05, Fisher’s exact test) enriched at promoter regions (49.16%), CLAMP binds almost equally frequently to both promoters (29.08%) and introns that are not first introns (27.72%). Therefore, CLAMP is required to establish or maintain open chromatin largely at promoters, but may also play other roles in intronic regions. Furthermore, motif analysis identified both GA-rich motifs and ZLD motifs enriched at regions that require CLAMP for their accessibility in 0–2 hr embryos (Figure 2D). These data suggest that CLAMP may also regulate the accessibility of some ZLD binding sites, a hypothesis that we discuss further below.

We next determined whether CLAMP-mediated chromatin accessibility could specifically drive early transcription by examining the relationship between the chromatin accessibility (DA, ATAC-seq) changes and gene expression of the nearest gene as measured by RNA-seq (Rieder et al., 2017). We observed significant (p<0.05, Mann-Whitney U-test) reduction in expression after maternal CLAMP depletion of genes at which CLAMP mediates chromatin accessibility (DA genes) compared with genes at which CLAMP does not regulate chromatin accessibility (non-DA genes) (Figure 2E). Overall, our results indicate that CLAMP promotes chromatin accessibility and transcription of a subset of other essential TF genes during ZGA, which is consistent with the extensive developmental defects caused by maternal CLAMP depletion.

CLAMP and ZLD regulate each other’s binding to a subset of promoters

To directly determine how CLAMP and ZLD impact each other’s binding, we performed ChIP-seq for CLAMP and ZLD in control MTD embryos and embryos that were maternally depleted for each factor with RNAi at the same two time points we used for our ATAC-seq experiments: before ZGA (0–2 hr) and during and after ZGA (2–4 hr) (Figure 3A–B, Figure 3—figure supplement 1A–B and Table 1). Overall, there are more ZLD peaks (0–2 hr: 6974; 2–4 hr: 8035) across the whole genome than CLAMP peaks (0–2 hr: 4962, 2–4 hr: 7564) in control MTD embryos. As we hypothesized, CLAMP and ZLD peaks significantly overlap (p<0.05, hypergeometric test, N=15,682 total fly genes) (Figure 3—figure supplement 1C).

Figure 3. CLAMP and ZLD depend on each other for binding at a subset of sites.

(A) ZLD occupancy in 0–2 hr and 2–4 hr MTD and maternal clamp RNAi embryos. Data is displayed as a heatmap of z-score normalized ChIP-seq (log2 IP/input) reads in a 2-kb region centered at each peak. Peaks in each class are arranged in order of decreasing z-scores in control MTD embryos. (B) CLAMP occupancy in 0–2 hr and 2–4 hr MTD and maternal zld RNAi embryos. Data is displayed as a heatmap of z-score normalized ChIP-seq (log2 IP/input) reads in a 2-kb region centered around each peak. Peaks in each class are arranged in order of decreasing z-scores in control MTD embryos. (C) Differential binding (DB) analysis of ZLD ChIP-seq. Mean difference (MA) plots of ZLD peaks in MTD embryos versus clamp-i embryos in 0–2 hr (left) or 2–4 hr (right). Blue dots indicate non-DB sites. Pink dots indicate significant (FDR<0.05) differential peaks identified by DiffBind (DESeq2). The number of peaks changed in each direction is noted on the plot. (D) DB analysis of CLAMP ChIP-seq. MA plots of CLAMP peaks from MTD embryos versus zld-i embryos in 0–2 hr (left) or 2–4 hr (right). Blue dots indicate non-DB sites. Pink dots indicate significant (FDR<0.05) DB peaks identified by DiffBind (DESeq2). Number of peaks in each direction is noted on the plot. (E) Stacked bar plots of CLAMP and ZLD down-DB (left) and CLAMP and ZLD non-DB peaks (right) distribution fraction in the Drosophila genome (dm6) in 0–2 hr and 2–4 hr embryos. (F) Box plot of the peak sizes in CLAMP and ZLD down-DB and non-DB peaks in 0–2 hr and 2–4 hr embryos. p-values of significant size difference between down-DB and non-DB peaks were calculated by Mann-Whitney U-test and noted on the plot. (G) Average profiles of ATAC-seq signal coverage show chromatin accessibility at ZLD down-DB (orange line) and non-DB (gray line) sites in 0–2 hr (left panel) or 2–4 hr (right panel) MTD and clamp-i embryos. Number of sites is noted on the plot. ChIP-seq, chromatin immunoprecipitation-sequencing; CLAMP, chromatin-linked adaptor for male-specific lethal (MSL) proteins. 

Figure 3.

Figure 3—figure supplement 1. CLAMP and ZLD depend on each other for chromatin binding.

Figure 3—figure supplement 1.

(A) CLAMP occupancy in 0–2 hr and 2–4 hr MTD and maternal clamp RNAi embryos. Data is displayed as a heatmap of z-score normalized ChIP-seq (log2 IP/input) reads, in a 2-kb region centered around each peak called in control MTD embryos. Peaks in each class are arranged in order of decreasing z-scores in control MTD embryos. (B) ZLD occupancy in 0–2 hr and 2–4 hr MTD and maternal zld RNAi embryos. Data is displayed as a heatmap of z-score normalized ChIP-seq (log2 IP/input) reads, in a 2-kb region centered around each peak called in control MTD embryos. Peaks in each class are arranged in order of decreasing z-scores in control MTD embryos. (C) CLAMP (green) and ZLD (orange) peaks and shared peaks where both CLAMP and ZLD are present in 0–2 hr and 2–4 hr embryos. p-values represent the significance (hypergeometric test, N=15,682 total fly genes) of overlap. (D) ZLD up-DB, down-DB, and non-DB peaks in 0–2 hr and 2–4 hr MTD and maternal clamp RNAi embryos. Data is displayed as a heatmap of z-score normalized ChIP-seq (log2 IP/input) reads in a 2-kb region centered around each peak. Peaks in each class are arranged in order of decreasing z-scores in control MTD embryos. (E) CLAMP up-DB, down-DB, and non-DB peaks in 0–2 hr and 2–4 hr MTD and maternal zld RNAi embryos. Data is displayed as a heatmap of z-score normalized ChIP-seq (log2 IP/input) reads in a 2-kb region centered around each peak. Peaks in each class are arranged in order of decreasing z-scores in control MTD embryos. (F) Venn diagram showing the number of overlapping sites between ZLD and CLAMP down-DB or ZLD and CLAMP non-DB sites in 0–2 hr and 2–4 hr. p-values represent the significance (hypergeometric test, N=15,682 total fly genes) of overlap. (G) Example IGV views of genomic loci in iab-8, which CLAMP and ZLD were both down-DBs and dependent on each other to bind. (H) Average profiles of ChIP-seq signal in log2 (IP/input) show the size of down-DB or non-DB peaks of CLAMP in MTD versus zld-i embryos (upper panel) and ZLD in MTD versus clamp-i embryos (lower panel). ChIP-seq, chromatin immunoprecipitation-sequencing; CLAMP, chromatin-linked adaptor for male-specific lethal (MSL) proteins.
Figure 3—figure supplement 2. CLAMP and ZLD depend on each other for chromatin binding.

Figure 3—figure supplement 2.

(A) Example IGV views of genomic loci: ChIP-seq in MTD and zld-i embryos at Hbn promoter region (upper panel) with CLAMP differential binding (down-DB) peak; Fdl (lower panel) with CLAMP non-DB peak at an intron. Peaks called by MACS2 are marked in dark gray (non-DB) and gray (down-DB). CLAMP and ZLD motifs are marked in green and orange, respectively. (B) Top three motifs called by Homer for CLAMP down-DB, CLAMP non-DB, ZLD down-DB, and ZLD non-DB peaks sites in 0–2 hr and 2–4 hr embryos. (C) Average profiles of ATAC-seq signal coverage in NC14 +12 min wt and zld- embryos show chromatin accessibility at CLAMP down-DB (green line) and non-DB (gray line) sites defined in 0–2 hr (left panel) or 2–4 hr (right panel) MTD and zld-i embryos. Number of sites is noted on the plot. ChIP-seq, chromatin immunoprecipitation-sequencing; CLAMP, chromatin-linked adaptor for male-specific lethal (MSL) proteins; NC14, nuclear cycle 14; wt, wild-type.
Figure 3—figure supplement 3. CLAMP and ZLD depend on each other for chromatin binding.

Figure 3—figure supplement 3.

Upper panels: Genomic distribution of ZLD lost (down-DB) or gained (up-DB) binding sites upon maternal clamp RNAi at 0–2 hr and 2–4 hr. Number of sites is noted on the plot. Lower panels: Genomic distribution of CLAMP lost (down-DB) or gained (up-DB) binding sites upon maternal zld RNAi at 0–2 hr and 2–4 hr. Number of sites is noted on the plot. CLAMP, chromatin-linked adaptor for male-specific lethal (MSL) proteins; DB, differential binding.

Table 1. The number of total and differentially bound peaks for CLAMP and ZLD in control MTD, clamp-i, and zld-i embryos.

Table 1—source data 1. ChIP-seq read counts in peak regions in replicates of MTD and RNAi samples (DiffBind analysis).
Page 1. ZLD ChIP-seq in clamp-i versus MTD in 0–2 hr embryos. Page 2. ZLD ChIP-seq in clamp-i versus MTD in 2–4 hr embryos. Page 3. CLAMP ChIP-seq in zld-i versus MTD in 0–2 hr embryos. Page 4. CLAMP ChIP-seq in zld-i versus MTD in 2–4 hr embryos.
ChIP-seq peaks CLAMP ZLD
MTD clamp-i zld-i MTD clamp-i zld-i
0–2 hr 4962 3488 4746 6974 3687 4650
2–4 hr 7564 4064 8279 8035 4687 6420
Differential binding
(DiffBind, DEseq2)
MTD versus zld-i MTD versus clamp-i
Up-DB Down-DB Non-DB Up-DB Down-DB Non-DB
0–2 hr 54 390 4184 8 274 3144
2–4 hr 3 30 7351 223 1289 5672

Next, we defined the sites that showed differential binding (DB; Figure 3C–D and Table 1) of CLAMP and ZLD in the absence of each other’s maternally deposited mRNA using DiffBind (Stark and Brown, 2019). We found a significant reduction of ZLD binding in the absence of CLAMP: there were 274 (0–2 hr) and 1289 (2–4 hr) sites where ZLD binding decreased in clamp-i embryos compared to MTD controls (down-DB) (Figure 3C, Figure 3—figure supplement 1D, and Table 1). Fewer ZLD binding sites increased in occupancy after clamp RNAi: 8 (0–2 hr) and 233 (2–4 hr) sites (up-DB). 390 (0–2 hr) and 30 (2–4 hr) CLAMP down-DB sites were found upon loss of ZLD (Figure 3D, Figure 3—figure supplement 1E, and Table 1). We identified very few sites where CLAMP occupancy increases after zld RNAi (up-DB sites: 0–2 hr: 54, 2–4 hr: 3). Moreover, depletion of either maternal zld or clamp mRNA altered the genomic distribution of CLAMP and ZLD: the most common pattern we observed was that promoter-bound peaks were lost (down-DB) and peaks in introns were gained (up-DB) (Figure 3—figure supplement 3).

The CLAMP down-DB sites and the ZLD down-DB sites also significantly overlap with each other (p<0.05, hypergeometric test, N=15,682 total fly genes) at both time points (Figure 3—figure supplement 1F). For example, iab-8, an essential Hox cluster gene at which CLAMP regulates chromatin accessibility (DA), is also one of the 95 genomic loci at which CLAMP and ZLD promote each other’s occupancy (Figure 3—figure supplement 1F–G).

To further understand how CLAMP and ZLD bind to dependent (down-DB) and independent (non-DB) sites, we determined the genomic distribution and size of occupancy of these two types of sites. Overall, dependent peaks (down-DB) are much broader in size and are located at promoters (Figure 3E, Figure 3—figure supplement 1H and Figure 3—figure supplement 2A). In contrast, independent sites (non-DB) are narrower and located within introns (Figure 3E, Figure 3—figure supplement 1H and Figure 3—figure supplement 2A). On average, the peak size of dependent sites (down-DB: 400–500 bp) is almost double that of independent sites (non-DB: 200–250 bp) with significant differences in peak size for both TFs at both time points (p<0.001, Mann-Whitney U-test) (Figure 3F).

Previous proteomic studies (Urban et al., 2017a; Hamm et al., 2017; Hamm et al., 2015) found no evidence that CLAMP and ZLD directly contact each other at the protein level, suggesting that CLAMP and ZLD regulate each other via binding to their DNA motifs. Therefore, we analyzed the motifs enriched at dependent (down-DB) and independent (non-DB) sites. We found that dependent sites are enriched for motifs specific for the required protein, which are not present at the independent sites (Figure 3—figure supplement 2B). For example, the ZLD motif is only enriched at sites where CLAMP requires ZLD for binding (CLAMP down-DB) but not at sites where CLAMP binds independently of ZLD (CLAMP non-DB). Similarly, the CLAMP motif is only enriched at sites where ZLD requires CLAMP for binding (ZLD down-DB) (Figure 3—figure supplement 2B). Therefore, the presence of specific CLAMP and ZLD motifs correlates with their ability to promote each other’s binding.

Given the cooperative relationship between CLAMP and ZLD binding to chromatin, we measured chromatin accessibility (ATAC-seq coverage) changes at their dependent and independent sites that we defined from ChIP-seq data (Figure 3G and Figure 3—figure supplement 2C). We found the average ATAC-seq signals were significantly reduced at sites where ZLD is dependent on CLAMP to bind (ZLD down-DB sites) in clamp-i embryos compared to MTD controls (Figure 3G). Furthermore, the accessibility at sites where ZLD binds independently of CLAMP (ZLD non-DB) is lower than that at ZLD DB sites but remains unchanged upon clamp RNA-i (Figure 3G). Therefore, the chromatin accessibility changes we observe over broader regions are enriched at specific loci where CLAMP promotes ZLD binding.

Sites where ZLD regulates CLAMP binding (CLAMP down-DB) have high chromatin accessibility while sites where CLAMP binds independently (CLAMP non-DB) of ZLD showed low chromatin accessibility (Figure 3—figure supplement 2C). Interestingly, accessibility slightly increases upon the loss of ZLD at sites where CLAMP requires ZLD for binding at 0–2 hr (Figure 3—figure supplement 2C). However, an active TF binding to DNA can prevent Tn5 cleavage at genomic regions (Yan et al., 2020). Therefore, loss of ZLD and CLAMP binding could result in a perceived accessibility increase, as measured by ATAC-seq, which does not necessarily reflect a repressive function for ZLD.

In summary, CLAMP and ZLD increase each other’s occupancy by binding to their motifs and altering chromatin accessibility. These data support a model in which CLAMP and ZLD increase each other’s occupancy at promoters of a subset of genes that often encode other TFs.

CLAMP and ZLD function together to regulate transcription during ZGA

CLAMP and ZLD both specifically regulate zygotic transcription (Figure 1E and Figure 1—figure supplement 1E; Liang et al., 2008; Harrison et al., 2011; McDaniel et al., 2019; Rieder et al., 2017). To further understand how CLAMP and ZLD function to regulate ZGA, we compared the transcriptional roles of CLAMP and ZLD in early embryos at genes that have different temporal expression patterns as defined in Li X-Y et al., 2014. We found that both CLAMP and ZLD are present at genes expressed throughout ZGA although CLAMP binding is more often present at mid- and late-transcribed zygotic genes (categories defined in Li X-Y et al., 2014), while ZLD binding is more often present at early transcribed zygotic genes (Figure 4A–B).

Figure 4. CLAMP and ZLD function together in zygotic genome activation.

Figure 4.

(A) Percentage of CLAMP binding sites in 0–2 hr and 2–4 hr embryos distributed in maternal (n=646), early (n=69), mid- (n=73), late- (n=104), later (n=74), and silent (n=921) genes (peaks within a 1-kb promoter region and gene body). Gene categories were defined in Li X-Y et al., 2014. (B) Percentage of ZLD binding sites in 0–2 hr and 2–4 hr embryos distributed in maternal (n=646), early (n=69), mid- (n=73), late- (n=104), later (n=74), and silent (n=921) genes (peaks within a 1-kb promoter region and gene body). Gene categories were defined in Li X-Y et al., 2014. (C) Gene expression changes caused by maternal clamp RNAi (Rieder et al., 2017) at genes with strong, weak, and no CLAMP binding as measured by ChIP-seq in 0–2 hr (left) or 2–4 hr (right) embryos. p-values of significant expression changes of CLAMP bindings were calculated by Mann-Whitney U-test and noted on the plot. (D) Gene expression changes caused by maternal zld RNAi (Schulz et al., 2015) at genes with strong, weak, and no ZLD binding as measured by ChIP-seq in 0–2 hr (left) or 2–4 hr (right) embryos. p-values of significant expression changes of ZLD bindings were calculated by Mann-Whitney U-test and noted on the plot. (E) Gene expression changes caused by maternal zld RNAi (Schulz et al., 2015) at genes with CLAMP down-DB and non-DB that defined in wt versus zld- 0–2 hr and 2–4 hr embryos ChIP-seq. p-values of significant expression changes of CLAMP down-DB and non-DB were calculated by Mann-Whitney U-test and noted on the plot. (F) Gene expression changes caused by maternal clamp RNAi (Rieder et al., 2017) at genes with ZLD down-DB and non-DB that defined in MTD versus clamp-i 0–2 hr and 2–4 hr embryos ChIP-seq. p-values of significant expression changes of ZLD down-DB and non-DB were calculated by Mann-Whitney U-test and noted on the plot. ChIP-seq, chromatin immunoprecipitation-sequencing; CLAMP, chromatin-linked adaptor for male-specific lethal (MSL) proteins; DB, differential binding; wt, wild-type.

We next asked whether the ability of CLAMP to bind to genes directly regulates zygotic gene activation by integrating ChIP-seq with RNA-seq data (Schulz et al., 2015; Rieder et al., 2017). We found that genes strongly bound or weakly bound by CLAMP (ChIP-seq data) showed a significant (p<0.001, Mann-Whitney U-test) level of gene expression reduction after clamp RNAi (Rieder et al., 2017) compared to unbound genes (Figure 4C). We also observed a significant (p<0.001, Mann-Whitney U-test) change in gene expression in maternal zld-i embryos (Schulz et al., 2015) for the genes that are strongly bound by ZLD (ChIP-seq data) (Figure 4D). Also, the magnitude of the transcriptional changes is similar for genes that are bound by CLAMP or ZLD. Together, these data indicate that CLAMP regulates the transcription of zygotic genes by directly binding to target genes.

To investigate whether CLAMP and ZLD could regulate each other’s binding to precisely drive the transcription of target genes, we plotted the gene expression changes caused by depleting maternal zld or clamp at the genes closest to where they regulate each other’s binding (Figure 4E and Figure 4F). The depletion of maternal zld significantly (p=4.3E−5, Mann-Whitney U-test) reduces the expression of genes where ZLD regulates CLAMP binding (down-DB) more than sites where CLAMP binds independently of ZLD (non-DB) (Figure 4E). Therefore, ZLD may specifically regulate zygotic genes at which ZLD promotes CLAMP binding. Also, compared to genes where ZLD binds independent of CLAMP, genes where ZLD binding is regulated by CLAMP had a significant (p<0.001, Mann-Whitney U-test) expression reduction after clamp RNAi at both 0–2 hr and 2–4 hr time points (Figure 4F). Thus, CLAMP may regulate the transcription of genes targeted by ZLD by promoting ZLD binding.

Furthermore, sites where CLAMP and ZLD require each other for binding are enriched for motifs specific for the required protein (Figure 3—figure supplement 2B). Therefore, the presence of specific CLAMP and ZLD motifs correlates with their ability to promote each other’s binding which further regulates the expression of each other’s target genes.

CLAMP and ZLD regulate gene expression via modulating chromatin accessibility

To determine how direct binding of CLAMP and ZLD relates to zygotic chromatin accessibility, we integrated ChIP-seq and ATAC-seq data. First, we defined four classes of CLAMP-related peaks (DA with CLAMP, DA without CLAMP, non-DA with CLAMP, and non-DA without CLAMP in Table 2 and Table 2—source data 1). We also obtained ZLD-related ATAC-seq data (Hannon et al., 2017; Soluri et al., 2020) that was generated from embryos laid by wt mothers or mothers with zld germline clones (zld-) at the nuclear cycle 14 (NC14) +12 min time point and integrated it with ChIP-seq data from the closest time point from this study (0–2 hr embryos). In this way, we defined four classes of genomic loci related to ZLD: DA with ZLD, DA without ZLD, non-DA with ZLD, and non-DA without ZLD (Table 2 and Table 2—source data 1).

Table 2. The number of peaks in four types of CLAMP or ZLD mediated regions.

Table 2—source data 1. Peaks locations in each CLAMP or ZLD-related category.
Page 1 Type I (n=5): both DA, CLAMP ZLD co-bound Page 2 Type II (n=23): CLAMP DA and ZLD non-DA, CLAMP ZLD co-bound Page 3 Type III (n=88): ZLD DA and CLAMP non-DA, CLAMP ZLD co-bound Page 4 Type IV (n=434): both non-DA, CLAMP ZLD co-bound Page 5 DA with CLAMP 0–2 hr; Page 6 DA without CLAMP 0–2 hr; Page 7 non-DA with CLAMP 0–2 hr; Page 8 non-DA without CLAMP 0–2 hr; Page 9 DA with ZLD NC14 +12 min; Page 10 DA without ZLD NC14 +12 min; Page 11 non-DA with ZLD NC14 +12 min; Page 12 non-DA, without ZLD NC14 +12 min.
ATAC-seq peaks (0–2 hr) DA w/ CLAMP DA w/o CLAMP Non-DA w/ CLAMP Non-DA w/o CLAMP
16,597 74 203 1239 15,081
ATAC-seq peaks (Hannon et al., 2017) (NC14 +12 min) DA w/ ZLD DA w/o ZLD Non-DA w/ ZLD Non-DA w/o ZLD
19,146 976 2782 2010 13,378
CLAMP ZLD co-bound open chromatin regions Type I
(Both DA)
Type II
(CLAMP DA)
Type III
(ZLD DA)
Type IV
(Both non-DA)
525 5 23 123 374

Next, we generated heatmaps to visualize ATAC-seq read coverage and CLAMP and ZLD ChIP-seq occupancy at their related classes of loci in MTD (wt), clamp-i, and zld-i (zld-) embryos (Figure 5—figure supplement 1A). As expected, MTD and clamp-i embryo heatmaps revealed that CLAMP-related DA regions (DA with CLAMP, DA without CLAMP) show a significant decrease in accessibility in embryos lacking CLAMP. Regions dependent on ZLD to open (DA with ZLD, DA without ZLD) also show a significant accessibility reduction in the absence of ZLD. Moreover, ChIP-seq read enrichment for protein binding in each RNAi or germline clone embryo class corresponds to our classification.

Interestingly, both CLAMP and ZLD protein occupancy on chromatin were reduced when the other TF was depleted, especially at regions where the bound protein is not required for chromatin accessibility (non-DA) (Figure 5—figure supplement 1A). For example, ZLD occupancy was reduced upon clamp RNAi at ZLD non-DA regions which are bound by ZLD to a level that resembles the ZLD occupancy in zld-i embryos (Figure 5—figure supplement 1A). We also found that CLAMP is enriched (ChIP-seq signal) at these ZLD non-DA regions, supporting our hypothesis that CLAMP facilitates ZLD occupancy at some of these loci.

To determine the relationship between CLAMP and ZLD in regulating chromatin accessibility at loci bound by both factors, we identified the subset of genomic loci (n=525) that co-bound both CLAMP and ZLD and have open zygotic chromatin (Figure 5A, Table 2 and Table 2—source data 1). We divided these co-bound loci into four types: 1% (n=5) of these loci show reduced accessibility after either clamp-i or zld-i (Type I, both DA, Figure 5A); 23 and 123 loci are specifically dependent on CLAMP or ZLD for their accessibility, respectively (Type II, CLAMP-DA and Type III, ZLD-DA, Figure 5A); the majority (374 out of 525) of CLAMP ZLD co-bound loci remain open when either protein is absent (Type IV, both non-DA, Figure 5A), suggesting either that CLAMP and ZLD function redundantly at some of these loci or that the presence of other TFs regulate their accessibility.

Figure 5. CLAMP and ZLD regulate gene expression via modulating chromatin accessibility.

(A) Four classes of CLAMP and ZLD co-bound peaks defined by combining ATAC-seq (this study or Hannon et al., 2017; Soluri et al., 2020) and ChIP-seq peaks in 0–2 hr MTD and RNAi embryos. Data is displayed as a heatmap of z-score normalized ATAC-seq and ChIP-seq reads in a 2-kb region centered around each peak. Peaks in each class are arranged in order of decreasing z-scores in control MTD embryos. Type I (n=5): both DA, differentially accessible regions which depend on CLAMP or ZLD; has both proteins bound. Type II (n=23): CLAMP-DA and ZLD non-DA, differentially accessible regions which depend on CLAMP, not on ZLD; has both proteins bound. Type III (n=123): ZLD-DA and CLAMP non-DA, differentially accessible regions which depend on ZLD, not on CLAMP; has both proteins bound. Type IV (n=374): both non-DA, accessibility independent from CLAMP or ZLD; has both proteins bound. (B) Left: Gene expression changes caused by maternal clamp RNAi (Rieder et al., 2017) in 0–2 hr embryos at genes fall into four classes of CLAMP and ZLD co-bound peaks. p-values of significant expression changes among classes were calculated by Mann-Whitney U-test and noted on the plot. Right: Gene expression changes caused by maternal zld RNAi (Schulz et al., 2015) in stage 5 embryos at genes fall into four classes of CLAMP and ZLD co-bound peaks. p-values of significant expression changes among classes were calculated by Mann-Whitney U-test and noted on the plot. (C) Upper panel: Venn diagram showing the number of overlapping sites between GAF-dependent DA sites (Gaskill et al., 2021), ZLD non-DA with ZLD bound, and CLAMP non-DA with CLAMP bound peaks. Lower panel: Venn diagram showing the number of overlapping sites between GAF-dependent DA sites (Gaskill et al., 2021), ZLD non-DA, and CLAMP non-DA peaks. ChIP-seq, chromatin immunoprecipitation-sequencing; CLAMP, chromatin-linked adaptor for male-specific lethal (MSL) proteins.

Figure 5.

Figure 5—figure supplement 1. CLAMP-mediated chromatin accessibility is correlated with CLAMP and ZLD binding.

Figure 5—figure supplement 1.

(A) CLAMP-related and ZLD-related peaks defined by combining ATAC-seq (this study or Hannon et al., 2017; Soluri et al., 2020) and ChIP-seq peaks in 0–2 hr MTD and RNAi embryos. Data is displayed as a heatmap of z-score normalized ATAC-seq and ChIP-seq (log2 IP/input) reads, in a 2-kb region centered around each peak. Peaks in each class are arranged in order of decreasing z-scores in control MTD embryos. Peak types and numbers are marked on the plot: DA w/ CLAMP (n=74): differentially accessible regions that depend on CLAMP and has CLAMP binding. DA w/o CLAMP (n=203): differentially accessible regions that depend on CLAMP and has no CLAMP binding. Non-DA w/ CLAMP (n=1238): not differentially accessible regions that have CLAMP binding. DA w/ ZLD (n=986): differentially accessible regions that depends on ZLD and has ZLD binding. DA w/o ZLD (n=2797): differentially accessible regions that depends on ZLD and has no ZLD binding. Non-DA w/ ZLD (n=2301): non-differentially accessible regions which have ZLD binding. Note that non-DA w/o CLAMP (n=15,081) and non-DA w/o ZLD (n=17,758) peaks are omitted from this plot. (B) Top motifs enriched in each type of CLAMP and ZLD co-bound regions. Enrichment p-value and percentage of sequences are noted. ChIP-seq, chromatin immunoprecipitation-sequencing; CLAMP, chromatin-linked adaptor for male-specific lethal (MSL) proteins.

Notably, at sites where CLAMP is required for chromatin accessibility (Type II, CLAMP-DA, n=23), ZLD occupancy is entirely ablated in clamp-i embryos (Figure 5A). CLAMP occupancy levels are also reduced after maternal zld RNAi at sites where ZLD is required for chromatin accessibility (Type III, ZLD-DA, n=123). Overall, we observed that CLAMP and/or ZLD occupancy is reduced at most of their co-bound regions when either one of the TFs is depleted, which is consistent with their inter-dependent binding relationship. Moreover, clamp-i has a stronger impact on ZLD occupancy than zld-i has on CLAMP occupancy.

To assess how CLAMP/ZLD-modulated chromatin accessibility impacts transcription, we examined the effect of maternal clamp (Rieder et al., 2017) or zld (Schulz et al., 2015) depletion on expression (RNA-seq data) of genes that fall into the four types of CLAMP/ZLD co-occupied sites (Figure 5B). We found that the expression levels of genes (Type II, CLAMP-DA, n=23) that require CLAMP for chromatin accessibility are significantly (p<0.05, Mann-Whitney U-test) downregulated in embryos lacking CLAMP compared to the Type IV (both non-DA) CLAMP and ZLD-independent group (n=374) (Figure 5B). Genes (Type III, ZLD-DA, n=123) dependent on ZLD for their accessibility also show a significant (p<0.001, Mann-Whitney U-test) reduction in expression upon maternal CLAMP depletion, suggesting CLAMP also might contribute to the regulation of genes at which ZLD regulates chromatin accessibility, likely by increasing ZLD binding.

In embryos depleted for maternal ZLD (Schulz et al., 2015), we found genes that fall into the Type III (ZLD-DA) ZLD-mediate chromatin accessibility group significantly (p<0.001, Mann-Whitney U-test) decreased in expression (Figure 5B), compared with the Type IV (both non-DA, n=374) group. Interestingly, genes within the CLAMP and ZLD-independent Type IV (both non-DA, n=374) group do not show significant expression fold changes after depleting either maternal clamp or zld, supporting the hypothesis that CLAMP and ZLD function redundantly at these loci and/or other proteins play a major role in regulating chromatin accessibility and transcription of these genes.

Motif analysis demonstrates that CLAMP and ZLD motifs are enriched at genomic loci that are regulated by each factor as well as independent sites (Type IV), in addition to the motif for another GA-binding protein, GAF (Figure 5—figure supplement 1B). We next determined whether GAF alters chromatin accessibility at loci at which depletion of CLAMP or ZLD individually alters accessibility (Type IV) and is bound by all three factors. Indeed, we found that approximately 10% of loci that require GAF for their chromatin accessibility (n=104) (Gaskill et al., 2021) overlap with regions where depleting CLAMP or ZLD individually does not alter accessibility (CLAMP non-DA and/or ZLD non-DA) (Figure 5C, upper panel). When we do not require occupancy of ZLD and CLAMP at their non-DA sites, the overlap with the GAF-dependent regions is approximately 97% (Figure 5C, lower panel). These results suggest GAF might function at these CLAMP/ZLD independent sites, supporting a model in which multiple TFs coordinately regulate early zygotic chromatin accessibility during ZGA (Hamm and Harrison, 2018).

Together, our results reveal the CLAMP and ZLD regulate chromatin accessibility, which alters the occupancy of both factors and regulates zygotic transcription. Furthermore, GAF and/or other TFs might function at sites that are not altered by depleting CLAMP or ZLD individually, suggesting that multiple TFs promote chromatin accessibility during ZGA. It is also possible that CLAMP and ZLD are functionally redundant at the subset of genomic loci at which they regulate each other’s occupancy, but depleting either factor individually is not sufficient to alter chromatin and expression.

Discussion

Two questions central to early embryogenesis of all metazoans are how and where do early TFs work together to drive chromatin changes and ZGA. Here, we defined a novel function of CLAMP as a new pioneer TFs that has a targeted yet essential function in early embryonic development. We found that CLAMP directly binds to nucleosomal DNA (Figure 1), establishes and/or maintains chromatin accessibility at promoters of genes that often encode other TFs (Figure 2), and facilitates the binding of ZLD to promoters (Figure 3) to regulate activation of zygotic gene transcription (Figure 4). We discovered that CLAMP and ZLD regulate each other’s binding via mediating chromatin accessibility which further regulates their target gene expression (Figure 5). Overall, we provide new insight into how CLAMP and ZLD function together to enhance each other’s occupancy and increase chromatin accessibility, which drives ZGA.

CLAMP and ZLD act together to define an open chromatin landscape and activate transcription in early embryos

We defined multiple classes of CLAMP-dependent and ZLD-dependent genomic loci in early embryos, which provides insight into how CLAMP and ZLD regulate chromatin accessibility and zygotic transcription during ZGA (Figure 6): (1) CLAMP promotes ZLD enrichment at sites where CLAMP increases chromatin accessibility and further regulates ZLD target gene expression. These loci remain open and transcriptionally active even upon ZLD depletion. (2) ZLD facilitates CLAMP occupancy at sites where ZLD regulates chromatin accessibility and promotes CLAMP target gene expression. When maternal CLAMP is depleted, these loci remain accessible and genes are actively transcribed. (3) GAF and/or other TFs could play major roles in opening chromatin at locations co-bound by CLAMP and ZLD but that are not altered in accessibility after depleting CLAMP or ZLD individually. CLAMP and ZLD could also function redundantly at some of these loci because they alter each other’s occupancy at these loci but do not change accessibility or expression after depletion of either maternal CLAMP or ZLD individually. Overall, our data suggest that CLAMP functions with ZLD regulate chromatin accessibility and gene expression of the early zygotic genome.

Figure 6. Model for how CLAMP and ZLD pioneer factor function together to define chromatin accessibility in early embryos.

Figure 6.

CLAMP and ZLD function together at promoters to regulate each other’s occupancy and gene expression of genes encoding other key TFs. We defined CLAMP and ZLD co-bound peaks in early embryos, which revealed roles for CLAMP and ZLD in defining chromatin accessibility and activating zygotic transcription at a subset of the zygotic genome.CLAMP-dependent regions: CLAMP promotes ZLD enrichment at these sites where CLAMP binding increases chromatin accessibility and regulates target gene expression. These sites are closed and lack binding of ZLD when maternal clamp is depleted, and they remain open and transcription is activated when maternal zld is depleted. ZLD-dependent regions: ZLD modulates chromatin opening and transcription at these sites that are bound by CLAMP but do not depend on CLAMP for chromatin accessibility. These sites are closed and lack binding of CLAMP when maternal zld is depleted, and they remain open and active when maternal clamp is depleted. CLAMP/ZLD-independent regions: GAF or other TFs open chromatin at locations co-bound by CLAMP and ZLD where chromatin accessibility is not altered when each factor is depleted individually. CLAMP and ZLD could also function redundantly at some of these loci. These sites remain accessible and transcriptionally active upon either maternal zld or clamp depletion. CLAMP, chromatin-linked adaptor for male-specific lethal (MSL) proteins; TF, transcription factor.

Although we have demonstrated an instrumental role for CLAMP in defining a subset of the open chromatin landscape in early embryos, our data show that CLAMP does not increase chromatin accessibility at promoters of all zygotic genes independent of ZLD. Consistent with our results in the early embryo, CLAMP regulates chromatin accessibility at only a few hundred genomic loci in male S2 (258 sites) and female Kc (102 sites) cell lines. Unlike ZLD, which plays a global role in regulating chromatin accessibility at promoters throughout the genome, depletion of CLAMP alone mainly drives changes at promoters of specific genes that often encode TFs that are important for early development, consistent with phenotypic data. These findings indicate that CLAMP and ZLD regulate ZGA in different ways: ZLD mediates chromatin opening globally, while the CLAMP functions in a more targeted way at certain essential early TF genes. However, both proteins are critical to ZGA and loss of either is catastrophic in terms of overall embryonic development.

Moreover, ZLD binding and/or chromatin accessibility is not regulated by maternal depletion of CLAMP at all GA-rich sites in the genome. GAF is also enriched at these same ZLD-bound regions where ZLD is not required for chromatin accessibility (Schulz et al., 2015; Gaskill et al., 2021). Both CLAMP and GAF are deposited maternally (Rieder et al., 2017; Hamm et al., 2017) and bind to similar GA-rich motifs (Kaye et al., 2018). To test whether GAF compensates for the depletion of CLAMP or ZLD, we tried to perform GAF RNAi in the current study to prevent GAF from compensating for CLAMP depletion. However, we and other laboratories could not achieve depletion of GAF in early embryos by RNAi, likely due to autoregulation of its own promoter and its prion-like self-perpetuating function (Tariq et al., 2013).

We previously demonstrated that competition between CLAMP and GAF at GA-rich binding sites is essential for MSL complex recruitment in S2 cells (Kaye et al., 2018). Furthermore, CLAMP excludes GAF at the histone locus which co-regulates genes that encode the histone proteins (Rieder et al., 2017). However, we also observed synergistic binding between CLAMP and GAF at many additional binding sites (Kaye et al., 2018). The relationship between CLAMP and GAF in early embryos remains unclear. It is very possible that the competitive relationship has not been established in early embryos, since dosage compensation has not yet been initiated (Prayitno et al., 2019). Using GAF-dependent loci defined by Gaskill et al., 2021, we found that genomic loci where GAF functions largely overlap with regions where depletion of CLAMP or ZLD alone does not alter chromatin accessibility, indicating that GAF may function independently of CLAMP or ZLD or is functionally redundant. Future studies are required to distinguish between these models by examining how GAF and CLAMP affect each other’s binding to co-bound loci and simultaneously eliminating both factors.

The GA-rich sequences targeted by CLAMP and GAF are distinct from each other in vivo and in vitro. GAF binding sites typically have 3.5 GA repeats; however, GAF is able to bind to as few as three bases (GAG) within the hsp70 promoter and in vitro (Wilkins and Lis, 1999). In contrast, CLAMP binding sites contain an 8-bp core with a less well-conserved second GA dinucleotide within the core (GA__GAGA) (Alekseyenko et al., 2008). CLAMP binding sites also include a GAGAG pentamer at a lower frequency than GAF binding sites, and flanking bases surrounding the 8-bp core are critical for CLAMP binding (Kaye et al., 2018). Therefore, GAF and CLAMP may have overlapping and non-overlapping functions at different loci, tissues, or developmental stages. Moreover, another TF, Pipsqueak (Psq) also binds to sites containing the GAGAG motif, and has multiple functions during oogenesis and embryonic pattern formation and functions with Polycomb in three-dimensional genome organization (Lehmann et al., 1998; Gutierrez-Perez et al., 2019). In the future, an optogenetic inactivation approach could be used to remove CLAMP, GAF, and/or Psq simultaneously in a spatial and temporal manner (McDaniel et al., 2019).

CLAMP and ZLD regulate each other’s binding via their own motifs

ZLD is an essential TF that regulates activation of the first set of zygotic genes during the minor wave of ZGA and thousands of genes transcribed during the major wave of ZGA at nuclear cycle 14 (Liang et al., 2008; Harrison et al., 2011). ZLD also establishes and maintains chromatin accessibility of specific regions and facilitates TF binding and early gene expression (Sun et al., 2015; Schulz et al., 2015). CLAMP regulates histone gene expression (Rieder et al., 2017), X chromosome dosage compensation (Soruco et al., 2013), and establishes/maintains chromatin accessibility (Urban et al., 2017b). Nonetheless, it remained unclear whether and how CLAMP and ZLD functionally interact during ZGA. Here, we demonstrate that CLAMP and ZLD function together at a subset of promoters that often encode other transcriptional regulators.

ZLD regulates CLAMP occupancy earlier than CLAMP regulates ZLD occupancy. Genomic loci at which CLAMP is dependent on ZLD early (0–2 hr) in development often become independent from ZLD later (2–4 hr), with the caveat that ZLD depletion is not as effective later in development. Therefore, CLAMP may require the pioneering activity of ZLD to access specific loci before ZGA, but ZLD may no longer be necessary once CLAMP binding is established. Also, our results suggest that CLAMP is a potent regulator of ZLD binding, especially in 2–4 hr embryos. ZLD can bind to many more promoter regions at 0–2 hr, while CLAMP mainly binds to introns early in development but occupies promoters later at 2–4 hr. Therefore, CLAMP may require ZLD to increase chromatin accessibility of these promoter regions (Schulz et al., 2015).

In addition to its role in embryonic development, CLAMP also plays an essential role in targeting the MSL male dosage compensation complex to the X chromosome (Soruco et al., 2013). Drosophila embryos initiate X chromosome counting in nuclear cycle 12 and start the sex determination cascade prior to the major wave of ZGA at nuclear cycle 14 (Gergen, 1987; ten Bosch et al., 2006). However, most dosage compensation is initiated much later in embryonic development (Prayitno et al., 2019). Our data support a model in which CLAMP functions early in the embryo prior to MSL complex assembly to open up specific chromatin regions for MSL complex recruitment (Urban et al., 2017b; Rieder et al., 2019). Moreover, ZLD likely functions primarily as an early pioneer factor, whereas CLAMP has pioneer functions in both early and late-ZGA embryos. Consistent with this hypothesis, CLAMP binding is enriched at both early and late zygotic genes. In contrast, ZLD binding binds more frequently to early genes, suggesting that there may be a sequential relationship between occupancy of these two TFs at some loci during early embryogenesis.

The different characteristics of dependent and independent CLAMP and ZLD binding sites also provide insight into how early TFs work together to regulate ZGA. At dependent sites, there are often relatively broad peaks of CLAMP and ZLD that are significantly enriched for clusters of motifs for the required protein. Our CLAMP gel shift assays and those previously reported (Kaye et al., 2018) also show multiple shifted bands consistent with possible multimerization. CLAMP contains two central disordered prion-like glutamine-rich regions (Kaye et al., 2018), a domain that is critical for transcriptional activation and multimerization in vivo in several TFs, including GAF (Wilkins and Lis, 1999). Moreover, glutamine-rich repeats alone can be sufficient to mediate stable protein multimerization in vitro (Stott et al., 1995). Therefore, it is reasonable to hypothesize that the CLAMP glutamine-rich domain also functions in CLAMP multimerization.

In contrast, ZLD fails to form dimers or multimers (Hamm et al., 2015; Hamm et al., 2017), indicating that ZLD most likely binds as a monomer. There is no evidence that CLAMP and ZLD have any direct protein-protein interaction at sites where they depend on each other to bind. For example, mass spectrometry results that identified dozens of CLAMP-associated proteins did not identify ZLD (Urban et al., 2017b). No data has validated any protein-protein interactions of ZLD with itself as a multimer or between ZLD and any other TFs (Hamm et al., 2017). In the future, simultaneous ablation of maternal CLAMP and ZLD will allow the analysis of potential functional redundancy at a subgroup of genomic loci. Our study suggests that regulating the chromatin landscape in early embryos to drive ZGA requires the function of multiple pioneer TFs.

Materials and methods

Recombinant protein expression and purification of CLAMP

MBP-tagged CLAMP DBD was expressed and purified as described previously (Kaye et al., 2018). MBP-tagged (pTHMT, Peti and Page, 2007) FL CLAMP protein was expressed in Escherichia coli BL21 Star (DE3) cells (Life Technologies). Bacterial cultures were grown to an optical density of 0.7–0.9 before induction with 1 mM isopropyl-β-D-1-thiogalactopyranoside (IPTG) for 4 hr at 37°C.

Cell pellets were harvested by centrifugation and stored at −80°C. Cell pellets were resuspended in 20 mM Tris, 1 M NaCl, 0.1 mM ZnCl2, and 10 mM imidazole pH 8.0 with one EDTA-free protease inhibitor tablet (Roche) and lysed using an Emulsiflex C3 (Avestin). The lysate was cleared by centrifugation at 20,000 rpm for 50 min at 4°C, filtered using a 0.2 μm syringe filter, and loaded onto a HisTrap HP 5 ml column. The protein was eluted with a gradient from 10 to 300 mM imidazole in 20 mM Tris, 1.0 M NaCl pH 8.0, and 0.1 mM ZnCl2. Fractions containing MBP-CLAMP FL were loaded onto a HiLoad 26/600 Superdex 200 pg column equilibrated in 20 mM Tris, 1.0 M NaCl, pH 8.0. Fractions containing FL CLAMP were identified by SDS-PAGE and concentrated using a centrifugation filter with a 10-kDa cutoff (Amicon, Millipore) and frozen as aliquots.

In vitro assembly of nucleosomes

The 240 bp 5C2 DNA fragment used for nucleosome in vitro assembly was amplified from 276 bp 5C2 fragments (50 ng/µl, IDT gBlocks gene fragments) by PCR (see 276 bp 5C2 and primer sequences below) using OneTaq Hot Start 2× Master Mix (New England Biolabs). The DNA was purified using the PCR Clean-Up Kit (Qiagen) and concentrated to 1 µg/µl by SpeedVac Vacuum (Eppendorf). The nucleosomes were assembled using the EpiMark Nucleosome Assembly Kit (New England Biolabs) following the kit’s protocol.

5C2 (276 bp), bold sequences are CLAMP-binding motifs, underlined sequences are primer binding sequences:

  • TCGACGACTAGTTTAAAGTTATTGTAGTTCTTAGAGCAGAATGTATTTTAAATATCAATGTTTCGATGTAGAAATTGAATGGTTTAAATCACGTTCACACAACTTAGAAAGAGATAGCGATGGCGGTGTGAAAGAGAGCGAGATAGTTGGAAGCTTCATGGAAATGAAAGAGAGGTAGTTTTTGGAAATGAAAGTTGTACTAGAAATAAGTATTTTATGTATATAGAATATCGAAGTACAGAAATTCGAAGCGATCTCAACTTGAATATTATATCG

Primers for 5C2 region (product is 240 bp):

  • Forward: TTGTAGTTCTTAGAGCAGAATGT

  • Reverse: GTTGAGATCGCTTCGAATTT

Electrophoretic mobility shift assays

DNA or nucleosome probes at 35 nM (700 fmol/reaction) were incubated with MBP-tagged CLAMP DBD protein or MBP-tagged FL CLAMP protein in a binding buffer. The binding reaction buffer conditions are similar to conditions previously used to test ZLD nucleosome binding (McDaniel et al., 2019) in 20 µl total volume: 7.5 µl BSA/HEGK buffer (12.5 mM HEPES, pH 7.0, 0.5 mM EDTA, 0.5 mM EGTA, 5% glycerol, 50 mM KCl, 0.05 mg/ml BSA, 0.2 mM PMSF, 1 mM DTT, 0.25 mM ZnCl2, and 0.006% NP-40) 10 µl probe mix (5 ng poly[d-(IC)], 5 mM MgCl2, 700 fmol probe), and 2.5 µl protein dilution (0.5µM, 1 µM, and 2.5 µM) at room temperature for 60 min. Reactions were loaded onto 6% DNA retardation gels (Thermo Fisher Scientific) and run in 0.5× Tris–borate–EDTA buffer for 2 hr. Gels were post stained with GelRed Nucleic Acid Stain (Thermo Fisher Scientific) for 30 min and visualized using the ChemiDoc MP imaging system (Bio-Rad).

Fly stocks and crosses

To deplete maternally deposited clamp or zld mRNA throughout oogenesis, we crossed a maternal triple driver (MTD-GAL4, Bloomington, #31777) line (Ni et al., 2011) with a Transgenic RNAi Project (TRiP) clamp RNAi line (Bloomington, #57008), a TRiP zld RNAi line (from C. Rushlow lab) or egfp RNAi line (Bloomington, #41552). The egfp RNAi line was used as control in smFISH immunostaining and imaging experiments. The MTD-GAL4 line alone was used as the control line in ATAC-seq and ChIP-seq experiments.

Briefly, the MTD-GAL4 virgin females (5–7 days old) were mated with TRiP UAS-RNAi males to obtain MTD-Gal4/UAS-RNAi line daughters. The MTD drives RNAi during oogenesis in these daughters. Therefore, the targeted mRNA is depleted in their eggs. Then MTD-Gal4/UAS-RNAi daughters were mated with males to produce embryos with depleted maternal clamp or zld mRNA and used for ATAC-seq and ChIP-seq experiments. The embryonic phenotypes of the maternal zld TRiP RNAi line were confirmed previously (Sun et al., 2015). Maternal clamp embryonic phenotypes of the TRiP clamp RNAi line were confirmed by immunofluorescent staining in our study. Moreover, we validated CLAMP or ZLD protein knockdown in early embryos by Western blotting using the Western Breeze Kit (Invitrogen) and measured clamp and zld mRNA levels by qRT-PCR (Figure 1—figure supplement 1B,C and Figure 1—source data 1).

Embryo collections

To optimize egg collections, young (5–7 days old) females and males were mated. To ensure mothers do not lay older embryos during collections, we first starved flies for 2 hr in the empty cages and discarded the first 2 hr grape agar plates with yeast paste (Plate set #0). When we collected eggs for the experiments, we put flies in the cages with grape agar plates (Plate set #1) with yeast paste for egg laying for 2 hr. Then, we replaced Plate set #1 with a new set of plates (Plate set #2) at the 2 hr time point. We kept Plate set #1 embryos (without any adult flies) to further develop for another 2 hr to obtain 2–4 hr embryos. At the same time, we obtained newly laid 0–2 hr embryos from Plate set #2. Therefore, this strategy successfully prevented cross-contamination between 0–2 hr (Plate set #2) and 2–4 hr embryos (Plate set #1).

smFISH, Immunostaining and Imaging

For whole embryo single-molecule fluorescence in situ hybridization (smFISH) and immunostaining and subsequent imaging, standard protocols were used (Little and Gregor, 2018). smFISH probes complementary to run were a gift from Thomas Gregor, and those complementary to eve were a gift from Shawn Little. The concentrations of the different dyes and antibodies were as follows: Hoechst (Invitrogen, 3 µg/ml), anti-NRT (Developmental Studies Hybridoma Bank BP106, 1:10), AlexaFluor secondary antibodies (Invitrogen Molecular Probes, 1:1000). Imaging was done using a Nikon A1 point-scanning confocal microscope with a 40× oil objective. Image processing and intensity measurements were done using ImageJ software (NIH). Figures were assembled using Adobe Photoshop CS4.

ATAC-seq in embryos

We conducted ATAC-seq following the protocol from Blythe and Wieschaus, 2016. 0–2 hr or 2–4 hr embryos were laid on grape agar plates, dechorionated by 1 min exposure to 6% bleach (Clorox) and then washed three times in deionized water. We homogenized 10 embryos and lysed them in 50 µl lysis buffer (10 mM Tris 7.5, 10 mM NaCl, 3 mM MgCl2, and 0.1% NP-40). We collected nuclei by centrifuging at 500g at 4°C and resuspended nuclei in 5 µl TD buffer with 2.5 µl Tn5 enzyme (Illumina Tagment DNA TDE1 Enzyme and Buffer Kits). We incubated samples at 37°C for 30 min at 800 rpm (Eppendorf Thermomixer) for fragmentation, and then purified samples with Qiagen MinElute columns before PCR amplification. We amplified libraries by adding 10 µl DNA to 25 µl NEBNext HiFi 2× PCR mix (New England Biolabs) and 2.5 µl of a 25 µM solution of each of the Ad1 and Ad2 primers. We used 13 PCR cycles to amplify samples from 0 to 2 hr embryos and 12 PCR cycles to amplify samples from 2 to 4 hr embryos. Next, we purified libraries with 1.2× Ampure SPRI beads. We performed three biological replicates for each genotype (n=2) and time point (n=2). We measured the concentrations of 12 ATAC-seq libraries by Qubit and determined library quality by Bioanalyzer. We sequenced libraries on an Illumina Hi-seq 4000 sequencer at GeneWiz (South Plainfield, NJ) using the 2 × 150 bp mode. ATAC-seq data is deposited at NCBI GEO and the accession number is GSE152596.

ChIP-seq in embryos

We performed ChIP-seq as previously described (Blythe and Wieschaus, 2015). We collected and fixed ~100 embryos from each MTD-GAL4 and RNAi cross 0–2 hr or 2–4 hr after egg lay. We used 3 µl of rabbit anti-CLAMP (Soruco et al., 2013) and 2 µl rat anti-ZLD (from C. Rushlow lab) per sample. We performed three biological ChIP replicates for each protein (n=2), genotype (n=3), and time point (n=2). In total, we prepared 36 libraries using the NEBNext Ultra ChIP-seq Kit (New England Biolabs) and sequenced libraries on the Illumina HiSeq 2500 sequencer using the 2 × 150 bp mode. ChIP-seq data is deposited at NCBI GEO and the accession number is GSE152598.

Computational analyses

ATAC-seq analysis

Prior to sequencing, the Fragment Analyzer showed the library top peaks were in the 180–190 bp range, which is comparable to the previously established embryo ATAC-seq protocol (Haines, 2017). Demultiplexed reads were trimmed of adapters using TrimGalore (Krueger, 2017) and mapped to the Drosophila genome dm6 version using Bowtie2 (v. 2.3.0) with option --very-sensitive, --no-mixed, --no-discordant, --dovetail -X 2000 k 2. We used Picard tools (v. 2.9.2) and SAMtools (v.1.9, Li et al., 2009) to remove the reads that were unmapped, failed primary alignment, or duplicated (-F 1804), and retain properly paired reads (-f 2) with MAPQ >30. After quality trimming and mapping, the Picard tool reported the mean fragment sizes for all ATAC-seq mapped reads are between 125 and 161 bp. As expected, we observed three classes of peaks: (1) a sharp peak at <100 bp (open chromatin); (2) a peak at ~200 bp (mono-nucleosome); and (3) other larger peaks (multi-nucleosomes).

After mapping, we used Samtools to select a fragment size ≤100 bp within the bam files to focus on open chromatin. Peak regions for open chromatin regions were called using MACS2 (v. 2.1.1, Zhang et al., 2008) with parameters -f BAMPE -g dm --call-summits. ENCODE blacklist was used to filter out problematic regions in dm6 (Amemiya et al., 2019). Bam files and peak bed files were used in DiffBind v.3.12 (Stark and Brown, 2019) for count reads (dba.count), library size normalization (dba.normalize), and calling (dba.contrast) DA region with the DESeq2 method. Peak regions (201 bp) were centered by peak summits and extended 100 bp on each side. Sites were defined as DA with statistically significant differences between conditions using absolute cutoffs of FC>0.5 and FDR<0.1. We report all accessible peaks from DiffBind in Figure 2—source data 1.

We used DeepTools (v. 3.1.0, Ramírez et al., 2014) to generate enrichment heatmaps (CPM normalization), and average profiles were generated in DeepStats (Gautier, 2020). We used 1× depth (reads per genome coverage, RPGC) normalization in Deeptools bamCoverage for making the coverage Bigwig files and uploaded to IGV (Robinson et al., 2011) for genomic track visualizations. Homer (v. 4.11, Givler and Lilienthal, 2005) was used for de novo motif searches. Visualizations and statistical tests were conducted in R Development Core Team, 2014. Specifically, we annotated peaks to their genomic regions using R packages Chipseeker (Yu et al., 2015) and we performed gene ontology enrichment analysis using clusterProfiler (Yu et al., 2012). Boxplot and violin plots were generated using ggplot2 (Wickham, 2009) package.

ChIP-seq analysis

Briefly, we trimmed ChIP-seq raw reads with TrimGalore (Krueger, 2017) with a minimal phred score of 20, 36 bp minimal read length, and Illumina adaptor removal. We then mapped cleaned reads to the D. melanogaster genome (UCSC dm6) with Bowtie2 (v. 2.3.0) with the –very-sensitive-local flag feature. We used Picard tools (v. 2.9.2) and SAMtools (v. 1.9, Li et al., 2009) to remove the PCR duplicates. We used MACS2 (v. 2.1.1, Zhang et al., 2008) to identify peaks with default parameters and MSPC (v. 4.0.0, Jalili et al., 2015) to obtain consensus peaks from three replicates. The peak number for each sample was summarized in Table 1. ENCODE blacklist was used to filter out problematic regions in dm6 (Amemiya et al., 2019). We identified DB and non-DB between MTD and RNAi samples using DiffBind (v. 3.10, Stark and Brown, 2019) with the DESeq2 method. Peak regions (501 bp) were centered by peak summits and extended 250 bp on each side. The DB and non-DB peak numbers are summarized in Table 1. DB was defined with absolute FC>0.5 and FDR<0.05 (Table 1—source data 1).

We used DeepTools (v. 3.1.0, Ramírez et al., 2014) to generate enrichment heatmaps and average profiles. Bigwig files were generated with DeepTools bamCompare (scale factor method: SES; Normalization: log2) and uploaded to IGV (Robinson et al., 2011) for genomic track visualization. We used Homer (v. 4.11, Givler and Lilienthal, 2005) for de novo motif searches and genomic annotation. Intervene (Khan and Mathelier, 2017) was used for intersection and visualization of multiple peak region sets. Visualizations and statistical tests were conducted in R Development Core Team, 2014. Specifically, we annotated peaks to their genomic regions using the R package Chipseeker (Yu et al., 2015) and we did gene ontology enrichment analysis using clusterProfiler (Yu et al., 2012). Boxplots and violin plots were generated using the ggplot2 (Wickham, 2009) package.

ATAC-seq and ChIP-seq data integration

We used Bedtools (Quinlan and Hall, 2010) intersection tool to intersect peaks in CLAMP ChIP-seq binding regions with CLAMP DA or non-DA peaks. Based on the intersection of the peaks, we defined four types of CLAMP related peaks: (1) DA with CLAMP, (2) DA without CLAMP, (3) non-DA with CLAMP, and (4) non-DA without CLAMP. Similarly, we defined ZLD related peaks by intersecting ZLD DA or non-DA peaks and ATAC-seq data sets (Hannon et al., 2017; Soluri et al., 2020) from wt and zld germline clone (zld-) embryos at the NC14 +12 min stage. Specifically, we defined four classes of genomic loci for ZLD-related classes: (1) DA with ZLD, (2) DA without ZLD, (3) non-DA with ZLD, and (4) non-DA, without ZLD. We used DeepTools (v. 3.1.0, Ramírez et al., 2014) to generate enrichment heatmaps for each subclass of peaks. Peaks locations in each CLAMP or ZLD-related category were summarized in Table 2—source data 1.

ATAC-seq and RNA-seq data integration

We annotated genes near differential (down-DA) ATAC-seq peaks in R using detailRanges function from the csaw package (Lun and Smyth, 2016). Then we plotted the expression of genes using previously published RNA-seq data (Rieder et al., 2017).

ChIP-seq and RNA-seq data integration

To define strong, weak, and unbound genes close to peaks in CLAMP or ZLD ChIP-seq data, we used the peak binding score reported in MACS2 -log10(p-value) of 100 as a cutoff value. We defined the following categories: (1) strong binding peaks: score greater than 100; (2) weak binding peak: score lesser than 100; (3) unbound peaks: the rest of the peaks that are neither strong or weak. Then, we annotated all peaks using Homer annotatePeaks (v. 4.11, Givler and Lilienthal, 2005). We then obtained the log2 fold change (clamp-i/MTD or zld-i/yw) of gene expression in the RNA-seq data set for each protein binding group: CLAMP (Rieder et al., 2017) or ZLD (Schulz et al., 2015). Boxplots and violin plots were generated using the ggplot2 (Wickham, 2009) package.

Data sets

RNA-seq data sets from wt and maternal clamp depletion by RNAi were from GSE102922 (Rieder et al., 2017). RNA-seq data sets from yw wt and zld maternal RNAi were from GSE65837 (Schulz et al., 2015). ATAC-seq data from wt and zld germline clones were from GSE86966 (Hannon et al., 2017). Processed ATAC-seq data identifying differential peaks between wt and zld germline mutations were from Soluri et al., 2020.

Acknowledgements

The authors thank Dr. Melissa Harrison, Tyler Gibson, and Marissa Gaskill for sending the GAF-dependent region bed file and helpful discussions. The authors thank members in the Larschan lab for feedback and discussions. This work was supported by NIH Grant F32GM109663, K99HD092625, and R00HD092625 to Dr. Leila Rieder and R35GM126994 to Dr. Erica Larschan, and in part by NSF Grant 1845734 and NIH Grant R01GM118530 to Dr. Nicolas L Fawzi.

Funding Statement

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Contributor Information

Jingyue Duan, Email: jd774@cornell.edu.

Erica Larschan, Email: erica_larschan@brown.edu.

Yukiko M Yamashita, Whitehead Institute/MIT, United States.

Kevin Struhl, Harvard Medical School, United States.

Funding Information

This paper was supported by the following grants:

  • National Institute of General Medical Sciences F32GM109663 to Leila Rieder.

  • National Institute of General Medical Sciences K99HD092625 to Leila Rieder.

  • National Institute of General Medical Sciences R00HD092625 to Leila Rieder.

  • National Institute of General Medical Sciences R35GM126994 to Erica Larschan.

  • National Science Foundation 1845734 to Nicolas Fawzi.

  • National Institute of General Medical Sciences R01GM118530 to Nicolas Fawzi.

Additional information

Competing interests

No competing interests declared.

Author contributions

Conceptualization, Data curation, Formal analysis, Supervision, Validation, Investigation, Visualization, Methodology, Writing - original draft, Writing - review and editing.

Conceptualization, Data curation, Software, Formal analysis, Supervision, Validation, Investigation, Visualization, Methodology, Writing - review and editing.

Visualization, Methodology, Writing - review and editing.

Validation.

Validation.

Supervision, Validation, Visualization.

Supervision, Validation, Visualization, Writing - review and editing.

Resources, Methodology.

Supervision, Funding acquisition, Methodology.

Conceptualization, Resources, Data curation, Software, Formal analysis, Supervision, Funding acquisition, Validation, Investigation, Visualization, Methodology, Writing - original draft, Project administration, Writing - review and editing.

Additional files

Transparent reporting form

Data availability

Sequencing data have been deposited in GEO under accession code GSE152613.

The following dataset was generated:

Rieder L, Colonnetta MM, Huang A, Mckenney M, Watters S, Deshpande G, Jordan W, Fawzi N, Larschan E. 2020. CLAMP and Zelda function together as pioneer transcription factors to promote Drosophila zygotic genome activation. NCBI Gene Expression Omnibus. GSE152613

The following previously published datasets were used:

Rieder LE, Koreski KP, Boltz KA, Kuzu G, Urban JA, Bowman S, Zeidman A, Jordan WT, Tolstorukov MY, Marzluff WF, Duronio RJ, Larschan EN. 2019. Histone locus regulation by the Drosophila dosage compensation adaptor protein CLAMP. NCBI Gene Expression Omnibus. GSE102922

Rieder L. 2015. Zelda determines chromatin accessibility during the Drosophila maternal-to-zygotic transition. NCBI Gene Expression Omnibus. GSE65837

Rieder L. 2017. Concentration dependent binding states of the Bicoid Homeodomain Protein. NCBI Gene Expression Omnibus. GSE86966

References

  1. Alekseyenko AA, Peng S, Larschan E, Gorchakov AA, Lee OK, Kharchenko P, McGrath SD, Wang CI, Mardis ER, Park PJ, Kuroda MI. A sequence motif within chromatin entry sites directs MSL establishment on the Drosophila X chromosome. Cell. 2008;134:599–609. doi: 10.1016/j.cell.2008.06.033. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Amemiya HM, Kundaje A, Boyle AP. The ENCODE blacklist: identification of problematic regions of the genome. Scientific Reports. 2019;9:9354. doi: 10.1038/s41598-019-45839-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Bhat KM, Farkas G, Karch F, Gyurkovics H, Gausz J, Schedl P. The GAGA factor is required in the early Drosophila embryo not only for transcriptional regulation but also for nuclear division. Development. 1996;122:1113–1124. doi: 10.1242/dev.122.4.1113. [DOI] [PubMed] [Google Scholar]
  4. Blythe SA, Wieschaus EF. Zygotic genome activation triggers the DNA replication checkpoint at the midblastula transition. Cell. 2015;160:1169–1181. doi: 10.1016/j.cell.2015.01.050. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Blythe SA, Wieschaus EF. Establishment and maintenance of heritable chromatin structure during early Drosophila embryogenesis. eLife. 2016;5:e20148. doi: 10.7554/eLife.20148. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Cirillo LA, Zaret KS. An early developmental transcription factor complex that is more stable on nucleosome core particles than on free DNA. Molecular Cell. 1999;4:961–969. doi: 10.1016/s1097-2765(00)80225-7. [DOI] [PubMed] [Google Scholar]
  7. Duarte FM, Fuda NJ, Mahat DB, Core LJ, Guertin MJ, Lis JT. Transcription factors GAF and HSF act at distinct regulatory steps to modulate stress-induced gene activation. Genes & Development. 2016;30:1731–1746. doi: 10.1101/gad.284430.116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Farkas G, Gausz J, Galloni M, Reuter G, Gyurkovics H, Karch F. The Trithorax-like gene encodes the Drosophila GAGA factor. Nature. 1994;371:806–808. doi: 10.1038/371806a0. [DOI] [PubMed] [Google Scholar]
  9. Fuda NJ, Guertin MJ, Sharma S, Danko CG, Martins AL, Siepel A, Lis JT. GAGA factor maintains nucleosome-free regions and has a role in RNA polymerase II recruitment to promoters. PLOS Genetics. 2015;11:e1005108. doi: 10.1371/journal.pgen.1005108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Fujioka M, Jaynes JB, Goto T. Early even-skipped stripes act as morphogenetic gradients at the single cell level to establish engrailed expression. Development. 1995;121:4371–4382. doi: 10.1242/dev.121.12.4371. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Gaskill MM, Gibson TJ, Larson ED, Harrison MM. GAF is essential for zygotic genome activation and chromatin accessibility in the early Drosophila embryo. eLife. 2021;10:e66668. doi: 10.7554/eLife.66668. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Gautier R. gtrichard/deepStats: deepStats. 0.3.1Zenodo. 2020 doi: 10.5281/zenodo.3361799. [DOI]
  13. Gergen JP. Dosage compensation in Drosophila: evidence that daughterless and Sex-lethal control X chromosome activity at the blastoderm stage of embryogenesis. Genetics. 1987;117:477–485. doi: 10.1093/genetics/117.3.477. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Givler T, Lilienthal P. Using HOMER software, NREL’s Micropower Optimization Model, to Explore the Role of Gen-Sets in Small Solar Power Systems; Case Study: Sri Lanka (No: NREL/TP-710-36774). National Renewable Energy Lab.2005. [Google Scholar]
  15. Gutierrez-Perez I, Rowley MJ, Lyu X, Valadez-Graham V, Vallejo DM, Ballesta-Illan E, Lopez-Atalaya JP, Kremsky I, Caparros E, Corces VG, Dominguez M. Ecdysone-Induced 3D chromatin reorganization involves active enhancers bound by pipsqueak and polycomb. Cell Reports. 2019;28:2715–2727. doi: 10.1016/j.celrep.2019.07.096. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Haines J. ATAC-seq on nuclei from frozen, sliced, Drosophila Melanogaster embryo halves. PLOS ONE. 2017;1:76. doi: 10.17504/protocols.io.j9zcr76. [DOI] [Google Scholar]
  17. Hamm DC, Bondra ER, Harrison MM. Transcriptional activation is a conserved feature of the early embryonic factor zelda that requires a cluster of four zinc fingers for DNA binding and a low-complexity activation domain. Journal of Biological Chemistry. 2015;290:3508–3518. doi: 10.1074/jbc.M114.602292. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Hamm DC, Larson ED, Nevil M, Marshall KE, Bondra ER, Harrison MM. A conserved maternal-specific repressive domain in Zelda revealed by Cas9-mediated mutagenesis in Drosophila Melanogaster. PLOS Genetics. 2017;13:e1007120. doi: 10.1371/journal.pgen.1007120. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Hamm DC, Harrison MM. Regulatory principles governing the maternal-to-zygotic transition: insights from Drosophila melanogaster. Open Biology. 2018;8:180183. doi: 10.1098/rsob.180183. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Hannon CE, Blythe SA, Wieschaus EF. Concentration dependent chromatin states induced by the bicoid morphogen gradient. eLife. 2017;6:e28275. doi: 10.7554/eLife.28275. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Harrison MM, Li XY, Kaplan T, Botchan MR, Eisen MB. Zelda binding in the early Drosophila Melanogaster embryo marks regions subsequently activated at the maternal-to-zygotic transition. PLOS Genetics. 2011;7:e1002266. doi: 10.1371/journal.pgen.1002266. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Hortsch M, Patel NH, Bieber AJ, Traquina ZR, Goodman CS. Drosophila neurotactin, a surface glycoprotein with homology to serine esterases, is dynamically expressed during embryogenesis. Development. 1990;110:1327–1340. doi: 10.1242/dev.110.4.1327. [DOI] [PubMed] [Google Scholar]
  23. Iwafuchi-Doi M, Donahue G, Kakumanu A, Watts JA, Mahony S, Pugh BF, Lee D, Kaestner KH, Zaret KS. The pioneer transcription factor FoxA maintains an accessible nucleosome configuration at enhancers for Tissue-Specific gene activation. Molecular Cell. 2016;62:79–91. doi: 10.1016/j.molcel.2016.03.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Jalili V, Matteucci M, Masseroli M, Morelli MJ. Using combined evidence from replicates to evaluate ChIP-seq peaks. Bioinformatics. 2015;31:2761–2769. doi: 10.1093/bioinformatics/btv293. [DOI] [PubMed] [Google Scholar]
  25. Judd J, Duarte FM, Lis JT. Pioneer-like factor GAF cooperates with PBAP (SWI/SNF) and NURF (ISWI) to regulate transcription. Genes & Development. 2021;35:147–156. doi: 10.1101/gad.341768.120. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Jukam D, Shariati SAM, Skotheim JM. Zygotic Genome Activation in Vertebrates. Developmental Cell. 2017;42:316–332. doi: 10.1016/j.devcel.2017.07.026. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Kaye EG, Booker M, Kurland JV, Conicella AE, Fawzi NL, Bulyk ML, Tolstorukov MY, Larschan E. Differential occupancy of two GA-Binding proteins promotes targeting of the Drosophila dosage compensation complex to the male X chromosome. Cell Reports. 2018;22:3227–3239. doi: 10.1016/j.celrep.2018.02.098. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Khan A, Mathelier A. Intervene: a tool for intersection and visualization of multiple gene or genomic region sets. BMC Bioinformatics. 2017;18:287. doi: 10.1186/s12859-017-1708-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Krueger F. Trim Galore: a wrapper script to automate quality and adapter trimming as well as quality control. v3Bioinformatics. 2017 https://www.bioinformatics.babraham.ac.uk/projects/trim_galore/
  30. Lee MT, Bonneau AR, Giraldez AJ. Zygotic genome activation during the maternal-to-zygotic transition. Annual Review of Cell and Developmental Biology. 2014;30:581–613. doi: 10.1146/annurev-cellbio-100913-013027. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Lehmann M, Siegmund T, Lintermann KG, Korge G. The pipsqueak protein of Drosophila Melanogaster binds to GAGA sequences through a novel DNA-binding domain. Journal of Biological Chemistry. 1998;273:28504–28509. doi: 10.1074/jbc.273.43.28504. [DOI] [PubMed] [Google Scholar]
  32. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, 1000 Genome Project Data Processing Subgroup The sequence alignment/Map format and SAMtools. Bioinformatics. 2009;25:2078–2079. doi: 10.1093/bioinformatics/btp352. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Li J, Liu Y, Rhee HS, Ghosh SK, Bai L, Pugh BF, Gilmour DS. Kinetic competition between elongation rate and binding of NELF controls promoter-proximal pausing. Molecular Cell. 2013;50:711–722. doi: 10.1016/j.molcel.2013.05.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Li X-Y HMM, Villalta JE, Kaplan T, Eisen MB. Establishment of regions of genomic activity during the Drosophila maternal to zygotic transition. eLife. 2014;3:e03737. doi: 10.7554/eLife.03737. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Liang HL, Nien CY, Liu HY, Metzstein MM, Kirov N, Rushlow C. The zinc-finger protein Zelda is a key activator of the early zygotic genome in Drosophila. Nature. 2008;456:400–403. doi: 10.1038/nature07388. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Little SC, Gregor T. Single mRNA molecule detection in Drosophila. Methods in Molecular Biology. 2018;1649:127–142. doi: 10.1007/978-1-4939-7213-5_8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Lott SE, Villalta JE, Schroth GP, Luo S, Tonkin LA, Eisen MB. Noncanonical compensation of zygotic X transcription in early Drosophila melanogaster development revealed through single-embryo RNA-seq. PLOS Biology. 2011;9:e1000590. doi: 10.1371/journal.pbio.1000590. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Lun AT, Smyth GK. csaw: a Bioconductor package for differential binding analysis of ChIP-seq data using sliding windows. Nucleic Acids Research. 2016;44:e45. doi: 10.1093/nar/gkv1191. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Manoukian AS, Krause HM. Concentration-dependent activities of the even-skipped protein in Drosophila embryos. Genes & Development. 1992;6:1740–1751. doi: 10.1101/gad.6.9.1740. [DOI] [PubMed] [Google Scholar]
  40. Mayran A, Drouin J. Pioneer transcription factors shape the epigenetic landscape. The Journal of Biological Chemistry. 2018;293:13795–13804. doi: 10.1074/jbc.R117.001232. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. McDaniel SL, Gibson TJ, Schulz KN, Fernandez Garcia M, Nevil M, Jain SU, Lewis PW, Zaret KS, Harrison MM. Continued activity of the pioneer factor Zelda is required to drive zygotic genome activation. Molecular Cell. 2019;74:185–195. doi: 10.1016/j.molcel.2019.01.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Ni JQ, Zhou R, Czech B, Liu LP, Holderbaum L, Yang-Zhou D, Shim HS, Tao R, Handler D, Karpowicz P, Binari R, Booker M, Brennecke J, Perkins LA, Hannon GJ, Perrimon N. A genome-scale shRNA resource for transgenic RNAi in Drosophila. Nature Methods. 2011;8:405–407. doi: 10.1038/nmeth.1592. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Peti W, Page R. Strategies to maximize heterologous protein expression in Escherichia coli with minimal cost. Protein Expression and Purification. 2007;51:1–10. doi: 10.1016/j.pep.2006.06.024. [DOI] [PubMed] [Google Scholar]
  44. Prayitno K, Schauer T, Regnard C, Becker PB. Progressive dosage compensation during Drosophila embryogenesis is reflected by gene arrangement. EMBO Reports. 2019;20:e48138. doi: 10.15252/embr.201948138. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26:841–842. doi: 10.1093/bioinformatics/btq033. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. R Development Core Team . Vienna, Austria: R Foundation for Statistical Computing; 2014. https://www.R-project.org/ [Google Scholar]
  47. Ramírez F, Dündar F, Diehl S, Grüning BA, Manke T. deepTools: a flexible platform for exploring deep-sequencing data. Nucleic Acids Research. 2014;42:W187–W191. doi: 10.1093/nar/gku365. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Rao X, Huang X, Zhou Z, Lin X. An improvement of the 2ˆ(-delta delta CT) method for quantitative real-time polymerase chain reaction data analysis. Biostatistics, Bioinformatics and Biomathematics. 2013;3:71–85. [PMC free article] [PubMed] [Google Scholar]
  49. Rieder LE, Koreski KP, Boltz KA, Kuzu G, Urban JA, Bowman SK, Zeidman A, Jordan WT, Tolstorukov MY, Marzluff WF, Duronio RJ, Larschan EN. Histone locus regulation by the Drosophila dosage compensation adaptor protein CLAMP. Genes & Development. 2017;31:1494–1508. doi: 10.1101/gad.300855.117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Rieder LE, Jordan WT, Larschan EN. Targeting of the Dosage-Compensated male X-Chromosome during early Drosophila development. Cell Reports. 2019;29:4268–4275. doi: 10.1016/j.celrep.2019.11.095. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Robinson JT, Thorvaldsdóttir H, Winckler W, Guttman M, Lander ES, Getz G, Mesirov JP. Integrative genomics viewer. Nature Biotechnology. 2011;29:24–26. doi: 10.1038/nbt.1754. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Schulz KN, Bondra ER, Moshe A, Villalta JE, Lieb JD, Kaplan T, McKay DJ, Harrison MM. Zelda is differentially required for chromatin accessibility, transcription factor binding, and gene expression in the early Drosophila embryo. Genome Research. 2015;25:1715–1726. doi: 10.1101/gr.192682.115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Shimojima T, Okada M, Nakayama T, Ueda H, Okawa K, Iwamatsu A, Handa H, Hirose S. Drosophila FACT contributes to Hox gene expression through physical and functional interactions with GAGA factor. Genes & Development. 2003;17:1605–1616. doi: 10.1101/gad.1086803. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Soluri IV, Zumerling LM, Payan Parra OA, Clark EG, Blythe SA. Zygotic pioneer factor activity of Odd-paired/Zic is necessary for late function of the Drosophila segmentation network. eLife. 2020;9:e53916. doi: 10.7554/eLife.53916. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Soruco MM, Chery J, Bishop EP, Siggers T, Tolstorukov MY, Leydon AR, Sugden AU, Goebel K, Feng J, Xia P, Vedenko A, Bulyk ML, Park PJ, Larschan E. The CLAMP protein links the MSL complex to the X chromosome during Drosophila dosage compensation. Genes & Development. 2013;27:1551–1556. doi: 10.1101/gad.214585.113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Soufi A, Garcia MF, Jaroszewicz A, Osman N, Pellegrini M, Zaret KS. Pioneer transcription factors target partial DNA motifs on nucleosomes to initiate reprogramming. Cell. 2015;161:555–568. doi: 10.1016/j.cell.2015.03.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Stark R, Brown G. DiffBind: Differential Binding Analysis of ChIP-Seq Peak Data. 3.10Bioconductor. 2019 https://bioconductor.org/packages/release/bioc/html/DiffBind.html
  58. Stott K, Blackburn JM, Butler PJ, Perutz M. Incorporation of glutamine repeats makes protein oligomerize: implications for neurodegenerative diseases. PNAS. 1995;92:6509–6513. doi: 10.1073/pnas.92.14.6509. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Sun Y, Nien CY, Chen K, Liu HY, Johnston J, Zeitlinger J, Rushlow C. Zelda overcomes the high intrinsic nucleosome barrier at enhancers during Drosophila zygotic genome activation. Genome Research. 2015;25:1703–1714. doi: 10.1101/gr.192542.115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Tariq M, Wegrzyn R, Anwar S, Bukau B, Paro R. Drosophila GAGA factor polyglutamine domains exhibit prion-like behavior. BMC Genomics. 2013;14:374. doi: 10.1186/1471-2164-14-374. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. ten Bosch JR, Benavides JA, Cline TW. The TAGteam DNA motif controls the timing of Drosophila pre-blastoderm transcription. Development. 2006;133:1967–1977. doi: 10.1242/dev.02373. [DOI] [PubMed] [Google Scholar]
  62. Urban J, Kuzu G, Bowman S, Scruggs B, Henriques T, Kingston R, Adelman K, Tolstorukov M, Larschan E. Enhanced chromatin accessibility of the dosage compensated Drosophila male X-chromosome requires the CLAMP zinc finger protein. PLOS ONE. 2017a;12:e0186855. doi: 10.1371/journal.pone.0186855. [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Urban JA, Urban JM, Kuzu G, Larschan EN. The Drosophila CLAMP protein associates with diverse proteins on chromatin. PLOS ONE. 2017b;12:e0189772. doi: 10.1371/journal.pone.0189772. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Wickham H. Ggplot2: Elegant Graphics for Data Analysis. Use R! New York: Springer-Verlag; 2009. [DOI] [Google Scholar]
  65. Wilkins RC, Lis JT. DNA distortion and multimerization: novel functions of the glutamine-rich domain of GAGA factor. Journal of Molecular Biology. 1999;285:515–525. doi: 10.1006/jmbi.1998.2356. [DOI] [PubMed] [Google Scholar]
  66. Yan F, Powell DR, Curtis DJ, Wong NC. From reads to insight: a hitchhiker's guide to ATAC-seq data analysis. Genome Biology. 2020;21:22. doi: 10.1186/s13059-020-1929-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  67. Yu G, Wang LG, Han Y, He QY. clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS: A Journal of Integrative Biology. 2012;16:284–287. doi: 10.1089/omi.2011.0118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  68. Yu G, Wang LG, He QY, Q-y H. ChIPseeker: an R/Bioconductor package for ChIP peak annotation, comparison and visualization. Bioinformatics. 2015;31:2382–2383. doi: 10.1093/bioinformatics/btv145. [DOI] [PubMed] [Google Scholar]
  69. Zaret KS, Carroll JS. Pioneer transcription factors: establishing competence for gene expression. Genes & Development. 2011;25:2227–2241. doi: 10.1101/gad.176826.111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  70. Zhang Y, Liu T, Meyer CA, Eeckhoute J, Johnson DS, Bernstein BE, Nusbaum C, Myers RM, Brown M, Li W, Liu XS. Model-based analysis of ChIP-Seq (MACS) Genome Biology. 2008;9:R137. doi: 10.1186/gb-2008-9-9-r137. [DOI] [PMC free article] [PubMed] [Google Scholar]

Decision letter

Editor: Yukiko M Yamashita1

In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.

Acceptance summary:

The authors showed that CLAMP functions as a pioneer factor during zygotic genome activation in Drosophila on specific subset of target genes, particularly at target genes with GA-rich motifs that are not regulated by a well-established pioneer factor Zelda.

Decision letter after peer review:

[Editors’ note: the authors submitted for reconsideration following the decision after peer review. What follows is the decision letter after the first round of review.]

Thank you for submitting your work entitled "CLAMP and Zelda function together as pioneer transcription factors to promote Drosophila zygotic genome activation" for consideration by eLife. Your article has been reviewed by 3 peer reviewers, and the evaluation has been overseen by a Reviewing Editor and a Senior Editor. The reviewers have opted to remain anonymous.

Our decision has been reached after consultation between the reviewers. The manuscript by Duan and Rieder et al. describes, for the first time, how the CLAMP transcription factor acts as a pioneer TF in the fly embryo. They demonstrate that CLAMP can bind to nucleosome-bound DNA and that it binds to and generates accessible chromatin at a set of gene promoters in the early embryo, and that without this activity these genes fail to be transcribed during ZGA. They further describe fascinating cooperativity between CLAMP and ZLD, a previously identified pioneer TF in the fly embryo. The work will be of broad interest to both the developmental biology and transcription biology fields.

All the reviewers appreciated the importance and the potential impact of the work, but all raised issues with experimental approaches particularly regarding the following issues.

1. Bioinformatics analysis

2. Writing – all noted that the manuscript was not written well to allow readers to follow the logic easily. Sometimes, the authors appeared to make contradictory statements

Details can be found in individual reviews.

In the light of eLife's policy to invite revisions only when revision experiments are unlikely to change major conclusion of the paper, we decided that this manuscript must be rejected at this point. However, we would like to note that all the reviewers are quite enthusiastic about this manuscript, and if you can address concerns, we would be ready to review the revised manuscript as a new submission, which will be handled by the same set of editors/reviewers.

Reviewer #1:

In this manuscript, Duan, Rieder and collaborators examine the potential role of the transcription factor (TF) CLAMP as a pioneer transcription factor and potential co-regulator of zygotic genome activation (ZGA) in Drosophila. First the authors perform an in vitro characterisation of the biochemical properties of CLAMP with respect to being able to bind nucleosomal DNA and the effect of the depletion of the protein in regulating transcription. The authors then perform ATAC-seq, and ChIP-seq in embryos depleted of CLAMP or Zelda (a known pioneer transcription factor in Drosophila, which is a master regulator of ZGA) to examine their binding patterns in relation to each other and how these affect chromatin accessibility during early embryonic development before (0-2 hours post fertilisation (hpf)) and after (2-4 hpf) ZGA.

The question is of interest since previous reports have described an enrichment of a GA-rich motif similar to that can be recognised by CLAMP in open chromatin at developmental time points around ZGA, posing the question as to whether CLAMP might be involved in this developmental transition.

Unfortunately, I find the manuscript very densely written and very difficult to follow. It has taken me much longer than anticipated to review the work since the concepts are not clearly introduced, and the description and interpretation of the data are confusing. Despite the dense writing, the manuscript lacks critical information regarding how analyses were done that preclude from a proper evaluation of the conclusions. For the parts of the manuscript where this is possible, it is unclear whether some of the analytical strategies are the most adequate to address the question, and whether the data supports the authors' conclusions.

1. The manuscript contains no information regarding the phenotypic effect of CLAMP depletion (or of Zelda depletion, which would serve as control). Are these embryos healthy? Do they stop developing? Are they viable at least until ZGA? How pure are the embryo collections? What is the proportion of embryos further than nuclear cycle 13,14 in the 0-2 hpf collections? In addition, the validation of the knockdowns is buried in Figure 3 —figure supplement 1C,D, and is not convincing. This is a critical piece of information which is necessary to interpret the results for all remaining figures in the manuscript.

2. The authors claim that CLAMP is a pioneer transcription factor that directly activates zygotically-transcribed genes (line 140), but in my opinion, this is not demonstrated in a convincing manner. First, the western blot presented in Figure 3 —figure supplement 1D doesn't allow for an evaluation of the presence of Zld at the protein level in CLAMP depleted embryos (see point 1 above). Furthermore, the analysis presented in Figure 1 —figure supplement 1A, shows that there is a widespread change in expression for zygotically expressed genes upon CLAMP depletion irrespective of the level of CLAMP binding, suggesting that overall the observed changes correspond to indirect effects.

3. The manuscript doesn't contain critical information regarding how the ATAC-seq data were analysed. Did the authors use only short/long/all fragments for their analyses? This is important since one can then evaluate whether the authors are measuring open chromatin (short fragments) or nucleosome positioning (long fragments). In addition, how are differentially accessible regions calculated? Is the genome binned? If so, how big are these bins? I am not sure that I fully understand how this analysis is done, and I don't know what "log concentration" refers to in Figure 2A, but I find difficult to reconcile the close to zero correlation values reported for the ATAC-seq datasets in the MTD and CLAMP depleted samples when the majority of dots in panel 2A are not different between the two conditions.

4. The genomic track plot presented in Figure 2B doesn't help in the interpretation of the data. First, the panel is missing the genomic coordinates, so one cannot determine what are the peaks that are presented in the figure. Second the panel shows multiple CLAMP binding sites that don't seem to present an open chromatin signature, suggesting that only a fraction of the binding is related to open chromatin in the control samples. It would be interesting, and, in my opinion, much more straightforward to test, how much of the binding that leads to open chromatin (irrespective of whether this would be scored as differential accessible regions, which might be confounded for the reasons outlined in points 1 and 2 above) occurs together, for example, with Zld binding.

5. I find the scatterplots and the regression analysis presented in Figure 2E very unconvincing. Although statistically significant, the correlations are extremely weak, and if the authors would compute R^2 this would be extremely low, highlighting that the majority of the variance observed in these measurements remains unaccounted for. So overall, in my opinion, most of the authors' conclusions in Figure 2 are not supported without further analysis.

6. I find the analytical strategy in Figure 3A-C inappropriate and difficult to understand. First, the authors use average plots that contain order of magnitude different number of regions across the comparisons. The manuscript does not contain specific details regarding how these plots are produced, but these can be severely affected by outliers, especially in the cases when the number of regions is low. Second, the author define four classes of regions, depending whether they are bound or not bound by the factor and differentially accessible, and a control region (not bound and not differentially accessible). The split in these classes is very confusing and difficult to follow throughout the rest of the manuscript. In my opinion, this might be easier to follow if the data were presented as heatmaps including the common set of regions then split into different classes. Of note, the authors use comparisons across the average plots to "validate" (line 231) or "confirm" (line 243) their analysis. I disagree that this is validation because the groups have been chosen presumably based on thresholding the same data, and a validation could only come from orthogonal data, which is not used here. If the authors would want to use average plots, these should at least include a sharing area representing a confidence interval. Only one of the lines in Figure 6D shows such a shading, although that this represents is not explained. This significantly affects the majority of analyses and interpretations in Figures, 3,4,5,6.

7. There is a significant difference in the shape of the average profiles for the ATAC-seq data for the fourth group in both comparisons in Figure 3A. This is meant to be the control group in both cases, of non-DA, non-bound peaks. Since the controls are qualitative different between the CLAMP and Zelda depleted experiments, I wonder whether strong conclusions can be obtained from the comparisons of the other average profiles. In this respect, the p-values in Table 1 need to be corrected for multiple testing. Under Bonferroni correction, the p-value of the overlap between DA Zld-bound and DA CLAMP-bound is not significant, which is in disagreement with the authors' conclusions (line 247-248). This also affects the motivation for the analysis in Figure 4 (lines 312-313).

8. Figure 3D is misleading since the scales in both plots are different and do not let the reader appreciate the extent of the changes for the different classes. It might help with the visualisation if the authors would use a violin plot, or if they would plot the data without the outliers and including notches for the boxes. In my opinion, the results in Figure 3D indicate that there is a downregulation of gene expression at CLAMP-bound genes irrespective of changes in chromatin accessibility. This would challenge the authors' own conclusions of a pioneer role for CLAMP at those sites that don't change accessibility. It is therefore unclear how the authors arrive at the conclusions in lines 294-296.

9. Does the list of motifs included in Figure 4A contain all the set of significant motifs found in these regions? Without this information, it is not possible to evaluate the statement in lines 319-320. Furthermore, this can be due to the fact that there is a significantly lower number of DA CLAMP-bound in 2-4 hours compared with the 0-2 hours data, which might affect the significance of these enrichments. In my opinion, a better alternative would be to show the number of Zelda events that occur in these regions. Without this information, the conclusion stated in this paragraph (lines 320-322) is not supported.

10. The authors state that the majority of Zelda binding sites are not affected upon CLAMP depletion (line 345). However, I find this statement puzzling, since Figure 4 —figure supplement 1A shows a significant effect in the level of Zelda binding upon CLAMP KD throughout most of the regions shown in the heatmap.

11. Related to the point above, it would be useful if the authors could clarify how the data presented in Figure 4 —figure supplement 1A and Figure 4 —figure supplement 1D are different. Also, please note that this figure is missing the labelling of the x-axis. This is also the case for Figure 4 —figure supplement 1E.

12. The conclusions reached by the authors in lines 354-356 regarding the effect of Zld in CLAMP binding are not supported since, as the authors acknowledge, the experimental design is confounded by the up-regulation of Zld in the 2-4 hours time point.

13. I am unable to understand the interpretation of the results presented in Figure 4F and 4G. In any case, the results in Figure 4G might be confounded by the increased in Zld expression at 2-4 hours, as mentioned by the authors before.

14. Figure 5A lacks genomic coordinates, which makes it impossible to interpret the plot. In addition, the scales of the signal are also not readable, which makes it impossible to evaluate the robustness of the binding represented and the comparison. Related to this figure, does the difference in peak size depend on the number of individual binding events? I am unable to follow the results presented in Figure 5 —figure supplement 1E,F, and the interpretation of the results of Figure 5E.

15. I find it difficult to understand the statement in lines 458-459 because I do not understand what is the nature of the interdependent relationship between Zld and CLAMP binding to chromatin.

16. It is unclear whether the data in Figure 6C refers to the dependent or independent sites, since both seem to gain accessibility upon Zld depletion. I find this observation difficult to reconcile with the results presented in Figure 4 —figure supplement 1A,E that suggest that Zld depletion leads to an overall reduction of CLAMP binding. I would have expected then to observe a loss of accessibility, but not a gain. How do the authors explain this puzzling observation? In any case, since the gain in accessibility seems to be independent of CLAMP binding, since it occurs in both groups, can the authors be confident that this is biological and not due to technical differences between the libraries? What is the overlap between the sites that are reported here and those reported gaining accessibility in Schulz et al.?

17. The results in Figure 6D are also very difficult to interpret, especially given the limited effect of Zld depletion on CLAMP binding at 2-4 hours. How do the authors explain these results?

18. I don't understand the authors reasoning for the statement in lines 490-492. The authors' own analysis shows that the overlap between the set of downregulated genes at 2-4 hours is not better than one could expect by random chance (Figure 6F). How do the authors then conclude that there is co-regulation for hundreds of genes after ZGA?

Reviewer #2:

In the current manuscript under review, Duan et al. address the question of the role of GA-repeat binding factor CLAMP on the process of ZGA. The question of ZGA, particularly that of which pioneer factors establish patterns of chromatin accessibility and promote the expression of the first zygotic transcripts has received heavy attention in recent years. Notably, although another pioneer, Zelda, has a critical role for driving ZGA for a subset of zygotic genes by several measures, the vast majority of genomic locations either require a combination of Zelda and another factor, or another factor entirely. Several prior studies have pointed to enrichment for a GA-repeat motif within this class of sites. Identifying and characterizing the role of such a second maternal pioneer would represent a significant advance for the field as well as more broadly across biological fields as the question of pioneering touches on several key aspects of transcriptional regulation and epigenetics.

While Duan et al. present data that (1) CLAMP binds its motif even in the nucleosome-associated state; (2) CLAMP loss of function leads to some amount of reduced chromatin accessibility; (3) Some CLAMP and Zld DNA binding is interdependent; (4) Loss of CLAMP function affects gene transcription-- the manuscript in its current state is far from suitable for publication. My primary concern is that the data presentation of the genomics studies is extremely difficult to follow, that supporting data tables are either incompletely annotated or missing, in many cases it is nearly impossible to read the plot labels in the figures, and that the biological significance of the observations is not fully substantiated. In addition, certain controls have not been provided or even incorporated into experimental design. Also, there are issues with the presentation of the study, with factually incorrect statements and missing or unclear description of methods, and missing references (in some cases leading to factually incorrect statements).

This could be an important paper and it is therefore important that the presentation is watertight. I provide the comments below fully aware of current constraints on daily life, and in the spirit of wanting to minimize additional work for the authors. I think that overall the data already exists to improve the manuscript (or at least it should). But there is a fundamental question of whether the data are over-interpreted, and whether the effect of Clamp is as significant as the authors claim, at least within the framework of the process of ZGA.

1) The presentation of the genomics data analysis is very difficult to follow. I inspected the bigWig files for the ATAC data and had a hard time finding genomic regions where there is clear-cut evidence for CLAMP's role as a pioneer factor. Loading up the four ATAC conditions (two timepoints each control or clamp-i), as well as the Rieder CLAMP ChIP (NC14) and the 3h Harrison Zld ChIP-seq, I can find only a handful of regions where CLAMP has a clear all-or-nothing effect on chromatin accessibility, and these (few) sites are at regions where there is little Zld binding. These sites I did find by scrolling through nearly the entire genome are: 3' to CG11448, within iab-8, and possibly at the promoters of Vsx1 and 2. There are, however, numerous examples of regions where a 'differential enrichment' analysis could possibly yield a statistically significant difference between control and clamp-i, but there remains substantial accessible chromatin in the knockdown conditions. This latter phenomenon cannot be construed as evidence for pioneer activity, since it is expected that in the absence of the pioneer, the locus would be inaccessible. I am left with the question of whether the effect of Clamp on chromatin accessibility is oversold in this study.

The example regions plotted in the Figures also reveal potential issues in the analysis or interpretation of data: Figure 2B, CG11023: I had questions about what was going on in this plot which were cleared up by checking the bigWig files. For instance, I was curious why the light blue peak region indicator included regions with no ATAC signal in either control or clamp-i. Why also is there Clamp ChIP signal in this region? Upon inspection of the data, this plot shows base one of chr2L, and the blank region in the ATAC is presumably due (understandably) to mapping issues at the very telomeric end of the chromosome. Why is a peak called here? Why does the peak end within a peak of ATAC signal and not include this whole region? Significantly more concerning is that when I examine this region, my conclusion is that this whole region is likely very low signal that I would be reluctant to score both as "open" as well as "bound by Clamp". On the basis of this, I am reluctant to say that the bioinformatic analysis has been performed with sufficient rigor. Admittedly, this is based on one example image, but I would also point out that the authors have both only provided limited example regions, and have not provided a sufficiently documented 'peaks list' that includes regions that they feel are (1) bound by CLAMP, (2) bound by Zelda, (3) score as a member of the various groupings used to compare regions throughout the text (e.g. DA-Clamp bound, et cetera). The peaks list that the authors do provide is in a strange format and the column labels are not included in that file (nor can I find anywhere a description of that file, but I may have missed that in the submission materials). Nevertheless, it does not appear to indicate membership in any of the different classes from what I can tell.

It is similarly difficult to evaluate the conclusion that CLAMP has anything at all to do with ZGA (see below). Specifically, however, to the bioinformatics analysis: when RNAseq data is analyzed, is it limited to zygotic genes only (as defined either in DeRenzis 2007, or in the Li paper cited in the manuscript?), and is the magnitude of the effect large enough to warrant the conclusion that Clamp is required for ZGA? For comparison, loss of Zelda function results in near zero transcripts produced from a subset of zygotic genes (and corresponding elimination full stop of chromatin accessibility at those loci). I'm worried that the authors are placing too much weight on "significant" p-values without considering if the magnitude of the effect supports the stated conclusions. If the effect of clamp-i is minimal on transcription and chromatin accessibility, which it may be based on my limited examination of the raw data, I see no way to justify the conclusion that Clamp has any major role in ZGA.

I also have a difficult time finding any Zld-bound loci that convincingly show loss of accessibility in the clamp-i data.

Reviewer #3:

The manuscript submitted by Duan and Rieder et al. describes, for the first time, how the CLAMP transcription factor acts as a pioneer TF in the fly embryo. They demonstrate that CLAMP can bind to nucleosome-bound DNA and that it binds to and generates accessible chromatin at a set of gene promoters in the early embryo, and that without this activity these genes fail to be transcribed during ZGA. They further describe fascinating cooperativity between CLAMP and ZLD, a previously identified pioneer TF in the fly embryo. Their results are compelling and rigorous, and the work will be of broad interest to both the developmental biology and transcription biology fields.

One concern is in Figure 2E, This scatterplot and correlation is not particularly convincing. The fact that the positive correlation is very minor needs to be emphasized properly in the text. Moreover, in our opinion, there is a better way of doing this. ATAC-seq and RNA-seq are very different assays and as such it is to be expected that the fold change upon depletion of a factor should not be expected to be correlated in magnitude between assays, only the change in direction. The dynamic range of change you can expect in RNA-seq is much greater than that in ATAC-seq because mRNA is much more abundant than its cognate DNA for transcribed genes. We think the authors should simply display a Venn diagram of genes/promoters that move in the same direction, i.e. what fraction of the genes have the same directionality of change. We do not think that comparing the magnitude of these changes is particularly useful or informative in this case.

[Editors’ note: further revisions were suggested prior to acceptance, as described below.]

Thank you for submitting your work entitled "CLAMP and Zelda function together as pioneer transcription factors to promote Drosophila zygotic genome activation" for consideration by eLife. Your article has been reviewed by 3 peer reviewers, and the evaluation has been overseen by a Reviewing Editor and a Senior Editor. The reviewers have opted to remain anonymous.

We are sorry to say that, after consultation with the reviewers, we have decided that your work will not be considered further for publication by eLife. Although all the reviewers appreciated the improvement that the authors made since the last submission, reviewer #3 noticed major issues with the analysis of the sequencing data that affect the major conclusion. During consultation session among the reviewers and editors, others agreed that these are major issues, precluding the publication of your manuscript, at least in its current form. If these issues are addressed appropriately, we would be happy to re-consider the manuscript, but per eLife's policy that revision only be invited when the major conclusion is unlikely to change as a result of revision, we are declining the current version of the work.

Reviewer #1:

The authors have satisfactorily addressed all our previous concerns, and the resubmitted manuscript is much better written and easier to understand than the previous submission. We feel the additional experiments the authors performed are sufficient to address concerns raised in the first round of review. This manuscript demonstrates clearly that CLAMP is acting as a pioneer factor in Drosophila embryos, and there is cooperativity between CLAMP and ZELDA at certain gene promoters.

Reviewer #2:

The authors have satisfactorily addressed all my questions. In my opinion the manuscript has vastly improved with the new data and analysis, and I do not have additional questions about the work.

Reviewer #3:

In the revised manuscript, Duan and colleagues have addressed some of the issues that were raised upon the original review. The authors have generally improved the presentation of their data and have rendered the results easier to interpret. However, despite these improvements, upon inspection of the differential enrichment analysis, the magnitude of effect of Clamp on differential chromatin accessibility is significantly overstated.

It is appreciated that the authors re-sequenced some of their lower depth samples for the resubmitted version. It is also appreciated that the authors have now provided the annotated tables from the differential enrichment analysis. In my original review, I mentioned that I manually searched nearly the entire genome while struggling to find more than a few examples of convincing loss of ATAC signal in the clamp-i data. I have now reviewed the differential enrichment analysis ("Table 4-source data 2", referred to as "Table 1 Source Data 2" in the text (line 681)) and note the following issue:

- The authors appear to have relied not on the FDR but rather on individual, independently calculated p-values for reckoning the number of differentially accessible peaks in the differential enrichment analysis. Table 1 reports 76 "Up", 1675 "Down", and 9465 "None" effects on accessibility in clamp-i embryos versus control. In the supplied source data, it is clear that the authors set a p-value cutoff of 0.05 for calculating these numbers. What isn't mentioned in the text is that this cutoff corresponds to a 32% FDR. Typically, a 5% FDR rate is chosen to minimize incorrect rejections of the null hypothesis that arise due to multiple testing. Using this standard, the total number of differential peaks in the 0-2 hour comparison is only 95, with 73 sites showing a reduction, and 22 showing an increase. Again, a manual inspection of a sampling of these regions shows marginal differences in magnitude between the few regions that do pass significance testing at 5% FDR.

Even fewer regions are differentially accessible in the 2-4 hour sample at a 5% FDR (total = 54, 33 "down", 21 "up").

On the basis of this observation, it would be hard to argue that CLAMP is playing a major role in regulating chromatin accessibility at ZGA. To me, this substantially casts doubt on the central premise of this manuscript and in fact suggests that CLAMP has only a minor effect on accessibility at this time.

[Editors' note: further revisions were suggested prior to acceptance, as described below.]

Thank you for resubmitting your work entitled "CLAMP and Zelda function together to promote Drosophila zygotic genome activation" for further consideration by eLife. Your revised article has been evaluated by Kevin Struhl (Senior Editor) and a Reviewing Editor.

The manuscript has been improved but there are some remaining issues that need to be addressed, as outlined below:

The authors have made a concerted effort to address the reviewer's concerns, and save for the remaining minor issues below, the manuscript is suitable for publication. While the reanalysis of the data has led to the conclusion that Clamp does not alter chromatin accessibility at as many sites in as non-redundant a way as Zelda, the work does document an interesting and critical interplay of pioneer transcription factors in early embryonic development, and it begins to understand the molecular underpinnings of that interplay. We think this work will be of broad interest and will help clarify how transcription factors act to establish chromatin accessibility and set-up the first steps in early embryonic transcription regulation.

1) Clamp as Pioneer: the authors have convincingly shown that Clamp binds to nucleosomal DNA using gel shift assays, and this result alone is probably sufficient to call it a pioneer factor in our view. However, the authors have also convincingly shown that the scope of Clamp pioneering accessibility of chromatin is very small compared to Zelda, but that like Zelda, loss of function is catastrophic in terms of overall development. Any use of "pioneer-like" can be replaced with "pioneer'. We also recommend that the authors carefully edit the Discussion to accurately describe the magnitude of Clamp's effect on accessibility, and to update the summation of results pending the outcome of points 2 and 3 below.

2) The reviewers agree that part of the new analysis presented in Figure 3 was not performed in an ideal manner to support the conclusions. The observation at line 245, for instance, is premature:

"Depletion of either maternal zld or clamp mRNA altered the genomic distribution of CLAMP and Zld: both factors shifted their occupancy from promoters to introns."

We request the authors either repeat this analysis more rigorously or eliminate the section entirely. The current analysis is performed by comparing independently called peak lists and placing emphasis on regions that are present or absent in each set. This approach is highly susceptible to thresholding artifacts associated with peak calling. All reviewers agree that a more rigorous approach would be either to perform this analysis on a single, union peak set followed by differential enrichment analysis, or coverage data between different treatments could be compared directly by generating XY-scatter plots of summed reads in each peak from a union peak list. If the conclusion of this section is correct, the genomic regions of interest should be significantly off the diagonal, and this can be statistically addressed.

3) The authors demonstrate that the knockdown efficiency of Zld RNAi is poor during the 2-4h timepoint (e.g. Figure 3, Fig. Supp. 1B). We caution the authors from drawing any strong conclusions about the effect of Zld on Clamp in the 2-4h time period. Please consider revising or eliminating the text beginning at line 262, where the weak effect of Zld on Clamp binding at 2-4 hours can possibly be attributed to incomplete knockdown.

4) For most of the heatmaps throughout the manuscript: the titles of the heatmaps incorrectly refer to "peaks", regardless of the data type presented in the heatmap. This can be confusing since the y-axis of the heatmap is some set of "peaks," and the data presented in the heatmap is ATAC-seq coverage or ChIP-seq coverage for a particular factor/genotype/timepoint. To improve readability, please revise heatmap plots to indicate the peak set on the y-axis, and relevant sample information in the header/title of each plot.

5) Paragraph beginning at line 341. Here, the authors are examining "gene expression changes caused by depleting maternal Zld at genes where CLAMP regulates Zld binding." The next sentence, however, talks about "genes where Zld regulates CLAMP binding." (Genes where "CLAMP regulates Zld binding" are never mentioned again.) This makes the logic of this paragraph difficult to interpret. Please revise.

6) In general, the reader has to work hard to clearly interpret the results section of this manuscript, particularly for Figures 4-5. Please consider editing the text related to Figures 4-5 for clarity.

eLife. 2021 Aug 3;10:e69937. doi: 10.7554/eLife.69937.sa2

Author response


[Editors’ note: the authors resubmitted a revised version of the paper for consideration. What follows is the authors’ response to the first round of review.]

Essential revisions:

Reviewer #1:

In this manuscript, Duan, Rieder and collaborators examine the potential role of the transcription factor (TF) CLAMP as a pioneer transcription factor and potential co-regulator of zygotic genome activation (ZGA) in Drosophila. First the authors perform an in vitro characterisation of the biochemical properties of CLAMP with respect to being able to bind nucleosomal DNA and the effect of the depletion of the protein in regulating transcription. The authors then perform ATAC-seq, and ChIP-seq in embryos depleted of CLAMP or Zelda (a known pioneer transcription factor in Drosophila, which is a master regulator of ZGA) to examine their binding patterns in relation to each other and how these affect chromatin accessibility during early embryonic development before (0-2 hours post fertilisation (hpf)) and after (2-4 hpf) ZGA.

The question is of interest since previous reports have described an enrichment of a GA-rich motif similar to that can be recognised by CLAMP in open chromatin at developmental time points around ZGA, posing the question as to whether CLAMP might be involved in this developmental transition.

Unfortunately, I find the manuscript very densely written and very difficult to follow. It has taken me much longer than anticipated to review the work since the concepts are not clearly introduced, and the description and interpretation of the data are confusing. Despite the dense writing, the manuscript lacks critical information regarding how analyses were done that preclude from a proper evaluation of the conclusions. For the parts of the manuscript where this is possible, it is unclear whether some of the analytical strategies are the most adequate to address the question, and whether the data supports the authors' conclusions.

1. The manuscript contains no information regarding the phenotypic effect of CLAMP depletion (or of Zelda depletion, which would serve as control). Are these embryos healthy? Do they stop developing? Are they viable at least until ZGA?

We thank the reviewer for this important comment. We have performed new experiments to assess the phenotypic effect of clamp depletion. Overall, the phenotypes caused by the maternal depletion of clamp are very similar to those caused by the maternal depletion of zld including failed cellularization. Embryos stall their growth at the blastoderm stage when maternal clamp is depleted.

Furthermore, we performed smFISH or immunostaining to localize early patterning genes including eve, run, and Nrt before and after depletion of clamp or zld. We chose to measure the localization patterns of these three target genes because they showed significantly reduced expression in RNA-seq experiments. Localization of each early patterning gene was disrupted after depletion of maternal clamp.

We have added additional text regarding this phenotypic effect in the introduction, methods, and main text. The new imaging data are in Figures 1A-B.

Here a summary of our new results:

A) By comparing maternal depletion of CLAMP and ZLD, we found that the phenotypes caused by the maternal depletion of clamp are very similar to those caused by the maternal depletion of zld, including failed cellularization. Furthermore, embryos stall at the blastoderm stage when maternal clamp is depleted.

B) We characterized the expression pattern of the pair-rule genes eve and run, as well as the distribution of the cell adhesion glycoprotein Nrt because they showed significant expression reduction in RNA-seq data obtained after depleting maternal clamp. These results (Figure 1) also support the conclusion that the loss of maternal clamp in the early embryo is catastrophic because the seven segmentation stripes formed by the pair-rule genes fail to form.

How pure are the embryo collections? What is the proportion of embryos further than nuclear cycle 13,14 in the 0-2 hpf collections?

Thank you for the question about embryo collection which is a key issue. The embryo collection was precisely timed. To ensure that female flies laid all older embryos before we started our collections, we first starved flies for 2 hours in empty cages and discarded these first grape agar plates (Plate set #0).

When we collected embryos for the experiments, we put flies in the cages with grape agar plates including yeast paste to promote egg laying for 2 hours (Plate set #1). Next, at the two-hour time point, we replaced the first plates with a set of new plates (Plate set #2).

We kept Plate set #1 embryos (without any adult flies) to further develop for another 2 hours to obtain 2-4hr embryos. At the same time, we obtained newly laid 0-2hr embryos from Plate set #2. Therefore, this strategy successfully prevented the potential cross contamination of 0-2hrs embryos (Plate set #2) with the 2-4hr embryos (Plate set #1).

The description for embryo collections was added to Materials and methods.

“Embryo collections

To optimize egg collections, young (5-7 day old) females and males were mated.

To ensure mothers do not lay older embryos during collections, we first starved flies for 2 hours in the empty cages and discarded the first 2-hour grape agar plates with yeast paste (Plate set #0). When we collected eggs for the experiments, we put flies in the cages with grape agar plates (Plate set #1) with yeast paste for egg laying for 2 hours. Then, we replaced Plate set #1 with a new set of plates (Plate set #2) at the 2hr time point. We kept Plate set #1 embryos (without any adult flies) to further develop for another 2 hours to obtain 2-4hr embryos. At the same time, we obtained newly laid 02hr embryos from Plate set #2. Therefore, this strategy successfully prevented cross contamination between 0-2hr embryos (Plate set #2) and the 2-4hr embryos (Plate set #1).”

In addition, the validation of the knockdowns is buried in Figure 3 —figure supplement 1C,D, and is not convincing. This is a critical piece of information which is necessary to interpret the results for all remaining figures in the manuscript.

Thank you for bringing up this important concern. Both clamp and zld RNAi knockdown lines which we used were previously validated by multiple publications as described below. Furthermore, our qPCR results support efficient and specific knockdown of both factors. Unfortunately, the ZLD western blot we reported is the best one we obtained after multiple attempts. We have inquired with several laboratories but have been unable to obtain an aliquot of a ZLD antibody that works well on a western blot.

A) The zld RNAi line was made and published by the Rushlow lab (Sun et al., 2015; Yamada et al., 2019) and has been previously validated. See Methods in Sun et al., 2015: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4617966/#s3title

B) The clamp RNAi line has also been used previously and validated by Rieder et al. (2017). Below is the western blot from Figure 5A in Rieder et al. (2017) for the same stage embryos:

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5588930/figure/RIEDERGAD300855F5/

We added the following information regarding validation of fly lines to the Materials and methods section:

“Fly stocks and crosses

To deplete maternally deposited clamp or zld mRNA throughout oogenesis, we crossed a maternal triple driver (MTD-GAL4, Bloomington, #31777) line (Ni et al., 2011) with a Transgenic RNAi Project (TRiP) clamp RNAi line (Bloomington, #57008), a TRiP zld RNAi line (from C. Rushlow lab) or egfp RNAi line (Bloomington, #41552). egfp RNAi line was used as control in smFISH, Immunostaining and Imaging experiments. The MTD-GAL4 line alone was used as the control line in ATAC-seq and ChIP-seq experiments. Briefly, the MTD-GAL4 virgin females (5-7day age) were mated with TRiP UAS-RNAi males to obtain MTD-Gal4/UAS-RNAi line daughters. The MTD drives RNAi during oogenesis in these daughters: therefore, the targeted mRNA will be depleted in their eggs. Then MTD-Gal4/UAS-RNAi daughters were mated with males to produce embryos with depleted maternal clamp or zld mRNA targeted for ATAC-seq and ChIPseq experiments. The embryonic phenotypes of the maternal zld- TRiP RNAi line were confirmed previously (Sun et al., 2015). Maternal clamp- embryonic phenotypes of the TRiP clamp RNAi line were confirmed by immunofluorescent staining in our study (Figure 1A-B). Moreover, we validated CLAMP or ZLD protein knockdown in early embryos by western blotting using the Western Breeze kit (Invitrogen) and measured clamp and zld mRNA levels by qRT-PCR (Figure 1 —figure supplement 1B, C).”

2. The authors claim that CLAMP is a pioneer transcription factor that directly activates zygotically-transcribed genes (line 140), but in my opinion, this is not demonstrated in a convincing manner. First, the western blot presented in Figure 3 —figure supplement 1D doesn't allow for an evaluation of the presence of Zld at the protein level in CLAMP depleted embryos (see point 1 above).

Thank you for this comment. As addressed above, the ZLD Western blot is unfortunately not possible to redo due to lack of antibody availability from several labs.

Furthermore, the analysis presented in Figure 1 —figure supplement 1A, shows that there is a widespread change in expression for zygotically expressed genes upon CLAMP depletion irrespective of the level of CLAMP binding, suggesting that overall the observed changes correspond to indirect effects.

Thank you for bringing up the key issue of direct and indirect effects. Similar to the majority of transcription factors, CLAMP does not have a direct effect on transcription at all of its targets. Furthermore, in the absence of CLAMP, GAF can bind to some CLAMP target sites (Kaye, Cell Reports 2018; Rieder, 2017). Therefore, we would not expect a direct linear relationship between CLAMP occupancy and its impact on gene expression.

However, the density plot that in current Figure5-Supplementary Figure 1A does show that increased binding of CLAMP increases the chance that a gene will have its expression reduced in the absence of CLAMP.

Similar trends, but again not a perfect correlation, were observed in an analysis of Zelda occupancy and Zelda-dependent gene expression (Figure 2A-B in Harrison et al., 2011: https://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1002266).

As suggested by Review #3, we have replaced the density plot as a violin plot to improve the data visualization (current Figure5 B-C).

3. The manuscript doesn't contain critical information regarding how the ATAC-seq data were analysed. Did the authors use only short/long/all fragments for their analyses? This is important since one can then evaluate whether the authors are measuring open chromatin (short fragments) or nucleosome positioning (long fragments).

We apologize for not providing more detailed information describing the ATAC-seq analysis methods. We have now included a full description in Material and Methods and provided all ATAC-seq analysis code as Figure2-supplementary file 1.

“ATAC-seq analysis

Detailed ATAC-seq analysis pipeline code is provided as Figure2- supplementary file 1. Prior to sequencing, the Fragment Analyzer shows the library top peaks are in 180-190bp range, which is comparable to the embryo ATAC-seq online protocol (Haines, 2017). Demultiplexed reads were trimmed of adapters using TrimGalore

(Krueger, 2017) and mapped to the Drosophila genome dm6 version using Bowtie2 (v. 2.3.0) with option -X 2000. We used Picard tools (v. 2.9.2) and SAMtools (v.1.9, Li et al., 2009) to remove the reads that were unmapped, failed primary alignment, or duplicated (-F 1804), and retain properly paired reads (-f 2) with MAPQ >30. After quality trimming and mapping, the Picard tool reported the mean fragment sizes for all ATAC-seq mapped reads is between 125-161bp. As expected, we observed three peaks: (1) a sharp peak at <100 bp (open chromatin); (2) a peak at ~200bp (mono-nucleosome); (3) and other larger peaks (multi-nucleosomes). After mapping, we used Samtools to select a fragment size <= 100bp within the bam file to focus on open chromatin. Peak regions for open chromatin regions were called using MACS2 (version 2.1.1, Zhang et al., 2008). The peak number was summarized in Table 1 and peaks in each group were reported in Table 1 -Table supplementary 1. ENCODE blacklist was used to filter out problematic regions in dm6 (Amemiya et al., 2019). DiffBind with the DESeq2 method (v. 3.10, Stark and Brown, 2019) was used to identify differentially accessible regions.

The DA and non-DA number were summarized in Table 1 and DA peaks were reported in Table 1 -Table supplementary 2.

We used DeepTools (version 3.1.0, Ramírez et al., 2014) to generate enrichment heatmaps (CPM normalization) and average profiles were generate in DeepStats (Gautier RICHARD, 2020). Bigwig files generated by Deeptools bamCoverage (CPM normalization) is uploaded to IGV (Robinson et al., 2011) for genomic track visualizations (Figure 2B). Then, we used Homer (v 4.11, Givler and Lilienthal, 2005) for de novo motif searches (Figure 2D). Visualizations and statistical tests were conducted in R (R Core Team, 2014). Specifically, we annotated peaks to their genomic regions using R packages Chipseeker (Figure 2C, Yu et al., 2015) and we did gene ontology enrichment analysis using clusterProfiler (Yu et al., 2012). Boxplot and violin plot was generated using ggplot2 (Wickham, 2009) package. Intervene (Khan and Mathelier, 2017) was used for intersection and visualization of multiple peak region sets (Figure 2E).”

In addition, how are differentially accessible regions calculated? Is the genome binned? If so, how big are these bins? I am not sure that I fully understand how this analysis is done, and I don't know what "log concentration" refers to in Figure 2A, but I find difficult to reconcile the close to zero correlation values reported for the ATAC-seq datasets in the MTD and CLAMP depleted samples when the majority of dots in panel 2A are not different between the two conditions.

The differentially accessible regions were called using the DiffBind DEseq2 method, which is commonly used in ATAC-seq data analysis: https://rockefelleruniversity.github.io/RU_ATAC_Workshop.html

The genome is not binned in this method, but in the read counting step, DiffBind centers the peak at summit and extends 50bp upstream and downstream of the summit to count reads in a 100bp window.

The “log concentration” was automatically generated by DiffBind. Here is a detailed explanation from the software developer:

“For the X axis, "concentration" refers to the mean (normalized) number of reads across all the samples for that binding site. This is reported as a log2 value, so as you go from left to right, the overall binding affinity (read density) is doubling. “

The dark spot near the origin is a cluster of sites that have very low read counts and also do not change significantly (Figure 2A). The main dark region shows sites with increasing binding activity (high x-axis values) that do not change significantly between conditions (y-axis value close to 0). Both of the dense blue areas are shifted slightly below a fold change of 0 (y-axis), indicating a tendency to see more reads in the MTD group.

The pink points are "significantly differentially bound/accessible" sites. The absolute values of the fold changes are greater than 2 because the y-axis is also on a log2 scale indicating at least a 4-fold change in binding affinity. The red dots on the outer diagonal lines are sites that have no binding in one condition and substantial binding in the other condition.”

To make the plot easier to understand, we changed the x- and y-axis labels of Figure 2A to “MTD ATAC-seq (CPM)” and “Clamp-i ATAC-seq (CPM)”.

4. The genomic track plot presented in Figure 2B doesn't help in the interpretation of the data. First, the panel is missing the genomic coordinates, so one cannot determine what are the peaks that are presented in the figure. Second the panel shows multiple CLAMP binding sites that don't seem to present an open chromatin signature, suggesting that only a fraction of the binding is related to open chromatin in the control samples. It would be interesting, and, in my opinion, much more straightforward to test, how much of the binding that leads to open chromatin (irrespective of whether this would be scored as differential accessible regions, which might be confounded for the reasons outlined in points 1 and 2 above) occurs together, for example, with Zld binding.

We thank the reviewer for the suggestion, we have updated Figure 2B with all the information you suggest.

You are correct that all CLAMP binding sites are not involved in opening chromatin. For example, CLAMP binding peaks associated with introns are less likely to be involved in opening chromatin than those at promoters (Figure 2C). Moreover, we have integrated CLAMP and ZLD ChIP-seq and ATAC-seq peak types to define the relationship between CLAMP binding, ZLD binding and their ability to open chromatin (Figure 3A-B). Our major conclusion about the relationship between CLAMP and ZLD occupancy is that they often function together to open chromatin at promoters, where they regulate each other’s occupancy (Figure 2C, 3A-B and 4F).

5. I find the scatterplots and the regression analysis presented in Figure 2E very unconvincing. Although statistically significant, the correlations are extremely weak, and if the authors would compute R^2 this would be extremely low, highlighting that the majority of the variance observed in these measurements remains unaccounted for. So overall, in my opinion, most of the authors' conclusions in Figure 2 are not supported without further analysis.

We thank the reviewer for this point. We agree with literature which supports Reviewer #3 who states: “the ATAC-seq and RNA-seq are very different assays and as such it is to be expected that the fold change upon depletion of a factor should not be expected to be correlated in magnitude between assays, only the change in direction.” We therefore replaced the original scatter plot with a Venn diagram which shows the significant overlap between genes where CLAMP opens chromatin (ATAC-seq) and genes which are activated by CLAMP (mRNA-seq) (current Figure 2E).

6. I find the analytical strategy in Figure 3A-C inappropriate and difficult to understand. First, the authors use average plots that contain order of magnitude different number of regions across the comparisons. The manuscript does not contain specific details regarding how these plots are produced, but these can be severely affected by outliers, especially in the cases when the number of regions is low. Second, the author define four classes of regions, depending whether they are bound or not bound by the factor and differentially accessible, and a control region (not bound and not differentially accessible). The split in these classes is very confusing and difficult to follow throughout the rest of the manuscript. In my opinion, this might be easier to follow if the data were presented as heatmaps including the common set of regions then split into different classes.

We thank the reviewer for this comment. We chose to use average profiles because they have been used previously to study the role of Zelda in regulating chromatin accessibility. For example, the previously published work on Zelda regulating chromatin accessibility near its binding sites also divided peaks into 4 groups in the same way we reported (Figure 2). Schulz et al., 2015:

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4617967/

We apologize for not providing details of the analysis in the methods. As reviewers requested, we have now provided all of our code as a supplementary file.

Thank you for the suggestion to generate heatmaps instead of average profiles. We have now replaced most average profiles plots with heatmaps to allow the better analysis of the distribution of patterns at individual loci (Figure 3).

Of note, the authors use comparisons across the average plots to "validate" (line 231) or "confirm" (line 243) their analysis. I disagree that this is validation because the groups have been chosen presumably based on thresholding the same data, and a validation could only come from orthogonal data, which is not used here.

Thank you for pointing out that we should revise our word choice to be more accurate. We have revised the “validate” and “confirm” to “revealed” and “correspond.”

If the authors would want to use average plots, these should at least include a sharing area representing a confidence interval. Only one of the lines in Figure 6D shows such a shading, although that this represents is not explained. This significantly affects the majority of analyses and interpretations in Figures, 3,4,5,6.

Thank you for pointing this out. All average plots were generated in Deeptools and DeepStats used “--plotType se” to plot the average line with a confidence interval (see code for plot). However, when numbers of similar sites are large, it is not possible to see the shading because the confidence intervals are very small. The shading can be seen when there are fewer sites (e.g. Figure 6D: 30 sites). As we describe above, we have replaced most average profiles with heatmaps to provide more insight into the distribution of individual sites.

7. There is a significant difference in the shape of the average profiles for the ATAC-seq data for the fourth group in both comparisons in Figure 3A. This is meant to be the control group in both cases, of non-DA, non-bound peaks. Since the controls are qualitative different between the CLAMP and Zelda depleted experiments, I wonder whether strong conclusions can be obtained from the comparisons of the other average profiles. In this respect, the p-values in Table 1 need to be corrected for multiple testing. Under Bonferroni correction, the p-value of the overlap between DA Zld-bound and DA CLAMP-bound is not significant, which is in disagreement with the authors' conclusions (line 247-248). This also affects the motivation for the analysis in Figure 4 (lines 312-313).

Thank you for pointing out that the shape of the ATAC-seq data average profiles differs for among four groups in our previous analysis. Now, we have selected fragment sizes which are all less than 100bp, so we did not compare the shapes of these data. Moreover, we have replaced all average profiles in the previous manuscript to the heatmaps for better data interpretation.

We apologize for not being clear in describing the method used in the p-value calculation in Table 1 (now Table 4).

Briefly, the problem of gene overlap testing can be described by a hypergeometric distribution where one gene list A defines the number of white balls in a jar and the other gene list B defines the number of white balls that are drawn from a jar. Assume the total number of genes is N, the number of genes in list A is A and the number of genes in list B is B. The intersection between A and B is k.

Therefore, we calculated the probability of have k numbers in the intersection:

enrich_pvalue <- function(N, A, B, k)

{

m <- A + k

n <- B + k

i <- k:min(m,n)

as.numeric( sum(chooseZ(m,i)*chooseZ(N-m,n-i))/chooseZ(N,n) )

}

We added the code for the calculation in the supplementary ATAC-seq pipeline files.

8. Figure 3D is misleading since the scales in both plots are different and do not let the reader appreciate the extent of the changes for the different classes. It might help with the visualisation if the authors would use a violin plot, or if they would plot the data without the outliers and including notches for the boxes. In my opinion, the results in Figure 3D indicate that there is a downregulation of gene expression at CLAMP-bound genes irrespective of changes in chromatin accessibility. This would challenge the authors' own conclusions of a pioneer role for CLAMP at those sites that don't change accessibility. It is therefore unclear how the authors arrive at the conclusions in lines 294-296.

Thank you for pointing this out. We have now taken this figure out and moved all of the transcription data to the new Figure 5. Also, we plotted the relationship between CLAMP binding and transcription reduction after CLAMP depletion in a violin plot as you suggested. Statistical analysis of these plots supports our conclusions that there is a greater chance that a gene will be downregulated by CLAMP depletion if it is more strongly bound by CLAMP.

9. Does the list of motifs included in Figure 4A contain all the set of significant motifs found in these regions? Without this information, it is not possible to evaluate the statement in lines 319-320.

We thank the reviewer for this comment. Figure 4A only shows the top 3 most significant motifs found in these regions. The Homer software reported a long list of motifs and therefore we only showed the top three most significant motifs consistent with previous reports (Schulz et al., 2015). After generating heatmaps, we have performed new analysis combining motif calls and heatmaps where we include the top three most significant motifs for each cluster of sites (current Figure 3A).

Furthermore, this can be due to the fact that there is a significantly lower number of DA CLAMP-bound in 2-4 hours compared with the 0-2 hours data, which might affect the significance of these enrichments. In my opinion, a better alternative would be to show the number of Zelda events that occur in these regions. Without this information, the conclusion stated in this paragraph (lines 320-322) is not supported.

We thank the reviewer for this suggestion. To address this important concern, we have performed a new heatmap visualization of these data which shows both CLAMP and Zelda binding at clusters of sites where chromatin accessibility is regulated by each factor (Figure 3).

10. The authors state that the majority of Zelda binding sites are not affected upon CLAMP depletion (line 345). However, I find this statement puzzling, since Figure 4 —figure supplement 1A shows a significant effect in the level of Zelda binding upon CLAMP KD throughout most of the regions shown in the heatmap.

Thank you for this comment. However, this statement was based on the result of statistical analysis identifying significant differential binding of Zelda in the absence of CLAMP across replicates. The Diffbind DEseq2 method reported 274 (0-2hr) and 1,289 (2-4hr) statistically significant down-DB sites where Zelda binding decreased in the absence of CLAMP compared to MTD controls. Diffbind identified many more nonsignificant binding sites (0-2h: 3,144; 2-4h: 5,672). In this way, we determined that the majority of Zelda binding sites were not affected by CLAMP depletion in a statistically significant way.

To avoid confusion, we removed this specific statement from the text.

11. Related to the point above, it would be useful if the authors could clarify how the data presented in Figure 4 —figure supplement 1A and Figure 4 —figure supplement 1D are different. Also, please note that this figure is missing the labelling of the x-axis. This is also the case for Figure 4 —figure supplement 1E.

We thank the reviewer for this comment:

1) Figure 4 —figure supplement 1A shows all peaks ranked by their binding intensity compared with the input control.

2) Figure 4 —figure supplement 1D shows peaks plotted separately for upregulated, down-regulated, and non-significantly bound peaks/regions.

We apologize for missing information in the plot. All plots have been remade and the missing labels were added.

12. The conclusions reached by the authors in lines 354-356 regarding the effect of Zld in CLAMP binding are not supported since, as the authors acknowledge, the experimental design is confounded by the up-regulation of Zld in the 2-4 hours time point.

Thank you for this comment. Our conclusion regarding the effect of maternally deposited Zelda on CLAMP binding is supported at 0-2hr because 390 peaks were identified as differentially regulated by Diffbind DEseq2.

We acknowledge in the manuscript that the Zelda levels are recovering due to the expression of zygotic Zelda. However, our focus of this study is the maternal effects of both proteins.

We did remove this statement to avoid causing confusion.

13. I am unable to understand the interpretation of the results presented in Figure 4F and 4G. In any case, the results in Figure 4G might be confounded by the increased in Zld expression at 2-4 hours, as mentioned by the authors before.

We apologize for the data complexity in combining ATAC-seq and ChIP-seq analysis.

We have updated the analysis by generating a heatmap as you have suggested (Figure 3). The heatmap has greatly increased the ability to interpret different classes of ATACseq and ChIP-seq binding sites and supported our conclusions.

14. Figure 5A lacks genomic coordinates, which makes it impossible to interpret the plot. In addition, the scales of the signal are also not readable, which makes it impossible to evaluate the robustness of the binding represented and the comparison.

We apologize for missing the information on the figure. We have added the genomic coordinates to the plot and remade the figure in illustrator for improve readability of the labels.

Related to this figure, does the difference in peak size depend on the number of individual binding events?

Thank you for this question. The difference in peak size could be associated with the number of individual binding sites since the broader peak regions contains multiple motifs. Furthermore, we have previously reported that broader CLAMP ChIP-seq peaks correlate with the presence of multiple clustered motifs (Soruco et al., 2013).

I am unable to follow the results presented in Figure 5 —figure supplement 1E,F, and the interpretation of the results of Figure 5E.

Thank you for this statement.

1) Figure 5 and Figure S1E and F: These plots show the number of protein binding motifs in Down-DB and Non-DB sites when these sites are defined based on the depletion of the other protein.

A) We found that Zelda motifs are more significantly enriched at sites where CLAMP occupancy is decreased after depleting Zelda.

B) Similarly, CLAMP motifs are significantly enriched at sites where CLAMP is required for Zelda binding.

Therefore, we conclude that CLAMP and Zelda regulate each other’s binding via directly binding to their own binding sites because the motifs for the required protein are enriched. This supports the conclusion that CLAMP and Zelda directly regulate each other’s binding. To avoid confusion, we removed this part of our analysis in the results.

2) Figure 5E (current Figure 5D): We apologize for the data complexity which results from combining ChIP-seq and RNA-seq analysis but it is important to integrate these two data types. Here, we plot transcription data for genes where TF occupancy is altered by the reduction of the other TF.

Briefly, the y-axis shows CLAMP target gene expression reduction after depleting Zelda (log 2 fold change compared to control). The x-axis shows differentially bound and non-differentially bound genes for 0-2hr (left) and 2-4hr (right) embryos. We found a significant reduction of CLAMP target gene expression reduction after Zelda depletion only in the differentially bound (down-DB) group at the 0-2hr time point. Therefore, genes where Zelda impacts CLAMP binding show altered target gene expression.

15. I find it difficult to understand the statement in lines 458-459 because I do not understand what is the nature of the interdependent relationship between Zld and CLAMP binding to chromatin.

Thank you for this statement. As suggested by Review #3, we changed

“interdependent” to “cooperative” throughout the manuscript. In our model, CLAMP and Zld help each other to binding to promoters of target genes which regulates target gene expression. Many of these target genes encode DNA binding transcription factors which drive genome activation.

16. It is unclear whether the data in Figure 6C refers to the dependent or independent sites, since both seem to gain accessibility upon Zld depletion. I find this observation difficult to reconcile with the results presented in Figure 4 —figure supplement 1A,E that suggest that Zld depletion leads to an overall reduction of CLAMP binding. I would have expected then to observe a loss of accessibility, but not a gain. How do the authors explain this puzzling observation?

Thank you for bringing up this important point. One possible explanation could be technical and related to the ATAC-seq technology: the increased accessibility of these regions upon Zelda depletion could be because Zelda is simply stably bound at these sites which inhibits access of the Tn5 enzyme used in ATAC-seq. Removing Zelda makes these sites more accessible for transposition. We have added this statement to the Results section:

“Interestingly, the accessibility slightly increases upon the loss of ZLD in CLAMP downDB at 0-2hr (Figures 5E). An active TF binding to DNA could prevent Tn5 cleavage at those regions (Yan et al., 2020). Therefore, loss of ZLD and CLAMP binding could result in a perceived accessibility increase, as measured by ATAC-seq.”

In any case, since the gain in accessibility seems to be independent of CLAMP binding, since it occurs in both groups, can the authors be confident that this is biological and not due to technical differences between the libraries? What is the overlap between the sites that are reported here and those reported gaining accessibility in Schulz et al.?

Thank you for this question. Please note that we did not generate libraries for the

ATAC-seq data from zld-i embryos. These data were obtained from Hannon et al., (2017). In their processed data, there are sites which show significant differential accessibility in both directions upon Zelda germline depletion.

17. The results in Figure 6D are also very difficult to interpret, especially given the limited effect of Zld depletion on CLAMP binding at 2-4 hours. How do the authors explain these results?

Thank you for this question. In Figure 6D, we focused on peaks that show differential binding. Although the effect of ZLD depletion on CLAMP binding is modest, we can still capture the profile for the most significant peaks (n=30). Chromatin accessibility was decreased by depleting Zelda at regions where CLAMP occupancy is also significantly reduced. These data support a model in which Zelda increases chromatin accessibility to promote CLAMP recruitment.

18. I don't understand the authors reasoning for the statement in lines 490-492. The authors' own analysis shows that the overlap between the set of downregulated genes at 2-4 hours is not better than one could expect by random chance (Figure 6F). How do the authors then conclude that there is co-regulation for hundreds of genes after ZGA?

Thank you for this question. The p-value does not show significance of the overlap between downregulated genes at 2-4hr compared to 0-2hr. However, we did not want to ignore that there are 373 genes which overlap by concluding that Zelda and CLAMP become independent of each other after ZGA. We removed this plot and replace it with the current Figure 5 which includes all of the transcription-related analysis in the new version of the manuscript.

Reviewer #2:

In the current manuscript under review, Duan et al. address the question of the role of GA-repeat binding factor CLAMP on the process of ZGA. The question of ZGA, particularly that of which pioneer factors establish patterns of chromatin accessibility and promote the expression of the first zygotic transcripts has received heavy attention in recent years. Notably, although another pioneer, Zelda, has a critical role for driving ZGA for a subset of zygotic genes by several measures, the vast majority of genomic locations either require a combination of Zelda and another factor, or another factor entirely. Several prior studies have pointed to enrichment for a GA-repeat motif within this class of sites. Identifying and characterizing the role of such a second maternal pioneer would represent a significant advance for the field as well as more broadly across biological fields as the question of pioneering touches on several key aspects of transcriptional regulation and epigenetics.

While Duan et al. present data that (1) CLAMP binds its motif even in the nucleosome-associated state; (2) CLAMP loss of function leads to some amount of reduced chromatin accessibility; (3) Some CLAMP and Zld DNA binding is interdependent; (4) Loss of CLAMP function affects gene transcription-- the manuscript in its current state is far from suitable for publication. My primary concern is that the data presentation of the genomics studies is extremely difficult to follow, that supporting data tables are either incompletely annotated or missing, in many cases it is nearly impossible to read the plot labels in the figures, and that the biological significance of the observations is not fully substantiated. In addition, certain controls have not been provided or even incorporated into experimental design. Also, there are issues with the presentation of the study, with factually incorrect statements and missing or unclear description of methods, and missing references (in some cases leading to factually incorrect statements).

This could be an important paper and it is therefore important that the presentation is watertight. I provide the comments below fully aware of current constraints on daily life, and in the spirit of wanting to minimize additional work for the authors. I think that overall the data already exists to improve the manuscript (or at least it should). But there is a fundamental question of whether the data are over-interpreted, and whether the effect of Clamp is as significant as the authors claim, at least within the framework of the process of ZGA.

We thank the reviewer for their careful assessment which has helped us greatly to significantly improve the manuscript. We added supporting data tables and missing information, changed figure labels, and provided additional experiments and computational approaches to strengthen the manuscript. All of your questions are addressed below:

1) The presentation of the genomics data analysis is very difficult to follow. I inspected the bigWig files for the ATAC data and had a hard time finding genomic regions where there is clear-cut evidence for CLAMP's role as a pioneer factor. Loading up the four ATAC conditions (two timepoints each control or clamp-i), as well as the Rieder CLAMP ChIP (NC14) and the 3h Harrison Zld ChIP-seq, I can find only a handful of regions where CLAMP has a clear all-or-nothing effect on chromatin accessibility, and these (few) sites are at regions where there is little Zld binding. These sites I did find by scrolling through nearly the entire genome are: 3' to CG11448, within iab-8, and possibly at the promoters of Vsx1 and 2. There are, however, numerous examples of regions where a 'differential enrichment' analysis could possibly yield a statistically significant difference between control and clamp-i, but there remains substantial accessible chromatin in the knockdown conditions. This latter phenomenon cannot be construed as evidence for pioneer activity, since it is expected that in the absence of the pioneer, the locus would be inaccessible. I am left with the question of whether the effect of Clamp on chromatin accessibility is oversold in this study.

We thank the reviewer for careful assessment which we have addressed as follows:

We re-sequenced our ATAC-seq data to obtain more reads and therefore higher quality data. We have added details of data analysis in Materials and methods and the code in the supplementary file. Here are some key points of the revision:

1) We have updated the candidate region in the current Figure 2B to more accurately show a better represented region with differential accessibility based on computational analysis.

2) We added an average profile showing the significant reduction of ATAC-seq reads in clamp-i embryos at DA sites compared with controls (Figure 2A).

3) We also added a heatmap of different classes of ATAC-seq peaks (Figure 3) which includes binding profiles for both CLAMP and Zelda. Furthermore, CLAMP binding is stronger at DA sites bound by CLAMP compared to nonDA sites bound by CLAMP.

4) We have added Tables to the manuscript and summarized numbers and fractions of peaks displaying an effect to improve the data presentation. Moreover, in supplementary tables, we reported peaks locations of DA called by DiffBind, and all 4 defined CLAMP or ZLD-related classes.

Overall, we observe a significant reduction of read enrichment at differentially accessible regions upon the clamp depletion. Similar to Zelda, CLAMP has both direct and indirect effects on chromatin accessibility and these effects do not occur at all factor binding sites as we state in the manuscript. For example, CLAMP has stronger effects on chromatin accessibility at promoters than at introns (Figure 2C).

The example regions plotted in the Figures also reveal potential issues in the analysis or interpretation of data: Figure 2B, CG11023: I had questions about what was going on in this plot which were cleared up by checking the bigWig files. For instance, I was curious why the light blue peak region indicator included regions with no ATAC signal in either control or clamp-i. Why also is there Clamp ChIP signal in this region? Upon inspection of the data, this plot shows base one of chr2L, and the blank region in the ATAC is presumably due (understandably) to mapping issues at the very telomeric end of the chromosome. Why is a peak called here? Why does the peak end within a peak of ATAC signal and not include this whole region? Significantly more concerning is that when I examine this region, my conclusion is that this whole region is likely very low signal that I would be reluctant to score both as "open" as well as "bound by Clamp". On the basis of this, I am reluctant to say that the bioinformatic analysis has been performed with sufficient rigor.

We apologize for not carefully choosing a representative region. We updated the region (promoter of Mod(mdg4)) in the current Figure 2B. We also made sure this example region is a differentially accessible region and a region where CLAMP is differentially bound to demonstrate the direct impact of CLAMP binding on chromatin accessibility. Moreover, we also found that ZLD is also differentially bind to this region upon CLAMP depletion (data is not shown in IGV plot). Moreover, we have now added a section of ATAC-seq and ChIP-seq data integration to our Methods and Materials section and provide all peaks and peaks in each sub-group in bed format as supplementary tables.

“ATAC-seq and ChIP-seq data integration

We first selected the top 6,000 ATAC-seq peaks based on the count per million value rank from our ATAC-seq data or the ATAC-seq in Hannon et al. (2017). Then, we used Bedtools (Quinlan and Hall, 2010) intersection tool to intersect peaks in CLAMP ChIP-seq binding regions with CLAMP DA or non-DA peaks. Based on the intersection of the peaks, we defined 4 types of CLAMP related peaks: (1) DA with CLAMP, (2) DA without CLAMP, (3) Non-DA with CLAMP, (4) Non-DA without CLAMP (Figure 3A, Figure 3—figure supplement 1A, and Table 2).

Similarly, we defined ZLD related peaks by intersecting ZLD DA or non-DA peaks and ATAC-seq datasets (Hannon et al., 2017; Soluri et al., 2020) from wildtype (wt) and zld germline clone (zld-) embryos at the NC14 +12 min stage. Specifically, we defined four classes of genomic loci for ZLD-related classes: (1) DA with ZLD, (2) DA without ZLD, (3) Non-DA with ZLD, (4) Non-DA, without ZLD (Figure 3B and Table 2). We used DeepTools (version 3.1.0, Ramírez et al., 2014) to generate enrichment heatmaps (CPM normalization) and average profiles for each subclass of peaks (Figure 3A-B). We used Homer (v 4.11, Givler and Lilienthal, 2005) for de novo motif searches. Peaks locations in each CLAMP or ZLD-related category were summarized in Table2 – Source Data 1.”

Admittedly, this is based on one example image, but I would also point out that the authors have both only provided limited example regions, and have not provided a sufficiently documented 'peaks list' that includes regions that they feel are (1) bound by CLAMP, (2) bound by Zelda, (3) score as a member of the various groupings used to compare regions throughout the text (e.g. DA-Clamp bound, et cetera). The peaks list that the authors do provide is in a strange format and the column labels are not included in that file (nor can I find anywhere a description of that file, but I may have missed that in the submission materials). Nevertheless, it does not appear to indicate membership in any of the different classes from what I can tell.

Thank you for the suggestion. During the revision, we have re-sequenced 3 ATAC-seq libraries that had lower than 5M usable (uniquely aligned) reads in our original analysis. Now after re-sequencing, we reached on average 25M reads per ATAC-seq sample (Figure 2-supplemental table 1). We have re-analyzed the data and updated all the plots in the manuscript. We have added Figure 2-supplimentary file1 and Figure 4supplimentary file1 to show the pipeline we used for ATAC-seq and ChIP-seq analysis. The code in each step and plots for each figure are all included. Moreover, we summarized peak number and fraction in main Table 1-4 and provided all peaks and peaks in each sub-class in bed format as supplementary tables.

It is similarly difficult to evaluate the conclusion that CLAMP has anything at all to do with ZGA (see below). Specifically, however, to the bioinformatics analysis: when RNAseq data is analyzed, is it limited to zygotic genes only (as defined either in DeRenzis 2007, or in the Li paper cited in the manuscript?), and is the magnitude of the effect large enough to warrant the conclusion that Clamp is required for ZGA?

Thank you for this question. We only separated maternal and zygotic genes when visualizing the impact of CLAMP impact on each category. We used all genes to overlap with CLAMP and ZLD DB, nonDB and to their bindings. Also, we have now moved all of the transcription data to the new Figure 5. For example, Figure 5B-C show that genes strongly bound by CLAMP showed a significant (p < 0.001, Mann-Whitney U-test) level of gene expression reduction after clamp RNAi than weakly bound or unbound genes. We also observed similar results (p < 0.001, Mann-Whitney U-test) on the ZLD strong binding genes with expression reduction upon zld- in germline clone embryo RNA-seq (Combs and Eisen, 2017). We noted the gene number in each binding group (strong, weak and none binding) on the plot and it shows similar numbers between CLAMP (eg. strong = 250, 463 in 2 time points) and ZLD (eg. strong = 207, 436 in 2 time points) regulated genes. Therefore, we showed the direct binding of CLAMP or ZLD to genes regulates their transcriptional activation in a similar magnitude during ZGA.

Moreover, we plotted the relationship between CLAMP binding and transcription reduction after CLAMP depletion in a violin plot as you suggested. Statistical analysis of these plots supports our conclusions that there is a greater chance that a gene will be downregulated by CLAMP depletion if it is more strongly bound by CLAMP.

For comparison, loss of Zelda function results in near zero transcripts produced from a subset of zygotic genes (and corresponding elimination full stop of chromatin accessibility at those loci). I'm worried that the authors are placing too much weight on "significant" p-values without considering if the magnitude of the effect supports the stated conclusions. If the effect of clamp-i is minimal on transcription and chromatin accessibility, which it may be based on my limited examination of the raw data, I see no way to justify the conclusion that Clamp has any major role in ZGA.

Thank you for the question. We have compared the impact of ZLD binding on gene expression and CLAMP binding on gene expression in the current Figure 5 B-C and they show very similar impact.

Moreover, we have performed new imaging experiments on early patterning genes which show similar phenotypes for embryos depleted of either CLAMP or ZLD (Figure 1). Like Zelda-depleted embryos, CLAMP-depleted embryos also show failed cellularization consistent with a defect in ZGA. Interestingly, CLAMP has a stronger role in promoting Zelda occupancy than Zelda does in promoting CLAMP occupancy (Figure 4A). Therefore, we are convinced that CLAMP promotes ZGA in early embryos.

I also have a difficult time finding any Zld-bound loci that convincingly show loss of accessibility in the clamp-i data.

Thank you for this comment. We have now re-sequenced our ATAC-seq data which has improved their quality. In the new Figure 3A we show that CLAMP regulates the accessibility of sites that are bound by Zelda and have Zelda motifs.

Reviewer #3:

The manuscript submitted by Duan and Rieder et al. describes, for the first time, how the CLAMP transcription factor acts as a pioneer TF in the fly embryo. They demonstrate that CLAMP can bind to nucleosome-bound DNA and that it binds to and generates accessible chromatin at a set of gene promoters in the early embryo, and that without this activity these genes fail to be transcribed during ZGA. They further describe fascinating cooperativity between CLAMP and ZLD, a previously identified pioneer TF in the fly embryo. Their results are compelling and rigorous, and the work will be of broad interest to both the developmental biology and transcription biology fields.

We thank the reviewer for their careful assessment to help us to improve the manuscript. We have performed new experiments, re-sequenced our ATAC-seq libraries, added supporting data tables, changed figure labels, and provided more experimental/bioinformatics details to strengthen the manuscript. All your questions are addressed below.

One concern is in Figure 2E, This scatterplot and correlation is not particularly convincing. The fact that the positive correlation is very minor needs to be emphasized properly in the text. Moreover, in our opinion, there is a better way of doing this. ATAC-seq and RNA-seq are very different assays and as such it is to be expected that the fold change upon depletion of a factor should not be expected to be correlated in magnitude between assays, only the change in direction. The dynamic range of change you can expect in RNA-seq is much greater than that in ATAC-seq because mRNA is much more abundant than its cognate DNA for transcribed genes. We think the authors should simply display a Venn diagram of genes/promoters that move in the same direction, i.e. what fraction of the genes have the same directionality of change. We do not think that comparing the magnitude of these changes is particularly useful or informative in this case.

Thank you for your constructive suggestion. We have followed your advice and replaced the correlation plot with a Venn diagram to show overlap between ATAC-seq and mRNA-seq data at the gene level.

[Editors’ note: what follows is the authors’ response to the second round of review.]

Reviewer #3 (Recommendations for the authors):

In the revised manuscript, Duan and colleagues have addressed some of the issues that were raised upon the original review. The authors have generally improved the presentation of their data and have rendered the results easier to interpret. However, despite these improvements, upon inspection of the differential enrichment analysis, the magnitude of effect of Clamp on differential chromatin accessibility is significantly overstated.

It is appreciated that the authors re-sequenced some of their lower depth samples for the resubmitted version. It is also appreciated that the authors have now provided the annotated tables from the differential enrichment analysis. In my original review, I mentioned that I manually searched nearly the entire genome while struggling to find more than a few examples of convincing loss of ATAC signal in the clamp-i data. I have now reviewed the differential enrichment analysis ("Table 4-source data 2", referred to as "Table 1 Source Data 2" in the text (line 681)) and note the following issue:

- The authors appear to have relied not on the FDR but rather on individual, independently calculated p-values for reckoning the number of differentially accessible peaks in the differential enrichment analysis. Table 1 reports 76 "Up", 1675 "Down", and 9465 "None" effects on accessibility in clamp-i embryos versus control. In the supplied source data, it is clear that the authors set a p-value cutoff of 0.05 for calculating these numbers. What isn't mentioned in the text is that this cutoff corresponds to a 32% FDR. Typically, a 5% FDR rate is chosen to minimize incorrect rejections of the null hypothesis that arise due to multiple testing. Using this standard, the total number of differential peaks in the 0-2 hour comparison is only 95, with 73 sites showing a reduction, and 22 showing an increase. Again, a manual inspection of a sampling of these regions shows marginal differences in magnitude between the few regions that do pass significance testing at 5% FDR.

Even fewer regions are differentially accessible in the 2-4 hour sample at a 5% FDR (total = 54, 33 "down", 21 "up").

On the basis of this observation, it would be hard to argue that CLAMP is playing a major role in regulating chromatin accessibility at ZGA. To me, this substantially casts doubt on the central premise of this manuscript and in fact suggests that CLAMP has only a minor effect on accessibility at this time.

We appreciate the reviewer’s comments and concerns. Because embryos were pooled for ATAC-seq within a 2-hour time interval to match our ChIP-seq data, we do see variability among sample replicates. Therefore, fewer peaks were identified after multiple hypothesis correction because many peaks are only altered in one replicate.

Therefore, consistent with ENCODE guidelines, we reanalyzed our ATAC-seq data using a more stringent FDR cutoff. We updated several parameters for each tool:

Mapping

bowtie2 \

--local \

-p16 -t --very-sensitive-local \

--phred33 -N 1 -X 2000 \

Current, changed the local alignment:

bowtie2 \

-p16 -t --very-sensitive --no-mixed --no-discordant \

--dovetail -X 2000 -k 2

Peak calling MACS2

macs2 callpeak -f BAMPE -g dm --keep-dup all -nolambda -q 0.01 --cutoff-analysis

Current:

macs2 callpeak -f BAMPE --gsize dm --qvalue 0.01 --call-summits

Moreover, we found that multiple publications used FDR <0.1 for ATAC-seq data analysis including previous analysis of CLAMP in cell lines ( Samata et al. Cell 2020, MOF;, Albig et al NAR 2018CLAMP in S2/Kc cells). Therefore, we updated our DiffBind analysis using a newer version (v.3.12) that normalizes sample library size by “dba.normalize” and defined our differential peaks with FDR<0.1.

Using this cutoff, we obtained 277(0-2 hours) and 50 (2-4 hour) loci that are differentially accessible in our ATAC-seq data. We further restricted to loci that are bound by both CLAMP and ZLD for the downstream analysis (Figure 4). Please see Figure2-supplementary table 1 and Table 2-Source Data1 for peak location details.

Because our number of CLAMP-dependent differentially accessible peaks is reduced in our new analysis, we have performed an intensive revision to remove all statements that might sound overstated and discussed the clear difference between CLAMP and ZLD regarding the role they play in chromatin accessibility. We have clarified that ZLD functions at many more sites than CLAMP and have not defined CLAMP as a pioneer factor but as a factor with “pioneer-like” function. However, CLAMP function promotes ZLD recruitment is critical for zygotic genome activation. Moreover, many CLAMP target genes are key early TFs and components of the Hox cluster which likely explain the early patterning defects and lethality we report (Figure 1A).

CLAMP also regulates the chromatin accessibility of the zld gene locus itself.

Here is an example from the manuscript:

“Although we have demonstrated an instrumental role for CLAMP in defining a subset of the open chromatin landscape in early embryos, our data show that CLAMP does not increase chromatin accessibility at promoters of all zygotic genes independent of ZLD. Consistent with our results in the early embryo, CLAMP was found to regulate chromatin accessibility at only a few hundred genomic loci in male S2 (258 sites) and female Kc (102 sites) cell lines (Albig et al., 2019). Unlike ZLD which plays a global role in regulating chromatin accessibility at promoters throughout the genome, depletion of CLAMP alone mainly drives changes at promoters of specific genes that often encode transcription factors which are important for early development, consistent with phenotypic data.”

[Editors' note: further revisions were suggested prior to acceptance, as described below.]

The authors have made a concerted effort to address the reviewer's concerns, and save for the remaining minor issues below, the manuscript is suitable for publication. While the reanalysis of the data has led to the conclusion that Clamp does not alter chromatin accessibility at as many sites in as non-redundant a way as Zelda, the work does document an interesting and critical interplay of pioneer transcription factors in early embryonic development, and it begins to understand the molecular underpinnings of that interplay. We think this work will be of broad interest and will help clarify how transcription factors act to establish chromatin accessibility and set-up the first steps in early embryonic transcription regulation.

We thank all of the reviewers and editors for your constructive comments and suggestions which have helped us greatly improve our manuscript. We have addressed the remaining minor issues and edited the manuscript language thoroughly for clarity.

1) Clamp as Pioneer: the authors have convincingly shown that Clamp binds to nucleosomal DNA using gel shift assays, and this result alone is probably sufficient to call it a pioneer factor in our view. However, the authors have also convincingly shown that the scope of Clamp pioneering accessibility of chromatin is very small compared to Zelda, but that like Zelda, loss of function is catastrophic in terms of overall development. Any use of "pioneer-like" can be replaced with "pioneer'. We also recommend that the authors carefully edit the Discussion to accurately describe the magnitude of Clamp's effect on accessibility, and to update the summation of results pending the outcome of points 2 and 3 below.

Thank you for your support of our findings. We replaced the “pioneer-like” with “pioneer” throughout the manuscript and edited the discussion about the role of CLAMP in modulating chromatin accessibility and emphasized the more targeted yet essential function in zygotic development:

“Although we have demonstrated an instrumental role for CLAMP in defining a subset of the open chromatin landscape in early embryos, our data show that CLAMP does not increase chromatin accessibility at promoters of all zygotic genes independent of ZLD. Consistent with our results in the early embryo, CLAMP regulates chromatin accessibility at only a few hundred genomic loci in male S2 (258 sites) and female Kc (102 sites) cell lines. Unlike ZLD, which plays a global role in regulating chromatin accessibility at promoters throughout the genome, depletion of CLAMP alone mainly drives changes at promoters of specific genes that often encode transcription factors that are important for early development, consistent with phenotypic data. These findings indicate that CLAMP and ZLD regulate ZGA in different ways: ZLD mediates chromatin opening globally, while the CLAMP functions in a more targeted way at certain essential early TF genes. However, both proteins are critical to ZGA and loss of either is catastrophic in terms of overall embryonic development.”

2) The reviewers agree that part of the new analysis presented in Figure 3 was not performed in an ideal manner to support the conclusions. The observation at line 245, for instance, is premature:

"Depletion of either maternal zld or clamp mRNA altered the genomic distribution of CLAMP and Zld: both factors shifted their occupancy from promoters to introns."

We request the authors either repeat this analysis more rigorously or eliminate the section entirely. The current analysis is performed by comparing independently called peak lists and placing emphasis on regions that are present or absent in each set. This approach is highly susceptible to thresholding artifacts associated with peak calling. All reviewers agree that a more rigorous approach would be either to perform this analysis on a single, union peak set followed by differential enrichment analysis, or coverage data between different treatments could be compared directly by generating XY-scatter plots of summed reads in each peak from a union peak list. If the conclusion of this section is correct, the genomic regions of interest should be significantly off the diagonal, and this can be statistically addressed.

We thank reviewers for your comments on analysis methods related to Figure 3C-D. We used an approach from a widely-cited R package ChIPseeker (Yu et al., 2015) to compare peak set genomic distribution in multiple samples, but now we realized the conclusion we made from this single analysis is premature based on your comments.

We agree that Figure 3C-D can only indicate that the peak distribution is different in each genotype but cannot statistically determine a significant shift in occupancy from promoters to introns.

Thank you for your suggested analysis approach. Instead of generating a single union peak list for differential analysis as reviewers suggested, our analysis used the DiffBind package (Stark R and Brown G, 2011) that overlaps and merges peak sets across compared datasets, counting reads in overlapped intervals and uses DESeq2 to identify statistically significantly differentially bound sites. Therefore, we directly performed a new analysis for the genomic distribution of up and down-regulated differentially bound (DB) peaks identified using DiffBind/DEseq2.

We have added these new analysis results for all up/down-DBs in Figure 3-supplementary Figures 3 to replace Figure 3C-D. We removed text in the previous section and moved the text after the introduction of the differential binding analysis:

“Moreover, depletion of either maternal zld or clamp mRNA altered the genomic distribution of CLAMP and ZLD: the most common pattern we observed was that promoter-bound peaks were lost (down-DB) and peaks in introns were gained (up-DB) (Figure 3-figure supplement 3).”

3) The authors demonstrate that the knockdown efficiency of Zld RNAi is poor during the 2-4h timepoint (e.g. Figure 3, Fig. Supp. 1B). We caution the authors from drawing any strong conclusions about the effect of Zld on Clamp in the 2-4h time period. Please consider revising or eliminating the text beginning at line 262, where the weak effect of Zld on Clamp binding at 2-4 hours can possibly be attributed to incomplete knockdown.

Thank you for the suggestion. This section is now revised to eliminate any potential overstatement:

“390 (0-2 hr) and 30 (2-4 hr) CLAMP down-DB sites were found upon loss of ZLD (Figures 3D, Figure 3-figure supplement 1E, and Table 1). We identified very few sites where CLAMP occupancy increases after zld RNAi (up-DB sites: 0-2hr: 54, 2-4hr: 3).”

4) For most of the heatmaps throughout the manuscript: the titles of the heatmaps incorrectly refer to "peaks", regardless of the data type presented in the heatmap. This can be confusing since the y-axis of the heatmap is some set of "peaks," and the data presented in the heatmap is ATAC-seq coverage or ChIP-seq coverage for a particular factor/genotype/timepoint. To improve readability, please revise heatmap plots to indicate the peak set on the y-axis, and relevant sample information in the header/title of each plot.

Thank you for the careful assessment. We have now updated all heatmap titles and axes to reflect the data types.

For example, we replaced “CLAMP peaks” with “CLAMP occupancy” on the title and “ChIP-seq signal intensity” on the y-axis/color key.

We revised “ATAC-seq peaks” with “Chromatin accessibility” in the title and “ATAC-seq signal intensity” on the y-axis/color key.

5) Paragraph beginning at line 341. Here, the authors are examining "gene expression changes caused by depleting maternal Zld at genes where CLAMP regulates Zld binding." The next sentence, however, talks about "genes where Zld regulates CLAMP binding." (Genes where "CLAMP regulates Zld binding" are never mentioned again.) This makes the logic of this paragraph difficult to interpret. Please revise.

Thank you for pointing this out. This paragraph has been revised in the manuscript to the following:

“To investigate whether CLAMP and ZLD could regulate each other’s binding to precisely drive the transcription of target genes, we plotted the gene expression changes caused by depleting maternal zld or clamp at the genes closest to where they regulate each other’s binding (Figure 4E & Figure 4F). The depletion of maternal zld significantly (p = 4.3e-5, Mann-Whitney U-test) reduces the expression of genes where ZLD regulates CLAMP binding (down-DB) more than sites where CLAMP binds independently of ZLD (non-DB) (Figure 4E). Therefore, ZLD may specifically regulate zygotic genes at which ZLD promotes CLAMP binding. Also, compared to genes where ZLD binds independent of CLAMP, genes where ZLD binding is regulated by CLAMP had a significant (p < 0.001, Mann-Whitney U-test) expression reduction after clamp RNAi at both 0-2hr and 2-4hr time points (Figure 4F). Thus, CLAMP may regulate the transcription of genes targeted by ZLD by promoting ZLD binding.”

6) In general, the reader has to work hard to clearly interpret the results section of this manuscript, particularly for Figures 4-5. Please consider editing the text related to Figures 4-5 for clarity.

Thank you. We have edited the Figures 4-5 text (highlighted in manuscript) for clarity and accessibility.

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Data Citations

    1. Rieder L, Colonnetta MM, Huang A, Mckenney M, Watters S, Deshpande G, Jordan W, Fawzi N, Larschan E. 2020. CLAMP and Zelda function together as pioneer transcription factors to promote Drosophila zygotic genome activation. NCBI Gene Expression Omnibus. GSE152613 [DOI] [PMC free article] [PubMed]
    2. Rieder LE, Koreski KP, Boltz KA, Kuzu G, Urban JA, Bowman S, Zeidman A, Jordan WT, Tolstorukov MY, Marzluff WF, Duronio RJ, Larschan EN. 2019. Histone locus regulation by the Drosophila dosage compensation adaptor protein CLAMP. NCBI Gene Expression Omnibus. GSE102922 [DOI] [PMC free article] [PubMed]
    3. Rieder L. 2015. Zelda determines chromatin accessibility during the Drosophila maternal-to-zygotic transition. NCBI Gene Expression Omnibus. GSE65837
    4. Rieder L. 2017. Concentration dependent binding states of the Bicoid Homeodomain Protein. NCBI Gene Expression Omnibus. GSE86966

    Supplementary Materials

    Figure 1—source data 1. Original western blots and EMSA images.
    Figure 2—source data 1. ATAC-seq read counts in peak region in replicates of MTD and RNAi samples (DiffBind analysis).

    Page 1. clamp-i versus MTD in 0–2 hr embryos. Page 2. clamp-i versus MTD in 2–4 hr embryos.

    Table 1—source data 1. ChIP-seq read counts in peak regions in replicates of MTD and RNAi samples (DiffBind analysis).

    Page 1. ZLD ChIP-seq in clamp-i versus MTD in 0–2 hr embryos. Page 2. ZLD ChIP-seq in clamp-i versus MTD in 2–4 hr embryos. Page 3. CLAMP ChIP-seq in zld-i versus MTD in 0–2 hr embryos. Page 4. CLAMP ChIP-seq in zld-i versus MTD in 2–4 hr embryos.

    Table 2—source data 1. Peaks locations in each CLAMP or ZLD-related category.

    Page 1 Type I (n=5): both DA, CLAMP ZLD co-bound Page 2 Type II (n=23): CLAMP DA and ZLD non-DA, CLAMP ZLD co-bound Page 3 Type III (n=88): ZLD DA and CLAMP non-DA, CLAMP ZLD co-bound Page 4 Type IV (n=434): both non-DA, CLAMP ZLD co-bound Page 5 DA with CLAMP 0–2 hr; Page 6 DA without CLAMP 0–2 hr; Page 7 non-DA with CLAMP 0–2 hr; Page 8 non-DA without CLAMP 0–2 hr; Page 9 DA with ZLD NC14 +12 min; Page 10 DA without ZLD NC14 +12 min; Page 11 non-DA with ZLD NC14 +12 min; Page 12 non-DA, without ZLD NC14 +12 min.

    Transparent reporting form

    Data Availability Statement

    Sequencing data have been deposited in GEO under accession code GSE152613.

    The following dataset was generated:

    Rieder L, Colonnetta MM, Huang A, Mckenney M, Watters S, Deshpande G, Jordan W, Fawzi N, Larschan E. 2020. CLAMP and Zelda function together as pioneer transcription factors to promote Drosophila zygotic genome activation. NCBI Gene Expression Omnibus. GSE152613

    The following previously published datasets were used:

    Rieder LE, Koreski KP, Boltz KA, Kuzu G, Urban JA, Bowman S, Zeidman A, Jordan WT, Tolstorukov MY, Marzluff WF, Duronio RJ, Larschan EN. 2019. Histone locus regulation by the Drosophila dosage compensation adaptor protein CLAMP. NCBI Gene Expression Omnibus. GSE102922

    Rieder L. 2015. Zelda determines chromatin accessibility during the Drosophila maternal-to-zygotic transition. NCBI Gene Expression Omnibus. GSE65837

    Rieder L. 2017. Concentration dependent binding states of the Bicoid Homeodomain Protein. NCBI Gene Expression Omnibus. GSE86966


    Articles from eLife are provided here courtesy of eLife Sciences Publications, Ltd

    RESOURCES