Abstract
Proper development depends on precise spatiotemporal gene expression patterns. Most developmental genes are regulated by multiple enhancers and often by multiple core promoters that generate similar transcripts. We hypothesize that multiple promoters may be required either because enhancers prefer a specific promoter or because multiple promoters serve as a redundancy mechanism. To test these hypotheses, we studied the expression of the knirps locus in the early Drosophila melanogaster embryo, which is mediated by multiple enhancers and core promoters. We found that one of these promoters resembles a typical “sharp” developmental promoter, while the other resembles a “broad” promoter usually associated with housekeeping genes. Using synthetic reporter constructs, we found that some, but not all, enhancers in the locus show a preference for one promoter, indicating that promoters provide both redundancy and specificity. By analyzing the reporter dynamics, we identified specific burst properties during the transcription process, namely burst size and frequency, that are most strongly tuned by the combination of promoter and enhancer. Using locus-sized reporters, we discovered that enhancers with no promoter preference in a synthetic setting have a preference in the locus context. Our results suggest that the presence of multiple promoters in a locus is due both to enhancer preference and a need for redundancy and that “broad” promoters with dispersed transcription start sites are common among developmental genes. They also imply that it can be difficult to extrapolate expression measurements from synthetic reporters to the locus context, where other variables shape a gene’s overall expression pattern.
Keywords: promoter, enhancer, knirps, Drosophila melanogaster, enhancer–core-promoter specificity
Introduction
Diverse processes in biology, from early development to the maintenance of homeostasis, rely on the regulation of gene expression. Enhancers and promoters are the primary regions of the genome that encode these gene regulatory programs. Both enhancers and promoters are characterized by clusters of sequence motifs that act as platforms for protein binding, allowing for the integration of a spectrum of signals in the cellular environment. The majority of studies that dissect enhancer or promoter function typically investigate each in isolation, which assumes that their function is largely modular. In practice, this means that we assume an enhancer drives generally the same pattern, regardless of promoter, and that promoter strength is independent of the interacting enhancer. However, there is evidence that there can be significant “interaction terms” between promoters and enhancers, with enhancer pattern shaped by promoter sequence, and promoter strength influenced by an interacting enhancer (Gehrig et al. 2009; Qin et al. 2010; Hoppe et al. 2020).
Therefore, a key question is precisely how the sequences of an enhancer and a promoter combine to dictate overall expression output. Adding to the complexity of this question, developmental genes often have multiple enhancers, and many metazoan genes have alternative promoters (Schibler and Sierra 1987; Schröder et al. 1988; Landry et al. 2003; Brown et al. 2014). In a locus, multiple enhancers exist either because they drive distinct expression patterns or, in the case of seemingly redundant shadow enhancers, because they buffer noise in the system (Kvon et al. 2021). RAMPAGE data, which consists of a genome-wide survey of promoter usage, from the 24 h of Drosophila embryonic development shows that >40% of developmentally expressed Drosophila genes have more than one promoter (Batut et al. 2013). Yet, the role of multiple promoters has been relatively less explored. In some cases, alternative promoters drive distinct transcripts, but hunchback is a notable example of a gene with two highly conserved promoters that produce identical proteins (Schröder et al. 1988; Ling et al. 2019).
This suggests there may be additional explanations for the prevalence of multiple promoters. One possibility is molecular compatibility—promoters can preferentially engage with different enhancers depending on the motif composition and proteins recruited to each (van Arensbergen et al. 2014; Wang et al. 2016). For example, enhancers bound by either the Caudal or Dorsal transcription factors (TFs) tend to interact with downstream promoter element (DPE)-containing promoters (Juven-Gershon et al. 2008; Zehavi et al. 2014) and Bicoid-dependent hunchback transcription seems to depend on the presence of a TATA box and Zelda site at one promoter (Ling et al. 2019). Another possibility is that having multiple promoters provides redundancy needed for robust gene expression, much like shadow enhancers.
To distinguish between these hypotheses, an ideal model is a gene with (1) multiple promoters that contain different promoter motifs and drive similar transcripts and with (2) multiple enhancers bound by different TFs. The Drosophila developmental gene knirps (kni) fits these criteria. It is a key developmental TF that acts in concert with other gap genes to direct anterior-posterior axis patterning of the early embryo. kni has two core promoters that produce transcripts encoding nearly identical proteins (only differing by five amino acids at the N-terminus) and that are both used during the blastoderm stage (Figure 1A and Supplementary Figure S1, C and D). Here, we define the core promoter as the region encompassing the transcription start site (TSS) and the 40 bp upstream and downstream of the TSS (Vo Ngoc et al. 2017). Also, like many early developmental genes, kni’s precise pattern of expression in the blastoderm is coordinated by multiple enhancers (Figure 1A). These characteristics make the kni locus a good system in which to examine the roles of multiple promoters in a single gene locus.
Figure 1.
knirps as a case study. The knirps (kni) locus was chosen to study how the motif content of endogenous enhancers and promoters affects transcription dynamics. This locus was selected because its blastoderm stage expression is controlled by multiple enhancers that drive different expression patterns and multiple core promoters that contain different promoter motifs. (A) The kni locus comprises multiple enhancers that together drive expression of a ventral, anterior band (left) and a posterior stripe, as shown in the in situ at the top left (Perry et al. 2011). Enhancers that drive similar expression patterns have been displayed together in boxes with a representative in situ hybridization (Schroeder et al. 2004; Perry et al. 2011). The four enhancers selected for study are in color and labeled in bold text; the others are in gray. kni also has two promoters represented in two shades of purple, which drive transcripts that encode slightly different protein products (differing by only five amino acids). (B) A total of eight MS2 reporter constructs containing pairs of each of the four enhancers matched with each of the two kni promoters were made. (C) The two kni promoters are shown in black, consisting of the RAMPAGE-defined transcription start clusters (TSCs) in gray and an additional ± 40 bp from the TSCs. The two kni promoters can be distinguished by their motif content (with promoter 1 consisting of a series of Inr motifs and a DPE motif and promoter 2 consisting of an Inr, two overlapping TATA Boxes and a DPE motif). They also differ in the “sharpness” of their region of transcription initiation (indicated with the horizontal gray bar), with promoter 1 (125 bp) being significantly broader than promoter 2 (4 bp). The scores of the promoter motifs are plotted above the black line on a log scale, with the color of the bar indicating the motif. The RAMPAGE tags have been plotted in gray below each promoter, indicating the experimentally identified initiation sites. RAMPAGE is a 5′-complete cDNA sequencing method that combines two 5′ selection approaches—template-switching and cap-trapping—to accurately profile promoter activity.
We used several approaches to delineate the roles of these two promoters. To examine the molecular compatibility of different kni enhancer-promoter pairs in a controlled setting, we created reporter constructs of eight kni enhancer-promoter pairs driving expression of an MS2 reporter. We found that some kni enhancers can interact with multiple promoters similarly, while others have a strong preference for one. By using the MS2 system to measure the transcription dynamics, we also determined the molecular events that lead to these preferences. Next, analysis of kni locus reporters demonstrated that locus context can affect promoter-enhancer preferences and indicates that promoters both have different jobs and provide some amount of redundancy. Finally, we explored the role of different promoter motifs in specifying expression dynamics by using constructs with promoter mutations. Examining the kni locus has allowed us to (1) determine how transcription dynamics are impacted by molecular compatibility, (2) determine the roles of multiple promoters in a locus, and (3) probe how the motif content of promoters produces a particular expression output.
Materials and methods
Datasets used in this study
The experimentally validated promoters and their experimentally determined TSSs were obtained from the Eukaryotic Promoter Database (EPD) New (Dreos et al. 2017). They were cross-referenced with the RNA Annotation and Mapping of Promoters for Analysis of Gene Expression (RAMPAGE) data obtained from five species of Drosophila (Batut and Gingeras 2017) to form a high-confidence set of promoters for which promoter usage during development could be evaluated. Single embryo RNA-seq obtained by Lott et al. was indexed (with a k of 17 for an average mapping rate of 96%) and quantified using Salmon v0.12.01. The resulting transcript-specific data was used to further resolve kni promoter usage during nuclear cycle 14 (Lott et al. 2011; Patro et al. 2017). Housekeeping genes were defined as in Corrales et al. where genes were defined as housekeeping if their expression exceeded the 40th percentile of expression in each of 30 time points and conditions using RNA-seq data collected by modEncode (Corrales et al. 2017) and a list of these can be found in the Supplementary materials (Supplementary File S1).
To study TF-promoter motif co-occurrence, we collected a total of ∼1000 enhancer-gene pairs expressed during development in Drosophila. The majority were identified by traditional enhancer trapping (REDfly and CRM Activity Database 2, or CAD2) and consist of nonredundant experimentally characterized enhancers (Halfon et al. 2008; Bonn et al. 2012). About 15% were identified through functional characterization of ∼7000 enhancer candidates using a high-throughput tiling screen (Vienna Tile, or VT); these VT enhancers have been limited to those expressed during stages 4–6. The remaining 1% of enhancer-gene pairs have been identified through 4C-seq (Ghavi-Helm et al. 2014) and are active 3–4 h after egg laying (stages 6 and 7). We then found the shortest, minimally overlapping subset of enhancers with an allowed base pair overlap of 25% of the median enhancer length using SelectSmallestFeature.py available at the Halfon Lab GitHub (https://github.com/HalfonLab/UtilityPrograms). A list of these enhancer-promoter pairs and their dm6 release coordinates can be found in the Supplementary materials (Supplementary File S2).
Motif prediction in promoters and enhancers
For enhancers, TF binding site prediction was performed using Patser (Hertz and Stormo 1999) with position weight matrices (PWMs) from the FlyFactor Survey (Zhu et al. 2011) and a GC content of 0.406. Each element in the PWM was adjusted with a pseudocount relative to the intergenic frequency of the corresponding base totaling 0.01. For TFs that had multiple PWMs available, the PWMs built from the largest number of aligned sequences were chosen; that of Stat92E was taken from an older version of the FlyFactor Survey. To systematically choose a score cutoff for the predicted motifs, the aligned sequences were scored using Patser, and a 25th percentile score threshold was chosen such that 75% of the aligned sequences would be called “true” binding sites. These binding sites predictions were further fine-tuned by determining which TF binding sites are overrepresented within an enhancer—occurring at a rate higher than expected—by comparing the number of predicted TF binding sites to the expected number of sites by using a background rate calculated from all intergenic sequences and the binomial distribution with a P-value threshold of 0.25. With a score threshold of 75% and a P-value threshold of 0.25, a median of five TFs control each enhancer, which falls within a biologically reasonable range. For promoters, the transcription start clusters (TSCs) (Batut and Gingeras 2017) and the adjoining ± 40 bp were scanned for Inr, TATA Box, DPE, and MTE motifs using ElemeNT and the PWMs from Sloutskin et al. (2015).
Evaluation of total binding capacity of enhancers
Total binding capacity is a measure of the cumulative ability of an enhancer to bind a TF, and thus it takes into account the binding affinity of every -mer in the enhancer for a TF binding site of length (Wunderlich et al. 2012). To calculate the total binding capacity, we start by using Patser to computationally score each possible site in the enhancer for the motifs of TFs regulating early axis specification. Taking the exponential of the score, normalizing this exponential by the enhancer length , and summing these values gives us an overall binding capacity for each enhancer and TF combination, which is roughly equal to the sum of the probabilities that a TF is bound to each potential site in the enhancer.
Hence, we use the following formula
to calculate the total binding capacity of a given sequence and a given TF (Wunderlich et al. 2012). Here, is the length of the sequence being considered, is the width of the PWM of the TF, is the base at position of the sequence, is the frequency of seeing base at position of the PWM, and is the background frequency of base . Note that
is equivalent to the score given to the -mer at position in the sequence calculated using Patser, as described above (Hertz and Stormo 1999).
Selection of enhancers to study
knirps enhancers expressed in the blastoderm were identified using REDfly (Halfon et al. 2008), and the shortest, minimally overlapping subset of enhancers was obtained using SelectSmallestFeature.py available at the Halfon Lab GitHub (https://github.com/HalfonLab/UtilityPrograms). The enhancers in this subset were categorized by the expression patterns they drove, and a representative enhancer was picked from each of these categories.
Generation of transgenic reporter fly lines
As described in Fukaya et al. the four kni enhancers were each cloned into the pBphi vector, directly upstream of kni promoter 1, 2, or 2ΔTATAΔDPE; 24 MS2 repeats; and a yellow reporter gene (Fukaya et al. 2016). Similarly, the kni locus and its promoter knockouts (Δp1 and Δp2) were each cloned into the pBphi vector, directly upstream of 24 MS2 repeats and a yellow reporter gene by Applied Biological Materials (Richmond, BC, Canada). We defined kni_-5 as chr3L:20699503-20700905(–), kni_proximal_minimal as chr3L:20694587–20695245(–), kni_KD as chr3L:20696543–20697412(–), VT33935 as chr3L:20697271–20699384(–), promoter 1 as chr3L:20695324–20695479(–), promoter 2 as chr3L:20694506–20694631(–), and the kni locus as chr3L:20693955–20701078(–), using the Drosophila melanogaster dm6 release coordinates. Promoter motif knockouts (for p2ΔTATAΔDPE and locus Δp2) involved making the minimal number of mutations that would both inactivate the motif and introduce the fewest new motifs or TF binding sites (TATA: TATATATATC > TAGATGTATC, Inr: TCAGTT > TCGGTT, and DPE: AGATCA > ATACCA). The locus Δp1 construct involved replacing promoter 1 with a region of the lambda genome predicted to have the minimal number of relevant TF binding sites as an alternative to mutating each of the Inr sites, of which there were many (Supplementary Figure S5). The precise sequences for each reporter construct are given in a series of GenBank files included in the Supplementary Materials (Supplementary Files S3–S17).
Using phiC31-mediated integration, each reporter construct was integrated into the same site on chr2L by injection into yw; PBac{y[+]-attP-3B}VK00002 (BDRC stock #9723) embryos by BestGene Inc. (Chino Hills, CA, USA). To visualize MS2 expression, female flies expressing RFP-tagged histones and GFP-tagged MCP (yw; His-RFP/Cyo; MCP-GFP/TM3.Sb) were crossed with males containing one of the MS2 reporter constructs.
Sample preparation and image acquisition
As in Garcia et al. (2013), live embryos were collected before nuclear cycle 14 (nc14), dechorionated, mounted with glue on a permeable membrane, immersed in Halocarbon 27 oil, and put under a glass coverslip. Individual embryos were then imaged on a Nikon A1R point scanning confocal microscope using a 60X/1.4 N.A. oil immersion objective and laser settings of 40 uW for 488 nm and 35uW for 561 nm. Settings were chosen such that the MS2 signal was not saturated for any of the constructs. To track transcription, 21 slice Z-stacks, at 0.5 um steps, were taken throughout nc14 at roughly 30 s intervals. To identify the Z-stack’s position in the embryo, the whole embryo was imaged at the end of nc14 at 20x using the same laser power settings. To quantify expression along the AP axis, each transcriptional spot’s location was placed in 2.5% anterior-posterior (AP) bins across the length of the embryo, with the first bin at the anterior of the embryo. Embryos were imaged at ambient temperature, which was on average 26.5°C. For each AP bin of interest, data from at least three embryos were collected.
Burst calling and calculation of transcription parameters
Tracking of nuclei and transcriptional puncta was done using a version of the image analysis MATLAB pipeline downloaded from the Garcia lab GitHub repository on January 8, 2020 and described in Garcia et al. (2013). For every spot of transcription imaged, background fluorescence at each time point is estimated as the offset of fitting the 2D maximum projection of the Z-stack image centered around the transcriptional spot to a Gaussian curve, using MATLAB lsqnonlin. This background estimate is subtracted from the raw spot fluorescence intensity. The resulting fluorescence traces across nc14 are then smoothed by the LOWESS method with a span of 10%. These smoothed traces are then used to quantify transcriptional properties and noise. Traces consisting of fewer than three timeframes are not included in the calculations.
To quantify the transcription properties of interest, we used the smoothed traces to determine at which time points the promoter was “on” or “off” (Waymack et al. 2020). A promoter was considered “on” if the slope of its trace, i.e., the change in fluorescence, between one point and the next was greater than or equal to the instantaneous fluorescence value calculated for one mRNA molecule (FRNAP, described below). Once called “on,” the promoter is considered active until the slope of the fluorescence trace becomes less than or equal to the negative instantaneous fluorescence value of one mRNA molecule, at which point it is considered inactive until the next time point it is called “on.” The instantaneous fluorescence of a single mRNA was chosen as the threshold because we reasoned that an increase in fluorescence greater than or equal to that of a single transcript is indicative of an actively producing promoter, just as a decrease in fluorescence greater than that associated with a single transcript indicates that transcripts are primarily dissociating from, not being newly initiated at, this locus. Visual inspection of fluorescence traces agreed well with the burst calling produced by this method (Supplementary Figure S6) (Waymack et al. 2020).
Using these smoothed traces and “on” and “off” time points of promoters, we measured burst size, burst frequency, burst duration, polymerase initiation rate, and noise. Burst size is defined as the integrated area under the curve of each transcriptional burst, from one “on” frame to the next “on” frame, with the value of 0 set as the floor of the background-subtracted fluorescence trace (Supplementary Figure S6C). Frequency is defined as the number of bursts in nc14 divided by the time between the first time the promoter is called active and 50 min into nc14 or the movie ends, whichever comes first (Supplementary Figure S6E). The time of first activity was used for frequency calculations because the different enhancer constructs showed different characteristic times to first transcriptional burst during nc14. Duration is defined as the amount of time occurring between the frame a promoter is considered “on” and the frame it is next considered “off” (Supplementary Figure S6F). Polymerase initiation rate is defined as the slope at the midpoint between the frame a promoter is considered “on” and the frame it is next considered “off” (Supplementary Figure S6G). The temporal coefficient of variation of each transcriptional spot , was calculated using the formula:
where is the fluorescence of spot at time . For these, and all other measurements, we control for the embryo position of the fluorescence trace by first individually analyzing the trace and then using all the traces in each AP bin (anterior-posterior; the embryo is divided into 41 bins each containing 2.5% of the embryo’s length) to calculate summary statistics of the transcriptional dynamics and noise values at that AP position.
All original MATLAB code used for burst calling, noise measurements, and other image processing is available at the Wunderlich Lab GitHub (Waymack et al. 2020) with a copy archived at https://github.com/elifesciences-publications/KrShadowEnhancerCode. Updates to include calculations of polymerase initiation rate are also available at the Wunderlich Lab GitHub (https://github.com/WunderlichLab).
Conversion of integrated fluorescence to mRNA molecules
To convert arbitrary fluorescence units into physiologically relevant units, we calibrated our fluorescence measurements in terms of mRNA molecules. As in Lammers et al., for our microscope, we determined a calibration factor, α, between our MS2 signal integrated over nc13, FMS2, and the number of mRNAs generated by a single allele from the same reporter construct in the same time interval, NFISH, using the hunchback P2 enhancer reporter construct (Garcia et al. 2013; Lammers et al. 2020). Using this conversion factor, we calculated the integrated fluorescence of a single mRNA (F1) as well as the instantaneous fluorescence of an mRNA molecule (FRNAP). For our microscope, FRNAP is 379 AU/RNAP, and F1 is 1338 AU/RNAP·min. We can use these values to convert both integrated and instantaneous fluorescence into total mRNAs produced and number of nascent mRNAs present at a single time point, by dividing by F1 and FRNAP, respectively.
Regression modeling and statistical analysis
To quantify the effect of enhancer, promoter, and interaction terms on burst parameters, we considered models of the form
where is the burst property of interest and is the link function (Supplementary Figure S7A). Model selection involved considering (1) the type of model, (2) the distribution that best fit the burst property data, and (3) the appropriate predictors to include. We approached model selection with no specific expectations, opting to use generalized linear models (GLMs) because they fit the data better than linear models (LMs). In addition, they were not much improved upon by adding random effects (GLMMs), suggesting that there is not a huge amount of embryo-to-embryo variation.
Similarly, the appropriate distribution for each burst property was determined by fitting various distributions to the data and comparing their goodness of fit. As expected, total RNA produced and burst size (in transcripts per burst) were best described by a negative binomial distribution, as has been commonly used to describe count data. For the other burst properties, for which the appropriate distribution was less clear, we found that burst frequency was best fit by the Weibull distribution and burst duration and initiation rate were best fit by the gamma distribution. These choices were supported by the lower Akaike information criterion (AIC) values produced when comparing them to models using alternative distributions. They also seem reasonable given examples of other applications of these distributions. To keep the interpretation consistent across models, we chose to use an identity link function for all models (Figure 3B); using the canonical link functions associated with each of these distributions produced the same trends (Supplementary Figure S7).
Figure 3.
Expression levels are mainly determined by burst frequency and size. (A) To parse the effects of the enhancer, the promoter, and their interactions on all burst properties, we built generalized linear models (GLMs). Y represents the burst property under study, and the enhancers, promoters, and their interaction terms are the explanatory variables whose contributions are additive in this model. The coefficients of each of these explanatory variables are representative of their contribution to the total value of the burst property. All burst property data was taken from the anterior-posterior bin of maximum expression (20% and 62.5%) for the anterior band and the posterior stripe, respectively. Note that modeling attempts to fit the means, but medians are shown in violin plots. (B) The classic-p1 construct was chosen as a reference and is represented in gray, and the effects of enhancer, promoter, and their interactions are represented in green, purple, and brown, respectively. The size of the bar is representative of the effect size of each variable on the explanatory variable. The coefficients and the 95% confidence intervals for each independent variable relative to that of a reference construct (classic-p1) are plotted as a bar graph. The asterisks, *P < 0.05, **P < 0.01, ***P < 0.001, represent contributions significantly different from zero. In panels (C–G), (left) split violin plots (and their associated box plots) of burst properties for all eight constructs are plotted with promoter 1 in light purple and promoter 2 in purple. The black boxes span the lower to upper quartiles, with the white dot within the box indicating the median. Whiskers extend to 1.5*IQR (interquartile range) ± the upper and lower quartile, respectively. (Right) Bar graphs representing the relative contributions of enhancer, promoter, and their interactions to each burst property are plotted as described in (B). The double hash marks on the axes indicate that 90% of the data is being shown. (C) Expression levels are significantly affected by the enhancer identity (green bars), promoter identity (purple bars), and the interaction terms (brown bars), with interaction terms representing the role of molecular compatibility. (D) Burst frequency is dominated by the enhancer and promoter terms, with promoter 2 consistently producing higher burst frequencies regardless of enhancer. (E) Burst size is in part determined by initiation rate and burst duration. Burst size is significantly affected by the enhancer, promoter, and interaction terms. (F) Burst duration is reasonably consistent regardless of enhancer or promoter, but some enhancer and interaction terms have small, but significant effects. (G) Pol II initiation rate is significantly affected by enhancer, promoter, and interaction terms, with promoter 2 consistently driving higher initiation rates than promoter 1. However, the differences in promoter 1 and promoter 2 initiation rates is enhancer-dependent, as indicated by the significant interaction terms.
The predictors we included were the enhancer and promoter and any interaction terms between the enhancer and promoter. In each case, dropping the interaction terms produced higher AIC values, suggesting that the interaction terms are important and should not be dropped from the model.
To determine any significant differences in mean expression levels, we performed Welch’s t-tests. To quantify the variability explained by different predictors, we calculated the Cragg and Uhler pseudo R-squared measures of the model including only the predictor in question and divided by that of the full model described above.
Results
Selection of enhancers and promoters tested
knirps has two conserved promoters that drive very similar transcripts (Figure 1A; Supplementary Figure S1, A and B). Most previous studies discuss the role of a single kni promoter (promoter 1), though in practice, many of the constructs used in these studies contained both promoters since promoter 2 is located in a kni intron (Pankratz et al. 1992; Pelegri and Lehmann 1994; Bothma et al. 2015; El-Sherif and Levine 2016). While more transcripts initiate from promoter 1 throughout most of development (Supplementary Figure S1C), both promoters appear to be active during nuclear cycle 14, 2–3 h after fertilization, based on two different measures of promoter usage (Supplementary Figure S1, C and D) (Lott et al. 2011; Batut and Gingeras 2017). These two promoters are distinguished by their motif content and by their “shape” (Figure 1C). Promoter 1 is composed of multiple putative Initiator (Inr) motifs, each of which can specify a TSS. These putative Inr motifs enable promoter 1 to drive transcription initiation in a 125 bp window, characteristic of a “broad” or “dispersed” promoter typically associated with housekeeping genes (Juven-Gershon et al. 2008; Sloutskin et al. 2015). There is a single putative DPE element in promoter 1; however, its significance is somewhat unclear, as it is only at the canonical distance from a single, somewhat weak, Inr motif within the initiation window. Promoter 2 is composed of putative Inr, TATA Box, and DPE motifs. This motif structure leads promoter 2 to initiate transcription in a 4 bp region, which is characteristic of the “sharp” or “focused” promoter shape typically associated with developmental genes (Figure 1C). This two-promoter structure, with one broad and one sharp, might be more prevalent than expected (Supplementary Figure S1, E–G, discussed below).
To select key early embryonic kni enhancers, we considered the expression patterns driven by the enhancers and their overlap in the locus. We split the enhancers into three groups based on their expression patterns and selected one representative enhancer per group—enhancers driving a diffuse posterior stripe (kni_proximal_minimal, or the intronic stripe enhancer), enhancers driving a sharp posterior stripe (kni_KD, the “classic” kni posterior stripe enhancer), and enhancers driving the anterior band (kni_-5, or the anterior enhancer) (Figure 1A). Among the enhancers driving a sharp posterior stripe, we decided to examine another enhancer, VT33935 (or the tile stripe enhancer), in addition to kni_KD (Pankratz et al. 1992). Hereafter these enhancers will be referred to by their simpler, more intuitive names. The tile stripe enhancer was identified in a high-throughput tiling screen for enhancer activity (Kvon et al. 2014) and has only minimal overlap with the classic stripe enhancer but drives the same posterior stripe of expression. This suggests it may be an important contributor to kni regulation.
As these enhancers drive different expression patterns, they are likely regulated by different TFs or at least different combinations of TFs. To corroborate this fact, we computationally predicted the putative TF inputs to these enhancers by scanning each enhancer using the motifs of TFs regulating early axis specification and calculated an overall binding capacity for each enhancer-TF pair (Figure 2A). We found that the classic stripe and the tile stripe enhancer seem to be regulated by similar TFs, which suggests that together they comprise one larger enhancer. Here, we studied them separately, as historically the classic stripe enhancer has been considered the canonical enhancer driving posterior stripe expression (Pankratz et al. 1992). Since the classic stripe, tile stripe, and intronic stripe enhancers drive overlapping expression patterns, they can be considered a set of shadow enhancers. Despite their similar expression output, the intronic stripe enhancer seems to have different TF inputs than the other two, including different predicted repressors and autoregulation by Kni itself (Perry et al. 2011). The anterior enhancer is the only enhancer that controls expression of a ventral, anterior band. Accordingly, this is the only enhancer of the four that has predicted dorsal-ventral TF inputs (Dorsal and Twist) (Figure 2A) (Schroeder et al. 2004). In sum, computational analyses of the total binding capacity of these enhancers suggest that they are bound by different TFs in line with their distinct expression patterns (Figure 2A).
Figure 2.
The kni enhancers differ in their predicted capacity to bind different transcription factors and their ability to drive transcription with different promoters. (A) The logarithm of the predicted TF binding capacity of each of the kni enhancers is plotted as circles around the enhancer, with the color indicating the TF and the circle size increasing with higher binding capacity. The TFs are categorized by their role in regulating anterior-posterior (AP) or dorsal-ventral (DV) patterning and broadly by their roles as activators (green arc) or repressors (pink arc). The classic and tile stripe enhancers, which drive the same posterior stripe of expression, share similar predicted TFs. The anterior enhancer, the only enhancer with a DV component, is the only one predicted to be bound by DV TFs. The intronic stripe enhancer drives a similar expression pattern to the classic and tile stripe enhancers, but has different predicted TF binding capacities. (B) The Drosophila embryo with the kni expression pattern at nuclear cycle 14 (nc14) is shown. Using measurements from enhancer-promoter reporters, the total RNA produced by each construct during nuclear cycle 14 is plotted against position along the embryo length (AP axis). The enhancers can be separated into two classes—those that produce high expression with either promoter (anterior and intronic stripe enhancers) and those that produce much higher expression with promoter 2 (classic and tile stripe enhancers). The constructs containing promoter 1 are denoted with a dashed line and those containing promoter 2 with a solid line. The error bands around the lines are 95% confidence intervals. In panels (C, D), the temporal coefficient of variation (CV) is plotted against the total RNA produced in nc14 at the anterior-posterior bin of maximum expression (20% and 62.5%) for the anterior band and the posterior stripe, respectively, with the error bars representing 95% confidence intervals. (C) Here, the data points are colored by the construct’s promoter. (D) Here, the data points are colored by the construct’s enhancer. There is a general trend of mean expression levels being anti-correlated with CV, or noise. However, some constructs, notably those containing the anterior enhancer, have higher noise than others with similar output levels.
By using this set of endogenously interacting enhancers and promoters with varied motif content, we can elucidate the functional value of having multiple promoters. In particular, we can determine whether multiple promoters exist because different enhancers work with different promoters, multiple promoters provide necessary redundancy in the system, or some combination of the two.
Some enhancers tolerate promoters of different shapes and composition
To characterize the inherent ability of promoters and enhancers to drive expression, without complicating factors like enhancer competition, promoter competition, or variable enhancer-promoter distances, we created a series of eight transgenic enhancer-promoter reporter lines. Each reporter contains one enhancer and one promoter directly adjacent to each other, followed by MS2 stem loops inserted in the 5′ UTR of the yellow gene (Figure 1B, see Materials and Methods for details). These tagged transcripts are bound by MCP-GFP fusion proteins, yielding fluorescent puncta at the site of nascent transcription. The fluorescence intensity of each spot is proportional to the number of transcripts in production at a given moment (Garcia et al. 2013).
When considering the expression output driven by these enhancer-promoter combinations, several outcomes are possible. One possible outcome is that one promoter is simply stronger than the other—consistently driving higher expression, regardless of which enhancer it is paired with. Another possibility is that some enhancers drive similar expression with either promoter. This suggests that the particular set (and orientation) of the TFs recruited to those enhancers allow them to transcend the differences in promoter architecture. Both of these potential outcomes represent cases in which the promoters are redundant. Finally, each enhancer may drive higher expression with one promoter than with the other, but this preferred promoter differs between enhancers. This would suggest that promoter motifs and shape affect their ability to successfully interact with enhancers with different bound TFs to drive expression. This outcome would be evidence of promoter specificity.
When comparing the mean expression levels, we found that some enhancers (anterior and intronic stripe) have relatively mild preferences for one promoter over the other (Figure 2B; two-sided t-test comparing anterior-promoter1 vs anterior-promoter2, P = 0.12 and intronic_stripe-promoter1 vs intronic_stripe-promoter2, P = 9.5 × 10−5). Despite the significant differences between these enhancer-promoter constructs, the effect size is relatively small, with the largest difference in mean expression being 1.2-fold. This suggests that the TFs recruited to these enhancers can interact with very different promoters more or less equally well. This would be a case of promoter redundancy. On the other hand, the classic and tile stripe enhancers respectively drive 2.9- and 3.2-fold higher expression with promoter 2 than promoter 1 at 62.5% embryo length (Figure 2B; one-sided t-test P < 2.2 × 10−16 for both). This suggests that the TFs recruited to the classic and tile stripe enhancers limit their ability to successfully drive expression with promoter 1, which is a dispersed promoter. This is evidence of promoter specificity. Taken together, this implies a simple model of promoter strength is not sufficient to account for these results. Instead, it is the combination of the proteins recruited to both enhancers and promoters that set expression levels, with some enhancers interacting equally well with both promoters and others having a preference.
These differences in enhancer preference or lack thereof may be mediated by the particular TFs recruited to them and the motifs present in the promoters. Previous researchers have found that the developmental TFs, Caudal (Cad) and Dorsal (Dl), tend to regulate genes with DPE motifs and drive lower expression when DPE has been eliminated (Juven-Gershon et al. 2008; Zehavi et al. 2014). In addition, computational analysis of TF-promoter motif co-occurrence patterns indicates that Bicoid (Bcd) shows a similar enrichment for DPE-containing promoters (Supplementary Figure S2). A study also indicated that Bcd can work in conjunction with Zelda to activate a TATA Box-containing promoter, but this combination does not appear to be widely generalizable (Ling et al. 2019). In accordance with that, we find that all four kni enhancers, which are predicted to bind Cad and Bcd, drive relatively high expression with the DPE-containing promoter 2. Interestingly, in the case of the anterior and intronic stripe enhancers, we find that they can also drive similarly high expression with the series of mostly weak Inr sites that compose promoter 1. This indicates that while some factors mediating enhancer-promoter preference have been identified, there are additional factors we have yet to discover that are playing a role.
We also calculated the expression noise associated with each construct and plotted it against the expression output of each. Previous studies have suggested that TATA-containing promoters generally drive more noisy expression (Ravarani et al. 2016; Ramalingam et al. 2021). Among our constructs, expression noise is generally inversely correlated with mean expression (Figure 2, C and D, and Supplementary Figure S3), and the TATA-containing promoter 2 does not have uniformly higher noise than the TATA-less promoter 1. However, some constructs, notably those containing the anterior enhancer, have higher noise than others with similar output levels, suggesting that, in this case, promoters alone do not determine expression noise.
Simple model of transcription and molecular basis of burst properties
To unravel the molecular events that result in the expression differences between reporters, we consider our results in the context of the two-state model of transcription (Peccoud and Ycart 1995; Tunnacliffe and Chubb 2020). Here, the promoter is either (1) in the inactive state (“OFF”), in which RNA polymerase cannot initiate transcription or (2) in the active state (“ON”), in which it can (Supplementary Figure S4A). The promoter transitions between these two states with rates kact and kinact, with the transitions involving both the interaction of the enhancer and promoter and the assembly of the necessary transcriptional machinery. This interaction may be through direct enhancer-promoter looping or through the formation of a transcriptional hub, a nuclear region with a high concentration of TFs, co-factors, and RNA polymerase (Lim and Levine 2021). In its active state, the promoter produces mRNA at rate r, and given our ability to observe only nascent transcripts, the mRNA decay rate µ denotes the diffusion of mRNA away from the gene locus.
We track these molecular events by analyzing the transcription dynamics driven by each reporter and quantifying several properties. Total expression is simply the integrated signal driven by each reporter. The burst duration is the period of active transcription and is dependent on kinact, the rate of promoter inactivation (Supplementary Figure S4B). The burst size, or number of transcripts produced per burst, depends on the burst duration, RNA Pol II initiation rate, and other factors like elongation rate. (Short, aborted transcripts and paused Pol II are not visible in MS2 measurements). The burst frequency, or the inverse of the time between two bursts, depends on both kact and kinact. We used this model to characterize how the transcription output produced is affected by different combinations of the kni enhancers and promoters.
Using GLMs to parse the role of enhancers, promoters, and their interactions
To determine the role of enhancers, promoters, and their interactions in setting reporter expression levels, we built separate generalized linear models (GLMs) to describe each transcriptional property. GLMs allow us to parse the contribution of each component in a statistically rigorous manner. We visually represented the model (Figure 3A) using a bar graph in which the relative contributions of enhancer, promoter, and their interactions are represented in bars of green, purple, and brown, respectively (Figure 3B). To build the GLM, we must pick a reference construct. Here, we selected classic-p1, as it drives the lowest total expression; it is represented by the gray bar. The size of the bars represents the effect size of each variable on the transcriptional property of interest, normalized to the effect size of classic-p1. These effect sizes are additive. The sign of the bars (positive or negative) indicates whether they are adding or detracting from the property of interest, respectively. Since the differences in expression driven by different enhancer-promoter pairs are generally consistent across the AP axis, we used the expression levels at the location of maximum expression along the AP axis (20% and 62.5% for the anterior band and posterior stripe, respectively, Figure 2B).
To develop an intuition for this formalism, we first built a GLM to describe total expression output. Using the GLM, we can see that enhancer, promoter, and interaction terms each play a large and statistically significant role in determining the expression output (Figure 3C), consistent with our qualitative interpretation above. To determine which molecular events modulate overall expression output, we then applied this same GLM structure to each burst property.
Both burst size and frequency determine expression levels
The classic and tile stripe enhancers drive significantly higher expression with promoter 2 than promoter 1 (Figures 2B and 3C), and this increase is due to both increases in burst frequency (Figure 3D) and burst size (Figure 3E). Burst size largely depends on RNA polymerase II (Pol II) initiation rate, burst duration, and elongation rate (Supplementary Figure S4, A and B). Our 5′-labeled transcripts allow us to measure initiation rate and burst duration, but not elongation rate. Burst duration is relatively consistent between the p1 and p2 versions of the classic and tile stripe reporters, as can been seen in the violin plots and the small purple bar and the brown p2+tile interaction bars (Figure 3F). In contrast, initiation rate is higher for the p2 versions of the classic and tile stripe reporters, and the large p2+tile interaction bar indicates that the effect of p2 on initiation rate is particularly pronounced in this combination (Figure 3G). Taken together, this indicates that the classic and tile enhancers drive higher transcription through increases in burst frequency and burst size, with the increase in burst size determined, at least in part, by an increase in initiation rate.
The intronic stripe and anterior enhancers drive similar expression with p1 and p2 (Figures 2B and 3C). The p2 versions of these reporters burst more frequently than their p1 counterparts (Figure 3D), but their burst sizes mirror the trends seen in the total expression levels (Figure 3E). The interaction between p2 and these enhancers has a negative effect on the burst size produced, as indicated by the negative anterior+p2 and intronic+p2 interaction terms in burst size (Figure 3E, brown bars), which is achieved in part by a tuning down of burst duration (Figure 3F). Although the p2 versions of the anterior and intronic stripe enhancers still have higher initiation rates than the p1 versions, the negative intronic+p2 interaction term (Figure 3G, brown bar) indicates that p2 increases initiation rate less with the intronic enhancer than with other enhancers. This indicates that relatively similar expression levels driven by the anterior and intronic stripe enhancers with p1 and p2 are due to tuning of their burst sizes, via both burst duration and, to a lesser extent, initiation rate.
Given that some enhancers had a strong promoter preference (Figure 2B), we further investigated which properties were most affected by interaction terms, which are a proxy for molecular compatibility. Molecular compatibility is important in determining the burst size (Figure 3E, brown bars). Of the variability in burst size explained by this model, enhancers and interaction terms account for 67.8% and 23.5% of the variance, respectively (Figure 3E). This suggests that the TFs and cofactors recruited to each reporter may act synergistically to both recruit RNA Pol II to the promoter and promote its successful initiation and elongation.
Conversely, burst frequency is dependent on promoter and enhancer identity, with negligible interaction terms (Figure 3D). Since burst frequency depends on the promoter-enhancer activation and inactivation rates (kact and kinact), this suggests that both enhancers and promoters play a large role in determining the likelihood of promoter activation and inactivation, with molecular compatibility only minimally affecting this likelihood. It is somewhat surprising that molecular compatibility plays only a small role in determining kact and kinact since one might expect the interactions between the proteins recruited to promoters and enhancers would determine the likelihood and duration of promoter activation. This may be the result of the design of these constructs, with promoter and enhancer immediately adjacent to each other, and this may differ in a more natural context (see below).
Despite promoter 2’s compatibility with the anterior enhancer, promoter 1 primarily drives anterior expression in the locus context
The constructs measured thus far only contain a single enhancer and promoter, and therefore measure the inherent ability of a promoter and enhancer to drive expression. However, in the native locus, other complications like differing enhancer-promoter distances, enhancer competition, or promoter competition may impact expression output. To measure the effect of these complicating factors on promoter redundancy and specificity, we cloned the entire kni locus into a reporter construct and measured the expression patterns and dynamics of the wild-type locus reporter (wt) and reporters with either promoter 1 or 2 knocked out (Δp1 and Δp2) (Figure 4A). Due to the large number of Inr motifs, we made the Δp1 construct by replacing promoter 1 with a piece of lambda phage DNA selected for its small number of predicted TF binding sites (Supplementary Figure S5). To make the Δp2 construct, we inactivated the TATA, Inr, and DPE motifs by making several mutations (see Materials and Methods for additional details). Previous work suggests that inactivating the Inr motif alone in a Drosophila developmental promoter can reduce expression by ∼32-fold and mutating Inr and TATA together can reduce expression further (Qi et al. 2020). However, we acknowledge that there may be other unperturbed motifs remaining in the Δp2 construct.
Figure 4.
The synthetic enhancer-promoter constructs are insufficient to capture the behavior of the knirps promoters within the endogenous locus. (A) We cloned the entire kni locus into an MS2 reporter construct and measured the expression levels and dynamics of the wild-type (wt) locus reporter, and reporters with either promoter 1 or 2 knocked out (Δp1 and Δp2). We made the Δp1 reporter by replacing promoter 1 with a piece of lambda phage DNA. To make the Δp2 construct, we inactivated the TATA, Inr, and DPE motifs (see Materials and Methods for additional details). In panels (B–G), all data were taken from the anterior-posterior bin of maximum expression (20% and 62.5%) for the anterior band and the posterior stripe, respectively. (B) The bin of maximum expression is highlighted in light teal on the embryo. To compare the expression produced by the synthetic enhancer-promoter reporters with the locus reporters, we plotted bar graphs of the summed total RNA produced at the location of maximum expression in the anterior (left) and posterior (right) for six cases—enhancer-promoter1 reporters (light purple), enhancer-promoter2 reporters (purple), both enhancer-promoter1 and -promoter2 reporters (dark purple), the wt locus reporter (black), the locus Δp2 reporter (light gray), and the locus Δp1 reporter (dark gray). In panels (C–F), violin plots (and their associated box plots) of burst properties for all three reporters are plotted with the wt, Δp2, and Δp1 reporters in black, light gray, and gray, respectively. The internal boxes span the lower to upper quartiles, with the dot indicating the median. Whiskers extend to 1.5*IQR (interquartile range) ± the upper and lower quartile, respectively. The double hash marks on the axes indicate that 95% of the data is being shown. (C) The coefficient of variation is inversely correlated with total RNA produced shown in (B). In the anterior, the Δp2 reporter, which produces the same total RNA as the wt reporter, also produces the same amount of noise. (D) In the anterior of the embryo, burst frequency of the Δp2 reporter is less than that of the wt reporter even though they produce the same expression levels and noise. In the posterior, knocking out promoter 2 has a larger impact on burst frequency than knocking out promoter 1. (E) In both the anterior and posterior, burst size is directly correlated with total RNA produced. Note that in the posterior of the embryo, knocking out promoter 2 has a larger impact on burst size than knocking out promoter 1. Burst size is in part dependent on Pol II initiation rate and burst duration. While (F) burst duration is reasonably consistent regardless of promoter knockout, (G) in the posterior, Pol II initiation rate is correlated with burst size. This suggests that differences in burst size are in part by differences in Pol II initiation rate.
In the anterior, the anterior enhancer is solely responsible for driving expression. Therefore, by comparing the expression output from the wild-type locus reporter and the anterior enhancer-promoter reporters (anterior-p1 and anterior-p2) in the anterior, we can measure the effect of the locus context, i.e., multiple promoters, differing promoter-enhancer distance, or other DNA sequence features. If the anterior enhancer-promoter reporters capture their ability to drive expression in the locus context, we would expect the locus reporter to drive expression equal to the sum of the anterior-p1 and anterior-p2 reporters. In contrast to this expectation, in the anterior band, the locus reporter drives a much lower level of expression than the sum of the two anterior enhancer reporters (Figure 4B, dark purple vs black bar). In fact, the level is similar to the expression output of the anterior enhancer with either individual promoter, suggesting that the expression output of the anterior enhancer is altered by the locus context.
The observed sub-additive behavior may arise in several ways. It may be that promoter competition similarly reduces the expression output of both p1 and p2 in the anterior. In this case, knocking out either promoter would produce wild-type levels of expression, as competition would be eliminated. Alternatively, the ability to drive expression in the locus context could be uneven between the promoters. If this is the case, we would expect the promoter knockouts to have different effects on expression.
Consistent with the second scenario, we find that when promoter 2 is eliminated in the kni locus reporter, the expression in the anterior remains essentially the same (two-sided t-test comparing mean expression levels of wt vs Δp2, P = 0.62), while a promoter 1 knockout has a significant impact on expression levels (one-sided t-test comparing mean expression levels of wt vs Δp1, P < 2.2 × 10−16; Figure 4B). Thus, promoter 1 is sufficient to produce wild-type expression levels and patterns in the locus. The noise and the burst properties of the WT kni locus construct and the promoter 2 knockout, excepting the burst frequency, are also nearly identical to the wild-type locus, further supporting the claim of promoter 1 sufficiency in the anterior (Figure 4, C–G).
When promoter 1 is eliminated from the locus, expression is cut to about one-third of that of the wild-type locus construct, which is also lower than the expression output of the anterior-p2 construct. Thus, unlike promoter 1, promoter 2 loses its ability to drive wild-type levels of expression in the context of the locus. As promoter 2 is ∼650 bp upstream of promoter 1, this extra distance between the anterior enhancer and promoter 2 may be sufficient to reduce promoter 2’s ability to drive expression. Alternatively, other features of the kni locus, such as the binding of other proteins or topological constraints, may interfere with the ability of the anterior enhancer to effectively interact with promoter 2. The drop in expression is mediated by a tuning down of burst frequency and size (Figure 4, D and E). In sum, the anterior enhancer preferentially drives expression via promoter 1 in the locus, even though enhancer-promoter constructs indicate that it is equally capable of driving expression with promoter 2. When promoter 1 is absent from the locus, promoter 2 can drive a smaller amount of expression, suggesting that it can serve as a backup, albeit an imperfect one.
In the posterior, both promoters are required for wild-type expression levels
The posterior stripe is controlled by three enhancers, with the intronic stripe enhancer producing similar levels of transcription with either promoter, and the other two enhancers strongly preferring promoter 2 and driving lower expression overall (Figure 2B). Therefore, when considering the posterior stripe, the expression output of the locus reporter may differ from the individual enhancer-reporter constructs due to promoter competition, enhancer competition, different promoter-enhancer distances, or other DNA features. By comparing the sum of the six relevant enhancer-promoter reporters to the output of the locus reporter, we can see that the locus construct drives considerably lower expression levels than the additive prediction (Figure 4B, dark purple vs black bar). In fact, the locus reporter output levels are similar to the sum of the enhancer-promoter 2 reporters, suggesting that promoter 2 could be solely responsible for expression in the posterior, despite the intronic stripe enhancer’s ability to effectively drive expression with promoter 1. If promoter 2 is sufficient for posterior stripe expression, we would predict that the promoter 1 knockout would have a relatively small effect, while a promoter 2 knockout would greatly decrease expression in the posterior.
In contrast to this expectation, both promoter 1 and promoter 2 knockouts have a sizable effect on expression output, indicating that both are required for wild-type expression levels in the posterior (Figure 4B, light gray and gray bars). Specifically, knocking out promoter 2 severely reduces expression in the posterior stripe, producing about half the expression of the summed outputs of the enhancer-promoter1 constructs (Figure 4B, light gray vs light purple bars). Knocking out promoter 1 also reduces expression in the posterior stripe but not as severely as knocking out promoter 2 (Figure 4B, gray vs light gray bars). The promoter 1 knockout generates about half the expression of the summed expression output of the enhancer-promoter2 constructs (Figure 4B, gray vs purple bars). In both cases, the results indicate that the differences in locus context cause the enhancers to act sub-additively, even when only one promoter is present.
The promoter knockouts also allow us to examine how they tune expression output. In the posterior, knocking out either promoter decreases expression output by decreasing both burst frequency and burst size (Figure 4, D and E). These results show that, in the posterior, both promoters are required to produce wild-type expression levels when considered in the locus setting (Figure 4B, light gray and gray vs black bars). This is despite the fact that enhancer-promoter reporters indicate that, in the absence of enhancer competition for the promoter, promoter 2 alone would suffice (Figure 4B, purple vs black bars).
Burst size and Pol II initiation rate are key burst properties tuned by promoter motif
Studying these enhancers and promoters in the locus context demonstrated that distance and competition affect a promoter’s ability to drive expression, but now we narrow our focus to promoter 2’s remarkable compatibility with enhancers that drive different expression patterns and that are predicted to bind very different sets of TFs. To dissect how its promoter motifs enable promoter 2 to be so broadly compatible, we again made enhancer-promoter reporter lines in which one enhancer and one promoter are directly adjacent to each other, but this time we replaced the promoter with a mutated promoter 2 in which the TATA Box and DPE motifs have been eliminated (p2ΔTATAΔDPE, Figure 5A, see Materials and Methods for details). This allows us to determine whether a single, strong Inr site (p2ΔTATAΔDPE) can perform similarly to a series of mostly weak Inr sites (promoter 1) and to clarify the role of TATA Box and DPE motifs in tuning burst properties.
Figure 5.
Pol II initiation rate is a key burst property that is tuned by promoter motif. (A) We made enhancer-promoter reporters containing each of the four enhancers matched with a mutated promoter 2 (p2ΔTATAΔDPE) in which the TATA Box and DPE motifs have been eliminated (see Materials and Methods for details). In panels (B–F), (left) split violin plots (and their associated box plots) of burst properties for all twelve constructs are plotted with promoter 1 in light purple, promoter 2 in purple, and mutated promoter 2 in yellow. The data for the promoter 1 and 2 reporters are the same as in Figure 3 but are provided for comparison. The black boxes span the lower to upper quartiles, with the white dot within the box indicating the median. Whiskers extend to 1.5*IQR (interquartile range) ± the upper and lower quartile, respectively. (Right) Bar graphs of the burst properties produced by p2ΔTATAΔDPE relative to promoter 1 (in light purple) and to promoter 2 (in purple) are shown. The error bars show the 95% confidence intervals. Promoter comparisons that show no difference between the burst properties are expected to show a relative quantity of 1 (gray dashed line). All burst property data was taken from the anterior-posterior bin of maximum expression (20% and 62.5%) for the anterior band and the posterior stripe, respectively. (B) We observe enhancers fall into two classes. Some enhancers drive less expression with p2ΔTATAΔDPE than with promoter 1 (light purple bars) and others more. The enhancers (anterior and intronic stripe) that drive less expression are also nearly equally compatible with both promoters 1 and 2, whereas the enhancers that drive more expression (classic and tile stripe) strongly preferred promoter 2. Comparing p2ΔTATAΔDPE with promoter 2 (dark purple bars), we see that eliminating TATA Box and DPE motifs reduces expression output for all enhancers. (C) When comparing p2ΔTATAΔDPE with either promoter 1 or promoter 2, burst frequency is not substantially affected. (D) Comparing the burst size of p2ΔTATAΔDPE reporters with either that of promoter 1 or promoter 2 reporters, we see the same behavior as with total RNA (shown in panel B). This suggests that burst size is the main mediator of the increase or decrease in total RNA produced. Burst size is in part dependent on Pol II initiation rate and burst duration. As (E) burst duration is reasonably consistent regardless of promoter, it appears that (F) changes in burst size are mediated in part by tuning Pol II initiation rate.
Promoter 2 is characterized by two overlapping TATA Boxes, an Inr motif, and a DPE motif. Previously, much research has focused on comparing TATA-dependent and DPE-dependent promoters; however, many promoters contain both. Here, we consider how the presence of both may impact transcription. We know that each of these motifs recruits subunits of TFIID, with TATA Box recruiting TBP or TRF1 (Kim et al. 1993; Hansen et al. 1997; Holmes and Tjian 2000), Inr recruiting TAF1 and 2 (Chalkley and Verrijzer 1999; Wu et al. 2001), and DPE recruiting TAF6 and 9 (Shao et al. 2005), as well as other co-factors like CK2 and Mot1 (Lewis et al. 2005; Hsu et al. 2008). Strict spacing between TATA-Inr and Inr-DPE both facilitate assembly of all these factors and others into a pre-initiation complex (Burke and Kadonaga 1996; Emami et al. 1997). A promoter with all three motifs will likely behave similarly, with the addition of each motif further tuning the composition, configuration, or flexibility of the transcriptional complex. Given this, elimination of the TATA Box and DPE motifs may weaken the promoter severely through loss of cooperative interactions, especially for the classic and tile stripe enhancers, which are significantly more compatible with promoter 2 than promoter 1. Alternatively, the single strong Inr site may be sufficient to recruit the necessary transcription machinery, especially in the case of the anterior and intronic stripe enhancers, which work well with the series of mostly weak Inr sites that composes promoter 1.
Enhancers less compatible with promoter 1 (classic and tile stripe enhancers) drive higher expression with p2ΔTATAΔDPE than with promoter 1 (Figure 5B, light purple bars). This suggests that either the classic and tile stripe enhancers do not effectively interact with a promoter with many Inr sites or that there are other sequence features in promoter 2 that also contribute to its expression output. In contrast, we see that promoter 1-compatible enhancers (anterior and intronic stripe enhancers) drive lower expression with p2ΔTATAΔDPE than with promoter 1 (Figure 5B, light purple bars). For all enhancers, the difference in expression between p2ΔTATAΔDPE and promoter 1 appears to be mediated mainly through a change in burst size due, in part, due to a change in initiation rates (Figure 5, D–F). Initiation rate is not sufficient to explain the difference in burst size, and this discrepancy is likely due to differences in elongation rate or other unmeasured factors.
Given that all four enhancers are compatible with promoter 2, and promoter 2 appears to achieve higher expression by tuning burst frequency and Pol II initiation rates (Figure 3, D and G, purple bars), we posit that TATA Box and DPE are what allow promoter 2 to do so. When comparing p2ΔTATAΔDPE with promoter 2, we see that all enhancers produce lower expression (Figure 5B, purple bars), and this is mediated mainly through a decrease in burst size (Figure 5D) and, for some enhancers, a smaller decrease in burst frequency (Figure 5C). Notably, burst size and polymerase initiation rate, which are dependent on molecular compatibility (Figure 3, E and G, brown bars), are also most affected by the elimination of the TATA Box and DPE motifs (Figure 5, D and F), suggesting that molecular compatibility may play an important role in mediating high expression output. In conclusion, enhancers seem to fall into classes, which behave in similar ways with particular promoters. The TATA Box and DPE elements in promoter 2 increase expression output via increases in burst size and initiation rate, suggesting that, in concert with the TFs recruited to the enhancer, the proteins these motifs recruit are needed to efficiently load Pol II and promote its elongation.
The two-promoter structure of the kni promoters—one broad, one sharp—is found across embryonically expressed genes
Based on our locus reporters, both kni promoters appear important for normal expression. Given this, we wanted to determine the prevalence of a two-promoter structure, with one broad and one sharp. To do so, we used the RAMPAGE data set, which includes a genome-wide survey of promoter usage during the 24 h of Drosophila embryonic development (Batut and Gingeras 2017) and cross-referenced these promoters with those in the Eukaryotic Promoter Database, which is a collection of experimentally validated promoters (Dreos et al. 2017). We found that 13% of embryonically expressed genes have at least two promoters. When we considered the two most commonly used promoters, there is a clear trend of a broader primary (most used) promoter (median = 91 bp) and a sharper secondary promoter (median = 42 bp) (Supplementary Figure S1C). This trend is still present if the genes are split into developmental and housekeeping genes, with developmental promoters (median = 43 bp) generally more focused than housekeeping promoters (median = 90 bp), as expected (Supplementary Figure S1, D and E). Among the primary promoters of developmental genes, 58% consist of a series of mostly weak Inr sites, much like kni promoter 1. This suggests that this promoter shape and motif content in developmental promoters may be more common than previously expected and should be explored.
Discussion
We dissected the kni gene locus as a case study of the role of multiple promoters in controlling a single gene’s transcription dynamics. Synthetic enhancer-promoter reporters allowed us to measure the ability of kni enhancer-promoter pairs to drive expression in the absence of complicating factors like promoter or enhancer competition. Using these reporters, we found that some promoters are broadly compatible with many enhancers, whereas others only drive high levels of expression with some enhancers. This indicates that promoters have multiple roles, providing both redundancy and specificity. A detailed analysis of the transcription dynamics of these reporters indicates that the molecular compatibility of the proteins recruited to the enhancer and promoter tune expression levels by altering the burst size and frequency of transcriptional bursts.
In the context of the whole locus, we found that some enhancer-promoter pairs drive lower expression than their corresponding synthetic reporters, due to the effects of promoter and enhancer competition, distance, or other factors. In fact, while the synthetic reporters indicate that both promoters can drive similarly high levels of expression in the anterior, in the locus, promoter 1 drives most of the expression, with promoter 2 supporting some low levels of expression in the absence of promoter 1. In the posterior, both promoters appear to be necessary to achieve wild-type levels of expression with enhancer competition leading to sub-additive expression. By mutating promoter motifs in the synthetic enhancer-reporter constructs, we found that the effects of promoter motif mutations fall into two different classes, depending on the enhancer that is paired with the promoter. This suggests that there may be several discrete ways that a promoter can be activated by an enhancer, depending on the proteins recruited to each. Returning to our original hypotheses to explain the presence of two promoters in a single locus, we find that both differing enhancer-promoter preferences and a need for expression robustness in the face of promoter mutation may play a role.
Our work has highlighted the importance of both of kni’s promoters. Previous studies have almost exclusively focused on kni’s promoter 1 (Pankratz et al. 1992; Pelegri and Lehmann 1994), which unexpectedly looks like a typical housekeeping gene promoter, with a dispersed shape and composed of a series of mostly weak Inr sites (Vo Ngoc et al. 2017). It is kni’s promoter 2, with its focused site of initiation and composition of TATA Box, Inr, and DPE motifs, that looks like a canonical developmental promoter (Vo Ngoc et al. 2017). Interestingly, despite only discussing promoter 1, in practice, studies interrogating the behavior of multiple kni enhancers often included both promoters, as promoter 2 is found in a kni intron (Bothma et al. 2015; El-Sherif and Levine 2016). Our analysis suggests that both promoters are needed for normal kni expression.
There is growing evidence that promoter motifs play a role in modulating different aspects of transcription dynamics. However, the role of each motif can vary from one locus to the next. In the “TATA-only” Drosophila snail promoter, the TATA Box affects burst size by tuning burst duration (Pimmett et al. 2021). In the mouse PD1 proximal promoter, which consists of a CAAT Box, TATA Box, Sp1, and Inr motif, the TATA box may tune burst size and frequency (Hendy et al. 2017). A study of a synthetic Drosophila core promoter and the ftz promoter found that the TATA box tunes burst size by modulating burst amplitude and that Inr, MTE, and DPE tune burst frequency (Yokoshi et al. 2021). TATA Box also appears to be associated with increased expression noise, as TATA-containing promoters tend to drive larger, but less frequent transcriptional bursts (Ramalingam et al. 2021). In contrast to TATA Box, Inr appears to be associated with promoter pausing, e.g., by adding a paused promoter state in the Inr-containing Drosophila Kr and Ilp4 promoters (Pimmett et al. 2021). In fact, a Pol II ChIP-seq study indicates that paused developmental genes appear to be enriched for GAGA, Inr, DPE, and PB motifs (Ramalingam et al. 2021).
Similarly, the TFs bound at enhancers can affect transcription dynamics in diverse ways. Exploration of the role of TFs in modulating burst properties has indicated that BMP and Notch can tune burst frequency and duration, respectively (Falo-Sanjuan et al. 2019; Lee et al. 2019; Hoppe et al. 2020). Work that considers both the promoters and enhancer simultaneously have come to differing conclusions. Work in human Jurkat cells, wherein 8000 genomic loci were integrated with one of three promoters, showed that burst frequency is modulated at weakly expressed loci and burst size modulated at strongly expressed loci (Dar et al. 2012). Work in Drosophila embryos and mouse fibroblasts and stem cells suggest that stronger enhancers produce more bursts, and promoters tune burst size (Fukaya et al. 2016; Larsson et al. 2019). On the whole, this work indicates that promoter motifs and the TFs binding enhancers can act to tune burst properties in a myriad of ways. Given the wide range of possibilities, it is likely that setting, i.e., the combination of promoter motifs and the interacting enhancers, is particularly important in determining the resulting transcription dynamics.
Our work supports this notion. Notably, eliminating the TATA Box and DPE from promoter 2 seems to reinforce the idea that we have two classes of enhancers that behave in distinct ways with these promoters, likely due to the different TFs bound at these enhancers. We find that burst size and initiation rate are key properties tuned by the proteins recruited to the enhancer and promoter. Our observation is in contrast to previous studies in which Pol II initiation rate seems constant despite swapping two promoters with different motif content, altering BMP levels, or modulating the strength of the TF’s activation domains (Senecal et al. 2014; Hoppe et al. 2020). Initiation rate is also tightly constrained for gap genes (Zoller et al. 2018). We suggest that the differences we see in our work, where initiation rate depends on molecular compatibility, versus other work, where initiation rate is controlled by other factors, again reinforces the idea that the role of any particular promoter motif or TF binding site can be highly context-dependent.
Together, ours and previous work demonstrate that deriving a general set of rules to predict transcription dynamics from sequence is a challenge because the space of promoter motif content and enhancer TF binding site arrangements is large. The proteins recruited to both promoters and enhancers can combine to make transcriptional complexes with different constituent proteins, post-translational modifications, and conformations, that may even vary as a function of time. Due to the vast possibility space and context-dependent rules, most work has only scratched the surface of how promoter motifs or enhancers can modulate burst properties, suggesting a field rich for future investigation.
Data availability
Transgenic fly strains and plasmids are available upon request. Supplementary Files are available on figshare: https://doi.org/10.25386/genetics.16599167. Supplementary File S1 contains the gene names, the dm6 release coordinates, and the FlyBase numbers (FBgns) that matched the housekeeping gene names and coordinates identified by Corrales et al. (2017). Supplementary File S2 contains the enhancer-promoter pairs (and their dm6 release coordinates) used in the computational analysis presented in Supplementary Figure S2. Supplementary Files S3–S17 contain GenBank files describing the plasmids used to make all the transgenic fly strains produced for this work. Supplementary Files S18–S35 are representative videos of the transcriptional dynamics of each transgenic reporter.
Acknowledgments
The authors wish to thank Leonila Lagunes, Srikiran Chandrasekaran, and all the Wunderlich lab members for helpful comments on the manuscript and Ali Mortazavi, Kyoko Yokomori, Kevin Thornton, and Rahul Warrior for useful discussion on the project. The authors thank Flo Ramirez for data analysis that inspired some of this work.
Funding
This work is supported by NIH-NICHD R00 HD073191 and NIH-NICHD R01 HD095246 (to Z.W.) and the US DoE P200A120207 and NIH-NIBIB T32 EB009418 (to L.L.).
Conflicts of interest
The authors declare that there is no conflict of interest.
Contributor Information
Lily Li, Department of Developmental and Cell Biology, University of California, Irvine, Irvine, CA 92697, USA.
Rachel Waymack, Department of Developmental and Cell Biology, University of California, Irvine, Irvine, CA 92697, USA.
Mario Gad, Department of Developmental and Cell Biology, University of California, Irvine, Irvine, CA 92697, USA.
Zeba Wunderlich, Department of Developmental and Cell Biology, University of California, Irvine, Irvine, CA 92697, USA; Department of Biology, Boston University, Boston, MA 02215, USA.
Literature cited
- Batut P, Dobin A, Plessy C, Carninci P, Gingeras TR. 2013. High-fidelity promoter profiling reveals widespread alternative promoter usage and transposon-driven developmental gene expression. Genome Res. 23:169–180. doi:10.1101/gr.139618.112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Batut PJ, Gingeras TR. 2017. Conserved noncoding transcription and core promoter regulatory code in early Drosophila development. Elife. 6:e29005. doi:10.7554/eLife.29005 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bonn S, Zinzen RP, Girardot C, Gustafson EH, Perez-Gonzalez A, et al. 2012. Tissue-specific analysis of chromatin state identifies temporal signatures of enhancer activity during embryonic development. Nat Genet. 44:148–156. doi:10.1038/ng.1064. [DOI] [PubMed] [Google Scholar]
- Bothma JP, Garcia HG, Ng S, Perry MW, Gregor T, et al. 2015. Enhancer additivity and non-additivity are determined by enhancer strength in the Drosophila embryo. eLife. 4:e07956. doi:10.7554/eLife.07956. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brown JB, Boley N, Eisman R, May GE, Stoiber MH, et al. 2014. Diversity and dynamics of the Drosophila transcriptome. Nature. 512:393–399. doi:10.1038/nature12962. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Burke TW, Kadonaga JT. 1996. Drosophila TFIID binds to a conserved downstream basal promoter element that is present in many TATA-box-deficient promoters. Genes Dev. 10:711–724. doi:10.1101/gad.10.6.711. [DOI] [PubMed] [Google Scholar]
- Chalkley GE, Verrijzer CP. 1999. DNA binding site selection by RNA polymerase II TAFs: a TAF(II)250-TAF(II)150 complex recognizes the initiator. EMBO J. 18:4835–4845. doi:10.1093/emboj/18.17.4835. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Corrales M, Rosado A, Cortini R, van Arensbergen J, van Steensel B, et al. 2017. Clustering of Drosophila housekeeping promoters facilitates their expression. Genome Res. 27:1153–1161. doi:10.1101/gr.211433.116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dar RD, Razooky BS, Singh A, Trimeloni TV, McCollum JM, et al. 2012. Transcriptional burst frequency and burst size are equally modulated across the human genome. Proc Natl Acad Sci USA. 109:17454–17459. doi:10.1073/pnas.1213530109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dreos R, Ambrosini G, Groux R, Cavin Périer R, Bucher P. 2017. The eukaryotic promoter database in its 30th year: focus on non-vertebrate organisms. Nucleic Acids Res. 45:D51–D55. doi:10.1093/NAR/GKW1069. [DOI] [PMC free article] [PubMed] [Google Scholar]
- El-Sherif E, Levine M. 2016. Shadow enhancers mediate dynamic shifts of gap gene expression in the Drosophila embryo. Curr Biol. 26:1164–1169. doi:10.1016/j.cub.2016.02.054. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Emami KH, Jain A, Smale ST. 1997. Mechanism of synergy between TATA and initiator: synergistic binding of TFIID following a putative TFIIA-induced isomerization. Genes Dev. 11:3007–3019. doi:10.1101/gad.11.22.3007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Falo-Sanjuan J, Lammers NC, Garcia HG, Bray SJ. 2019. Enhancer priming enables fast and sustained transcriptional responses to notch signaling. Dev Cell. 50:411–425.e8. doi:10.1016/j.devcel.2019.07.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fukaya T, Lim B, Levine M. 2016. Enhancer control of transcriptional bursting. Cell. 166:358–368. doi:10.1016/j.cell.2016.05.025 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Garcia HG, Tikhonov M, Lin A, Gregor T. 2013. Quantitative imaging of transcription in living Drosophila embryos links polymerase activity to patterning. Curr Biol. 23:2140–2145. doi:10.1016/j.cub.2013.08.054. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gehrig J, Reischl M, Kalmár E, Ferg M, Hadzhiev Y, et al. 2009. Automated high-throughput mapping of promoter-enhancer interactions in zebrafish embryos. Nat Methods. 6:911–916. doi:10.1038/nmeth.1396. [DOI] [PubMed] [Google Scholar]
- Ghavi-Helm Y, Klein FA, Pakozdi T, Ciglar L, Noordermeer D, et al. 2014. Enhancer loops appear stable during development and are associated with paused polymerase. Nature. 512:96–100. doi:10.1038/nature13417. [DOI] [PubMed] [Google Scholar]
- Halfon MS, Gallo SM, Bergman CM. 2008. REDfly 2.0: an integrated database of cis-regulatory modules and transcription factor binding sites in Drosophila. Nucleic Acids Res. 36:D594–D598. doi:10.1093/nar/gkm876. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hansen SK, Takada S, Jacobson RH, Lis JT, Tjian R. 1997. Transcription properties of a cell type-specific TATA-binding protein, TRF. Cell. 91:71–83. doi:10.1016/S0092-8674(01)80010-6. [DOI] [PubMed] [Google Scholar]
- Hendy O, Campbell J, Weissman JD, Larson DR, Singer DS. 2017. Differential context-specific impact of individual core promoter elements on transcriptional dynamics. Mol Biol Cell. 28:3360–3370. doi:10.1091/mbc.E17-06-0408. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hertz GZ, Stormo GD. 1999. Identifying DNA and protein patterns with statistically significant alignments of multiple sequences. Bioinformatics. 15:563–577. doi:10.1093/bioinformatics/15.7.563. [DOI] [PubMed] [Google Scholar]
- Holmes MC, Tjian R. 2000. Promoter-selective properties of the TBP-related factor TRF1. Science. 288:867–870. doi:10.1126/science.288.5467.867. [DOI] [PubMed] [Google Scholar]
- Hoppe C, Bowles JR, Minchington TG, Sutcliffe C, Upadhyai P, et al. 2020. Modulation of the promoter activation rate dictates the transcriptional response to graded BMP signaling levels in the Drosophila embryo. Dev Cell. 54:727–741.e7. doi:10.1016/j.devcel.2020.07.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hsu JY, Juven-Gershon T, Marr MT, Wright KJ, Tjian R, et al. 2008. TBP, Mot1, and NC2 establish a regulatory circuit that controls DPE-dependent versus TATA-dependent transcription. Genes Dev. 22:2353–2358. doi:10.1101/gad.1681808. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Juven-Gershon T, Hsu JY, Kadonaga JT. 2008. Caudal, a key developmental regulator, is a DPE-specific transcriptional factor. Genes Dev. 22:2823–2830. doi:10.1101/gad.1698108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Juven-Gershon T, Hsu JY, Theisen JW, Kadonaga JT. 2008. The RNA polymerase II core promoter—the gateway to transcription. Curr Opin Cell Biol. 20:253–259. doi:10.1016/j.ceb.2008.03.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kim JL, Nikolov DB, Burley SK. 1993. Co-crystal structure of TBP recognizing the minor groove of a TATA element. Nature. 365:520–527. doi:10.1038/365520a0. [DOI] [PubMed] [Google Scholar]
- Kvon EZ, Kazmar T, Stampfel G, Yáñez-Cuna JO, Pagani M, et al. 2014. Genome-scale functional characterization of Drosophila developmental enhancers in vivo. Nature. 512:91–95. doi:10.1038/nature13395. [DOI] [PubMed] [Google Scholar]
- Kvon EZ, Waymack R, Gad M, Wunderlich Z. 2021. Enhancer redundancy in development and disease. Nat Rev Genet. 22:324–336. doi:10.1038/s41576-020-00311-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lammers NC, Galstyan V, Reimer A, Medin SA, Wiggins CH, et al. 2020. Multimodal transcriptional control of pattern formation in embryonic development. Proc Natl Acad Sci USA. 117:836–847. doi:10.1073/pnas.1912500117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Landry JR, Mager DL, Wilhelm BT. 2003. Complex controls: the role of alternative promoters in mammalian genomes. Trends Genet. 19:640–648. doi:10.1016/j.tig.2003.09.014. [DOI] [PubMed] [Google Scholar]
- Larsson AJM, Johnsson P, Hagemann-Jensen M, Hartmanis L, Faridani OR, et al. 2019. Genomic encoding of transcriptional burst kinetics. Nature. 565:251–254. doi:10.1038/s41586-018-0836-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lee CH, Shin H, Kimble J. 2019. Dynamics of Notch-Dependent transcriptional bursting in its native context. Dev Cell. 50:426–435.e4. doi:10.1016/j.devcel.2019.07.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lewis BA, Sims RJ, Lane WS, Reinberg D. 2005. Functional characterization of core promoter elements: DPE-specific transcription requires the protein kinase CK2 and the PC4 coactivator. Mol Cell. 18:471–481. doi:10.1016/j.molcel.2005.04.005. [DOI] [PubMed] [Google Scholar]
- Lim B, Levine MS. 2021. Enhancer-promoter communication: hubs or loops?. Curr Opin Genet Dev. 67:5–9. doi:10.1016/j.gde.2020.10.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ling J, Umezawa KY, Scott T, Small S. 2019. Bicoid-dependent activation of the target gene hunchback requires a two-motif sequence code in a specific basal promoter. Mol Cell. 75:1178–1187.e4. doi:10.1016/j.molcel.2019.06.038. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lott SE, Villalta JE, Schroth GP, Luo S, Tonkin LA, et al. 2011. Noncanonical compensation of zygotic X transcription in early Drosophila melanogaster development revealed through single-embryo RNA-Seq. PLoS Biol. 9:e1000590.doi:10.1371/journal.pbio.1000590. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pankratz MJ, Busch M, Hoch M, Seifert E, Pankratz MJ, et al. 1992. Spatial control of the gap gene knirps in the Drosophila embryo by posterior morphogen system. Science. 255:986–989. doi:10.1126/science.1546296. [DOI] [PubMed] [Google Scholar]
- Patro R, Duggal G, Love MI, Irizarry RA, Kingsford C. 2017. Salmon provides fast and bias-aware quantification of transcript expression. Nat Methods. 14:417–419. doi:10.1038/nmeth.4197. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Peccoud J, Ycart B. 1995. Markovian modeling of gene-product synthesis. Theor Popul Biol. 48:222–234. doi:10.1006/tpbi.1995.1027. [Google Scholar]
- Pelegri F, Lehmann R. 1994. A role of polycomb group genes in the regulation of gap gene expression in Drosophila. Genetics. 136:1341–1353. doi:10.1093/genetics/136.4.1341. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Perry MW, Boettiger AN, Levine M. 2011. Multiple enhancers ensure precision of gap gene-expression patterns in the Drosophila embryo. Proc Natl Acad Sci USA. 108:13570–13575. doi:10.1073/pnas.1109873108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pimmett V, Dejean M, Fernandez C, Trullo A, Betrand E, et al. 2021. Quantitative imaging of transcription in living Drosophila embryos reveals the impact of core promoter motifs on promoter state dynamics. BioRxiv. doi:10.1101/2021.01.22.427786. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Qi Z, Jung C, Bandilla P, Ludwig C, Heron M, et al. 2020. Large-scale analysis of Drosophila core promoter function using synthetic promoters. BioRxiv. 49:2020.10.15.339325. doi:10.1101/2020.10.15.339325. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Qin JY, Zhang L, Clift KL, Hulur I, Xiang AP, et al. 2010. Systematic comparison of constitutive promoters and the doxycycline-inducible promoter. PLoS One. 5:e10611.doi:10.1371/journal.pone.0010611. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ramalingam V, Natarajan M, Johnston J, Zeitlinger J. 2021. TATA and paused promoters active in differentiated tissues have distinct expression characteristics. Mol Syst Biol. 17:e9866.doi:10.15252/msb.20209866. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ravarani CNJ, Chalancon G, Breker M, de Groot NS, Babu MM. 2016. Affinity and competition for TBP are molecular determinants of gene expression noise. Nat Commun. 7:10417.doi:10.1038/ncomms10417. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schibler U, Sierra F. 1987. Alternative promoters in developmental gene expression. Annu Rev Genet. 21:237–257. doi:10.1146/annurev.ge.21.120187.001321. [DOI] [PubMed] [Google Scholar]
- Schröder C, Tautz D, Seifert E, Jäckle H. 1988. Differential regulation of the two transcripts from the Drosophila gap segmentation gene hunchback. EMBO J. 7:2881–2887. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schroeder MD, Pearce M, Fak J, Fan HQ, Unnerstall U, et al. 2004. Transcriptional control in the segmentation gene network of Drosophila. PLoS Biol. 2:E271.doi:10.1371/journal.pbio.0020271. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Senecal A, Munsky B, Proux F, Ly N, Braye FE, et al. 2014. Transcription factors modulate c-Fos transcriptional bursts. Cell Rep. 8:75–83. doi:10.1016/j.celrep.2014.05.053. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shao H, Revach M, Moshonov S, Tzuman Y, Gazit K, et al. 2005. Core promoter binding by Histone-like TAF complexes. Mol Cell Biol. 25:206–219. doi:10.1128/mcb.25.1.206-219.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sloutskin A, Danino YM, Orenstein Y, Zehavi Y, Doniger T, et al. 2015. ElemeNT: a computational tool for detecting core promoter elements. Transcription. 6:41–50. doi:10.1080/21541264.2015.1067286. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tunnacliffe E, Chubb JR. 2020. What is a transcriptional burst?. Trends Genet. 36:288–297. doi:10.1016/j.tig.2020.01.003. [DOI] [PubMed] [Google Scholar]
- van Arensbergen J, van Steensel B, Bussemaker HJ. 2014. In search of the determinants of enhancer-promoter interaction specificity. Trends Cell Biol. 24:695–702. doi:10.1016/j.tcb.2014.07.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vo Ngoc L, Wang Y-L, Kassavetis GA, Kadonaga JT. 2017. The punctilious RNA polymerase II core promoter. Genes Dev. 31:1289–1301. doi:10.1101/gad.303149.117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang X, Hou J, Quedenau C, Chen W. 2016. Pervasive isoform‐specific translational regulation via alternative transcription start sites in mammals. Mol Syst Biol. 12:875.doi:10.15252/msb.20166941. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Waymack R, Fletcher A, Enciso G, Wunderlich Z. 2020. Shadow enhancers can suppress input transcription factor noise through distinct regulatory logic. eLife. 9:1–57. doi:10.7554/ELIFE.59351. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wu C-HH, Madabusi L, Nishioka H, Emanuel P, Sypes M, et al. 2001. Analysis of core promoter sequences located downstream from the TATA element in the hsp70 promoter from Drosophila melanogaster. Mol Cell Biol. 21:1593–1602. doi:10.1128/MCB.21.5.1593-1602.2001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wunderlich Z, Bragdon MD, Eckenrode KB, Lydiard-Martin T, Pearl-Waserman S, et al. 2012. Dissecting sources of quantitative gene expression pattern divergence between Drosophila species. Mol Syst Biol. 8:604.doi:10.1038/msb.2012.35. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yokoshi M, Cambón M, Fukaya T. 2021. Regulation of transcriptional bursting by core promoter elements in the Drosophila embryo. BioRxiv. doi:10.1101/2021.03.18.435761. [Google Scholar]
- Zehavi Y, Kuznetsov O, Ovadia-Shochat A, Juven-Gershon T. 2014. Core promoter functions in the regulation of gene expression of Drosophila dorsal target genes. J Biol Chem. 289:11993–12004. doi:10.1074/jbc.M114.550251. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhu LJ, Christensen RG, Kazemian M, Hull CJ, Enuameh MS, et al. 2011. FlyFactorSurvey: a database of Drosophila transcription factor binding specificities determined using the bacterial one-hybrid system. Nucleic Acids Res. 39:D111–D117. doi:10.1093/nar/gkq858. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zoller B, Little SC, Gregor T. 2018. Diverse spatial expression patterns emerge from unified kinetics of transcriptional bursting. Cell. 175:835–847.e25. doi:10.1016/j.cell.2018.09.056. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
Transgenic fly strains and plasmids are available upon request. Supplementary Files are available on figshare: https://doi.org/10.25386/genetics.16599167. Supplementary File S1 contains the gene names, the dm6 release coordinates, and the FlyBase numbers (FBgns) that matched the housekeeping gene names and coordinates identified by Corrales et al. (2017). Supplementary File S2 contains the enhancer-promoter pairs (and their dm6 release coordinates) used in the computational analysis presented in Supplementary Figure S2. Supplementary Files S3–S17 contain GenBank files describing the plasmids used to make all the transgenic fly strains produced for this work. Supplementary Files S18–S35 are representative videos of the transcriptional dynamics of each transgenic reporter.





