Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2005 Mar 28;102(14):4966–4971. doi: 10.1073/pnas.0409414102

Quantitative analysis of binding motifs mediating diverse spatial readouts of the Dorsal gradient in the Drosophila embryo

Dmitri Papatsenko 1,*, Michael Levine 1
PMCID: PMC555988  PMID: 15795372

Abstract

Dorsal is a sequence-specific transcription factor that is distributed in a broad nuclear gradient across the dorsal–ventral (DV) axis of the early Drosophila embryo. It initiates gastrulation by regulating at least 30–50 target genes in a concentration-dependent fashion. Previous studies identified 18 enhancers that are directly regulated by different concentrations of Dorsal. Here, we employ computational methods to determine the basis for these distinct transcriptional outputs. Orthologous enhancers were identified in a variety of divergent Drosophila species, and their comparison revealed several conserved sequence features responsible for DV patterning. In particular, the quality of Dorsal and Twist recognition sequences correlates with the DV coordinates of gene expression relative to the Dorsal gradient. These findings are entirely consistent with a gradient threshold model for DV patterning, whereby the quality of individual Dorsal binding sites determines in vivo occupancy of target enhancers by the Dorsal gradient. Linked Dorsal and Twist binding sites constitute a conserved composite element in certain “type 2” Dorsal target enhancers, which direct gene expression in ventral regions of the neurogenic ectoderm in response to intermediate levels of the Dorsal gradient. Similar motif arrangements were identified in orthologous loci in the distant mosquito genome, Anopheles gambiae. We discuss how Dorsal and Twist work either additively or synergistically to activate different target enhancers.

Keywords: fitting expression, site quality, expression patterns, development


Whole-genome sequence assemblies of divergent Drosophila genomes provide an unprecedented opportunity for the quantitative analysis of regulatory DNAs. Previous studies suggest that orthologous enhancers from distant species, such as Drosophila melanogaster and Drosophila virilis, direct similar or identical patterns of gene expression (13), even when known binding motifs are not well conserved (4, 5). This evolutionary turnover of regulatory elements is rapid (6, 7) and facilitates the computational analysis of structure–function relationships by comparing orthologous enhancers among different Drosophilids. Here, we use the genome assembles of four distant Drosophila species (D. melanogaster, Drosophila pseudoobscura, D. virilis, and Drosophila mojavensis) to investigate different mechanisms of dorsal–ventral (DV) patterning and gastrulation (8, 9).

The DV patterning of the Drosophila embryo is initiated by Dorsal, a Rel-containing sequence-specific transcription factor (10, 11). The Dorsal protein is distributed throughout the cytoplasm of growing oocytes but enters nuclei shortly after fertilization. A Dorsal nuclear gradient is established 90 min after fertilization, with peak levels in the ventral-most regions of precellular embryos (12, 13). The Dorsal nuclear gradient controls DV patterning and gastrulation by regulating a variety of target genes in a concentration-dependent fashion (8). Secondary gradients are established by some of the target genes, such as Twist, and the encoded proteins work together with Dorsal to control DV patterning (14, 15). Nearly 50 genes exhibiting localized expression patterns along the DV axis have been identified by classical genetics screens, computational surveys, and microarray expression screens (9, 16, 17). It is estimated that 30 of the 50 genes correspond to direct transcriptional targets of the Dorsal gradient; enhancers have been identified for 18 of the genes, representing one of the largest collections of regulatory DNAs engaged in a common developmental process (8).

The 18 Dorsal target enhancers can be separated into at least three functional categories. Type 1 enhancers (hbr, htl, mes3, phm, sna, and twist) are activated by peak levels of the Dorsal gradient in ventral regions of the embryo. Several of the associated type 1 genes restrict FGF signaling within the presumptive mesoderm (16, 18). Type 2 enhancers (brk, m8, rho, sim, vn, and vnd) are activated by intermediate levels of the gradient, and several of the associated genes trigger EGF signaling in ventral regions of the presumptive neurogenic ectoderm (16). Finally, type 3 enhancers (dpp, ind, ths, tld, sog, and zen) are regulated by the lowest concentrations of Dorsal throughout the neurogenic ectoderm. Most of the associated genes are responsible for producing a broad TGFβ/Dpp signaling gradient in the dorsal ectoderm of gastrulating embryos (19).

It has been suggested that the quality and number of Dorsal binding sites might be an important determinant of distinct threshold readouts of the gradient (20, 21). To determine whether Dorsal works through a “gradient threshold” model, we analyzed the 18 Dorsal target enhancers from D. melanogaster, as well as the orthologous enhancers from the following three divergent Drosophilids: D. pseudoobscura, D. virilis, and D. mojavensis. This expansion of the data set from 18 to 72 provides a better foundation for statistical evaluation and minimizes scoring errors caused, for instance, by inaccuracies in the binding motif models (2224).

In the current study, we examined whether the distribution and quality of Dorsal and Twist motifs correlated with distinct threshold readouts of the gradient. Evidence is presented that enhancer sequences responding to low levels of Dorsal typically contain better Dorsal recognition sequences than enhancers responding to high levels of the gradient. We also present evidence for conserved Dorsal–Twist composite elements in a subset of type 2 enhancers that mediate localized expression in ventral regions of the neurogenic ectoderm in response to intermediate levels of the Dorsal gradient. Some of these composite elements are conserved in orthologous genes in the distantly related genome of the mosquito, Anopheles gambiae.

Methods

Identification of Orthologous Enhancers. Enhancer sequences for 18 different Dorsal target genes in D. melanogaster were annotated based on the positions of restriction fragments or the sequences of the PCR primers used in transgenic assays (25). Minimal fragments (minimal elements) reproducing endogenous expression patterns were selected for the analysis. Genome assemblies for seven Drosophila species and the mosquito, A. gambiae, were downloaded by using the Lawrence-Berkeley National Lab web resource (http://rana.lbl.gov/drosophila/multipleflies.html). In the case of Drosophila, the orthologous loci containing enhancer sequences were retrieved by using National Center for Biotechnology Information blast (26), whereas the Anopheles orthologs were identified with the help of the ensemble genome browser (27). The exact positions of enhancers in the orthologous loci of Drosophila species were identified by using local alignment procedures (lasergene and dnastar) (28), and the positions of conserved blocks in enhancers were mapped by using the motif extraction algorithm MEME (29). The orthologous enhancer sequences and the distribution of conserved motifs are available from D.P. upon request.

Calculation of the Feature Scores. For a binding motif alignment containing n columns and m rows, we calculated match score Mw to a given word w by using standard position-weighted matrix (PWM) score (or matrix score) (30).

graphic file with name M1.gif [1]

where Inline graphic is frequency of ith character α of the word w (α ∈ {A, C, G, T}) in the column i of the binding motif alignment, Inline graphic is frequency of the α in genome of Drosophila, and a is pseudocount parameter (a = 1). Pseudocount negates zero value under the logarithm (if Inline graphic) and provides an additional control over ambiguous data sets, such as binding motif alignments containing a low number of sites (small m).

Cutoff values for the matrix match M were set by maximizing the ratio reflecting fraction of matches coinciding with the known sites (experimental data)

graphic file with name M5.gif [2]

Here, TH, FP, and FN are the numbers of true hits, false-positives, and false-negatives, respectively; r is the total number of known sites; and t is a pseudocount that reflects a penalty for the high FN rate (t = r). In the case of overlapping matches scoring above the cutoff, a local maximum within 3 bases was considered a match.

We calculated a feature score, S, for an enhancer sequence as the following: (i) the total number of matches above the matrix cutoff M0,(ii) sum of the match scores, (iii) average score of the matches, (iv) score of the best match, and (v) match density score. For a sequence of length L containing N matches above cutoff M0, the match density was approximated by Poisson distribution (25, 31),

graphic file with name M6.gif [3]

where p is the match probability corresponding to the selected cutoff M0 and calculated as a fraction of D. melanogaster genome positions producing match score higher than M0.

Statistical Tests. To determine whether the observed differences in the mean feature scores j are significant between the J functional enhancer groups of N samples, we performed ANOVA and calculated F-ratios as follows:

graphic file with name M7.gif [4]

The corresponding p values p(F,(J–1), (NJ)) were calculated by using publicly available software (www.physics.csbsju.edu/stats). Correlation between various data sets (r, correlation coefficient or cc) was calculated by using the standard Pearson association formula.

To retrieve most frequent site combinations, distances were calculated between all matches for a single motif or for two motifs within a range of 300 bp. The resulting distance histogram was smoothed by three data points, and the highest observed values were used to construct the filters.

Results

Orthologous enhancers were identified in the four most evolutionarily divergent species that have been sequenced to date: D. melanogaster, D. pseudoobscura, D. virilis, and D. mojavensis. The selected data set contains two paired clades, each composed of lineages that are roughly equidistant from a common ancestor (Fig. 1). This sampling should minimize possible scoring bias toward any single clade (see below).

Fig. 1.

Fig. 1.

Pattern of conservation. (AE) Multiple alignments of the most conserved blocks found in the Dorsal target enhancers. Seventy-seven bases of the m8 enhancer (B) were found to be identical in all seven Drosophila species. (F) Phylogenetic relationships between the Drosophilids analyzed in this study. Selected species are marked by rectangles.

The initial set of 18 Dorsal target enhancers was expanded to create a final data set of 65.5 kb encompassing 72 enhancer sequences, consisting of 24 sequences (six enhancers from each of the four Drosophila species) for each of the three major Dorsal gradient patterning thresholds. Despite the evolutionary divergence of the four Drosophilids under study, some of the enhancers contain extended blocks of near perfect conservation. Nearly 20% of the total enhancer sequences were found within these blocks (see Fig. 1; additional information is available from D.P. upon request).

Only half of all putative Dorsal binding sites (122 of 233, match probability pm = 8.5 × 10–4) reside within the extended blocks of conserved sequences. However, 80% (45 of 56) of the optimal (pm = 2.4 × 10–5) Dorsal sites (GGAATTTCC) are contained in these blocks. There is a similar enrichment of optimal Twist sites within the extended blocks of conservation. Only ≈25% of all Twist sites reside within the blocks (pm = 1.4 × 10–3, but they represent 42% of the optimal sites (CACATGT; pm = 6.1 × 10–5). These observations are consistent with previous findings that binding sites undergo rapid evolutionary changes (4, 5, 7), although optimal sites might be better conserved than putative low-affinity sites.

Scoring Enhancers Based on the Quality of Binding Sites. The number of binding motifs has been used to predict gene expression levels (32, 33). These models are based on the additive contribution of each binding motif (linear regression models) and have been particularly useful for predicting the levels of gene expression in the yeast cell cycle (34, 35). However, metazoans possess more sophisticated mechanisms of transcriptional control. The spatial and temporal patterns of developmental gene expression depend on the interaction of multiple transcriptional activators and repressors with complex enhancers, which often contain multiple binding sites with varying affinities (8, 36, 37). Consequently, the number of binding motifs might not be the best scoring feature to model gene expression.

The following features were used to measure the Dorsal and Twist binding sites in the 72 target enhancers: the total number of matches above a minimum “cutoff” value in a sequence (see Methods), the sum of the match scores, the average match score, the best match score, and the relative match density (cluster density). Unfortunately, the enhancers comprising the data set had a 10-fold variation in sequence length, from 300 bp to 3 kb. This variation complicated the analysis because match density strongly correlates with enhancer size (correlation up to 0.5; see Methods). Although cluster density has been used to identify enhancers in Drosophila and other genomes (17, 25, 38, 39), it does not appear to be the best feature for scoring enhancer sequences and modeling threshold readouts of the Dorsal gradient. There is no positive correlation between enhancer size and the number of sites or the sum of all motif scores (above the cutoff). For example, sna is regulated by the largest enhancer (2,900 bases) but contains only three Dorsal sites above the cutoff (for more information, contact D.P.). One of the smallest enhancers (299 bases) regulates rho; it contains four sites. Higher cutoff values help overcome possible scoring errors due to size differences, which sometimes arise from ambiguities in the definition of enhancer borders.

Fitting Expression Types with Single Motif Models. Each enhancer in the data set was assigned a score based on matches to Dorsal and Twist PWMs by using the four methods described above: number of matches, the sum of all matches, the average match score, and the best score. The scores obtained by using each method were then compared with the spatial expression types (see Introduction). ANOVA was used to determine the statistical significance of the differences in mean scores among the functional groups (types 1, 2, and 3). According to a gradient threshold model, we expected to observe significant differences in the mean Dorsal scores among the three groups, with a progressive increase in the mean scores (motif quality) from type 1 to type 3 enhancers. Indeed, the mean scores for type 1 enhancers are consistently lower than those seen for types 2 or 3. These scores represent the average match quality (overall pattern goodness) and the best match score (score of the site with the closest match to the binding matrix). By both criteria, type 1 enhancers contain the poorest Dorsal binding motifs (pA < 0.0001; see Fig. 2 and Table 1, data sets D and E). In contrast, type 1 enhancers are not discriminated from the others when scoring is based on the number of Dorsal motifs or the sum of all scores (Table 1, data set C).

Fig. 2.

Fig. 2.

Distribution of feature scores for Dorsal and Twist motifs. (A and B) Scatter plots show the distribution of the enhancer scores (y axis) for the Dorsal (A) and Twist (B) motifs. The Roman numerals beneath the plots indicate each of the three major patterning thresholds. For example, the htl and sna enhancers are type 1 enhancers that are activated only by high levels of the Dorsal gradient. The sog and zen enhancers are type 3 enhancers that are regulated by low levels of the gradient. Scores were calculated as an average score by using all of the matches found in all four orthologous sequences for each enhancer. (CJ) Distribution of enhancer scores using different criteria (see below). The bottom and top marks correspond to maximal and minimal score values, respectively; the score range covered by the box contains 50% of all data points (the second and third quartiles of distribution). Horizontal bar shows the median of the distribution. (CE) Box plots show comparison of different features by using the Dorsal motif as follows: total number of matches (C), the average PWM score (see Methods) calculated for each enhancer (D), and PWM score of the best match (E). The average PWM score and the score of the best match display the best separation of the type 1 enhancers from types 2 and 3. (F) Distribution of average PWM scores for the Twist motif. (GJ) Distribution of the average match scores for the Dorsal motif within the following individual species: D. melanogaster (G), D. pseudoobscura (H), D. virilis (I), and D. mojavensis (J). In the most distant species, D. mojavensis, Dorsal scores progressively improve from the type 1 to 3 enhancers.

Table 1. Statistical evaluation of the mean scores.

Data set*
N sequences
Statistics
Motif Feature Mean σ F P
A 18 (×4) Dorsal Avg. match 6.37 0.010
    I 6.57 0.39
    II 7.49 0.47
    III 7.59 0.72
B 18 (×4) Twist Avg. match 3.13 0.073
    I 5.88 0.54
    II 6.42 0.30
    III 5.91 0.38
C 72 Dorsal No. matches 3.14 0.050
    I 3.83 3.10
    II 5.58 2.55
    III 4.13 2.01
72 Dorsal Sum of match 3.44 0.038
    I 20.7 17.5
    II 30.8 13.1
    III 22.3 11.5
D 72 Dorsal Avg. match 20.06 <0.0001
    I 6.42 0.86
    II 7.78 0.94
    III 7.99 1.01
E 72 Dorsal Best match 15.56 <0.0001
    I 7.24 1.53
    II 9.05 1.41
    III 9.44 1.44
F 72 Twist Avg. match 6.47 0.0027
    I 5.90 0.69
    II 6.49 0.49
    III 5.87 0.80
G 18 (D. mel) Dorsal Avg. match 3.13 0.073
    I 6.87 0.76
    II 8.10 0.28
    III 7.79 1.31
H 18 (D. pse) Dorsal Avg. match 4.55 0.029
    I 6.45 0.87
    II 7.64 0.65
    III 7.58 0.79
I 18 (D. vir) Dorsal Avg. match 10.23 0.0016
    I 5.96 0.77
    II 7.91 1.27
    III 8.46 0.92
J 18 (D. moj) Dorsal Avg. match 3.95 0.042
    I 6.39 0.99
    II 7.48 1.29
    III 8.14 0.97
*

Roman numerals I—III indicate expression types.

D. mel, D. melanogaster; D. pse, D. pseudoobscura; D. vir, D. viritis; D. moj, D. mojanvensis.

Avg. match, average match score; No. matches, number of matches; Sum of match, sum of match scores; Best match, best match score.

In the preceding analysis, enhancer scores were derived from the average of all four Drosophilids. Similar analyses were performed with individual Drosophila species (Fig. 2 and Table 1, data sets G–J). The Dorsal pattern quality is progressively improved from the type 1 to 3 enhancers in the most distant species, D. virilis and D. mojavensis (Fig. 2 and Table 1, data set J). The observed variation between different data sets (species) is likely a result of errors present in the Dorsal consensus sequences, which are based on sites from D. melanogaster where the enhancers were first characterized (see Discussion). In general, statistical tests have shown agreement between the quality of the best Dorsal binding motifs and the type of target enhancer. For example, type 3 enhancers tend to have the best Dorsal sites, and they are regulated by the lowest levels of the Dorsal gradient (Fig. 2 and Table 1, data set A). However, the agreement is not perfect; the type 3 tld enhancer has a much lower score (6.21) than the expected value (7.59; see Table 1 and Discussion).

Twist binding motifs also were examined in the different enhancers. The twist gene contains a type 1 enhancer and is selectively transcribed in ventral regions (presumptive mesoderm) of early embryos. Despite this localized transcription, the encoded Twist basic helix–loop–helix protein diffuses dorsally, into ventral regions of the neurogenic ectoderm (40), and thereby forms a steep gradient near the boundary of the mesoderm and neurogenic ectoderm. According to a gradient threshold model (see above), we expected to find better Twist motif patterns in type 2 than type 1 enhancers, because there are lower levels of Twist protein in the regions where the type 2 enhancers are active. This hypothesis was found to be correct (see Fig. 2 and Table 1, data sets B and F). Moreover, the type 2 enhancers display better Twist scores than those seen for type 3. The expectation for type 3 was not clear because the Twist gradient does not appear to extend into dorsal regions of the neurogenic ectoderm where these enhancers are active. Type 3 enhancers display a broad variation in the Twist scores with the mean value close to that of the type I group.

The analysis of the Twist scores indicates that only half of all type 1 enhancers contain perfect matches to the Twist consensus sequence. However, the enhancers lacking the perfect Twist sites tend to contain higher quality Dorsal sites, and, conversely, type 1 enhancers with good matches to the Twist consensus sequence tend to contain poorer Dorsal sites (see below).

Alternative Modes of Dorsal and Twist Action. Dorsal and Twist activate many common target genes in a synergistic fashion (14, 15). It is conceivable that this synergy depends on linked Dorsal and Twist binding sites that foster direct protein–protein interactions. To investigate this possibility, we determined the most frequent distances and orientations of Dorsal and Twist motifs in the data set (24, 41).

The two motifs are most frequently separated by an average of 14, 20, and 53 bp (Fig. 3). Just over two-thirds of the linked sites are seen in type 2 enhancers, even though these enhancers represent only one-third of the complete data set. Type 2 enhancers contain 9 of the 13 linked sites displaying the 14-bp arrangement, 12 of the 19 linked sites separated by 20 bp, and 14 of the 19 linked sites separated by 53 bp (see Fig. 3 B). Most of the linked sites display a preferred orientation, which we designate as Twist pointing “toward” Dorsal. This orientation might foster preferential Twist–Dorsal protein–protein interactions. Dorsal probably binds DNA as a homodimer, but Twist might function as a heterodimer with other basic helix–loop–helix subunits such as Daughterless (20). A preferred orientation of Twist might ensure that the correct basic helix–loop–helix subunit makes contact with Dorsal at the neighboring site.

Fig. 3.

Fig. 3.

Extraction of the most frequent site arrangements. (A) Histogram of distance occurrences for the combination of Dorsal and Twist motifs. Three peaks correspond to the most frequently observed distance ranges (shown on the right). (B) Alignment of sequences corresponding to the most frequent Dorsal–Twist combinations separated by 13–15 bp (peak 1 in A).

The linked, oriented Twist and Dorsal sites can be thought of as “composite” elements that produce optimal synergy between the two transcription factors. To determine whether composite elements are conserved in evolution, we analyzed orthologous genes in the distant A. gambiae genome. We focused on the type 2 genes rho and vn because they contain conserved 14-bp Twist–Dorsal composite elements (Fig. 3). There is at least one match (match probability, pm = 2.5 × 10–5) to this composite element in each of the Anopheles orthologs (see Fig. 4, which is published as supporting information on the PNAS web site). Furthermore, there are two copies of the composite element within the large intron 1 of the Anopheles vn gene, which coincide with highly significant Dorsal binding site clusters (Fig. 4D), suggesting that one or both sequences are likely to correspond to type 2 Dorsal target enhancers. The composite elements found in the Anopheles rho and vnd loci do not map near Dorsal clusters, so it is less clear whether they identify type 2 enhancer elements.

We also investigated the possibility that Dorsal and Twist act in an additive/compensatory fashion by comparing (correlating) the enhancer scores for the two motifs. If enhancers with poor Dorsal sites contain optimal Twist sites, one would see a negative correlation between the motif scores. No correlation (cc = 0.04 and 0.00, respectively) was detected in either type 2 or 3 enhancers. However, a significant negative correlation (–0.41) was seen for the type 1 group. Nearly half of these enhancers displayed a strong negative dependence between the Dorsal and Twist scores. This observation suggests that Dorsal and Twist might compensate for one another in ventral regions of early embryos where there are peak levels of both activators.

Discussion

We have presented evidence that the quality and arrangement of Dorsal and Twist binding sites is a critical determinant of type 1 and 2 spatial readouts of the Dorsal nuclear gradient in the early Drosophila embryo. The analysis of orthologous type 2 enhancers in divergent Drosophilids suggests that linked Twist and Dorsal sites might constitute composite elements fostering optimal transcriptional synergy. The negative dependence between Dorsal and Twist scores in type 1 enhancers suggests that the evolutionary loss of high-affinity Dorsal binding sites can be compensated by an increase in the quality of Twist sites and vice versa. Thus, Dorsal and Twist synergistically activate type 2 enhancers, but regulate type 1 enhancers in an additive/compensatory fashion.

Binding Motif Quality Is a Critical Determinant of the Dorsal Gradient Response. Four different scoring methods were used to assess the quality, or “goodness,” of Dorsal and Twist binding motifs in each enhancer as follows: number of sites, sum of site scores, average site score, and score of the best site. Each of the putative binding sites in each enhancer was given a score based on its match to the weighted matrix (see Methods). Based on the match scores, each enhancer was assigned a total score that we compared with spatial expression. Comparison of the different scoring methods showed that the average score and the score of the best site provide the most accurate fit between the type of expression pattern and motif distribution. Apparently these two features (predictor variables) most adequately reflect motif pattern goodness and appear to be relatively insensitive to parameter variations and scoring errors.

According to a gradient response hypothesis, the goodness of the Dorsal binding sites and the enhancer scores should progressively improve from type 1 to 3 enhancers. Indeed, six of the eight enhancers with the lowest scores are type 1 enhancers that are activated by high levels of the Dorsal gradient. Four of the six enhancers with “intermediate” scores are type 2 enhancers that are activated by intermediate levels of the Dorsal gradient. And, finally, five of the eight enhancers with the best scores are type 3 enhancers that are regulated by the lowest levels of the Dorsal gradient. These findings suggest that the overall quality of Dorsal binding motifs is one of the prime determinants for directing different readouts of the Dorsal gradient. The importance of Dorsal binding affinities was suggested by earlier studies on the twist and zen enhancers, which are regulated by high and low levels of the gradient, respectively (14, 20). However, the current quantitative analysis includes a larger sampling of Dorsal target enhancers and a wider range of scoring techniques. It is now possible to conclude that the basic principles inferred from the manipulation of just a few enhancers are likely to be general and apply to most of the enhancers from our data set.

Comparison of Dorsal target enhancers among divergent Drosophilids provides further support for the gradient response hypothesis. In D. melanogaster and D. pseudoobscura, the mean scores fail to discriminate type 2 and 3 enhancers (Fig. 2 G and H). However, discrimination is observed in distant Drosophilids and is particularly striking in D. mojavenesis (see Fig. 2 J). One explanation for the better fit is that the Dorsal recognition matrix used to evaluate the binding sites is biased toward a subset of enhancers from D. melanogaster where the original footprint data were generated (42, 43). According to this view, divergent Drosophilids contribute unbiased new data. Consequently, scoring errors caused by the bias present in D. melanogaster sequences might be diminished.

Dorsal binding affinities (in the context of overall pattern goodness) are important, but clearly are not the sole determinant of the threshold response. For example, the type 3 tld enhancer is repressed by low levels of the Dorsal gradient but, nonetheless, possesses a very low score, similar to those seen for the type 1 htl and sna enhancers, which are activated by the highest levels of the Dorsal gradient in the ventral mesoderm (see Fig. 2 A). It has been proposed that repression is achieved through the concerted action of Dorsal and “corepressors” such as Cut, Dead Ringer, and Capicua (4447). Perhaps the tld enhancer contains optimal corepressor sites, which foster cooperative occupancy of Dorsal at neighboring, low-affinity sites. In general, linear readouts of the Dorsal gradient would be obscured by extensive cooperative binding interactions, so we regard this provisional model for the tld enhancer as an exception.

The analysis of ≈20 Bicoid target enhancers suggests that there is no clear correlation between the quality of Bicoid binding motifs and the spatial limits of gene expression across the anterior–posterior axis of early embryos (51). There are several possible explanations for the apparently distinct modes of Dorsal and Bicoid gradient thresholds. In particular, Dorsal dimers bound to neighboring sites might not interact in a cooperative fashion, whereas Bicoid monomers might cooperatively occupy adjacent sites (2224). Such cooperativity would diminish the correlation between the quality of individual sites and the pattern of gene expression. It appears that there are multiple mechanisms for the gradient threshold response in different systems.

Role of Twist in Generating Differential Patterns of Expression. The twist gene is an immediate early target of the Dorsal gradient and is directly activated by high levels of Dorsal in the ventral mesoderm (14). The encoded Twist protein forms a steep gradient across the presumptive mesoderm–neuroectoderm border in the early embryo. Low levels of the Twist protein can be detected in nuclei that map four to five cells beyond the limit of the mesoderm (40, 48).

To determine whether the quality of Twist binding sites is also important for differential patterns of gene expression, scores were assigned to each of the 72 enhancers based on matches to the Twist PWM. We found that six of the eight enhancers with the highest scores (see Fig. 2B) are likely targets of the Twist gradient. Indeed, Twist binding sites have been shown to be essential for the expression of the type 2 enhancers rho, vnd, and sim (9). These three enhancers possess higher Twist scores than does the type 1 sna enhancer, which is also directly activated by Twist (49). Overall, the analysis of Twist sites clearly discriminates type 2 enhancers from the other classes (Fig. 2 B and F). Not only do type 2 enhancers contain the best Twist motifs, but they also exhibit the smallest deviations in overall scores (σ = 0.49; see Table 1, data set F). Type 3 enhancers produced a broad range of Twist scores, typically lower than those seen for type 2 enhancers. There is currently no evidence that Twist is required for the expression of any type 3 enhancers.

Synergistic and Compensatory Modes of Dorsal and Twist Activation. The present analysis suggests that Dorsal and Twist work in an additive, or compensatory, fashion on type 1 enhancers but function synergistically on type 2 enhancers. Type 1 enhancers sometimes exhibit compensatory changes in Dorsal and Twist binding motifs. For example, htl enhancers posses the poorest quality Dorsal motifs but have the highest quality Twist sites among all type 1 enhancers. Conversely, twist enhancers have poor Twist motifs but good Dorsal sites. These observations suggest that the Dorsal and Twist proteins do not always require a direct interaction with one another. Instead, the linear combination (33) of the scores for the two motifs provides a good predictive model for the type 1 response.

Dorsal–Twist synergy is suggested by the fixed arrangement of the two binding motifs in type 2 enhancers. All 72 enhancers in the data set were surveyed for the most frequent distances and orientations between Dorsal and Twist binding sites (Fig. 3A). Similar Dorsal–Twist arrangements are shared between two enhancers, rho and vn. These enhancers regulate coordinate genes engaged in a common process: the Rhomboid membrane protease might process the Vein EGF ligand to pattern ventral regions of the neurogenic ectoderm (50). Both enhancers contain the 14-bp spacing of linked Dorsal and Twist sites. A similar, but not identical, arrangement of Dorsal and Twist was detected in different vnd enhancers (20-bp spacing). The rho, vn, and vnd enhancers contain nearly half of the linked sites seen for all 72 enhancers. They direct very similar patterns of gene expression and are engaged in the specification and patterning of ventral regions of the neurogenic ectoderm. The composite Twist–Dorsal element shared in the vn and rho enhancers is conserved in the mosquito, A. gambiae, and the match seen in the vn locus coincides with a significant cluster of Dorsal sites (see Fig. 4D).

Conclusions

The preceding analyses are entirely consistent with a gradient threshold model for the differential regulation of Dorsal target genes. In vivo occupancy of Dorsal target enhancers is determined by the overall quality of the Dorsal binding sites. Dorsal and Twist function in a concerted fashion to regulate more than half of all target enhancers. However, there is a fundamental difference in the behavior of type 1 and 2 enhancers. The two activators function in a strictly additive fashion on type 1 enhancers, such that low-quality Dorsal sites can be compensated by high-quality Twist sites and vice versa. In contrast, the two proteins are likely to function in a synergistic fashion on type 2 enhancers, which contain tightly organized binding sites that sometimes constitute composite elements. The fact that many of the composite elements contain a specific orientation of Twist and Dorsal sites, along with the strong evolution conservation in the Anopheles genome, suggest that they foster optimal synergy between the Dorsal and Twist proteins. Future studies are needed to determine whether distinct composite elements underlie the type 3 expression pattern.

Supplementary Material

Supporting Figure
pnas_102_14_4966__.html (1.5KB, html)

Acknowledgments

We thank A. Stathopoulos for sharing the unpublished sequence of the ind enhancer, R. Zinzen for the unpublished sequence of the m8 enhancer, and M. Ronshaugen for helpful discussions and critical remarks. The perl scripts are available from D.P. on request. This work was supported by National Institutes of Health Grant GM 46638 and the Moore Foundation.

This paper was submitted directly (Track II) to the PNAS office.

Abbreviations: DV, dorsal–ventral; PWM, position-weighted matrix.

References

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supporting Figure
pnas_102_14_4966__.html (1.5KB, html)
pnas_102_14_4966__1.pdf (128.1KB, pdf)

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES