Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2010 Feb 1;107(7):2854–2859. doi: 10.1073/pnas.0915066107

Predicting strength and function for promoters of the Escherichia coli alternative sigma factor, σE

Virgil A Rhodius 1,1, Vivek K Mutalik 1,2
PMCID: PMC2840333  PMID: 20133665

Abstract

Sequenced bacterial genomes provide a wealth of information but little understanding of transcriptional regulatory circuits largely because accurate prediction of promoters is difficult. We examined two important issues for accurate promoter prediction: (1) the ability to predict promoter strength and (2) the sequence properties that distinguish between active and weak/inactive promoters. We addressed promoter prediction using natural core promoters recognized by the well-studied alternative sigma factor, Escherichia coli σE, as a representative of group 4 σs, the largest σ group. To evaluate the contribution of sequence to promoter strength and function, we used modular position weight matrix models comprised of each promoter motif and a penalty score for suboptimal motif location. We find that a combination of select modules is moderately predictive of promoter strength and that imposing minimal motif scores distinguished active from weak/inactive promoters. The combined -35/-10 score is the most important predictor of activity. Our models also identified key sequence features associated with active promoters. A conserved “AAC” motif in the -35 region is likely to be a general predictor of function for promoters recognized by group 4 σs. These results provide valuable insights into sequences that govern promoter strength, distinguish active and inactive promoters for the first time, and are applicable to both in vivo and in vitro measures of promoter strength.

Keywords: -10 motif, -35 motif, motif, transcription initiation


Promoters are constructed of multiple poorly conserved motifs separated by variable length spacers making them difficult to predict accurately (13). Here, we examine the ability of models to predict promoter strength (rate of transcription) and to distinguish between functional and weak/nonfunctional promoters as a means of understanding the principles governing construction of promoters. We use promoters recognized by Escherichia coli σE, a group 4 extracytoplasmic function (ECF) σ as a testbed. Group 4 σs consists of two structurally conserved domains (2 and 4), have control functions ranging from cell envelope morphogenesis to antibiotic resistance, and are the most abundant σs in the bacterial kingdom (4), making it important to be able to predict their promoters in novel organisms. Their promoters require both the -10 and -35 motifs, have few redundant motifs, and contain high information content relative to promoters of housekeeping σs (46), making it feasible to determine the contribution of each motif to promoter activity.

Position weight matrix (PWM) models are commonly used in predicting transcription factor binding sites and promoters because they are simple and have a predictive success comparable to that of more complex models (7, 8). Typically, these models identify many real promoters but also make many false predictions, reflecting their poor ability to discriminate between functional promoters and nonfunctional sequences. We previously used PWMs to successfully predict σE promoters across multiple genomes (5). However, a significant fraction of these predictions were weak or inactive under physiological conditions (9) and the most active σE promoters were missed (1012), indicating significant problems with the model. PWM scores correlate with protein-DNA binding energies, and their assumptions of independence and additivity of the binding energy of each nucleotide position (1315) have been validated for transcription factor binding sites (1619). It is unclear how well PWM scores correlate with promoter strength, which is a function not just of DNA binding, but also of DNA melting and promoter escape. An examination of low-complexity promoter libraries based on a few parent σ70 (2022) or σ28 promoters (23) has yielded conflicting results.

The utility of PWMs to predict promoter strength of high complexity natural promoters has not been examined. Here, we measured the strength of the 60 known σE core promoters (sequences from -35 to +20 relative to the transcription start site) (9) and find that a PWM score comprised of the entire promoter, similar to our previous model (5), is not predictive. However, the use of separate PWMs for each promoter motif enabled us to evaluate their contribution to strength, and the combined score of a subset of motifs gives modest correlation with strength. Importantly, imposing minimal scores for each motif allowed successful discrimination between active and weak/nonactive promoters. Comparison of active and weak/nonactive promoters provides critical insight into the design of active promoters and a rationale for improving the accuracy of PWM-based promoter predictions.

Results

Promoter Strength of Natural Promoters.

Promoter strength was determined in vitro and in vivo. In vitro measurements used single-round transcription assays, which provide a measure of promoter occupancy and reduce the effects of promoter clearance or pausing. Each assay contained a test and control promoter (to permit ready comparison between reactions), each linear promoter fragment comprised of unique promoter sequences from -35 to +20, a common vector sequence upstream, and the efficient rpoC terminator (24) downstream (Fig. S1). The strength of the same promoter sequences were determined in vivo in strains overexpressing σE during exponential growth in M9 minimal media using fusions to GFP (9). Using a cutoff of 2-fold above background, 37 of the 60 natural promoters were active in vitro and a largely overlapping set of 40 were active in vivo (Fig. 1). Most promoters exhibited similar relative activities in the two assays, but there are also differences. The rpoE promoter is much more active than expected in vivo, whilst several promoters including rpoH, ygiM, and yfeY are much less active in vivo (Fig. 1 and Fig. S2). The two “active promoter” sets (37 in vitro and 40 in vivo) were used to construct separate promoter PWM models.

Fig. 1.

Fig. 1.

Relative in vitro and in vivo activities of natural promoter libraries. Promoter activities as determined by single-round in vitro transcriptions (Fig. S1; Materials and Methods) or in vivo gfp fluorescent assays after σE overexpression (9). Each bar indicates the average of 3 independent experiments; error bars represent 1 standard deviation. Weak/nonactive promoters are marked: (*) in vitro and in vivo; (#) in vitro only; (^) in vivo only. Outliers in the optimized in vitro model are marked with a star and in the in vivo model with a triangle.

Correlating Promoter Scores with in Vitro Measures of Promoter Strength.

PWMs were constructed for each active promoter motif (Fig. 2) with a penalty term applied for nonoptimal spacer and discriminator lengths (S + D penalty; see Fig. S3). These models were then used to score all active promoters to derive the correlation (R) of each motif score with promoter strength. The -10, discriminator, and start motif PWMs and the S + D penalty score exhibited moderate correlation with promoter strength; the -35 motif was neutral; and the spacer and initial transcribed region (ITR) were slightly negatively correlated (Table 1A). The total promoter score that was calculated by summing all seven modules (six PWMs plus S + D penalty) gave little correlation with promoter strength (R = 0.26; Table 2). Eliminating the negatively correlated spacer and ITR PWMs improved the correlation to R = 0.57 (Table 2; Fig. 3A). This five-module model gave a better correlation than any other combination of modules.

Fig. 2.

Fig. 2.

Sequence logo of 37 promoters active in vitro. Promoter sequences were aligned with respect to their -35, -10, and start motifs to account for variable motif positions (see Fig. S2). PWMs were constructed using the motifs illustrated.

Table 1.

Properties of motif scores and spacer penalties based on in vitro or in vivo promoter models

S + D penalty -35 Spacer -10 Disc Start ITR -10 and -35*
A) Correlation (R) of motif scores with promoter strength for active promoters
In vitro 0.37 0.06 -0.21 0.49 0.22 0.21 -0.12 0.45
In vivo 0.01 0.27 -0.34 0.39 0.37 0.10 0.06 0.60
B) Average active and inactive promoter motif scores and significant difference
In vitro
Active -0.60 6.5 1.3 4.1 0.44 0.79 1.4 11
Inactive -0.72 3.0 1.2 3.8 0.07 0.70 -1.2 6.8
p-value (t-test) 0.7 9 × 10-7 0.9 0.5 0.2 0.6 1 × 10-5 2 × 10-7
In vivo
Active -0.52 6.0 1.6 4.3 0.38 0.76 1.1 10
Inactive -1.2 3.8 0.75 3.5 0.23 0.77 -0.57 7.3
p-value (t-test) 0.1 2 × 10-3 0.1 0.1 0.6 0.9 1 × 10-4 4 × 10-5
C) Number of inactive promoters scoring below motif cut-off thresholds
In vitro 1 7 1 3 5 15
In vivo 1 2 1 4 6 11

*Combined -10 and -35 motif scores.

Table 2.

Correlation of in vitro and in vivo promoter models with strength

Promoter strength measurements Model Correlation (R) between strength and score
Outliers
Init* Opt Val
In vitro 37P 7 modules 0.26
In vitro 37P 5 modules 0.57 0.73 0.73 yfeK, yfeY, yfgC, yicJ, c4860
In vivo 40P 5 modules 0.57 0.77 0.71 yfeY, yfgC, rpoE, yfiO, yhjJ, ybjW

*Initial fit after summing module scores.

Optimized fit after removal of outliers.

Validated promoters fit.

Fig. 3.

Fig. 3.

Scatter plots of PWM promoter scores and promoter activity. Promoter score fits with promoter strength (in vitro or in vivo) using PWM models based on 37 promoters active in vitro (panels AD) and 40 promoters active in vivo (panels EF). Scores are with five module PWM models (S + D penalty, -35, -10, discriminator, start). Correlation scores (R) for each model with promoter strength are shown. (A) Initial fit of 37 promoter in vitro scoring model. (B) Optimized fit of 37 promoter in vitro model after removing five outliers. (C) 10-fold cross-validation scores of optimized in vitro promoter model. (D) Optimized in vitro promoter model scoring active and weak/nonactive promoters. The trend line shows the fit based on the active promoters. (E) Optimized fit of the 40 promoter in vivo model after removing six outliers, scoring active and weak/nonactive promoters. The trend line and correlation score is based on the active promoters. (F) 10-fold cross-validation scores of optimized in vivo promoter model.

A small minority of promoters may have unusual sequence properties that detract from the model and would likely present as outliers in the correlation of score with promoter strength. Five outliers were identified and removed based on their high residuals and leverage properties on the general fit of the model (see Materials and Methods); this optimized model gave an improved fit of R = 0.73 (Table 2; Fig. 3B). The model was then tested by 10-fold cross-validation to demonstrate that it does not overfit the data (i.e., describe random noise in the data; thereby reducing its predictive utility). Promoters were divided into 10 groups, and a model constructed from 9 groups (training set) was used to score promoters in the 10th group (validation set). The training sets were rebuilt using different combinations of groups, enabling each promoter to be validated by an independent model. This gave a correlation of promoter validation scores with promoter strength of R = 0.73, demonstrating good predictive utility (Fig. 3C). Importantly, neither partial least squares regression (PLSR) (which enables differential weighting of modules to promoter score), nor dinucleotide frequency improved the model (Tables S1, S2).

Correlating Promoter Scores with in Vivo Measures of Promoter Strength.

Using the same approach for modeling the 40 active promoters in vivo, the -35, -10, and discriminator PWMs weakly correlate with strength, with only the spacer negatively correlating (Table 1A). Importantly, the same set of five modules (S + D, -35, -10, discriminator, start) provided the best optimized in vivo model, performing similarly to the in vitro model although a slightly different set of outliers was excluded (R = 0.77; Table 2; Fig. 3E). Testing the model with 10-fold cross-validation demonstrated that it does not overfit the data (R = 0.71; Table 2; Fig. 3F). PLSR did not improve the model (Tables S1, S2). We conclude that largely similar PWM models moderately describe promoter strength both in vivo and in vitro.

The optimized in vitro and in vivo models removed a total of nine outliers (with two common to both models; Table 2; Fig. 1). Five outliers (yicJ, yfeK, yhjJ, c4860, and ybjW) have little activity suggesting that they are borderline “functional” promoters, and the remaining four outliers (rpoE, yfeY, yfgC, and yfiO) have discrepant strengths in vitro and in vivo (Fig. 1). These promoters may be subject to additional regulation in vivo, or have kinetic steps that strongly influence their strength under different conditions.

Active and Inactive Promoters Can Be Distinguished by Requiring Minimal Scores for Motifs.

Models that successfully identify promoters also make many false predictions. Our library of active and weak/inactive σE promoters and PWM models provides an ideal opportunity to identify sequence determinants that distinguish active from weak/inactive promoters. The complexity of this task is illustrated by the fact that scoring the inactive promoters with the optimized five-module promoter models results in higher than expected scores that overlap with the active promoters (Fig. 3D, E), suggesting that additional sequence features beyond total promoter score are required to distinguish the two groups.

The active and inactive promoters have similar sequence logos but exhibit key differences that enable their distinction (Fig. 4). The average PWM motif scores of the -35, ITR, and combined -35/-10 motifs are significantly higher in the active promoters compared to the inactive promoters (Table 1B). Additionally, active and inactive promoters have significant differences in binding energy scores at specific positions within the -35 and the ITR for both the in vitro and in vivo datasets and within the -10 for the in vivo dataset (Fig. 4). Together, these observations suggest a minimum motif score requirement for promoter function. Indeed, 18 of the 23 promoters inactive in vitro and 15 of the 20 promoters inactive in vivo had at least one motif below threshold (defined as the lowest score for the corresponding motif in the active promoter set) (Fig. 5). These observations suggest that individual motif scores are more important than total promoter score as predictors of promoter function. Notably, the combined -10/-35motif scores are below threshold in over 50% of inactive promoters, and the -35, discriminator, and ITR are below threshold in many other inactive promoters (Table 1C). Active and inactive promoters also exhibited significant differential cross-correlations of certain motif scores (Fig. 6). The -35 motif/spacer penalty (S + D penalty) (in vitro and in vivo) and spacer/start (in vitro only) are anticorrelated in active promoters and correlated in inactive promoters, indicating that the combination of good -35 motifs with optimal spacer lengths, and good spacer and start sequences, is not favored in active promoters.

Fig. 4.

Fig. 4.

Comparison of active and inactive promoter sequences. Sequence logos of (A) 37 active in vitro promoters minus five outliers; (B) 23 inactive in vitro promoters; (C) 40 active in vivo promoters minus six outliers; (D) 20 inactive in vivo promoters. Positions with significantly different scores (p < 0.05 by t-test) between active and inactive promoters are indicated by red circles.

Fig. 5.

Fig. 5.

Heat maps of promoter module scores clustered against promoter score and promoter strength. The heat maps illustrate the z-score of each module for all promoters based on the in vitro (A) or in vivo (B) optimized promoter models. Module z-scores are calculated from the average and standard deviation of all scores of that module in the active promoters; this enables relative comparison of scores between modules. High and low scores are indicated by red and green, respectively. “-10 & -35” indicates combined scores of the -10 and -35 modules. Promoter score is based on a five-module optimized PWM model. Promoters are rank-ordered by promoter strength in vitro (A) and in vivo (B); active and weak promoters are indicated. The heat map is vertically clustered using Euclidean clustering to indicate the similarity of module scores with each other and with promoter score and strength. Modules of weak promoters scoring below the lowest module score in the active promoters are highlighted with a yellow box.

Fig. 6.

Fig. 6.

Cross-correlations of module scores of active and inactive promoters. Cross-correlations scores (R) of modules of (A) 37 active in vitro promoters minus five outliers; (B) 23 inactive in vitro promoters; (C) 40 active in vivo promoters minus six outliers; (D) 20 inactive in vivo promoters. +ve to -ve correlations are indicated by a sliding red-yellow-green color scale.

Discussion

We present a unique comprehensive analysis of the core promoter requirements of natural promoters. Our analysis focused on promoters recognized by σE, a member of the largest and simplest σ subfamily: the group 4 (ECF) σs. We developed a modular PWM model that was moderately predictive of promoter strength and showed that imposition of minimal motif scores dramatically improved the ability of the model to distinguish active from weak/inactive promoters. These results have significant implications for understanding the structure of promoters. Our results also point to the significant challenges remaining before we are able to predict full-length promoters and promoters recognized by more complex σs with more flexible promoter recognition requirements.

Two features of our approach enabled significant progress over previous modeling attempts. Our use of modular PWMs enabled independent evaluation of the contribution of every feature of the promoter to its strength. As a consequence, we were able to identify the promoter components that best predicted the strength of active promoters. Importantly, the same model predicted strength measured either in vivo or in vitro, demonstrating that the model captured general features contributing to promoter strength, rather than specialized features that reflect the method of measurement. Second, we used a comprehensive set of promoters that spanned the entire range of promoter activity, including many with marginal activity. This enabled us to develop a set of criteria that could distinguish active from inactive promoters both in vivo and in vitro. Importantly, these criteria were largely distinct from those that determined the strength of active promoters (Table 1A, C).

Imposition of minimal scores for each motif distinguished inactive from active promoters. The single most important predictor of promoter inactivity is a combined -10/-35 motif score that is below threshold. This cutoff identified twice as many inactive promoters in vitro and 5 times as many inactive promoters in vivo as the individual -10 and -35 scores. Additionally, the ITR was an important predictor of functionality, even though this feature detracts from the promoter strength model (Table 1A). The five positions in the ITR with significantly different binding energy scores between active and inactive promoters cluster on the same face of the DNA helix (Fig. 4), suggesting they may identify a unique process in transcription.

Our findings of minimal motif score requirements for promoter function correlate well with work on transcription factors where the correlation between binding energy and PWM score of the target binding sites (15) breaks down as sites are increasingly mutated away from consensus. Loss of specific protein-DNA interactions results in an abrupt transition to nonspecific electrostatic interactions (25). The same is likely true for our promoters. When a motif scores below a certain threshold, specific protein-DNA contacts critical for promoter function are lost. High scores in other motifs in general cannot compensate for loss of this contact, rendering the promoter nonfunctional. However, there is some interdependency of motifs. First, the finding that a combined -35/-10 motif cutoff is a superior predictor than either motif alone (Table 1C) indicates that a moderately low scoring -10 motif can only be compensated by an excellent -35 motif, and vice versa. Second, functional promoters exhibited anticorrelation of certain modules (Fig. 6) suggesting that it is disadvantageous for σE promoters to have high scores in all motifs and optimal spacer lengths. For E. coli, σ70 consensus promoters are thought to be limited at promoter escape due to the energetic cost of breaking these optimal contacts (2628) and the same is likely to be true for σE promoters.

Comparison of the active and inactive promoter sets also pointed to two features in the -35 motif that are important for function. We find that two positions in the -35 motif (underlined; GGAACTT) are significantly overrepresented in the functional class (Fig. 4). Importantly, structural analysis indicates that these positions are responsible for two of the three σE sequence-specific -35 contacts (29), suggesting that their absence significantly disrupts -35 recognition. Additionally, the AAC unit (underlined; GGAACTT), suggested to provide a rigid structural unit that facilitates recognition by σE (29) is missing in ∼45% of inactive promoters, but present in almost all active promoters (> 90%), indicating that it is important for promoter function. Importantly, this is the first indication that correlations above the single nucleotide level contribute to promoter function. Additionally, these two features may be generally diagnostic of functionality for promoters of the group 4 σs as the type of sequence-specific recognition carried out by σE is likely to be conserved across this group (29) and the AAC unit is present in the promoter consensus for most group 4 σs (6).

A complete promoter model must include the upstream sequences (UP-elements) that bind the α subunits of RNA polymerase and contribute to promoter activity in all groups of σs (5, 9, 3033). Strong UP-sequences contain tracts of As and Ts (3435) that generate a narrowed minor groove required for α-binding (36). These tracts suggest a dependency on adjacent nucleotide positions that may require alternative approaches to PWMs to capture their contribution to promoter strength. We are currently investigating this at σE promoters.

Housekeeping promoters, such as those regulated by E. coli σ70, will be difficult to model using the approaches described here. In contrast to σE and other group 4 σs, the housekeeping σs require only a subset of motifs for an active promoter, leading to poorly conserved promoter sequences (1, 37). The extensive use of activators by housekeeping σs exacerbates promoter degeneracy because protein-protein interactions between the activator and RNA polymerase compensate for poor promoter motifs. Activators also alter the placement of the upstream α-binding sites (38), increasing the complexity of their prediction. Consequently, tractable prediction models for housekeeping promoters will likely focus on only the small subset of promoters that are activator-independent and contain moderately well-conserved motifs.

Materials and Methods

Please refer to SI Text for detailed methods.

Biological Methods.

The 60 natural σE-dependent promoters have native promoter sequence from -35 to +20, flanked by vector sequence and the rpoC terminator downstream (Fig. S1), and were generated by PCR (9) (Table S3). Single-round transcriptions were used as a measure of promoter activity. Assays were performed as in refs. 5 and 39 with modifications described in the Supporting Information. Promoter strength was determined in vivo as described in ref. 9 from strains overexpressing σE in M9 complete minimal medium supplemented with 0.2% glucose and 1 mM IPTG at 30 °C. Promoter activities are in Table S4.

Scoring σE Promoters Using PWMs.

Position weight matrices were constructed using the method of ref. 40 from aligned promoter sequences for each of the elements shown in Fig. 2 (see Table S5). Sequences were scored as described in ref. 5. Spacer and discriminator length penalty scores calculated as described in ref. 5 were applied to promoters with suboptimal spacing between the +1, -10, and -35 motifs (Fig. S3). Total promoter score was calculated by summing the PWM and penalty scores. Based on ref. 13, promoter score, Sp is taken to be proportional to the log of promoter strength, Sp ∝ ln(Ka), where Ka is occupancy or promoter strength. The fit of Sp with ln(Ka) was assessed by Pearson’s correlation coefficient (R). Outliers with both high residual y-variance and high leverage were identified using the software “The Unscrambler v9.8” (CAMO Software AS, Norway; http://www.camo.no). Sequence logos of aligned motifs were generated using WebLogo v2.8 [http://weblogo.berkeley.edu//; (41)].

Supplementary Material

Supporting Information

ACKNOWLEDGMENTS.

We thank Carol A. Gross for her extensive help and support; Pieter deHaseth, Hao Li, and Chuck Turnbough for many helpful discussions; and Steve Busby, Richard Gourse, John Helmann, Peter von Hippel, Hao Li, and members of the Gross lab for critical comments on the manuscript. This work was supported by the National Institutes of Health Grant GM57755 (to Carol A. Gross).

Footnotes

The authors declare no conflict of interest.

This article contains supporting information online at www.pnas.org/cgi/content/full/0915066107/DCSupplemental.

References

  • 1.Hook-Barnard IG, Hinton DM. Transcription initiation by mix and match elements: flexibility for polymerase binding to bacterial promoters. Gene Regul Syst Bio. 2007;1:275–293. [PMC free article] [PubMed] [Google Scholar]
  • 2.Huerta AM, Collado-Vides J. Sigma70 promoters in Escherichia coli: specific transcription in dense regions of overlapping promoter-like signals. J Mol Biol. 2003;333(2):261–278. doi: 10.1016/j.jmb.2003.07.017. [DOI] [PubMed] [Google Scholar]
  • 3.Hertz GZ, Stormo GD. Escherichia coli promoter sequences: analysis and prediction. Methods Enzymol. 1996;273:30–42. doi: 10.1016/s0076-6879(96)73004-5. [DOI] [PubMed] [Google Scholar]
  • 4.Helmann JD. The extracytoplasmic function (ECF) sigma factors. Adv Microb Physiol. 2002;46:47–110. doi: 10.1016/s0065-2911(02)46002-x. [DOI] [PubMed] [Google Scholar]
  • 5.Rhodius VA, Suh WC, Nonaka G, West J, Gross CA. Conserved and variable functions of the sigma(E) stress response in related genomes. PLoS Biol. 2006;4(1):43–59. doi: 10.1371/journal.pbio.0040002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Staron A, et al. The third pillar of bacterial signal transduction: classification of the extracytoplasmic function (ECF) sigma factor protein family. Mol Microbiol. 2009;74(3):557–581. doi: 10.1111/j.1365-2958.2009.06870.x. [DOI] [PubMed] [Google Scholar]
  • 7.Stormo GD. DNA binding sites: representation and discovery. Bioinformatics. 2000;16(1):16–23. doi: 10.1093/bioinformatics/16.1.16. [DOI] [PubMed] [Google Scholar]
  • 8.Horton PB, Kanehisa M. An assessment of neural network and statistical approaches for prediction of E. coli promoter sites. Nucleic Acids Res. 1992;20(16):4331–4338. doi: 10.1093/nar/20.16.4331. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Mutalik VK, Nonaka G, Ades SE, Rhodius VA, Gross CA. Promoter strength properties of the complete sigma E regulon of Escherichia coli and Salmonella enterica. J Bacteriol. 2009;191(23):7279–7287. doi: 10.1128/JB.01047-09. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Papenfort K, et al. Sigma E-dependent small RNAs of Salmonella respond to membrane stress by acelerating global omp mRNA decay. Mol Microbiol. 2006;62(6):1674–1688. doi: 10.1111/j.1365-2958.2006.05524.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Thompson KM, Rhodius VA, Gottesman S. SigmaE regulates and is regulated by a small RNA in Escherichia coli. J Bacteriol. 2007;189(11):4243–4256. doi: 10.1128/JB.00020-07. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Johansen J, Rasmussen AA, Overgaard M, Valentin-Hansen P. Conserved small non-coding RNAs that belong to the sigma(E) regulon: role in down-regulation of outer membrane proteins. J Mol Biol. 2006;364(1):1–8. doi: 10.1016/j.jmb.2006.09.004. [DOI] [PubMed] [Google Scholar]
  • 13.Berg OG, von Hippel PH. Selection of DNA binding sites by regulatory proteins. Statistical-mechanical theory and application to operators and promoters. J Mol Biol. 1987;193(4):723–750. doi: 10.1016/0022-2836(87)90354-8. [DOI] [PubMed] [Google Scholar]
  • 14.Staden R. Computer methods to locate signals in nucleic acid sequences. Nucleic Acids Res. 1984;12(1 Pt 2):505–519. doi: 10.1093/nar/12.1part2.505. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Stormo GD, Fields DS. Specificity, free energy and information content in protein-DNA interactions. Trends Biochem Sci. 1998;23(3):109–113. doi: 10.1016/s0968-0004(98)01187-6. [DOI] [PubMed] [Google Scholar]
  • 16.Benos PV, Bulyk ML, Stormo GD. Additivity in protein-DNA interactions: how good an approximation is it? Nucleic Acids Res. 2002;30(20):4442–4451. doi: 10.1093/nar/gkf578. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Fields DS, He Y, Al-Uzri AY, Stormo GD. Quantitative specificity of the Mnt repressor. J Mol Biol. 1997;271(2):178–194. doi: 10.1006/jmbi.1997.1171. [DOI] [PubMed] [Google Scholar]
  • 18.Sarai A, Takeda Y. Lambda repressor recognizes the approximately 2-fold symmetric half-operator sequences asymmetrically. Proc Natl Acad Sci USA. 1989;86(17):6513–6517. doi: 10.1073/pnas.86.17.6513. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Takeda Y, Sarai A, Rivera VM. Analysis of the sequence-specific interactions between Cro repressor and operator DNA by systematic base substitution experiments. Proc Natl Acad Sci USA. 1989;86(2):439–443. doi: 10.1073/pnas.86.2.439. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Mulligan ME, Hawley DK, Entriken R, McClure WR. Escherichia coli promoter sequences predict in vitro RNA polymerase selectivity. Nucleic Acids Res. 1984;12(1 Pt 2):789–800. doi: 10.1093/nar/12.1part2.789. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Szoke PA, Allen TL, deHaseth PL. Promoter recognition by Escherichia coli RNA polymerase: effects of base substitutions in the -10 and -35 regions. Biochemistry. 1987;26(19):6188–6194. doi: 10.1021/bi00393a035. [DOI] [PubMed] [Google Scholar]
  • 22.Djordjevic M, Bundschuh R. Formation of the open complex by bacterial RNA polymerase—a quantitative model. Biophys J. 2008;94(11):4233–4248. doi: 10.1529/biophysj.107.116970. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Wozniak CE, Hughes KT. Genetic dissection of the consensus sequence for the class 2 and class 3 flagellar promoters. J Mol Biol. 2008;379(5):936–952. doi: 10.1016/j.jmb.2008.04.043. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.McDowell JC, Roberts JW, Jin DJ, Gross C. Determination of intrinsic transcription termination efficiency by RNA polymerase elongation rate. Science. 1994;266(5186):822–825. doi: 10.1126/science.7526463. [DOI] [PubMed] [Google Scholar]
  • 25.von Hippel PH. From “simple” DNA-protein interactions to the macromolecular machines of gene expression. Annu Rev Biophys Biomol Struct. 2007;36:79–105. doi: 10.1146/annurev.biophys.34.040204.144521. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Ellinger T, Behnke D, Knaus R, Bujard H, Gralla JD. Context-dependent effects of upstream A-tracts. Stimulation or inhibition of Escherichia coli promoter function. J Mol Biol. 1994;239(4):466–475. doi: 10.1006/jmbi.1994.1389. [DOI] [PubMed] [Google Scholar]
  • 27.Ellinger T, Behnke D, Bujard H, Gralla JD. Stalling of Escherichia coli RNA polymerase in the +6 to +12 region in vivo is associated with tight binding to consensus promoter elements. J Mol Biol. 1994;239(4):455–465. doi: 10.1006/jmbi.1994.1388. [DOI] [PubMed] [Google Scholar]
  • 28.Miroslavova NS, Busby SJ. Investigations of the modular structure of bacterial promoters. Biochem Soc Symp. 2006;73:1–10. doi: 10.1042/bss0730001. [DOI] [PubMed] [Google Scholar]
  • 29.Lane WJ, Darst SA. The structural basis for promoter -35 element recognition by the group IV sigma factors. PLoS Biol. 2006;4(9):1–10. doi: 10.1371/journal.pbio.0040269. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Gourse RL, Ross W, Gaal T. UPs and downs in bacterial transcription initiation: the role of the alpha subunit of RNA polymerase in promoter recognition. Mol Microbiol. 2000;37(4):687–695. doi: 10.1046/j.1365-2958.2000.01972.x. [DOI] [PubMed] [Google Scholar]
  • 31.Haugen SP, Ross W, Gourse RL. Advances in bacterial promoter recognition and its control by factors that do not bind DNA. Nat Rev Microbiol. 2008;6(7):507–519. doi: 10.1038/nrmicro1912. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Typas A, Hengge R. Differential ability of sigma(s) and sigma70 of Escherichia coli to utilize promoters containing half or full UP-element sites. Mol Microbiol. 2005;55(1):250–260. doi: 10.1111/j.1365-2958.2004.04382.x. [DOI] [PubMed] [Google Scholar]
  • 33.Koo BM, Rhodius VA, Campbell EA, Gross CA. Dissection of recognition determinants of Escherichia coli sigma32 suggests a composite -10 region with an ‘extended -10’ motif and a core -10 element. Mol Microbiol. 2009;72(4):815–829. doi: 10.1111/j.1365-2958.2009.06690.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Estrem ST, et al. Bacterial promoter architecture: subsite structure of UP elements and interactions with the carboxy-terminal domain of the RNA polymerase alpha subunit. Genes Dev. 1999;13(16):2134–2147. doi: 10.1101/gad.13.16.2134. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Estrem ST, Gaal T, Ross W, Gourse RL. Identification of an UP element consensus sequence for bacterial promoters. Proc Natl Acad Sci USA. 1998;95(17):9761–9766. doi: 10.1073/pnas.95.17.9761. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Benoff B, et al. Structural basis of transcription activation: the CAP-alpha CTD-DNA complex. Science. 2002;297(5586):1562–1566. doi: 10.1126/science.1076376. [DOI] [PubMed] [Google Scholar]
  • 37.Shultzaberger RK, Chen Z, Lewis KA, Schneider TD. Anatomy of Escherichia coli sigma70 promoters. Nucleic Acids Res. 2007;35(3):771–788. doi: 10.1093/nar/gkl956. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Browning DF, Busby SJ. The regulation of bacterial transcription initiation. Nat Rev Microbiol. 2004;2(1):57–65. doi: 10.1038/nrmicro787. [DOI] [PubMed] [Google Scholar]
  • 39.Rhodius V, Savery N, Kolb A, Busby S. Assays for transcription factor activity. Methods Mol Biol. 2001;148:451–464. doi: 10.1385/1-59259-208-2:451. [DOI] [PubMed] [Google Scholar]
  • 40.Stormo GD. Consensus patterns in DNA. Methods Enzymol. 1990;183:211–221. doi: 10.1016/0076-6879(90)83015-2. [DOI] [PubMed] [Google Scholar]
  • 41.Crooks GE, Hon G, Chandonia JM, Brenner SE. WebLogo: a sequence logo generator. Genome Research. 2004;14(6):1188–1190. doi: 10.1101/gr.849004. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supporting Information

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES