Abstract
Several regulators within the AraC family control the expression of genes known or thought to be required for virulence of bacterial pathogens. One of these, Rns, activates transcription from an unprecedented variety of binding-site locations. Although nearly all prokaryotic activators bind within a small region upstream and adjacent to the promoter that they regulate, Rns does not bind within this region to activate its own promoter, Prns. Instead, to activate Prns, Rns requires one binding site 224.5 bp upstream and one downstream of the transcription start site. We show in this study that several other AraC family activators recognize the same binding sites as Rns and share with it the ability to utilize a downstream binding site. Like Rns, other members of this group of activators positively regulate the expression of virulence factors in pathogenic bacteria. These regulators are also able to activate transcription from promoter-proximal upstream binding sites since they are able to substitute for Rns at Pcoo, a promoter with only upstream binding sites. Thus, Rns is the prototype for a group of regulators, which include CfaR, VirF, AggR, and CsvR and which activate transcription from locations that are more diverse than those of any other known activator.
Rns, a transcriptional regulator belonging to the AraC family (16), is present in some strains of human disease-associated enterotoxigenic Escherichia coli (ETEC). Rns is required for the expression of CS1 and CS2 pili, as well as for its own expression (6, 15). To fully activate the CS1 pilin promoter, Pcoo, Rns uses two DNA binding sites upstream of the transcription start site, site I centered at bp −109.5 and site II centered at bp −37.5 (24). As is the case for nearly all other prokaryotic activators of ς70-dependent promoters (17), both sites are upstream of the promoter −10 hexamer. When bound at site II, which overlaps the −35 hexamer, Rns may make direct contacts with the ς or α subunits of RNA polymerase, as has been shown for other activators (4, 19, 33). A potential Rns binding site has also been identified upstream of the CS2 pilin genes at the same position as site II (24), suggesting that the mechanism of transcriptional activation may be similar at both promoters.
Although Rns binding sites are located within the expected region at the CS1 and CS2 pilin promoters, the arrangement of Rns binding sites at Prns is unprecedented for an activator. Site 1 is centered 224.5 bp upstream of the transcription start site (25), well outside the promoter-proximal region where activators typically bind. Despite its distance from Prns, site 1 is required for Rns-dependent expression from this promoter because nucleotide substitutions within site 1 that interfere with Rns binding in vitro abolish Rns activation of Prns in vivo. Although other activators of ς70 promoters may also have promoter-distal upstream binding sites, these are invariably accompanied by promoter-proximal binding sites for the activator or for an auxiliary regulator (17). Unlike these other activators, there are no additional Rns binding sites between site 1 and the transcription start site of Prns, and, by itself, Rns facilitates the formation of an RNAP-open complex at Prns in vitro (25).
Rns has two additional binding sites near Prns; however, both of these sites are downstream of the transcription start site: site 2 is centered at bp +43.5 and site 3 is at bp +83.5. The locations of these sites suggest that Rns negatively regulates its own transcription because proteins that bind downstream of the −10 hexamer invariably act as repressors (17). However, nucleotide substitutions within site 3 abolished Rns-dependent expression from Prns, demonstrating that Rns, unlike nearly all other characterized prokaryotic activators, is not restricted to upstream binding sites (25). Thus, Rns is capable of activating transcription from an unprecedented variety of binding site locations, and this suggests there may be no intrinsic limitation to binding-site locations for activators.
Although Rns is unusual in the location of its binding sites, it probably binds to DNA in the same manner as other AraC family members do, because it shares with them a conserved secondary structure (16). Like most AraC proteins, Rns has two predicted helix-turn-helix (HTH) motifs within its carboxy terminus. For other AraC family members (3, 32) and presumably also for Rns, each motif participates in DNA binding by placing a recognition helix in the major groove of DNA. The recognition helices of Rns are identical or very similar to those of a group of regulators within the AraC family (Fig. 1), which suggests these regulators bind to DNA sequences similar to Rns binding sites. However, a direct comparison between the binding sites for these regulators is not possible because only those that are more distantly related to Rns, UreR (39) and VirF of Yersinia spp. (43), have binding sites that have been clearly defined experimentally. Attempts to characterize others biochemically (41, 42) have been hampered by their insolubility, a trait common to members of the AraC family (16).
FIG. 1.
Identity of Rns to other regulators within the AraC family. The carboxy termini of regulators with significant homology to Rns are shown. Amino acids identical to those of Rns are shaded. Numbering is relative to Rns, a 31-kDa protein. The predicted HTH motifs of Rns, which are thought to be involved in DNA binding, are overlined. MarA, for which there is structural information, is also shown with its HTH motifs underlined. Abbreviations: Ec, E. coli, Sf, S. flexneri; Pm, P. mirabilis; Yp, Y. pestis.
While little is known about their actual binding sites, the regulators with HTH motifs similar to that of Rns (Fig. 1) are required for the expression of genes known or thought to be required for virulence of bacterial pathogens. For example, like Rns, several of these activators are needed for the expression of pili. CfaR and CsvR activate the expression of colonization factor antigen I (CFA/I) and CS4 pili, respectively, in different human disease-associated ETEC strains (7, 44). AggR is required for the expression of aggregative adherence fimbriae I and II (AAF/I and AAF/II) in enteroaggregative E. coli (14, 27). FapR positively regulates the expression of 987P pili in porcine strains of ETEC (20). PerA positively regulates the expression of bundle-forming pili in enteropathogenic E. coli (EPEC) (41).
Rns-related regulators may also regulate the expression of other types of virulence factors, either directly or indirectly through regulatory cascades. In addition to activation of BFP pili, PerA may activate the expression of virulence-associated genes within the 36.5-kb locus of enterocyte attachment and effacement (LEE) (21) indirectly through the activator Ler (22). PerA also positively regulates the expression of a polycistronic mRNA encoding Tir, CesT, and EaeA (22), but it is not clear whether this occurs through Ler or through another regulator yet to be identified (22, 37). VirF from Shigella flexneri, the causative agent of bacillary dysentery, positively regulates the expression of genes needed for S. flexneri to invade, replicate within, and spread between epithelial cells of the colon. VirF regulates some virulence genes directly, as in the case of virG, and regulates others indirectly by activating VirB, a regulator unrelated to the AraC family (12). UreR, which has been isolated from uropathogenic strains of E. coli and Proteus mirabilis, directly activates the expression of genes encoding the structural subunits of urease in the presence of urea (9). Although UreR is the only Rns-related activator that has been shown to require an inducer, it positively regulates its own expression, as do Rns and PerA (5, 10, 15).
The similarities of other activators within the AraC family to Rns suggest the possibility that, like Rns, some may be able to activate transcription from binding sites outside the upstream promoter-proximal region. In this work, we found that Rns can substitute for several related regulators and that all but one of these is also able to substitute for Rns at Pcoo and Prns. We also found that site 3, downstream of the transcription start site, is required by each of these regulators to activate Prns. Since the identity of Rns to some of these activators is limited to their HTH motifs, this motif is an indicator of the ability of AraC family members to substitute for each other. In this study we also demonstrate that regulators interchangeable with Rns recognize the same binding sites as Rns and that a prototypical binding site for the group can be defined from the known binding sites for Rns. This can then be used to predict the location of the binding sites for Rns-related virulence regulators and may facilitate the identification of new virulence genes by sequence analysis. Some of these predicted binding sites are downstream of transcription start sites and even within genes. Thus, transcriptional activation from downstream binding sites is not limited to a single activator or a single promoter. Rather, Rns represents a new class of transcriptional regulators that play a pivotal role in the virulence of bacterial pathogens and whose activity is not restricted to promoter-proximal upstream binding sites.
MATERIALS AND METHODS
Plasmids and strains.
Plasmids, strains, and reporter phage are summarized in Table 1. Reporter phage were constructed by cloning the promoter region of interest upstream of the promoterless lacZYA operon carried by plasmids pRS550, pRS551, or pRS415 (36). The derivative plasmids were then recombined with a resident prophage, λRS45, by homologous recombination as described previously (36).
TABLE 1.
Plasmids and reporter phage used in this study
| Feature | Name | Relevant characteristicsa | Reference |
|---|---|---|---|
| Plasmids | |||
| Rns | pEU2080 | rns expressed from the arabinose-inducible promoter of pBAD24 | 24 |
| PerA | pTB101-T2 | perABC expressed from IPTG-inducible promoter of pTB101 | 41 |
| VirF | pATM323 | virF expressed from the arabinose-inducible promoter of pBAD18 | Maurellia |
| FapR | pEU2137 | fapR expressed from the arabinose-inducible promoter of pBAD24 | This study |
| AggR | pJPN52 | aggR cloned into pBR328, constitutively expressed | 27 |
| CfaR | pNTP503 | cfaR cloned into pBR322, constitutively expressed | 44 |
| UreRPm | pSKW4 | ureR of P. mirabilis cloned into pBR322, expressed from a urea-inducible promoter | 11 |
| Phage | |||
| Pcoo-lacZ | λEU2108 | CS1 pilin promoter region from −426 to +514 | 24 |
| Prns-lacZ | λEU2103 | rns promoter region from −707 to +9 | 25 |
| Prns3-lacZ | λEU2129 | derivative of λEU2013 carrying substitutions within Rns binding site 3 from −93 AAAA −90 to −93 GGCG −90 | 25 |
| PvirB-lacZ | λMAD102 | virB promoter region from −259 to +308 | Maurellia |
| PfasA-lacZ | λEU2134 | 987P pilin promoter region from >−1000 to +173 | This study |
| PureREc-lacZ | λSEF06 | E. coli ureR promoter region from −247 to +286 | 11 |
| PureDEc-lacZ | λSEF09 | E. coli ureD promoter region from −391 to +200 | 11 |
| PureRPm-lacZ | λPROT11 | P. mirabilis ureR promoter, undefined region | 11 |
| PperA-lacZ | λEU2125 | perA promoter region from −193 to +147 | This study |
| Ptir-lacZ | λEU2133 | tir cesT eaeA promoter region from −480 to +2430 | This study |
| LEE1-lacZ | λLEE1 | LEE1 operon promoter region from −665 to +65 | 22 |
All numbering is relative to the beginning of the relevant open reading frame.
T. Maurelli, unpublished data.
Enzymatic assays.
Lysogens of E. coli strain DH5α (45) were used for analysis of the PureR and PureD promoters, and lysogens of strain MC4100 (8) were used for analysis of all other promoters. All strains were grown with aeration at 37°C in Luria Bertani (LB) medium with 100 μg of ampicillin per ml. For induction, strains carrying plasmids with regulators expressed from the promoter ParaBAD (18) were grown to log phase with 0.2% glucose, washed, and diluted into LB medium with 0.1% arabinose. For strains carrying plasmids with regulators expressed from Ptac (40), isopropyl-β-d-thiogalactopyranoside (IPTG) was added at a final concentration of 30 μM to log-phase cells. Aliquots of cells in log phase were assayed for β-galactosidase activity (23) before and 2 h after induction with IPTG or arabinose. UreR-dependent expression of β-galactosidase was assayed on MacConkey agar containing 300 mM urea. For regulators that were not under the control of inducible promoters, the strains were assayed in log phase.
Sequence analysis.
A binding-site consensus and scoring index (28) was defined by aligning the five known Rns binding sites as previously described (25). This scoring index combines the redundancy index (RI) and the Berg vonHippel function (BvH) and is represented graphically by the Rns binding-site sequence logo (34). The RI ranks the importance of each position in the consensus in terms of the conservation at that position within the alignment (35). The BvH relates the occurrence of a query nucleotide to that of the consensus base (2). Each potential binding site is assigned an overall index ranking that is the cumulative sum of BvH times RI at each position. Potential binding sites with index rankings as good as or better than those of known Rns binding sites were defined as having high similarity to Rns binding sites. The index rankings of other sites were used to define them as having moderate or low similarity to known binding sites by assigning arbitrary cutoff values for each category.
RESULTS AND DISCUSSION
Substitution for other regulators by Rns.
The ability of Rns to substitute for other regulators was assayed using single-copy promoter-lacZ reporter fusions integrated into the chromosome of E. coli K-12. Each reporter strain was transformed with a plasmid carrying rns expressed from an arabinose-inducible promoter, a plasmid carrying the promoter's cognate regulator, or, as a negative control, the relevant vector plasmid carrying neither regulator. For the purposes of this study, these assays were sufficiently sensitive to determine which activators can substitute for one another, but they should not be interpreted as a comparison of the activation efficiency of different regulators because the regulators may be expressed at different levels.
Previously, it was concluded that Rns can substitute for VirF of S. flexneri because Rns produced an increased expression of β-galactosidase from a mxiC-lacZ fusion (29). We wished to reexamine this conclusion by assaying the ability of Rns to activate the promoter of virB because VirF positively regulates mxiC indirectly through virB (1). We found that when expression of Rns or VirF was induced, the level of β-galactosidase from PvirB increased (Table 2). Even before the expression of Rns was induced, the expression of β-galactosidase from PvirB was higher than that from the negative control strain. This suggests that even a low level of Rns expression is sufficient to activate PvirB (Table 2). These results show that Rns and VirF both activate the same promoter and are consistent with the conclusions of Porter et al. (29). Although a reporter for an AggR-regulated promoter was not available for these studies, Rns can probably substitute for this regulator because Rns can substitute for another regulator, CfaR (7), which can substitute for AggR (27). Rns is also more closely related to AggR than to FapR (Fig. 1), and Rns activated PfasA (Table 2), the 987P fimbria promoter which is regulated by FapR in porcine ETEC strains (13).
TABLE 2.
Rns substitution for other regulators
| Promoter | Regulator | β-Galactosidase expressiona
|
||
|---|---|---|---|---|
| Preinduction | Postinduction | Ratiob | ||
| PvirB | Rns | 134 | 1,082 | 8 |
| VirF | 57 | 1,113 | 20 | |
| Vector | 22 | 10 | 1 | |
| PfasA | Rns | 49 | 235 | 5 |
| FapR | 279 | 953 | 3 | |
| Vector | 48 | 53 | 1 | |
| PperA | Rns | 207 | 449 | 2 |
| Vector | 204 | 386 | 2 | |
| PerABC | 1,215 | 15,665 | 13 | |
| Vector | 281 | 337 | 1 | |
| LEE1 | Rns | 531 | 395 | 1 |
| Vector | 423 | 661 | 2 | |
| PerABC | 1,595 | 19,133 | 12 | |
| Vector | 356 | 302 | 1 | |
| Ptir | Rns | 75 | 164 | 2 |
| Vector | 127 | 122 | 1 | |
| PerABC | 288 | 370 | 1 | |
| Vector | 235 | 232 | 1 | |
| PureRPm | Rns | 85 | 125 | 2 |
| Vector | 77 | 38 | 1 | |
| UreRc | Lac+ | |||
| Vectorc | Lac− | |||
| PureREc | Rns | 87 | 118 | 1 |
| Vector | 90 | 45 | 1 | |
| UreRc | Lac+ | |||
| Vectorc | Lac− | |||
| PureDEc | Rns | 131 | 121 | 1 |
| Vector | 105 | 100 | 1 | |
| UreRc | Lac+ | |||
Values are means of at least three samples, and in all cases the standard deviation was less than 15% of the mean. Values are expressed in β-galactosidase units.
Ratio of β-galactosidase expression from the strain after induction of the regulator compared to preinduction conditions.
For these strains, β-galactosidase expression was determined by their phenotype on MacConkey agar with 300 mM urea.
The levels of β-galactosidase from PperA and LEE1, PerA-regulated promoters in EPEC, were not significantly higher when the expression of Rns was induced than in a control strain carrying neither Rns nor PerA (Table 2). Although the level of expression from Ptir, a promoter indirectly regulated by PerA in EPEC, was twofold higher when Rns was induced than under noninducing conditions, this level was comparable to that for the negative control strain grown under the same inducing conditions. This indicates that Rns does not activate Ptir. Similarly, promoters regulated by UreR in P. mirabilis and in uropathogenic strains of E. coli (11) were not activated by Rns, although the cognate activator UreR caused expression of β-galactosidase from each of these promoters in the presence of urea (Table 2 and data not shown). Thus, FapR appears to be the most distantly related AraC family member for which Rns can substitute.
Substitution for Rns by other virulence regulators.
To determine whether regulators homologous to Rns can substitute for Rns, we assayed their ability to increase the expression of β-galactosidase from Pcoo-lacZ and Prns-lacZ reporters integrated into the chromosome of E. coli K-12. Each regulator was provided in trans from a plasmid. Some regulators were expressed from their native promoters (AggR, UreR, and CfaR), while others were under the control of inducible promoters. In each case, a strain without the gene for the regulator served as a negative control and a strain with rns expressed from an arabinose-inducible promoter served as a positive control. When the expression of VirF was induced, the expression of β-galactosidase from both Pcoo and Prns increased (Table 3). Although this result is expected from the ability of Rns to substitute for VirF (Table 2), it disagrees with the previous conclusions of Porter et al. (29). Presumably their assay, which measured CS1 pilin expression from a Western blot of CooA, was too insensitive to detect activation. AggR and CfaR, which were expressed constitutively from their native promoters, also resulted in significantly higher β-galactosidase expression from both Pcoo and Prns than in negative control strains (Table 3). These results show that CfaR, AggR, and VirF are able to substitute for Rns at both Pcoo and Prns.
TABLE 3.
Substitution of Rns by other regulators
| Regulatorb | β-Galactosidase expressiona from promoter:
|
||||||||
|---|---|---|---|---|---|---|---|---|---|
| Pcoo-lacZ
|
Prns-lacZ
|
Prns3-lacZc
|
|||||||
| Preinduction | Postinduction | Ratiod | Preinduction | Postinduction | Ratio | Preinduction | Postinduction | Ratio | |
| Rns | 105 | 3,862 | 37 | 320 | 2,038 | 6 | 424 | 474 | 1 |
| VirF | 161 | 816 | 5 | 247 | 780 | 3 | 195 | 284 | 2 |
| FapR | 53 | 92 | 2 | 197 | 169 | 1 | |||
| Vector | 150 | 253 | 2 | 221 | 260 | 1 | |||
| AggR | 3,121 | 17 | 1,564 | 9 | 139 | 1 | |||
| CfaR | 5,016 | 28 | 530 | 3 | 246 | 1 | |||
| Vector | 178 | 1 | 168 | 1 | 247 | 1 | |||
| PerABC | 129 | 245 | 2 | 117 | 345 | 3 | |||
| Vector | 73 | 43 | 1 | 114 | 245 | 2 | |||
| UreRPme | Lac− | Lac− | |||||||
β-Galactosidase units. Values are means of at least three samples, and in all cases the standard deviation was less than 15% of the mean.
Regulators were expressed in trans from inducible (Rns, VirF, FapR, PerA, and UreR) or constitutive (AggR and CfaR) promoters.
Prns3-lacZ carries nucleotide substitutions within Rns binding site 3. The other promoter constructs are wild type.
Ratio of β-galactosidase expression from strain grown under postinduction compared to preinduction conditions or to vector control for strains carrying constitutively expressed AggR and CfaR.
For this strain, β-galactosidase expression was determined by its phenotype on MacConkey agar with 300 mM urea.
Surprisingly, FapR was unable to activate Pcoo or Prns (Table 3), although Rns did substitute for FapR to activate PfasA (Table 2). This may be because the location of a binding site relative to the promoter is more critical for FapR activity than for Rns activity. This would be consistent with our finding that there are sites similar to Rns binding sites upstream of fasA and within fapR but that the arrangement of these sites is different from that of Rns binding sites in operons regulated by Rns (see below).
As expected from our finding that Rns cannot substitute for PerA or UreR, neither of these regulators was able to substitute for Rns at Pcoo or Prns (Table 3 and data not shown). PerA was also unable to activate these promoters when strains were grown in Dulbecco's modified Eagle's medium (data not shown), conditions that produce the highest PerA activity (30). UreR did not activate either promoter, since the E. coli K-12 strains carrying the Pcoo-lacZ and Prns-lacZ reporters were Lac− on indicator plates when transformed with the plasmid carrying UreR expressed from its native promoter, even in the presence of the inducer urea (data not shown). The inability of UreR and Rns to substitute for one another is also consistent with the fact that the consensus UreR binding site (39) is not similar to Rns binding sites. It has previously been shown that another distantly related activator, VirF of Yersinia pestis, is also unable to substitute for Rns (7).
Rns and related activators utilize the same binding site downstream of the transcription start site of Prns.
The homology of the DNA binding domains of Rns, CfaR, AggR, VirF, and CsvR (Fig. 1) and their interchangeability imply that these regulators recognize similar DNA binding sites. In this study we have also shown that CfaR, AggR, and VirF can substitute for Rns to activate expression from Prns. This suggests that these other regulators are able to function as transcriptional activators when bound downstream of a promoter, because it has been shown that Rns requires binding site 3, downstream of the transcription start site, to activate Prns. However, because it is extremely rare for activators to utilize downstream binding sites, we sought to confirm this prediction experimentally by assaying the effect of a mutation in Rns binding site 3 on the ability of these regulators to activate Prns.
These experiments were performed as described for assays using the wild-type Prns-lacZ reporter fusion, except that the Prns3-lacZ reporter was used. This construct carries nucleotide substitutions within Rns binding site 3 at bp +77 to +80 (numbering relative to the transcription start site of Prns) from AAAA to GGCG. In vitro, this mutation abolished Rns binding to site 3 and the ability of Rns to activate Prns in vivo (Table 3) (25). Similarly, we found that this mutation abolished the ability of VirF, AggR, and CfaR to activate Prns (Table 3), although each of these regulators activated expression from wild-type Prns. Thus, Rns and the activators with which it is interchangeable are members of an unusual group of regulators because they can activate transcription when bound either upstream or downstream of a promoter, Pcoo and Prns, respectively. The activator CsvR may also belong to this group, because it has a higher percent identity to Rns than either VirF or AggR does and it has been shown to substitute for Rns and CfaR (44).
Positions of predicted binding sites in promoters regulated by Rns and related activators.
Although VirF, CfaR, and AggR recognize the same binding sites as Rns does and utilize a downstream binding site to activate Prns, we wanted to determine if this promoter is a singular example or if other promoters regulated by these activators have downstream binding sites. However, Rns is the only activator within this group for which binding sites have been defined. Therefore we used the five experimentally determined Rns binding sites to define a binding site scoring index for this group of activators which is represented by the Rns binding-site logo (Fig. 2A). This scoring index was then used to search for potential binding sites, and these were ranked high, moderate, and low based on their similarity to the group of known Rns binding sites.
FIG. 2.
Rns binding-site logo and locations of predicted binding sites. (A) The consensus of the five known Rns binding sites is represented by the binding-site logo, where the height of each nucleotide is proportional to its frequency and the information content of that position. (B and C) Triangles represent predicted binding sites with similarity to the known Rns binding sites near the gene encoding activators homologous to Rns (B) or virulence genes regulated by these activators. (C) Right- and left-pointing triangles indicate binding sites located on the coding and noncoding strands, respectively. The known Rns binding sites are marked by asterisks, and dark, light, and no triangle shading indicates high, moderate, and low similarity of the predicted binding sites to the known binding sites, respectively. Numbering is relative to the beginning of the adjacent open reading frame. Transcriptional start sites that have been reported are shown as wavy arrows. The cognate regulator is shown to the left of each virulence gene, and the system encoded by these genes is shown to the right. Only a limited region of virF upstream sequence was available from GenBank. The nucleotide sequence of the CsvR-regulated locus, CS4, has not been reported.
To assess the accuracy of the search algorithm, the sequences of rns and the CS1 pilin genes were searched first. Within 2 kb of Pcoo, the only sites that ranked high were the two known binding sites that are upstream of this promoter, as expected (Fig. 2C). The three known binding sites near Prns were also ranked high, and, unexpectedly, so were two sites within rns, at bp +47.5 and +451.5 (Fig. 2B). For consistency, numbering is relative to the beginning of the open reading frame because the transcriptional start sites for many of the genes under consideration in this section have not been determined. The function of the predicted binding sites within Rns is unclear since deletions that remove both of these sites have no detectable effect on transcription from Prns in vivo under normal laboratory growth conditions. However, DNase I footprinting showed that an maltose-binding protein–Rns fusion protein bound to the site at bp +47.5 (G. P. Munson and J. R. Scott, unpublished data). Although similar binding studies have not yet been done for the potential binding site at bp +451.5, the accuracy of the search algorithm suggests that further investigations into the function of this and other predicted sites are warranted.
Another indication of the accuracy of the search algorithm is the correlation between a predicted binding site ranked high and deletion analysis of PvirB. Nucleotide sequence analysis revealed one potential binding site centered 150.5 bp upstream of virB on the coding strand (Fig. 2C). Deletions of sequences upstream of bp −163 had no effect on VirF activation of PvirB in vivo, but an upstream deletion extending to bp −153, within the predicted binding site, abolished VirF activation of PvirB (42). A DNase I footprint for VirF also includes this protected site; however, it was not possible to identify a discrete VirF binding site from the footprint because it was extensive (42). In contrast to the frequency of high-scoring sites near Pcoo, Prns, and PvirB, a random sampling of the E. coli K-12 genome found that high-scoring sites are relatively infrequent, approximately one for every 10 kbp. This suggests that the identification of false-positive binding sites is rare.
When cfaR and csvR were searched, binding sites ranked as high were found in an arrangement nearly identical to the known and potential sites upstream of and within rns (Fig. 2B). Rns binding site 1 is centered at bp −394.5 (relative to the open reading frame) on the coding strand and is required for autoregulation of Rns despite its distance from the transcriptional start. A predicted binding site is similarly positioned upstream of cfaR, centered at bp −399.5, and upstream of csvR, centered at bp −351.5. Another site required for positive autoregulation is site 3, centered at bp −86.5, downstream of Prns (25). Predicted binding sites are found at the similar positions of bp −86.5 for cfaR and −87.5 for csvR. Although it is not known if CfaR and CsvR are autoregulatory, these findings suggest that they are and that they use an arrangement of binding sites similar to Rns to positively regulate their own expression. CsvR has also been shown to substitute for CfaR and Rns (44). Although site 2 at bp −126.5 and the sites at bp +47.5 and +451.5 do not appear to be required for positive autoregulation of rns under laboratory conditions (25; Munson and Scott, unpublished), sites at similar locations were found upstream of and within cfaR and csvR. A function has not been attributed to these sites, but their conservation at Rns, CfaR, and CsvR suggest that they may play a regulatory role that is yet to be discovered.
Several potential binding sites were found upstream of and within aggR and fapR (Fig. 2B), although the arrangement of these sites was not identical to that of the sites at Prns. FapR is not thought to positively regulate its own expression (13), which correlates with the absence of sites in locations required for Rns autoactivation. Since it is not yet known if AggR is autoregulatory, the significance of predicted sites near and within aggR is unclear. However, we have shown that AggR activates Pcoo and Prns, two promoters with dramatically different arrangements of binding sites, and so it is possible that other binding-site arrangements may also be utilized by Rns and related activators. Like FapR, VirF is not autoregulatory. Rather, expression of VirF is regulated by the two-component system CpxAR (26). This correlates with the lack of conserved potential binding sites upstream of virF, although a binding site ranked high was found within virF at bp +155.5.
Additional sites, some nearly identical in sequence to known Rns binding sites, were identified near genes known to be regulated by Rns or the regulators with which it is interchangeable (Fig. 2C). However only low-scoring sites were identified near the putative promoters for CS2, CFA/I, and AAF/I pilin genes (Fig. 2C), even though the expression of these pili is Rns, CfaR, and AggR dependent respectively. One possible explanation is that the five known Rns binding sites on which the consensus motif is based do not represent the full range of actual binding sites for Rns and related activators. In this case, more divergent binding sites will not be detected or will be ranked low even though they may be high-affinity binding sites. Additional DNA binding studies are required to address this issue and should result in more accurate binding-site predictions by increasing the sample size of known Rns binding sites. Even though the search algorithm may not detect all Rns binding sites, it is likely that those ranked high or moderate are actual binding sites. The locations of some high- and moderate-scoring sites within open reading frames indicate that at least some of them are downstream of transcription start sites, although the promoters for many of these genes have not been identified. This further suggests that these activators may not be limited to binding sites within a narrow region upstream of a promoter. Thus, Rns and related activators appear to have a more diverse repertoire of binding-site locations than was previously thought possible for prokaryotic activators.
Conclusions and evolutionary considerations.
As an activator, Rns presents an unprecedented variability in the arrangement of the binding sites with which it interacts. However, in this study we have shown that Rns is not unique VirF, AggR, and CfaR also can activate expression from both Pcoo, with its more typical arrangement of binding sites, and Prns. Since the identity of VirF and AggR to Rns is limited to the HTH motifs, this suggests that identity within this region may by itself be a good indicator of which regulators within the AraC family can substitute for one another. Moreover, each of these regulatory proteins requires binding site 3, downstream of the transcription start site, to activate Prns. Thus, Rns is the prototype for a group of activators within the AraC family whose activity is not restricted to upstream binding sites. Although the number of regulators that function as activators when bound downstream of a promoter is still very small (25, 31, 38), they may be more common than previously thought because the AraC family has over 150 members (16), many of which have not been characterized.
At their native promoters, each of these activators probably binds DNA sequences similar to the prototypical binding site (Fig. 2A) defined by the five characterized Rns binding sites. Because these regulators are associated with the expression of virulence factors, this information may provide a powerful tool to search for other virulence genes by computational analysis of genomic sequences. However, we have restricted our present analysis to genes that are known to be regulated by Rns and its homologs because the small sample size of five known binding sites may limit the accuracy of predictions. The experimental verification of some of these predictions will increase our confidence in this type of analysis, the sample size of known binding sites, and, presumably, the accuracy of binding-site identification. Eventually, this type of analysis may provide a starting point for genetic and biochemical analysis of virulence regulons, and these methods should be applicable to other homologous groups of regulators.
The genes encoding Rns and the activators with which it is interchangeable have an unusually low G+C content of less than or equal to 30%, while the average G+C content for E. coli is 49%. This extremely low G+C content suggests that Rns and its homologs have recently been acquired from an organism that has yet to be identified. Perhaps the ability to utilize downstream binding sites is common in the hypothetical ancestor from which these activators were derived. The further characterization of Rns and its regulation of Prns should provide a new perspective on activators and transcription initiation.
ACKNOWLEDGMENTS
This work was supported by Public Health Service award AI24870 from the NIAID. G.P.M. was supported in part by Public Health Service award AI10145.
We thank the following for kindly providing strains: Carleen Collins, Jim Kaper, Tony Maurelli, Jim Nataro, Dieter Schifferli, Gary Schoolnik, and Bob Simons. We thank Michael O'Neill for providing software for the analysis of potential binding sites and for many helpful discussions. We thank Annette Woodring for assistance with enzymatic assays.
REFERENCES
- 1.Adler B, Sasakawa C, Tobe T, Makino S, Komatsu K, Yoshikawa M. A dual transcriptional activation system for the 230 kb plasmid genes coding for virulence-associated antigens of Shigella flexneri. Mol Microbiol. 1989;3:627–635. doi: 10.1111/j.1365-2958.1989.tb00210.x. [DOI] [PubMed] [Google Scholar]
- 2.Berg O G, von Hippel P H. Selection of DNA binding sites by regulatory proteins. Statistical-mechanical theory and application to operators and promoters. J Mol Biol. 1987;193:723–750. doi: 10.1016/0022-2836(87)90354-8. [DOI] [PubMed] [Google Scholar]
- 3.Bhende P M, Egan S M. Amino acid-DNA contacts by RhaS: an AraC family transcription activator. J Bacteriol. 1999;181:5185–5192. doi: 10.1128/jb.181.17.5185-5192.1999. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Busby S, Ebright R H. Promoter structure, promoter recognition, and transcription activation in prokaryotes. Cell. 1994;79:743–746. doi: 10.1016/0092-8674(94)90063-9. [DOI] [PubMed] [Google Scholar]
- 5.Bustamante V H, Calva E, Puente J L. Analysis of cis-acting elements required for bfpA expression in enteropathogenic Escherichia coli. J Bacteriol. 1998;180:3013–3016. doi: 10.1128/jb.180.11.3013-3016.1998. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Caron J, Coffield L M, Scott J R. A plasmid-encoded regulatory gene, rns, required for expression of the CS1 and CS2 adhesins of enterotoxigenic Escherichia coli. Proc Natl Acad Sci USA. 1989;86:963–967. doi: 10.1073/pnas.86.3.963. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Caron J, Scott J R. A rns-like regulatory gene for colonization factor antigen I (CFA/I) that controls expression of CFA/I pilin. Infect Immun. 1990;58:874–878. doi: 10.1128/iai.58.4.874-878.1990. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Casadaban M J. Transposition and fusion of the lac genes to selected promoters in Escherichia coli using bacteriophage lambda and Mu. J Mol Biol. 1976;104:541–555. doi: 10.1016/0022-2836(76)90119-4. [DOI] [PubMed] [Google Scholar]
- 9.D'Orazio S E, Collins C M. The plasmid-encoded urease gene cluster of the family Enterobacteriaceae is positively regulated by UreR, a member of the AraC family of transcriptional activators. J Bacteriol. 1993;175:3459–3467. doi: 10.1128/jb.175.11.3459-3467.1993. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.D'Orazio S E, Collins C M. UreR activates transcription at multiple promoters within the plasmid-encoded urease locus of the Enterobacteriaceae. Mol Microbiol. 1995;16:145–155. doi: 10.1111/j.1365-2958.1995.tb02399.x. [DOI] [PubMed] [Google Scholar]
- 11.D'Orazio S E, Thomas V, Collins C M. Activation of transcription at divergent urea-dependent promoters by the urease gene regulator UreR. Mol Microbiol. 1996;21:643–655. doi: 10.1111/j.1365-2958.1996.tb02572.x. [DOI] [PubMed] [Google Scholar]
- 12.Dorman C J, Porter M E. The Shigella virulence gene regulatory cascade: a paradigm of bacterial gene control mechanisms. Mol Microbiol. 1998;29:677–684. doi: 10.1046/j.1365-2958.1998.00902.x. [DOI] [PubMed] [Google Scholar]
- 13.Edwards R A, Schifferli D M. Differential regulation of fasA and fasH expression of Escherichia coli 987P fimbriae by environmental cues. Mol Microbiol. 1997;25:797–809. doi: 10.1046/j.1365-2958.1997.5161875.x. [DOI] [PubMed] [Google Scholar]
- 14.Elias W P, Jr, Czeczulin J R, Henderson I R, Trabulsi L R, Nataro J P. Organization of biogenesis genes for aggregative adherence fimbria II defines a virulence gene cluster in enteroaggregative Escherichia coli. J Bacteriol. 1999;181:1779–1785. doi: 10.1128/jb.181.6.1779-1785.1999. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Froehlich B, Husmann L, Caron J, Scott J R. Regulation of rns, a positive regulatory factor for pili of enterotoxigenic Escherichia coli. J Bacteriol. 1994;176:5385–5392. doi: 10.1128/jb.176.17.5385-5392.1994. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Gallegos M T, Schleif R, Bairoch A, Hofmann K, Ramos J L. Arac/XylS family of transcriptional regulators. Microbiol Mol Biol Rev. 1997;61:393–410. doi: 10.1128/mmbr.61.4.393-410.1997. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Gralla J D, Collado-Vides J. Organization and function of transcription regulatory elements. In: Neidhardt F C, Curtiss III R, Ingraham J L, Lin E C C, Low K B, Magasanik B, Resnikoff W W, Riley M, Schaechter M, Umbarger H E, editors. Escherichia coli and Salmonella: cellular and molecular biology. 2nd ed. Washington, D.C.: ASM Press; 1996. pp. 1232–1245. [Google Scholar]
- 18.Guzman L M, Belin D, Carson M J, Beckwith J. Tight regulation, modulation, and high-level expression by vectors containing the arabinose PBAD promoter. J Bacteriol. 1995;177:4121–4130. doi: 10.1128/jb.177.14.4121-4130.1995. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Ishihama A. Protein-protein communication within the transcription apparatus. J Bacteriol. 1993;175:2483–2489. doi: 10.1128/jb.175.9.2483-2489.1993. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Klaasen P, de Graaf F K. Characterization of FapR, a positive regulator of expression of the 987P operon in enterotoxigenic Escherichia coli. Mol Microbiol. 1990;4:1779–1783. doi: 10.1111/j.1365-2958.1990.tb00556.x. [DOI] [PubMed] [Google Scholar]
- 21.McDaniel T K, Kaper J B. A cloned pathogenicity island from enteropathogenic Escherichia coli confers the attaching and effacing phenotype on E. coli K-12. Mol Microbiol. 1997;23:399–407. doi: 10.1046/j.1365-2958.1997.2311591.x. [DOI] [PubMed] [Google Scholar]
- 22.Mellies J L, Elliott S J, Sperandio V, Donnenberg M S, Kaper J B. The Per regulon of enteropathogenic Escherichia coli: identification of a regulatory cascade and a novel transcriptional activator, the locus of enterocyte effacement (LEE)-encoded regulator (Ler) Mol Microbiol. 1999;33:296–306. doi: 10.1046/j.1365-2958.1999.01473.x. [DOI] [PubMed] [Google Scholar]
- 23.Miller J H. Experiments in molecular genetics. Cold Spring Harbor, N.Y: Cold Spring Harbor Laboratory; 1972. [Google Scholar]
- 24.Munson G P, Scott J R. Binding site recognition by Rns, a virulence regulator in the AraC family. J Bacteriol. 1999;181:2110–2117. doi: 10.1128/jb.181.7.2110-2117.1999. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Munson G P, Scott J R. Rns, a virulence regulator within the AraC family, requires binding sites upstream and downstream of its own promoter to function as an activator. Mol Microbiol. 2000;36:1391–1402. doi: 10.1046/j.1365-2958.2000.01957.x. [DOI] [PubMed] [Google Scholar]
- 26.Nakayama S, Watanabe H. Identification of cpxR as a positive regulator essential for expression of the Shigella sonnei virF gene. J Bacteriol. 1998;180:3522–3528. doi: 10.1128/jb.180.14.3522-3528.1998. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Nataro J P, Yikang D, Yingkang D, Walker K. AggR, a transcriptional activator of aggregative adherence fimbria I expression in enteroaggregative Escherichia coli. J Bacteriol. 1994;176:4691–4699. doi: 10.1128/jb.176.15.4691-4699.1994. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.O'Neill M C. Consensus methods for finding and ranking DNA binding sites. Application to Escherichia coli promoters. J Mol Biol. 1989;207:301–310. doi: 10.1016/0022-2836(89)90256-8. [DOI] [PubMed] [Google Scholar]
- 29.Porter M E, Smith S G, Dorman C J. Two highly related regulatory proteins, Shigella flexneri VirF and enterotoxigenic Escherichia coli Rns, have common and distinct regulatory properties. FEMS Microbiol Lett. 1998;162:303–309. doi: 10.1111/j.1574-6968.1998.tb13013.x. [DOI] [PubMed] [Google Scholar]
- 30.Puente J L, Bieber D, Ramer S W, Murray W, Schoolnik G K. The bundle-forming pili of enteropathogenic Escherichia coli: transcriptional regulation by environmental signals. Mol Microbiol. 1996;20:87–100. doi: 10.1111/j.1365-2958.1996.tb02491.x. [DOI] [PubMed] [Google Scholar]
- 31.Qi Y, Hulett F M. PhoP-P and RNA polymerase sigmaA holoenzyme are sufficient for transcription of Pho regulon promoters in Bacillus subtilis: PhoP-P activator sites within the coding region stimulate transcription in vitro. Mol Microbiol. 1998;28:1187–1197. doi: 10.1046/j.1365-2958.1998.00882.x. [DOI] [PubMed] [Google Scholar]
- 32.Rhee S, Martin R G, Rosner J L, Davies D R. A novel DNA-binding motif in MarA: the first structure for an AraC family transcriptional activator. Proc Natl Acad Sci USA. 1998;95:10413–10418. doi: 10.1073/pnas.95.18.10413. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Rhodius V A, Busby S J. Positive activation of gene expression. Curr Opin Microbiol. 1998;1:152–159. doi: 10.1016/s1369-5274(98)80005-2. [DOI] [PubMed] [Google Scholar]
- 34.Schneider T D, Stephens R M. Sequence logos: a new way to display consensus sequences. Nucleic Acids Res. 1990;18:6097–6100. doi: 10.1093/nar/18.20.6097. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Schneider T D, Stormo G D, Gold L, Ehrenfeucht A. Information content of binding sites on nucleotide sequences. J Mol Biol. 1986;188:415–431. doi: 10.1016/0022-2836(86)90165-8. [DOI] [PubMed] [Google Scholar]
- 36.Simons R W, Houman F, Kleckner N. Improved single and multicopy lac-based cloning vectors for protein and operon fusions. Gene. 1987;53:85–96. doi: 10.1016/0378-1119(87)90095-3. [DOI] [PubMed] [Google Scholar]
- 37.Sperandio V, Mellies J L, Nguyen W, Shin S, Kaper J B. Quorum sensing controls expression of the type III secretion gene transcription and protein secretion in enterohemorrhagic and enteropathogenic Escherichia coli. Proc Natl Acad Sci USA. 1999;96:15196–15201. doi: 10.1073/pnas.96.26.15196. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Szalewska-Palasz A, Wegrzyn A, Blaszczak A, Taylor K, Wegrzyn G. DnaA-stimulated transcriptional activation of orilambda: Escherichia coli RNA polymerase beta subunit as a transcriptional activator contact site. Proc Natl Acad Sci USA. 1998;95:4241–4246. doi: 10.1073/pnas.95.8.4241. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Thomas V J, Collins C M. Identification of UreR binding sites in the Enterobacteriaceae plasmid-encoded and Proteus mirabilis urease gene operons. Mol Microbiol. 1999;31:1417–1428. doi: 10.1046/j.1365-2958.1999.01283.x. [DOI] [PubMed] [Google Scholar]
- 40.Tobe T, Sasakawa C, Okada N, Honma Y, Yoshikawa M. vacB, a novel chromosomal gene required for expression of virulence genes on the large plasmid of Shigella flexneri. J Bacteriol. 1992;174:6359–6367. doi: 10.1128/jb.174.20.6359-6367.1992. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Tobe T, Schoolnik G K, Sohel I, Bustamante V H, Puente J L. Cloning and characterization of bfpTVW, genes required for the transcriptional activation of bfpA in enteropathogenic Escherichia coli. Mol Microbiol. 1996;21:963–975. doi: 10.1046/j.1365-2958.1996.531415.x. [DOI] [PubMed] [Google Scholar]
- 42.Tobe T, Yoshikawa M, Mizuno T, Sasakawa C. Transcriptional control of the invasion regulatory gene virB of Shigella flexneri: activation by virF and repression by HNS. J Bacteriol. 1993;175:6142–6149. doi: 10.1128/jb.175.19.6142-6149.1993. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Wattiau P, Cornelis G R. Identification of DNA sequences recognized by VirF, the transcriptional activator of the Yersinia yop regulon. J Bacteriol. 1994;176:3878–3884. doi: 10.1128/jb.176.13.3878-3884.1994. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Willshaw G A, Smith H R, McConnell M M, Rowe B. Cloning of regulator genes controlling fimbrial production by enterotoxigenic Escherichia coli. FEMS Microbiol Lett. 1991;66:125–129. doi: 10.1016/0378-1097(91)90320-a. [DOI] [PubMed] [Google Scholar]
- 45.Woodcock D M, Crowther P J, Doherty J, Jefferson S, DeCruz E, Noyer-Weidner M, Smith S S, Michael M Z, Graham M W. Quantitative evaluation of Escherichia coli host strains for tolerance to cytosine methylation in plasmid and phage recombinants. Nucleic Acids Res. 1989;17:3469–3478. doi: 10.1093/nar/17.9.3469. [DOI] [PMC free article] [PubMed] [Google Scholar]


