Abstract
The genes coding for the β (rpoB) and β′ (rpoC) subunits of RNA polymerase are fused in the gastric pathogen Helicobacter pylori but separate in other taxonomic groups. To better understand how the unique fused structure evolved, we determined DNA sequences at and around the rpoB-rpoC junction in 10 gastric and nongastric species of Helicobacter and in members of the related genera Wolinella, Arcobacter, Sulfurospirillum, and Campylobacter. We found the fusion to be specific to Helicobacter and Wolinella genera; rpoB and rpoC overlap in the other genera. The fusion may have arisen by a frameshift mutation at the site of rpoB and rpoC overlap. Loss of good Shine-Dalgarno sequences might then have fixed the fusion in the Helicobacteraceae, even if fusion itself did not confer a selective advantage.
A most unexpected natural fusion of rpoB and rpoC, the genes coding for the β and β′ subunits of DNA-dependent RNA polymerase (RNAP), respectively, was discovered while sequencing the genome of Helicobacter pylori 26695 (7), an ɛ-group proteobacterium that is the primary cause of peptic ulcer disease and an early risk factor for gastric cancer (4). We found that this extraordinary rpoB-rpoC fusion is typical of H. pylori as a species and also of one other gastric Helicobacter tested (H. felis) and that it results in a stable fused β-β′ subunit of RNAP (8). In contrast, Campylobacter jejuni and Campylobacter fetus, species related to H. pylori but which colonize intestinal and not gastric sites and often cause diarrheal disease, have separate rpoB and rpoC genes (8), as do all other eubacterial species studied to date.
If the rpoB-rpoC fusion were a characteristic and specific feature of all gastric helicobacters, it might contribute to the special ability of these bacteria to colonize their unique gastric niche. For example, one can speculate that the tethered structure of RNAP β and β′ is useful for H. pylori and other gastric helicobacters in facilitating the multisubunit RNAP assembly in the hostile, urea-rich or low-pH gastric environment. A simple prediction of such a model is that rpoB and rpoC genes might be separate in helicobacters that colonize nongastric sites. To test this prediction and to better understand how this unusual gene structure evolved, we studied the distribution of translational fusion of rpoB and rpoC.
Our collection of ɛ-group proteobacteria included 10 species of Helicobacter. Two of these species colonize gastric sites, and the rest colonize intestinal sites. In addition, three species of Arcobacter, which are significant animal and occasional human pathogens (1); Wolinella succinogenes; Campylobacter rectus (formerly known as Wolinella recta [5]); and one species of Sulfurospirillum, S. barnesii, all of which are Helicobacter related, were included in the analysis. Two primers that target sequences flanking the rpoB-rpoC junction and that are complementary to highly conserved sequences in the H. pylori 26695 rpoB-rpoC gene (8) were used for PCR amplification with genomic DNA from these organisms. In all cases, a single major PCR fragment ca. 500 bp in length was amplified. Each fragment was cloned and sequenced, and the DNA sequences were used to derive amino acid sequences of the RNAP subunit(s). The translational fusion of rpoB and rpoC was maintained in all helicobacters, gastric and nongastric, as well as in W. succinogenes. Based on our previous results with H. pylori (6, 8), we conclude that these organisms use natural fusion polypeptide as the sole source of the largest RNAP subunits, β and β′. In contrast, equivalent DNA sequence analysis showed that the rpoB and rpoC reading frames are separate in all three Arcobacter species, as well as in C. rectus and Sulfurospirillum. Hence, the β and β′ subunits are separate in these organisms. The likely position of an ATG codon coding for the β′ subunit Met1 was inferred based on sequence comparisons with β′ sequences from other organisms and on the presence of an appropriately spaced A and G rich sequence that could serve as a ribosome-binding site. The analysis suggests that in arcobacters and in C. rectus, the appropriate ATG codon is found overlapping with two last codons of the rpoB gene, as is also the case in C. jejuni and C. fetus (8). It is worth noting that the inferred initiating ATG is found in the same reading frame, −1, relative to the rpoB reading frame in all these organisms with separate rpoB and rpoC genes (Fig. 1). In Sulfurospirillum, the rpoB-rpoC gene overlap is more extensive, as the two genes overlap by 7 codons (Fig. 1).
The resultant collection of deduced amino acid sequences was aligned by using the Clustal method, and the phylogenetic tree shown in Fig. 2A was built with the DNASTAR program. The previously determined sequences of the rpoB-rpoC junction of C. jejuni, C. fetus, H. felis (8), and H. pylori (7) were also included in this analysis. As can be seen, the phylogenetic tree reveals two major clusters: members of the family Helicobacteraceae (the helicobacters and W. succinogenes) and members of the family Campylobacteraceae (arcobacters, campylobacters, and S. barnesii). The campylobacters and arcobacters form their own coherent groups within their cluster, with Sulfurospirillum just “outside” the campylobacters. Helicobacters also form a coherent group with W. succinogenes just outside. Overall, these data correlate very well (with some minor differences) with 16S rRNA data (Fig. 2B) as well as with an rpoB-rpoC tree built using DNA sequences (data not shown). Previous rRNA sequence analysis had suggested the classification of C. rectus from its initial placement as a Wolinella (5). Our analysis firmly supports this placement, based on (i) sequence analysis per se and (ii) the fact that rpoB and rpoC in this organism are separate genes.
The most striking feature of the phylogenetic tree presented in Fig. 2 is that all members of the Helicobacteraceae, both gastric and nongastric, have fused rpoB and rpoC genes (shown by bold lines in Fig. 2A). In contrast, in members of the Campylobacteraceae, these two genes are separate. Although the rpoB and rpoC genes are separated by an untranslated linker of 20 to 100 bp in most bacterial species, in the Campylobacteraceae, these two genes partially overlap. Interestingly, the only other eubacterial species with an rpoB-rpoC overlap is Aquifex aeolicus, apparently the most deeply branching member of the eubacteria, based on 16S rRNA sequence analysis (2). However, the A. aeolicus rpoB and rpoC sequences strongly resemble proteobacterial sequences based on their primary sequence and the presence of long dispensable regions typical of proteobacteria (data not shown). In addition, phylogenetic analysis of primary sigma factors also places A. aeolicus and H. pylori together (3). These results strongly suggest the horizontal transfer of rpo genes during the evolution of A. aeolicus.
The fused rpoB-rpoC structure of the Helicobacteraceae probably originated in the common ancestor of present-day helicobacters and wolinellas by a simple frameshift mutation (either an insertion of 1 nucleotide base pair or the deletion of 2 bp) at the site of the rpoB and rpoC overlap. This original frameshift mutation might have been an evolutionary accident which was not specifically selected against nor required (at least initially) for gastric colonization. Indeed, an engineered H. pylori strain with separated rpoB and rpoC genes is viable and can colonize conventional mice at least for the short run (6). Additional experiments are needed, however, to examine more closely the possible contribution of the rpoB-rpoC fusion to H. pylori fitness during chronic infection and severe inflammatory responses.
The sequences reported in this paper have been deposited in the GenBank (accession no. AF136503 to AF136518).
Acknowledgments
This work was supported by the Burroughs Wellcome Career Award, a Charles and Johanna Busch Biomedical grant, and NIH grant RO1 GM 59295 (to K.S.); NIH grants DK48029 and AI138166 (to D.E.B.); and NIDCR grants DE-10374 (to F.E.D.) and DE-11443 (to B.J.P.).
We are grateful to John F. Stolz for providing S. barnesii DNA. N.Z. is a recipient of a Charles and Johanna Busch Postdoctoral Fellowship.
ADDENDUM IN PROOF
Recent phylogenetic analysis of rpoB and rpoC genes of Aquifex pyrophilus confirms the placement of the genus Aquifex within or close to the ε group of proteobacteria (H.-P. Klenk, T.-D. Meier, P. Durovic, V. Schwass, F. Lottspeich, D. P. Dennis, and W. Zillig, J. Mol. Evol. 48:528–541, 1999).
REFERENCES
- 1.Anderson K F, Kiehlbauch J A, Anderson D C, McClure H M, Wachsmuth I K. Arcobacter (Campylobacter) butzleri-associated diarrheal illness in a nonhuman primate population. Infect Immun. 1993;61:2220–2223. doi: 10.1128/iai.61.5.2220-2223.1993. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Deckert G, Warren P V, Gaasterland T, Young W G, Lenox A L, Graham D E, Overbeek R, Snead M A, Keller M, Aujay M, Huber R, Feldman R A, Short J M, Olsen G J, Swanson R V. The complete genome of the hyperthermophilic bacterium Aquifex aeolicus. Nature. 1998;392:353–358. doi: 10.1038/32831. [DOI] [PubMed] [Google Scholar]
- 3.Gruber T M, Bryant D A. Characterization of the group 1 and group 2 sigma factors of the green sulfur bacterium Chlorobium tepidum and the green non-sulfur bacterium Chloroflexus aurantiacus. Arch Microbiol. 1998;170:285–296. doi: 10.1007/s002030050644. [DOI] [PubMed] [Google Scholar]
- 4.Parsonnet J, Harris R A, Hack H M, Owens D K. Modeling cost-effectiveness of Helicobacter pyloriscreening to prevent gastric cancer: a mandate for clinical trials. Lancet. 1996;348:150–154. doi: 10.1016/s0140-6736(96)01501-2. [DOI] [PubMed] [Google Scholar]
- 5.Paster B J, Dewhirst F E. Phylogeny of campylobacters, wolinellas, Bacteroides gracilis, and Bacteroides ureolyticusby 16S rRNA sequencing. Int J Syst Bacteriol. 1988;38:56–62. [Google Scholar]
- 6.Raudonikiene A, Zakharova N, Su W W, Jeong J-Y, Bryden L, Hoffman P S, Berg D E, Severinov K. Helicobacter pyloriwith separate β and β′ subunits of RNA polymerase is viable and can colonize conventional mice. Mol Microbiol. 1999;32:131–138. doi: 10.1046/j.1365-2958.1999.01336.x. [DOI] [PubMed] [Google Scholar]
- 7.Tomb J F, White O, Kerlavage A R, Clayton R A, Sutton G G, Fleischmann R D, Ketchum K A, Klenk H P, Gill S, Dougherty B A, Nelson K, Quackenbush J, Zhou L, Kirkness E F, Peterson S, Loftus B, Richardson B D, Dodson R, Khalak H G, Glodek A, McKenney K, Fitzgerald L M, Lee N, Adams M D, Venter J C, et al. The complete genome sequence of the gastric pathogen Helicobacter pylori. Nature. 1997;388:539–547. doi: 10.1038/41483. [DOI] [PubMed] [Google Scholar]
- 8.Zakharova N, Hoffman P S, Berg D E, Severinov K. The largest subunits of RNA polymerase from gastric helicobacters are tethered. J Biol Chem. 1998;273:19371–19374. doi: 10.1074/jbc.273.31.19371. [DOI] [PubMed] [Google Scholar]