Abstract
We analyzed several features of five currently available δ-proteobacterial genomes, including two aerobic bacteria exhibiting predatory behavior and three anaerobic sulfate-reducing bacteria. The δ genomes are distinguished from other bacteria by several properties: (i) The δ genomes contain two “giant” S1 ribosomal protein genes in contrast to all other bacterial types, which encode a single or no S1; (ii) in most δ-proteobacterial genomes the major ribosomal protein (RP) gene cluster is near the replication terminus whereas most bacterial genomes place the major RP cluster near the origin of replication; (iii) the δ genomes possess the rare combination of discriminating asparaginyl and glutaminyl tRNA synthetase (AARS) together with the amido-transferase complex (Gat CAB) genes that modify Asp-tRNAAsn into Asn-tRNAAsn and Glu-tRNAGln into Gln-tRNAGln; (iv) the TonB receptors and ferric siderophore receptors that facilitate uptake and removal of complex metals are common among δ genomes; (v) the anaerobic δ genomes encode multiple copies of the anaerobic detoxification protein rubrerythrin that can neutralize hydrogen peroxide; and (vi) σ54 activators play a more important role in the δ genomes than in other bacteria. δ genomes have a plethora of enhancer binding proteins that respond to environmental and intracellular cues, often as part of two-component systems; (vii) δ genomes encode multiple copies of metallo-β-lactamase enzymes; (viii) a host of secretion proteins emphasizing SecA, SecB, and SecY may be especially useful in the predatory activities of Myxococcus xanthus; (ix) δ proteobacteria drive many multiprotein machines in their periplasms and outer membrane, including chaperone-feeding machines, jets for slime secretion, and type IV pili. Bdellovibrio replicates in the periplasm of prey cells. The sulfate-reducing δ proteobacteria metabolize hydrogen and generate a proton gradient by electron transport. The predicted highly expressed genes from δ genomes reflect their different ecologies, metabolic strategies, and adaptations.
Keywords: δ proteobacteria, Myxococcus xanthus, sulfate-reducing bacteria, predatory bacteria, σ54 activators
The δ proteobacteria (δ genomes) are defined by their 16S RNA sequence (1). Completely sequenced δ genomes include the multicellular predator Myxococcus xanthus (MYXXA) (D.K., W. C. Nierman, B. S. Goldman, S. Slater, A. S. Durkini, J. Eisen, C. M. Ronning, W. B. Barbazuk, M. Blanchard, C. Field, et al., unpublished results), the unicellular predator Bdellovibrio bacteriovorus (BDEBA) (2), and the three anaerobic sulfate-reducing bacteria Desulfovibrio vulgaris (DESVU) (3), Geobacter sulfurreducens (GEOSU) (4) and Desulfotalea psychrophila (DESPS) (5). Whereas MYXXA of 9.14-Mb length is among the largest bacterial genomes sequenced, the other four δ genomes are all of size 3.5 to 4.0 Mb.
MYXXA lives in cultivated topsoil, where it is often exposed to solar radiation and is well aerated. It has two life stages, growth and development, both of which involve remarkable cellular cooperation and much gliding movement (6–8). Myxobacteria are proficient predators of whole colonies of other soil microbes. MYXXA encodes many duplicated proteins expressed during the different life stages, e.g., Lon (9) and serine-threonine protein kinases (10). BDEBA is ubiquitous in terrestrial and aquatic habitats. It preys on individual Gram-negative bacterial cells by invading their periplasm and transforming them into nearly spherical structures called bdelloplasts (2). A detailed scenario for the adhesion and colonization of prey bacteria by BDEBA is reviewed in refs 2 and 11. DESVU is a strictly anaerobic sulfate-reducing bacterium (3) with a substantial capacity to oxidize metal ions, many of which are toxic. It has an extensive network of periplasmic hydrogenases and cytochromes for electron transport (3). GEOSU implements bioremediation by precipitating various soluble heavy metals (4). GEOSU encodes many predicted highly expressed (PHX) cytochrome c and ferredoxin proteins that are used for periplasmic and outer membrane electron transport (4). DESPS is a psychrophilic sulfate-reducing bacterium that uses sulfate as the main electron acceptor and lactate and alcohols as major carbon and electron sources (5). Its optimal doubling time is 27 h during growth on lactate at 10°C, but it can also grow successfully at a temperature <0°C (5).
This article has two objectives. First, PHX genes and related properties of the five δ genomes are analyzed (Table 1). Qualitatively, a gene can be defined PHX if its codon frequencies are similar to highly expressed genes such as those for ribosomal proteins (RP) or for major transcription/translation factors (TF) or for the principal chaperone/degradation (CH) proteins, but deviate strongly in codon frequencies from the average gene of the genome (see Methods for precise criteria). Second, various properties that distinguish the δ genomes from other bacteria are described.
Table 1.
δ-Proteobacterial complete genomes
| G + C genome frequencies, % | % PHX genes, % | Max E(g) | |
|---|---|---|---|
| MYXXA | 68.9 | 19.2 | 2.02 |
| BDEBA | 50.7 | 7.3 | 2.87 |
| DESVU | 63.2 | 15.1 | 1.90 |
| GEOSU | 60.9 | 5.2 | 1.33 |
| DESPS | 46.8 | 8.6 | 1.63 |
Results and Discussion
Distinctive PHX Genes of δ Genomes.
MYXXA shows, to date, the highest percentage of PHX genes, 19.2% compared with all currently sequenced bacterial genomes (12–16). The top PHX gene in MYXXA encodes the preprotein SecA translocase subunit [E(g) = 2.02] (see Table 6, which is published as supporting information on the PNAS web site). The high E(g) value suggests that secretion plays a major role in the MYXXA lifestyle. Of almost equal predicted expression level are the RNA polymerase subunits RpoC [E(g) = 2.02] and RpoB [E(g) = 1.84] and the ATP-dependent protease Lon [E(g) = 1.95]. A highly expressed protein [E(g) = 2.01] of unknown function that could be an attractive candidate for experimental analyses is encoded at genome positions 4160152–4162320. The PHX genes of BDEBA reach the high E(g) level 2.87, suggesting that BDEBA should be considered a fast growing organism (16, 17). In fact, once inside the bdelloplast, BDEBA does multiply rapidly, producing several descendants from a single Escherichia coli host cell (11). BDEBA also encodes a variety of periplasmic PHX electron transporters that adapt it to microaerophilic conditions likely found within the host’s periplasm (11). Of the 46 RP genes ≥80 aa length of the BDEBA genome, 45 are PHX, a high proportion consistent with the proposition that BDEBA is fast growing (17). DESVU contains several PHX anaerobic detoxification genes, including two rubrerythrin genes and two rubredoxin oxidoreductase genes that can protect the organism against oxidative stress or other reactive toxins (18, 19). The gene of highest PHX level [E(g) = 1.90] is the large ribosomal protein gene S1. The PHX genes of GEOSU encompass only 5% of the proteome and have a maximum E(g) = 1.33 for the RNA processing/degradation gene pnp (polynucleotide phosphorylase). The low expression levels of its PHX genes suggest that GEOSU is prone to grow slowly (16, 17). This genome also has few PHX chaperone genes.
Energy Metabolism in δ Genomes.
MYXXA and BDEBA feature many PHX genes for aerobic metabolism. These genes include the NADH dehydrogenase (Nuo) complex and associated enzymes of respiration, most tricarboxylic acid (TCA) cycle enzymes (see below), and the cytochrome c oxidase operon comprising four subunits arranged in the order II-I-III-IV, all PHX (see Table 6). GEOSU and DESVU contain the same operon, but none of the subunits are PHX. DESPS is missing the whole operon. These qualities correlate with the anaerobic lifestyle. It appears that MYXXA has evolved to efficiently produce ATP over a wide range of oxygen concentrations in accord with its habit of sporulating within the environment of a fruiting body covered with slime.
The two basic metabolic pathways, glycolysis and the TCA cycle, involve 10 and 15 genes, respectively (see Table 7, which is published as supporting information on the PNAS web site). Counts of PHX genes in these pathways of the δ genomes are reported in Table 2. Microbes with a preference for aerobic growth usually feature many PHX genes functioning in the TCA cycle and few PHX genes in glycolysis (16, 17). By contrast, microbes adapted for fermentation of carbohydrates generally involve more glycolysis PHX genes. Microbes with many PHX genes in both respiratory and fermentative pathways are predicted to be facultative aerobes. In symbiotic or parasitic organisms, few of the glycolytic enzymes and few of the TCA enzymes tend to be PHX (17). An aerobic environment is also predicted when the genome encodes several PHX oxygen detoxifying genes (17).
Table 2.
PHX genes of important pathways
| Genomes | Glycolysis* | TCA-cycle* | Detoxification* |
|---|---|---|---|
| MYXXA | 8 (8) | 11 (13) | 10 (14) |
| BDEBA | 3 (3) | 7 (8) | 5 (5) |
| DESVU | 4 (4) | 1 (1) | 1 (1) |
| GEOSU | 1 (1) | 5 (5) | 3 (3) |
| DESPS | 1 (1) | 1 (1) | 0 (0) |
DESVU has three anaerobic detoxification genes (rubrerythrin, rubredoxin-oxygen oxidoreductase, and nigerythrin).
*Number of distinct PHX genes (number of PHX genes with repeats).
TCA Cycle Gene Cluster.
In MYXXA, the PHX genes for 2-oxoglutarate dehydrogenase E1 (sucA) and E2 (sucB) components, which overlap 10 bp, are encoded in the same operon. Moreover, the six PHX genes for succinyl-CoA synthase α subunit (sucD), β subunit (sucC), succinate dehydrogenase flavoprotein subunit (sdhA), succinate dehydrogenase iron-sulfur protein (sdhB), malate dehydrogenase (mdh), and isocitrate dehydrogenase (icd) of the TCA cycle form a gene cluster, possibly a single operon (display in Scheme 1). Overall, MYXXA has 11 PHX TCA cycle genes and 13 including two duplications (Table 2 and Scheme 1). The successive gaps between genes in Scheme 1 are 32, 219, 44, 286, and 29 bp. The successive genes are of sizes 431, 313, 625, 268, 385, and 298 aa. Gene orientation is indicated by arrows. The coordinates below the grid indicate the starting position of each gene.
Scheme 1.
Cluster of six TCA cycle genes.
TCA cycle genes of the δ genomes BDEBA, GEOSU, and DESPS are organized similarly in that sucA is adjacent to sucB, sucC is adjacent to sucD, and the succinate dehydrogenase flavoprotein subunit (sdhA) and the succinate dehydrogenase iron-sulfur subunit (sdhB) are encoded as part of a single operon. BDEBA and DESPS encode complete sets of TCA cycle enzymes (8 and 1, respectively, PHX). DESVU features a fusion of sucC and sucD (sucCD) but, apart from icd (PHX), mdh, and fumC, lacks the other genes of the TCA cycle. Five TCA cycle enzymes are PHX in GEOSU.
Glycolysis.
Genes encoding glycolytic enzymes are broadly distributed among the δ genomes, with a single cluster in each genome. Explicitly, MYXXA, BDEBA, and GEOSU each cluster the genes gap, pgk, and tpi, probably in a single operon; DESVU clusters the genes fba and gap; and DESPS clusters tpi and pgk. MYXXA contains two copies of pyk and of pfk and DESVU contains two copies of gap. DESVU expresses few PHX TCA cycle genes but many PHX fermentation genes, as expected for an organism that grows anaerobically on sugars.
PHX Genes Contributing to MYXXA Social Behavior.
MYXXA is remarkable for the ability of its cells to cooperate in coordinated cell movements and in building fruiting bodies when starved. Its repertoire of PHX genes reflects its capacity for sensing its environment and for cell–cell interactions. For example, there is a large family of signal responsive σ54 activators, also called enhancer-binding proteins. This action contrasts with most other Gram-negative bacteria where σ70 factors predominate and the σ54-dependent RNA polymerase complex is accessory (20). In particular, σ54-dependent transcription plays an important role in fruiting body development (17).
Chaperone/degradation PHX genes in MYXXA include 49 genes highlighting 8 DnaK PHX genes, the most of any bacterial genome sequenced to date, and an additional 7 DnaK genes that are not PHX (see Table 8, which is published as supporting information on the PNAS web site). Among the other ≈200 prokaryotic genomes currently available, there is usually a single DnaK per genome, and at most five (data not shown). In sharp contrast to MYXXA, each of the other four δ genomes involves a unique DnaK gene (Table 8). MYXXA also has multiple PHX peptidyl-prolyl cis-trans isomerase (PPI) genes embracing all three known types (FKBP, cylophilin, and parvulin). Many PHX degradation genes (clpX, clpA/B, clpP, hslU, hslV) and two htpX (HSP90) are conspicuous (see the list of PHX in Tables 9–15, which are published as supporting information on the PNAS web site). The chaperone/degradation genes tig, clpP, clpX, pep, and lon occur as a gene cluster as depicted in Scheme 2. Two PHX duplicates of chaperonin groEL entail ≥75% identity.
Scheme 2.
We conjecture that many of the chaperone/degradation genes were acquired concomitant with predation in MYXXA. Several strongly PHX groups of genes may facilitate predation. These groups include genes involved in secretion of digestive enzymes, nine omp (porin) genes, and a melange of >70 protease/peptidase genes including thermolysin, pitrilysin, fungallysin, etc. Numerous ABC transporter genes stand out as PHX (see supporting information in Table 11). At least 10 PHX TonB-dependent receptors are encoded. These receptors compare with Caulobacter crescentus, which shows at least 25 PHX TonB receptors, which are involved in absorption, and which convey iron or other nonsoluble substances into and out of the bacterial cell (15). Serine/threonine (S/T) kinase, phosphatase and histidine kinases, and cell cycle stress/heat shock proteins contribute to regulating the developmental-sporulation phase of MYXXA (6, 7, 21).
MYXXA possesses two genetically distinct motility systems designated adventurous (A) and social (S) (22–24). MYXXA features in excess of 12 PHX A-motile gliding proteins (see Table 9), associated with its abundant slime secretion (24). The S motility system involves type IV pili and fibrils in cell movement. The layers of slime excretions of polysaccharides provide a solid surface for cell movements (24). In this context, MYXXA also possesses >35 PHX lipoproteins, some of which may be involved in slime secretion.
Distinguishing Features of δ-Proteobacterial Genomes.
Two giant ribosomal protein S1 genes are present in all δ genomes.
Other bacterial genomes sequenced to date have at most a single S1 gene. Archaea and eukaryotes have no S1 gene. Strikingly, there are two copies of the S1 ribosomal protein gene in each of the five δ genomes analyzed. One copy is highly conserved, those of size 500–600 aa with 50–60% sequence identity among different species, and is usually PHX (Table 3).
Table 3.
Ribosomal protein S1
| E(g)* | Location | Size, aa | |
|---|---|---|---|
| MYXXA | 1.19 | 4143609 | 720 |
| MYXXA | 1.64 | 4561591C | 569 |
| BDEBA | 2.47 | 1011942 | 594 |
| BDEBA | 0.82 | 1138486 | 397 |
| DESVU | 1.27 | 1551153C | 486 |
| DESVU | 1.90 | 3303332C | 576 |
| GEOSU | 0.92 | 1303580 | 401 |
| GEOSU | 0.92 | 2872001 | 573 |
| DESPS | 1.36 | 1012514 | 569 |
| DESPS | 0.84 | 3213437 | 395 |
*Predicted highly expressed genes are indicated in bold. C, complementary strand
The RP gene S1, commonly exceeding 500 aa in length, is essential in Gram-negative bacteria for initiating translation and is encoded separately from the main RP cluster (25). S1 is overall acidic, binds weakly and reversibly to the small ribosomal complex, and interacts with mRNA chains, whereas most other RPs bind strongly to the complex (25). S1 can facilitate binding of mRNA that lacks a strong Shine-Dalgarno sequence, allowing their translation by the δ proteobacteria. S1 is also encoded in the deeply branching Gram-negative hyperthermophiles Aquifex aeolicus and Thermotoga maritima. The 820-aa S1 protein of T. maritima can be recognized as a fusion of cytidylate kinase, involved in nucleotide biosynthesis, with a standard S1 sequence. The S1 protein of Bacillus genomes and those of low G + C Gram-positive bacteria (Firmicutes) are of reduced size in the range of 380–410 aa. Acidic RPs are rarely present in bacterial genomes, except for S1 and L7/L12. L7/L12, as with the eukaryotic ribosomal proteins P0, P1, and P2, feature a hyperacidic carboxyl residue run that is thought to act in adapting mRNA chains to the ribosome.
Location and organization of the major RP gene cluster.
Most bacterial genomes carry a cluster, accounting for 15–40% of all RP genes, positioned proximal to the origin of replication, thus permitting an early expression of these RPs in the cell cycle. Several PHX genes fundamental in protein synthesis, including tuf, fus, rpoA, rpoB, and rpoC and several chaperones (e.g., groEL, groES, and tig), are encoded within or proximal to the major RP cluster in many bacterial genomes. Archaeal genomes, often lacking a unique origin of replication, delimit a less extended RP cluster compared with bacterial genomes. By contrast, the RP genes of yeast and of higher eukaryotes are randomly dispersed over their genome. However, three δ genomes (MYXXA, DESVU, and DESPS) have their primary RP gene cluster located at or near the Ter region of the genome whereas BDEBA and GEOSU locate their RP cluster significantly closer (less than 1 Mb) to the origin of replication (oriC) similar to the organization of most bacterial genomes. As with S1 the RPs L25 and S2 are often isolated (but not always) in the genome. Intermeshed with the major RP cluster of GEOSU are the genes secY, adk, tuf, fus, rpoB, rpoC, nus, and secE, ≈50 kbp from the origin of replication (oriC). The major RP cluster of MYXXA incorporates the proteins Fus, Tuf, SecY, and RpoA. The principal RP cluster of BDEBA includes the proteins RpoA, SecY, RpoC, RpoB, NusG, SecE, and Tuf located ≈92 kbp from oriC. The prime RP cluster of DESVU includes the proteins EF-G, SecY, RpoA, Tig, and ClpX. Here, RpoB and RpoC are encoded in a separate cluster with four RP genes. The principal RP cluster of DESPS includes the protein translation processing genes nusG, rpoB, rpoC, map, fus, tuf, secY, rpoA, EF-Ts, and rrf (ribosome recycling factor). Note that secY is encoded as part of the major RP cluster in every δ genome, as in many other bacterial genomes, emphasizing a role in translation.
The complement of asparaginyl and glutaminyl tRNA synthetase (AARS) genes in δ genomes.
The accuracy of the AARS enzymes is essential for correct translation of the genetic code. Most bacteria differ from the E. coli model of tRNA aminoacylation for asparagine, glutamine, cysteine, proline, and lysine (26, 27). The sequencing of numerous genomes verified the absence of the regular glutaminyl and asparginyl AARSs (Table 4 and supporting information in Table 10). Other anomalies for lysyl, cysteinyl, and prolyl AARS were also revealed for Gram-positive bacteria, for α-, β-, and ε-proteobacterial genomes, for most obligate intracellular pathogens, and for archaeal genomes (27). AARS representations in γ-proteobacterial genomes are variable, with most in possession of the cognate AARS for every amino acid but not in Pseudomonas genomes (ref. 26 and Table 4). The five δ genomes possess the regular AARS for each amino acid (Table 10), including two gene copies for GluRS in MYXXA and DESVU; two gene copies for LysRS in MYXXA, BDEBA, GEOSU, and DESPS; two gene copies for ThrRS in MYXXA and BDEBA; and two gene copies for LeuRS in MYXXA.
Table 4.
tRNA synthetases (asnS and glnS) and glutamyl/aspartyl amidotransferases (gat) in Bacteria
| Group* | asnS | glnS | gat |
|---|---|---|---|
| δ Proteobacteria (5) | + | + | + |
| MYXXA | + | + | + |
| BDEBA | + | + | + |
| GEOSU | + | + | + |
| DESVU | + | + | + |
| DESPS | + | + | + |
| α Proteobacteria (18) | − | − | + |
| β Proteobacteria (11) | − | + | + |
| γ Proteobacteria (27)† | + | + | − |
| γ Proteobacteria (6)‡ | − | + | + |
| γ Proteobacteria (2)§ | − | − | + |
| γ Proteobacteria (1)¶ | + | + | + |
| ε Proteobacteria (4) | − | − | + |
| Firmicutes (35) | + | − | + |
| Actinobacteriales (13) | − | − | + |
| Spirochaetales (5) | + | − | + |
| Chlamydiales (5) | − | − | + |
| Cyanobacteria (8) | + | − | + |
| Chloroflexi (1) | − | − | + |
| Bacteroidales (3) | + | + | − |
| Deinococcus-Thermus (2) | + | + | + |
| Chlorobiales (1) | − | − | + |
| Aquificales (1) | − | − | + |
| Thermotogales (1) | − | − | + |
| Fusobacteria (1) | + | − | + |
| Planctomycetales (1) | + | + | + |
*Numbers in parentheses indicate the number of species.
†Enterobacteriales (12), Pasteurellales (4), Alteromonadales (2), Vibrionales (5) and Xanthomonadales (4).
‡Pseudomonadales (5) and Francisella tularensis (Thiotricales).
§Coxiella burnetii (Legionellales) and Methylococcus capsulatus (Methylococcales).
¶L. pneumophila (Legionellales).
Generally when the regular glutamine and/or asparagine AARS genes (asnS and glnS) are lacking, an amido-transferase pretranslation modification mechanism (GatCAB of three subunits) is available that converts glu-tRNAgln to gln-tRNAgln and/or asp-tRNAasn to asn-tRNAasn. Subsequently, the correct charging of both asn-tRNAasn and gln-tRNAgln occurs by comparable transamidation reactions compensating for the lack of AsnRS and GlnRS function, respectively (26, 27). Strikingly, the five δ-proteobacterial genomes include asnS and glnS as well as the genes for the GatCAB amidotransferase complex (Table 4). The δ-proteobacterial pattern is rare among 180 bacterial genomes analyzed. It occurs only in Legionella pneumophila (γ proteobacteria), and in the non-proteobacteria Deinococcus radiodurans, Thermus thermophilus (Deinococcus-Thermus group) and Pirellula sp. (Planctomycetales). Among these species, it has been shown that the genome of D. radiodurans does not encode the regular enzymes for asparagine biosynthesis (asnA or asnB) and that Asp-tRNAAsn transamidation is the only pathway by which D. radiodurans can synthesize asparagine (28). The coexistence of GatCAB with AsnRS and GlnRS might then be explained in the nine organisms where GatCAB coexists with AsnRS and GlnRS by a role of GatBCA in the biosynthesis of asparagine or glutamine. We found that one or more genes for glutamine biosynthesis (glnA) are present in these organisms. Five genomes are missing the enzymes for asparagine biosynthesis (asnA and asnB), i.e., D. radiodurans, T. thermophilus, L. pneumophila, and the two predator δ genomes MYXXA and BDEBA. However, a gene for asparagine biosynthesis (asnB) is present in the genomes of the sulfate-reducing δ proteobacteria GEOSU, DESVU, and DESPS and in Pirellula sp. In the latter organisms, the coexistence of GatCAB with AsnRS and GlnRS might be explained by a role of GatCAB in asparagine or glutamine biosynthesis. MYXXA and BDEBA may have been assured of receiving asparagines from their prey, secondarily acquiring GatBCA.
Anaerobic detoxification genes.
Rubrerythrin (Rbr), with a di-iron active site, is found often PHX in anaerobic and microaerophilic bacteria and in archaeal organisms. The presence of Rbr is interpreted as an oxidative stress protection system in air-sensitive bacteria and archaea (18, 19). DESVU and DESPS are obligate anaerobic bacteria but appear to possess oxygen-reducing systems. They have been shown not to grow aerobically and appear to be inhibited by the presence of molecular oxygen. Superoxide dismutase (Sod) and catalase (Kat) are moderately expressed in both genomes, apparently allowing some direct oxygen detoxification (ref. 3 and 5; see also supporting information in Tables 12 and 15 with δ-genome lists of PHX genes). Rubrerythrin (Rbr), Nigerythrin, and Rubredoxin oxidoreductase were recently described as alternative oxidation stress protection proteins functioning in an anaerobic environment (18, 19). Rbr is widespread in archaea and some bacteria, particularly in organisms that die in the presence of oxygen. The function of Rbr has been debated, and recent evidence both in vivo and in vitro strongly supports Rbr in a role of a novel oxidative stress protection system (18, 19). A strictly anaerobic organism needs to eliminate oxygen radicals that arise during reduction of SO4= to SO−. Genomic analyses have shown that Rbr-like proteins are ubiquitous among archaea and bacteria and in genomes of many anaerobes that encode multiple Rbr homologues. The main function of Rbr seems to be reduction of hydrogen peroxide (19).
TonB-dependent receptors.
All of the δ genomes encode multiple tonB-dependent receptors that channel various complex metals into and out of bacterial cells. In particular, MYXXA features at least 10 PHX TonB-dependent receptors that it may use for predation. The other δ genomes also encode multiple TonB-dependent receptors, but none are PHX (Table 5). The TonB-dependent receptor proteins interact with outer membrane proteins and energize uptake of specific substrates (e.g., iron). These substrates either are poorly permeable through the porin channels or are encountered at very low concentrations. DESVU and BDEBA encode 6 and 10 copies, respectively, of the TonB receptor. GEOSU shows 7 copies (1 PHX) and DESPS has 2 copies.
Table 5.
Representation of selected gene families in δ proteobacteria
| Gene | MYXXA* | BDEBA* | GEOSU* | DESVU* | DESPS* |
|---|---|---|---|---|---|
| Histidine kinase | 133 (10) | 54 (−) | 87 (1) | 53 + 3† (−) | 17 (−) |
| Response regulators | 139 (31) | 41 (−) | 95 (3) | 75 + 4† (−) | 32 (1) |
| His-kinase and response regulator | 36 (4) | 3 (−) | 22 (−) | 18 (−) | − |
| Ser/Thr protein kinase | 102 (8) | 6 (−) | 1‡ (1‡) | 1 (1) | 3 (−) |
| Ser/Thr phosphatase | 18 (3) | 5 (−) | 3‡ (1‡) | 3 (−) | 2 (−) |
| σ54-Dependent DNA binding | 20 (7) | − | 22 (1) | 4 (−) | − |
| σ54-Dependent transcriptional reg. | 22 (8) | 6 (1) | 4 (−) | 24 + 2† (−) | 3 (−) |
| σ54 Others | 2 (1) | 2 (1) | 2 (−) | 1 + 1† (−) | 20§ (−) |
| All σ54-dependent | 44 (16) | 8 + 9¶ (2) | 28 (1) | 29 + 3† (−) | 23 (−) |
| RNA pol σ54 factor | 1 (−) | 2 (−) | 1 (−) | 1 (−) | 1 (−) |
| RNA pol σ70 factor | 35 (3) | 2‖ (−) | 1 (−) | 2 (−) | 2 (1) |
| RNA pol σ-32 | 1 (−) | 2‖ (−) | 1 (1) | − | 1** (−) |
| σ Factor for flagellar operon (FliA) | − | 1 (−) | 1 (−) | 1 (−) | 1 + 1†† (−) |
| (Metallo)-β-lactamase | 30 (3) | 11 (−) | 13 (−) | 8 (−) | −‡‡ |
| GGDEF domain protein | 16 (4) | 4 (−) | 23 (1) | 25 (−) | − |
| TonB protein | 16 (3) | 6 (−) | − | 4 (−) | 1 (−) |
| TonB-dependent receptor | 12§§ (4) | 4 (−) | 7 (1) | 2 (−) | 1 (−) |
| OmpA | 10 (4) | 2 (−) | 5 (1) | 2 (−) | − |
| Peptidyl-prolyl cis-trans isomerase | 13 (5) | 10 (4) | 7 (−) | 6 (−) | 4 (1) |
*Number of highly expressed genes are in parentheses.
†In megaplasmid.
‡One identified as “HPr(Ser) kinase/phosphatase.”
§Similar to two-component system response regulators (Ntr family). This family includes regulatory proteins that activate the expression of genes from promoters recognized by core RNA polymerase associated with the alternative σ54 factor. They have a conserved domain of ≈230 residues involved in the ATP-dependent interaction with σ54.
¶Nine other genes with similarity 20–30% to σ54-dependent transcription regulators, named transcriptional regulator NifA (2), response regulator containing CheY-like receiver AAA-type ATPase and DNA-binding domains (5), transcriptional regulatory protein zraR (1), and flagellar transcriptional activator protein flbD (1).
‖Two genes identified as σ70/σ32.
**Identified as “RNA polymerase σ-B factor.”
††Very low similarity (0–8%) to other flagellar σ factors or to σ70 (5–12%).
‡‡β-Lactamase sequences from DESVU have similarity (23–25%) to three sequences from DESPS identified as flavoproteins.
§§Two of these are identified as “tonB system transport proteins” of the ExbD/TolR and ExbB/TolQ families, respectively.
σ54-Activator proteins, histidine kinase (HK), and response regulator (RR) types.
There are >16 PHX genes in MYXXA that encode σ54 activators and at least an additional 28 genes encoding σ54-activator proteins not PHX. These proteins play important roles in fruiting body development (6, 7, 21). MYXXA has many σ70 factors most of which are ECF (extracytoplasmic function) sigmas (Table 9; see also ref. 29). Many of the σ54-activator proteins are PHX, but only three among the σ70 factors are PHX (Table 5). Sensory and signal histidine kinase genes in excess of 133 copies are widely distributed in the MYXXA genome, and 139 genes are separately characterized as RRs. In addition, at least 36 genes constitute hybrid two-component systems consisting of an HK domain coupled to an RR domain. Serine/threonine kinase and phosphatase (STPK and STPh, respectively) may function with the Forkhead σ54-enhancer binding proteins. An unprecedented number of at least 102 STPKs balanced by at least 18 STPh are distributed around the genome (Table 5).
The HK genes represent the most abundant collection of regulatory genes in GEOSU. Additionally, 95 genes are characterized as RRs (Table 5). There are 22 genes containing together the HK and RR domains and 8 gene pairs that locate an HK gene consecutive to a RR gene. The GEOSU genome is impressive, with a total of 28 σ54-activator proteins, of which 22 copies are described as σ54-dependent DNA-binding genes and 4 copies are described as σ54-dependent transcriptional regulators, but only one representative of σ70 occurs. In 14 occurrences, the σ54 activators are encoded contiguously with an HK domain. Three examples of serine/threonine phosphatase genes occur, but no STPK is found in the GEOSU genome.
DESVU features 56 HK genes, 79 RRs, and 18 genes combining the HK and RR domains in a common transcript. Many contiguous gene pairs encode the HK and RR domains. Three serine/threonine (S/T) phosphatase genes and apparently a single STPK is detected in DESVU. Remarkably, 32 σ54-dependent transcriptional regulators occur in DESVU, and 4 σ54-activator genes joined with HK genes are recognized. There are only two sequences of σ70 factors.
DESPS encodes 23 σ54 activators, one σ54 factor, and two σ70 factors. The most frequent gene types are HK (17 copies) and RR (32 copies) and many genes whose domains are united into the same operon. There are three STPK and two STPh genes. The most abundant gene types in BDEBA are the two-component systems HK (54 copies) and RR (41 copies).
Secretion proteins.
SecA is impressive, with the highest E(g) score relative to all PHX genes of MYXXA. The gene also qualifies as PHX in most of the other δ genomes.
Bacteria have developed complex mechanisms to deal with membrane translocation, secretion of polypeptides, and correct folding. A dimeric SecA, essential and unique to bacteria (not found in archaea), is fundamental for protein translocation to the periplasm (30, 31). Apart from SecA, secretion-specific chaperones include SecB (32) and the signal recognition particle. In these activities, the major chaperones GroEL, DnaK, and the trigger factor are also involved (12). In addition to structural and ancillary subunits, such as SecY, SecE, and SecG, the translocase complex has a mechanical motor device, the SecA ATPase, that binds to SecYEG to establish the functional translocase core (30). Mycobacterium tuberculosis possesses two SecA paralogs with distinct substrate specificities. SecY is prominent in the major RP cluster of all of the δ genomes, possibly indicating their relevance to protein synthesis. The SecA gene is also PHX in Vibrio cholerae, E. coli, Synechocystis, Mycoplasma pneumoniae, Treponema pallidum, Borrelia burgdorferi, Aquifex aeolicus, and other bacteria. The secretion pathway is used by many protein substrates. The cellular destination of all secretory polypeptides is governed by a 20- to 30-residue amino-terminal sequence, the leader peptide, which also helps guide SecA binding to the substrate. SecA, SecB, and SecG are all involved in protein export and chaperone activity. Gram-negative bacteria also secrete a variety of proteins into the extracellular and periplasmic milieu mediated by the secretion apparatus of types I to IV. These proteins can also influence bacterium–host interactions.
Other abundant gene classes.
Multiple copies of the GGDEF domain proteins, the metallo-β-lactamase enzymes, and adventurous motility proteins are conspicuous in δ genomes (Table 5).
GGDEF proteins.
GGDEF proteins are related to cyclic diguanylate metabolism and in regulation of the transition from sessile to motile forms (33). MYXXA contains 16 copies (4PHX), BDEBA contains 4 copies, GEOSU shows 23 copies, and DESVU may encode 25 copies.
β-lactamase enzymes.
β lactamase catalyses the opening and hydrolysis of the β-lactam ring of β-lactam antibiotics such as penicillins and cephalosporins (34). Metallo β lactamase (30 copies in MYXXA) is abundant in all five δ genomes. This gene can putatively protect the genome from microbial antibiotics or provide resistance against its own antibiotics. (Myxobacteria have substantial capacity for polyketide biosynthesis and production of antibiotics.) The δ proteobacteria are mainly soil inhabitants, and, because antibiotics are prodigiously manufactured by the Streptomyces soil bacteria and many fungal microbes, the metallo-β-lactamase enzymes presumably provide a defense against antibiotic molecules. The multiplicity of these β-lactamase genes may reflect a gene dosage effect. Because the β-lactamase motif is a crucial component of a large group of therapeutically useful antibiotics, comprising penicillin, cephalosporin, and carbapenem families, some bacteria express β-lactamase enzymes to escape the action of damaging antibiotics. Among soil habitats, Gram-negative bacteria are generally more sensitive to β-lactam antibiotics than Gram-positive bacteria. This sensitivity may relate to the hard cell wall structure of Gram-positive bacteria. The sporulation capabilities of many soil Gram-positive bacteria may also protect them from adverse effects, including antibiotics. Frequent genes of GEOSU include metallo-β lactamase, 13 copies; multiple repeats in DESVU, 8 copies; and, in BDEBA, 11 copies but no version in DESPS.
Motility genes.
Gliding motility genes (29 genes mostly PHX in MYXXA; see also Table 9) contribute importantly to swarming movements. Also relevant are genes of twitching motility and the frizzy genes FrzA and FrzB, which, in combination with the (MglA, MglB) operon, regulate reversing of the swarming motions in the fruiting body ensemble (24). The main genes governing movements of MYXXA are the adventurous gliding motility genes and many pilus genes associated with social motility (22, 23). In BDEBA, other groups of genes feature gliding and twitching motility genes of pilR, pilS, pilT, pilU, pilV types, each with multiple occurrences. These genes presumably contribute to BDEBA’s ability to attach to its Gram-negative bacterial prey before entering the periplasm.
Methods
Let G be a group of genes with average codon frequency g(x, y, z) for the codon (x, y, z) such that Σg(x, y, z) = 1 for each amino acid family. Similarly, let {ƒ(x, y, z)} indicate the average codon frequencies for the gene group F (F can be a single gene). The codon usage difference of F with respect to G is calculated by the formula
![]() |
where {pa(F)} are the average amino acid frequencies of the genes of F (12, 13). Predicted expression levels with respect to individual standards can be based on the ratios ERP(g) = B(g|C)/B(g|RP), ECH(g) = B(g|C)/B(g|CH), ETF(g) = B(g|C)/B(g|TF), where C is the totality of all of the genes of the genome (RP, ribosomal protein genes; TF, major protein synthesis factors; CH, major chaperone/degradation proteins). We introduce the expression measure E = E(g) = B(g|C)/(1/3)[B(g|RP) + B(g|CH) + B(g|TF)].
The gene classes (RP, CH, and TF) serve as representatives of highly expressed genes. Our method specifies genes with similar codon usages to at least one of these classes as PHX. These assignments are reasonable under fast growing conditions, where there is a need for many ribosomes, for proficient translation, and for many chaperone proteins to ensure properly folded and translocated protein products. E(g) is an estimate of the expression level of the gene g. The criterion E(g) > 1 in conjunction with at least two of the values ERP(g), ETF(g), or ECH(g) exceeding 1.05 generally reflects high protein molar abundance (12, 13).
Examples of PHX gene classes in most bacteria include: (i) most RPs but generally not all; (ii) global protein synthesis genes (like rpoB, rpoC, tuf, and fus); (iii) major chaperone/degradation proteins [like GroEL, DnaK, Tig, FtsH, Clp(A/B), and PPI (peptidyl-prolyl cis-trans isomerase)]; (iv) Pnp (mRNA processing and degradation); (v) essential energy metabolic genes, including glycolysis genes mainly under anaerobic conditions and TCA cycle genes generally under aerobic conditions, photosynthesis genes prominent in cyanobacteria, and methanogenesis genes in methanogens. Examples of protein classes that are PHX in particular genomes include: (i) fatty acid metabolism in M. tuberculosis (12); (ii) urease primarily in Helicobacter pylori and Ureaplasma urealyticum (12); (iii) flagellar proteins in some α proteobacteria (Mezorhizobium loti, C. crescentus) and in the spirochetes T. pallidum, B. burgdorferi (12, 14); and (iv) a proliferation of PHX detoxification genes in D. radiodurans (15). Our results on PHX genes are consistent with assessments of protein levels in two-dimensional gel electrophoresis (12, 13).
Multifunctional proteins may be expected to attain high E(g) values. For example, Pnp is fundamental to both mRNA processing and degradation and achieves the highest E(g) = 2.66 value among E. coli genes (12, 13). Aconitase not only interconverts citrate and isocitrate for the TCA cycle but also serves as a sensor detecting changes in the redox state and in assaying iron content within the cell and attains the highest E(g) value (2.56) in D. radiodurans. Another multifunctional PHX protein of many genomes is GAPDH (gap), which catalyzes the oxidation of glyceraldehyde-3-P in glycolysis and also possesses uracil DNA glycosylase activity, senses oxidative stress, binds to RNA and DNA, and serves as a source of reducing equivalents. In contrast, proteins that are required in few molecules per cell cycle are not expected to be highly expressed. Thus, the following gene groups are seldom highly expressed: (i) specialized transcription factors, (ii) strict replication proteins, (iii) most repair proteins, and (iv) vitamin biosynthesis enzymes (13). Overall, there is support for the proposition that each bacterial genome has evolved a codon usage pattern reflecting “optimal” gene expression levels for its typical lifestyle, habitat, and metabolic propensities (12, 13). We provide in supporting information (Tables 11–15) complete lists of PHX genes for each δ genome. Many of these genes offer attractive candidates for experimental study.
Supplementary Material
Abbreviations
- PHX
predicted highly expressed
- RP
ribosomal protein
- AARS
asparaginyl and glutaminyl tRNA synthetase
- RR
response regulator
- HK
histidine kinase
- TCA
tricarboxylic acid.
Note.
After completing the foregoing analysis, two new δ-proteobacterial genomes were released: Desulfovibrio desulfuricans DESDE (3.73 Mb) G + C content 57.8% (GenBank accession no. NC-007519) and Geobacter metallireducens GEOME (3.97 Mb) G + C content 59.5% (GenBank accession no. NC-007517). The distinctive properties of δ-proteobacterial genomes set forth in the abstract also apply to these genomes, including (i) presence of two genes for the giant ribosomal protein S1; (ii) the major RP gene cluster situated either in the ter region (DESDE) or close to oriC (GEOME), in both cases including the gene for SecY; (iii) presence of AARS genes for all amino acids (including asnS and glnS) and also the GatCAB amidotransferase complex; (iv) multiple copies of anaerobic detoxification proteins (rubrerythrin and variants); and (v) a proliferation of σ54-activator proteins and of histidine kinases and response regulators, either encoded separately or fused, but few (at most three) PHX σ70 factors. Other frequent protein classes include metallo-β-lactamase, GGDEF-domain proteins, TonB receptors, and secretion proteins.
Footnotes
References
- 1.Woese C. R., Kandler O., Wheeler M. L. Proc. Natl. Acad. Sci. USA. 1990;87:4576–4579. doi: 10.1073/pnas.87.12.4576. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Rendulic S., Jagtap P., Rosinus A., Eppinger M., Baar C., Lanz C., Keller H., Lambert C., Evans K. J., Goesmann A., et al. Science. 2004;303:689–692. doi: 10.1126/science.1093027. [DOI] [PubMed] [Google Scholar]
- 3.Heidelberg J. F., Seshadri R., Haveman S. A., Hemme C. L., Paulsen I. T., Kolonay J. F., Eisen J. A., Ward N., Methe B., Brinkac L. M., et al. Nat. Biotechnol. 2004;22:554–559. doi: 10.1038/nbt959. [DOI] [PubMed] [Google Scholar]
- 4.Methe B. A., Nelson K. E., Eisen J. A., Paulsen I. T., Nelson W., Heidelberg J. F., Wu D., Wu M., Ward N., Beanan M. J., et al. Science. 2003;302:1967–1969. doi: 10.1126/science.1088727. [DOI] [PubMed] [Google Scholar]
- 5.Rabus R., Ruepp A., Frickey T., Rattei T., Fartmann B., Stark M., Bauer M., Zibat A., Lombardot T., Becker I., et al. Environ. Microbiol. 2004;6:887–902. doi: 10.1111/j.1462-2920.2004.00665.x. [DOI] [PubMed] [Google Scholar]
- 6.Jakobsen J. S., Jelsbak L., Welch R. D., Cummings C., Goldman B., Stark E., Slater S., Kaiser D. J. Bacteriol. 2004;186:4361–4368. doi: 10.1128/JB.186.13.4361-4368.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Jelsbak L., Givskov M., Kaiser D. Proc. Natl. Acad. Sci. USA. 2005;102:3010–3015. doi: 10.1073/pnas.0409371102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Shimkets L. J. Annu. Rev. Microbiol. 1999;53:525–549. doi: 10.1146/annurev.micro.53.1.525. [DOI] [PubMed] [Google Scholar]
- 9.Tojo N., Inouye S., Komano T. J. Bacteriol. 1993;175:4545–4549. doi: 10.1128/jb.175.14.4545-4549.1993. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Munoz-Dorado J., Inouye S., Inouye M. Cell. 1991;67:995–1006. doi: 10.1016/0092-8674(91)90372-6. [DOI] [PubMed] [Google Scholar]
- 11.Sockett R. E., Lambert C. Nat. Rev. Microbiol. 2004;2:669–675. doi: 10.1038/nrmicro959. [DOI] [PubMed] [Google Scholar]
- 12.Karlin S., Mrázek J. J. Bacteriol. 2000;182:5238–5250. doi: 10.1128/jb.182.18.5238-5250.2000. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Karlin S., Mrázek J., Campbell A., Kaiser D. J. Bacteriol. 2001;183:5025–5040. doi: 10.1128/JB.183.17.5025-5040.2001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Karlin S., Mrázek J. Proc. Natl. Acad. Sci. USA. 2001;98:5240–5245. doi: 10.1073/pnas.081077598. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Karlin S., Barnett M. J., Campbell A. M., Fisher R. F., Mrázek J. Proc. Natl. Acad. Sci. USA. 2003;100:7313–7318. doi: 10.1073/pnas.1232298100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Mrázek J., Spormann A. M., Karlin S. Environ. Microbiol. 2006;8:273–288. doi: 10.1111/j.1462-2920.2005.00894.x. [DOI] [PubMed] [Google Scholar]
- 17.Karlin S., Brocchieri L., Campbell A., Cyert M., Mrázek J. Proc. Natl. Acad. Sci. USA. 2005;102:7309–7314. doi: 10.1073/pnas.0502314102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Lumppio H. L., Shenvi N. V., Summers A. O., Voordouw G., Kurtz D. M., Jr. J. Bacteriol. 2001;183:101–108. doi: 10.1128/JB.183.1.101-108.2001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Weinberg M. V., Jenney F. E., Jr., Cui X., Adams M. W. J. Bacteriol. 2004;186:7888–7895. doi: 10.1128/JB.186.23.7888-7895.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Buck M., Gallegos M. T., Studholme D. J., Guo Y., Gralla J. D. J. Bacteriol. 2000;182:4129–4136. doi: 10.1128/jb.182.15.4129-4136.2000. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Kroos L. Proc. Natl. Acad. Sci. USA. 2005;102:2681–2682. doi: 10.1073/pnas.0500157102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Spormann A. M. Microbiol. Mol. Biol. Rev. 1999;63:621–641. doi: 10.1128/mmbr.63.3.621-641.1999. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Ward M. J., Lew H., Zusman D. R. Mol. Microbiol. 2000;37:1357–1371. doi: 10.1046/j.1365-2958.2000.02079.x. [DOI] [PubMed] [Google Scholar]
- 24.Kaiser D., Yu R. Curr. Opin. Microbiol. 2005;8:216–221. doi: 10.1016/j.mib.2005.02.002. [DOI] [PubMed] [Google Scholar]
- 25.Sengupta J., Agrawal R. K., Frank J. Proc. Natl. Acad. Sci. USA. 2001;98:11991–11996. doi: 10.1073/pnas.211266898. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Stathopoulos C., Ahel I., Ali K., Ambrogelly A., Becker H., Bunjun S., Feng L., Herring S., Jacquin-Becker C., Kobayashi H., et al. Cold Spring Harbor Symposia on Quantitative Biology. Vol. LXVI. Woodbury, NY: Cold Spring Harbor Lab. Press; 2001. pp. 175–183. [DOI] [PubMed] [Google Scholar]
- 27.Ibba M., Soll D. Annu. Rev. Biochem. 2000;69:617–650. doi: 10.1146/annurev.biochem.69.1.617. [DOI] [PubMed] [Google Scholar]
- 28.Min B., Pelaschier J. T., Graham D. E., Tumbula-Hansen D., Soll D. Proc. Natl. Acad. Sci. USA. 2002;99:2678–2683. doi: 10.1073/pnas.012027399. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Helmann J. D. Adv. Microb. Physiol. 2002;46:47–110. doi: 10.1016/s0065-2911(02)46002-x. [DOI] [PubMed] [Google Scholar]
- 30.Economou A. Trends Microbiol. 1999;7:315–320. doi: 10.1016/s0966-842x(99)01555-3. [DOI] [PubMed] [Google Scholar]
- 31.Jilaveanu L. B., Zito C. R., Oliver D. Proc. Natl. Acad. Sci. USA. 2005;102:7511–7516. doi: 10.1073/pnas.0502774102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Ullers R. S., Luirink J., Harms N., Schwager F., Georgopoulos C., Genevaux P. Proc. Natl. Acad. Sci. USA. 2004;101:7583–7588. doi: 10.1073/pnas.0402398101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Simm R., Morr M., Kader A., Nimtz M., Romling U. Mol. Microbiol. 2004;53:1123–1134. doi: 10.1111/j.1365-2958.2004.04206.x. [DOI] [PubMed] [Google Scholar]
- 34.Frere J. M. Mol. Microbiol. 1995;16:385–395. doi: 10.1111/j.1365-2958.1995.tb02404.x. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.



