Skip to main content
Journal of Bacteriology logoLink to Journal of Bacteriology
. 2011 Aug;193(15):3757–3764. doi: 10.1128/JB.00404-11

Genomes of Three Methylotrophs from a Single Niche Reveal the Genetic and Metabolic Divergence of the Methylophilaceae

Alla Lapidus 1,#, Alicia Clum 1, Kurt LaButti 1, Marina G Kaluzhnaya 2, Sujung Lim 2, David A C Beck 3,4, Tijana Glavina del Rio 1, Matt Nolan 1, Konstantinos Mavromatis 1, Marcel Huntemann 1, Susan Lucas 1, Mary E Lidstrom 2,4, Natalia Ivanova 1, Ludmila Chistoserdova 4,*
PMCID: PMC3147524  PMID: 21622745

Abstract

The genomes of three representatives of the family Methylophilaceae, Methylotenera mobilis JLW8, Methylotenera versatilis 301, and Methylovorus glucosetrophus SIP3-4, all isolated from a single study site, Lake Washington in Seattle, WA, were completely sequenced. These were compared to each other and to the previously published genomes of Methylobacillus flagellatus KT and an unclassified Methylophilales strain, HTCC2181. Comparative analysis revealed that the core genome of Methylophilaceae may be as small as approximately 600 genes, while the pangenome may be as large as approximately 6,000 genes. Significant divergence between the genomes in terms of both gene content and gene and protein conservation was uncovered, including the varied presence of certain genes involved in methylotrophy. Overall, our data demonstrate that metabolic potentials can vary significantly between different species of Methylophilaceae, including organisms inhabiting the very same environment. These data suggest that genetic divergence among the members of this family may be responsible for their specialized and nonredundant functions in C1 cycling, which in turn suggests means for their successful coexistence in their specific ecological niches.

INTRODUCTION

The family Methylophilaceae includes four formally described genera (Methylophilus, Methylovorus, Methylobacillus, and Methylotenera), all representing obligate or restricted facultative methylotrophic isolates from terrestrial or freshwater environments (19, 22, 26). A single marine isolate of Methylophilaceae (strain HTCC2181) has been described; so far it does not have an official genus and species name (14). In addition to the data from pure-culture isolates, data from numerous culture-independent surveys show that representatives of Methylophilaceae are ubiquitous, thriving in a variety of natural as well as man-made environments (1113, 30, 35). These surveys uncovered many sequences that are quite distant from the sequences of the cultured Methylophilaceae (with the top hit at below 96% for the 16S rRNA gene) (12, 35), indicating that the diversity of Methylophiaceae may be much greater than the diversity covered by the cultured species. Genomes for two representatives of Methylophilaceae have been recently sequenced and subjected to comparative analysis: Methylobacillus flagellatus (Mb. flagellatus) KT, an industrial activated sludge isolate (8), and Methylophilales strain HTCC2181, a marine isolate distantly related to terrestrial species (92.9 to 94.6% 16S rRNA gene identity) (14). The two genomes differed remarkably in size (2.9 versus 1.3 Mb) and thus in gene content. However, the conserved portions of the genomes were largely syntenic, suggesting a common evolution, and many of the genes known for their role in methylotrophy were conserved (14). More recently, a composite genomic sequence of a few closely related strains of Methylotenera has been extracted from a metagenomic data set, serving as a proxy for the genomic makeup of another representative genus of Methylophilaceae (20). While it was more closely related to strain HTCC2181 based on 16S rRNA gene phylogeny, the genome of Methylotenera appeared to be more similar to the genome of Mb. flagellatus KT in terms of size and the presence of signature functional methylotrophy modules. However, genome-wide comparisons revealed little conservation beyond housekeeping functions and methylotrophy, suggesting a significant potential for species-specific functionality (20).

As part of the ongoing Microbial Observatory project in Lake Washington, funded by the National Science Foundation, we have detected a variety of species belonging to Methylophilaceae constituting part of a broader methylotrophic community present in the sediment of the lake (17, 18, 29). A number of species were isolated in pure culture, including the recently formally described Methylotenera mobilis (Mt. mobilis) JLW8 (19), Methylotenera versatilis 301 (22), and Methylovorus glucosetrophus (Mv. glucosetrophus) SIP3-4 (22). In order to expand the genome-based knowledge on Methylophilaceae and to obtain further insights into their diversity and evolution, we sequenced and analyzed their genomes. All three strains utilize methylamine (Mt. mobilis JLW8 grows with a doubling time of 7 h, and Mv. glucosetrophus SIP3-4 grows with a doubling time of 17 h; Mt. versatilis 301, while growing well on solid media, is reluctant to grow in liquid media) (22). Mv. glucosetrophus SIP3-4 also grows robustly on methanol. However, Mt. mobilis JLW8 grows very poorly on methanol and appears to require nitrate (methanol-dependent denitrification) (21), while Mt. versatilis 301 grows (poorly) on methanol in a nitrate-independent fashion (22).

MATERIALS AND METHODS

Cultivation, DNA isolation, whole-genome sequencing, assembly, and genome annotation.

Mt. mobilis JLW8, Mt. versatilis 301, and Mv. glucosetrophus SIP3-4 were cultivated as previously described (19, 22). DNA was isolated as previously described (19). The genomes were sequenced using a combination of Illumina and 454 sequencing platforms. Three genomic libraries, one 454 pyrosequence standard library, one 454 PE library (22-kb, 9-kb, and 16-kb insert sizes, respectively), and one Illumina standard library were created for each genome. All general aspects of library construction and sequencing can be found at the Joint Genome Institute (JGI) website (http://www.jgi.doe.gov/). Pyrosequencing reads were assembled using the Newbler assembler version 2.0.00.20-PostRelease-11-05-2008-gcc-3.4.6 (Roche). Illumina GAii sequencing data produced for the three genomes were assembled with Velvet (http://www.ebi.ac.uk/∼zerbino/velvet/), and the consensus sequences were shredded into 1.0-kb overlapped fake reads. Illumina data produced for strain JLW8 were not used to assemble the genome, but these were used for assembly improvement at the final stages of the project. The initial Newbler assemblies for Mv. glucosetrophus SIP3-4 and Mt. versatilis 301, consisting of 31 contigs in 1 scaffold and of 26 contigs in 1 scaffold, respectively, were coassembled with Velvet fake reads using Newbler assembler. The majority of gaps in all three assemblies were closed using automated gap resolution software developed at the JGI (http://www.jgi.doe.gov/). This software subprojects data associated with each gap in a scaffold and reassembles the data using Newbler (Roche) or Paracel Genome Assembler 2.6.2 (Paracel, Pasadena, CA). The fakes of those subprojected assemblies were added to the ace file to close the gaps. Some gaps were closed by PCR primer walks. Totals of 58, 2, and 31 additional Sanger reactions were necessary to close gaps and to raise the quality of the finished sequences of the three genomes. Illumina reads were also used to correct potential base errors and increase consensus quality using a software Polisher developed at the JGI (K. LaButti and A. Lapidus, unpublished). The error rates of the completed genome sequences are less than 1 in 100,000. The final assembly of the genome of Mv. glucosetrophus SIP3-4 consists of 405,086 pyrosequence and 11,868,826 Illumina reads, the assembly of the genome of Mt. versatilis 301 consists of 1,090,401 pyrosequence and 4,563,862 Illumina reads, and the assembly of the genome of Mt. mobilis JLW8 consists of 375,648 pyrosequence reads. Together, the combination of the sequencing platforms used provided 182.3-fold, 169.0-fold, and 51.4-fold coverage of the genomes of Mv. glucosetrophus SIP3-4, Mt. versatilis 301, and Mt. mobilis JLW8, respectively.

Analysis of inferred proteome conservation.

For each inferred proteome, protein translations were extracted from the respective GenBank files and placed into FASTA files. These were then compiled into a BLAST protein database. In a pairwise fashion, the FASTA file from each query proteome was BLASTed against the database from every other target proteome. Default BLASTp parameters were used. Two proteins were considered shared if they aligned over 70% or more of the length of each protein. The fraction of identical residues was recorded.

Core and pangenome analysis.

The core genome was estimated using the Phylogenetic profiler tool that is part of the IMG system (http://img.jgi.doe.gov/cgi-bin/m/main.cgi) at different similarity cutoffs (30 to 90%). The pangenome of Methylophilaceae was generated using the pangenome pipeline developed for IMG (K. Mavromatis et al., unpublished). In the first step, the pipeline predicts orthologous genes between the target set of genomes by analyzing bit scores of all-against-all BLASTp hits with e values of 1.0e−02 or lower (27; see also IMG documentation at http://img.jgi.doe.gov/w/doc/userGuide.pdf). The genes from the family of the target group of the five genomes were considered the in-group, while all other genomes in IMG were considered an outgroup. In the first step, the genes were clustered into orthologous groups based on the normalized bit scores of their BLASTp hits. Bit scores were then divided by the bit score of the previous best hit, and then the genes with better hits to the genes from the target group than to the genes from the outgroup were assigned to the same orthologous group. Some of the orthologous groups contained multiple members from one or more genomes due to recent duplication or gene loss events. For such orthologous groups an attempt was made to split them based on the conservation of the chromosomal neighborhoods (28). For this purpose the BLAST similarity scores of all neighboring genes were analyzed, and genes with the highest number of similar neighboring genes were removed from the original group to form new orthologous groups. This final set of orthologous groups represented the pangenome of Methylophilaceae analyzed in this study; the orthologous groups with members from all genomes were considered the “core genome” of this group, those with members from 1 genome only were considered “unique,” and the groups with members in one but not all genomes were considered “dispensable” parts of the pangenome. The pangenome of Methylophilaceae is available in IMG-ER (http://img.jgi.doe.gov/er).

Identification of genomic islands.

Genomic islands were detected manually based on genome-genome synteny breaks, deviation in GC content compared to the average GC content of a given genome, and the presence of potential integration sites. First, we employed the Phylogenetic Profiler tool available as part of the IMG interface, recording genes that were unique to each of the three genomes investigated, i.e., genes not found in other Methylophilaceae genomes. These were then evaluated in terms of being parts of gene clusters (single unique genes were ignored for the purposes of this study, but these may be valuable for future, more detailed analyses of gene transfer in these species), and if they were, then disturbances in gene synteny between the three genomes were recorded using the Gene Ortholog Neighborhoods tool in the IMG system. For regions showing disturbed synteny, GC content deviations from the genome average were recorded using the Genome Viewers tools in the IMG system. The same Genome Viewers tools were used to denote the presence of tRNA or pqqA genes in the regions of disturbed synteny/GC content. Insertion sequence elements were detected using ISsaga software (33), and these were manually curated for potential false positives.

Reconstruction of methylotrophy and other major metabolic pathways.

Automated gene annotations created using the IMG pipeline (27) were curated manually for genes involved in key metabolic pathways. Reconstruction of methylotrophy pathways was modeled after prior analysis of the genomes of Mb. flagellatus KT (8) and strain HTCC2181 (14) and the composite genome of Methylotenera (20). Homologs of the previously described genes were identified in the newly sequenced genomes using comparative genomics tools that are parts of the IMG system or using BLAST against the nonredundant NCBI database.

RESULTS AND DISCUSSION

Genome structure.

Maps of the three newly sequenced genomes are shown in Fig. 1. Relevant characteristics of the genomes are shown in Table 1, where the previously published Methylophilaceae genomes are used for comparison. We opted not to include the newly available genome of Methylovorus strain MP688 in the analyses as it is extremely similar to the genome of Mv. glucosetrophus SIP3-4 (over 98% genome-genome sequence identity) (39) (see below). The sizes of the newly sequenced genomes ranged between 2.5 and 3.1 Mb, with the genome of strain HTCC2181 remaining the smallest genome among known Methylophilaceae. The GC content ranged between 37.93 and 55.72%, with strain HTCC2181 having the lowest and strain KT having the highest GC content. Only one organism, Mv. glucosetrophus SIP3-4, revealed the presence of extrachromosomal replicons, plasmids of 76,680 and 9,816 bp, respectively. The smaller plasmid encoded a total of 13 proteins, of which most were hypothetical proteins. Functions for only three proteins could be predicted; these were an integrase, a toxin-like protein, and a periplasmic serine protease. The larger plasmid encoded a total of 79 proteins, of which a large number were hypothetical proteins. Proteins with functional predictions belonged to the following categories: DNA replication and modification (7 proteins), chromosome partitioning (2 proteins), pilus formation and secretion functions (10 proteins), conjugative transfer functions (9 proteins), and regulation and sensing functions (5 proteins).

Fig. 1.

Fig. 1.

Circular representation of the replicons sequenced in this study. (A), Chromosome of Methylotenera mobilis JLW8; (B) chromosome of Methylotenera versatilis 301; (C, D, and E) chromosome and plasmids of Methylovorus glucosetrophus SIP3-4. Maps are not drawn to scale. From the outside to the center: genes on the forward strand and genes on the reverse strand colored by COG categories, RNA genes (tRNAs in green, rRNAs in red, and other RNAs in black), GC content, and GC skew. Color correspondence to COG functions can be found via the IMG interface.

Table 1.

Genome statistics and general features

Strain Genome size (bp) % GC No. of:
Mean coding sequence length (bp) % Coding regions Reference
Proteins encoded rRNA operons tRNAs Replicons
Mt. mobilis JLW8 2,547.570 45.51 2,348 2 46 1 975.63 89.96 This work
Mt. versatilis 301 3,059.871 42.64 2,800 3 47 1 993.59 90.26 This work
Mv. glucosetrophus SIP3-4 3,082.007 54.61 2,922 2 48 3 966.10 91.51 This work
Mb. flagellatus KT 2,971.517 55.72 2,759 2 46 1 973.73 90.61 8
Methylophilales strain HTCC2181 1,304,428 37.93 1,338 1 36 1 923.45 95.00 14

Genome conservation.

We previously carried out virtual hybridization between the available genomes of Methylophilaceae, including the two Methylotenera strains that are most closely related to each other in terms of 16S rRNA gene similarity, and demonstrated that they show relatively low DNA-DNA homology (2.0 to 41.3%) (22). In a pairwise fashion, we also compared protein complements in the available genomes, at different identity cutoff values, demonstrating significant variability in both the number of shared proteins and protein conservation (see Fig. S1 in the supplemental material). The two Methylotenera strains shared only 51.6% proteins with 70% identity, the value proposed for an average similarity cutoff for genus-level relatedness (15) (Table 2), and comparisons between other species revealed even less similarity. The protein complement of the marine strain HTCC2181 showed the least similarity with the rest of the strains, suggesting that it likely represents a new genus within this family. Despite significant divergence at the protein and DNA levels, each of the pairwise genome-genome alignments revealed significant gene order synteny (see Fig. S2 in the supplemental material) (data not shown). However, when gene colinearity was compared for all five genomes, only 15 conserved gene clusters of 10 or more genes were identified. Most of these encoded major housekeeping functions such as ribosomal proteins, cell division proteins, ATP synthesis, and amino acid and vitamin biosynthesis. Only a few genes important for methylotrophy were parts of these conserved gene clusters, i.e., the pyrroloquinoline quinone (PQQ) biosynthesis genes (with the exception of pqqA) and the Entner-Doudoroff pathway genes (edd and eda) (not shown). All these data point to significant genetic divergence among Methylophilaceae.

Table 2.

Relationships between the five Methylophilaceae strains compared in this work based on 16S rRNA gene identity and on percentage of common proteins

Strain Gene or protein identitya with:
Mt. mobilis JLW8 Mt. versatilis 301 Mv. glucosetrophus SIP3-4 Mb. flagellatus KT Methylophilales strain HTCC 2181
Mt. mobilis LW8 96.6 94.3 93.8 94.6
Mt. versatilis 301 51.6 93.5 93.6 94.3
Mv. glucosetrophus SIP3-4 33.6 29.6 96.5 93.9
Mb. flagellatus KT 29.6 25.9 38.7 92.9
Methylophilales strain HTCC2181 17.0 16.9 15.7 15.2
a

16S rRNA identity is shown in the upper right and percentage of common proteins (at 70% identity) in the lower left.

It is interesting to point out that the proportions of “orphan” proteins in each genome, i.e., proteins with no homologs in the nonredundant database, were very small: 2.21% for Mt. mobilis JLW8, 1.39% for Mt. versatilis 301, and 2.91% for Mv. glucosetrophus SIP3-4.

Methylotrophy.

The biochemistry of methylotrophy in Methylophilaceae has been well understood; it was originally described based on enzyme activity measurements and mutant analysis with model species of Methylophilus and Methylobacillus (1, 5, 23, 24). In this way, methanol dehydrogenase (MDH) and methylamine dehydrogenase (MADH) have been established as the hallmark enzymes for primary substrate oxidation in these species (3, 10). The ribulose monophosphate (RuMP) cycle was concluded to be the pathway for formaldehyde assimilation, also serving as a major pathway for formaldehyde oxidation (1, 23). “Linear” formaldehyde oxidation having formate as an intermediate was found to be of minor significance in these organisms (5). The tricarboxylic acid (TCA) cycle was incomplete because of the lack of alpha-ketoglutarate dehydrogenase activity, presenting one explanation for the obligately methylotrophic lifestyle (23). The publication of the genome of Mb. flagellatus KT provided the first comprehensive genetic blueprint for methylotrophy in Methylophilaceae, and it was in a complete agreement with the predictions from the physiological studies (8). However, when the next Methylophilaceae genome became available, that of the marine isolate HTCC2181, genes for neither MDH nor MADH were found (14). Analysis of the composite genomes of Methylotenera strains enriched with either methanol or methylamine also indicated the apparent lack of genes for true MDH (mxaFI), while the presence of genes for MADH was sample dependent (20). These data questioned the assumption of Methylophilaceae being a metabolically homogenous group and suggested a potential for species-specific functionality. This study approaches these questions by further expanding the genome-based knowledge of methylotrophy in Methylophilaceae.

(i) Primary substrate oxidation.

Methylamine dehydrogenase (MADH) was encoded only in the genome of Mt. mobilis JLW8, by a gene cluster structurally identical to the one previously identified in the composite genome of Methylotenera (20) and highly similar to the cluster in the chromosome of Mb. flagellatus KT with one exception: the gene for azurin (azu) present in Mb. flagellatus KT was replaced by a gene for a novel cytochrome in Methylotenera (20). These and the gene clusters mentioned below are best viewed using the JGI's IMG/M interface (http://img.jgi.doe.gov/cgi-bin/m/main.cgi) and the respective gene identifiers provided in Table S1 in the supplemental material.

While not encoding MADH, Mt. versatilis 301 and Mv. glucosetrophus SIP3-4 contained the recently characterized genes for the N-methylglutamate pathway for methylamine oxidation, all in one cluster (mgdABCD gma mgsABC), which are structurally identical to those previously described for other betaproteobacteria, including Mb. flagellatus KT (9, 25). The encoded proteins in each case were most closely related to those of Mb. flagellatus KT, as expected. Less expectedly, the proteins of Mt. versatilis 301 were more closely related to the proteins of Mb. flagellatus KT (47% to 90% amino acid identity, with MgdA being most conserved and MgdD being least conserved) than the proteins of Mv. glucosetrophus SIP3-4 (40 to 87% identity). This observation indicates that gene clusters responsible for this pathway might have been subjects of lateral transfers, as has been previously suggested for other methylotrophy gene clusters (4).

Genes for the typical methanol dehydrogenase (MDH), mxaFI, were recognized only in the genome of Mv. glucosetrophus SIP3-4. The two subunits showed high sequence similarity with the subunits of MDH of Mb. flagellatus KT (78% for MxaF and 70% for MxaI), while MxaJ (an accessory protein) and MxaG (a dedicated cytochrome) were less conserved (47% and 59%, respectively) with the orthologs in Mb. flagellatus KT. The presence of these genes correlated with robust growth on methanol observed for both strains (5, 22).

While MxaFI (GI) were encoded only in the genome of Mv. glucosetrophus SIP3-4, all three genomes encoded homologs of the large subunit of MDH known as XoxF (two to four divergent copies) (Table 3). The exact function of these proteins remains unknown. However, they have been previously implicated in having a role in C1 metabolism (38). Moreover, an XoxF from Methylobacterium extorquens AM1 has recently been demonstrated to possess MDH activity (31). MDH requires PQQ as a cofactor (2). XoxF enzymes that show approximately 50% amino acid identity with MDH enzymes are also predicted to be PQQ dependent. All the PQQ biosynthesis genes were detected in all three genomes, in each case forming two gene clusters, pqqABCDE and pqqFG, with the exception of the strain HTCC2181 genome, in which pqqA genes were not clustered with other pqq genes (also see discussion below).

Table 3.

Major pathways and enzymes for carbon and nitrogen metabolism predicted from genomes

Enzyme or pathwaya Mt. mobilis JLW8 Mt. versatilis 301 Mv. glucosetrophus SIP3-4 Mb. flagellatus KT Methylophilales strain HTCC2181
RuMP cycle + + + + +
Gnd enzymes GndB GndB GndA GndA, GndB GndA
MADH + +
NMG pathway + + +
H4MPT pathway + + + +
H4F pathway + + + + +
Fae homologs Fae2 Fae2, Fae3 Fae2, Fae3 Fae2, Fae3
FDH2 + + + + +
FDH4 + + +
MDH (MxaFJGI) + +
PQQ synthesis + + + + +
pqqA (gene copies) 5 4 5 3 3
MxaRSACKL copies 2 2 3 3 2
XoxF (copies) 2 3 4 4 1
NapA/NirBD + + + +b
AniA/Nor +
Urea metabolism + + +
Choline degradation +
MCA cycle + + +
a

RuMP, ribulose monophosphate; Gnd, 6-phosphogluconate dehydrogenase; H4MPT, tetrahydromethanopterin; FDH, formate dehydrogenase, MDH, methanol dehydrogenase; MADH, methylamine dehydrogenase; NMG, N-methylglutamate; XoxF, homolog of the large subunit of methanol dehydrogenase; FDH, formate dehydrogenase; NapA/NirBD, assimilatory nitrate reduction pathway; AniA/Nor, denitrification pathway; MCA, methylcitric acid.

b

Nonorthologous.

Other genes are involved in MDH function, i.e., the genes responsible for Ca2+ insertion (mxaACKL) and genes of yet-unknown function (mxaRS and mxaD), which are typically colocalized with the MDH structural genes to form the operon mxaFJGIRSACKLD. Such a typical operon was identified only in Mv. glucosetrophus SIP3-4. Similarly to that of Mb. flagellatus KT (8), the genome of Mv. glucosetrophus SIP3-4 contained two additional gene clusters, mxaRSACKL, with genes forming the three clusters significantly diverging in sequence (36 to 42% amino acid divergence for the most conserved gene, mxaR, and 71 to 73% divergence for the least conserved gene, mxaA). The two latter gene clusters were conserved across all five members of the Methylophilaceae compared in this study (Table 3).

Of the genomes compared, only the genome of Mt. versatilis 301 encoded the complete pathway for choline degradation (m301_1286 to m301_1307) (see Table S2 in the supplemental material), and the strain was previously shown to be able to grow on betaine, which is an intermediate in this pathway (22).

(ii) Formaldehyde oxidation.

Two pathways for formaldehyde oxidation are recognized in most Methylophilaceae: the linear pathway involving C1 transfer reactions linked to tetrahydromethanopterin (H4MPT), along with formate dehydrogenases (FDHs), and cyclic oxidation involving initial reactions of the RuMP cycle along with the 6-phosphogluconate dehydrogenase (Gnd) reaction (4, 16). As expected, all the genes for the RuMP cycle and for the H4MPT-linked reactions were detected in the newly sequenced genomes. In terms of both gene clustering and sequence conservation, genomes of the two Methylotenera strains were more closely related to each other than to that of Mv. glucosetrophus SIP3-4, while the genome of the latter showed a high degree of synteny and sequence conservation with the genome of Mb. flagellatus KT, with few exceptions. While the genomes of Mt. mobilis JLW8 and Mt. versatilis 301 encoded two highly similar hexulose phosphate synthases each (Hps1 and Hps2), the genome of Mv. glucosetrophus SIP3-4 encoded only one (Hps1). As previously described for the Mb. flagellatus KT hps genes, the hps1 genes were parts of the histidine biosynthesis gene clusters, while the hps2 genes were parts of the clusters encoding the reactions of the H4MPT-linked formaldehyde oxidation pathway. Interestingly, other than the lack of hps1, the respective gene cluster of Mv. glucosetrophus SIP3-4 was almost identical to the cluster of Mb. flagellatus KT. However, while Mb. flagellatus KT contains one copy of fae, which is thought to encode formaldehyde-activating enzyme (36) in this cluster, Mv. glucosetrophus SIP3-4 contained two dissimilar (74% amino acid identity) copies. It is noteworthy that in the Methylotenera genomes, none of the true fae genes were found clustered with other C1 genes, as previously noted for the composite Mt. mobilis genome, prompting speculation on a potential secondary function of Fae as a sensor/regulator (20). As in the composite genome, one copy of fae in Mt. mobilis JLW8 (mmol_1253) was part of the chemotaxis gene cluster. However, the gene neighborhood was only partially conserved in the genome of Mt. versatilis 301 (m301_0896), where only a part of the chemotaxis cluster was present. Gene neighborhoods surrounding the second copy of fae (mmol_2056 and m301_1343) were not conserved between the genomes and between the metagenomic fragments annotated as Methylotenera. These, along with our previous observations (22), highlight the tendency of the fae genes to diverge in sequence, genomic location, and copy number. In addition to the genes encoding true Fae (over 50% identity with the functionally characterized enzyme) (36), all three genomes encoded Fae2, a Fae homolog of yet-unknown function. The genomes of Mt. versatilis 301 and Mv. glucosetrophus SIP3-4 also encoded Fae3, another Fae homolog of unknown function (4). In the genome of Mt. versatilis 301, fae3 is located near the gene cluster encoding the N-methylglutamate pathway.

While in Methylophilaceae enzyme activities are found at high levels for both linear H4MPT-linked and cyclic RuMP cycle formaldehyde oxidation, the latter pathway, involving 6-phosphogluconate dehydrogenase (Gnd) in addition to the early enzymes of the RuMP cycle (Hps, Hpi, phosphoglucoisomerase, and glucose 6-phosphate dehydrogenase), has been assumed to be the major pathway. Two forms of Gnd have been previously recognized, GndA and GndB; the former is active with NAD as a cofactor, and the latter is active with NADP (7). Mutagenesis in Mb. flagellatus KT revealed that while either of the Gnd enzymes can operate in the oxidative RuMP cycle, the NAD-linked enzyme appeared to be more important for the organism's fitness (16). Of these two enzymes, Mv. glucosetrophus SIP3-4 was predicted to possess only GndA, while the Methylotenera strains were predicted to possess only GndB.

(iii) Formate oxidation.

For formate oxidation, an enzyme previously described as FDH2, predicted to be a molybdenum-containing enzyme, was encoded by all three genomes, while homologs of the previously characterized FDH4 were encoded only by Mt. mobilis JLW8 and Mv. glucosetrophus SIP3-4. While the role of this enzyme is not well understood, its importance in the fitness of both M. extorquens AM1 and Mb. flagellatus KT has been previously demonstrated (6, 16).

(iv) C1 assimilation.

The enzymes involved in the remaining reactions of the RuMP cycle (i.e., the cleavage of 6-phosphogluconate and regeneration of RuMP) were found to be highly conserved among all the genomes compared here, with edd and eda clustered together in all cases, zwf, pgl, and gndA clustered in the chromosome of Mv. glucosetrophus SIP3-4, and zwf and pgl clustered in the chromosomes of Mt. mobilis JLW8 and Mt. versatilis 301 (which do not contain gndA). The remaining genes lacked any specific clustering on the chromosomes and often were not parts of syntenic islands.

Multicarbon substrate metabolism.

It has been discussed before that the obligately methylotrophic phenotype of Mb. flagellatus KT may result from the lack of three enzymes of the tricarboxylic acid cycle: alpha-ketoglutarate dehydrogenase, malate dehydrogenase, and succinate dehydrogenase (8). However, the composite genome of Methylotenera, like the genome of strain HTCC2181, while still lacking the genes for alpha-ketoglutarate dehydrogenase, encoded malate dehydrogenase and succinate dehydrogenase, and the respective genes were clustered in a strictly conserved way with the remaining genes for the methylcitric acid (MCA) cycle on the chromosomes of the two organisms (20). This study revealed that both Methylotenera strains encode the complete MCA cycle, with gene clustering identical to that uncovered for the composite Methylotenera genome (20), while Mv. glucosetrophus SIP3-4 does not. However, Mv. glucosetrophus SIP3-4, unlike Mb. flagellatus KT, encodes a malate dehydrogenase.

Of all the strains described, only Mt. versatilis 301 has been demonstrated to be a true facultative methylotroph, capable of growing on, among other substrates, fructose and pyruvate (22). Genome-genome comparisons revealed that only strain 301 encoded a fructose-specific transport system (phosphotransferase system [PTS]), along with phosphofructokinase (m301_1125 and m301_1126) (see Table S2 in the supplemental material), and these are likely responsible for fructose metabolism. Pyruvate phosphate dikinase (m301_0371) was also unique to Mt. versatilis 301, suggesting its importance in pyruvate metabolism.

Nitrogen metabolism.

We previously noted that the composite genome of Methylotenera encoded an incomplete denitrification pathway (20) and later demonstrated that Mt. mobilis JLW8 is capable of denitrification with N2O as the final product (21). The denitrification genes uncovered in the genome of Mt. mobilis JLW8 were highly conserved in sequence with those in the composite genome of Methylotenera, and their clustering was also conserved. However, no genes for dissimilatory nitrite reduction or nitric oxide reduction were present in the genomes of Mt. versatilis 301 and Mv. glucosetrophus SIP3-4. Conversely, genes for assimilatory nitrate and nitrite reduction that are highly similar to the respective genes in Mt. mobilis JLW8 were found in the genomes of Mt. versatilis 301 and Mv. glucosetrophus SIP3-4, while the genome of Mb. flagellatus KT contained putative genes for assimilatory nitrate and nitrite reduction not related to the genes in other strains (<30% amino acid identity), and strain HTCC2181 had none of these genes (Table 3).

Highly conserved gene clusters encoding enzymes for urea metabolism were uncovered in the genomes of Mt. versatilis 301 and Mv. glucosetrophus SIP3-4, and these were highly similar to the urea degradation gene cluster in Mb. flagellatus KT (m301_1356 to m301_1372 and msip34_0695 to msip34_0711, respectively). Only Mt. versatilis 301 encoded enzymes for a putrescine degradation pathway (m301_1037 to m301_1071) (see Table S2 in the supplemental material).

The core genome and the pangenome of Methylophilaceae.

The availability of five divergent Methylophilaceae genomes allowed for an initial estimate of the core genome for this family, i.e., the fraction of genes present in each genome. The minimalist genome of strain HTCC2181 is key in defining the gene set essential for methylotrophy in this group. At cutoffs of 50% and 40% (at the protein level), only 579 and 798 genes, respectively, were conserved among all five genomes. These encoded mainly the important housekeeping functions (such as ribosomal proteins, DNA and RNA synthesis machinery, and amino acid biosynthesis) as well as some of the methylotrophy functions. From these comparative analyses, neither MDH nor MADH genes were parts of the core genome. While it is widespread in methylotrophs of various groups (4), the C1 transfer pathway linked to H4MPT was also not part of the core genome. Conversely, genes encoding C1 transfer reactions linked to H4F (involving metF, folD, and purU) were parts of the core genome. Potentially, this pathway is utilized by Methylophilaceae as an additional means for oxidizing formaldehyde (see the discussion in reference 4) or for oxidizing methyl-H4F originating from demethylation reactions. However, experimental validation is needed to draw conclusions about the importance of this pathway for methylotrophy. Genes encoding XoxF and associated proteins (xoxFJG) were parts of the core genome, with all XoxF proteins being highly conserved (>70% identity, with one exception). Genes for PQQ biosynthesis and MDH accessory functions (mxaRSACKL, possibly more precisely designated PQQ-dependent dehydrogenase accessory functions) were parts of the core genome. All the genes involved in the reactions of the RuMP cycle, with the exception of gnd, were also parts of the core genome and were highly conserved. While a Gnd enzyme was encoded by each of the genomes and thus the gene was part of the functional core genome, neither of the two nonhomologous genes, gndA or gndB, was present in all organisms. Genes encoding only one of the formate dehydrogenases (FDH2) were parts of the core genome.

The pangenome of Methylophilaceae, as deduced from the five genomes, was found to consist of 5,803 orthologous groups of genes, encoded by approximately 6 Mbp of DNA. Members of 827 orthologous groups were found in all genomes, and 1,709 orthologous groups included members from 2 to 4 genomes. The remaining 3,267 orthologous groups were found in only 1 genome. These observations support the notion that the members of Methylophilaceae included in this analysis are diverse in regard to their gene content.

A genome announcement for Methylovorus strain MP688 was published when this paper was in preparation (39). We did not use this genome in core and pangenome calculations because this genome is extremely similar to the genome of Mv. glucosetrophus (39), possessing only 61 nonconserved genes, none of which are parts of the core genome. Methylovorus strain MP688 does not possess any of the genes found on plasmids of Mv. glucosetrophus SIP3-4.

Genomic islands and means for genomic evolution.

From the analyses above, the core genome of Methylophilaceae appears to be about 1/10 of the pangenome, suggesting significant metabolic flexibility for the members of this phylum. Indeed, each genome contained a number of unique genes (11 to 19% of the total genes per genome) not present in any other genomes compared here. Typically, the unique genes were found to form gene islands, and these could be classified as metabolic islands (such as the denitrification gene island in Mt. mobilis JLW8 and fructose metabolism and choline degradation gene islands in Mt. versatilis 301) and (pro)phage-like gene islands (such as the ones previously described for the Mb. flagellatus KT genome [8] and the prominent phage-like islands in the genomes described here [see Table S2 in the supplemental material]).

Genomic islands that are results of horizontal gene acquisition have been previously recognized as sources of genomic diversity in other closely related species (32). One of the major mechanisms proposed for such horizontal gene acquisition is the one involving homologous recombination at the tRNA sites. Indeed, many of the gene islands identified in the genomes of Methylophilaceae were flanked by tRNA genes (see Table S2 in the supplemental material), suggesting that this mechanism plays an important role in genomic diversity among representatives of this family. Insertion elements were also identified in each of the genomes (see Table S3 in the supplemental material), and many of these were parts of the genomic islands (see Table S2 in the supplemental material). Mv. glucosetrophus SIP3-4 in addition carries two circular plasmids, the larger of which appears to be a conjugative plasmid and the smaller of which contains at least one integrase gene, suggesting that conjugation followed by chromosomal integration may contribute to the evolution of this species.

We also noted that multiple copies of pqqA were present in all Methylophilaceae genomes (Table 3). The function originally proposed for pqqA, in supplying a peptide that serves as a precursor in the biosynthetic pathway for PQQ (34), has been questioned based on the lack of detection of the respective peptides in comprehensive proteomics studies (16). We previously speculated that pqqA may instead code for a small RNA (16). Comparative analysis of the new genomes of Methylophilaceae suggests another potential function, as hot spots for recombination, using a mechanism similar to the one based on highly conserved sequences of tRNA (37). This proposal is based on an observation that the locations of pqqA genes tend to coincide with the breaks in synteny between the chromosomes of the Methylophilaceae species compared. This observation is illustrated in Fig. 2, which compares the locations of pqqA in the genomes of the two Methylotenera species. In some instances, pqqA genes were found on the flanks of genomic islands, alone or together with tRNA genes (see Table S2 in the supplemental material).

Fig. 2.

Fig. 2.

Synteny plot between the chromosomes of Methylotenera mobilis JLW8 (horizontal axis) and Methylotenera versatilis 301, with locations of each pqqA gene denoted by arrows. The genomes were linearized at the origin of replication Red, leading strand; blue, lagging strand.

Conclusions.

The complete sequences of the genomes of Mt. mobilis JLW8, Mt. versatilis 301, and Mv. glucosetrophus SIP3-4 expand the genomic databases for representatives of the family Methylophilaceae to include two new genera, Methylotenera and Methylovorus. Comparative analysis of the five available genomes demonstrated significant genetic divergence among Methylophilaceae, at both the genus and species levels, including genes and gene islands encoding methylotrophy functions. The core genome of Methylophilaceae is relatively small and does not include some of the bona fide methylotrophy functions, such as methanol dehydrogenase, methylamine dehydrogenase, and the H4MPT-linked formaldehyde oxidation pathway. In contrast, the pangenome is large and includes genes for multiple auxiliary metabolic pathways such as denitrification and choline and putrescine degradation. A wealth of genomic islands that appear to be phage- or plasmid-like elements was uncovered, and these carry a potential for genomic plasticity and means for the evolution of the Methylophilaceae genomes.

Supplementary Material

[Supplemental material]

ACKNOWLEDGMENTS

This work was funded by grants from the National Science foundation (MCB-0604269 and MCB-0950183). The work conducted by the U.S. Department of Energy Joint Genome Institute was supported by the Office of Science of the U.S. Department of Energy under contract no. DE-AC02-05CH11231.

Footnotes

Supplemental material for this article may be found at http://jb.asm.org/.

Published ahead of print on 27 May 2011.

REFERENCES

  • 1. Anthony C. 1982. The biochemistry of methylotrophs. Academic Press, London, United Kingdom [Google Scholar]
  • 2. Anthony C. 2004. The quinoprotein dehydrogenases for methanol and glucose. Arch. Biochem. Biophys. 428:2–9 [DOI] [PubMed] [Google Scholar]
  • 3. Chistoserdov A. Y., McIntire W. S., Mathews F. S., Lidstrom M. E. 1994. Organization of the methylamine utilization (mau) genes in Methylophilus methylotrophus W3A1-NS. J. Bacteriol. 176:4073–4080 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Chistoserdova L. 28 March 2011. Modularity of methylotrophy, revisited. Environ. Microbiol. doi:10.1111/j.I462-2920.2011.02464.x [DOI] [PubMed] [Google Scholar]
  • 5. Chistoserdova L., Chistoserdov A. Y., Schklyar N. L., Baev M. V., Tsygankov Y. D. 1991. Oxidative and assimilative enzyme activities in continuous cultures of the obligate methylotroph Methylobacillus flagellatum. Antonie Van Leeuwenhoek 60:101–107 [DOI] [PubMed] [Google Scholar]
  • 6. Chistoserdova L., et al. 2007. Identification of a fourth formate dehydrogenase in Methylobacterium extorquens AM1 and confirmation of the essential role of formate oxidation in methylotrophy. J. Bacteriol. 2007 189:9076–9081 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Chistoserdova L., et al. 2000. Analysis of two formaldehyde oxidation pathways in Methylobacillus flagellatus KT, a ribulose monophosphate cycle methylotroph. Microbiology 146:233–238 [DOI] [PubMed] [Google Scholar]
  • 8. Chistoserdova L., et al. 2007. The genome of Methylobacillus flagellatus, the molecular basis for obligate methylotrophy, and the polyphyletic origin of methylotrophy. J. Bacteriol. 189:4020–4027 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Chistoserdova L., Kalyuzhnaya M. G., Lidstrom M. E. 2009. The expanding world of methylotrophic metabolism. Annu. Rev. Microbiol. 63:477–499 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Cox J. M., Day D. J., Anthony C. 1992. The interaction of methanol dehydrogenase and its electron acceptor, cytochrome cL in methylotrophic bacteria. Biochim. Biophys. Acta 1119:97–106 [DOI] [PubMed] [Google Scholar]
  • 11. Crump B. C., Koch E. W. 2008. Attached bacterial populations shared by four species of aquatic angiosperms. Appl. Environ. Microbiol. 74:5948–5957 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Dick G. J., Tebo B. M. 2010. Microbial diversity and biogeochemistry of the Guaymas Basin deep-sea hydrothermal plume. Environ. Microbiol. 12:1334–1347 [DOI] [PubMed] [Google Scholar]
  • 13. Ginige M. P., et al. 2004. Use of stable-isotope probing, full-cycle rRNA analysis, and fluorescence in situ hybridization-microautoradiography to study a methanol-fed denitrifying microbial community. Appl. Environ. Microbiol. 70:588–596 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Giovannoni S. J., et al. 2008. The small genome of an abundant coastal ocean methylotroph. Environ. Microbiol. 10:1771–1782 [DOI] [PubMed] [Google Scholar]
  • 15. Goris J., et al. 2007. DNA-DNA hybridization values and their relationship to whole-genome sequence similarities. Int. J. Syst. Evol. Microbiol. 57:81–91 [DOI] [PubMed] [Google Scholar]
  • 16. Hendrickson E. L., et al. 2010. The expressed genome of Methylobacillus flagellatus defined through comprehensive proteomics and new insights into methylotrophy. J. Bacteriol. 192:4859–4867 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Kalyuzhnaya M. G., Lidstrom M. E., Chistoserdova L. 2004. Utility of environmental probes targeting ancient enzymes: methylotroph detection in Lake Washington. Microb. Ecol. 48:436–472 [DOI] [PubMed] [Google Scholar]
  • 18. Kalyuzhnaya M. G., Nercessian O., Lidstrom M. E., Chistoserdova L. 2005. Development and application of polymerase chain reaction primers based on fhcD for environmental detection of methanopterin-linked C1-metabolism in bacteria. Environ. Microbiol. 7:1269–1274 [DOI] [PubMed] [Google Scholar]
  • 19. Kalyuzhnaya M. G., Bowerman S., Lara J. C., Lidstrom M. E., Chistoserdova L. 2006. Methylotenera mobilis gen. nov., sp. nov., an obligately methylamine-utilizing bacterium within the family Methylophilaceae. Int. J. Syst. Evol. Microbiol. 56:2819–2823 [DOI] [PubMed] [Google Scholar]
  • 20. Kalyuzhnaya M. G., et al. 2008. High resolution metagenomics targets major functional types in complex microbial communities. Nat. Biotechnol. 26:1029–1034 [DOI] [PubMed] [Google Scholar]
  • 21. Kalyuhznaya M. G., et al. 2009. Methylophilaceae link methanol oxidation to denitrification in freshwater lake sediment as suggested by stable isotope probing and pure culture analysis. Environ. Microbiol. Rep. 1:385–392 [DOI] [PubMed] [Google Scholar]
  • 22. Kalyuzhnaya M. G., et al. 2011. Novel methylotrophic isolates from Lake Washington sediment and description of a new species in the genus Methylotenera, Methylotenera versatilis sp. nov. Int. J. Syst. Evol. Microbiol. doi:10.1099/ijs.0.029165-0 [DOI] [PubMed] [Google Scholar]
  • 23. Kletsova L. V., Govorukhina N. I., Tsygankov Y. D., Trosenko Y. A. 1987. Metabolism of the obligate methylotroph Methylobacillus flagellatum. Microbiologiya 56:901–906 [Google Scholar]
  • 24. Kletsova L. V., Chibisova E. S., Tsygankov Y. D. 1988. Mutants of the obligate methylotroph Methylobacillus flagellatum KT defective in genes of the ribulose monophosphate cycle of formaldehyde fixation. Arch. Microbiol. 149:441–446 [Google Scholar]
  • 25. Latypova E., et al. 2010. Genetics of the glutamate-mediated methylamine utilization pathway in the facultative methylotrophic beta-proteobacterium Methyloversatilis universalis FAM5. Mol. Microbiol. 75:426–439 [DOI] [PubMed] [Google Scholar]
  • 26. Lidstrom M. E. 2006. Aerobic methylotrophic procaryotes, p. 618–634 In Balows A., Truper H. G., Dworkin M., Harder W., Schleifer K.-H. (ed.), The prokaryotes. Springer-Verlag, New York, NY [Google Scholar]
  • 27. Markowitz V. M., et al. 2006. The integrated microbial genomes (IMG) system. Nucleic Acids Res. 34:D344–D348 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28. Mavromatis K., et al. 2009. Gene context analysis in the Integrated Microbial Genomes (IMG) data management system. PLoS One 4:e7979. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29. Nercessian O., Noyes E., Kalyuzhnaya M. G., Lidstrom M. E., Chistoserdova L. L. 2005. Bacterial populations active in metabolism of C1 compounds in the sediment of Lake Washington, a freshwater lake. Appl. Environ. Microbiol. 71:6885–6899 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Redmond M. C., Valentine D. L., Sessions A. L. 2010. Identification of novel methane-, ethane-, and propane-oxidizing bacteria at marine hydrocarbon seeps by stable isotope probing. Appl. Environ. Microbiol. 76:6412–6422 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31. Schmidt S., Christen P., Kiefer P., Vorholt J. A. 2010. Functional investigation of methanol dehydrogenase-like protein XoxF in Methylobacterium extorquens AM1. Microbiol. 156:2575–2586 [DOI] [PubMed] [Google Scholar]
  • 32. Tuanyok A., et al. 2008. Genomic islands from five strains of Burkholderia pseudomallei. BMC Genomics 9:566. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. Varani A. M., Siguier P., Gourbeyre E., Charneau V., Chandler M. 2011. ISsaga is an ensemble of web-based methods for high throughput identification and semi-automatic annotation of insertion sequences in prokaryotic genomes. Genome Biol. 12:R30. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34. Velterop J. S., et al. 1995. Synthesis of pyrroloquinoline quinone in vivo and in vitro and detection of an intermediate in the biosynthetic pathway. J. Bacteriol. 177:5088–5098 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35. Vishnivetskaya T. A., et al. 2010. Microbial community changes in response to ethanol or methanol amendments for U(VI) reduction. Appl. Environ. Microbiol. 76:5728–5735 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36. Vorholt J. A., Marx C. J., Lidstrom M. E., Thauer R. K. 2000. Novel formaldehyde-activating enzyme in Methylobacterium extorquens AM1 required for growth on methanol. J. Bacteriol. 182:6645–6650 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37. Williams K. P. 2002. Integration sites for genetic elements in prokaryotic tRNA and tmRNA genes: sublocation preference of integrase subfamilies. Nucleic Acids Res. 30:866–875 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38. Wilson S. M., Gleisten M. P., Donohue T. J. 2008. Identification of proteins involved in formaldehyde metabolism by Rhodobacter sphaeroides. Microbiology 154:296–305 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39. Xiong X. H., et al. 2011. Complete genome sequence of the bacterium Methylovorus sp. strain MP688, a high-level producer of pyrroloquinolone quinone. J. Bacteriol. 193:1012–1013 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

[Supplemental material]

Articles from Journal of Bacteriology are provided here courtesy of American Society for Microbiology (ASM)

RESOURCES