Abstract
Closely related bacterial genomes usually differ in gene content, suggesting that nearly every strain in nature may be ecologically unique. We have tested this hypothesis by sequencing the genomes of extremely close relatives within a recognized taxon and analyzing the genomes for evidence of ecological distinctness. We compared the genomes of four Death Valley isolates plus the laboratory strain W23, all previously classified as Bacillus subtilis subsp. spizizenii and hypothesized through multilocus analysis to be members of the same ecotype (an ecologically homogeneous population), named putative ecotype 15 (PE15). These strains showed a history of positive selection on amino acid sequences in 38 genes. Each of the strains was under a different regimen of positive selection, suggesting that each strain is ecologically unique and represents a distinct ecological speciation event. The rate of speciation appears to be much faster than can be resolved with multilocus sequencing. Each PE15 strain contained unique genes known to confer a function for bacteria. Remarkably, no unique gene conferred a metabolic system or subsystem function that was not already present in all the PE15 strains sampled. Thus, the origin of ecotypes within this clade shows no evidence of qualitative divergence in the set of resources utilized. Ecotype formation within this clade is consistent with the nanoniche model of bacterial speciation, in which ecotypes use the same set of resources but in different proportions, and genetic cohesion extends beyond a single ecotype to the set of ecotypes utilizing the same resources.
INTRODUCTION
Microbial ecology is challenged to explain the extreme disparity among bacteria in physiology, cell structure, and ecology (1–4), as well as the huge numerical diversity of bacterial taxa, which run over 70 phyla (5, 6) and possibly billions of species (7–9). Ultimately, the origins of all bacterial groups, from the profoundly disparate divisions to the most closely related taxa, trace back to the origin of species, whereby one lineage splits into two ecologically distinct lineages that are able to coexist. In this study, we aimed to estimate the tempo of bacterial speciation and to characterize the processes of divergence that underlie the origin of species.
How easy is bacterial speciation?
Over the last several decades, evolutionary biology has seen the emergence of models of sympatric speciation (10, 11). One such model is ecological speciation (10), in which local populations living in different habitat types or utilizing different resources can successfully and indefinitely diverge, despite interbreeding.
Ecological speciation is likely to occur frequently in the bacterial world (12–20), owing to several aspects of bacterial population dynamics. First, the low rate of genetic exchange in bacteria, even among the closest relatives, is not sufficient to hinder adaptive divergence, so evolution of sexual isolation is not a necessary milestone of bacterial speciation (14, 21–23). Second, genetic exchange can occur between distantly related bacteria (22, 24, 25), allowing a single gene acquisition event to dramatically change the ecological niche of the recipient population. Third, recombining segments are often quite short (26–29), so niche-transcending adaptations can be transferred without the cotransfer of niche-specifying genes (22, 30, 31). Finally, the astronomical sizes of many bacterial populations make adaptive mutations and recombinations that are rare on a per-capita basis accessible on a per-population basis (32).
Empirical evidence for rapid speciation has emerged from genome comparisons, in which close relatives within a named species have been shown to differ in the contents of their genomes (19, 33–35). Because acquisition of novel genes can effect change in ecological niche (24, 35–37), Doolittle and Zhaxybayeva have inferred that even the closest relatives may be ecologically distinct (19).
Nevertheless, this conclusion is not yet well supported. First, the species recognized by bacterial systematics are well known to hold an enormous level of diversity in physiology and ecology (15, 20, 33, 35, 38–46), so finding that members of a species are also diverse in their genome content is not surprising. What is needed is to compare the genome contents of the closest relatives, not just members of the same species taxon. In addition, genome content differences between the closest relatives must be shown to have ecological importance. Genome content differences among members of the same species are dominated by genes that are functional for phage or transposons and by the so-called “hypothetical” genes, with no known function for bacteria (33, 47). An ecologically motivated genome comparison should do more than track the comings and goings of genetic elements that parasitize bacteria.
We have aimed to characterize the origins of ecological diversity in bacteria by comparing the genomes of close relatives that represent the most newly divergent “ecotypes.” We define ecotype here as a phylogenetic group of close relatives that are ecologically very similar, in that the members of an ecotype share genetic adaptations to a particular set of habitats, resources, and conditions (48). More specifically, different ecotypes are predicted to coexist indefinitely as a result of their ecological differences, while lineages within one ecotype are ecologically too homogeneous to allow indefinite coexistence (49, 50). This definition contrasts with earlier ecotype concepts (51), in that the present definition of ecotype implies no other species-like characteristics beyond ecological distinctness. Our goals are to discover empirically which species-like properties follow from ecological distinctness and to quantify the rate at which ecotypes originate. We use the terms ecotype and species interchangeably, to mean the most newly divergent, ecologically distinct populations, and we refer to the origin of bacterial ecotypes as “speciation.”
Models of bacterial speciation.
de Queiroz has enumerated species-like properties shared among zoological models of speciation, of which we consider three: (i) a species is cohesive, in that diversity within the species is recurrently constrained; (ii) different species are ecologically distinct; and (iii) different species are irreversibly separate (52); additionally, species may be discovered as distinct sequence clusters (51, 53). Newly divergent species of bacteria may hold all or some of these species properties, depending on the model of bacterial speciation (Table 1). These models differ profoundly in the rate at which new ecological diversity is invented, as well as the extent to which newly invented diversity persists into the future.
TABLE 1.
Properties of bacterial species under different models of speciation
| Model | Species-like property of ecotypes |
Distinguishable as sequence clusters in multilocus sequence analysis | ||
|---|---|---|---|---|
| Ecological distinctness | Cohesion | Irreversible separateness | ||
| Stable ecotype | ✓ | ✓ | ✓ | ✓ |
| Speedy speciation | ✓ | ✓ | ✓ | |
| Nanoniche | ✓ | ✓ (across ecotypes sharing the same resources) | ||
| Speciesless | ✓ | |||
| Recurrent niche invasion | ✓ | ✓ | ||
In the stable-ecotype model, bacterial speciation is infrequent and ecologically distinct species coexist long enough to be distinguished as multilocus sequence clusters (Fig. 1A; Table 1) (51). The long-term coexistence of different ecotypes may be fostered by a qualitative ecological divergence, where each ecotype utilizes some unique resource not shared with others (22). In this model, an ecotype is subject to cohesion through many recurrent bouts of diversity-purging events, caused by periodic selection and/or genetic drift, during the long lifetime of the ecotype. Different ecotypes have the species property of being irreversibly separate because, owing to their ecological differences, neither periodic selection nor genetic drift can prevent further divergence among newly split ecotypes; moreover, ecological divergence between the ecotypes is not prevented by recurrent genetic exchange (12, 21–23). In the stable-ecotype model, most ecotypes are discernible as sequence clusters because longstanding ecotypes have had opportunity to accumulate neutral sequence divergence at every locus, while diversity within ecotypes is recurrently purged (51).
FIG 1.
Models of bacterial speciation. Ecotypes are represented by different colors, periodic selection events are indicated by asterisks, and extinct lineages are represented by dashed lines. The letters at the top represent the resources that each group of organisms can utilize. In cases where ecotypes utilize the same set of resources but in different proportions, the predominant resource of each ecotype is noted by a capital letter. (A) Stable-ecotype model. In this model, each ecotype endures many periodic selection events during its long lifetime. The stable-ecotype model generally yields a one-to-one correspondence between ecotypes and sequence clusters. The ecotypes are able to coexist indefinitely because each has a resource not shared with the others (22). (Reprinted from reference 22 [copyright 2011 Federation of European Microbiological Societies; published by Blackwell Publishing Ltd., all rights reserved].) (B) Speedy-speciation model. This model is much like the stable-ecotype model, except that speciation occurs so rapidly that most newly divergent ecotypes cannot be detected as sequence clusters in multilocus analyses (51). (Adapted from reference 51 with permission of Elsevier.) (C) Nanoniche model. Three nanoniche ecotypes use the same set of resources but in different proportions (noted by Abc, aBc, and abC). Each nanoniche ecotype can coexist with the other two because they have partitioned their resources, at least quantitatively. However, because the ecotypes share all their resources, each is vulnerable to a possible speciation-quashing mutation that may arise in the other ecotypes. This could be a mutation that increases efficiency in utilization of all resources. These speciation-quashing mutations are indicated by a large asterisk; each of these extinguishes the other nanoniche ecotypes. Thus, in the nanoniche model, cohesion can cut across ecologically distinct populations, provided that they are only quantitatively different in their resource utilization (22). (Reprinted from reference 22 [copyright 2011 Federation of European Microbiological Societies; published by Blackwell Publishing Ltd., all rights reserved].) (D) Speciesless model. Here the diversity within an ecotype is limited not by periodic selection but instead by the short time from the ecotype's invention as a single mutant until its extinction. The origination and extinction of each ecotype i are indicated by si and ei, respectively. In the absence of periodic selection, each extant ecotype that has given rise to another ecotype is a paraphyletic group, and each recent ecotype that has not yet given rise to another ecotype is monophyletic (50). (Adapted from reference 50.) (E) Recurrent niche invasion model. Here a lineage may move, frequently and recurrently, from one ecotype to another, usually by acquisition and loss of niche-determining plasmids. Red lines indicate the times in which a lineage is in the plasmid-containing ecotype; blue lines indicate the times when the lineage is in the plasmid-absent ecotype. Periodic selection events within one ecotype extinguish only the lineages of the same ecotype. For example, in the most ancient periodic selection event shown, which is in the plasmid-absent (blue) ecotype, only the lineages missing the plasmid at the time of periodic selection are extinguished, while the plasmid-containing lineages (red) persist. Ecotypes determined by a plasmid are not likely to be discoverable as sequence clusters (22). (Reprinted from reference 22 [copyright 2011 Federation of European Microbiological Societies; published by Blackwell Publishing Ltd., all rights reserved].)
In contrast, several models of bacterial speciation accommodate rapid ecological diversification. The speedy-speciation model is much like the stable-ecotype model, except that the pace of speciation is faster, so closely related ecotypes are not distinguishable as multilocus sequence clusters (Fig. 1B; Table 1) (51).
In the nanoniche model (Fig. 1C; Table 1), speciation occurs rapidly, but the most newly divergent, “nanoniche” ecotypes are only quantitatively different. Here each ecotype has no unique resources but utilizes the same set of resources in different proportions (22, 51). This pattern of ecological divergence is common in animal and plant speciation (54) and may extend to bacteria, as closely related bacterial ecotypes frequently overlap in the habitats they occupy (15, 42, 55–58). Nanoniche ecotypes are predicted to be ephemeral because each is vulnerable to a speciation-quashing periodic selection event emanating from another ecotype (Fig. 1C) (59). The nanoniche ecotypes are not irreversibly separate because cohesion can occur at the level of a set of ecotypes that utilize the same set of resources (Table 1).
Another possibility is that there is both rapid formation and extinction of ecotypes, as in the speciesless model (Fig. 1D) (50). Here each ecotype lives only briefly before going extinct, so the diversity within an ecotype is constrained by the short time of its existence rather than recurrent cohesive forces like periodic selection. Related to the speciesless model is the opportunitroph model, in which various generalist lineages adapt convergently to similar types of ephemeral particles (60).
Finally, the recurrent niche invasion model allows for the possibility that the ecological distinctness of ecotypes is encoded by plasmids or other transferable elements (22, 51, 61). Ecotypes in this model are not irreversibly separate (Fig. 1E; Table 1).
Testing the models of speciation requires identification of the ecotypes representing the most recent products of speciation, as we have set out to do.
The model system.
We have aimed to identify the models of speciation that apply to Bacillus subtilis relatives. We previously isolated Bacillus from various soil microhabitats within Radio Facility Wash (RFW), a canyon on the floor of Death Valley (55), and demarcated putative ecotypes using two sequence-based algorithms, ecotype simulation (ES) (13) and AdaptML (15). The membership of each putative ecotype contains a group of extremely close relatives that have been hypothesized to be ecologically interchangeable with one another but ecologically distinct from other ecotypes (62).
Our previous sequence-based analyses were conducted on a 1,776-bp concatenation of partial sequences of three genes, and they identified 32 putative ecotypes among the 11 recognized species and subspecies within the B. subtilis-Bacillus licheniformis clade (55, 63). Each putative ecotype hypothesized by ES and AdaptML analyses was confirmed to be ecologically distinct from other ecotypes through differences in their associations with microhabitats of different solar exposures and soil textures, as well as physiological differences (13, 55, 64). Nevertheless, we have not yet confirmed whether the putative ecotypes are each homogeneous in their ecological adaptations.
In this study, we have focused on discovering and characterizing ecological heterogeneity within a single putative ecotype (PE) we previously identified as PE15 and classified within Bacillus subtilis subsp. spizizenii (55). We tested for ecological heterogeneity within PE15 by comparing the genomes of four PE15 isolates from RFW, plus the genomes of reference strain W23 of B. subtilis subsp. spizizenii (65) and strain 168 of Bacillus subtilis subsp. subtilis (66). We demonstrated a high rate of invention of new ecotypes within the putative ecotype PE15. By characterizing the genetic basis of ecological differences among these ecotypes, we show that the genome comparisons most strongly support the nanoniche model, in which the most newly divergent ecotypes utilize only resources that are shared with their closest relatives and ecotypes are not irreversibly separate.
MATERIALS AND METHODS
Strains.
We chose PE15 as the focus of this study because it was the best sampled of all putative ecotypes from RFW and because the fully sequenced laboratory strain W23 was previously shown to be a member of PE15 (55). We selected four RFW isolates from PE15 to represent strains most likely to be ecologically distinct, based on their habitats of isolation and physiological properties (55). Two strains (RFWG1A3 and RFWG1A4) were isolated from the warmer and sunnier south-facing slope and appeared to be among the best adapted to high temperatures, based on heat adaptation index (HAI), a measure of the proportion of warm-adapting fatty acids versus cool-adapting fatty acids; two strains (RFWG4C10 and RFWG5B15) were isolated from the cooler and shadier north-facing slope and appeared to be among the best adapted to cooler temperatures based on HAI (55). We abbreviate the strain names here as G1A3, G1A4, G4C10, and G5B15.
We also included the strains B. subtilis subsp. spizizenii W23 and B. subtilis subsp. subtilis 168, whose genomes were previously sequenced (65, 66). Strain W23 was previously classified to the ecotype of interest (PE15), and strain 168 had been classified to PE10; both assignments were based on analysis of three genes (13, 55).
Genome sequencing and annotation.
Genomic DNA was extracted from liquid growth cultures, using the Gentra Puregene Yeast/Bact. kit, and was sequenced with a Roche 454 GS FLX pyrosequencer. Genomes were assembled using MIRA (Mimicking Intelligent Read Assembly) (67, 68) and Newbler (69). Contigs were ordered with Mauve (70) using the fully sequenced 168 (66) and W23 (65) genomes for reference. Gaps were manually closed using Consed (71). The draft genomes were annotated using the RAST automated server (72), which uses the SEED framework (73). Genes that were not phage or transposon related or hypothetical were classified to one of the functional systems and subsystems of RAST. Some genes were annotated as part of more than one functional subsystem or system. For example, the gene for beta-phosphoglucomutase was classified to two subsystems within the carbohydrate system: maltose and maltodextrin utilization and trehalose uptake and utilization.
Core genome phylogeny.
We identified orthologous genes shared by all six strains using Mauve (70). These orthologs were aligned and concatenated to establish a “core genome.” Applying Treefinder (74), we created a maximum likelihood core genome phylogeny for the five PE15 strains, rooted by strain 168.
Genome content comparisons.
Regions unique to individual genomes were discovered using the Novel Regions Finder of Panseq (75). Genes shared by each subset of strains were identified using Mauve (70). Unshared genes were categorized as hypothetical, phage or transposon related, or functional (i.e., for the bacterium), according to the RAST annotation.
Positive selection.
Two tests in the PAML package (76, 77) were implemented to identify orthologous genes shared by all six strains that had a history of positive selection on amino acid sequence. We eliminated from analyses any orthologs with stop codons in frame and those whose sequence length was not a multiple of three. We thus tested for positive selection in 2,892 of the 3,121 genes shared among the six strains.
We first tested for positive selection occurring anywhere in the six-strain phylogeny for each gene by comparing a “nearly neutral” model (M1A) to a positive-selection model (M2A). We then identified genes under positive selection in each internode of the phylogeny using Branch Test Two. All the genes hypothesized to be under positive selection were checked for evidence of recombination on the basis of noncongruence of phylogenies and RDP3 (78).
Growth in monoculture.
We employed monoculture and competition experiments to test whether the strains differ in utilization of maltose, maltodextrin, and inositol, as suggested from gene content comparisons. For monoculture growth, the four Death Valley isolates of PE15 were grown separately in liquid LB medium, and at log phase they were diluted 1:2,000 into minimal salts (79) (with no carbon source added) and vortexed. The diluted culture was added to minimal salts supplemented with 1% (final concentration) of glucose, maltose, maltodextrin, or inositol (with no additional carbon source) and incubated at 28°C, with shaking, in a sterile microtiter plate. Growth was monitored in a SpectraMax M5 microplate reader as percent absorbance at 405 nm at 15-min intervals for 24 h.
For a given substrate, experiments were conducted in three replicates on different days. Within each day, there were two subreplicate growth cultures, stemming from the same LB culture; control growth tests on glucose were based on the same LB culture. The unit of replication in our analyses was the mean of the two subreplicates from a given LB culture. Experiments were also set up with minimal medium and no carbon source added, as a control for the effects of possible residual LB.
Competition experiments.
For competition experiments, two strains (G1A4 and G4C10) were grown overnight in brain heart infusion (Bacto). Approximately equal numbers of cells from the two strains' overnight cultures (following previous estimates of cell densities), in a total volume of 25 μl, were inoculated into 10 ml of minimal medium with glucose, maltose, or maltodextrin (plus citrate, which does not allow significant growth of either strain [data not shown]). Cultures were incubated at 30°C with shaking (225 rpm) for 22 h. At 0, 4, 13, and 17 h, 1-ml aliquots were sampled and then stored at −20°C. These time points were selected to represent the time of inoculation, the beginning of exponential growth, the end of exponential growth, and stationary phase, respectively. The whole set of competition experiments was replicated twice, on different days.
The abundance of each strain was analyzed by quantitative PCR (qPCR). Bacterial genomic DNA was extracted by incubating 50 μl of the culture at 95°C for 20 min. Primers for each strain were designed from a unique, single-copy gene of its genome using Primer3 (80) (G1A4 gene identifier [ID], fig|932005.3.peg.207, beta-phosphoglucomutase; G1A4 forward, TGCCTCACAATCAGATCAGC; G1A4 reverse, AACACGTTTGGGGATTATGG; G4C10 gene ID, fig|932007.3.peg.1742, hypothetical protein; G4C10 forward, CCGCTTTCAAGGTATTGAGC; G4C10 reverse, AGACCAAAGAAAAGGCATGG; gene IDs assigned by RAST annotation [72]).
The qPCR was performed with Fast SYBR green master mix (Applied Biosystems) in the 7500/7500 Fast real-time PCR system, with the holding stage at 95°C for 20 s and the cycling stage at 95°C for 3 s and 60°C for 30 s (40 cycles). Three replicates were conducted for each reaction. The reproducibility of the qPCR essay was evaluated by the standard deviation of the cycle numbers required for fluorescence to reach a set threshold, over three replicates (<0.3). The average cycle number of the three replicates was used to calculate the copy number. The primer specificity of qPCR essays was verified by a melt curve analysis. Standard curves for each strain were obtained using its pure genomic DNA for absolute quantification. The consistency of amplification efficiency was validated by the R2 of the standard curves (all are greater than 0.99).
Analysis of stationary-phase density.
Absorbance values at 24 h (monoculture) or qPCR copy number values at 22 h (competition) were log10 transformed. To account for G1A4's superior performance in minimal medium (as measured by growth in minimal medium with glucose), the values for maltose, maltodextrin, and inositol for each replicate of each strain were divided by the value for glucose. Strains were compared for performance in media with each carbon source (corrected and not corrected by glucose) in one-way analyses of variance (ANOVAs).
RESULTS
Genome sequencing and annotation.
All four genomes were sequenced using 454 pyrosequencing, yielding a total sequence per genome of 59.22 to 109.92 Mb (see Table S1 in the supplemental material). The genomes were assembled as 11 to 20 contigs, with the N50 statistic ranging from 285 kbp to 848 kbp. The sizes of gaps were estimated by comparison to the complete genome sequence of B. subtilis subsp. spizizenii strain W23 (65). Gaps between contigs were usually shorter than 50 bp, except for the repetitive RNA genes (see Table S2 in the supplemental material). Thus, the genome assemblies all have an extremely high percent coverage and are unlikely to have missed any protein-coding genes. The genome sizes and GC contents of the four RFW strains of the putative ecotype PE15 (Table 2) were similar to those of B. subtilis subsp. spizizenii strain W23 (65). The assemblies contained no separate circular contigs, indicating an absence of plasmids.
TABLE 2.
Genome sequencing data
Core genome phylogeny.
The mean pairwise average nucleotide identity (ANI) value between the PE15 strains and the 168 strain of PE10 was 92.9% (Table 3). Within PE15, all strain pairs showed an ANI value of ≥99.4%. The average number of shared genes between the PE15 strains and the PE10 strain was 3,566; all pairs within PE15 shared at least 3,869 genes (Table 3).
TABLE 3.
Average nucleotide identity and number of genes shareda
| Strain | PE10 strain 168 | PE15 |
||||
|---|---|---|---|---|---|---|
| W23 | G1A3 | G1A4 | G4C10 | G5B15 | ||
| 168 | 92.907 | 92.919 | 92.899 | 92.923 | 92.933 | |
| W23 | 3,530 | 99.482 | 99.556 | 99.522 | 99.505 | |
| G1A3 | 3,619 | 3,869 | 99.443 | 99.592 | 99.498 | |
| G1A4 | 3,533 | 3,919 | 3,911 | 99.540 | 99.497 | |
| G4C10 | 3,554 | 3,915 | 3,970 | 3,923 | 99.627 | |
| G5B15 | 3,592 | 3,897 | 4,017 | 3,959 | 3,972 | |
The average nucleotide identity (percent) is shown above the diagonal, and the number of genes shared is shown below the diagonal for strains of PE15 (within B. subtilis subsp. spizizenii) and PE10 (within B. subtilis subsp. subtilis).
The core genome alignment confirmed earlier results (55) showing that strain W23 clusters within PE15 (Fig. 2). The PE15 clade contains two sister subclades of two strains each, plus a more basal strain.
FIG 2.
Maximum likelihood core genome phylogeny of PE15 strains, rooted by strain 168 of PE10, with genome content comparisons (A) and positive selection analyses (B). In each internode, the unique genes are classified as follows: total genes/genes with known bacterial function/genes present in characterized functional subsystems. Each node was supported by 100% of 1,000 bootstrap replicates. In panel A, the left pie chart for each internode indicates the proportion of unique genes that are hypothetical, are phage or transposon related, or have a known bacterial function; the right pie chart indicates the proportion of unique genes in each of the RAST major functional systems. In the case of strain 168, the fraction of genes in five systems was too small to be visible, with each of the following at 1%: cell division and cell cycle, nitrogen metabolism, potassium metabolism, respiration, and membrane transport. In panel B, the pie charts indicate the functional classification of genes under positive selection for amino acid sequence.
Functional capabilities.
The members of PE15 were extremely similar in their functional capabilities (Fig. 3; see also Table S3 in the supplemental material). Comparisons across the putative ecotypes PE15 and PE10 showed much greater variation in gene content across functional systems. For each of three system categories (cell wall and capsule, protein metabolism, and carbohydrates), each putative ecotype contained at least 24 genes not present in the other (see Tables S4 and S5 in the supplemental material).
FIG 3.
Functional classification of gene content for strain G1A3 of PE15, by RAST functional systems (A) and subsystems (B) within the carbohydrate system. The number of genes in each system or subsystem is indicated in parentheses. The percentages of genes in each system category were very similar across the PE15 strains (with an average standard deviation of 0.20% across all system categories) (see Table S3 in the supplemental material). Likewise, the PE15 genomes were similar in the number of genes at the subsystem level. For example, in the carbohydrate subsystem, the average standard deviation of gene content was 0.66%.
Unique chromosomal regions.
Panseq identified 561 genes (in 116 regions) unique to strain 168 of PE10 compared to PE15, 197 of which (35.1%) were hypothetical and 68 of which (12.1%) were phage related (Fig. 2A; see also Table S3 in the supplemental material). Mauve identified 186 genes shared among the five PE15 strains but not found in PE10, 91 of which (48.9%) were hypothetical and 1 of which (0.5%) was phage related (Fig. 2A).
The individual strains of PE15 had an average of 67 unique genes. The majority of these (mean, 62.6%) were hypothetical (Fig. 2A). Chromosomal regions unique to subclades (or individual strains) within the PE15 phylogeny indicated the acquisition of that region by horizontal genetic transfer, as inferred by maximum parsimony, since each unique gene was most closely related to a gene in another species (see Table S4 in the supplemental material).
In a very few cases, the most parsimonious interpretation of comparative genome content analyses indicated gene loss in a PE15 lineage (see Fig. S1 in the supplemental material). In total, only 30 genes were lost along all the lineages.
Surprisingly, no unique gene conferred a metabolic system or subsystem function that was not already present in the PE15 core genome. That is, each unique metabolic gene was either an additional, paralogous copy of a gene already present in all the PE15 genomes or an additional gene within one or more functional subsystems present in all the PE15 genomes (Table 4; see also Table S4 in the supplemental material). The only unique genes to confer a subsystem function not present in all PE15 strains were members of the nonmetabolic restriction-modification subsystem.
TABLE 4.
Classification of unique genes to functional subsystemsa
| PE 15 strain | Functional subsystem(s) (no. of unique genes) | Product(s) of unique genes that are paralogs of genes present in all PE15 strains | Product(s) of nonparalogous unique genes that are members of subsystems present in all or some PE15 strains |
|---|---|---|---|
| G1A3 | Branched-chain amino acid biosynthesis (1) | 3-Isopropylmalate dehydrogenase | |
| Leucine biosynthesis (1) | 3-Isopropylmalate dehydrogenase | ||
| CBSS-262719.3.peg.410 (1) | Replicative DNA helicase | ||
| DNA replication (1) | Replicative DNA helicase | ||
| DNA repair, bacterial (1) | DNA-cytosine methyltransferaseb | ||
| Glutamine, glutamate, aspartate, and asparagine biosynthesis; Iojap; threonine and homoserine biosynthesis (1) | Transcriptional regulator, GntR family domain/aspartate aminotransferase | ||
| Inositol catabolism; inositol utilization (3) | Major myo-inositol transporter IolT; inosose isomerase; myo-inositol 2-dehydrogenase | ||
| Nitric oxide synthase (1) | Putative cytochrome P450 hydroxylase | ||
| Phosphate metabolism (1) | Alkaline phosphatase-like protein | ||
| G1A4 | Alpha-amylase locus in Streptococcus; bacterial chemotaxis; maltose and maltodextrin utilization (1) | Maltose/maltodextrin ABC transporter, substrate binding periplasmic protein MalE | |
| DNA replication (1) | DNA replication protein DnaC | ||
| Maltose and maltodextrin utilization (7) | Maltose/maltodextrin ABC transporter, permease protein MalG; maltose/maltodextrin ABC transporter, permease protein MalF | Neopullulanaseb (2); maltose phosphorylaseb; maltodextrose utilization protein MalAb; maltose operon transcriptional repressor MalR, LacI familyb | |
| Maltose and maltodextrin utilization; trehalose uptake and utilization (7) | Beta-phosphoglucomutaseb | ||
| Teichoic and lipoteichoic acid biosynthesis (1) | CDP-glycerol:poly(glycerophosphate) glycerophosphotransferase (2) | ||
| G4C10 | At3g21300 (1) | RNA methyltransferase, TrmA familyb | |
| Biotin biosynthesis; biotin biosynthesis experimental; biotin synthesis cluster; YhgI, YhgH (1) | Adenosylmethionine-8-amino-7-oxononanoate aminotransferase | ||
| Protein chaperones; proteolysis in bacteria, ATP dependent (1) | ClpB protein | ||
| ESAT-6 protein secretion system in Firmicutes; ESAT-6 protein secretion system in Firmicutes (1) | Putative toxin component near putative ESAT-related proteins, repetitive/repetitive hypothetical protein near ESAT cluster, SA0282 homolog (5) | ||
| G5B15 | Restriction-modification system; type I restriction-modification (3) | Type I restriction-modification system, specificity subunit Sc; type I restriction-modification system, DNA-methyltransferase subunit Mc; type I restriction-modification system, restriction subunit Rc |
In some cases, a unique gene was a member of more than one subsystem, indicated with subsystems separated by semicolons. All unique genes either were paralogs of genes present in all PE15 strains or were nonparalogous additional genes in functional subsystems already present in some or all sampled strains within PE15. In cases where a unique gene with a given specific function was present in more than one copy, the number of copies is indicated in parentheses (e.g., 2 copies of neopullulanase in G1A4).
Present in all P15 strains.
Present in some P15 strains.
In the case of the carbohydrate system, for example, G1A3 was shown to have three unique genes involved in inositol utilization. All of these appear to be acquired from other Bacillus species taxa and are additional, paralogous copies of genes found in all the PE15 genomes sampled. Similarly, G1A4 has nine unique genes involved in maltose or maltodextrin utilization, which is a function encoded by all the PE15 genomes: three genes were paralogs of genes shared by all PE15 strains, and six were additional genes involved in maltose or maltodextrin utilization, particularly uptake and metabolism of maltodextrin (Table 4).
Growth experiments.
In monoculture, the four PE15 strains were significantly heterogeneous in their stationary-phase densities in minimal medium with maltose, maltodextrin, and inositol (F = 57.13, 143.04, and 4.73; P < 0.0001, P < 0.0001, and P = 0.012, respectively; degrees of freedom [df] = 3, 8), as well as in the control medium with glucose (F = 6.06; P = 0.0026; df = 3, 8) (Fig. 4A; see also Fig. S2 in the supplemental material). (Strains did not grow and were not significantly different from one another in control medium with no carbon source added, as shown in Fig. S3 in the supplemental material.) Moreover, the strain expected to hold an advantage in maltose and maltodextrin (G1A4), owing to its nine unique genes, had a significantly higher stationary-phase density than each of the other strains with those resources (Tukey test, P < 0.01). Because G1A4 also had the highest density in the glucose control (although not significantly), we tested whether this strain was benefited the greatest by maltose and maltodextrin, by comparing the ratio of densities in maltose and maltodextrin to densities in glucose [log (density in maltose or maltodextrin/density in glucose)]. The four strains were marginally significantly heterogeneous in their maltose and maltodextrin density ratios (P = 0.080 and 0.075, respectively). As predicted, strain G1A4 had the highest average ratio of densities for both maltose and maltodextrin (Fig. 4B) and, indeed, had the greatest ratio in all three independent replicates, expected by chance with a probability of (1/4)3, i.e., 0.016 (for each resource).
FIG 4.
Stationary-phase densities in monoculture with maltose, maltodextrin, inositol, or glucose. (A) Stationary-phase density (K) as estimated by absorbance; (B) stationary-phase density corrected by density in glucose.
In the case of growth in inositol, where G1A3 was expected to be superior owing to its additional three genes, this strain did not reach the highest stationary phase either corrected or uncorrected for growth in glucose (Fig. 4; see also Fig. S4 in the supplemental material).
Competition experiments.
qPCR analysis of strains G1A4 and G4C10 grown in coculture on maltose and maltodextrin showed that G1A4 grew to a higher density than G4C10 on both these resources (maltose, t = 4.05, P = 0.028, 2 df, one-tailed test; maltodextrin, t = 12.0, P = 0.0034, 2 df, one-tailed test; [see Fig. S5 in the supplemental material]). When corrected for growth on glucose, G1A4 still showed superior growth on maltose (although not significantly) and on maltodextrin (t = 3.21, P = 0.043, 2 df, one-tailed test).
Positive selection.
Phylogeny-wide testing for positive selection identified 14 genes under positive selection in one or more PE15 strains (see Table S6 in the supplemental material), and all but one gene has a known function in B. subtilis subsp. subtilis 168 (Fig. 2B).
Branch-specific testing indicated positive selection in 38 genes on six internodes within PE15; all these genes had a known bacterial function (Fig. 2B). The individual genomes had relatively few genes under positive selection—ranging from 0 to 2 genes per genome—but each strain was under a unique regimen of positive selection. Strain G1A4, which had acquired several genes coding for maltose or maltodextrin utilization, also showed evidence for positive selection in the maltose or maltodextrin utilization gene coding for glycogen phosphorylase (see Table S6 in the supplemental material).
DISCUSSION
We have investigated the rate at which bacterial lineages split to form ecologically distinct populations and have aimed to identify the species-like properties that follow the invention of new ecotypes. Our approach was to compare the genomes of very close relatives that had previously been hypothesized to belong to a single ecotype, to identify any ecological heterogeneity within the putative ecotype, and to discover the genetic and physiological bases of ecological differences.
The previous ecotype simulation and AdaptML analyses of Bacillus isolates from Radio Facility Wash of Death Valley, based on three genes, suggested the clade labeled PE15 to be one ecotype whose members are ecologically interchangeable with one another but ecologically distinct from other ecotypes (55). This putative ecotype was indeed confirmed to be ecologically distinct from its close relatives by differences in its habitat associations and in physiological differences (55). However, like previous studies (13, 15, 42, 56–58, 81), our ecological confirmation of ecotypes did not test for ecological interchangeability among the members of an ecotype identified through analysis of sequence diversity.
Evidence of multiple ecotypes within PE15.
The genomic analysis of the five PE15 strains confirmed the earlier conclusion (55) that PE15 is a clade of close relatives. The core genome phylogeny showed the five PE15 strains to be extremely closely related (Fig. 2), with all pairwise ANI values of ≥99.4% (Table 3).
The core genome of the PE15 members showed a phylogenetic structure of two sister clades plus a more basal strain, with all nodes having highly significant support (Fig. 2). This structure would not be expected under a model in which PE15 is a single ecotype whose diversity is recurrently purged by periodic selection, as previously predicted (55). When periodic selection is the dominant force of cohesion, a star phylogeny is expected to result (with all strains equally related) (51). One explanation for the phylogenetic structure within PE15 would be that genetic drift is the primary force of cohesion within the ecotype, but the enormous population sizes of taxa within the B. subtilis-B. licheniformis clade in nature make this explanation unlikely (13, 55). The most plausible interpretation is that the phylogenetic structure reflects the origination of multiple ecotypes within PE15.
Further evidence of ecological heterogeneity within PE15 emerges from tests of positive selection on amino acid sequences of shared genes. Three of the five PE15 strains showed at least one gene under positive selection, and three of the nonterminal internodes within the PE15 clade also showed positive selection, indicating a unique regimen of positive selection on each of the five PE15 isolates (Fig. 2B). Finding a clade to be under a unique regimen of positive selection provides evidence that the clade is a distinct ecotype (23), and so we conclude that each of the five PE15 strains sampled represents a separate ecotype. Our genome-based analyses have thus indicated five ecotypes within one clade (PE15) that our previous analyses, based on three genes, had demarcated as a single ecotype (55).
It then became interesting to compare the rates of ecotype formation based on the present full-genome analysis versus the previous three-gene analysis (55). Our first step was to estimate the rate of ecotype formation within the full B. subtilis-B. licheniformis clade (excluding the outgroup strain from Bacillus halodurans), based on the three genes. A maximum likelihood analysis, using TreeFinder (74), yielded a Newick tree from which we estimated a total branch length of 0.872 substitution per site in the entire B. subtilis-B. licheniformis clade. The 31 putative ecotypes previously identified (55) correspond to 30 ecotype formation events emanating from an ancestral ecotype. Thus, the resolution of the earlier three-gene analysis yields a rate of 34.4 (= 30/0.872) ecotype formation events per nucleotide substitution per site (over the three-gene concatenation).
We then estimated the rate of ecotype formation within PE15, based on the present analysis of genomes, where the tree of five PE15 strains contains five ecotypes, or four ecotype formation events. A maximum likelihood phylogenetic analysis of the five PE15 strains, based on the three-gene concatenation, yielded a total branch length of 0.000605 substitution per site. The origination of four ecotypes within this clade yields 6,610 (= 4/0.000605) ecotype formation events per nucleotide substitution per site (in the three-gene concatenation). The full-genome analysis thus indicates a far higher rate of ecotype formation events than the original three-gene analysis, by a factor of 192.
The ecological uniqueness of every strain in our small sample strongly suggests an enormous number of ecotypes among extremely close relatives, at least within our focus group of B. subtilis. It appears that ecotypes are formed at too high a rate to be discovered by analyzing the sequences of just a few genes. We predict that a high-resolution analysis by either ecotype simulation or AdaptML that is based on the entire core genome would demarcate multiple ecotypes within PE15.
Additional evidence for ecological heterogeneity within PE15 comes from comparisons of gene content. The number of genes unique to a single strain ranged from 22 to 117, and various subclades within PE15 also had unique genes. As expected (33, 47), many of the most newly acquired unique genes reflected the comings and goings of phage and transposons (Fig. 2A). However, every strain and every clade within PE15 had acquired a set of unique genes with known bacterial functions, some possibly with niche-specifying properties. In particular, genes in the carbohydrate system category are among the most likely of the functional, unique genes to have an effect on the ecological niche. Additional evidence for the ecological significance of the unique genes comes from the positive selection analyses on shared genes—the maltose-maltodextrin subsystem, which had some unique genes, also showed positive selection in one gene shared by all strains.
The nature of ecological divergence among ecotypes within PE15.
It would be difficult to pinpoint a single ecological or physiological dimension along which the PE15 ecotypes have diverged from one another. This is because the 38 genes under positive selection in the PE15 clade comprise a great variety of functions divided among multiple system categories (see Table S6 in the supplemental material). Nevertheless, the horizontally acquired genes point to a role for carbohydrate metabolism (see Table S4 in the supplemental material). Two possibilities for ecological divergence based on carbohydrate metabolism involve gene acquisitions in the maltose-maltodextrin and inositol subsystems (Table 4).
Remarkably, in no case has a functional gene unique to a PE15 genome conferred a novel metabolic subsystem function upon that strain. This is well exemplified by the nine additional genes for maltose or maltodextrin usage in G1A4. Because all the PE15 genomes studied already have the minimum set of genes for maltose or maltodextrin usage, we may conclude that the additional genes do not provide a novel resource; indeed, we found that all strains had the ability to use maltose and maltodextrin as their sole carbon source when cultured. The nine additional genes in the G1A4 genome appear to allow for an expanded usage of maltose and maltodextrin. For example, the operon of neopullulanase, maltose phosphorylase, and beta-phosphoglucomutase genes in G1A4 confers two alternative pathways from maltodextrin to glucose (82). Growth tests confirmed that when maltose or maltodextrin was the only carbon source, G1A4 grew significantly better in monoculture than the other strains, and this advantage was corroborated in coculture with one other strain. We note that maltose and maltodextrin are likely resources for soil heterotrophs such as Bacillus, as these are breakdown products of starch, which can be contributed to soil from a great diversity of plant sources.
Unique genes in 14 functional subsystems were similarly found to belong to subsystems present in all other strains (Table 4). Additional genes in the inositol catabolism-utilization pathway did not give G1A3 a growth advantage in our experiments. Therefore, we conclude that recently acquired genes, even if functional, may not always be adaptive and so adaptations must be experimentally confirmed.
In only one case did unique genes confer a subsystem not shared by all members of PE15; this was the case of unique genes in the restriction-modification subsystem, which confers protection against phage but does not directly affect resource usage.
The limited differences among strains in genome content suggest that the ecological distinctness within PE15 is only quantitative in nature, with the various ecotypes appearing to share the same metabolic functions. The sequence differences in shared proteins, as well as the addition of unique genes in shared pathways, may contribute to ecotypes utilizing the same chemical resources but in different microhabitats or in different proportions in the same microhabitat.
Which model of ecological speciation?
The nanoniche model appears to best explain the origin of newly divergent ecotypes within PE15, given the available data, as no members of PE15 show evidence of utilizing any unique resources. This kind of quantitative divergence runs counter to the intuition built up for decades by systematic microbiology, where ecological differences between recognized taxa are usually scored as positive versus negative ability to utilize a given resource (83, 84), and where HGT acquisition of new capabilities is thought to drive speciation (24, 37). However, closely related populations have been shown to be quantitatively divergent in rates of utilization of shared substrates (85) and in expression of shared genes responsible for metabolism of environmental substrates (42). Moreover, the paradigm of quantitative ecological divergence is likely the dominant mode of specialization in animals and plants (54).
The consequences of the nanoniche model are quite different from those of the stable-ecotype model, in which ecotypes have all the species-like qualities attributed to species, including ecological distinctness, cohesiveness, irreversible separateness, and recognition through multilocus sequence analysis. In the nanoniche model, each ecotype is cohesive, in that there may be periodic selection events that sweep the diversity within one ecotype. However, the unit of cohesion would extend beyond the individual nanoniche ecotypes to the set of all ecotypes using the same resources (Table 1; Fig. 1C). The possibility of such speciation-quashing adaptive mutations means that the individual nanoniche ecotypes are not irreversibly separate (22, 51, 59).
While the nanoniche model is consistent with the set of genomic data at hand, this interpretation must be tentative. One issue is that in a larger sample, some strains may have been found to be qualitatively different in resource utilization (with some resources not shared); however, the available data suggest that nanoniche is at least the predominant mode of speciation. One additional issue is that some of the unique hypothetical genes in PE15 strains may later be shown to be niche specifying and to provide unshared resources (86).
In that case, there are other possible rapid-speciation models that may explain the high rate of speciation seen within PE15. One possibility is that this taxon is in a moment of adaptive radiation, in which ecologically distinct, cohesive (with recurrent periodic selection events), and irreversibly separate ecotypes (each with unique resources) are being invented at a high rate, as in the speedy-speciation model (Table 1; Fig. 1B). Another possibility is the speciesless model (Fig. 1D), in which a rapid speciation rate is matched by an equally high rate of extinction. This model requires that most ecological niches are ephemeral, such that the demise of a habitat type brings about the extinction of a specialized ecotype, perhaps as seen in the pathogen Neisseria meningitidis (87) or in marine heterotrophs adapting to an ephemeral particle of marine snow (60). We have previously laid out a protocol for distinguishing the speedy-speciation and speciesless models, but this requires a larger sample of fully sequenced genomes (50). Finally, one other model of rapid ecological change is the recurrent niche invasion model, in which a lineage moves in and out of different ecotypes with the acquisition or loss of various niche-specifying plasmids (Fig. 1E). This model may be ruled out in the present case, as the genomes showed no plasmids, and more generally the plasmids of B. subtilis relatives are too small to code for host adaptations (88).
In summary, our focus clade has shown a recent history marked by the rapid formation of new, ecologically distinct populations. This result is consistent with the hypothesis of Doolittle and Zhaxybayeva that nearly every strain among close relatives is ecologically distinct (19). However, there is no evidence that these newly divergent ecotypes are sufficiently distinct to diverge indefinitely; it appears that their divergence may be constrained by competition with one another. The most newly divergent ecotypes have the species-like properties of being ecologically distinct and cohesive, but they appear to lack the property of being irreversibly separate.
Divergence between putative ecotypes.
The genomic comparison of the two putative ecotypes previously identified through three-gene analyses, PE15 and PE10, reveals substantial genetic and ecological divergence. First, the average nucleotide identity of these groups is much less than that within PE15, at 92.9%. Also, positive selection on amino acid sequences has accelerated the divergence between PE15 and PE10 in 38 genes across 12 functional systems, much more than the case for divergence within PE15. Finally, gene content comparisons show 186 and 561 unique genes for PE15 and PE10, respectively, and many of these constitute functional subsystems not present at all in the other ecotype. That is, each putative ecotype appears to utilize resources not available to the other, and they may thus be irreversibly separate. While the previous three-gene analyses missed the discovery of ecological heterogeneity within PE15, it appears that the putative ecotypes identified by previous analyses represent the most closely related of clades that are both ecologically distinct and irreversibly separate.
Supplementary Material
ACKNOWLEDGMENTS
This work was supported by funds from Connecticut Space Grant fellowships to S.K. and J.W., an NSF FIBR award (EF-0328698) to F.M.C., and research funds from Wesleyan University and the University of Virginia.
Footnotes
Published ahead of print 6 June 2014
Supplemental material for this article may be found at http://dx.doi.org/10.1128/AEM.00576-14.
REFERENCES
- 1.Madigan MT, Martinko JM, Dunlap PV, Clark DP. 2009. Brock biology of microorganisms, 12th ed. Pearson Benjamin Cummings, San Francisco, CA [Google Scholar]
- 2.Schimel J, Balser TC, Wallenstein M. 2007. Microbial stress-response physiology and its implications for ecosystem function. Ecology 88:1386–1394. 10.1890/06-0219 [DOI] [PubMed] [Google Scholar]
- 3.Lawler ML, Brun YV. 2007. Advantages and mechanisms of polarity and cell shape determination in Caulobacter crescentus. Curr. Opin. Microbiol. 10:630–637. 10.1016/j.mib.2007.09.007 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Boucher Y, Douady CJ, Papke RT, Walsh DA, Boudreau ME, Nesbo CL, Case RJ, Doolittle WF. 2003. Lateral gene transfer and the origins of prokaryotic groups. Annu. Rev. Genet. 37:283–328. 10.1146/annurev.genet.37.050503.084247 [DOI] [PubMed] [Google Scholar]
- 5.McDonald D, Price MN, Goodrich J, Nawrocki EP, DeSantis TZ, Probst A, Andersen GL, Knight R, Hugenholtz P. 2012. An improved Greengenes taxonomy with explicit ranks for ecological and evolutionary analyses of bacteria and archaea. ISME J. 6:610–618. 10.1038/ismej.2011.139 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Sutcliffe IC. 2010. A phylum level perspective on bacterial cell envelope architecture. Trends Microbiol. 18:464–470. 10.1016/j.tim.2010.06.005 [DOI] [PubMed] [Google Scholar]
- 7.Curtis TP, Heas IM, Lunn M, Sloan WT, Schloss PD, Woodcock S. 2006. What is the extent of prokaryotic diversity? Philos. Trans. R. Soc. Lond. B Biol. Sci. 361:2023–2037. 10.1098/rstb.2006.1921 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Gans J, Wolinsky M, Dunbar J. 2005. Computational improvements reveal great bacterial diversity and high metal toxicity in soil. Science 309:1387–1390. 10.1126/science.1112665 [DOI] [PubMed] [Google Scholar]
- 9.Dykhuizen DE. 1998. Santa Rosalia revisited: why are there so many species of bacteria? Antonie Van Leeuwenhoek 73:25–33. 10.1023/A:1000665216662 [DOI] [PubMed] [Google Scholar]
- 10.Schluter D. 2009. Evidence for ecological speciation and its alternative. Science 323:737–741. 10.1126/science.1160006 [DOI] [PubMed] [Google Scholar]
- 11.Doebeli M. 2011. Adaptive diversification. Princeton University Press, Princeton, NJ [Google Scholar]
- 12.Cohan FM, Koeppel AF. 2008. The origins of ecological diversity in prokaryotes. Curr. Biol. 18:R1024–R1034. 10.1016/j.cub.2008.09.014 [DOI] [PubMed] [Google Scholar]
- 13.Koeppel A, Perry EB, Sikorski J, Krizanc D, Warner WA, Ward DM, Rooney AP, Brambilla E, Connor N, Ratcliff RM, Nevo E, Cohan FM. 2008. Identifying the fundamental units of bacterial diversity: a paradigm shift to incorporate ecology into bacterial systematics. Proc. Natl. Acad. Sci. U. S. A. 105:2504–2509. 10.1073/pnas.0712205105 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Cohan FM. 1994. The effects of rare but promiscuous genetic exchange on evolutionary divergence in prokaryotes. Am. Nat. 143:965–986. 10.1086/285644 [DOI] [PubMed] [Google Scholar]
- 15.Hunt DE, David LA, Gevers D, Preheim SP, Alm EJ, Polz MF. 2008. Resource partitioning and sympatric differentiation among closely related bacterioplankton. Science 320:1081–1085. 10.1126/science.1157890 [DOI] [PubMed] [Google Scholar]
- 16.Rainey PB, Travisano M. 1998. Adaptive radiation in a heterogeneous environment. Nature 394:69–72. 10.1038/27900 [DOI] [PubMed] [Google Scholar]
- 17.Treves DS, Manning S, Adams J. 1998. Repeated evolution of an acetate-crossfeeding polymorphism in long-term populations of Escherichia coli. Mol. Biol. Evol. 15:789–797. 10.1093/oxfordjournals.molbev.a025984 [DOI] [PubMed] [Google Scholar]
- 18.Rozen DE, Lenski RE. 2000. Long-term experimental evolution in Escherichia coli. VIII. Dynamics of a balanced polymorphism. Am. Nat. 155:24–35 [DOI] [PubMed] [Google Scholar]
- 19.Doolittle WF, Zhaxybayeva O. 2009. On the origin of prokaryotic species. Genome Res. 19:744–756. 10.1101/gr.086645.108 [DOI] [PubMed] [Google Scholar]
- 20.Shapiro BJ, Friedman J, Cordero OX, Preheim SP, Timberlake SC, Szabo G, Polz MF, Alm EJ. 2012. Population genomics of early events in the ecological differentiation of bacteria. Science 336:48–51. 10.1126/science.1218198 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Haldane JBS. 1932. The causes of evolution. Longmans, Green, and Co, London, United Kingdom [Google Scholar]
- 22.Wiedenbeck J, Cohan FM. 2011. Origins of bacterial diversity through horizontal gene transfer and adaptation to new ecological niches. FEMS Microbiol. Rev. 35:957–976. 10.1111/j.1574-6976.2011.00292.x [DOI] [PubMed] [Google Scholar]
- 23.Vos M. 2011. A species concept for bacteria based on adaptive divergence. Trends Microbiol. 19:1–7. 10.1016/j.tim.2010.10.003 [DOI] [PubMed] [Google Scholar]
- 24.Gogarten JP, Doolittle WF, Lawrence JG. 2002. Prokaryotic evolution in light of gene transfer. Mol. Biol. Evol. 19:2226–2238. 10.1093/oxfordjournals.molbev.a004046 [DOI] [PubMed] [Google Scholar]
- 25.Popa O, Hazkani-Covo E, Landan G, Martin W, Dagan T. 2011. Directed networks reveal genomic barriers and DNA repair bypasses to lateral gene transfer among prokaryotes. Genome Res. 21:599–609. 10.1101/gr.115592.110 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Choi SC, Rasmussen MD, Hubisz MJ, Gronau I, Stanhope MJ, Siepel A. 2012. Replacing and additive horizontal gene transfer in Streptococcus. Mol. Biol. Evol. 29:3309–3320. 10.1093/molbev/mss138 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Croucher NJ, Harris SR, Barquist L, Parkhill J, Bentley SD. 2012. A high-resolution view of genome-wide pneumococcal transformation. PLoS Pathog. 8:e1002745. 10.1371/journal.ppat.1002745 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Feil EJ, Maynard Smith J, Enright MC, Spratt BG. 2000. Estimating recombinational parameters in Streptococcus pneumoniae from multilocus sequence typing data. Genetics 154:1439–1450 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Hiller NL, Ahmed A, Powell E, Martin DP, Eutsey R, Earl J, Janto B, Boissy RJ, Hogg J, Barbadora K, Sampath R, Lonergan S, Post JC, Hu FZ, Ehrlich GD. 2010. Generation of genic diversity among Streptococcus pneumoniae strains via horizontal gene transfer during a chronic polyclonal pediatric infection. PLoS Pathog. 6:e1001108. 10.1371/journal.ppat.1001108 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Zawadzki P, Cohan FM. 1995. The size and continuity of DNA segments integrated in Bacillus transformation. Genetics 141:1231–1243 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Cohan FM. 2001. Bacterial species and speciation. Syst. Biol. 50:513–524. 10.1080/10635150118398 [DOI] [PubMed] [Google Scholar]
- 32.Levin BR, Bergstrom CT. 2000. Bacteria are different: observations, interpretations, speculations, and opinions about the mechanisms of adaptive evolution in prokaryotes. Proc. Natl. Acad. Sci. U. S. A. 97:6981–6985. 10.1073/pnas.97.13.6981 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Touchon M, Hoede C, Tenaillon O, Barbe V, Baeriswyl S, Bidet P, Bingen E, Bonacorsi S, Bouchier C, Bouvet O, Calteau A, Chiapello H, Clermont O, Cruveiller S, Danchin A, Diard M, Dossat C, Karoui ME, Frapy E, Garry L, Ghigo JM, Gilles AM, Johnson J, Le Bouguenec C, Lescat M, Mangenot S, Martinez-Jéhanne Matic VI, Nassif X, Oztas S, Petit MA, Pichon C, Rouy Z, Ruf CS, Schneider D, Tourret J, Vacherie B, Vallenet D, Médigue C, Rocha EP, Denamur E. 2009. Organised genome dynamics in the Escherichia coli species results in highly diverse adaptive paths. PLoS Genet. 5:e1000344. 10.1371/journal.pgen.1000344 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Thompson JR, Pacocha S, Pharino C, Klepac-Ceraj V, Hunt DE, Benoit J, Sarma-Rupavtarm R, Distel DL, Polz MF. 2005. Genotypic diversity within a natural coastal bacterioplankton population. Science 307:1311–1313. 10.1126/science.1106028 [DOI] [PubMed] [Google Scholar]
- 35.Walk ST, Alm EW, Gordon DM, Ram JL, Toranzos GA, Tiedje JM, Whittam TS. 2009. Cryptic lineages of the genus Escherichia. Appl. Environ. Microbiol. 75:6534–6544. 10.1128/AEM.01262-09 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Bhaya D, Grossman AR, Steunou AS, Khuri N, Cohan FM, Hamamura N, Melendrez MC, Bateson MM, Ward DM, Heidelberg JF. 2007. Population level functional diversity in a microbial community revealed by comparative genomic and metagenomic analyses. ISME J. 1:703–713. 10.1038/ismej.2007.46 [DOI] [PubMed] [Google Scholar]
- 37.Ochman H, Lawrence JG, Groisman EA. 2000. Lateral gene transfer and the nature of bacterial innovation. Nature 405:299–304. 10.1038/35012500 [DOI] [PubMed] [Google Scholar]
- 38.Welch RA, Burland V, Plunkett G, III, Redford P, Roesch P, Rasko D, Buckles EL, Liou SR, Boutin A, Hackett J, Stroud D, Mayhew GF, Rose DJ, Zhou S, Schwartz DC, Perna NT, Mobley HL, Donnenberg MS, Blattner FR. 2002. Extensive mosaic structure revealed by the complete genome sequence of uropathogenic Escherichia coli. Proc. Natl. Acad. Sci. U. S. A. 99:17020–17024. 10.1073/pnas.252529799 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Whittam TS, Bumbaugh AC. 2002. Inferences from whole-genome sequences of bacterial pathogens. Curr. Opin. Genet. Dev. 12:719–725. 10.1016/S0959-437X(02)00361-1 [DOI] [PubMed] [Google Scholar]
- 40.Lefébure T, Stanhope MJ. 2007. Evolution of the core and pan-genome of Streptococcus: positive selection, recombination, and genome composition. Genome Biol. 8:R71. 10.1186/gb-2007-8-5-r71 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Rasko DA, Rosovitz MJ, Myers GS, Mongodin EF, Fricke WF, Gajer P, Crabtree J, Sebaihia M, Thomson NR, Chaudhuri R, Henderson IR, Sperandio V, Ravel J. 2008. The pangenome structure of Escherichia coli: comparative genomic analysis of E. coli commensal and pathogenic isolates. J. Bacteriol. 190:6881–6893. 10.1128/JB.00619-08 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Denef VJ, Kalnejais LH, Mueller RS, Wilmes P, Baker BJ, Thomas BC, VerBerkmoes NC, Hettich RL, Banfield JF. 2010. Proteogenomic basis for ecological divergence of closely related bacteria in natural acidophilic microbial communities. Proc. Natl. Acad. Sci. U. S. A. 107:2383–2390. 10.1073/pnas.0907041107 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Medini D, Donati C, Tettelin H, Masignani V, Rappuoli R. 2005. The microbial pan-genome. Curr. Opin. Genet. Dev. 15:589–594. 10.1016/j.gde.2005.09.006 [DOI] [PubMed] [Google Scholar]
- 44.Luo C, Walk ST, Gordon DM, Feldgarden M, Tiedje JM, Konstantinidis KT. 2011. Genome sequencing of environmental Escherichia coli expands understanding of the ecology and speciation of the model bacterial species. Proc. Natl. Acad. Sci. U. S. A. 108:7200–7205. 10.1073/pnas.1015622108 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Smith NH, Gordon SV, de la Rua-Domenech R, Clifton-Hadley RS, Hewinson RG. 2006. Bottlenecks and broomsticks: the molecular evolution of Mycobacterium bovis. Nat. Rev. Microbiol. 4:670–681. 10.1038/nrmicro1472 [DOI] [PubMed] [Google Scholar]
- 46.Earl AM, Losick R, Kolter R. 2007. Bacillus subtilis genome diversity. J. Bacteriol. 189:1163–1170. 10.1128/JB.01343-06 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Deng X, Phillippy AM, Li Z, Salzberg SL, Zhang W. 2010. Probing the pan-genome of Listeria monocytogenes: new insights into intraspecific niche expansion and genomic diversification. BMC Genomics 11:500. 10.1186/1471-2164-11-500 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Ward DM. 1998. A natural species concept for prokaryotes. Curr. Opin. Microbiol. 1:271–277. 10.1016/S1369-5274(98)80029-5 [DOI] [PubMed] [Google Scholar]
- 49.Koeppel AF, Wertheim JO, Barone L, Gentile N, Krizanc D, Cohan FM. 2013. Speedy speciation in a bacterial microcosm: new species can arise as frequently as adaptations within a species. ISME J. 7:1080–1091. 10.1038/ismej.2013.3 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Cohan FM. 2011. Are species cohesive? A view from bacteriology, p 43–65 In Walk S, Feng P. (ed), Bacterial population genetics: a tribute to Thomas S. Whittam. ASM Press, Washington, DC [Google Scholar]
- 51.Cohan FM, Perry EB. 2007. A systematics for discovering the fundamental units of bacterial diversity. Curr. Biol. 17:R373–R386. 10.1016/j.cub.2007.03.032 [DOI] [PubMed] [Google Scholar]
- 52.de Queiroz K. 2005. Ernst Mayr and the modern concept of species. Proc. Natl. Acad. Sci. U. S. A. 102(Suppl 1):S6600–S6607. 10.1073/pnas.0502030102 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Mallet J. 1995. A species definition for the modern synthesis. Trends Ecol. Evol. 10:294–299. 10.1016/0169-5347(95)90031-4 [DOI] [PubMed] [Google Scholar]
- 54.Schluter D. 2000. The ecology of adaptive radiation. Oxford University Press, Oxford, United Kingdom [Google Scholar]
- 55.Connor N, Sikorski J, Rooney AP, Kopac S, Koeppel AF, Burger A, Cole SG, Perry EB, Krizanc D, Field NC, Slaton M, Cohan FM. 2010. The ecology of speciation in Bacillus. Appl. Environ. Microbiol. 76:1349–1358. 10.1128/AEM.01988-09 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Becraft E, Cohan FM, Kühl M, Jensen S, Ward DM. 2011. Fine-scale distribution patterns of Synechococcus ecological diversity in the microbial mat of Mushroom Spring, Yellowstone National Park. Appl. Environ. Microbiol. 77:7689–7697. 10.1128/AEM.05927-11 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Melendrez MC, Lange RK, Cohan FM, Ward DM. 2011. Influence of molecular resolution on sequence-based discovery of ecological diversity among Synechococcus populations in an alkaline siliceous hot spring microbial mat. Appl. Environ. Microbiol. 77:1359–1367. 10.1128/AEM.02032-10 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Martiny AC, Tai AP, Veneziano D, Primeau F, Chisholm SW. 2009. Taxonomic resolution, ecotypes and the biogeography of Prochlorococcus. Environ. Microbiol. 11:823–832. 10.1111/j.1462-2920.2008.01803.x [DOI] [PubMed] [Google Scholar]
- 59.Cohan FM. 2005. Periodic selection and ecological diversity in bacteria, p 78–93 In Nurminsky D. (ed), Selective sweep. Landes Bioscience, Georgetown, TX [Google Scholar]
- 60.Polz MF, Hunt DE, Preheim SP, Weinreich DM. 2006. Patterns and mechanisms of genetic and phenotypic differentiation in marine microbes. Philos. Trans. R. Soc. Lond. B Biol. Sci. 361:2009–2021. 10.1098/rstb.2006.1928 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Fondi M, Fani R. 2010. The horizontal flow of the plasmid resistome: clues from inter-generic similarity networks. Environ. Microbiol. 12:3228–3242. 10.1111/j.1462-2920.2010.02295.x [DOI] [PubMed] [Google Scholar]
- 62.Kopac S, Cohan FM. 2011. A theory-based pragmatism for discovering and classifying newly divergent bacterial species, p 21–41 In Tibayrenc M. (ed), Genetics and evolution of infectious diseases. Elsevier, London, United Kingdom [Google Scholar]
- 63.Stefanic P, Decorosi F, Viti C, Petito J, Cohan FM, Mandic-Mulec I. 2012. The quorum sensing diversity within and between ecotypes of Bacillus subtilis. Environ. Microbiol. 14:1378–1389. 10.1111/j.1462-2920.2012.02717.x [DOI] [PubMed] [Google Scholar]
- 64.Sikorski J, Brambilla E, Kroppenstedt RM, Tindall BJ. 2008. The temperature adaptive fatty acid content in Bacillus simplex strains from “Evolution Canyon,” Israel. Microbiology 154:2416–2426. 10.1099/mic.0.2007/016105-0 [DOI] [PubMed] [Google Scholar]
- 65.Zeigler DR. 2011. The genome sequence of Bacillus subtilis subsp. spizizenii W23: insights into speciation within the B. subtilis complex and into the history of B. subtilis genetics. Microbiology 157:2033–2041. 10.1099/mic.0.048520-0 [DOI] [PubMed] [Google Scholar]
- 66.Kunst F, Ogasawara N, Moszer I, Albertini AM, Alloni G, Azevedo V, Bertero MG, Bessieres P, Bolotin A, Borchert S, Borriss R, Boursier L, Brans A, Braun M, Brignell SC, Bron S, Brouillet S, Bruschi CV, Caldwell B, Capuano V, Carter NM, Choi SK, Codani JJ, Connerton IF, Cummings NJ, Daniel RA, Denizot F, Devine KM, Dusterhoft A, Ehrlich SD, Emmerson PT, Entian KD, Errington J, Fabret C, Ferrari E, Foulger D, Fritz C, Fujita M, Fujita Y, Fuma S, Galizzi A, Galleron N, Ghim SY, Glaser P, Goffeau A, Golightly EJ, Grandi G, Guiseppi G, Guy BJ, Haga K, et al. 1997. The complete genome sequence of the Gram-positive bacterium Bacillus subtilis. Nature 390:249–256. 10.1038/36786 [DOI] [PubMed] [Google Scholar]
- 67.Chevreux BW, Suhai S. 1999. Genome sequence assembly using trace signals and additional sequence information. Proc. German Conf. Bioinformatics 99:45–56 [Google Scholar]
- 68.Chevreux B, Pfisterer T, Drescher B, Driesel AJ, Muller WE, Wetter T, Suhai S. 2004. Using the miraEST assembler for reliable and automated mRNA transcript assembly and SNP detection in sequenced ESTs. Genome Res. 14:1147–1159. 10.1101/gr.1917404 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Margulies M, Egholm M, Altman WE, Attiya S, Bader JS, Bemben LA, Berka J, Braverman MS, Chen YJ, Chen Z, Dewell SB, Du L, Fierro JM, Gomes XV, Godwin BC, He W, Helgesen S, Ho CH, Irzyk GP, Jando SC, Alenquer ML, Jarvie TP, Jirage KB, Kim JB, Knight JR, Lanza JR, Leamon JH, Lefkowitz SM, Lei M, Li J, Lohman KL, Lu H, Makhijani VB, McDade KE, McKenna MP, Myers EW, Nickerson E, Nobile JR, Plant R, Puc BP, Ronan MT, Roth GT, Sarkis GJ, Simons JF, Simpson JW, Srinivasan M, Tartaro KR, Tomasz A, Vogt KA, Volkmer GA, et al. 2005. Genome sequencing in microfabricated high-density picolitre reactors. Nature 437:376–380. 10.1038/nature03959 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Darling AC, Mau B, Blattner FR, Perna NT. 2004. Mauve: multiple alignment of conserved genomic sequence with rearrangements. Genome Res. 14:1394–1403. 10.1101/gr.2289704 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Gordon D, Abajian C, Green P. 1998. Consed: a graphical tool for sequence finishing. Genome Res. 8:195–202. 10.1101/gr.8.3.195 [DOI] [PubMed] [Google Scholar]
- 72.Aziz RK, Bartels D, Best AA, DeJongh M, Disz T, Edwards RA, Formsma K, Gerdes S, Glass EM, Kubal M, Meyer F, Olsen GJ, Olson R, Osterman AL, Overbeek RA, McNeil LK, Paarmann D, Paczian T, Parrello B, Pusch GD, Reich C, Stevens R, Vassieva O, Vonstein V, Wilke A, Zagnitko O. 2008. The RAST server: rapid annotations using subsystems technology. BMC Genomics 9:75. 10.1186/1471-2164-9-75 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Overbeek R, Begley T, Butler RM, Choudhuri JV, Chuang HY, Cohoon M, de Crecy-Lagard V, Diaz N, Disz T, Edwards R, Fonstein M, Frank ED, Gerdes S, Glass EM, Goesmann A, Hanson A, Iwata-Reuyl D, Jensen R, Jamshidi N, Krause L, Kubal M, Larsen N, Linke B, McHardy AC, Meyer F, Neuweger H, Olsen G, Olson R, Osterman A, Portnoy V, Pusch GD, Rodionov DA, Ruckert C, Steiner J, Stevens R, Thiele I, Vassieva O, Ye Y, Zagnitko O, Vonstein V. 2005. The subsystems approach to genome annotation and its use in the project to annotate 1000 genomes. Nucleic Acids Res. 33:5691–5702. 10.1093/nar/gki866 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Jobb G, von Haeseler A, Strimmer K. 2004. TREEFINDER: a powerful graphical analysis environment for molecular phylogenetics. BMC Evol. Biol. 4:18. 10.1186/1471-2148-4-18 [DOI] [PMC free article] [PubMed] [Google Scholar] [Retracted]
- 75.Laing C, Buchanan C, Taboada EN, Zhang Y, Kropinski A, Villegas A, Thomas JE, Gannon VP. 2010. Pan-genome sequence analysis using Panseq: an online tool for the rapid analysis of core and accessory genomic regions. BMC Bioinformatics 11:461. 10.1186/1471-2105-11-461 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Yang Z. 1997. PAML: a program package for phylogenetic analysis by maximum likelihood. Comput. Appl. Biosci. 13:555–556 [DOI] [PubMed] [Google Scholar]
- 77.Yang Z. 2007. PAML 4: phylogenetic analysis by maximum likelihood. Mol. Biol. Evol. 24:1586–1591. 10.1093/molbev/msm088 [DOI] [PubMed] [Google Scholar]
- 78.Martin DP. 2009. Recombination detection and analysis using RDP3. Methods Mol. Biol. 537:185–205. 10.1007/978-1-59745-251-9_9 [DOI] [PubMed] [Google Scholar]
- 79.Anagnostopoulos C, Spizizen J. 1961. Requirements for transformation in Bacillus subtilis. J. Bacteriol. 81:741–746 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.Untergasser A, Cutcutache I, Koressaar T, Ye J, Faircloth BC, Remm M, Rozen SG. 2012. Primer3—new capabilities and interfaces. Nucleic Acids Res. 40:e115. 10.1093/nar/gks596 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Papke RT, Zhaxybayeva O, Feil EJ, Sommerfeld K, Muise D, Doolittle WF. 2007. Searching for species in haloarchaea. Proc. Natl. Acad. Sci. U. S. A. 104:14092–14097. 10.1073/pnas.0706358104 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Schönert S, Seitz S, Krafft H, Feuerbaum EA, Andernach I, Witz G, Dahl MK. 2006. Maltose and maltodextrin utilization by Bacillus subtilis. J. Bacteriol. 188:3911–3922. 10.1128/JB.00213-06 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83.Vandamme P, Pot B, Gillis M, de Vos P, Kersters K, Swings J. 1996. Polyphasic taxonomy, a consensus approach to bacterial systematics. Microbiol. Rev. 60:407–438 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Gordon RE, Haynes WC, Pang CH. 1973. The genus Bacillus. Agricultural handbook no. 427. Agricultural Research Service, US Department of Agriculture, Washington, DC [Google Scholar]
- 85.Feldgarden M, Byrd N, Cohan FM. 2003. Gradual evolution in bacteria: evidence from Bacillus systematics. Microbiology 149:3565–3573. 10.1099/mic.0.26457-0 [DOI] [PubMed] [Google Scholar]
- 86.Sommer MO, Dantas G, Church GM. 2009. Functional characterization of the antibiotic resistance reservoir in the human microflora. Science 325:1128–1131. 10.1126/science.1176950 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.Achtman M, Wagner M. 2008. Microbial diversity and the genetic nature of microbial species. Nat. Rev. Microbiol. 6:431–440 [DOI] [PubMed] [Google Scholar]
- 88.Zawadzki P, Riley MA, Cohan FM. 1996. Homology among nearly all plasmids infecting three Bacillus species. J. Bacteriol. 178:191–198 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.




