Abstract
Spontaneous mutations are ultimately essential for evolutionary change and are also the root cause of many diseases. However, until recently, both biological and technical barriers have prevented detailed analyses of mutation profiles, constraining our understanding of the mutation process to a few model organisms and leaving major gaps in our understanding of the role of genome content and structure on mutation. Here, we present a genome-wide view of the molecular mutation spectrum in Burkholderia cenocepacia, a clinically relevant pathogen with high %GC content and multiple chromosomes. We find that B. cenocepacia has low genome-wide mutation rates with insertion–deletion mutations biased toward deletions, consistent with the idea that deletion pressure reduces prokaryotic genome sizes. Unlike prior studies of other organisms, mutations in B. cenocepacia are not AT biased, which suggests that at least some genomes with high %GC content experience unusual base-substitution mutation pressure. Importantly, we also observe variation in both the rates and spectra of mutations among chromosomes and elevated G:C > T:A transversions in late-replicating regions. Thus, although some patterns of mutation appear to be highly conserved across cellular life, others vary between species and even between chromosomes of the same species, potentially influencing the evolution of nucleotide composition and genome architecture.
Keywords: mutation rate, genome evolution, replication timing, mutation spectra
AS the ultimate source of genetic variation, mutation is implicit in every aspect of genetics and evolution. However, as a result of the genetic burden imposed by deleterious mutations, remarkably low mutation rates have evolved across all of life, making detection of these rare events technologically challenging and accurate measures of mutation rates and spectra exceedingly difficult (Kibota and Lynch 1996; Lynch and Walsh 1998; Sniegowski et al. 2000; Lynch 2011; Fijalkowska et al. 2012; Zhu et al. 2014). Until recently, most estimates of mutational properties have been derived indirectly using comparative genomics at putatively neutral sites (Graur and Li 2000; Wielgoss et al. 2011) or by extrapolation from small reporter-construct studies (Drake 1991). Both of these methods are subject to potentially significant biases, as many putatively neutral sites are subject to selection and mutation rates can vary substantially among different genomic regions (Lynch 2007).
To avoid the potential biases of these earlier methods, pairing classic mutation accumulation (MA) with whole-genome sequencing (WGS) has become the preferred method for obtaining direct measures of mutation rates and spectra (Lynch et al. 2008; Denver et al. 2009; Ossowski et al. 2010; Lee et al. 2012; Sung et al. 2012a,b; Heilbron et al. 2014). Using this strategy, a single clonal ancestor is used to initiate several replicate lineages that are subsequently passaged through repeated single-cell bottlenecks for several thousand generations. The complete genomes of each evolved lineage are then sequenced and compared with the other lines to identify de novo mutations occurring over the course of the experiment. The bottlenecking regime minimizes the ability of natural selection to eliminate deleterious mutations, and the parallel sequencing provides a large enough body of information to yield a nearly unbiased picture of the natural mutation spectrum of the study organism (Lynch et al. 2008).
The MA–WGS method has now been used to examine mutational processes in several model eukaryotic and prokaryotic species, yielding a number of apparently generalizable conclusions about mutation rates and spectra. For example, a negative scaling between base-substitution mutation rates and both effective population size (Ne) and the amount of coding DNA supports the hypothesis that the refinement of replication fidelity that can be achieved by selection is determined by the power of random genetic drift among phylogenetic lineages (Lynch 2011; Sung et al. 2012a). This “drift-barrier hypothesis” therefore predicts that organisms with very large population sizes such as some bacteria should have evolved very low mutation rates (Lee et al. 2012; Sung et al. 2012a; Foster et al. 2013). Universal transition and G:C > A:T biases have also been observed in all MA studies to date (Lind and Andersson 2008; Lynch et al. 2008; Denver et al. 2009; Ossowski et al. 2010; Lee et al. 2012; Sung et al. 2012a,b), corroborating previous findings using indirect methods (Hershberg and Petrov 2010; Hildebrand et al. 2010). However, several additional characteristics of mutation spectra vary among species (Lynch et al. 2008; Denver et al. 2009; Ossowski et al. 2010; Lee et al. 2012; Sung et al. 2012a,b), and examining the role of genome architecture, size, and lifestyle in producing these idiosyncrasies will require a considerably larger number of detailed MA studies. Among bacterial species that have been subjected to mutational studies, genomes with high %GC content are particularly sparse and no studies have been conducted on bacteria with multiple chromosomes, a genome architecture of many important bacterial species (e.g., Vibrio, Brucella, and Burkholderia).
Burkholderia cenocepacia is a member of the B. cepacia complex, a diverse group of bacteria with important clinical implications for patients with cystic fibrosis (CF), where they can form persistent lung infections and highly resistant biofilms (Coenye et al. 2004; Mahenthiralingam et al. 2005; Traverse et al. 2013). The core genome of B. cenocepacia HI2424 has a high %GC content (66.8%) and harbors three chromosomes, each containing rDNA operons (LiPuma et al. 2002), although the third chromosome can be eliminated under certain conditions (Agnoli et al. 2012). The primary chromosome (Chr1) is ∼3.48 Mb and contains 3253 genes; the secondary chromosome (Chr2) is ∼3.00 Mb and contains 2709 genes; and the tertiary chromosome (Chr3) is ∼1.06 Mb and contains 929 genes. In addition, B. cenocepacia HI2424 contains a 0.164-Mb plasmid, which contains 159 genes and lower %GC content than the core genome (62.0%). Although the %GC content is consistent across the three core chromosomes, the proportion of coding DNA declines from Chr1 to Chr3, while the synonymous and nonsynonymous substitution rates increase from Chr1 to Chr3 (Cooper et al. 2010; Morrow and Cooper 2012). Whether this variation in evolutionary rate is driven by variation in nonadaptive processes like mutation bias or variation in the relative strength of purifying selection remains a largely unanswered question in the evolution of bacteria with multiple chromosomes.
Here, we applied whole-genome sequencing to 47 MA lineages derived from B. cenocepacia HI2424 that were evolved in the near absence of natural selection for >5550 generations each. We identified a total of 282 mutations spanning all three replicons and the plasmid, enabling a unique perspective on interchromosomal variation in both mutation rate and spectra, in a bacterium with the highest %GC content studied with MA–WGS to date.
Materials and Methods
Mutation accumulation
Seventy-five independent lineages were founded by single cells derived from a single colony of B. cenocepacia HI2424, a soil isolate that had only previously been passaged in the laboratory during isolation (Coenye and LiPuma 2003). Independent lineages were then serially propagated every 24 hr onto fresh high nutrient tryptic soy agar (TSA) plates (30 g/liter tryptic soy broth (TSB) powder, 15 g/liter agar). Two lineages were maintained on each plate at 37°, and the isolated colony closest to the base of each plate half was chosen for daily restreaking. Following 217 days of MA, frozen stocks of all lineages were prepared by growing a final colony per isolate in 5 ml TSB (30 g/liter TSB) overnight at 37°, and freezing in 8% DMSO at −80°.
Daily generation times were estimated each month by placing a single representative colony from each line in 2 ml of phosphate buffer saline (80 g/liter NaCl, 2 g/liter KCl, 14.4 g/liter Na2HPO4 • 2H2O, 2.4 g/liter KH2PO4), serially diluting to 10−3 and spread plating 100 μl on TSA. By counting the colonies on the resultant TSA plate, we calculated the number of viable cells in a single colony and thus the number of generations between each transfer. The average generation time across all lines was then calculated and used as the daily generation time for that month. These generation-time measurements were used to evaluate potential effects of declining colony size over the course of the MA experiment as a result of mutational load, a phenotype that was observed (see Supporting Information, Figure S1). Final generation numbers per line were estimated as the sum of monthly generation estimates, which were derived by multiplying the number of generations per day in that month by the number of days between measurements (see Figure S1).
DNA extraction and sequencing
Genomic DNA was extracted from 1 ml of overnight culture inoculated from 47 frozen derivatives of MA lines and the ancestor of the MA experiments using the Wizard Genomic DNA Purification kit (Promega). Following library preparation, sequencing was performed using the 151-bp paired-end Illumina HiSeq platform at the University of New Hampshire Hubbard Center for Genomic Studies with an average fragment size between paired-end reads of ∼386 bp. All of our raw fastQ files were analyzed using fastQC, but filtering was not performed as all files passed each of the fastQC modules. The average Phred score was 34 for both forward and reverse reads, with moderate declines in base quality from the beginning to the end of reads. All forward and reverse reads for each isolate and the ancestor were individually mapped to the reference genome of B. cenocepacia HI2424 (LiPuma et al. 2002), with both the Burrows–Wheeler aligner (BWA) (Li and Durbin 2009) and Novoalign (www.novocraft.com), producing an average sequence depth of ∼50 times.
Base-substitution mutation identification
To identify base-substitution mutations, the SAM alignment files that were produced by each reference aligner were first converted to mpileup format using SAMtools (Li et al. 2009). Forward and reverse read alignments were then produced for each position in each line using in-house perl scripts. Next, a three-step process was used to detect polymorphisms. First, an ancestral consensus base was called at each site in the reference genome as the base with the highest support among pooled reads across all lines, as long as there were at least three lines with sufficient coverage to identify a base. Importantly, this allows us to correct any differences between the published B. cenocepacia HI2424 reference and our ancestor, leveraging the enormous power of having sequenced 48 nearly isogenic isolates. Second, lineage specific consensus bases were called using the reads for each individually sequenced isolate, as long as the site was covered by at least two forward and two reverse reads, and at least 80% of those reads identified the same base. Lastly, these informative sites shared by MA lines and the ancestor were compared to identify putative base-substitution mutations. These mutations were considered genuine if both aligners independently identified the mutation and they were only identified in a single lineage. Base substitutions shared by more than one lineage were considered to have occurred in the founding colony and were therefore only counted once, though this was only observed in two cases. Sites that did not meet the criteria for being informative in both an individual line and the ancestor were excluded from our analyses and do not contribute to the value of sites analyzed (n), described below for mutation rate calculations.
Although the above criteria for identifying individual line bases and overall consensus bases are relatively lenient, given our coverage of ∼50 times for individual lines, both the coverage and support for all substitutions that were called dramatically exceeded these criteria, demonstrating that we were not simply obtaining false positives in regions of lower coverage (see File S1). Furthermore, our ancestral strain was sequenced at the same depth as the derived strains and included as a 48th isolate in our analysis. As expected, no mutations were identified in the ancestral strain. These same methods have been used to identify base-substitution mutations in both Escherichia coli and Bacillus subtilis MA lines, where 19 of 19 and 69 of 69 base-substitution mutations called were confirmed by conventional sequencing, respectively (Lee et al. 2012; Sung et al. 2012a). Thus, these criteria are unlikely to result in false positives, while allowing us to cover the majority of the B. cenocepacia genome and reduce false negatives.
Insertion–deletion mutation identification
For insertion–deletion mutations (indels), inherent difficulties with gaps and repeat elements can reduce agreement in the alignment of single reads using short-read alignment algorithms, even in the case of true indels. Thus, putative indels were first extracted from both BWA and Novoalign at all sites where at least two forward and two reverse reads covered an indel, and 30% of those reads identified the exact same indel (size and motif). Next, the alignment output was additionally passaged through the pattern-growth algorithm PINDEL to verify putative indels from the alignment and identify larger indels using paired-end information (Ye et al. 2009). Here, a total of 20 reads, including at least 6 forward and 6 reverse reads were required to extract a putative indel. Putative indels were only kept as true indels for further analysis if (a) they were independently identified by both alignment algorithms and PINDEL, and at least 50% of the full-coverage reads (>25 bases on both sides of the indel) from the initial alignment identified the mutation; (b) they were identified only by BWA and Novoalign, and at least 80% of the good-coverage reads from the initial alignment identified the mutation; or (c) they were larger indels that were only identified by the stricter requirements of PINDEL. Moreover, if an indel was identified in more than half of the lineages, we consider it to be an ancestral indel and exclude it from further analyses.
Unlike base-substitutions mutations, many reads that cover an indel mutation may fail to identify the mutation because they lack sufficient coverage on both sides of the mutation to anchor the read to the reference genome, particular when they occur in simple sequence repeats. Therefore, applying the initially lenient filter to extract putative indels is justified to identify all potential indels. By then focusing only on the good-coverage reads and applying an independent paired-end indel identifier (PINDEL), we can filter out indels that are more likely to be false positives, while keeping only the high concordance indels supported by multiple algorithms. Although there remains more uncertainty with indel calls than with base-substitution mutations, we are confident that we have obtained an accurate picture of the naturally occurring indels from this study because of the high concordance across algorithms and reads (see File S1; Figure S3), and the fact that no indels were called independently by more than two lines (see Figure S4). A complete list of the indels identified in this study, along with the algorithms that identified them, their coverage, and concordance across well-covered reads can be found in File S1.
Mutation-rate analysis
Once a complete set of mutations had been identified in each lineage, we calculated the substitution and indel mutation rates for each line using the equation , where μ represents the mutation rate (μbs for base-substitutional changes (bps), μindel for insertion-deletion mutations (indels)), m represents the number of mutations observed, n represents the number of sites that had sufficient depth and consensus to analyze, and T represents the total generations over the course of the MA study for an individual line. The standard error of the mutation rate for each line was measured as described previously with the equation (Denver et al. 2004, 2009).
The final μbs and μindel for B. cenocepacia were calculated by taking the average μ of all sequenced lineages, and the total standard error was calculated as the standard deviation of the mutation rates across all lines (s) divided by the square root of the number of lines analyzed (N): . Specific base-substitution mutation rates were further divided into conditional rates for each substitution type, again using the equation , where m is the number of substitutions of a particular type, and n is the number of ancestral bases that can lead to each substitution with sufficient depth and consensus to analyze. The conditional substitution rates at seven multi-locus sequence typing (MLST) loci (atpD, gltB, gyrB, lepA, phaC, recA, and trpB) were calculated under the assumption that the most common nucleotide was the ancestral state and any deviation from that ancestral state occurred only once and spread through the population (Jolley and Maiden 2010). We then estimated conditional substitution rates as , as described above.
Calculation of GE, πs, and NE
Effective genome size (GE) was determined as the total coding bases in the B. cenocepacia genome. Silent site diversity (πs) was derived using the MLST loci described above, which were concatenated and aligned using BIGSdb (Jolley and Maiden 2010) and analyzed using DNAsp (Librado and Rozas 2009). Using the value of μbs obtained in this study, Ne was estimated by dividing the value of πs by 2μbs () (Kimura 1983).
Results
A classic mutation-accumulation experiment was carried out for 217 days with 75 independent lineages all derived from the same ancestral colony of B. cenocepacia HI2424 (LiPuma et al. 2002). This method thus founds a new population each day by a single cell, which limits the efficiency with which natural selection can purge deleterious and enrich beneficial mutations. Measurements of generations of growth per day were taken monthly and varied from 26.2 ± 0.12 to 24.9 ± 0.14 (mean ± 95% C.I. of highest and lowest measurements, respectively) (see Figure S1), resulting in an average of 5554 generations per line over the course of the MA experiment. Thus, across the 47 lines whose complete genomes were sequenced, we were able to visualize the natural mutation spectrum of B. cenocepacia HI2424 over 261,047 generations of MA.
From the comparative sequence data, we identified 245 bps, 33 indels (with sizes in the range of 1 to 145 base pairs), and four plasmid-loss events spanning the entire genome (Figure 1). With means of 5.21 bps and 0.70 indel mutations per line, the distribution of bps and indels across individual lines did not differ significantly from a Poisson distribution (bps: χ2 = 1.81, P = 0.99; indels: χ2 = 0.48, P = 0.92), indicating that mutation rates did not vary over the course of the MA experiment (see Figure S2).
Mutation-accumulation experiments rely on the basic principle that when the effective population size (Ne) is sufficiently reduced, the efficiency of selection is minimized to the point at which nearly all mutations become fixed by genetic drift with equal probability (Kibota and Lynch 1996). Ne in this mutation-accumulation study was calculated to be ∼12.86, using the harmonic mean of the population size over 24 hr of colony growth (Hall et al. 2008). The threshold selective coefficient below which genetic drift will overpower natural selection is Ne × s = 1 for haploid organisms (Lynch 2007). Thus, only mutations conferring adaptive or deleterious effects of s > 0.078 would be subject to the biases of natural selection in this study, which is expected to be a very small fraction of mutations (Kimura 1983; Elena et al. 1998; Zeyl and Devisser 2001; Hall et al. 2008).
Given the codon usage and %GC content of synonymous and nonsynonymous sites in B. cenocepacia HI2424, 27.8% of coding substitutions are expected to be synonymous in the absence of natural selection. The observed percentage of synonymous substitutions (25.5%) did not differ significantly from this null expectation (χ2 = 0.54, d.f. = 1, P = 0.46). Further, we find limited evidence of positive selection since parallel evolution among base-substitution mutations is rare in this study; no gene is hit more than twice across any of the 47 independently derived lineages (see File S1). Although both base substitutions (χ2 = 4.20, d.f. = 1, P = 0.04) and indels (χ2 = 21.3, d.f. = 1, P < 0.0001) were biased to noncoding DNA, evidence exists that mismatch repair preferentially repairs damage in coding regions, which can create artificial signatures of selection in MA experiments (Lee et al. 2012). Thus, our overall observations are consistent with our MA experiment inducing limited selection on the mutation spectra, at least as far as base substitutions are concerned.
Low base-substitution and indel mutation rates
The preceding results imply that base-substitution and indel mutation rates for B. cenocepacia are 1.33 (0.008) × 10−10/bp/generation and 1.68 (0.003) × 10−11/bp/generation (SEM), respectively. Based on the 7.70 Mb genome size, these per-base mutation rates correspond to a genome-wide base-substitution mutation rate of only 0.001/genome/generation, and an indel mutation rate of only 0.0001/genome/generation. Although the ∼1:3 ratio of synonymous to nonsynonymous substitutions is consistent with negligible influence of selection on base-substitution mutations in this study, too few indels occurred to evaluate a signature of selection, but their scarcity could reflect some selective loss of genotypes with loss-of-function mutations (Heilbron et al. 2014; Zhu et al. 2014).
Base-substitution mutations are not AT biased
One of the central motivations for studying the molecular mutation spectrum of B. cenocepacia was its high %GC content (66.8%). A universal mutation bias in the direction of AT has been observed in all other wild-type species studied by MA (Table 1) and has also been inferred in comparative analyses of several bacterial species, including B. pseudomallei (Lynch et al. 2008; Denver et al. 2009, 2012; Keightley et al. 2009; Hershberg and Petrov 2010; Hildebrand et al. 2010; Lynch 2010a; Ossowski et al. 2010; Sung et al. 2012a,b; Lee et al. 2012; Schrider et al. 2013; Zhu et al. 2014). Thus, biased gene conversion and selection have been invoked to explain the high %GC content realized in many genomes (Lynch et al. 2008; Duret and Galtier 2009; Raghavan et al. 2012; Zhu et al. 2014; Lassalle et al. 2015). Our data for B. cenocepacia are inconsistent with prior published studies showing a mutation bias in the direction of AT (Table 1), but also suggest that biased gene conversion and/or selection must have mostly generated the realized %GC content of B. cenocepacia, which is substantially higher than expected based on mutation pressure alone.
Table 1. Comparison of mutation bias in B. cenocepacia with the mutation bias of seven other species with large mutation datasets and intact mismatch repair (two prokaryotes, five eukaryotes).
Organism (%GC) | Transitions | Transversions | AT bias | Reference | ||||
---|---|---|---|---|---|---|---|---|
A:T > G:C | G:C > A:T | A:T > T:A | G:C > T:A | A:T > C:G | G:C > C:G | |||
B. cenocepacia (0.67) | 6.88 | 7.51 | 2.67 | 2.44 | 5.35 | 2.38 | 0.81 | This study |
E. coli (0.51)a | 8.74 | 13.71 | 2.80 | 5.08 | 6.64 | 2.88 | 1.22 | Lee et al. (2012) |
M. florum (0.27)a | 50.93 | 640.83 | 15.67 | 360.14 | 11.75 | 185.36 | 15.97 | Sung et al. (2012a) |
H. sapiens (0.45) | 429.00 | 961.00 | 129.00 | 258.00 | 152.00 | 295.00 | 2.10 | Lynch (2010) |
D. melanogaster (0.42)a | 101.13 | 513.19 | 98.06 | 130.76 | 48.01 | 74.52 | 4.32 | Schrider et al. (2013) |
S. cerevisiae (0.38)a | 7.13 | 17.86 | 3.03 | 9.69 | 5.30 | 7.82 | 2.22 | Zhu et al. (2014) |
A. thaliana (0.36)a | 104.54 | 896.30 | 43.56 | 139.08 | 60.98 | 123.63 | 6.26 | Ossowski et al. (2010) |
C. elegans (0.35)a | 16.47 | 57.23 | 17.50 | 44.09 | 7.72 | 16.89 | 4.19 | Denver et al. (2012) |
The strength of the AT-mutation bias, calculated as the ratio of the conditional mutation rates in the G:C > A:T direction to the conditional mutation rates in the A:T > G:C direction, is substantially higher in all other species than in B. cenocepacia.
Conditional mutation rates (×1011) are calculated as the number of each mutation type, divided by the product of the number of generations and the total A:T or G:C sites in each respective reference genome if the raw data are not directly available in the cited reference (Wei et al. 2014).
In comparing the relative rates of G:C > A:T transition and G:C > T:A transversion mutations with those of A:T > G:C transitions and A:T > C:G transversions, corrected for the ratio of G:C to A:T sites analyzed in this study, we found that substitutions in the G:C direction were 17% more frequent than mutations in the A:T direction per base pair, although the conditional rates were not significantly different (χ2 = 0.91, d.f. = 1, P = 0.33). The lack of mutational bias in the A:T direction can largely be attributed to A:T > C:G transversions occurring at significantly higher rates than any other transversion type, most notably the G:C > T:A transversions (χ2 = 8.68, d.f. = 1, P = 0.0032). However, A:T > G:C transitions also occurred at nearly the same rate as G:C > A:T transitions, the latter of which have been the most commonly observed substitution in other studies, putatively due to deamination of cytosine or 5-methyl-cytosine (Figure 2) (Lee et al. 2012; Sung et al. 2012b; Zhu et al. 2014).
Using the ratio of the conditional rate of mutation in the G:C direction to that in the A:T direction (x), the expected %GC content under mutation-drift equilibrium is x/(1 + x) = 0.539 ± 0.043 (SEM). Therefore, although mutation pressure in B. cenocepacia does not favor AT bases, it is clear that the observed mutation bias is not sufficient to elicit the realized %GC content of 66.8%. Thus, either the B. cenocepacia genome is still moving toward mutation-drift equilibrium, or GC-biased gene conversion and/or natural selection are responsible for the observed %GC content (Lynch et al. 2008; Duret and Galtier 2009; Raghavan et al. 2012; Zhu et al. 2014; Lassalle et al. 2015).
Deletion bias favors genome-size reduction and AT composition
Although our lower bound estimates of the insertion and deletion mutation rates are both ∼15-fold lower than the base-substitution mutation rate, many indels affect more than one base. Specifically, the 17 deletions observed in this study result in the deletion of a total of 376 bases, while the 16 insertions result in a gain of 121 bases. Therefore, the number of bases that are impacted by indels in this study is more than twice the number impacted by bps, indicating that indels may still play a central role in the genome evolution of B. cenocepacia if they are not purged by natural selection.
Although the ratio of deletions to insertions observed in this study was nearly 1, the per-base-pair deletion rate (1.97 (0.86) × 10−10/bp/generation) was substantially higher than the insertion rate (6.11 (1.90) × 10−11/bp/generation), since the average size of deletions was greater than the average size of insertions. Thus, there is a net deletion rate of 1.36 (5.95) × 10−10/bp/generation (Table 2). Although no indels >150 bp were observed in this study, examining the depth of coverage of the B. cenocepacia HI2424 plasmid relative to the rest of the genome revealed that the plasmid was lost at a rate of 1.53 × 10−5 per cell division, while gains in plasmid copy number were not observed (Table 2).
Table 2. Parameters of insertion and deletion mutations during 261,038 generations of spontaneous mutation accumulation in B. cenocepacia.
Parameter | Deletions | Insertions |
---|---|---|
Events observed | 17 | 16 |
Total nucleotides affected | 376 | 121 |
Total GC bases affected | 302 | 87 |
Total AT bases affected | 74 | 34 |
Proportion of GC bases affected | 0.80 | 0.72 |
Plasmid copy number loss/gain | 4 | 0 |
The base composition of deletions was also biased, with GC bases being deleted significantly more than expected based on the genome content (χ2 = 30.4, d.f. = 1, P < 0.0001). In contrast, no detectable bias was observed toward insertions of GC over AT bases (χ2 = 1.20, d.f. = 1, P = 0.27) (Table 2). Thus, indels in B. cenocepacia are expected to reduce genome wide %GC content, further supporting the need for other population-genetic processes to account for the composition of high-GC genomes (Lynch et al. 2008; Duret and Galtier 2009; Raghavan et al. 2012; Zhu et al. 2014; Lassalle et al. 2015). Overall, the observed mutation spectra in this study suggest that the natural indel spectrum of B. cenocepacia causes both genome-size reduction and increased %AT content.
Nonuniform chromosomal distribution of mutations
Another major goal of this study was to investigate whether mutation rates and spectra vary among chromosomes and chromosomal regions. The three core chromosomes of B. cenocepacia vary in size and content but are sufficiently large to have each accumulated a considerable number of mutations in this study (Morrow and Cooper 2012). Chromosome 1 (Chr1) is the largest chromosome (both in size and in gene count), with more essential and highly expressed genes than either chromosome 2 (Chr2) or 3 (Chr3) (see Figure S5). Expression and number of essential genes are second highest on Chr2 and lowest on Chr3 (Cooper et al. 2010; Morrow and Cooper 2012). In contrast, average nonsynonymous and synonymous variation among orthologs shared by multiple strains of B. cenocepacia, as well as fixed variation among Burkholderia species (dN and dS), are highest on Chr3 and lowest on Chr1 (see Figure S5) (Cooper et al. 2010; Morrow and Cooper 2012).
The overall base-substitution mutation rates of the three core chromosomes differ significantly based on a chi-square proportions test, where the null expectation was that the number of substitutions would be proportional to the number of sites covered on each chromosome (χ2 = 6.77, d.f. = 2, P = 0.034) (Figure 3A). Specifically, base-substitution mutation rates are highest on Chr1, and lowest on Chr2, which is the opposite of observed evolutionary rates on these chromosomes (see Figure S5) (Cooper et al. 2010). In addition, a second chi-squared test was performed to test whether the observed base-substitution mutation rates differed from the conditional mutation rates expected on each chromosome given their respective nucleotide contents, which are similar (%GC: Chr1, 66.8%; Chr2, 66.9%; and Chr3, 67.3%). Here, the null expectation for the total number of base-substitution mutations on each chromosome was calculated as the product of the number of GC bases covered, the total number of generations across lines, and the overall GC substitution rate across the genome, added to the product of the same calculation for AT substitutions. The differences in the base-substitution mutation rates of the three core chromosomes remained significant when this test was performed (χ2 = 6.88, d.f. = 2, P = 0.032), indicating that the intrachromosomal heterogeneity in base-substitution mutation rates cannot be explained by variation in nucleotide content.
The conditional base-substitution mutation spectra were also significantly different in all pairwise chi-squared proportions tests between chromosomes (Chr1/Chr2: χ2 = 14.3, d.f. = 5, P = 0.014; Chr1/Chr3: χ2 = 17.0, d.f. = 5, P = 0.004; and Chr2/Chr3: χ2 = 13.4, d.f. = 5, P = 0.020) (Figure 3C). These comparisons further illustrate that the significant variation in conditional base-substitution mutation rates is mostly driven by a few types of substitutions that occur at higher rates on particular chromosomes. Specifically, although their individual differences were not quite statistically significant, G:C > T:A transversions seem to occur at the highest rate on Chr3 (χ2 = 5.94, d.f. = 2, P = 0.051) and A:T > C:G transversions occur at the highest rate on Chr1 (χ2 = 5.67, d.f. = 2, P = 0.059) (Figure 3B; Figure 4A).
Studies in Vibrio cholerae have suggested that in bacteria with multiple chromosomes, smaller secondary chromosomes delay their replication until there remains approximately the same number of bases to be replicated on larger chromosomes (Rasmussen et al. 2007; Cooper et al. 2010). This ensures synchrony of replication termination between chromosomes of different sizes, despite the fact that their replication proceeds at the same rate. To test whether this replication timing gradient is partially responsible for the patterns we observe in base-substitution mutation spectra between chromosomes, we binned Chr1 and Chr2 into late and early replicating regions, where the early replicating regions represent bases presumed to replicate prior to Chr3 initiation, and the late replicated regions represent bases presumed to replicate following Chr3 initiation (the last 1.06 Mb replicated).
In support of this model, G:C > T:A transversions also occur at a slightly higher rate in late replicated regions of Chr1 and Chr2 than they do in early replicated regions of Chr1 and Chr2 (see Figure 5A). However, even when mutations are binned by overall replication timing (combining late replicating regions on Chr1 and Chr2 with Chr3 and comparing them to early replicating regions on Chr1 and Chr2), the rate of G:C > T:A transversions is not significantly higher than it is in early replication-timing regions, likely due to small sample sizes (χ2 = 2.52, d.f. = 1, P = 0.113). A:T > C:G transversions occur at slightly higher rates in early replicated regions of Chr1 and Chr2 than they do in late replicated regions (see Figure 5B), but again the difference is not statistically significant (χ2 = 1.26, d.f. = 1, P = 0.262). Together, these findings suggest that late replicating DNA is predisposed to incur more G:C > T:A transversions and early replicating DNA is predisposed to incur more A:T > C:G transversions, but a larger collection of mutations will be necessary to fully address this question.
Unlike base-substitution mutation rates, neither the deletion or insertion mutation rate varied significantly among chromosomes (deletions: χ2 = 3.81, d.f. = 2, P = 0.15; insertions: χ2 = 0.64, d.f. = 2, P = 0.73), (Figure 3B; Figure 4B). No indels were observed on the 0.16-Mb plasmid, but as noted above, four plasmid loss events were observed. The latter events involve the loss of 157 genes, and are expected to have phenotypic consequences. The relative rarity of indels observed in this study limits our ability to analyze their intrachromosomal biases in great detail, but the repeated occurrence of indels in short 5-kb regions, and particularly within microsatellites (57.6% of all indels) suggests that replication slippage is a common cause of indels in the B. cenocepacia genome (Figure 4B).
Discussion
Despite their relevance to both evolutionary theory and human health, the extent to which generalizations about mutation rates and spectra are conserved across organisms remains unclear. Bacterial genomes are particularly amenable to studying these issues because of their diverse genome content (Lynch 2007). In measuring the rate and molecular spectrum of mutations in the high-GC, multireplicon genome of B. cenocepacia, we have corroborated some prior findings of MA studies in model organisms, but also demonstrated idiosyncrasies in the B. cenocepacia spectrum that may extend to other organisms with high %GC content and/or with multiple chromosomes. Specifically, B. cenocepacia has a relatively low mutation rate and a mutation spectrum consistent with a universal deletion bias in prokaryotes (Mira et al. 2001). However, the lack of AT-mutation bias is inconsistent with all previous findings in mismatch-repair-proficient organisms (Lynch et al. 2008; Denver et al. 2009; Hershberg and Petrov 2010; Hildebrand et al. 2010; Ossowski et al. 2010; Lee et al. 2012; Sung et al. 2012b). Further, both mutation rates and spectra differed significantly among chromosomes in a manner suggesting greater oxidative damage or more inefficient repair in late replicated regions.
As a member of a species complex with broad ecological and clinical significance, B. cenocepacia is a taxon with rich genomic resources that enable comparisons between the de novo mutations reported here and extant sequence diversity. With 7050 genes, B. cenocepacia HI2424 has a large amount of coding DNA (GE) (6.8 × 106 bp) and a high average nucleotide heterozygosity at silent sites (πs) (6.57 × 10−2) relative to other strains (Watterson 1975; Mahenthiralingam et al. 2005). By combining this πs measurement and the base-substitution rate from this study, we estimate that the Ne of B. cenocepacia is ∼2.47 × 108, which is in the upper echelon among species whose Ne has been derived in this manner (see Figure S6). Under the drift-barrier hypothesis, high target size for functional DNA and high Ne increase the ability of natural selection to reduce mutation rates (Lynch 2010b, 2011; Sung et al. 2012a). Thus, given the large proteome and Ne of B. cenocepacia, it is unsurprising that B. cenocepacia has relatively low base-substitution and indel mutation rates when compared to other organisms (Sung et al. 2012a). However, the low substitution and indel mutation rates observed in this study need not imply limited genetic diversity among species of the B. cepacia complex. Rather, because of their high Ne and evidently frequent lateral genetic transfer, species of the B. cepacia complex are remarkably diverse (Baldwin et al. 2005; Pearson et al. 2009), demonstrating that low mutation rates need not imply low levels of genetic diversity.
Burkholderia genomes also tend to be large in comparison to other Proteobacteria, but this is evidently not the product of more frequent insertions. Rather, insertions and deletions occurred at similar rates but deletions were larger than insertions, and plasmids were lost relatively frequently, which together add to the general model that bacterial genomes are subject to a deletion bias (Mira et al. 2001; Kuo and Ochman 2009). Ultimately, this dynamic has the potential to drive the irreversible loss of previously essential genes during prolonged colonization of a host and may enable host dependence to form more rapidly in prokaryotic organisms than in eukaryotes, which do not have a strong deletion bias (Denver et al. 2004; Kuo and Ochman 2009; Dyall et al. 2014). Consistent with this dynamic, host-restricted Burkholderia genomes evolving at lower Ne are indeed substantially smaller than free-living genomes (Mahenthiralingam et al. 2005; Carlier and Eberl 2012).
The lack of mutational bias toward AT bases observed in B. cenocepacia has not been seen previously in nonmutator MA lineages of any kind (Lind and Andersson 2008; Lynch et al. 2008; Denver et al. 2009; Keightley et al. 2009; Ossowski et al. 2010; Lee et al. 2012; Sung et al. 2012a,b). However, selection and/or biased gene conversion must still be invoked to explain the high %GC content in B. cenocepacia (Hershberg and Petrov 2010; Hildebrand et al. 2010). Of these two explanations, selection favoring GC content may be the more influential force, given that there is no evidence for increased %GC content in recombinant genes of Burkholderia, despite its prevalence in other bacteria (Lassalle et al. 2015). It is also notable that similar substitution biases can be observed at polymorphic sites of several MLST loci shared across B. cenocepacia isolates (Jolley and Maiden 2010). Specifically, A:T > C:G transversions are more common than G:C > T:A transversions, and the rates of G:C > A:T and A:T > G:C transitions are nearly indistinguishable at six of the seven loci (see Figure S7). However, the evolutionary mechanism of these substitution biases are uncertain, given the potential for ongoing recombination and/or natural selection to influence polymorphisms at these sites in conserved housekeeping genes (Lynch et al. 2008; Duret and Galtier 2009; Raghavan et al. 2012; Zhu et al. 2014).
In principle, a decreased rate of G:C > A:T transition mutation relative to other bacteria could be achieved by an increased abundance of uracil–DNA–glycosylases (UDGs), which remove uracils from DNA following cytosine deamination (Pearl 2000) or by a lack of cytosine methyltransferases, which methylate the C-5 carbon of cytosines and expose them to increased rates of cytosine deamination (Kahramanoglou et al. 2012). However, B. cenocepacia HI2424 does not appear to have an exceptionally high number of UDGs, and it does contain an obvious cytosine methyltransferase homolog, suggesting that active methylation of cytosines does occur in B. cenocepacia. Extending these methods to more genomes with high %GC content will be required to determine whether a lack of AT-mutation bias is a common feature of GC-rich genomes.
Perhaps the most important finding from this study is that both mutation rates and spectra vary significantly among the three autonomously replicating chromosomes that make up the B. cenocepacia genome (Figure 3). Our data demonstrate that base-substitution mutation rates vary significantly among chromosomes, but not in the direction predicted by comparative studies on sequence divergence (Mira and Ochman 2002; Cooper et al. 2010; Lang and Murray 2011; Agier and Fischer 2012; Morrow and Cooper 2012). Specifically, we find that base-substitution mutation rates are highest on the primary chromosome (Figure 3, A and B), where evolutionary rates are lowest. Thus, purifying selection must be substantially stronger on the primary chromosome to offset the effect of an elevated mutation rate.
The spectra of base substitutions also differed significantly among chromosomes. Specifically, A:T > C:G transversions are more than twice as likely to occur on Chr1 as elsewhere, and G:C > T:A transversions are more than twice as likely to occur on Chr3 (Figure 3C). One possible explanation for the increased rate of G:C > T:A transversions on Chr3 is that they can arise through oxidative damage (Michaels et al. 1992; Lee et al. 2012) and may be elevated late in the cell cycle when intracellular levels of reactive oxygen species are high (Mira and Ochman 2002; Stamatoyannopoulos et al. 2009; Chen et al. 2010). Thus, because tertiary chromosomes are expected to be replicated late in the cell cycle (Rasmussen et al. 2007), we would expect these elevated rates of G:C > T:A transversions on Chr3. Of course, if this explanation were accurate, we would also observe an increased rate of G:C > T:A transversions in late-replicated regions of Chr1 and Chr2. Although the low number of total G:C > T:A transversions observed in this study prevents us from statistically distinguishing G:C > T:A transversion rates between late and early replicated regions of Chr1 and Chr2, the rate of G:C > T:A transversions is higher in late replicated regions of Chr1 and Chr2 (see Figure 5A), a remarkable finding considering that early replicated genes on Chr1 and Chr2 are expressed more, which has been shown to induce G:C > T:A transversions independently of replication (Klapacz and Bhagwat 2002; Kim and Jinks-Robertson 2012; Alexander et al. 2013). Thus, we suggest that late replicating DNA is inherently predisposed to increased rates of G:C > T:A transversions, possibly due to increased exposure to oxidative damage (Michaels et al. 1992), variation in nucleotide-pool composition (Kunkel 1992; Zhang and Mathews 1995), or variation in DNA-repair mechanisms (Hawk et al. 2005; Courcelle 2009).
A mechanism of an increased A:T > C:G transversion mutation rate on the primary chromosome is less clear, but a decreased rate of A:T > C:G transversions in a late replicating reporter relative to that on an intermediate replicating reporter has been demonstrated previously in Salmonella enterica (Hudson et al. 2002). Thus, it is possible that the rate of this form of transversion is increased in early replicating DNA, or that it is primarily caused by other forms of mutagenesis (Klapacz and Bhagwat 2002). A:T > C:G transversion rates in early replicating regions of Chr1 and Chr2 support the former hypothesis, as early replicated regions of Chr1 and Chr2 experience the highest rates of A:T > C:G transversions (see Figure 5B). The alternative mechanism of transcriptional mutagenesis seems less likely as A:T > C:G transversions occurred frequently in noncoding DNA relative to other substitution types (see Figure S8).
In summary, this study has demonstrated that the GC-rich genome of B. cenocepacia has a relatively low mutation rate, with a mutation spectrum that lacks an AT bias and is biased toward deletion. Moreover, both the rate and types of base-substitution mutations that occur most frequently vary by chromosome, likely related to replication dynamics, the cell cycle, and transcription (Klapacz and Bhagwat 2002; Cooper et al. 2010; Merrikh et al. 2012). Although this study has broadened our understanding of mutation rates and spectra beyond that of model organisms, whether the observed mutational traits are common to all GC-rich genomes with multiple replicons, or are merely species-specific idiosyncrasies will require a more thorough investigation across a more diverse collection of GC-rich and multireplicon bacterial genomes. Ultimately, by better understanding the core mutational processes that generate the variation on which evolution acts, we can aspire to develop true species-specific null hypotheses for molecular evolution, and by extension, enable more accurate analyses of the role of all evolutionary forces in driving genome evolution.
Supplementary Material
Acknowledgments
We thank Kenny Flynn for helpful discussion and Brian VanDam for technical support. This work was supported by the Multidisciplinary University Research Initiative Award from the US Army Research Office (W911NF-09-1-0444 to M.L., P. Foster, H. Tang, and S. Finkel) and a National Science Foundation career award (DEB-0845851 to V.S.C.).
Author Contributions: M.D., W.S., M.L., and V.C. designed the research; M.D., and W.S. performed the research; M.D., and W.S. analyzed the data; and M.D., W.S., M.L., and V.C. wrote the paper.
Footnotes
Communicating editor: J. G. Lawrence
Supporting information is available online at www.genetics.org/lookup/suppl/doi:10.1534/genetics.115.176834/-/DC1.
Literature Cited
- Agier N., Fischer G., 2012. The mutational profile of the yeast genome is shaped by replication. Mol. Biol. Evol. 29: 905–913. [DOI] [PubMed] [Google Scholar]
- Agnoli K., Schwager S., Uehlinger S., Vergunst A., Viteri D. F., et al. , 2012. Exposing the third chromosome of Burkholderia cepacia complex strains as a virulence plasmid. Mol. Microbiol. 83: 362–378. [DOI] [PubMed] [Google Scholar]
- Alexander M. P., Begins K. J., Crall W. C., Holmes M. P., Lippert M. J., 2013. High levels of transcription stimulate transversions at GC base pairs in yeast. Environ. Mol. Mutagen. 54: 44–53. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Baldwin A., Mahenthiralingam E., Thickett K. M., Honeybourne D., Maiden M. C. J., et al. , 2005. Multilocus sequence typing scheme that provides both species and strain differentiation for the Burkholderia cepacia complex. J. Clin. Microbiol. 43: 4665–4673. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Carlier A. L., Eberl L., 2012. The eroded genome of a Psychotria leaf symbiont: hypotheses about lifestyle and interactions with its plant host. Environ. Microbiol. 14: 2757–2769. [DOI] [PubMed] [Google Scholar]
- Chen C.-L., Rappailles A., Duquenne L., Huvet M., Guilbaud G., et al. , 2010. Impact of replication timing on non-CpG and CpG substitution rates in mammalian genomes. Genome Res. 20: 447–457. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Coenye T., and J. J. LiPuma, 2003. Population structure analysis of Burkholderia cepacia genomovar III: varying degrees of genetic recombination characterize major clonal complexes. Microbiology 149: 77–88. [DOI] [PubMed] [Google Scholar]
- Coenye T., Spilker T., Van Schoor A., J. J. LiPuma, and P. Vandamme, 2004. Recovery of Burkholderia cenocepacia strain PHDC from cystic fibrosis patients in Europe. Thorax 59: 952–954. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cooper V. S., Vohr S. H., Wrocklage S. C., Hatcher P. J., 2010. Why genes evolve faster on secondary chromosomes in bacteria. PLOS Comput. Biol. 6: e1000732. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Courcelle J., 2009. Shifting replication between IInd, IIIrd, and IVth gears. Proc. Natl. Acad. Sci. USA 106: 6027–6028. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Denver D. R., Morris K., Lynch M., Thomas W. K., 2004. High mutation rate and predominance of insertions in the Caenorhabditis elegans nuclear genome. Nature 430: 679–682. [DOI] [PubMed] [Google Scholar]
- Denver D. R., Dolan P. C., Wilhelm L. J., Sung W., Lucas-Lledo J. I., et al. , 2009. A genome-wide view of Caenorhabditis elegans base-substitution mutation processes. Proc. Natl. Acad. Sci. USA 106: 16310–16314. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Denver D. R., Wilhelm L. J., Howe D. K., Gafner K., Dolan P. C., et al. , 2012. Variation in base-substitution mutation in experimental and natural lineages of Caenorhabditis nematodes. Genome Biol. Evol. 4: 513–522. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Drake J. W., 1991. A constant rate of spontaneous mutation in DNA-based microbes. Proc. Natl. Acad. Sci. USA 88: 7160–7164. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Duret L., Galtier N., 2009. Biased gene conversion and the evolution of mammalian genomic landscapes. Annu. Rev. Genomics Hum. Genet. 10: 285–311. [DOI] [PubMed] [Google Scholar]
- Dyall S. D., Brown M. T., Johnson P. J., 2014. Ancient invasions : from endosymbionts to organelles. Science 304: 253–257. [DOI] [PubMed] [Google Scholar]
- Elena S. F., Ekunwe L., Hajela N., Oden S. A., Lenski R. E., 1998. Distribution of fitness effects caused by random insertion mutations in Escherichia coli. Genetica 102: 349–358. [PubMed] [Google Scholar]
- Fijalkowska I. J., Schaaper R. M., Jonczyk P., 2012. DNA replication fidelity in Escherichia coli: a multi-DNA polymerase affair. FEMS Microbiol. Rev. 36: 1105–1121. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Foster P. L., Hanson A. J., Lee H., Popodi E. M., Tang H. X., 2013. On the mutational topology of the bacterial genome. G3(Bethesda). 3: 399–407. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Graur D., Li W.-H., 2000. Fundamentals of Molecular Evolution, Sinauer Associates, Sunderland, MA. [Google Scholar]
- Hall D. W., Mahmoudizad R., Hurd A. W., Joseph S. B., 2008. Spontaneous mutations in diploid Saccharomyces cerevisiae: another thousand cell generations. Genet. Res. 90: 229–241. [DOI] [PubMed] [Google Scholar]
- Hawk J. D., Stefanovic L., Boyer J. C., Petes T. D., Farber R. A., 2005. Variation in efficiency of DNA mismatch repair at different sites in the yeast genome. Proc. Natl. Acad. Sci. USA 102: 8639–8643. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Heilbron K., Toll-Riera M., Kojadinovic M., Maclean R. C., 2014. Fitness is strongly influenced by rare mutations of large effect in a microbial mutation accumulation experiment. Genetics 197: 981–990. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hershberg R., Petrov D. A., 2010. Evidence that mutation is universally biased towards AT in bacteria. PLoS Genet. 6: e1001115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hildebrand F., Meyer A., Eyre-Walker A., 2010. Evidence of selection upon genomic GC-content in bacteria. PLoS Genet. 6: e1001107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hudson R. E., Bergthorsson U., Roth J. R., Ochman H., 2002. Effect of chromosome location on bacterial mutation rates. Mol. Biol. Evol. 19: 85–92. [DOI] [PubMed] [Google Scholar]
- Jolley K. A., Maiden M. C. J., 2010. BIGSdb: scalable analysis of bacterial genome variation at the population level. BMC Bioinformatics 11: 595–606. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kahramanoglou C., Prieto A. I., Khedkar S., Haase B., Gupta A., et al. , 2012. Genomics of DNA cytosine methylation in Escherichia coli reveals its role in stationary phase transcription. Nat. Commun. 3: 886. [DOI] [PubMed] [Google Scholar]
- Keightley P. D., Trivedi U., Thomson M., Oliver F., Kumar S., et al. , 2009. Analysis of the genome sequences of three Drosophila melanogaster spontaneous mutation accumulation lines. Genome Res. 19: 1195–1201. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kibota T. T., Lynch M., 1996. Estimate of the genomic mutation rate deleterious to overall fitness in E. coli. Nature 381: 694–696. [DOI] [PubMed] [Google Scholar]
- Kim N., Jinks-Robertson S., 2012. Transcription as a source of genome instability. Nat. Rev. Genet. 13: 204–214. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kimura M., 1983. The Neutral Theory of Molecular Evolution, Cambridge University Press, Cambridge, UK. [Google Scholar]
- Klapacz J., Bhagwat A. S., 2002. Transcription-dependent increase in multiple classes of base substitution mutations in Escherichia coli. J. Bacteriol. 184: 6866–6872. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kunkel T. A., 1992. Biological asymmetries and the fidelity of eukaryotic DNA replication. BioEssays 14: 303–308. [DOI] [PubMed] [Google Scholar]
- Kuo C.-H., Ochman H., 2009. Deletional bias across the three domains of life. Genome Biol. Evol. 1: 145–152. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lang G. I., Murray A. W., 2011. Mutation rates across budding yeast chromosome VI are correlated with replication timing. Genome Biol. Evol. 3: 799–811. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lassalle F., Périan S., Bataillon T., Nesme X., Duret L., et al. , 2015. GC-content evolution in bacterial genomes: the biased gene conversion hypothesis expands. PLoS Genet. 11: e1004941. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lee H., Popodi E., Tang H. X., Foster P. L., 2012. Rate and molecular spectrum of spontaneous mutations in the bacterium Escherichia coli as determined by whole-genome sequencing. Proc. Natl. Acad. Sci. USA 109: E2774–E2783. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li H., Durbin R., 2009. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25: 1754–1760. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li H., Handsaker B., Wysoker A., Fennell T., Ruan J., et al. , 2009. The sequence alignment/map format and SAMtools. Bioinformatics 25: 2078–2079. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Librado P., Rozas J., 2009. DnaSP v5: a software for comprehensive analysis of DNA polymorphism data. Bioinformatics 25: 1451–1452. [DOI] [PubMed] [Google Scholar]
- Lind P. A., Andersson D. I., 2008. Whole-genome mutational biases in bacteria. Proc. Natl. Acad. Sci. USA 105: 17878–17883. [DOI] [PMC free article] [PubMed] [Google Scholar]
- LiPuma J. J., Spilker T., TCoenye, Gonzalez C. F., 2002. An epidemic Burkholderia cepacia complex strain identified in soil. Lancet 359: 2002–2003. [DOI] [PubMed] [Google Scholar]
- Lynch M., 2007. The Origins of Genome Architecture, Sinauer Associates, Sunderland, MA. [Google Scholar]
- Lynch M., Walsh B., 1998. Genetics and Analysis of Quantitative Traits, Sinauer Associates, Sunderland, MA. [Google Scholar]
- Lynch M., 2010a Rate, molecular spectrum, and consequences of human mutation. Proc. Natl. Acad. Sci. USA 107: 961–968. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lynch M., 2010b Evolution of the mutation rate. Trends Genet. 26: 345–352. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lynch M., 2011. The lower bound to the evolution of mutation rates. Genome Biol. Evol. 3: 1107–1118. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lynch M., Sung W., Morris K., Coffey N., Landry C. R., et al. , 2008. A genome-wide view of the spectrum of spontaneous mutations in yeast. Proc. Natl. Acad. Sci. USA 105: 9272–9277. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mahenthiralingam E., Urban T. A., Goldberg J. B., 2005. The multifarious, multireplicon Burkholderia cepacia complex. Nat. Rev. Microbiol. 3: 144–156. [DOI] [PubMed] [Google Scholar]
- Merrikh H., Zhang Y., Grossman A. D., Wang J. D., 2012. Replication-transcription conflicts in bacteria. Nat. Rev. Microbiol. 10: 449–458. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Michaels M. L., Cruz C., Grollman A. P., Miller J. H., 1992. Evidence that mutY and mutM combine to prevent mutations by an oxidatively damaged form of guanine in DNA. Proc. Natl. Acad. Sci. USA 89: 7022–7025. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mira A., Ochman H., 2002. Gene location and bacterial sequence divergence. Mol. Biol. Evol. 19: 1350–1358. [DOI] [PubMed] [Google Scholar]
- Mira A., Ochman H., Moran N. A., 2001. Deletional bias and the evolution of bacterial genomes. Trends Genet. 17: 589–596. [DOI] [PubMed] [Google Scholar]
- Morrow J. D., Cooper V. S., 2012. Evolutionary effects of translocations in bacterial genomes. Genome Biol. Evol. 4: 1256–1262. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ossowski S., Schneeberger K., Lucas-Lledo J. I., Warthmann N., Clark R. M., et al. , 2010. The rate and molecular spectrum of spontaneous mutations in Arabidopsis thaliana. Science 327: 92–94. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pearl L. H., 2000. Structure and function in the uracil-DNA glycosylase superfamily. Mutat. Res. DNA Repair 460: 165–181. [DOI] [PubMed] [Google Scholar]
- Pearson T., Giffard P., Beckstrom-Sternberg S., Auerbach R., Hornstra H., et al. , 2009. Phylogeographic reconstruction of a bacterial species with high levels of lateral gene transfer. BMC Biol. 7: 78–92. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Raghavan R., Kelkar Y. D., Ochman H., 2012. A selective force favoring increased G plus C content in bacterial genes. Proc. Natl. Acad. Sci. USA 109: 14504–14507. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rasmussen T., Jensen R. B., Skovgaard O., 2007. The two chromosomes of Vibrio cholerae are initiated at different time points in the cell cycle. EMBO J. 26: 3124–3131. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schrider D. R., Houle D., Lynch M., Hahn M. W., 2013. Rates and genomic consequences of spontaneous mutational events in Drosophila melanogaster. Genetics 194: 937–954. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sniegowski P. D., Gerrish P. J., Johnson T., Shaver A., 2000. The evolution of mutation rates: separating causes from consequences. BioEssays 22: 1057–1066. [DOI] [PubMed] [Google Scholar]
- Stamatoyannopoulos J. A., Adzhubei I., Thurman R. E., Kryukov G. V., Mirkin S. M., et al. , 2009. Human mutation rate associated with DNA replication timing. Nat. Genet. 41: 393–395. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sung W., Ackerman M. S., Miller S. F., Doak T. G., Lynch M., 2012a Drift-barrier hypothesis and mutation-rate evolution. Proc. Natl. Acad. Sci. USA 109: 18488–18492. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sung W., Tucker A. E., Doak T. G., Choi E., Thomas W. K., et al. , 2012b Extraordinary genome stability in the ciliate Paramecium tetraurelia. Proc. Natl. Acad. Sci. USA 109: 19339–19344. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Traverse C. C., Mayo-Smith L. M., Poltak S. R., Cooper V. S., 2013. Tangled bank of experimentally evolved Burkholderia biofilms reflects selection during chronic infections. Proc. Natl. Acad. Sci. USA 110: E250–E259. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Watterson G. A., 1975. On the number of segregating sites in genetical models without recombination. Theor. Popul. Biol. 7: 256–276. [DOI] [PubMed] [Google Scholar]
- Wei W., Ning L. -W., Ye Y. -N., Li S. -J., Zhou H. -Q., et al. , 2014. SMAL: A Resource of spontaneous mutation accumulation Lines. Mol. Biol. Evol. 31: 1302–8. [DOI] [PubMed] [Google Scholar]
- Wielgoss S., Barrick J. E., Tenaillon O., Cruveiller S., Chane-Woon-Ming B. et al, 2011. Mutation rate inferred from synonymous substitutions in a long-term evolution experiment with Escherichia coli G3 (Bethesda) 1: 183–186. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ye K., Schulz M. H., Long Q., Apweiler R., Ning Z. M., 2009. Pindel: a pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads. Bioinformatics 25: 2865–2871. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zeyl C., DeVisser J. A., 2001. Estimates of the rate and distribution of fitness effects of spontaneous mutation in Saccharomyces cerevisiae. Genetics 157: 53–61. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhang X. L., Mathews C. K., 1995. Natural DNA precursor pool asymmetry and base sequence context as determinants of replication fidelity. J. Biol. Chem. 270: 8401–8404. [DOI] [PubMed] [Google Scholar]
- Zhu Y. O., Siegal M. L., Hall D. W., Petrov D. A., 2014. Precise estimates of mutation rate and spectrum in yeast. Proc. Natl. Acad. Sci. USA 111: E2310–E2318. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.