Significance
Because genetic variation underlies evolution, a complete understanding of evolutionary processes requires identifying and characterizing the forces determining the stability of the genome. Using mutation accumulation and whole-genome sequencing, we found that spontaneous mutation rates in three widely diverged Escherichia coli strains are nearly identical. To determine the importance of DNA damage in driving mutation rates, we investigated 11 strains, each defective for a major DNA repair pathway. The striking result was that only loss of the ability to repair or prevent oxidative DNA damage significantly impacted mutation rates and spectra. These results suggest that, with the exception of those that defend against oxidative damage, DNA repair pathways may exist primarily to defend against DNA damage induced by exogenous agents.
Keywords: mutation rate, mutation accumulation, evolution, DNA repair, oxidative DNA damage
Abstract
A complete understanding of evolutionary processes requires that factors determining spontaneous mutation rates and spectra be identified and characterized. Using mutation accumulation followed by whole-genome sequencing, we found that the mutation rates of three widely diverged commensal Escherichia coli strains differ only by about 50%, suggesting that a rate of 1–2 × 10−3 mutations per generation per genome is common for this bacterium. Four major forces are postulated to contribute to spontaneous mutations: intrinsic DNA polymerase errors, endogenously induced DNA damage, DNA damage caused by exogenous agents, and the activities of error-prone polymerases. To determine the relative importance of these factors, we studied 11 strains, each defective for a major DNA repair pathway. The striking result was that only loss of the ability to prevent or repair oxidative DNA damage significantly impacted mutation rates or spectra. These results suggest that, with the exception of oxidative damage, endogenously induced DNA damage does not perturb the overall accuracy of DNA replication in normally growing cells and that repair pathways may exist primarily to defend against exogenously induced DNA damage. The thousands of mutations caused by oxidative damage recovered across the entire genome revealed strong local-sequence biases of these mutations. Specifically, we found that the identity of the 3′ base can affect the mutability of a purine by oxidative damage by as much as eightfold.
A complete understanding of the evolution and stability of the genome requires that the determinants of spontaneous mutation be identified and characterized. Among the variety of mistakes that can occur during DNA transactions, four sources of sequence variation appear to dominate in prokaryotes: intrinsic DNA polymerase errors, endogenously induced DNA damage, DNA damage induced by exogenous agents, and the activities of error-prone polymerases. This conclusion is based on changes in the rates and spectra of mutations that occur when genes affecting these processes are deleted or amplified. In particular, loss of a DNA repair pathway often gives a mutator phenotype, indicating that the pathway of interest exerts an important limitation on spontaneous mutation (1). However, investigations of the mutagenic impact of various DNA repair pathways have relied almost exclusively on reporter genes, leaving open the possibility that the results are biased by the particular features of the selected loci. This concern can be avoided by allowing mutations to accumulate nonselectively in DNA repair-defective strains and identifying the resulting sequence changes by whole-genome sequencing (WGS). Although this approach may miss rare but interesting mutational processes, it can reveal the overall threats to genomic stability and identify features, such as local sequence context, that influence mutational frequencies. Surprisingly, this technique has been used with the eukaryote Caenorhabditis elegans (2) but has not been extensively applied to prokaryotes.
The mutation accumulation (MA) protocol involves establishing multiple clonal populations from a single founder and then repeatedly passing the lines through single-individual bottlenecks (3, 4), which in bacteria is achieved easily by streaking for single colonies on agar medium. This procedure allows mutations to accumulate in an unbiased manner with a minimum of selective pressure. After a sufficient number of generations have occurred, the genomes are sequenced, and mutations are identified. Using this technique, we recently determined the intrinsic mutation rates and mutational spectra of repair-proficient strains of Escherichia coli and Bacillus subtilis and documented the mutational impact of the loss of the major error-correcting system, mismatch repair (MMR) (5–7). In the studies reported here we concentrate on E. coli, first asking if other commensal strains of E. coli have the same mutation rate and spectrum as our K12 strain and whether changing the growth medium influences mutation. Then we determined the mutational effects of the loss of several important DNA repair pathways. Our major conclusion is that, under the conditions of our experiments, mutation rates and spectra are nearly impervious to the loss of DNA repair functions except for those that deal with oxidative DNA damage. We also show that the mutagenicity of a major oxidative lesion, 7,8-dihydro-8-oxoguanine (8-oxoG), is highly dependent on the local sequence context.
Results and Discussion
Mutation Rates, Spectra, and Biases of Wild-Type E. coli Strains.
Although E. coli has been a model organism for decades, estimates of its spontaneous mutation rate vary by more than an order of magnitude (ref. 8 and the references therein). In a previous study using MA/WGS, we found that the rate of spontaneous base-pair substitutions (BPSs) of E. coli K12 was 2 × 10−10 mutations per nucleotide per generation (5), about a third of the rate expected based on results with reporter loci (9). One goal of the current study was to determine if this mutation rate pertains only to K12 or is characteristic of other E. coli strains. Thus, we performed MA/WGS with two relatively recently isolated human commensal E. coli strains, ED1a and IAI1, that are diverged from E. coli K12 (10).
The results for BPSs from all the MA/WGS experiments reported here are summarized in Table 1. The BPS rates in ED1a and IAI1 did not differ significantly from each other but were each ∼50% higher than that of our wild-type K12 strain, PFM2. Although these differences are significant (t = 4.6, 4.4, df = 22, 20, P = 1 × 10−4 and 3 × 10−4 for ED1a and IAI1, respectively), the rates nonetheless are within the same order of magnitude. We also determined that the mutation rate of PFM2 when cultured on minimal glucose medium was nearly the same as when it was cultured on LB medium, which supplies mainly peptides as carbon and nitrogen sources (Table 1).
Table 1.
Parameters of the MA/WGS experiments
| Strain | Description | No. of BPSs | No. of lines | Generations per line | BPS rate per genome (× 103) | 95% CL* | BPS rate per nucleotide† (× 1010) | 95% CL* | Rate compared with PFM2 |
| PFM2 | WT‡ | 246 | 61 | 4,230 | 0.95 | 0.17 | 2.05 | 0.36 | 1.00 |
| PFM2m | WT on minimal medium | 277 | 50 | 6,166 | 0.90 | 0.13 | 1.94 | 0.29 | 0.94 |
| ED1a | WT | 407 | 41 | 6,114 | 1.62 | 0.19 | 3.12 | 0.36 | 1.52 |
| IAI1 | WT | 422 | 47 | 6,342 | 1.42 | 0.15 | 3.01 | 0.32 | 1.47 |
| PFM101 | umuDC dinB | 269 | 43 | 6,078 | 1.03 | 0.15 | 2.22 | 0.33 | 1.08 |
| PFM133 | umuDC dinB polB | 252 | 45 | 6,204 | 0.90 | 0.15 | 1.95 | 0.32 | 0.95 |
| PFM35 | uvrA | 316 | 47 | 6,350 | 1.06 | 0.18 | 2.28 | 0.39 | 1.11 |
| PFM40 | alkA tagA | 265 | 47 | 6,225 | 0.91 | 0.16 | 1.95 | 0.35 | 0.95 |
| PFM88 | ada ogt | 250 | 49 | 6,269 | 0.81 | 0.12 | 1.75 | 0.25 | 0.85 |
| PFM22 | nth nei | 209 | 50 | 920 | 4.54 | 0.79 | 9.79 | 1.71 | 4.77 |
| PFM180 | xthA nfo | 339 | 44 | 3,155 | 2.44 | 0.38 | 5.26 | 0.81 | 2.56 |
| PFM91 | nfi | 335 | 47 | 6,308 | 1.13 | 0.16 | 2.44 | 0.35 | 1.19 |
| PFM61 | mutT | 2234 | 25 | 599 | 149 | 9 | 321 | 19 | 156 |
| PFM6 | mutY | 485 | 24 | 1,972 | 10.2 | 1.2 | 22.1 | 2.6 | 11 |
| PFM94 | mutM mutY | 4282 | 24 | 1,916 | 93 | 8 | 201 | 18 | 98 |
The SEM of the BPSs per line for each strain was multiplied by the critical value of the t distribution to give the 95% CL.
Nucleotides per genome: K12 = 4,639,675; ED1a = 5,209,548; IAI1 = 4,700,560; the plasmids in ED1a and IAI1 are not included.
Data for PFM2 on LB are from ref. 5; two lines and 13 mutations are included that were not included in ref. 5 because they were shared among MA lines (see SI Materials and Methods). The mutation rate is the average of two PFM2 datasets; the number of generations per line is a weighted average of the two datasets; the 95% CL was derived from the sum of the variances of the BPS per line for the two datasets.
The spectra of the types of BPSs were comparable among the three strains (Tables 2 and 3). Fig. 1A shows the conditional mutation rates, i.e., the rates of each type of BPS relative to the numbers of A:T or G:C base pairs in each genome. Because the mutation rates differ among the strains, it is more informative to look at the fractions of each type of BPS, again normalized to the number of A:T or G:C base pairs (Fig. 1B). The BPS spectra of all three strains were dominated by transitions, particularly G:C to A:T transitions, which is the expected spectrum based upon previous locus-specific and genome-wide experiments (8).
Table 2.
BPS spectra in the experimental strains
| Mutational changes | PFM2 WT* | PFM2m WT on min | ED1a WT | IAI1 WT | PFM101 umuDC dinB | PFM133 umuDC dinB polB | PFM35 uvrA | PFM40 alkA tagA | PFM88 ada ogt | PFM22 nth nei | PFM180 xthA nfo | PFM91 nfi | PFM61 mutT | PFM6 mutY | PFM94 mutM mutY |
| Types of substitutions | |||||||||||||||
| Total | 246 | 277 | 407 | 422 | 269 | 252 | 316 | 265 | 250 | 209 | 339 | 335 | 2,234 | 485 | 4,282 |
| Transitions | 136 | 167 | 256 | 218 | 136 | 142 | 167 | 147 | 142 | 186 | 170 | 182 | 6 | 25 | 24 |
| A:T > G:C | 50 | 39 | 62 | 56 | 45 | 46 | 74 | 57 | 42 | 23 | 76 | 74 | 3 | 9 | 8 |
| G:C > A:T | 86 | 128 | 194 | 162 | 91 | 96 | 93 | 90 | 100 | 5 | 94 | 108 | 3 | 16 | 16 |
| Transversions | 110 | 110 | 151 | 204 | 133 | 110 | 149 | 118 | 108 | 181 | 169 | 153 | 2,228 | 460 | 4,258 |
| A:T > T:A | 19 | 24 | 17 | 26 | 18 | 26 | 22 | 27 | 19 | 3 | 33 | 31 | 1 | 1 | 4 |
| G:C > T:A | 31 | 44 | 58 | 87 | 58 | 34 | 60 | 35 | 39 | 10 | 47 | 45 | 2 | 443 | 4,223 |
| A:T > C:G | 43 | 30 | 54 | 61 | 41 | 37 | 47 | 40 | 39 | 2 | 56 | 60 | 2,224 | 12 | 10 |
| G:C > C:G | 17 | 12 | 22 | 30 | 16 | 13 | 20 | 16 | 11 | 8 | 33 | 17 | 1 | 4 | 21 |
| A:T sites | 112 | 93 | 133 | 143 | 104 | 109 | 143 | 124 | 100 | 10 | 165 | 165 | 2,228 | 22 | 22 |
| G:C sites | 134 | 184 | 274 | 279 | 165 | 143 | 173 | 141 | 150 | 199 | 174 | 170 | 6 | 463 | 4,260 |
| Consequences of substitutions | |||||||||||||||
| Position | |||||||||||||||
| N-Cd | 57 | 80 | 92 | 80 | 73 | 50 | 73 | 66 | 48 | 24 | 72 | 81 | 350 | 80 | 550 |
| Cd | 189 | 197 | 315 | 342 | 196 | 202 | 243 | 199 | 202 | 185 | 267 | 254 | 1,884 | 405 | 3,732 |
| Within coding sequences | |||||||||||||||
| Syn | 57 | 56 | 89 | 101 | 60 | 48 | 51 | 47 | 54 | 58 | 77 | 78 | 245 | 107 | 948 |
| N-Syn | 132 | 141 | 226 | 241 | 136 | 154 | 192 | 152 | 148 | 127 | 190 | 176 | 1,639 | 298 | 2,784 |
| Amino acid changes | |||||||||||||||
| Csv | 59 | 78 | 99 | 113 | 71 | 71 | 92 | 75 | 75 | 61 | 93 | 92 | 716 | 112 | 1,039 |
| N-Csv | 73 | 63 | 127 | 128 | 65 | 83 | 100 | 77 | 73 | 66 | 97 | 84 | 923 | 186 | 1,745 |
Table 3.
Conditional BPS rates × 1010
| Mutational changes | PFM2 WT* | PFM2m WT | ED1a WT | IAI1 WT | PFM101 umuDC dinB | PFM133 umuDC dinB polB | PFM35 uvrA | PFM40 alkA tagA | PFM88 ada ogt | PFM22 nth nei | PFM180 xthA nfo | PFM91 nfi | PFM61 mutT | PFM6 mutY | PFM94 mutM mutY |
| Types of substitutions | |||||||||||||||
| Total | 2.05 | 1.94 | 3.12 | 3.01 | 2.22 | 1.95 | 2.28 | 1.84 | 1.75 | 9.79 | 5.26 | 2.44 | 321 | 23.4 | 207 |
| Transitions | 1.14 | 1.17 | 1.96 | 1.56 | 1.12 | 1.10 | 1.21 | 1.02 | 1.00 | 8.72 | 2.64 | 1.32 | 0.86 | 2.49 | 3.46 |
| A:T > G:C | 0.85 | 0.55 | 0.96 | 0.81 | 0.75 | 0.72 | 1.09 | 0.80 | 0.60 | 0.48 | 2.40 | 1.09 | 0.88 | 2.93 | 4.57 |
| G:C > A:T | 1.41 | 1.76 | 2.93 | 2.28 | 1.48 | 1.46 | 1.32 | 1.23 | 1.38 | 16.7 | 2.87 | 1.55 | 0.85 | 2.07 | 2.39 |
| Transversions | 0.92 | 0.77 | 1.16 | 1.46 | 1.10 | 0.85 | 1.08 | 0.82 | 0.76 | 1.08 | 2.62 | 1.11 | 321 | 20.9 | 203 |
| A:T > T:A | 0.32 | 0.34 | 0.26 | 0.38 | 0.30 | 0.41 | 0.32 | 0.38 | 0.27 | 0.29 | 1.04 | 0.46 | 0.29 | 0.18 | 0.46 |
| G:C > T:A | 0.51 | 0.61 | 0.88 | 1.22 | 0.94 | 0.52 | 0.85 | 0.48 | 0.54 | 0.92 | 1.44 | 0.64 | 0.57 | 39.5 | 397 |
| A:T > C:G | 0.73 | 0.43 | 0.84 | 0.88 | 0.69 | 0.58 | 0.69 | 0.56 | 0.56 | 0.19 | 1.77 | 0.89 | 650 | 1.15 | 1.01 |
| G:C > C:G | 0.28 | 0.17 | 0.33 | 0.42 | 0.26 | 0.20 | 0.28 | 0.22 | 0.15 | 0.74 | 1.01 | 0.24 | 0.28 | 0.34 | 1.95 |
| A:T sites | 1.90 | 1.32 | 2.07 | 2.07 | 1.74 | 1.71 | 2.10 | 1.74 | 1.43 | 0.95 | 5.21 | 2.44 | 651 | 4.26 | 6.03 |
| G:C sites | 2.20 | 2.53 | 4.14 | 3.92 | 2.68 | 2.17 | 2.46 | 1.92 | 2.07 | 18.4 | 5.32 | 2.43 | 1.70 | 41.9 | 401 |
| Consequences of substitutions | |||||||||||||||
| Position | |||||||||||||||
| N-Cd | 3.21 | 3.77 | 5.12 | 4.68 | 4.06 | 2.60 | 3.55 | 3.08 | 2.27 | 7.58 | 7.54 | 3.97 | 339 | 26.5 | 181 |
| Cd | 1.85 | 1.62 | 2.80 | 2.78 | 1.90 | 1.83 | 2.06 | 1.62 | 1.66 | 10.2 | 4.87 | 2.17 | 318 | 22.8 | 211 |
| Within coding sequences | |||||||||||||||
| Syn | 0.79 | 0.65 | 1.12 | 1.16 | 0.82 | 0.62 | 0.61 | 0.54 | 0.63 | 4.52 | 1.99 | 0.94 | 58.6 | 9.30 | 75.8 |
| N-Syn | 0.56 | 0.50 | 0.87 | 0.85 | 0.57 | 0.61 | 0.71 | 0.54 | 0.53 | 3.05 | 1.51 | 0.66 | 121 | 7.09 | 68.7 |
| Amino acid changes | |||||||||||||||
| Csv | 0.54 | 0.59 | 0.82 | 0.85 | 0.64 | 0.60 | 0.72 | 0.56 | 0.57 | 3.10 | 1.57 | 0.73 | 112 | 5.79 | 55.5 |
| N-Csv | 0.59 | 0.43 | 0.93 | 0.86 | 0.52 | 0.62 | 0.70 | 0.52 | 0.50 | 3.00 | 1.46 | 0.59 | 129 | 8.26 | 80.6 |
Conditional mutation rates are the numbers of mutations per generation divided by the relevant parameter for each genome. Cd, coding; Csv, conservative; N-Cd, noncoding; N-Csv, nonconservative; N-syn, nonsynonymous; PFM2m WT, PFM2 WT on minimal glucose medium (all other experiments were done on LB medium); Syn, synonymous; WT, wild type.
Fig. 1.
The mutation rate and spectra of BPSs in wild-type E. coli strains. (A) The conditional mutation rates of each of the six BPSs. The bars for the totals show the mutation per generation per nucleotide; error bars show 95% CLs calculated from the number of mutations per MA line. The bars for specific BPSs show the mutation per generation of each type of BPS divided by the number of A:T or G:C base pairs in the genome; error bars show 95% CLs calculated from 1,000 Monte Carlo simulations of a random distribution with the mutational spectra observed for each dataset (SI Materials and Methods). (B) Normalized fraction of each of the six BPSs. The bars show the number of each of the six BPSs divided by the total number of BPSs, normalized to the fraction of A:T or G:C base pairs in the genome. The error bars show 95% CLs from 1,000 Monte Carlo simulations of a random distribution with the mutational spectrum observed for each dataset (SI Materials and Methods). PFM2m, PFM2 on minimal medium (all other experiments were on rich medium).
In all three strains the ratio of BPSs in coding versus noncoding DNA was almost half of that expected from the ratio of coding versus noncoding DNA in the genomes (Table S1). Indeed, 1,000 Monte Carlo simulations for each strain did not yield a ratio of BPSs in coding versus noncoding DNA as small as the observed ratio. This result does not reflect selection against mutations in coding DNA; for each strain the ratio of nonsynonymous to synonymous mutations did not differ significantly from the ratio obtained from simulations (Table S1). In our previous study we showed that the bias against coding DNA disappeared when MMR was defective, indicating that MMR preferentially repairs coding sequences (5).
Table S1.
Observed BPSs and amino acid changes in wild-type strains compared with expected
| Mutational changes | PFM2 | PFM2m | ED1a | IAI1 | ||||||||
| Obs | Exp | P | Obs | Exp | P | Obs | Exp | P | Obs | Exp | P | |
| Using fractions of all BPSs | ||||||||||||
| G:Cs | 134 | 125 | 184 | 141 | 274 | 206 | 279 | 214 | ||||
| A:Ts | 112 | 121 | 93 | 136 | 133 | 201 | 143 | 208 | ||||
| G:C/A:T | 1.20 | 1.03 | 0.41 | 1.98 | 1.03 | 2 × 10−4 | 2.06 | 1.03 | 2 × 10−6 | 1.95 | 1.03 | 6 × 10−6 |
| G:C Ts | 86 | 69 | 128 | 85 | 194 | 130 | 162 | 111 | ||||
| A:T Ts | 50 | 67 | 39 | 82 | 62 | 126 | 56 | 107 | ||||
| G:C Ts/A:T Ts | 1.72 | 1.03 | 0.04 | 3.28 | 1.03 | 9 × 10−7 | 3.13 | 1.03 | 4 × 10−9 | 2.89 | 1.03 | 4 × 10−7 |
| Cd | 189 | 210 | 197 | 236 | 315 | 351 | 342 | 371 | ||||
| N-Cd | 57 | 36 | 80 | 41 | 92 | 56 | 80 | 51 | ||||
| Cd/N-Cd | 3.32 | 5.74 | 0.02 | 2.46 | 5.74 | 6 × 10−5 | 3.42 | 6.26 | 1 × 10−3 | 4.28 | 7.20 | 0.007 |
| Using fractions of BPSs in coding DNA | ||||||||||||
| N-Syn | 132 | 144 | 141 | 151 | 226 | 241 | 241 | 262 | ||||
| Syn | 57 | 45 | 56 | 46 | 89 | 74 | 101 | 80 | ||||
| N-Syn/Syn | 2.32 | 3.25 | 0.15 | 2.52 | 3.25 | 0.27 | 2.54 | 3.25 | 0.18 | 2.39 | 3.25 | 0.07 |
| Using fractions of N-Syn changes | ||||||||||||
| N-Csv | 73 | 70 | 63 | 75 | 127 | 120 | 128 | 127 | ||||
| Csv | 59 | 62 | 78 | 66 | 99 | 106 | 113 | 114 | ||||
| N-Csv/Csv | 1.24 | 1.12 | 0.69 | 0.81 | 1.12 | 0.17 | 1.28 | 1.13 | 0.49 | 1.13 | 1.12 | 0.96 |
| Using simulations | ||||||||||||
| Cd | 189 | 209 | 197 | 238 | 315 | 352 | 342 | 372 | ||||
| N-Cd | 57 | 37 | 80 | 39 | 92 | 55 | 80 | 50 | ||||
| Cd/N-Cd | 3.32 | 5.71 | 0.02 | 2.46 | 6.03 | 3 × 10−5 | 3.42 | 6.34 | 1 × 10−3 | 4.28 | 7.41 | 4 ×1 0−3 |
| N-Syn | 132 | 138 | 141 | 140 | 226 | 223 | 241 | 248 | ||||
| Syn | 57 | 51 | 56 | 57 | 89 | 92 | 101 | 94 | ||||
| N-Syn/Syn | 2.32 | 2.67 | 0.53 | 2.52 | 2.46 | 0.92 | 2.54 | 2.42 | 0.79 | 2.39 | 2.63 | 0.57 |
| N-Csv | 73 | 66 | 63 | 70 | 127 | 112 | 128 | 123 | ||||
| Csv | 59 | 66 | 78 | 71 | 99 | 114 | 113 | 118 | ||||
| N-Csv/Csv | 1.24 | 1.02 | 0.42 | 0.81 | 0.99 | 0.40 | 1.28 | 0.98 | 0.15 | 1.13 | 1.03 | 0.62 |
For comparisons using fractions, the expected values were calculated using the fraction of possible BPS or amino acid changes of each type based on the genome. For comparisons using simulations, the expected values are the means of 1,000 Monte Carlo simulations using the actual spectrum of BPS for each strain. Coding DNA = 85, 86, and 88% of the genome for PFM2, ED1a, and IAI1, respectively. Cd, coding DNA; Csv, conservative amino acid change; Exp, expected number of mutations; N-Cd, noncoding DNA; N-Csv, nonconservative amino acid change; N-Syn, nonsynonymous bp change; Obs, observed number of mutations; P, probability calculated for the χ2 test of the two observed values versus the two expected values; PFM2m, PFM2 on minimal glucose medium (all other experiments were on LB); Syn, synonymous base-pair change; Ts, transitions.
The frequency of mutations in PFM2 grown on LB is biased by neighboring nucleotides; in particular, transitions at A:T base pairs in the context 5′ApC3′/3′TpG5′ occur twice as often as would be expected from the frequency of this dimer in the genome (5). (Here and throughout this paper the mutated base is underlined and in bold-face type.) Both ED1a and IAI1 also showed this bias, but for IAI1 the bias was not significant at the P ≤ 0.05 level (Table S2). All three strains growing on LB also showed a strong DNA strand bias: G:C to A:T transitions were twice as likely to occur with C templating the lagging strand and G templating the leading strand rather than vice-versa (Table S3) (5). Interestingly, culturing PFM2 on minimal glucose medium eliminated the sequence bias and diminished the strand bias (Tables S2 and S3).
Table S2.
Local sequence biases of transition mutations in wild-type strains
| 5′NMN3′/3′NMN5′ | PFM2 | PFM2m | ED1a | IAI1 | ||||||||||||
| Obs | Exp | Obs/ Exp | P | Obs | Exp | Obs/ Exp | P | Obs | Exp | Obs/ Exp | P | Obs | Exp | Obs/ Exp | P | |
| NAA/NTT | 8 | 15 | 0.54 | 0.10 | 10 | 12 | 0.86 | 0.69 | 12 | 18 | 0.66 | 0.19 | 12 | 17 | 0.72 | 0.32 |
| NAC/NTG | 26 | 11 | 2.32 | 0.002 | 9 | 9 | 1.03 | 0.95 | 26 | 14 | 1.86 | 0.02 | 19 | 13 | 1.51 | 0.18 |
| NAC/NTC | 5 | 10 | 0.48 | 0.14 | 14 | 8 | 1.73 | 0.14 | 6 | 13 | 0.46 | 0.08 | 9 | 12 | 0.77 | 0.52 |
| NAT/NTA | 11 | 14 | 0.81 | 0.55 | 6 | 11 | 0.57 | 0.20 | 18 | 17 | 1.07 | 0.81 | 16 | 15 | 1.05 | 0.87 |
| 50 | 50 | 1 x 10−5 | 39 | 39 | 0.09 | 62 | 62 | 0.001 | 56 | 56 | 0.16 | |||||
| AAN/TTN | 8 | 15 | 0.54 | 0.10 | 11 | 12 | 0.95 | 0.89 | 19 | 18 | 1.04 | 0.89 | 14 | 17 | 0.84 | 0.58 |
| CAN/GTN | 6 | 14 | 0.42 | 0.04 | 6 | 11 | 0.54 | 0.17 | 8 | 18 | 0.45 | 0.03 | 11 | 16 | 0.69 | 0.28 |
| GAN/CTN | 20 | 12 | 1.71 | 0.07 | 10 | 9 | 1.10 | 0.82 | 23 | 15 | 1.58 | 0.10 | 15 | 13 | 1.14 | 0.68 |
| TAN/ATN | 16 | 9 | 1.72 | 0.12 | 12 | 7 | 1.66 | 0.21 | 12 | 11 | 1.05 | 0.90 | 16 | 10 | 1.53 | 0.22 |
| 50 | 50 | 3 x 10−4 | 39 | 39 | 0.13 | 62 | 62 | 0.02 | 56 | 56 | 0.16 | |||||
| NGA/NCT | 10 | 20 | 0.51 | 0.05 | 29 | 29 | 1.00 | 1.0 | 35 | 44 | 0.79 | 0.24 | 29 | 37 | 0.79 | 0.29 |
| NGC/NCG | 36 | 28 | 1.28 | 0.21 | 40 | 42 | 0.96 | 0.82 | 61 | 62 | 0.98 | 0.90 | 66 | 53 | 1.25 | 0.13 |
| NGG/NCC | 25 | 20 | 1.26 | 0.36 | 38 | 29 | 1.29 | 0.22 | 49 | 45 | 1.09 | 0.65 | 43 | 37 | 1.15 | 0.47 |
| NGT/NCA | 15 | 19 | 0.80 | 0.48 | 21 | 28 | 0.75 | 0.28 | 49 | 42 | 1.15 | 0.44 | 24 | 35 | 0.68 | 0.11 |
| 86 | 86 | 0.03 | 128 | 128 | 0.24 | 194 | 194 | 0.35 | 162 | 162 | 0.02 | |||||
| AGN/NCT | 19 | 17 | 1.10 | 0.75 | 28 | 26 | 1.09 | 0.73 | 34 | 39 | 0.87 | 0.49 | 29 | 33 | 0.89 | 0.61 |
| CGN/NCG | 28 | 25 | 1.11 | 0.62 | 33 | 38 | 0.88 | 0.51 | 62 | 56 | 1.11 | 0.49 | 47 | 48 | 0.99 | 0.94 |
| GGN/NCC | 13 | 20 | 0.66 | 0.18 | 24 | 29 | 0.82 | 0.40 | 35 | 45 | 0.78 | 0.20 | 42 | 37 | 1.13 | 0.55 |
| TGN/NCA | 26 | 24 | 1.10 | 0.69 | 43 | 35 | 1.22 | 0.29 | 63 | 54 | 1.17 | 0.31 | 44 | 44 | 0.99 | 0.96 |
| 86 | 86 | 0.50 | 128 | 128 | 0.32 | 194 | 194 | 0.15 | 162 | 162 | 0.80 | |||||
The observed (Obs) and expected (Exp) values are for the “+” or reference strand; the values for a dimer and its reverse complement are summed to account for the orientation on each strand. The expected values are the number of mutations of each base multiplied by the fraction of each dimer involving that base in each genome. To allow for multiple comparisons, the Benjamini–Hochberg procedure (88) was applied with the false-discovery rate set at 1% for the four comparisons per dataset; significant values by this criterion are in bold. Exp, the expected number of mutations; M, mutated base (mutated bases are shown in bold and are underlined); Obs, the observed number of mutations; P, probability for the χ2 value for each comparison (the last P value in each set is for all four comparisons); PFM2m, PFM2 on minimal medium (all other experiments were done on LB medium).
Table S3.
DNA strand biases of BPSs in the experimental strains
| Strain | Description | Lagging strand template | Lagging strand template | ||||||||||
| CObs | CExp | GObs | GExp | C/G Obs | *P | AObs | AExp | TObs | TExp | A/TObs | *P | ||
| G:C > A:T transitions | A:T > G:C transitions | ||||||||||||
| PFM2 | WT | 61 | 42 | 25 | 44 | 2.44 | 0.003 | 30 | 25 | 20 | 25 | 1.50 | 0.31 |
| PFM2m | WT on min | 74 | 62 | 54 | 66 | 1.37 | 0.13 | 23 | 19 | 16 | 20 | 1.44 | 0.42 |
| ED1a | WT | 130 | 97 | 64 | 97 | 2.03 | 8 × 10−4 | 34 | 31 | 28 | 31 | 1.21 | 0.59 |
| IAI1 | WT | 109 | 78 | 53 | 84 | 2.06 | 5 × 10−4 | 35 | 28 | 21 | 28 | 1.67 | 0.18 |
| PFM101 | umuDC dinB | 58 | 44 | 33 | 47 | 1.76 | 0.04 | 27 | 22 | 18 | 23 | 1.50 | 0.33 |
| PFM133 | umuDC dinB polB | 63 | 46 | 33 | 50 | 1.91 | 0.02 | 23 | 23 | 23 | 23 | 1.00 | 0.99 |
| PFM35 | uvrA | 63 | 45 | 30 | 48 | 2.10 | 0.01 | 47 | 37 | 27 | 37 | 1.74 | 0.09 |
| PFM40 | alkA tagA | 66 | 44 | 24 | 46 | 2.75 | 0.001 | 30 | 28 | 27 | 29 | 1.11 | 0.76 |
| PFM88 | ada ogt | 67 | 48 | 33 | 52 | 2.03 | 0.01 | 25 | 21 | 17 | 21 | 1.47 | 0.37 |
| PFM22 | nth nei | 101 | 96 | 98 | 103 | 1.03 | 0.64 | 6 | 5 | 4 | 5 | 1.50 | 0.65 |
| PFM180 | xthA nfo | 101 | 84 | 73 | 90 | 1.38 | 0.07 | 80 | 82 | 85 | 83 | 0.94 | 0.81 |
| PFM91 | nfi | 110 | 83 | 61 | 88 | 1.80 | 0.003 | 77 | 82 | 87 | 82 | 0.89 | 0.61 |
| G:C > T:A transversions | A:T > C:G transversions | ||||||||||||
| PFM61 | mutT | N/A | N/A | N/A | N/A | N/A | N/A | 1,062 | 1,107 | 1,162 | 1,117 | 0.91 | 0.17 |
| PFM6 | mutY | 233 | 212 | 210 | 229 | 1.11 | 0.21 | N/A | N/A | N/A | N/A | N/A | N/A |
| PFM94 | mutM mutY | 2,120 | 2,045 | 2,103 | 2,178 | 1.01 | 0.10 | N/A | N/A | N/A | N/A | N/A | N/A |
Exp, the expected number of mutations; Obs, the observed number of mutations; N/A, not applicable because there were too few mutations to analyze.
P is the probability calculated for the χ2 test of the two observed values versus the two expected values. The expected ratios for C/G on the lagging-strand template (= the leading strand) are 0.94 for PFM2 and derivatives, 1.01 for ED1a, and 0.93 for IAI1. The expected ratios for A/T on the lagging-strand template are 0.99 for PFM2 and derivatives, 1.00 for ED1a, and 0.99 for IAI1.
DNA cytosine methyltransferase (Dcm) transfers a methyl group to the 5 position of the internal Cs in CCWGG sequences; because 5-meC is prone to deamination, creating T, CCWGG sequences are hotspots for G:C to A:T transitions (11). Our results reproduce this finding for PFM2 (cultured in LB and in minimal medium) and ED1a but not IAI1. In all three genomes 2% of the total G:C base pairs occur at the internal position of CCWGG sites, but these base pairs accounted for 14% (12/86) of the G:C transitions when PFM2 was grown on LB, 6% (8/128) when PFM2 was grown on minimal medium, and 6% (12/194) in ED1a; these values are significantly greater than expected (χ2 = 53; P ∼ 0; χ2 = 9, P = 0.003; and χ2 = 15, P = 0.0001, respectively). However, in IAI1 only 4% (6/162) of the G:C transitions occurred at CCWGG sites, a value not significantly different from that expected (χ2 = 1.4, P = 0.24).
DNA adenine methyltransferase (Dam) transfers a methyl group to the 6 position of As in GATC sequences; GATC sequences are hotspots for A:T transversions (5), presumably because 6-meA depurinates, and depurinated sites produce transversions (12). In all three genomes 3% of the total A:T base pairs occur in GATC sites, but these base pairs accounted for 24% (12/62) of the A:T transversions when PFM2 was grown on LB, 10% (5/53) when PFM2 was cultured on minimal medium, and 16% (14/87) in IAI1; these values are significantly greater than expected (χ2 = 43, P ∼ 0; χ2 = 4, P = 0.05; and χ2 = 39, P ∼ 0, respectively). However, there was no preference for A:T transversions to occur at GATC sites in ED1a (3/71 observed, 2/71 expected, χ2 = 0.02, P = 0.88).
That G:C base pairs in CCWGG sites were not significant hotspots in IAI1, and A:T base pairs in GATC sites were not significant hotspots in ED1a, but both were hotspots in PFM2, suggests that methylase activities among these strains differ even though both Dcm and Dam are encoded in all three genomes (www.kegg.jp/kegg/).
In almost all the experimental strains in this report small indels (≤4 bp) occurred at about 1/10th the rate as BPSs and more frequently were losses rather than gains of base pairs (Tables 4 and 5). As discovered many years ago (14), indels occurred predominantly in mononucleotide runs. Growth on minimal glucose medium appeared to decrease the indel mutation rate, particularly of −1-bp events; however, because there were so few indels, few statistically significant conclusions can be drawn.
Table 4.
Spectra of indels in the experimental strains
| Mutational changes | PFM2 WT* | PFM2m WT | ED1a WT | IAI1 WT | PFM101 umuDC dinB | PFM133 umuDC dinB polB | PFM35 uvrA | PFM40 alkA tagA | PFM88 ada ogt | PFM22 nth nei | PFM180 xthA nfo | PFM91 nfi | PFM61 mutT | PFM6 mutY | PFM94 mutM mutY |
| Total | 24 | 13 | 37 | 31 | 23 | 27 | 40 | 30 | 24 | 6 | 36 | 25 | 1 | 6 | 6 |
| +1 bp | 6 | 6 | 6 | 5 | 6 | 11 | 9 | 5 | 4 | 2 | 14 | 8 | 0 | 2 | 1 |
| −1 bp | 16 | 7 | 27 | 25 | 16 | 15 | 29 | 22 | 18 | 4 | 20 | 16 | 1 | 3 | 5 |
| + >1 bp | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
| − >1 bp | 2 | 0 | 4 | 1 | 0 | 1 | 2 | 3 | 1 | 0 | 2 | 1 | 0 | 1 | 0 |
| +1 A:T | 5 | 2 | 5 | 2 | 4 | 8 | 6 | 3 | 3 | 1 | 7 | 5 | 0 | 1 | 0 |
| −1 A:T | 5 | 4 | 15 | 19 | 12 | 8 | 18 | 16 | 17 | 4 | 12 | 12 | 0 | 1 | 3 |
| +1 G:C | 1 | 4 | 1 | 3 | 2 | 3 | 3 | 2 | 1 | 1 | 7 | 3 | 0 | 1 | 1 |
| −1 G:C | 12 | 3 | 12 | 6 | 4 | 7 | 11 | 6 | 1 | 0 | 8 | 4 | 1 | 2 | 2 |
| In run | 22 | 13 | 30 | 31 | 19 | 26 | 37 | 24 | 20 | 6 | 32 | 22 | 1 | 5 | 6 |
| Not in run | 2 | 0 | 7 | 0 | 4 | 1 | 3 | 6 | 4 | 0 | 5 | 3 | 0 | 1 | 0 |
bp, base pair; PFM2m, PFM2 on minimal glucose medium (all other experiments were done on LB medium). A run is two or more of the same base pair or repeats of a short base-pair sequence.
Data are from ref. 5; two indels identified in ref. 13 and one mistaken call have been added (see SI Materials and Methods).
Table 5.
Conditional mutation rates of indels × 1011
| Mutational changes | PFM2 WT* | PFM2m WT on min | ED1a WT | IAI1 WT | PFM101 umuDC dinB | PFM133 umuDC dinB polB | PFM35 uvrA | PFM40 alkA tagA | PFM88 ada ogt | PFM22 nth nei | PFM180 xthA nfo | PFM91 nfi | PFM61 mutT | PFM6 mutY | PFM94 mutM mutY |
| Total | 2.02 | 0.91 | 2.83 | 2.21 | 1.90 | 2.08 | 2.89 | 2.08 | 1.68 | 2.81 | 5.59 | 1.82 | 1.44 | 2.73 | 2.81 |
| +1 bp | 0.50 | 0.42 | 0.46 | 0.36 | 0.49 | 0.85 | 0.65 | 0.35 | 0.28 | 0.94 | 2.17 | 0.58 | <1.4 | 0.91 | 0.47 |
| −1 bp | 1.35 | 0.49 | 2.07 | 1.78 | 1.32 | 1.16 | 2.09 | 1.52 | 1.26 | 1.87 | 3.11 | 1.16 | 1.44 | 1.37 | 2.34 |
| + >1 bp | <0.08 | <0.07 | <0.08 | <0.07 | 0.08 | <0.08 | <0.07 | <0.07 | 0.07 | <0.5 | <0.16 | <0.07 | <1.4 | <0.5 | <0.5 |
| − >1 bp | 0.17 | <0.07 | 0.31 | 0.07 | <0.08 | 0.08 | 0.14 | 0.21 | 0.07 | <0.5 | 0.31 | 0.07 | <1.4 | 0.5 | <0.5 |
| +1 A:T | 0.85 | 0.28 | 0.78 | 0.29 | 0.67 | 1.26 | 0.88 | 0.42 | 0.43 | 0.95 | 2.21 | 0.74 | <2.9 | 0.93 | <1 |
| −1 A:T | 0.85 | 0.57 | 2.33 | 2.76 | 2.01 | 1.26 | 2.64 | 2.25 | 2.42 | 3.81 | 3.79 | 1.77 | <2.9 | 0.93 | 2.86 |
| +1 G:C | 0.17 | 0.55 | 0.15 | 0.42 | 0.32 | 1.06 | 0.43 | 0.27 | 0.14 | 0.92 | 2.14 | 0.43 | <2.8 | 0.90 | 0.92 |
| −1 G:C | 1.82 | 0.41 | 1.81 | 0.84 | 0.65 | 1.85 | 1.56 | 0.82 | 0.14 | <0.92 | 2.45 | 0.57 | 2.83 | 1.79 | 1.85 |
| In run | 8.62 | 4.66 | 11.0 | 11.0 | 7.61 | 9.50 | 11.5 | 7.45 | 6.47 | 14.4 | 21.5 | 7.83 | 7.38 | 2.28 | 2.81 |
| Not in run | 0.17 | <0.07 | 0.54 | <0.07 | 0.33 | 0.08 | 0.22 | 0.42 | 0.28 | <0.5 | 0.78 | 0.22 | <1.4 | 0.46 | <0.5 |
Conditional mutation rates are the numbers of mutations per generation divided by the relevant parameter for each genome (not corrected for the few indels that did not occur at sites of the same base pair). The conditional mutation rates in runs are the number of indels that occurred in runs divided by the number of bases in runs in the genome (not corrected for the few indels that occurred in runs of more than one base pair). A run is two or more of the same base pair or repeats of a short base-pair sequence. bp, base pair; PFM2m, PFM2 on minimal glucose medium (all other experiments were done on LB medium).
Data are from ref. 5; two indels identified in ref. 13 and one mistaken call have been added (see SI Material and Methods).
Transcription can impact mutation rates in several ways. Because ssDNA is exposed during transcription and is more vulnerable to damage than dsDNA, high rates of transcription can be mutagenic (15). On the other hand, transcription-coupled repair (TCR), which is targeted to the transcribed DNA strand, should lower mutation rates in transcribed genes (16). Head-on collisions between replication and transcription destabilize the replication machinery and can lead to mutations (17). In our previous study of E. coli K12 strain PFM2, we found no evidence that transcription affected the frequency of mutations (5). This negative result was reproduced here with IAI1, ED1a, and PFM2 growing in minimal medium and with all the experimental strains. There was no statistically significant bias for mutations to occur in highly transcribed genes or in genes oriented so that transcription is in the direction opposite the direction of replication (even among highly transcribed genes), nor was there a bias for any specific base change to occur on the transcribed versus the nontranscribed strand (Tables S4–S6). Using different methods to analyze our published data, Chen et al. (18) found a small (17–30%) increase in mutation frequencies in highly expressed genes, but this increase was significant only in an MMR-defective strain.
Table S4.
Ratios of base changes on the transcribed versus the nontranscribed DNA strand
| Strain | Description | A > G | G > A | A > T | G > T | A > C | G > C | As | Gs | ||||||||
| TS/NTS | P | TS/NTS | P | TS/NTS | P | TS/NTS | P | TS/NTS | P | TS/NTS | P | TS/NTS | P | TS/NTS | P | ||
| Expected | |||||||||||||||||
| PFM2 | WT | 1.00 | 1.01 | 1.00 | 1.01 | 1.00 | 1.01 | 1.00 | 1.01 | ||||||||
| ED1a | WT | 1.00 | 1.01 | 1.00 | 1.01 | 1.00 | 1.01 | 1.00 | 1.01 | ||||||||
| IAI1 | WT | 1.00 | 1.01 | 1.00 | 1.01 | 1.00 | 1.01 | 1.00 | 1.01 | ||||||||
| Observed | |||||||||||||||||
| PFM2 | WT | 1.13 | 0.81 | 0.87 | 0.66 | 0.88 | 0.85 | 0.83 | 0.76 | 1.06 | 0.90 | 0.56 | 0.44 | 1.05 | 0.88 | 0.81 | 0.44 |
| PFM2m | WT on min | 0.50 | 0.31 | 0.85 | 0.55 | 0.75 | 0.64 | 0.73 | 0.57 | 0.79 | 0.67 | 0.57 | 0.82 | 0.68 | 0.29 | 0.80 | 0.35 |
| ED1A | WT | 1.56 | 0.32 | 1.06 | 0.82 | 1.00 | 0.65 | 1.14 | 0.77 | 1.11 | 0.82 | 1.22 | 0.61 | 1.28 | 0.39 | 1.09 | 0.68 |
| IAI1 | WT | 0.94 | 0.91 | 0.93 | 0.74 | 1.33 | 0.64 | 1.16 | 0.69 | 1.08 | 0.84 | 1.17 | 0.63 | 1.08 | 0.78 | 1.02 | 0.96 |
| PFM101 | umuDC dinB | 1.60 | 0.40 | 0.94 | 0.85 | 0.89 | 0.86 | 0.86 | 0.73 | 1.55 | 0.42 | 0.75 | 0.81 | 1.37 | 0.36 | 0.89 | 0.64 |
| PFM133 | umuDC dinB polB | 0.94 | 0.90 | 1.13 | 0.71 | 1.56 | 0.46 | 0.85 | 0.77 | 2.10 | 0.16 | 0.67 | 0.90 | 1.42 | 0.26 | 1.02 | 0.96 |
| PFM35 | uvrA | 0.91 | 0.82 | 1.26 | 0.48 | 1.25 | 0.74 | 0.66 | 0.30 | 0.54 | 0.17 | 1.57 | 0.41 | 0.79 | 0.39 | 1.04 | 0.88 |
| PFM35* | uvrA | 0.63 | 0.25 | 0.59 | 0.56 | 0.14 | 0.13 | 0.31 | 0.29 | ||||||||
| PFM40 | alkA tagA | 0.50 | 0.19 | 1.11 | 0.76 | 1.11 | 0.87 | 0.59 | 0.80 | 0.67 | 0.43 | 0.88 | 0.97 | 0.68 | 0.23 | 0.94 | 0.63 |
| PFM88 | ada ogt | 1.00 | 1.00 | 1.13 | 0.71 | 1.29 | 0.72 | 1.14 | 0.46 | 0.62 | 0.33 | 1.20 | 0.93 | 0.85 | 0.62 | 1.14 | 0.41 |
| PFM22 | nth nei | 3/0 | 0.64 | 1.24 | 0.35 | 2/1 | 0.68 | 7/2 | 0.90 | 0/1 | 0.41 | 1/7 | 0.93 | 5/2 | 0.62 | 1.20 | 0.80 |
| PFM180 | xthA nfo | 1.22 | 0.62 | 1.03 | 0.95 | 1.60 | 0.40 | 0.95 | 0.16 | 0.92 | 0.83 | 0.76 | 0.76 | 1.16 | 0.57 | 0.95 | 0.21 |
| PFM91 | nfi | 1.31 | 0.56 | 1.47 | 0.24 | 1.10 | 0.88 | 2.10 | 0.42 | 0.78 | 0.58 | 0.33 | 0.48 | 1.02 | 0.95 | 1.39 | 0.75 |
| PFM61 | mutT | 0/1 | 0.41 | 1/2 | 0.68 | 0/0 | NA | 0/1 | 0.25 | 1.02 | 0.79 | 1/0 | 0.45 | 1.02 | 0.80 | 1/3 | 0.27 |
| PFM6 | mutY | 0/6 | 0.18 | 1.33 | 0.71 | 0/0 | NA | 1.19 | 0.05 | 0.50 | 0.81 | 1/3 | 0.94 | 0.25 | 0.18 | 1.18 | 0.06 |
| PFM94 | mutM mutY | 3/2 | 0.75 | 1.80 | 0.71 | 2/2 | 0.48 | 0.92 | 0.80 | 0.33 | 0.60 | 1.00 | 0.87 | 0.70 | 0.60 | 0.92 | 0.63 |
Expected is the expected ratio calculated from the number of each purine on each strand. P is the probability that the observed ratio does not differ from the expected ratio based on a χ2 test; if any number was ≤5, the Yates correction was applied (70). TS/NTS is the ratio of the number of base changes in the transcribed strand (TS) to the number in the nontranscribed strand (NTS); if the numbers are small, they are given instead of the ratio Only the purine base changes are shown, because the TS/NTS ratios for the pyrimidines on the other strand are the inverse of the ratios shown.
This row gives the probabilities that the observed ratio for PFM35 does not differ from that of PFM2 based on a χ2 test.
Table S6.
Orientation of genes with BPSs in the experimental strains
| Strain | Description | LdObs | LdExp | LgObs | LgExp | Ld/LgObs | *P |
| PFM2 | WT | 109 | 103 | 80 | 86 | 1.36 | 0.51 |
| PFM2m | WT on min | 110 | 107 | 87 | 90 | 1.26 | 0.77 |
| ED1a | WT | 186 | 177 | 129 | 138 | 1.44 | 0.47 |
| IAI1 | WT | 205 | 177 | 137 | 138 | 1.50 | 0.15 |
| PFM101 | umuDC dinB | 93 | 107 | 103 | 89 | 0.90 | 0.17 |
| PFM133 | umuDC dinB polB | 118 | 110 | 84 | 92 | 1.40 | 0.41 |
| PFM35 | uvrA | 131 | 132 | 112 | 111 | 1.17 | 0.91 |
| PFM40 | alkA tagA | 104 | 108 | 95 | 91 | 1.09 | 0.68 |
| PFM88 | ada ogt | 110 | 110 | 92 | 92 | 1.20 | 0.98 |
| PFM22 | nth nei | 104 | 101 | 81 | 84 | 1.28 | 0.72 |
| PFM180 | xthA nfo | 149 | 145 | 118 | 122 | 1.26 | 0.73 |
| PFM91 | nfi | 131 | 128 | 123 | 116 | 1.07 | 0.53 |
| PFM61 | mutT | 1,014 | 1,024 | 870 | 860 | 1.17 | 0.75 |
| PFM6 | mutY | 216 | 220 | 189 | 185 | 1.14 | 0.77 |
| PFM94 | mutM mutY | 2,058 | 2,028 | 1,674 | 1,704 | 1.23 | 0.49 |
Genes whose coding strand is the leading DNA strand (= the lagging strand template) are transcribed in the same direction as DNA replication; genes whose coding strand is the lagging DNA strand (= the leading strand template) are transcribed in the direction opposite DNA replication. Expected values are the total number of BPSs that occurred in all genes multiplied by the fraction of the nucleotides in the genes in each orientation. The expected values of Ld/Lg are: PFM2 and derivatives, 1.19; ED1a, 1.28; IAI1, 1.20. Exp, expected number of mutations; Ld, leading strand; Lg, lagging strand; Obs, observed number of mutations.
P is the probability calculated for the χ2 test of the two observed values versus the two expected values.
Impact of DNA Repair Pathways on Spontaneous Mutations.
A second goal of this study was to determine if, in normally growing cells, loss of a given DNA repair pathway would change the mutation rate or the mutational spectrum. If so, it could be concluded that the repair pathway in question is important in preventing the mutations that arise from DNA damage induced by endogenous activities or by ubiquitous exogenous agents. Our choice of which DNA repair pathways to investigate was based on previous reports that their loss had mutational consequences (discussed below).
Note that in this report we have not included two major mutation-avoidance pathways, proofreading during DNA replication and removal of uracil from the DNA. We previously reported the effects of inactivating the DNA MMR pathway (5, 6).
DNA polymerases.
E. coli possesses three specialized DNA polymerases that are induced by DNA damage and other stresses. Two of these, DNA polymerase V (Pol V, encoded by the umuDC genes) and DNA polymerase IV (Pol IV, encoded by the dinB gene), are intrinsically error prone, whereas the third, DNA polymerase II (Pol II, encoded by the polB gene), is accurate (reviewed in ref. 19). Although there are few (≤15) molecules of Pol V in uninduced cells, transient induction of Pol V during normal growth could be mutagenic (20). Indeed, some reports have credited Pol V with a substantial role in spontaneous mutation (21–23), especially in stationary-phase cells (24). In contrast to Pol V, Pol IV is abundant even in normally growing cells (∼250 molecules) and can be induced 10- to 30-fold by DNA damage and other stresses (25, 26). Pol IV is required for stationary-phase or adaptive mutation (27, 28) and has a substantial role in producing mutations on the F´ episome (29, 30). However, in normally growing cells Pol IV contributes only 10%, at most, to spontaneous mutations on the chromosome (20, 29, 31). Loss of polymerase II (Pol II) results in a mutator phenotype, but this phenotype is entirely dependent on Pol IV (27, 32, 33), probably because Pol IV is overexpressed if Pol II is deleted (26, 34). The three specialized polymerases also have distinct mutational biases (20, 31, 35). Thus, we expected that loss of these polymerases would result in a decrease in mutation rates and/or a shift in mutational spectra.
As shown in Fig. 2 and Table 3, the loss of Pol IV and Pol V or of all three specialized polymerases did not significantly affect the rate or the spectra of BPSs with the exception of an 85% increase in G:C to T:A transversions in the umuDC dinB mutant strain (P = 0.03 based on simulations). Loss of the polymerases also did not significantly affect the rate or spectra of small indels (Table 5). We conclude that in normally growing cells the specialized DNA polymerases do not contribute detectably to genomic mutations.
Fig. 2.
The spectra of BPSs of DNA repair-defective E. coli strains. See the legend of Fig. 1A for the explanations of bars and error bars. ada ogt, PFM88; alkA tagA, PFM40; umuDC dinB, PFM101; umuDC dinB polB, PFM133; uvrA, PFM35; wild-type, PFM2.
Nucleotide excision repair and TCR.
The nucleotide excision repair (NER) proteins recognize and initiate the repair of a variety of DNA lesions produced by exogenous and endogenous agents (36) as well as misincorporated ribonucleotides (37). NER also can act gratuitously on undamaged DNA (38). TCR preferentially targets NER to the transcribed strand during transcription (16). Loss of NER has been reported to increase mutation rates by as much as sixfold, presumably because of error-prone translesion synthesis (22, 39), and also to decrease mutation rates by about the same amount, presumably by eliminating errors made by DNA polymerase I (Pol I) during gap repair (40).
The UvrA protein initiates the repair process by recognizing DNA damage and recruiting the other required Uvr proteins; thus, a deletion of the uvrA gene inactivates both NER and TCR. As shown in Fig. 2 and Table 3, the rate of BPSs in the uvrA mutant strain did not differ significantly from that in the wild-type strain. The spectrum of BPSs in the uvrA mutant strain also was nearly the same as that in the wild-type strain, differing only by a 67% increase in G:C to T:A transversions and a 30% increase in A:T to G:C transitions (P = 0.03 and 0.05, respectively, based on simulations). As in the wild-type strain, the ratio of BPSs in coding versus noncoding DNA was nearly half that expected (243/73 observed, 269/47 expected; χ2 = 7, P = 0.008) (Table 2); thus, TCR does not contribute to the bias against mutations in coding DNA. Loss of UvrA did not result in a significant change either in the frequency at which any type of base change occurred on the transcribed strand (Table S4) or in the fraction of mutations in highly transcribed genes (Table S5). Thus, we conclude that neither NER nor TCR contributes to the spontaneous BPS rate in normally growing cells.
Table S5.
Number of BPSs in highly expressed genes
| Strain | Description | Total BPSs, obs | BPSs in genes, obs | BPSs in HEGs, obs | Genome* | CDS† | ||
| BPSs in HEGs, exp | BPSs in HEGs, P | BPSs in HEGs, exp | BPSs in HEGs, P | |||||
| All HEGs‡ | ||||||||
| PFM2 | WT | 246 | 189 | 18 | 14 | 0.37 | 13 | 0.18 |
| PFM2m | WT on min | 277 | 197 | 17 | 16 | 0.88 | 13 | 0.38 |
| ED1a | WT | 407 | 315 | 17 | 21 | 0.47 | 19 | 0.77 |
| IAI1 | WT | 422 | 342 | 21 | 24 | 0.63 | 22 | 0.89 |
| PFM101 | umuDC dinB | 269 | 196 | 11 | 15 | 0.32 | 13 | 0.64 |
| PFM133 | umuDC dinB polB | 252 | 202 | 11 | 14 | 0.44 | 14 | 0.57 |
| PFM35 | uvrA | 316 | 243 | 16 | 18 | 0.70 | 16 | 0.98 |
| PFM35§ | uvrA | 23 | 0.24 | 23 | 0.23 | |||
| PFM40 | alkA tagA | 265 | 199 | 6 | 15 | 0.03 | 13 | 0.06 |
| PFM88 | ada ogt | 250 | 202 | 11 | 14 | 0.45 | 14 | 0.57 |
| PFM22 | nth nei | 209 | 185 | 14 | 12 | 0.66 | 12 | 0.77 |
| PFM180 | xthA nfo | 339 | 267 | 15 | 19 | 0.37 | 18 | 0.55 |
| PFM91 | nfi | 335 | 254 | 14 | 19 | 0.28 | 17 | 0.53 |
| PFM61 | mutT | 2,234 | 1884 | 112 | 128 | 0.17 | 127 | 0.20 |
| PFM6 | mutY | 485 | 405 | 25 | 28 | 0.66 | 27 | 0.73 |
| PFM94 | mutY mutM | 4,282 | 3732 | 222 | 246 | 0.14 | 252 | 0.07 |
| HEGs on the leading strand‡ | ||||||||
| PFM2 | WT | 246 | 189 | 11 | 10 | 0.74 | 9 | 0.58 |
| PFM2m | WT on min | 277 | 197 | 11 | 11 | 0.95 | 9 | 0.64 |
| ED1a | WT | 407 | 315 | 10 | 16 | 0.26 | 14 | 0.43 |
| IAI1 | WT | 422 | 342 | 16 | 15 | 0.79 | 14 | 0.67 |
| PFM101 | umuDC dinB | 269 | 196 | 10 | 10 | 0.92 | 9 | 0.06 |
| PFM133 | umuDC dinB polB | 252 | 202 | 8 | 10 | 0.67 | 9 | 0.77 |
| PFM35 | uvrA | 316 | 243 | 10 | 12 | 0.63 | 11 | 0.81 |
| PFM35§ | uvrA | 14 | 0.39 | 14 | 0.39 | |||
| PFM40 | alkA tagA | 265 | 199 | 3 | 10 | 0.08 | 9 | 0.14 |
| PFM88 | ada ogt | 250 | 202 | 8 | 10 | 0.68 | 9 | 0.77 |
| PFM22 | nth nei | 209 | 185 | 13 | 8 | 0.27 | 8 | 1.03 |
| PFM180 | xthA nfo | 339 | 267 | 12 | 13 | 0.82 | 12 | 0.97 |
| PFM91 | nfi | 335 | 254 | 9 | 13 | 0.39 | 12 | 0.56 |
| PFM61 | mutT | 2,234 | 1,884 | 73 | 87 | 0.27 | 86 | 0.30 |
| PFM6 | mutY | 485 | 405 | 20 | 19 | 0.84 | 18 | 0.80 |
| PFM94 | mutM mutY | 4,282 | 3,732 | 149 | 166 | 0.33 | 170 | 0.23 |
P is the probability that the number of observed and expected BPSs do not differ based on a χ2 test; if any number was ≤5, the Yates correction was applied (70). The names, lengths, and orientations of 255 HEGs in E. coli K12 were obtained from genomes.urv.cat/HEG-DB/. The number of BPSs on the lagging strand can be obtained by subtraction; although there were no statistically significant differences between the observed and expected BPSs on the lagging strand, most of these numbers are too small for meaningful statistical evaluation. CDS, coding sequence; exp, expected; HEG, highly expressed genes; obs, observed.
The number of BPSs expected in HEGs = (nucleotides in HEGs/nucleotides in genome) × (total number of BPSs observed).
Number of BPSs expected in HEGs = (nucleotides in HEGs/nucleotides in CDS) × (number of BPSs observed in CDS).
For PFM2 the HEGs represent 5.8% of the total genome and 6.7% of the CDS; HEGs on the leading strand represent 3.9% of the genome and 4.6% of the CDS; HEGs on the lagging strand represent 1.9% of the genome and 2.2% of the CDS. Assuming that the same genes are highly expressed in ED1a and IAI1 as in K12, the comparable numbers for ED1a are 5.1%, 6.0%, 3.5%, 4.0%, 1.7%, and 1.9%; for IAI1the comparable numbers are 5.7%, 6.5%, 3.8%, 4.4%, 1.8%, and 2.1%.
This row gives the expected values and probabilities that the observed ratio for PFM35 does not differ from that of PFM2.
Alkylation damage repair.
Two major pathways prevent the lethal and mutagenic consequences of the DNA damage produced by DNA alkylating agents (41). The first pathway is initiated by E. coli’s two 3-alkyladenine glycosylases, AlkA and Tag, which remove 3-alkylpurines from the DNA; AlkA also removes several additional alkylated bases. The resulting abasic sites are repaired by enzymes of the base-excision repair (BER) pathway (see below). The second pathway involves two alkyltransferases, Ada and Ogt, each of which remove the alkyl groups from O6-alkylguanine and O4-alkylthymine, restoring the bases to normality. Ada also is the transcriptional activator of the genes that encode itself, AlkA and AlkB; AlkB repairs alkylated bases in ssDNA. E. coli strains defective for Ada and Ogt have increased spontaneous mutation rates under starvation conditions (42–44), mostly because of DNA alkylating agents produced by nitrosation of amino acids, peptides, and/or polyamines (42).
In our MA/WGS experiment, deleting both alkA and tagA had little effect on the spontaneous mutation rate (Table 1). Deleting both ada and ogt reduced the mutation rate by a nonsignificant 15% relative to the wild-type strain (t = 1.5, df = 18, P = 0.14) (Table 1). The BPS spectra of both double-mutant strains were the same as that of the wild-type strain (Fig. 2 and Table 3). Thus, under the conditions of our experiment during which cells do not experience prolonged starvation, endogenous DNA alkylation is, at most, a minor contributor to spontaneous mutation. However, DNA Pol IV may bypass endogenous alkyl lesions accurately, possibly mitigating the mutagenic consequence of loss of the alkylation repair enzymes (45).
BER.
BER is a major defense against the lethal and mutagenic consequences of a variety of DNA lesions. It typically is initiated by a glycosylase that removes a specific type of damaged base, leaving an apurinic/apyrimidinic (AP) site. The activities of endo- and/or exonucleases then create a small gap that can be filled in by DNA Pol I. The mutagenic consequences of loss of BER have been documented by numerous previous reports; we chose to investigate the following three pathways that appear prominently in such reports.
Endonuclease III and endonuclease VIII.
Endonuclease III (Nth) and endonuclease VIII (Nei) are glycosylases whose main targets are oxidatively damaged pyrimidines (46, 47), although Nei can remove some oxidatively damaged purines, including 8-oxoG (48). Both Nth and Nei possess lyase activity, cleaving the sugar-phosphate backbone 3′ to the AP site; the sugar then must be removed by other enzymes to create a substrate for DNA Pol I (see below). Previous studies found that loss of either enzyme alone had only modest mutagenic effects, but loss of both was synergistic, particularly for G:C to A:T mutations, which were enhanced 40-fold in an nth nei double-mutant strain. These results established damaged cytosines as the major mutagenic lesions repaired by this pathway (49).
The results of our MA/WGS experiment with a strain deleted for both nth and nei confirmed these results, although the mutational effects were more modest. Relative to the wild-type strain the overall BPS rate in the double-mutant strain was elevated fivefold (t = 10, df = 16, P = 2 × 10−8), and the rate of G:C to A:T transitions was elevated 12-fold (t = 11, df = 12, P = 1 × 10−7) (Fig. 3 and Table 3). The rate of G:C transversions also was increased twofold (P = 0.04, based on simulations), probably because of the occasional insertion of Gs and Ts opposite damaged cytosines by DNA polymerases, as previously observed (50). G:C transitions were significantly biased by the local sequence context, occurring in the context 5′CpG3′/3′GpC5′ nearly twice as frequently as expected from the number of such dimer pairs in the genome (98 observed versus 53 expected; χ2 = 23, P = 2 × 10−6). In contrast, the context 5′CpT3′/3′GpA5′ was strongly disfavored (5 observed versus 36 expected; χ2 = 26, P = 3 × 10−7). These biases suggest that a 3′ G potentiates a C to oxidative damage, whereas a 3′ T is protective. Alternatively, DNA polymerase may be more likely to insert an A opposite a damaged C or to extend from the A:C mispair if it has just made a strong G:C base pair rather than a weak T:A base pair. However, base-stacking must play a role, because 5′CpC3′/3′GpG5′ sequences were not hotspots, and 5′CpA3′/3′GpT5′ sequences were not coldspots. As with other strains with high levels of oxidation-induced mutations (see below), mutations in the nth nei double-mutant strain were not DNA-strand biased (Table S3).
Fig. 3.
The spectra of the six BPSs of DNA repair-defective E. coli strains. See the legend of Fig. 1A for the explanations of bars and error bars. nfi, PFM91; nth nei, PFM22; wild-type, PFM2; xthA nfo, PFM180.
Exonuclease III and endonuclease IV.
AP sites arise from the action of glycosylases, from spontaneous breakage of the glycosylic bond, and by the direct action of some DNA-damaging agents. AP endonucleases cleave the phosphodiester bond 5′ to an AP site or to the damaged sugar left by the lyase activities of DNA glycosylases, leaving a 3′ OH group that is the substrate for DNA Pol I. Exonuclease III (XthA) and endonuclease IV (Nfo) comprise >90% of the AP endonuclease activity in E. coli (reviewed in ref. 51). Based on previous reports (39, 52, 53), we expected a two- to 10-fold increase in the mutation rate in the xthA nfo double-mutant strain. Our results were on the lower end of this range: the BPS rate in the xthA nfo double-mutant strain was elevated 2.6-fold relative to the wild-type strain (t = 7.6, df = 22, P = 1 × 10−7) (Fig. 3 and Table 1). As might be expected from the nonspecific activities of these enzymes, all classes of BPS were equally elevated by the loss of XthA and Nfo (Fig. 3 and Table 3). The rate of indels also was increased threefold (Table 5), but because the actual number of indels was small (36 in the xthA nfo double-mutant strain and 24 in the wild type strain), this result may not be significant (t = 3.4, df = 2, P = 0.075). Overall, these results indicate that a low level of endogenous DNA damage resulting in base loss either spontaneously or from the action of glycosylases is a constant threat to genomic integrity.
Endonuclease V.
Endogenous and exogenous nitrosating agents oxidatively deaminate A, C, and G, yielding the potentially mutagenic bases hypoxanthine, uracil, xanthine, and oxanine. Repair of these lesions can be initiated by enzymes of the uracil DNA glycosylase family, but a major pathway for the repair of oxidatively deaminated purine bases is initiated by endonuclease V (Nfi). Nfi recognizes the base and nicks the damage-bearing DNA strand; the damaged base then is removed via the exonuclease activity of Nfi or of DNA Pol I (reviewed in ref. 54). In previous studies, loss of Nfi resulted in a twofold increase in the spontaneous mutation rate (55). However, in our MA/WGS experiment the loss of Nfi increased the mutation rate by only 19% relative to the wild-type strain (Table 1), an increase that may not be significant (t = 1.7, df = 19, P = 0.11), and had no effect on the spectrum of BPSs (Fig. 3 and Table 3). These results suggest that under the conditions of our experiments endogenous nitrosating agents are not major contributors to spontaneous mutations.
8-oxoG mutagenesis.
One of the most mutagenic DNA lesions produced by oxidation is 8-oxoG, which base pairs efficiently with A when in the syn instead of the normal anti orientation (56). E. coli has three enzymes that prevent 8-oxoG mutagenesis: MutT, which hydrolyzes 8-oxodGTP, preventing its incorporation into the DNA; MutY, which removes As that are mispaired with 8-oxoG or G; and MutM, which removes 8-oxoG and formamidopyrimidine (Fapy) lesions from the DNA (reviewed in ref. 57). Loss of each of these three enzymes has distinct mutagenic consequences.
MutT.
In the absence of MutT, 8-oxoG is incorporated efficiently into the DNA opposite template A; thus, the loss of MutT yields a high frequency of A:T to C:G transversions (58, 59). The results of our MA/WGS experiment reproduce this old finding; A:T to C:G transversions occurred in the mutT mutant strain at a rate 900-fold greater than in the wild-type strain (t = 33, df = 23, P ∼ 0) (Fig. 4 and Table 3).
Fig. 4.
The spectra of the six BPSs of E. coli strains defective in the prevention of mutations caused by 8-oxoG. See the legend of Fig. 1A for the explanations of bars and error bars. mutM mutY, PFM94; mutT, PFM61; mutY, PFM6; wild-type, PFM2.
Because 8-oxoG inserted opposite template C could pair with A during the next round of replication, loss of MutT might be expected to increase the rate of G:C to T:A transversions. However, in confirmation of previous results (60), we found that the rate of G:C to T:A transversions was not increased in the mutT mutant strain (Table 3), suggesting either that 8-oxoG is rarely inserted opposite template C or that the combined activities of MutY, MutM, and MMR are sufficient to prevent such mutations.
MutY.
Loss of MutY results in G:C to T:A transversions because of the persistence of 8-oxoG:A and G:A mispairs (57). As expected, in our MA/WGS experiment the mutY mutant strain had an 80-fold increase in G:C to T:A transversions relative to the wild-type strain (t = 19, df = 13, P ∼ 0) (Fig. 4 and Table 3). MutY has a fairly wide range of substrates: in addition to removing A mispaired with 8-oxoG and G, MutY can remove A mispaired with 8-oxoA or C (61). If A is the correct base in these mispairs, MutY activity will produce mutations instead of preventing them. Although previous studies have found that loss of MutY reduced the frequency of both A:T to C:G transversions (60) and A:T to G:C transitions (62), our data did not confirm these results (Table 3).
MutY also can remove Gs paired with 8-oxoG, preventing G:C to C:G transversions (63). However, in the MA/WGS experiment the rate of G:C to C:G transversions was elevated in the mutY mutant strain by only a nonsignificant 30% (t = 0.35, df = 3, P = 0.75) (but see below). This difference from the 100-fold increase in the frequency of G:C to C:G transversions found by Zhang et al. (63) probably is explained by the fact that their results were obtained in cells after prolonged incubation on selective medium.
MutM MutY.
Although loss of MutM has been shown to increase mutation frequencies as much as 10-fold (64), in preliminary experiments we failed to detect increases in mutation rates to resistance to rifampicin or to nalidixic acid in a mutM-deletion strain. Because loss of both MutM and MutY is synergistic for G:C to T:A transversions (57), we used a mutM mutY double-deletion strain in our MA/WGS experiment. In confirmation of previous results, the rate of G:C to T:A transversions in the mutM mutY double-mutant strain was 800-fold higher than in the wild-type strain (t = 23, df = 22, P ∼ 0) and 10-fold higher than in the mutY mutant strain (t = 21, df = 31, P ∼ 0) (Fig. 4 and Table 3).
The rate of G:C to C:G transversions in the mutM mutY double-mutant strain was increased sevenfold relative to the wild-type strain (t = 4.2, df = 3, P = 0.02) and fivefold relative to the mutY mutant strain (t = 3.6, df = 2, P = 0.07) (Table 3), suggesting that G:C to C:G mutations are caused by a substrate of MutM that persists in the absence of MutY. An obvious candidate is the 8-oxoG:G mispair, a substrate of both MutM (65) and MutY (66), that, if uncorrected, could create a G:C to C:G transversion in the next round of replication. Alternatively, further oxidation products of 8-oxoG as well as other modified guanines are also MutM substrates (67, 68) and may be responsible for these mutations.
Sequence Context of 8-OxoG Mutagenesis.
Our experiments revealed that the local sequence context strongly influences the frequency of mutations caused by 8-oxoG (Table S7). In the mutT mutant strain, 50% of the A:T to C:G mutations occurred in the sequence 5′ApA3′/3′TpT5′, although, based on the frequency of that dimer pair in the genome, the expected percentage was only 30%. In contrast, only 10% of the A:T to C:G mutations occurred in the context 5′ApC3′/3′TpG5′, although the expected percentage was 22%. There was an eightfold difference in mutagenic potential between the triplet that was most permissive, 5′GpApA3′/3′TpTpC5′, and the triplet that was most restrictive, 5′CpApC3′/3′GpTpG5′ (Table S7). However, A:T to C:G transversions in the mutT mutant strain were not DNA-strand biased, occurring equally often with A templating the leading strand as with A templating the lagging strand (Table S3).
Table S7.
Sequence bases of BPSs in strains defective for the prevention of 8-oxoG mutagenesis
| 5′NMN3′/3′NMN5′ | PFM61 = mutT | 5′NMN3′/3′NMN5′ | PFM6 = mutY | PFM94 = mutM mutY | |||||||||
| A:T > C:G | G:C > T:A | G:C > T:A | |||||||||||
| Obs | Exp | Obs/Exp | P | Obs | Exp | Obs/Exp | P | Obs | Exp | Obs/Exp | P | ||
| AAA/TTT | 296 | 213 | 1.39 | 9 × 10−5 | AGA/TCT | 15 | 21 | 0.71 | 0.30 | 213 | 201 | 1.06 | 0.54 |
| CAA/GTT | 152 | 150 | 1.02 | 0.89 | CGA/GCT | 40 | 27 | 1.49 | 0.09 | 273 | 256 | 1.07 | 0.44 |
| GAA/CTT | 378 | 163 | 2.32 | ∼0 | GGA/CCT | 19 | 21 | 0.90 | 0.73 | 270 | 201 | 1.34 | 0.001 |
| TAA/ATT | 291 | 134 | 2.17 | ∼0 | TGA/ACT | 27 | 31 | 0.86 | 0.54 | 305 | 300 | 1.02 | 0.84 |
| 1,117 | 660 | 1.69 | ∼0 | 101 | 100 | 1.01 | 0.97 | 1,061 | 958 | 1.11 | 0.009 | ||
| AAC/TTG | 78 | 161 | 0.48 | 5 × 10−8 | AGC/TCG | 55 | 30 | 1.82 | 0.005 | 461 | 289 | 1.60 | ∼0 |
| CAC/GTG | 36 | 129 | 0.28 | ∼0 | CGC/GCG | 57 | 43 | 1.32 | 0.15 | 522 | 413 | 1.26 | 2 × 10−4 |
| GAC/CTG | 66 | 106 | 0.62 | 0.002 | GGC/CCG | 45 | 35 | 1.29 | 0.23 | 405 | 332 | 1.22 | 0.005 |
| TAC/ATG | 53 | 103 | 0.52 | 5 × 10−5 | TGC/ACG | 42 | 36 | 1.17 | 0.47 | 522 | 343 | 1.52 | 1 × 10−10 |
| 233 | 499 | 0.47 | ∼0 | 199 | 144 | 1.38 | 2 × 10−4 | 1,910 | 1,376 | 1.39 | ∼0 | ||
| AAG/TTC | 113 | 124 | 0.91 | 0.52 | AGG/TCC | 17 | 19 | 0.89 | 0.73 | 206 | 181 | 1.14 | 0.19 |
| CAG/GTC | 138 | 202 | 0.68 | 3 × 10−4 |
CGG/GCC | 57 | 33 | 1.74 | 0.007 | 350 | 312 | 1.12 | 0.12 |
| GAG/CTC | 70 | 83 | 0.84 | 0.29 | GGG/CCC | 11 | 18 | 0.61 | 0.19 | 190 | 171 | 1.11 | 0.30 |
| TAG/ATC | 70 | 53 | 1.33 | 0.11 | TGG/ACC | 19 | 32 | 0.59 | 0.06 | 280 | 307 | 0.91 | 0.24 |
| 391 | 462 | 0.85 | 0.007 | 104 | 102 | 1.02 | 0.86 | 1026 | 971 | 1.06 | 0.16 | ||
| AAT/TTA | 110 | 162 | 0.68 | 0.001 | AGT/TCA | 10 | 19 | 0.53 | 0.10 | 34 | 179 | 0.19 | ∼0 |
| CAT/GTA | 90 | 149 | 0.60 | 8 × 10−5 | CGT/GCA | 13 | 28 | 0.47 | 0.02 | 78 | 262 | 0.30 | ∼0 |
| GAT/CTA | 152 | 169 | 0.90 | 0.37 | GGT/CCA | 10 | 28 | 0.36 | 0.003 | 53 | 267 | 0.20 | ∼0 |
| TAT/ATA | 131 | 124 | 1.06 | 0.64 | TGT/ACA | 6 | 22 | 0.27 | 0.002 | 61 | 210 | 0.29 | ∼0 |
| 483 | 604 | 0.80 | 2 × 10−5 | 39 | 96 | 0.40 | 9 × 10−8 | 226 | 918 | 0.25 | ∼0 | ||
The observed and expected values are for the “+” or reference strand; the values for a trimer and its reverse complement are summed to account for the orientation on each strand. The expected values are the number of mutations of each base multiplied by the fraction of each trimer involving that base in the genome + strand. P is the probability that observed and expected values do not differ based on χ2 calculations; the Yates correction for small numbers was applied where relevant (70). The last P value in each set is for all four comparisons. P ∼ 0 is used to indicate probabilities <10−10. To allow multiple comparisons, the Benjamini–Hochberg procedure (88) was applied with the false-discovery rate set at 0.5%. By this criterion, for the mutT data all P values ≤ 0.002 are significant for the 16 comparisons, and all the P values are significant for the four comparisons of the sums. For the mutY data all P values ≤ 0.007 are significant for the 16 comparisons, and all P values ≤ 2 × 10−4 are significant for the four comparisons of the sums. For the mutM mutY data all P values ≤ 0.005 are significant for the 16 comparisons, and all P values ≤ 0.009 are significant for the four comparisons of the sums. Exp, the expected number of mutations; M, mutated base (mutated bases are in boldface and underlined in the columns); N, any base; Obs, the observed number of mutations.
In both the mutY and mutM mutY mutant strains 45% of the G:C to T:A mutations occurred in the context 5′GpC3′/3′CpG5′, although the expected percentage was only 33% (Table S7). The context 5′GpT3′/3′CpA5′ was inhibitory; only 9% (mutY) and 5% (mutM mutY) of the G:C to T:A transversions occurred in this context, although the expected percentage was 22% (Table S7). The most permissive triplet was 5′ApGpC3′/3′TpCpG5′, and the most stringent triplets were 5′TpGpT3′/3′ApCpA5′ (mutY) and 5′ApGpT3′/3′TpCpA5′ (mutM mutY), again accounting for an eightfold difference in mutagenic potential in the mutM mutY strain. In contrast to G:C to T:A transversions, G:C to C:G transversions in the mutM mutY mutant strain were biased by the 5′ base: of the 21 G:C to C:G transversions, 16 occurred in the context 5′GpG3′/3′CpC5′, although only five were expected (χ 2 = 12, P = 7 × 10−4). Interestingly, 5′GpG3′/3′CpC5′ was identified in an mutM mutant strain as a hotspot for G:C to C:G transversions induced by the oxidizing agent menadione (68). G:C to T:A transversions in mutY and mutM mutY mutant strains were not strand biased (Table S3), but G:C to C:G transversions in the mutM mutY mutant strain were ∼ 30% more likely to occur with G templating the leading strand than with G templating the lagging strand (16/21 observed vs. 10/21 expected; χ2 = 3.4, P = 0.06).
The local sequence context could affect the probability of a base being damaged or repaired, of errors occurring during replication, of mismatches being corrected, and/or of polymerase extending DNA synthesis from a mismatched base pair. We briefly discuss the influence of sequence context on replication errors below; a more extensive discussion that includes other factors is given in SI Discussion.
In the mutT mutant strain A:T to C:G mutations occur when 8-oxoG is inserted opposite A in the template. Our results suggest that DNA polymerase makes this insertion most readily if it has just inserted a T opposite a template A and least readily if it has just inserted a G opposite a template C. A simple explanation for this bias is that the weaker T:A base pair permits the incoming 8-oxoG to approach in the syn orientation, allowing it to base pair with A (56), whereas the stronger G:C base pair inhibits this orientation. In the mutY and mutM mutY mutant strains, G:C to T:A transversions occur when A is inserted opposite 8-oxoG in the template. The mutational bias observed suggests that DNA polymerase is more likely to insert an A opposite 8-oxoG if it has just inserted a G opposite a template C than if it has just inserted an A opposite a template T. A simple explanation is that template 8-oxoG may be more likely to swing into the syn orientation, or to be stably maintained in that orientation, if the 3′ base pair is the strong C:G rather than the weak T:A. However, our data also indicate that base-stacking must play a role in potentiating or preventing mutations, because the 3′ base had little influence when it was the complement of those indicated above (Table S7).
Comparison of the Results from MA/WGS Analyses and Previous Studies.
As mentioned above, the DNA repair pathways investigated in this report were chosen from published reports that their loss resulted in changes in spontaneous mutation rates or spectra. In general, with the exception of DNA repair activities dealing with oxidative damage, our results did not reproduce these previous findings. In retrospect, this result is not surprising. Most investigations of mutational processes have relied on reporter genes for mutational readout, and these reporter genes may not be representative of the genome as a whole. Indeed, reporter genes may enter into use because they are highly mutable or responsive to particular types of DNA damage. Although special sequences can reveal relatively rare but important mutational processes, such as hotspots, amplifications, complex multiple mutations, and aberrant recombination, such events at these loci may not contribute detectibly to the mutational load when the whole genome is the target. Thus, although the MA/WGS method may be blind to some mutational events, our results confirm previous conclusions (1) that the most significant threat to genomic integrity overall is DNA damage by reactive oxygen species.
Conclusions
Our studies of mutational accumulation in E. coli have resulted in the following four conclusions about the determinants of spontaneous mutation:
-
i)
The rates and spectra of spontaneous mutations in three diverged commensal E. coli strains are nearly the same. The rate of BPSs is 2–3 × 10−10 mutations per nucleotide per generation, and the rate of small indels (≤4 nt) is 1/10th of that. BPSs are dominated by G:C to A:T transitions, and the majority of indels occur in mononucleotide runs.
-
ii)
Several major pathways for repairing or tolerating DNA damage have little impact on spontaneous mutation under the conditions of our experiments. These pathways include translesion DNA synthesis, NER, TCR, and repair of alkylation damage. We conclude either that these pathways have not evolved to repair spontaneous DNA damage or that there are sufficient backup or redundant repair activities to compensate for the loss of any one pathway. However, the loss of the major AP endonucleases Xth and Nfo causes a modest two- to threefold increase in all types of mutations, possibly including small indels, suggesting that there is a low level of endogenous DNA damage that results in base loss either spontaneously or from the action of glycosylases.
-
iii)
Oxidative DNA damage is pervasive and highly mutagenic. Loss of Nth and Nei, the glycosylases that remove oxidatively damaged pyrimidines, results in a 12-fold increase in G:C transitions. Loss of the enzymes MutT, MutY, and MutM, all involved in preventing mutations caused by oxidatively damaged purines, increases transversion mutations by nearly 1,000-fold.
-
iv)
Mutations are strongly biased by the local sequence context. This bias was evident in the wild-type bacteria, in which A:T transitions occurred in the context 5′ApC3′/3′TpG5′ twice as frequently as expected, in the nth nei mutant strain, in which G:C transitions occurred in the context 5′CpG3′/3′GpC5′ twice as frequently as expected, and in strains defective in preventing mutagenesis caused by 8-oxoG, in which 5′ApA3′/3′TpT5′ and 5′GpC3′/3′CpG5′ were favored for A:T to C:G and G:C to T:A transversions, respectively.
Materials and Methods
Bacterial Strains.
The bacterial strains used in this study and their methods of construction are given in Table S8. The oligonucleotides used to perform and confirm genetic manipulations are listed in Table S9. Further details are given in SI Materials and Methods.
Table S8.
E. coli strains used in this study
| Strain | Relevant genotype | Donor | Recipient | Target gene | Method | Source |
| PFM2 | MG1655, rph+ | (5) | ||||
| ED1a | Wild type | (10) | ||||
| IAI1 | Wild type | (10) | ||||
| PFM6 | ∆mutY | JW2928 | PFM2 | ∆mutY::KnR | A | This study |
| PFM15 | ∆nth | JW1625 | PFM2 | ∆nth::KnR | A | This study |
| PFM21 | ∆nfo | JW2146 | PFM2 | ∆nfo::KnR | A | This study |
| PFM22 | ∆nth ∆nei | JW4354 | PFM15 | ∆nei::KnR | A | This study |
| PFM32 | ∆alkA | JW2053 | PFM2 | ∆alkA::KnR | A | This study |
| PFM34 | ∆ogt | JW1329 | PFM2 | ∆ogt::KnR | A | This study |
| PFM35 | ∆uvrA | JW4019 | PFM2 | ∆uvrA::KnR | A | This study |
| PFM40 | ∆alkA ∆tagA | JW3518 | PFM32 | ∆tagA::KnR | A | This study |
| PFM50 | ∆umuDC | PFM2 | umuDC | B | This study | |
| PFM61 | ∆mutT | JW097 | PFM2 | ∆mutT::KnR | A | This study |
| PFM88 | ∆ada ∆ogt | PFM34 | ada | C | This study | |
| PFM91 | ∆nfi | JW5547 | PFM2 | ∆nfi::KnR | A | This study |
| PFM94 | ∆mutM ∆mutY | PFM6 | mutM | C | This study | |
| PFM101 | ∆umuDC ∆dinB | PFM50 | dinB | D | This study | |
| PFM133 | ∆umuDC ∆dinB ∆polB | PFM101 | polB | D | This study | |
| PFM180 | ∆xthA ∆nfo | PFM21 | xthA | D | This study |
Methods: A: P1vir transduction (72) from donor (79), selecting for KnR, followed by FLP recombination to remove the KnR element (80). B: gene replacement using a KnR cassette, followed by FLP recombination to remove the KnR element (80). C: scarless gene deletion using a cat-sacB cassette (82). D: scarless gene deletion using a cat-I-SceI site cassette (81).
Table S9.
Oligonucleotides used in this study
| Name | Sequence |
| umuCD left | 5′–GAACAGACTACTGTATATAAAAACAGTATAACTTCAGGCAGATTATTATGATTCCGGGGATCCGTCGACC–3′ |
| umuCD right | 5′–GCGGGAGCGCTTTTTTCCTGCCGCTATATTTATTTGACCCTCAGTAAATCTGTAGGCTGGAGCTGCTTCG–3′ |
| umuD forward | 5′–GCCGGGTGATATTTTTC–3′ |
| umuC internal | 5′–GCTCTCCCCGTGGATGAC–3′ |
| umuC reverse | 5′–CGATGGCGCCGAGCGTC–3′ |
| dinB left | 5′–CAAACCCTGAAATCACTGTATACTTTACCAGTGTTGAGAGGTGAGCAATGCGCCTTACGCCCCGCCCTGC–3′ |
| dinB right | 5′–CATAATAATGCACACCAGAATATACATAATAGTATACATCATAATCCCAGCTAGACTATATTACCCTGTT–3′ |
| dinB del | 5′–CCCTGAAATCACTGTATACTTTACCAGTGTTGAGAGGTGAGCAATGCGTAAACTGGGATTATGATGTATACTATTATGTATATTCTGGTGTGCATTATTATGAG–3′ |
| dinB del 201–3 F | 5′–GGGCAACCGGCGCATTGAGA–3′ |
| dinB del 201–3 R | 5′–GGAAGCGCGCCTTTATGGGCT–3′ |
| polB left | 5′–AATGGTATCTGGCGAACTCTTTTTTTTGCTCAAAATAGCCCAAGTTGCCCATTCCGGGGATCC–3′ |
| polB right | 5′–ACGAAACCAGGCTATACTCAAGCCTGGTTTTTTGATGGATTTTCAGCGTGTGTAGGCTGGAGCTGCTTCG–3′ |
| polB del | 5′–GCCTGGTTTTTTGATGGATTTTCAGCGTGGCGCAGGCAGGTTTTATCTAAACCCGACACTAGGAAGAAAACCAAAAGCGTGGTCGCCCCTTGCAATATCAGAATCGCGGCACC–3′ |
| polBseq forward | 5′–TGCGTAAGCATGGCGCGAAGG–3′ |
| polBseq reverse | 5′–GCAGGCAATAACCACCCGCTACC–3′ |
| alkA forward | 5′–GCCGTCGCGACAACCGGAAT–3′ |
| alkA reverse | 5′–CATAGCGAACGGGCGGCGAT–3′ |
| tag forward | 5′–TCCCAGCGCCCTTTGGCTACT–3′ |
| tag reverse | 5′–TCGCGGGGAGTTCTGAACGT–3′ |
| ada left | 5′–CGGTTCAGCATCGGCAAACAGATCCAACATTACCTCTCCTCATTTTCAGCTGTGACGGAAGATCACTTCG–3′ |
| ada right | 5′–ATGGTGACCGGGCAGCCTAAAGGCTATCCTTAACCAGGGAGCTGATTATGATCAAAGGGAAAACTGTCCATAT–3′ |
| ada del | 5′–GCGTGATGGTGACCGGGCAGCCTAAAGGCTATCCTTAACCAGGGAGCTGATTATGAAAAAAGAGAGGTAATGTTGGATCTGTTTGCCGATGCTGAACCGTGGCAAGAGCCA–3′ |
| ada forward | 5′–TTGCGTGATGGTGACCGGGC–3′ |
| ada reverse | 5′–CAATTGGCGCGCGCAGATCC–3′ |
| ogt forward | 5′–GGCTGCTGATGTTGCTGGCG–3′ |
| ogt reverse | 5′–CGCCAGACTGAATGCGCCGT–3′ |
| nth forward | 5′–GGTGCAGATGCGCTGTTAGG–3′ |
| nth reverse | 5′–CATTGCGACAGCATCGTGATC–3′ |
| nei forward | 5′–GGAGATGGTGCAACACGG–3′ |
| nei reverse | 5′–ATGACTGCCTACCTCGCC–3′ |
| xth left | 5′–CGAAATTCTGCTACCATCCACGCACTCTTTATCTGAATAAATGGCAGCGACTATGCGCCTTACGCCCCGCCCTGC–3′ |
| xth right | 5′–GGGGGCGTGATCGGACGGTTTTTCCATGCTGCGGATTTCATAGTCGATGCCGGTCTAGACTATATTACCCTGTT–3′ |
| xth del | 5′–TTCTGCTACCATCCACGCACTCTTTATCTGAATAAATGGCAGCGACTATGACCGGCATCGACTATGAAATCCGCAGCATGGAAAAACCGTCCGATCACGCCCCC–3′ |
| nfo forward | 5′–GCGCGCTGAAAAAAGCTGG–3′ |
| nfo reverse | 5′–CATTAAATCCATAATATGCGCCGC–3′ |
| xth forward | 5′–AACAACAGGCGGTAAGCAACGC–3′ |
| xth reverse | 5′–ACAAAGGACGGCAGGCAACAAATC–3′ |
| nfi forward | 5′–AACCTTGGTCACGGCATTCATCAG–3′ |
| nfi reverse | 5′–TACATGCGTTCGCATAAGCAAGCC–3′ |
| mutY forward | 5′–GCAAGCATGATAAGGCCGTG–3′ |
| mutY reverse | 5′–TCAGCATGGTTTGCTTGTGC–3′ |
| mutM left | 5′–TCTGCTTGCCCCCATATTGACTGCATCTGTTCATTCCTGGAGATGCTATGTGTGACGGAAGATCACTTCG–3′ |
| mutM right | 5′–CGGATGGTATGCCATCCGGCGCGCATGAATTACTTCTGGCACTGCCGACAATCAAAGGGAAAACTGTCCATAT–3′ |
| mutM del | 5′–TCTGCTTGCCCCCATATTGACTGCATCTGTTCATTCCTGGAGATGCTATGTGTCGGCAGTGCCAGAAGTAATTCATGCGCGCCGGATGGCATAC–3′ |
| mutM forward | 5′–CACTACGAAGAACAAACG–3′ |
| mutM reverse | 5′–CGCCACTTAATGCCGGAAC–3′ |
| aslB forward | 5′–TATTTCCGCCATCTACCGCC–3′ |
| hemY reverse | 5′–TGAAGCACGGAGAATGGCA–3′ |
| parC forward | 5′–AATTTGGCGCTGGCATTCAG –3′ |
| vgiB reverse | 5′–CGGAAGCGTTTCGCTTTGAT–3′ |
| yeiI forward | 5′–CGCAAAGGCCGGATTAAAGG–3′ |
| rihB reverse | 5′–GGAGAGCTGTTCAGCGACAT–3′ |
| I-SceI forward | 5′–ACGTTAGTTACGCTAGGGATAACAGGGTAATATAG–3′ |
| I-SceI reverse | 5′–GATCCTATATTACCCTGTTATCCCTAGCGTAACTA–3′ |
MA Procedure.
MA lines originated from single colonies of a founder and were propagated through single-colony bottlenecks daily as described (5). Further details are given in SI Materials and Methods.
Estimation of Generations.
The number of generations that each MA line experienced was estimated from colony size as previously described (5). The mean of these values for each strain is reported in Table 1 and was used to calculate mutation rates. Further details are given in SI Materials and Methods.
Genomic DNA Purification, Library Construction, and WGS.
Genomic DNA was purified using the PureLink Genomic DNA purification kit (Invitrogen Corp.). Libraries were constructed either by the Beijing Genome Institute (BGI) or by the Indiana University Center for Genomics and Bioinformatics (CGB). Sequencing was performed either by BGI using the Illumina HiSeq 2000 platform or by the University of New Hampshire Hubbard Center for Genome Studies (HCGS) using the Illumina HiSeq 2500 platform. Further details are given in SI Materials and Methods.
SNP and Short Indel Calling.
Procedures for SNP and indel calling were as described in ref. 5. E. coli K12 strain MG1655 [National Center for Biotechnology Information (NCBI) reference sequence NC_000913.2] was used as the reference genome sequence. Further details are given in SI Materials and Methods. The sequences and mutations reported in this paper have been deposited in the NCBI Sequence Read Archive (BioProject accession no. SRP013707) and in the IUScholarWorks Repository (URI hdl.handle.net/2022/20340).
Mutation Annotation.
Variants were annotated using custom scripts. Protein-coding gene coordinates were obtained from the GenBank page of reference sequence NC_000913.2. BPSs in coding sequences were determined to be synonymous or nonsynonymous based on the genetic code. Nonsynonymous BPSs were designated conservative or nonconservative based on the Blosum62 matrix (69) with a value ≥0 considered conservative.
Monte Carlo Simulations.
For each strain a random distribution of BPSs corresponding to the observed mutational spectra was simulated using a custom script; the number of mutations was fixed at the observed numbers, and 1,000 trials were simulated (5).
Statistical Analyses.
Standard statistical analyses were used (70, 71). Confidence limits (CLs) on the overall rates of BPS and indels were calculated from the mean and variance of the mutations per MA line for each strain. CLs for the different types of mutations were calculated from the mean and variance of 1,000 Monte Carlo simulations for each strain; P values based on these calculations are indicated in the text. Further details are given in SI Materials and Methods.
SI Materials and Methods
Bacterial Strains and Media.
Culture medium for rich conditions was Miller Luria Broth (LB) (Difco; BD), liquid or agar plates, and for minimal conditions was Vogel–Bonner minimal (VB min) 0.2% glucose (72) liquid or agar plates. When required, antibiotics were added at the following concentrations: carbenicillin, 100 µg/mL; kanamycin (Kn), 50 µg/mL; chloramphenicol, 10 µg/mL; nalidixic acid (Nal), 40 µg/mL; rifampicin (Rif), 100 µg/mL; and anhydrotetracycline, 500 µg/mL.
Wild-type E. coli strains ED1a and IAI1 were obtained from E. Denamur (University of Paris Diderot, Paris). ED1a was originally isolated in the 2000s and is in the B2 phylogenetic group of E. coli; IAI1 was originally isolated in the 1980s and is in the B1 phylogenetic group of E. coli (10). Wild-type E. coli K12 strain PFM2 (5) is a prototrophic derivative of the reference strain MG1655 (73, 74), which is a descendent of an E. coli K12 strain isolated in 1922 (75) and is in the A phylogenetic group of E. coli. Our laboratory strain of MG1655, which was obtained from R. Maurer (76), has the same sequence differences from MG1655(Seq) as the MG1655 isolate deposited in the American Type Culture Collection as ATCC47076 (77). Thus, PFM2 and its derivatives do not have the point mutations in gatC and glp, or the insertion sequence (IS) element insertions in crl and oppA-ychE that recently were found in MG1655 (Seq) (77) and were incorporated into the reference sequence NCBI NC_000913.3. For this reason we continue to use NCBI NC_000913.2 as our reference genome. In addition, our strains do not have an IS element insertion at the flhDC promoter found in many MG1655 isolates (78) but do have an extra insertion of IS186 approximately at nucleotide 1877858.
The mutant E. coli K12 strains used in this study are derivatives of PFM2 and are described in Table S8. Gene deletions were designed to have little or no effect on downstream genes. Many of the gene deletions were moved into PFM2 and derivatives from the Keio collection of indel mutants (79) by P1vir phage transduction (72); the KnR element was then removed by flippase (FLP) recombination (80) leaving an in-frame scar sequence that encodes a 34-aa peptide. To remove an entire operon (e.g., umuDC), recombineering was used to generate a similar nonpolar deletion. Several strains with multiple deletions were generated. In two strains, PFM22 and PFM40, both deletions were derived from the Keio collection and thus have the same scar sequence; before sequencing, all MA lines derived from these strains were analyzed by PCR to ensure that recombination had not occurred between the scars. All other second or third deletions were generated by scarless recombineering using either I-SceI endonuclease activity (81) or sucrose toxicity (82) for counterselection. To make a KnR–I-SceI site cassette, we linearized pKD13 (80) with AfeI and ligated it to a 39-bp double-stranded oligo containing the I-SceI endonuclease recognition site (5′-GATCCTATATTACCCTGTTATCCCTAGCGTAACTAAGCT-3′). The method used to construct each strain is given in Table S8. Gene deletions were confirmed either by PCR and sequencing or by phenotype (see below). The sequences of the deletion oligos and PCR primers are given in Table S9.
The scarless deletions of the various genes are as follows: ada retains the first three and the last three codons; mutM retains the first and last seven codons; dinB retains the first two and the last seven codons; xthA retains the first and the last 26 codons, thus retaining the ydjXp5 promotor. The polB deletion was designed to retain the hepAp2 promoter in the polB coding sequence. First the entire polB coding sequence in PFM101 was replaced with a KnR–I-SceI cassette amplified from an in-house–modified pKD13 (see above). This cassette was replaced with a 508-bp fragment of polB consisting of the sequence for the first 12 N-terminal codons with engineered stop codons at the eighth and 12th positions followed by the 75 C-terminal codons; the intervening codons are deleted. This fragment was generated by overlap extension PCR (83) with primers polB del and polBseq reverse using PFM2 gDNA as template. The initial fragment was extended upstream of the polB gene with primer polBseq forward, generating the deletion fragment that replaced the KnR–I-SceI cassette.
Phenotypic Tests.
To test for UV sensitivity, cells from overnight liquid cultures were streaked onto an LB agar plate and were exposed to UV light in 10-s intervals; PFM35 (∆uvrA) was seven times more sensitive than PFM2 to UV exposure. Sensitivity to methylating, alkylating, and oxidizing agents was tested using LB gradient plates or by radially streaking cells on an LB agar plate with a central filter disk on which the agent was spotted. For gradient plates, a maximum concentration of 8.8 mM methyl methanesulfonate (MMS) or 0.0013% tert-butyl hydroperoxide (TBHP) was used; for streak tests, 70 μL of 300 mM N-methyl-N′-nitro-nitrosoguanidine (MNNG) or 70 μL of 1.17 M (10%) MMS was used (chemicals from Sigma-Aldrich Co.). PFM40 (∆alkA ∆tag) was more sensitive to MMS than was either single-deletion mutant or PFM2; PFM88 (∆ada ∆ogt) was more sensitive to MMS and to MNNG than was either single-deletion mutant or PFM2; PFM180 (∆xth ∆nfo) was more sensitive to TBHP and to MMS than was either single-deletion mutant or PFM2.
Estimation of Mutation Rates by Fluctuation Tests.
Mutation rates to NalR or RifR were estimated using fluctuation tests as described (84). The mutation rate was calculated using the Ma–Sandri–Sarkar maximum likelihood method (85) implemented by the FALCOR web tool found at www.mitochondria.org/protocols/FALCOR.html (86).
MA Procedure.
MA lines originated from single colonies isolated on agar plates from a founder (5). For early experiments, (with PFM61, PFM94, PFM6, PFM91, PFM35, PFM40, PFM88, PFM22), founders were generated by inoculating a LB broth culture from freezer stocks, growing overnight, and plating an appropriate dilution on LB agar plates (5). For the experiment on minimal medium (PFM2m), VB min was used instead of LB. For later experiments (with IAI, ED1a, PFM166, PFM180, PFM101, PFM133, and PFM166), an inoculum from the freezer stock was streaked onto a LB agar plate and incubated overnight; one well-isolated colony was excised from the agar plate, soaked for 30 min in 0.85% NaCl + 0.01% gelatin, and vortexed for 60 s. Appropriate dilutions then were plated onto LB agar plates to obtain well-isolated colonies to start the MA lines.
Subsequently each MA line was streaked for single colonies each day on a LB agar plate at 37 °C as described (5). For the PFM2m experiment, cells were streaked on VB min agar plates; because cells grow more slowly on minimal medium, these plates were incubated at 37 °C for 2 d. After passage, plates were stored at 4 °C for 2 d. If a well-isolated colony was not available for streaking on a particular day, a new streak was made from the same colony on the stored plate. The number of daily passages required for each MA experiment was determined from the mutation rate estimated by fluctuation analysis. For two strains, PFM35 and PFM40, we underestimated the true mutation rate and stopped the experiment too soon; these lines then were streaked from the frozen stocks made after the last passage, and the experiment continued (see ref. 5 for a further discussion of this procedure).
Estimation of Generations.
The number of generations between passages was estimated from the diameter of the colonies. The number of cells in colonies of different diameters was determined for each of the MA experiments independently using the method previously described (5). The number of generations that gave rise to a colony of a given diameter is the log2 of the number of cells in the colony, which usually was approximately 28 generations. At the end of the experiment daily colony diameters were converted to generations for each line; the mean of these values is reported in Table 1 and was used to calculate mutation rates.
Genomic DNA Preparation, Library Construction, and WGS.
Genomic DNA was purified using the PureLink GenomicDNA purification kit (Invitrogen) from 0.75 mL of overnight LB cultures inoculated from the freezer stocks made after the last passage of each MA line. DNA concentration and purity were assessed with a NanoDrop ND-1000 Spectrophotometer (Thermo Fisher Scientific, Inc.) or an Epoch Microplate Spectrophotometer (BioTek Instruments, Inc.).
When MA experiments were run concurrently, we confirmed the identity of the lines with diagnostic PCR on the gDNA before library construction. For gene-deletion strains the presence of the deletion was checked. The wild-type strains ED1a, IAI1, and PFM2 were confirmed by PCR of three regions that had different lengths in the three backgrounds: the primer pair aslB forward and hemY reverse gave products of 2.7 kb in PFM2, 1.2 kb in IAI1, and 1.7 kb in ED1a; the primer pair parC forward and vgiB reverse gave products of 3.2 kb in PFM2, 1 kb in IAI1, and 2.3 kb in ED1a; and, the primer pair yeiI forward and rihB reverse gave products of 2.7 kb in PFM2, 2.6 kb in IAI1, and 1.5 kb in ED1a.
Libraries for PFM6, PFM22, PFM35, PFM40, PFM61, PFM88, PFM91, and PFM94 were made by the BGI and were sequenced using the Illumina HiSeq 2000 platform. Libraries for PFM2m were made by the Indiana University CGB and were sequenced by BGI. Libraries for ED1a, IAI1, PFM101, PFM133, and PFM180 were made by the Indiana University CGB and were sequenced at the University of New Hampshire HCGS using the Illumina HiSeq 2500 platform.
For quality-control purposes, reads with any one of the following characteristics were discarded: (i) ≥10% unreadable bases; (ii) ≥20% low-quality (≤Q20) bases; (iii) adapter contamination (≥15-bp overlap allowing up to 3-bp mismatch); (iv) duplicate read-pairs. After such filtering, ∼100–130 × (460–598 Mbp) of paired-end reads (2 × 90 bp), and 55–78 × (253–359 Mbp) of paired-end reads (2 × 101 bp) per MA line were retained for lines sequenced by BGI and HCGS, respectively.
SNP and Short Indel Calling.
Procedures for SNP and indel calling were as described (5). E. coli K12 strain MG1655 (NCBI reference sequence NC_000913.2) was used as the reference genome sequence. Illumina reads were aligned to the reference with the Burrows–Wheeler short-read alignment tool, BWA version 0.7.9 (87). Short indels (≤4 bp) were called based on the read mapping of the SNP calling procedures. The sequences and mutations reported in this paper have been deposited in the NCBI Sequence Read Archive (BioProject accession no. SRP013707) and in the IUScholarWorks Repository (URI: hdl.handle.net/2022/20340).
Some MA lines were eliminated because of poor sequence coverage (<30×). Some MA lines carried shared mutations, which arise from mutations that occur during the initial growth of the founder or from cross-contamination during streaking. In our previous analysis of the PFM2 dataset (5), we eliminated any MA line that had shared mutations, but for the analyses reported here we retained such lines and assigned the shared mutations to one of them according to the deduced lineage, if possible, or randomly. Only MA lines with irreconcilable conflicts or with no unique BPS were eliminated. Reanalyzing the PFM2 data by these criteria resulted in two additional MA lines, six BPSs, and three indels (Tables 1, 4, and 5). The net result compared with the previously reported rates (5) is a 5% decrease in the rate for BPSs and, because there are few indels, a 62% increase in the rate for indels.
In recent paper (13) the PFM2 sequence reads reported in ref. 5 were reanalyzed, and an additional 15 BPSs and seven indels (≤4 nt) were identified. We had eliminated seven of the BPSs because they did not meet our sequence coverage criterion. The remaining eight BPSs identified by Barrick et al. (13) were each shared among two MA lines and had been eliminated in ref. 5; these four unique BPSs are now included in the six additional BPSs in the reanalyzed PFM2 dataset (Table 1). The seven indels identified in ref. 13 also were shared; five MA lines shared one indel, and two MA lines shared the other. These two unique indels have been added to the PFM2 dataset in Tables 4 and 5, as has one indel that had been mistakenly eliminated as shared but was unique: one line had a +C mutation, whereas the other lines had a −C mutation at the same site.
Statistical Analyses.
Standard statistical analyses were used (70, 71). CLs on the overall rates of BPS and indels were calculated from the mean and variance of the mutations per MA line for each strain. Because each strain had a different number of generations and lines, plus different genome sizes for ED1a and IAI1, we used these final mutation rates and variances multiplied by 1010 in t tests to evaluate differences in mutation rates statistically. For each strain the df was calculated as the number of unique values in the mutations per MA line; the df for comparing two strains was the sum of the two dfs minus 2. For strains with high rates of specific mutations (PFM22, PFM61, PFM6, and PFM94), statistical comparisons could be made using the mean and variance of the mutations per MA line for each strain. For strains with low numbers of the different types of mutations, CLs were calculated from the variance of the 1,000 Monte Carlo simulations described in Materials and Methods; P values based on these calculations are indicated in the text. Where applicable the Benjamini–Hochberg procedure (88) with the false-discovery rate set at 0.5% or 1.0% was applied to evaluate multiple comparisons.
SI Discussion
Our experiments revealed that mutations caused by 8-oxoG are strongly sequence dependent (Table S7). In the mutT mutant strain, A:T to C:G transversions were three- to fourfold more likely to occur if an A rather than a C was 3′ to the mutated A, whereas in the mutY and mutM mutY mutant strains, G:C to T:A transversions were three- to eightfold more likely to occur if a C rather than a T was 3′ to the mutated G (Table S7). The local sequence could affect the probability of initial damage, of subsequent repair, of errors occurring during replication, and/or of polymerase extending from the mismatched base pair. We discuss some of these possibilities below.
Because MutT prevents the insertion of 8-oxoG into the DNA, the sequence bias seen in the mutT mutant strain cannot be caused by different susceptibilities to damage. In contrast, because MutM and MutY act on DNA containing 8-oxoG, sequence-specific susceptibility of guanines to oxidation could be important in determining their mutagenic potential. Local sequence contexts have been shown to influence the extent of oxidation of guanines in vitro, but the results are not consistent and do not correlate well with our mutation results. Although NpGpT triplets had low oxidation potential in vitro and also were poorly mutated in our experiments, our most mutated triplets, NpGpC, were reported to be both relatively unreactive (89, 90) and highly reactive (91, 92), depending on the oxidizing agent.
Sequence bias could reflect differential targeting by DNA repair enzymes. In mutT mutant strains MutM activity decreases the frequency of A:T to C:G transversions (60). If the observed sequence biases in the mutT mutant strain were caused by the activity of MutM, the sequence 5′Tp(8-oxoG)3′ should be a poor substrate, and 5′Gp(8-oxoG)3′ should be a good substrate for MutM. However, in an in vitro study, 5′Gp(8-oxoG)3′ was a poor substrate for MutM (93); in contrast, a more recent study concluded that MutM efficiently repairs 8-oxoG with a 5′ C, T, or G (94). Thus, these in vitro results do not support the hypothesis that the sequence specificity we found in the mutT mutant strain was caused by MutM activity. The influence of sequence context on MutY activity, which should increase the frequency of A:T to C:G transversions (57), appears not to be well studied.
If the sequence biases reflect the potential for replication errors, the results from the mutT mutant strain suggest that, in general, DNA polymerase inserts 8-oxoG opposite template A three to four times more frequently if it has just inserted a T opposite a template A than if it has just inserted a G opposite a template C. The simplest explanation for this bias is that the weak T:A base pair allows the incoming 8-oxoG to approach in the syn orientation, allowing it to base pair with A (56), whereas the strong G:C base pair inhibits this orientation. However, stacking interactions with the 8-oxoG must play a role, because T or G 3′ to the mutated A did not have such strong effects. The stacking interactions that would be most informative are the incoming configurations 5′Tp(8-oxoG)3′ and 5′Gp(8-oxoG)3′. In a structural study of dinucleotides, 5′ purines were shown to be well stacked and 5′ pyrimidines poorly stacked with 8-oxoG, suggesting that a 5′ pyrimidine provides the 8-oxoG with flexibility (94); however, in this study there was little difference between the two purines or the two pyrimidines.
In mutY and mutM mutY mutant strains, G:C to T:A transversions occur with 8-oxoG in the template, so if the mutations are caused by replication errors, the mutational bias observed suggests that DNA polymerase is, in general, three to five times more likely to insert an A opposite 8-oxoG if it has just inserted a G opposite template C than if it has just inserted an A opposite template T. A simple explanation is that template 8-oxoG may be more likely to swing into the syn orientation or to be stably maintained in that orientation if the 3′ base pair is the strong C:G rather than the weak T:A. In confirmation, the crystal structure of DNA Pol I with template 8-oxoG suggests that a rigid template favors the syn orientation of 8-oxoG (95). Again, however, our data indicate that base stacking must play a role, because G or A 3′ to the mutated G did not have strong effects.
However, a third determinant of mutagenic potential is the ability of the mutagenic mispair to escape proofreading and be extended. A recent study provided strong evidence that E. coli’s replicase, DNA polymerase III (Pol III), is responsible for inserting 8-oxoG opposite template A in a mutT mutant strain and that Pol III is poor both in correcting the resulting 8-oxoG:A mismatch and in extending from it. Thus, the alternative polymerases, particularly Pol I and Pol IV, are likely to be responsible for extension from the mismatch (96). To explain the sequence bias of the A:T to C:G transversions in the mutT mutant strain, 5′Tp(8-oxoG)3′ paired with template 3′ApA5′ should be well preserved and extended; in contrast, to explain the sequence bias of G:C to T:A transversions in the mutM and mutM mutY mutant strains, the same dimer in the opposite orientation, 5′ApA3′ paired with template 3′Tp(8-oxoG)5′, should be poorly preserved and extended. Resolution of this conundrum awaits further studies of the how the alternative polymerases interact and extend from these sequences.
Acknowledgments
We thank the following current and past members of the P.L.F. laboratory for technical assistance: H. Bedwell, C. Coplen, J. Eagan, J. Ferlman, N. Gruenhagen, J. Healy, N. Ivers, R. Newlon, D. Osiecki, I. Rameses, S. Riffert, D. Simon, K. Smith, B. Souders, K. Storvik, L. Whitson, B. Wojcik, N. Yahaya, and A. Ying Yi Tan; members of the P.L.F., Lynch, and Finkel laboratories for useful discussions; the anonymous reviewers of this paper for valuable suggestions; E. Denamur, R. Gerlach, B. Wanner, and the National BioResource Project-E. coli at the National Institute of Genetics for providing bacterial strains and plasmids. This research was supported by US Army Research Office Multidisciplinary University Research Initiative Award W911NF-09-1-0444 (to P.L.F., H.T., M. Lynch, and S. E. Finkel).
Footnotes
The authors declare no conflict of interest.
This article is a PNAS Direct Submission.
Data deposition: The sequences reported in this paper have been deposited in the National Center for Biotechnology Information Sequence Read Archive (BioProject accession no. SRP013707) and in IUScholarWorks Repository (URI hdl.handle.net/2022/20340).
This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1512136112/-/DCSupplemental.
References
- 1.Miller JH. Spontaneous mutators in bacteria: Insights into pathways of mutagenesis and repair. Annu Rev Microbiol. 1996;50:625–643. doi: 10.1146/annurev.micro.50.1.625. [DOI] [PubMed] [Google Scholar]
- 2.Meier B, et al. C. elegans whole-genome sequencing reveals mutational signatures related to carcinogens and DNA repair deficiency. Genome Res. 2014;24(10):1624–1636. doi: 10.1101/gr.175547.114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Mukai T. The genetic structure of natural populations of Drosophila melanogaster. I. Spontaneous mutation rate of polygenes controlling viability. Genetics. 1964;50:1–19. doi: 10.1093/genetics/50.1.1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Halligan DL, Keightley PD. Spontaneous mutation accumulation studies in evolutionary genetics. Annu Rev Ecol Evol Syst. 2009;40:151–172. [Google Scholar]
- 5.Lee H, Popodi E, Tang H, Foster PL. Rate and molecular spectrum of spontaneous mutations in the bacterium Escherichia coli as determined by whole-genome sequencing. Proc Natl Acad Sci USA. 2012;109(41):E2774–E2783. doi: 10.1073/pnas.1210309109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Foster PL, Hanson AJ, Lee H, Popodi EM, Tang H. On the mutational topology of the bacterial genome. G3 (Bethesda) 2013;3(3):399–407. doi: 10.1534/g3.112.005355. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Sung W, et al. Asymmetric context-dependent mutation patterns revealed through mutation-accumulation experiments. Mol Biol Evol. 2015;32(7):1672–1683. doi: 10.1093/molbev/msv055. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Williams AB. Spontaneous mutation rates come into focus in Escherichia coli. DNA Repair (Amst) 2014;24:73–79. doi: 10.1016/j.dnarep.2014.09.009. [DOI] [PubMed] [Google Scholar]
- 9.Drake JW. A constant rate of spontaneous mutation in DNA-based microbes. Proc Natl Acad Sci USA. 1991;88(16):7160–7164. doi: 10.1073/pnas.88.16.7160. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Touchon M, et al. Organised genome dynamics in the Escherichia coli species results in highly diverse adaptive paths. PLoS Genet. 2009;5(1):e1000344. doi: 10.1371/journal.pgen.1000344. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Coulondre C, Miller JH, Farabaugh PJ, Gilbert W. Molecular basis of base substitution hotspots in Escherichia coli. Nature. 1978;274(5673):775–780. doi: 10.1038/274775a0. [DOI] [PubMed] [Google Scholar]
- 12.Loeb LA, Preston BD. Mutagenesis by apurinic/apyrimidinic sites. Annu Rev Genet. 1986;20:201–230. doi: 10.1146/annurev.ge.20.120186.001221. [DOI] [PubMed] [Google Scholar]
- 13.Barrick JE, et al. Identifying structural variation in haploid microbial genomes from short-read resequencing data using breseq. BMC Genomics. 2014;15:1039. doi: 10.1186/1471-2164-15-1039. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Streisinger G, Owen J. Mechanisms of spontaneous and induced frameshift mutation in bacteriophage T4. Genetics. 1985;109(4):633–659. doi: 10.1093/genetics/109.4.633. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Jinks-Robertson S, Bhagwat AS. Transcription-associated mutagenesis. Annu Rev Genet. 2014;48:341–359. doi: 10.1146/annurev-genet-120213-092015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Ganesan A, Spivak G, Hanawalt PC. Transcription-coupled DNA repair in prokaryotes. Prog Mol Biol Transl Sci. 2012;110:25–40. doi: 10.1016/B978-0-12-387665-2.00002-X. [DOI] [PubMed] [Google Scholar]
- 17.Merrikh H, Zhang Y, Grossman AD, Wang JD. Replication-transcription conflicts in bacteria. Nat Rev Microbiol. 2012;10(7):449–458. doi: 10.1038/nrmicro2800. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Chen X, Zhang J. No gene-specific optimization of mutation rate in Escherichia coli. Mol Biol Evol. 2013;30(7):1559–1562. doi: 10.1093/molbev/mst060. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Fuchs RP, Fujii S. Translesion DNA synthesis and mutagenesis in prokaryotes. Cold Spring Harb Perspect Biol. 2013;5(12):a012682. doi: 10.1101/cshperspect.a012682. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Curti E, McDonald JP, Mead S, Woodgate R. DNA polymerase switching: Effects on spontaneous mutagenesis in Escherichia coli. Mol Microbiol. 2009;71(2):315–331. doi: 10.1111/j.1365-2958.2008.06526.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Kato T, Shinoura Y. Isolation and characterization of mutants of Escherichia coli deficient in induction of mutations by ultraviolet light. Mol Gen Genet. 1977;156(2):121–131. doi: 10.1007/BF00283484. [DOI] [PubMed] [Google Scholar]
- 22.Sargentini NJ, Smith KC. Much of spontaneous mutagenesis in Escherichia coli is due to error-prone DNA repair: Implications for spontaneous carcinogenesis. Carcinogenesis. 1981;2(9):863–872. doi: 10.1093/carcin/2.9.863. [DOI] [PubMed] [Google Scholar]
- 23.Bhamre S, Gadea BB, Koyama CA, White SJ, Fowler RG. An aerobic recA-, umuC-dependent pathway of spontaneous base-pair substitution mutagenesis in Escherichia coli. Mutat Res. 2001;473(2):229–247. doi: 10.1016/s0027-5107(00)00155-x. [DOI] [PubMed] [Google Scholar]
- 24.Timms AR, Muriel W, Bridges BA. A UmuD,C-dependent pathway for spontaneous G:C to C:G transversions in stationary phase Escherichia coli mut Y. Mutat Res. 1999;435(1):77–80. doi: 10.1016/s0921-8777(99)00035-x. [DOI] [PubMed] [Google Scholar]
- 25.Kim SR, Matsui K, Yamada M, Gruz P, Nohmi T. Roles of chromosomal and episomal dinB genes encoding DNA pol IV in targeted and untargeted mutagenesis in Escherichia coli. Mol Genet Genomics. 2001;266(2):207–215. doi: 10.1007/s004380100541. [DOI] [PubMed] [Google Scholar]
- 26.Layton JC, Foster PL. Error-prone DNA polymerase IV is controlled by the stress-response sigma factor, RpoS, in Escherichia coli. Mol Microbiol. 2003;50(2):549–561. doi: 10.1046/j.1365-2958.2003.03704.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Foster PL. Adaptive mutation in Escherichia coli. Cold Spring Harb Symp Quant Biol. 2000;65:21–29. doi: 10.1101/sqb.2000.65.21. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.McKenzie GJ, Lee PL, Lombardo M-J, Hastings PJ, Rosenberg SM. SOS mutator DNA polymerase IV functions in adaptive mutation and not adaptive amplification. Mol Cell. 2001;7(3):571–579. doi: 10.1016/s1097-2765(01)00204-0. [DOI] [PubMed] [Google Scholar]
- 29.Kuban W, et al. Role of Escherichia coli DNA polymerase IV in in vivo replication fidelity. J Bacteriol. 2004;186(14):4802–4807. doi: 10.1128/JB.186.14.4802-4807.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Strauss BS, Roberts R, Francis L, Pouryazdanparast P. Role of the dinB gene product in spontaneous mutation in Escherichia coli with an impaired replicative polymerase. J Bacteriol. 2000;182(23):6742–6750. doi: 10.1128/jb.182.23.6742-6750.2000. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Wolff E, Kim M, Hu K, Yang H, Miller JH. Polymerases leave fingerprints: Analysis of the mutational spectrum in Escherichia coli rpoB to assess the role of polymerase IV in spontaneous mutation. J Bacteriol. 2004;186(9):2900–2905. doi: 10.1128/JB.186.9.2900-2905.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Foster PL. Are adaptive mutations due to a decline in mismatch repair? The evidence is lacking. Mutat Res. 1999;436(2):179–184. doi: 10.1016/s1383-5742(98)00023-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Banach-Orlowska M, Fijalkowska IJ, Schaaper RM, Jonczyk P. DNA polymerase II as a fidelity factor in chromosomal DNA synthesis in Escherichia coli. Mol Microbiol. 2005;58(1):61–70. doi: 10.1111/j.1365-2958.2005.04805.x. [DOI] [PubMed] [Google Scholar]
- 34.Foster PL. Stress-induced mutagenesis in bacteria. Crit Rev Biochem Mol Biol. 2007;42(5):373–397. doi: 10.1080/10409230701648494. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Nowosielska A, Janion C, Grzesiuk E. Effect of deletion of SOS-induced polymerases, pol II, IV, and V, on spontaneous mutagenesis in Escherichia coli mutD5. Environ Mol Mutagen. 2004;43(4):226–234. doi: 10.1002/em.20019. [DOI] [PubMed] [Google Scholar]
- 36.Truglio JJ, Croteau DL, Van Houten B, Kisker C. Prokaryotic nucleotide excision repair: The UvrABC system. Chem Rev. 2006;106(2):233–252. doi: 10.1021/cr040471u. [DOI] [PubMed] [Google Scholar]
- 37.Vaisman A, et al. Investigating the mechanisms of ribonucleotide excision repair in Escherichia coli. Mutat Res. 2014;761:21–33. doi: 10.1016/j.mrfmmm.2014.01.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Branum ME, Reardon JT, Sancar A. DNA repair excision nuclease attacks undamaged DNA. A potential source of spontaneous mutations. J Biol Chem. 2001;276(27):25421–25426. doi: 10.1074/jbc.M101032200. [DOI] [PubMed] [Google Scholar]
- 39.Foster PL. Escherichia coli strains with multiple DNA repair defects are hyperinduced for the SOS response. J Bacteriol. 1990;172(8):4719–4720. doi: 10.1128/jb.172.8.4719-4720.1990. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Hasegawa K, Yoshiyama K, Maki H. Spontaneous mutagenesis associated with nucleotide excision repair in Escherichia coli. Genes Cells. 2008;13(5):459–469. doi: 10.1111/j.1365-2443.2008.01185.x. [DOI] [PubMed] [Google Scholar]
- 41.Sedgwick B. Repairing DNA-methylation damage. Nat Rev Mol Cell Biol. 2004;5(2):148–157. doi: 10.1038/nrm1312. [DOI] [PubMed] [Google Scholar]
- 42.Taverna P, Sedgwick B. Generation of an endogenous DNA-methylating agent by nitrosation in Escherichia coli. J Bacteriol. 1996;178(17):5105–5111. doi: 10.1128/jb.178.17.5105-5111.1996. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Mackay WJ, Han S, Samson LD. DNA alkylation repair limits spontaneous base substitution mutations in Escherichia coli. J Bacteriol. 1994;176(11):3224–3230. doi: 10.1128/jb.176.11.3224-3230.1994. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Foster PL, Cairns J. Mechanisms of directed mutation. Genetics. 1992;131(4):783–789. doi: 10.1093/genetics/131.4.783. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Bjedov I, et al. Stress-induced mutagenesis in bacteria. Science. 2003;300(5624):1404–1409. doi: 10.1126/science.1082240. [DOI] [PubMed] [Google Scholar]
- 46.Jiang D, Hatahet Z, Blaisdell JO, Melamede RJ, Wallace SS. Escherichia coli endonuclease VIII: Cloning, sequencing, and overexpression of the nei structural gene and characterization of nei and nei nth mutants. J Bacteriol. 1997;179(11):3773–3782. doi: 10.1128/jb.179.11.3773-3782.1997. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Saito Y, et al. Characterization of endonuclease III (nth) and endonuclease VIII (nei) mutants of Escherichia coli K-12. J Bacteriol. 1997;179(11):3783–3785. doi: 10.1128/jb.179.11.3783-3785.1997. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Hazra TK, et al. Characterization of a novel 8-oxoguanine-DNA glycosylase activity in Escherichia coli and identification of the enzyme as endonuclease VIII. J Biol Chem. 2000;275(36):27762–27767. doi: 10.1074/jbc.M004052200. [DOI] [PubMed] [Google Scholar]
- 49.Blaisdell JO, Hatahet Z, Wallace SS. A novel role for Escherichia coli endonuclease VIII in prevention of spontaneous G-->T transversions. J Bacteriol. 1999;181(20):6396–6402. doi: 10.1128/jb.181.20.6396-6402.1999. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Wang D, Kreutzer DA, Essigmann JM. Mutagenicity and repair of oxidative DNA damage: Insights from studies using defined lesions. Mutat Res. 1998;400(1-2):99–115. doi: 10.1016/s0027-5107(98)00066-9. [DOI] [PubMed] [Google Scholar]
- 51.Daley JM, Zakaria C, Ramotar D. The endonuclease IV family of apurinic/apyrimidinic endonucleases. Mutat Res. 2010;705(3):217–227. doi: 10.1016/j.mrrev.2010.07.003. [DOI] [PubMed] [Google Scholar]
- 52.Cunningham RP, Saporito SM, Spitzer SG, Weiss B. Endonuclease IV (nfo) mutant of Escherichia coli. J Bacteriol. 1986;168(3):1120–1127. doi: 10.1128/jb.168.3.1120-1127.1986. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Wallace SS. Enzymatic processing of radiation-induced free radical damage in DNA. Radiat Res. 1998;150(5) Suppl:S60–S79. [PubMed] [Google Scholar]
- 54.Cao W. Endonuclease V: An unusual enzyme for repair of DNA deamination. Cell Mol Life Sci. 2013;70(17):3145–3156. doi: 10.1007/s00018-012-1222-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Guo G, Weiss B. Endonuclease V (nfi) mutant of Escherichia coli K-12. J Bacteriol. 1998;180(1):46–51. doi: 10.1128/jb.180.1.46-51.1998. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Kouchakdjian M, et al. NMR structural studies of the ionizing radiation adduct 7-hydro-8-oxodeoxyguanosine (8-oxo-7H-dG) opposite deoxyadenosine in a DNA duplex. 8-Oxo-7H-dG(syn).dA(anti) alignment at lesion site. Biochemistry. 1991;30(5):1403–1412. doi: 10.1021/bi00219a034. [DOI] [PubMed] [Google Scholar]
- 57.Michaels ML, Miller JH. The GO system protects organisms from the mutagenic effect of the spontaneous lesion 8-hydroxyguanine (7,8-dihydro-8-oxoguanine) J Bacteriol. 1992;174(20):6321–6325. doi: 10.1128/jb.174.20.6321-6325.1992. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Yanofsky C, Cox EC, Horn V. The unusual mutagenic specificity of an E. Coli mutator gene. Proc Natl Acad Sci USA. 1966;55(2):274–281. doi: 10.1073/pnas.55.2.274. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Maki H, Sekiguchi M. MutT protein specifically hydrolyses a potent mutagenic substrate for DNA synthesis. Nature. 1992;355(6357):273–275. doi: 10.1038/355273a0. [DOI] [PubMed] [Google Scholar]
- 60.Fowler RG, et al. Interactions among the Escherichia coli mutT, mutM, and mutY damage prevention pathways. DNA Repair (Amst) 2003;2(2):159–173. doi: 10.1016/s1568-7864(02)00193-3. [DOI] [PubMed] [Google Scholar]
- 61.de Oliveira AH, da Silva AE, de Oliveira IM, Henriques JA, Agnez-Lima LF. MutY-glycosylase: An overview on mutagenesis and activities beyond the GO system. Mutat Res. 2014;769:119–131. doi: 10.1016/j.mrfmmm.2014.08.002. [DOI] [PubMed] [Google Scholar]
- 62.Kim M, Huang T, Miller JH. Competition between MutY and mismatch repair at A·C mispairs In vivo. J Bacteriol. 2003;185(15):4626–4629. doi: 10.1128/JB.185.15.4626-4629.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Zhang QM, Ishikawa N, Nakahara T, Yonei S. Escherichia coli MutY protein has a guanine-DNA glycosylase that acts on 7,8-dihydro-8-oxoguanine:guanine mispair to prevent spontaneous G:C-->C:G transversions. Nucleic Acids Res. 1998;26(20):4669–4675. doi: 10.1093/nar/26.20.4669. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Michaels ML, Cruz C, Grollman AP, Miller JH. Evidence that MutY and MutM combine to prevent mutations by an oxidatively damaged form of guanine in DNA. Proc Natl Acad Sci USA. 1992;89(15):7022–7025. doi: 10.1073/pnas.89.15.7022. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Tchou J, et al. 8-oxoguanine (8-hydroxyguanine) DNA glycosylase and its substrate specificity. Proc Natl Acad Sci USA. 1991;88(11):4690–4694. doi: 10.1073/pnas.88.11.4690. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Michaels ML, Tchou J, Grollman AP, Miller JH. A repair system for 8-oxo-7,8-dihydrodeoxyguanine. Biochemistry. 1992;31(45):10964–10968. doi: 10.1021/bi00160a004. [DOI] [PubMed] [Google Scholar]
- 67.Prakash A, Doublié S, Wallace SS. The Fpg/Nei family of DNA glycosylases: Substrates, structures, and search for damage. Prog Mol Biol Transl Sci. 2012;110:71–91. doi: 10.1016/B978-0-12-387665-2.00004-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Ono T, Negishi K, Hayatsu H. Spectra of superoxide-induced mutations in the lacI gene of a wild-type and a mutM strain of Escherichia coli K-12. Mutat Res. 1995;326(2):175–183. doi: 10.1016/0027-5107(94)00167-4. [DOI] [PubMed] [Google Scholar]
- 69.Henikoff S, Henikoff JG. Protein family classification based on searching a database of blocks. Genomics. 1994;19(1):97–107. doi: 10.1006/geno.1994.1018. [DOI] [PubMed] [Google Scholar]
- 70.Zar JH. 1984. Biostatistical Analysis (Prentice Hall, Englewood Cliffs, NJ) 2nd Ed.
- 71.Rice JA. 1995. Mathematical Statistics and Data Analysis (Wadsworth Publishing Company, Belmont, CA) 2nd Ed.
- 72.Miller JH. A Short Course in Bacterial Genetics: A Laboratory Manual and Handbook for Escherichia coli and Related Bacteria. Cold Spring Harbor Lab Press; Cold Spring Harbor, NY: 1992. [Google Scholar]
- 73.Blattner FR, et al. The complete genome sequence of Escherichia coli K-12. Science. 1997;277(5331):1453–1462. doi: 10.1126/science.277.5331.1453. [DOI] [PubMed] [Google Scholar]
- 74.Riley M, et al. Escherichia coli K-12: A cooperatively developed annotation snapshot--2005. Nucleic Acids Res. 2006;34(1):1–9. doi: 10.1093/nar/gkj405. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Bachmann BJ. 1996. Derivations and genotypes of some mutant derivatives of Escherichia coli K-12. Escherichia coli and Salmonella Cellular and Molecular Biology, eds Neidhardt FC, et al. (American Society of Microbiology, Washington, D.C.), 2nd Ed.
- 76.Slater SC, Lifsics MR, O’Donnell M, Maurer R. holE, the gene coding for the theta subunit of DNA polymerase III of Escherichia coli: Characterization of a holE mutant and comparison with a dnaQ (epsilon-subunit) mutant. J Bacteriol. 1994;176(3):815–821. doi: 10.1128/jb.176.3.815-821.1994. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Freddolino PL, Amini S, Tavazoie S. Newly identified genetic variations in common Escherichia coli MG1655 stock cultures. J Bacteriol. 2012;194(2):303–306. doi: 10.1128/jb.06087-11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Barker CS, Prüss BM, Matsumura P. Increased motility of Escherichia coli by insertion sequence element integration into the regulatory region of the flhD operon. J Bacteriol. 2004;186(22):7529–7537. doi: 10.1128/JB.186.22.7529-7537.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Baba T, et al. Construction of Escherichia coli K-12 in-frame, single-gene knockout mutants: The Keio collection. Mol Syst Biol. 2006;2:0008. doi: 10.1038/msb4100050. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.Datsenko KA, Wanner BL. One-step inactivation of chromosomal genes in Escherichia coli K-12 using PCR products. Proc Natl Acad Sci USA. 2000;97(12):6640–6645. doi: 10.1073/pnas.120163297. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Blank K, Hensel M, Gerlach RG. Rapid and highly efficient method for scarless mutagenesis within the Salmonella enterica chromosome. PLoS One. 2011;6(1):e15763. doi: 10.1371/journal.pone.0015763. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Sawitzke JA, et al. Recombineering: In vivo genetic engineering in E. coli, S. enterica, and beyond. Methods Enzymol. 2007;421:171–199. doi: 10.1016/S0076-6879(06)21015-2. [DOI] [PubMed] [Google Scholar]
- 83.Heckman KL, Pease LR. Gene splicing and mutagenesis by PCR-driven overlap extension. Nat Protoc. 2007;2(4):924–932. doi: 10.1038/nprot.2007.132. [DOI] [PubMed] [Google Scholar]
- 84.Foster PL. Methods for determining spontaneous mutation rates. Methods Enzymol. 2006;409:195–213. doi: 10.1016/S0076-6879(05)09012-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85.Sarkar S, Ma WT, Sandri GH. On fluctuation analysis: A new, simple and efficient method for computing the expected number of mutants. Genetica. 1992;85(2):173–179. doi: 10.1007/BF00120324. [DOI] [PubMed] [Google Scholar]
- 86.Hall BM, Ma CX, Liang P, Singh KK. Fluctuation analysis CalculatOR: A web tool for the determination of mutation rate using Luria-Delbruck fluctuation analysis. Bioinformatics. 2009;25(12):1564–1565. doi: 10.1093/bioinformatics/btp253. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25(14):1754–1760. doi: 10.1093/bioinformatics/btp324. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 88.Benjamini Y, Hochberg Y. Controlling the false discovery rate - a practical and powerful approach to multiple testing. J R Statist Soc B. 1995;57(1):289–300. [Google Scholar]
- 89.Ming X, et al. Mapping structurally defined guanine oxidation products along DNA duplexes: Influence of local sequence context and endogenous cytosine methylation. J Am Chem Soc. 2014;136(11):4223–4235. doi: 10.1021/ja411636j. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90.Margolin Y, Shafirovich V, Geacintov NE, DeMott MS, Dedon PC. DNA sequence context as a determinant of the quantity and chemistry of guanine oxidation produced by hydroxyl radicals and one-electron oxidants. J Biol Chem. 2008;283(51):35569–35578. doi: 10.1074/jbc.M806809200. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 91.Rodriguez H, Valentine MR, Holmquist GP, Akman SA, Termini J. Mapping of peroxyl radical induced damage on genomic DNA. Biochemistry. 1999;38(50):16578–16588. doi: 10.1021/bi9918994. [DOI] [PubMed] [Google Scholar]
- 92.Margolin Y, Cloutier JF, Shafirovich V, Geacintov NE, Dedon PC. Paradoxical hotspots for guanine oxidation by a chemical mediator of inflammation. Nat Chem Biol. 2006;2(7):365–366. doi: 10.1038/nchembio796. [DOI] [PubMed] [Google Scholar]
- 93.Hatahet Z, Zhou M, Reha-Krantz LJ, Morrical SW, Wallace SS. In search of a mutational hotspot. Proc Natl Acad Sci USA. 1998;95(15):8556–8561. doi: 10.1073/pnas.95.15.8556. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 94.Sung RJ, Zhang M, Qi Y, Verdine GL. Sequence-dependent structural variation in DNA undergoing intrahelical inspection by the DNA glycosylase MutM. J Biol Chem. 2012;287(22):18044–18054. doi: 10.1074/jbc.M111.313635. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 95.Krahn JM, Beard WA, Miller H, Grollman AP, Wilson SH. Structure of DNA polymerase beta with the mutagenic DNA lesion 8-oxodeoxyguanine reveals structural insights into its coding potential. Structure. 2003;11(1):121–127. doi: 10.1016/s0969-2126(02)00930-9. [DOI] [PubMed] [Google Scholar]
- 96.Yamada M, et al. Escherichia coli DNA polymerase III is responsible for the high level of spontaneous mutations in mutT strains. Mol Microbiol. 2012;86(6):1364–1375. doi: 10.1111/mmi.12061. [DOI] [PMC free article] [PubMed] [Google Scholar]




