Significance
Organisms rely on accurate transcription for proper cellular function. Whereas errors incurred during replication are transmitted to subsequent generations, those that occur during transcription are transient and affect only a subset of the encoded proteins. Although transcription errors may increase survival in stressful conditions, the majority of these errors are harmful, and their rates must be minimized. By assessing the transcription errors genome-wide in Escherichia coli and in two bacterial endosymbionts, we discovered that all species had remarkably similar transcription error rates. This conservation is unexpected given that both endosymbiotic species lack orthologs of several E. coli RNA fidelity factors and that lifestyle differences among these species have led to vast differences in their mutation and substitution rates.
Keywords: transcription errors, RNA polymerase fidelity, base substitutions
Abstract
Errors that occur during transcription have received much less attention than the mutations that occur in DNA because transcription errors are not heritable and usually result in a very limited number of altered proteins. However, transcription error rates are typically several orders of magnitude higher than the mutation rate. Also, individual transcripts can be translated multiple times, so a single error can have substantial effects on the pool of proteins. Transcription errors can also contribute to cellular noise, thereby influencing cell survival under stressful conditions, such as starvation or antibiotic stress. Implementing a method that captures transcription errors genome-wide, we measured the rates and spectra of transcription errors in Escherichia coli and in endosymbionts for which mutation and/or substitution rates are greatly elevated over those of E. coli. Under all tested conditions, across all species, and even for different categories of RNA sequences (mRNA and rRNAs), there were no significant differences in rates of transcription errors, which ranged from 2.3 × 10−5 per nucleotide in mRNA of the endosymbiont Buchnera aphidicola to 5.2 × 10−5 per nucleotide in rRNA of the endosymbiont Carsonella ruddii. The similarity of transcription error rates in these bacterial endosymbionts to that in E. coli (4.63 × 10−5 per nucleotide) is all the more surprising given that genomic erosion has resulted in the loss of transcription fidelity factors in both Buchnera and Carsonella.
Among the multiple types of information processing errors, the majority of research has focused on mutations that occur during DNA replication because such errors are heritable and form the basis of evolutionary change. However, errors that occur during transcription and translation can also have substantial effects on gene function by producing misfolded and malfunctioning proteins. The rate of translation errors is typically an order of magnitude higher than the rate of transcription errors (1–6). However, errors occurring during transcription often elicit more dire consequences than those occurring during translation because individual mRNAs can be translated up to 40 times (7, 8), resulting in a burst of flawed proteins. Therefore, a single transcription error can result in many flawed proteins, whereas a translation error will disrupt only a single protein.
Because deleterious transcription errors are not transmitted to subsequent generations, they can occur more frequently than mutations to DNA but still infrequently enough to ensure the cell is not overburdened with faulty proteins. Estimates of the rate of transcription errors in Escherichia coli have been determined in vitro by measuring the misincorporation of radiolabeled nucleotides into repeating dinucleotide tracts (1, 9) and in vivo by quantifying the reversion frequencies of nonsense mutations in lacZ (2, 3). These assays yielded variable estimates of transcription error rates of 10−4–10−5 per nucleotide, several orders of magnitude higher than the mutation rate (10–12). Studies that assay individual loci are often not representative of the genome as a whole because sequence- or genome-specific features, such as base composition (12, 13) or sequence motifs (14), affect the incidence of information processing errors. Moreover, transcription error rate reversion assays based on the recovery of functional proteins might also include translation errors, if these occur at a sufficiently high rate.
RNAseq offers an approach to both disentangle transcription errors from translation errors and provide an error rate for every transcribed gene in a genome. Unfortunately, the high error rates both of cDNA synthesis (3–6 × 10−5 per nucleotide) (15–17) and of high-throughput sequencing technologies (possibly as high as 10−2–10−3 per nucleotide) (18, 19) renders the transcription errors obtained by conventional RNAseq indistinguishable from sequencing artifacts. Two recently developed methods offer ways to circumvent these problems by allowing transcription errors to be distinguished from sequencing and cDNA synthesis errors. Through the use of altered library preparation protocols, these methods reduce the overall error rate of RNAseq to less than 10−8 (20) and 10−12 (21, 22) per nucleotide, making it possible to measure error rates across the entire transcriptomes of viruses and other organisms.
In this study, we implement both of these RNAseq-based methods in E. coli to examine whether transcription error rates vary according to growth state and physiological condition, as has been reported for translation error rates (23–26) and for the combined transcription and translation error rate (27). Moreover, we ask whether transcription error rates are increased in the endosymbiotic bacteria Buchnera aphidicola and Carsonella ruddii—species that have lost known transcription fidelity factors and whose mutation rates, substitution rates, and rates of protein sequence evolution are all amplified as a result of genetic drift and the loss of repair enzymes (Fig. S1). We show that transcription error rates are remarkably similar across organisms, even for broad categories of RNA on which the cell is known to selectively degrade malfunctioning rRNA (28).
Fig. S1.
Nucleic acid information processing genes that are present in E. coli compared with their retention or loss in B. aphidicola and C. ruddii. Colored circles indicate retention of the corresponding gene; white circles indicate loss of the corresponding gene from the specified genome.
Results
Resource Limitation and Growth Phase Do Not Alter Transcription Error Rates.
We tested the effects of different growth conditions—all of which have been associated with altered mutation rates and/or translation error rates—on rates of transcription errors. Using a deep-sequencing approach to identify errors, we measured the transcription error rate in E. coli when grown under four growth conditions [tryptic soy broth (TSB) complex media or M9 glucose minimal media, each sampled at midlog and at stationary phase]. Note that these errors include both base substitutions during the process of transcription and any damage to the mRNA after transcription. Each of the four conditions were assayed in duplicate, and in total, we detected 2,621 transcription errors, with the number of errors per sample ranging from 156 to 681. In neither of the nutrient sources was there significant differences in transcription error rates for cells harvested at midlog phase or at 8 h after entering stationary phase (Fig. 1; paired Wilcoxon test, P > 0.30). Similar to what we observed for E. coli assayed at different growth phases, transcription error rates do not differ significantly in nutrient-rich (TSB) and nutrient-poor (M9) growth media (Fig. 1; paired Wilcoxon test, P = 0.3429). Furthermore, there are no significant differences in overall transcription error rates between any pair of individual conditions tested [Fig. 1; two-tailed t tests, t(2) < 2.3, P > 0.14], and the average transcription error rate over all conditions is 4.63 ± 0.34 (SEM) × 10−5 for E. coli mRNA.
Fig. 1.
Frequency of transcription errors in E. coli. Points are color-coded according to growth condition (n = 2 for each condition); horizontal bars represent means of each column. No significant differences in transcription error frequencies were detected between any of the tested parameters.
Distribution of Transcription Errors.
The use of a high-throughput sequencing method to detect transcription errors (as opposed to a reporter-gene method) enables analysis of transcription errors genome-wide as well as the localization of errors to individual sites in each transcript. Starting at the scale of whole genomes, we analyzed the fluctuation in transcription error rates and found that the 95% of measurements made for 50-kb nonoverlapping windows across the entire E. coli genome varies threefold among genomic regions, ranging from 2.3 to 7.2 × 10−5 (Fig. S2). Regions containing highly expressed genes had an increased number of transcription errors (Fig. S3), resulting from increased coverage enabling the discovery of more errors relative to areas in the genome with low coverage.
Fig. S2.
Frequency of transcription errors along the E. coli genome. Shaded rectangles represent transcription error rates of all errors over the eight E. coli samples in nonoverlapping 50-kb windows. Horizontal lines represent the genome-wide mean transcription error rate (black) and 2 SDs from the mean (red). Positions of replication origin and terminus are shown.
Fig. S3.
Association between numbers of transcription errors and sequence coverage. Error numbers computed for nonoverlapping 50-kb windows across the E. coli genome in all eight samples.
Transcription proceeds in the direction of DNA replication on the leading strand and in the opposite direction on the lagging strand, in which case there can be collisions between the replication and transcription machineries. Despite an increased likelihood of collision-induced errors on the lagging strand, there is no significant difference in the transcription error rates between genes encoded on the two strands (Wilcoxon test, P > 0.90; Fig. S4). Next, we tested whether adjacent nucleotides affected the occurrence of transcription errors and found that neither a particular preceding nor succeeding nucleotide induced transcription errors. Only when both the preceding and succeeding nucleotides are guanine residues do we observe a significant increase in transcription error frequency (Fisher’s exact test, P < 0.02). Taken together, transcription errors occur without regard for genome location, direction of transcription, or for the vast majority of neighboring nucleotides.
Fig. S4.
Transcription error frequencies in E. coli genes transcribed on the leading or lagging strands. Points are color-coded according to growth condition, and horizontal bars represent means of each column. There is no significant difference between the transcription error frequencies for genes encoded on the two strands (Wilcoxon test, P > 0.90; n = 8).
Biases in E. coli Transcription Errors.
Measuring transcription errors using a sequencing-based approach provides information about the absolute frequencies of each of the possible base substitutions. C→U errors were most common, occurring at a significantly higher frequency than all other transcription errors (Fig. 2A), presumably attributable to high rates of cytosine deamination after the RNA is transcribed. It has previously been reported that transcription errors incur a higher rate of transitions than transversions (20, 29), the same overall pattern that we observe in E. coli (Wilcoxon test, P < 0.05). This trend, however, is driven solely by high incidence of C→U changes and no longer reaches significance after removing these transitions from the analysis (Wilcoxon test, P > 0.50). Next, we tested the effect of individual nucleotides on the frequency of transcription errors in E. coli and found that G/C→N errors occur at higher frequencies than do A/U→N errors (Wilcoxon test, P < 0.02; Fig. 2B). Additionally, N→A/U errors occurred at a significantly higher rate than do N→G/C errors (Fig. 2C; Wilcoxon test, P < 0.02). These effects are not due solely to the high frequency of C→U errors: even after the removal of C→U errors (Methods), G/C→N errors remain significantly more frequent than A/U→N errors (Fig. 2B), and N→A/U errors remain significantly more frequent than N→G/C errors (Fig. 2C).
Fig. 2.
Transcription error frequencies by substitution type in E. coli. Points are color-coded according to growth condition, and horizontal bars represent mean values for each class of base substitution. (A) Transcription error frequencies for individual substitutions. C→U is the most common transcription error, displaying a significantly higher error rate than each of the other substitutions. (B) Effect of base composition (G/C or A/U) on transcription error frequencies. Errors occur at significantly higher frequencies when the original nucleotide is a G or C. Removal of C→U errors from the analysis (right column) demonstrates that the significant effect does not depend on the most abundant type of error. (C) Transcription error frequencies grouped according to base composition G/C or A/U of resulting substitutions. Transcription errors resulting in A or U occur at significantly higher levels than those resulting in G or C. Removal of C→U errors from the analysis (right column) demonstrates that the significant effect does not depend on the most abundant type of error. Comparisons were made by pairwise Wilcoxon tests (n = 8 for each test), subjected to Bonferroni correction: *P < 0.05; **P < 0.01.
Transcription Error Rates in Host-Restricted Bacteria with Reduced Genomes.
The bacterial endosymbionts, B. aphidicola and C. ruddii, harbor small genomes (450 and 190 kb, respectively) and have very high substitution rates, as a consequence of both their lack of several repair mechanisms (Fig. S1) and the reduced efficacy of selection due to their small effective population sizes. These features are also expected to augment rates of transcription errors, so we assayed the transcription error rates in these endosymbionts using methods identical to those used for E. coli. For the replicate samples of B. aphidicola, we detected a total of 169 transcription errors in total mRNA, yielding a transcription error rate of 2.69 ± 0.73 (SEM) × 10−5, which is not significantly different from the rate that we obtained for E. coli mRNA [two-tailed t test, t(8) = 2.527, P > 0.05; Fig. 3A].
Fig. 3.
Transcription error frequencies in divergent bacterial taxa. Points are color-coded according to growth condition or source of RNA (see Key), and horizontal bars represent means of each column. (A) Transcription error frequencies in E. coli mRNA (n = 8), B. aphidicola mRNA (n = 2), B. aphidicola rRNA (n = 2), and C. ruddii rRNA (n = 1). (B) Transcription error frequencies by substitution type in bacterial endosymbionts. No significant differences were detected for any pairwise comparisons.
Transcription errors in C. ruddii mRNA could not be assigned unequivocally because the C. ruddii RNA was extracted from a natural population of individuals, rendering it difficult to distinguish between transcription errors and the polymorphisms that might be present in the population. Instead, we quantified transcription error rates for 16S and 23S ribosomal RNA in both C. ruddii and B. aphidicola because these operons are present in single copy, have high read-coverage (despite the rRNA removal step), and are not polymorphic within a species. Unlike C. ruddii and B. aphidicola, the E. coli genome possesses multiple polymorphic rRNA operons, making it unfeasible to estimate rRNA transcription error rates in E. coli. We detected a total of 1,014 errors in C. ruddii rRNAs and 4,377 errors in B. aphidicola rRNAs, yielding rRNA transcription error rates of 5.13 × 10−5 for C. ruddii and 3.37 × 10−5 for B. aphidicola (Fig. 3A). Our estimates of bacterial transcription error rates are, in descending order, 5.13 × 10−5 for C. ruddii rRNA, 4.63 × 10−5 for E. coli mRNA, 3.37 × 10−5 for Buchnera rRNA, and 2.69 × 10−5 for Buchnera mRNA. The transcription error rates for B. aphidicola mRNA and rRNA do not differ significantly from one another.
Biases in Endosymbiont Transcription Error Rates.
Assessing the transcription errors occurring in both Buchnera mRNA and rRNA allowed us to determine whether there are any observable differences in the error rates for two RNA substrates, as might be caused by base compositional biases or selection. All possible nucleotide substitutions, as attributable to transcription errors, were detected in both the mRNA and rRNA samples (although one of the B. aphidicola mRNA replicates lacked any A→C changes). There were no significant differences for any of the individual substitution classes between mRNA and rRNA or among any of individual substitution classes (Fig. 3B).
Effects of Transcription Errors on Protein Sequences.
Given that each transcript can be translated—perhaps multiple times—into protein, we determined which transcription errors result in an amino acid substitution. On average, 68 ± 1.46% (SEM) of transcription errors cause an amino acid substitution in E. coli, whereas 80% of the transcription errors in Buchnera result in amino acid substitutions. If errors were to occur at random over the E. coli transcriptome, the probability of changing an amino acid is significantly higher than that actually incurred by transcription errors (76% vs. 68%; pairwise Wilcoxon test, P < 0.008).
Discussion
Considering the range of variation in replication error rates and in translation error rates both within and among bacterial species, our finding that transcription error rates are similar for different species and for different classes of RNA sequences and under different physiological conditions within a species is bewildering. The mutation (i.e., DNA replication error) rates for bacteria span by several orders of magnitude (10); for the specific organisms that we consider, spontaneous mutation rates vary nearly 50-fold, from 8.9 × 10−11 per site per generation in E. coli (10) to 4.0 × 10−9 for Buchnera aphidicola (29). In contrast, based on our genome-wide deep-sequencing approach, the transcription error rates of these two species differ by less than twofold (2.7 × 10−5 vs. 4.3 × 10−5), with E. coli having the slightly higher rate. Our initial prediction was that endosymbionts would have higher transcription error rates because they are subject to high levels of genetic drift and would therefore sustain more deleterious mutations; however, neither of the studied endosymbionts had elevated transcription error rates.
We reasoned that differential regulation of transcription fidelity factors, such as greA (30, 31), greB (31), or dksA (31, 32), operating during transcription, translation, or protein degradation, could provide a mechanism for E. coli to modulate its transcription error rate under various conditions and growth phases. The conservation of transcription error rates among species is all of the more surprising given that these endosymbionts lack homologs for several of these transcription fidelity factors (Fig. S1). Endosymbionts possess the most highly reduced bacterial genomes (33), and the genome sizes of Buchnera and Carsonella are only 641 and 160 kb, respectively (34, 35), in contrast to the 4,640-kb genome of the E. coli MG1655. Genome reduction in endosymbionts results from elimination of genes that are no longer necessary in the host environment but also involves the loss of apparently beneficial genes, such as those that enhance the efficiency of universal cellular processes, such as DNA repair, translation, and transcription (Fig. S1). The lack of certain DNA repair enzymes in endosymbionts have been implicated in their extreme base compositions and increased mutation rates (36–38); however, loss of multiple RNA fidelity factors, such as greB in Buchnera (Fig. S1) and greA, greB (Fig. S1) and dksA in Carsonella, seems not to affect transcription error rates.
These bacterial endosymbionts are missing transcription fidelity factors, but their transcription error rates are unchanged, implying that there are mutations within RNAP that can increase the fidelity of transcription. If there is indeed an optimal transcription error rate across bacteria, selection may have improved the intrinsic error rate in the endosymbiont RNAPs after they lost the transcription fidelity factors. However, neither of the RNAPs of the endosymbionts possess a mutation known to increase transcription fidelity in E. coli (39). It is possible that endosymbionts do not require rapid transcription and can tolerate slow but accurate transcription (39). The presence of these fidelity factors in E. coli could allow its RNAP to make more errors (which are then corrected), as a result of selection for increased transcription speeds and increased growth rates (40).
Not only were transcription error rates similar in proteobacterial taxa of vastly different lifestyles, population structures, genomes sizes, and mutation rates, but the error rates were comparable across organisms for different broad categories of RNAs. Because structural RNAs (16S and 23S rRNAs) persist longer than mRNAs, they can incur more damage (due to oxidative stress or deamination), thereby leading to an increase in our estimates of the error rate for ribosomal RNAs. On the other hand, one might anticipate rRNAs to have lower error rates than mRNAs, because subfunctional molecules would be preferentially targeted for degradation (28), leaving only those rRNAs that do not contain errors. It should be noted that under both scenarios, the error rate during transcription does not change, but rather the variation in the estimated error rates is caused by differences in the fate of rRNAs after transcription. We were only able to measure transcription errors rates for both mRNA and rRNA in Buchnera. The average error rate for Buchnera mRNA was slightly lower than for rRNA, but this estimate was based on the detection of many fewer errors, and there is no significant difference between the two categories of RNAs (Fig. 3). It is not possible to measure transcription errors in rRNAs of E. coli and in mRNAs in Carsonella—in both cases, DNA polymorphisms inherent to the sample prevent recognition of transcription errors; however, the error rates of E. coli mRNA and Carsonella rRNAs differ by less than 10%.
Unlike what we observe for transcription error rates, the mutation rate of an individual strain can vary depending on its growth conditions. E. coli mutation rates have been shown to increase by an order of magnitude during stationary phase and under nutrient-limited conditions (41). Much of the variation in the mutation rate within a species has been attributed to expression of error-prone polymerases during stationary phase (42–44) and to increased chemical damage occurring during the switch from exponential growth to stationary growth (45–47). Such chemical damage to DNA is usually corrected through DNA repair pathways, but because analogous pathways do not exist for RNA, it is potentially more susceptible to this source of damage. That there is no increase in either the rate or spectrum of errors to RNA during stationary phase suggests that other mechanisms compensate for stationary-phase stresses (e.g., dps protein and catalases) (48–50) or that RNA is too short-lived to be significantly affected.
The relative frequencies of each type of transcription error were similar across organisms and across growth conditions (Fig. 2) and correspond to what is observed for spontaneous mutations in these organisms (i.e., that C→U substitutions constitute the most common class of errors, and that A/T→T/A and G/C→C/G transversions occur at some of the lowest frequencies) (12, 33, 51–53). Cytosine is the most unstable nucleobase and has an even higher rate of deamination to uracil when nucleic acids are in a single-stranded state (54), so the pronounced bias toward this error is expected. Therefore, some of the observed transcription errors appear to be due to damage to RNA, although current methods simply enumerate errors and do not discriminate between those caused by base misincorporations occurring during transcription and by damage to the RNA after transcription. Nonetheless, chemical damage occurring after transcription is biologically relevant because ribosomes can still translate the damaged base.
Many of the initial measurements of transcription errors in bacteria were restricted to single reporter genes and assayed the combined effects of transcription and translation errors by assessing how frequent functional proteins were produced from a mutant gene (2, 3, 14). These assays considered errors in translation to be relatively rare because, in this system, it was thought that only the first ribosome on a transcript would be capable of mistranslation and that most errors could be ascribed to the process of transcription (2, 3). However, translation errors in E. coli can occur at rates between 10−3 and 10−4 per codon (4–6), suggesting that many of the original measurements of transcription errors are confounded by the inclusion of translation errors. Furthermore, the error rates varied by up to an order of magnitude for different stop codons (3), indicating that these fluctuations may be attributed to different translation error rates for different codons (55); therefore, the rates derived from these studies require validation by methods that exclusively consider transcription errors.
Previous studies reported that the combined transcription/translation error rate, as inferred from the frequency of errors in protein sequences, increases both in stationary phase and under starvation conditions (4, 6, 27). Because we detected no differences in transcription error rates between these different growth conditions, we reason that this variation manifests during translation and is most likely caused by tRNA scarcity during stationary phase (6, 55, 56). Although decreases in ribonucleotide concentration also occur during stationary phase (57), this has little effect on the overall fidelity of gene expression. Decreases in ribonucleotide concentration have been shown to increase the frequency of transcriptional pausing (58), which is closely associated with base misincorporations during transcription (39, 59, 60), so it seems that either (i) ribonucleotide concentration does not decrease enough under our experimental conditions to significantly alter the transcription error rate or (ii) that ribonucleotide concentration-induced pausing does not result from transcription errors. Nonetheless, it is curious that cellular growth conditions modify both the rate of DNA mutations and the rate of protein translation errors but not the transcription error rate.
Rates of translation errors have been estimated as being at least an order of magnitude higher than rates of transcription errors, but because most transcripts are translated multiple times, the realized number of modified proteins originating from transcription errors will equal or exceed the number caused by translation errors. This amplification of individual transcription errors into multiple proteins is likely to account for the reduction of transcription vs. translation error rates (10−5 vs. 10−4).
It has been suggested that errors in proteins, as caused by transcription and translation errors, contribute to survivability in the face of external stresses by the production of novel proteins or metabolites (27, 61, 62) or by inducing the general stress response (63). Such effects could not be accomplished through genomic mutations because such mutations can incur permanent decrements to fitness after the stress is removed. Although transcription errors can increase cellular noise and confer a benefit under certain temporary conditions, most variation introduced by errors will not be advantageous. Thus, the predominant direction of selection is to lower error rates because too many errors will overload the proteome with deleterious proteins. Whether or not the above argument is tenable, our findings, showing a remarkable consistency of transcription error rates across ecologically diverse bacterial species, different RNA categories, and under a variety of stress and nonstress growth conditions indicate that transcription errors would contribute very little to such transient protein errors. Transcription is a much less accurate process than DNA replication, and because transcription errors are not heritable (and the vast majority of RNAs are transcribed faithfully under any set of conditions), there appears to be little selection to modulate the overall transcription error rate.
Methods
Strains and Growth Conditions.
Transcription errors were enumerated for E. coli MG1655 grown at 37 °C in (i) 15 g/L TSB or (ii) M9 minimal media supplemented with 0.4% glucose. Bacterial cultures were preconditioned in either TSB or M9 minimal media for 24 h before inoculation for sampling. Overnight cultures were diluted to OD600 = 0.05 into fresh media and sampled at midlog phase (4 h for TSB; 6 h for M9) and stationary phase (18 h for TSB; 24 h for M9).
Transcription errors were enumerated for B. aphidicola, an insect endosymbiont recovered directly from its aphid host, Acyrthosiphon pisum. B. aphidicola were isolated from 5 g adult aphids by a membrane filtration method (64) as follows: aphids were crushed by mortar and pestle in 15 mL buffer A (25 mM KCl, 35 mM Tris⋅HCl, 100 mM EDTA, and 250 mM sucrose, pH 8.0) at 4 °C, and the homogenate was centrifuged at 1,500 × g for 15 min. Pellets were resuspended in 15 mL buffer A and passed serially through 100-, 20-, 8-, and 5-µm filters. B. aphidicola cells were recovered from the filtrate by centrifugation. Transcription errors occurring in the genome of C. rudii, another insect endosymbiont, were determined from a pooled sample of bacteriocytes from 200 dissected larvae of the psyllid Pachypsylla venusta collected locally from galls present on a hackberry tree. Bacteriocytes were stored in buffer A at –20 °C before RNA extraction.
RNA Extractions.
RNA was extracted from E. coli following the RNAsnap protocol for gram-negative bacteria (65). Roughly 108 bacterial cells were harvested by centrifugation at 16,000 × g for 30 s, the supernatant was removed by aspiration, and pelleted cells were immediately transferred to liquid nitrogen to halt transcription. Samples were transferred to ice, mixed with 100 µL RNAsnap solution [18 mM EDTA, 0.025% SDS, 1% 2-mercaptoethanol, and 95% (vol/vol) formamide], briefly vortexed, and incubated for 7 min at 95 °C. Following incubation, samples were centrifuged at 16,000 × g for 5 min. Supernatants were mixed with an equal volume of PCI (phenol/chloroform/isoamyl alcohol, 25:24:1), the aqueous phase was removed and treated with an equal volume of chloroform, and RNA was precipitated by addition of 1/10 volume 3 M sodium acetate, 1/50 volume 50 mg/mL glycogen, and 3 volumes of 100% ethanol. DNA contamination was tested using a Qbit high sensitivity DNA assay (Life Technologies), and RNA quality was assessed on an Agilent Bioanalyzer. Ribosomal RNAs were removed from total RNA preparation using the MICROBExpress kit (Life Technologies).
RNA was extracted from B. aphidicola and C. ruddii by the addition of 0.75 mL TRIzol reagent (Life Technologies) to 0.25 mL harvested cells (or bacteriocytes in the case of C. ruddii). Samples were mixed with 0.5 mL sterile zirconium beads, vortexed for 2 min to disrupt cells, and incubated for 5 min at 20 °C. Following a chloroform extraction, nucleic acids were precipitated from the aqueous phase by the addition of 1/10 volume 3 M sodium acetate, 1/50 volume 50 mg/mL glycogen, and an equal volume of 100% isopropyl alcohol. Precipitated nucleic acids were washed twice with 70% (vol/vol) ethanol, suspended in 50 µL RNase-free dH2O, and treated with DNase, according to the supplier’s specifications (Promega). Reactions were terminated by the addition of an equal volume of PCI, and total RNA was precipitated, quantified, extracted, tested for purity, and cleared of ribosomal RNAs as described above.
Library Preparation and Sequencing.
We applied two library preparation procedures that have been reported to differentiate errors that occur during transcription from those that arise during sequencing (20–22). Both methods aim to produce multiple cDNA copies of each mRNA and identify consensus errors, which represent those that are actually present in the corresponding mRNA template. The first method involves successive rounds of sequencing streptavidin-captured mRNAs (20) to generate the multiple cDNA copies of each mRNA, and the second method (termed CircSeq) (21, 22) is based on the sequencing of short, circularized fragments of mRNA that are copied multiple times by rolling-circle amplification before sequencing. Attempts at the original streptavidin-capture method of Gout et al. (20) failed to generate multiple copies of cDNA from each mRNA, and even after consulting with the authors and applying several suggested additions and modifications to the published protocol, we concluded that this method, as currently described, cannot be used to estimate transcription error rates.
For the CircSeq procedure, we followed the protocol of Acevedo and Andino (22) with the following modifications that reduced the total number of steps. Starting with 1 μg purified mRNA, samples were fragmented with the NEB Magnesium Fragmentation module at 94 °C for 5 min and then assayed by denaturing PAGE. Regions of the gel-containing RNA fragments in the 80- to 100-nt size range were excised from the gel, and RNA was eluted from crushed gel slices by overnight incubation in a solution containing 600 mM sodium acetate, 0.017% (wt/vol) SDS, and 1.67 mM EDTA at 4 °C. RNA was recovered from the eluent by ethanol precipitation, washed in 70% (vol/vol) EtOH, resuspended in 14 µL ddH2O, and analyzed for quality on an Agilent Bioanalyzer RNA chip. RNA fragments were circularized by incubating the entire sample volume with 1 µL T4 polynucleotide kinase (NEB), 1 µL T4 RNA ligase I (NEB), 2 µL T4 RNA ligase buffer (NEB), and 2 µL 10 mM ATP for 30 min at 37 °C. Samples were purified by PCI extraction and ethanol precipitation, and libraries were prepared for Illumina sequencing by following the protocol accompanying the NEBNext Ultra RNA Library Prep Kit through completion of the second strand synthesis step. After this step, samples were repurified by PCI extraction and ethanol precipitation and analyzed with an Agilent Bioanalyzer RNA chip to determine the extent of rolling circle amplification, which occurred during the cDNA synthesis step of the NEB protocol. After confirming amplification status, ddH2O was added to a final volume of 200 µL, and samples were subjected to 12 min of pulsed sonication (15 s on, 15 s off, amplitude 20%) in a Qsonica sonicator to obtain fragments for sequencing. After harvesting nucleic acids by EtOH precipitation, we resumed the NEBNext Ultra RNA Library Prep Kit protocol for a target insert size of 300 bp. Samples were barcoded using NEBNext Multiplex Oligos (Index Primers Set 1), and the resulting libraries were sequenced on an Illumina MiSeq using 300-nt reads. Sequencing files were discriminated based on their identifying barcodes and analyzed using the CirSeq_v2 pipeline (21).
Data Processing and Analysis.
After the sequences were processed by the CirSeq_v2 pipeline with an average quality score cutoff of 20 (Fig. S5 and SI Methods), we removed those duplicate and multicopy genes that are polymorphic within the E. coli genome (e.g., structural RNA genes, ompF and ompC, and tufA and tufB) because the source of variation cannot be unequivocally assigned. Transcription error rates were adjusted for base composition of the sample using the weighted average of the occurrence of each nucleotide in the particular individual transcriptome being considered.
Fig. S5.
Effect of sequencing errors and data quality on the estimation of transcription error frequencies. Transcription error frequencies for the combined E. coli replicates were calculated at increasing average base quality scores between 10 and 40 to demonstrate the effect of sequencing errors and low quality bases on error frequencies. Overall transcription error frequency (A) and the transcription error frequency for each nucleotide substitution (B) level off in the quality-score range of 18–20, indicating that use of data in this range and beyond exclude sequencing artifacts from estimates of transcription error rates. There were insufficient bases in the transcriptome that attained average quality scores >38 for inclusion in this analysis.
We developed custom Python scripts to determine the following: (i) transcription errors, calculated by tabulating the total number of errors identified by the CirSeq_v2 pipeline within the protein coding regions of the genome (SI Methods); (ii) nucleotide coverage, calculated by adding the overall coverage of each base within the protein coding regions of the genome; (iii) error rates, calculated by tabulating the total number of errors and base coverage of all coding regions within 50-kb nonoverlapping windows across an entire genome and dividing the number of errors by the coverage, yielding an error rate (SI Methods); (iv) leading/lagging strand error rates, calculated by tabulating the errors and coverage of all genes situated on either the leading or lagging strands and calculating the error rate as above; and (v) the number of errors that would result in an amino acid replacement by chance, calculated by randomly generating simulated transcription errors from each sequenced transcriptome and determining their effects on the amino acid sequence. All statistics were performed in Prism Graphpad or R.
The list of nucleic acid information processing genes and the associated functions were curated using EcoCyc (66). Orthologs of these genes in the endosymbionts were determined using BLASTP from the National Center for Biotechnology Information (NCBI) (blast.ncbi.nlm.nih.gov/Blast.cgi) with an E score cutoff of ≤1 and an amino acid-positive score cutoff of ≥40%. The genome accession numbers for the genomes used in this study are NZ_ACFK01000001 for B. aphidicola LSR1 and NC_008512 for C. ruddii PV and were accessed through NCBI.
SI Methods
Transcription error rates were calculated by recovering all errors in the output file processed by CircSeq_v2. This pipeline is described in detail (22), but briefly, repeats within each read were identified by CircSeq_v2, and aligned to obtain a consensus sequence if a read contained were at least three full repeats of 100 bp or less. Any read that failed to meet this criterion was discarded. Because each base within each repeat is assigned a different quality score, a single quality score representative of the consensus sequence at each base was calculated as the average quality score from the three bases from each repeat at each location. Reads are then mapped to their respective reference genome using bowtie2, and errors were identified as those bases within reads that did not match the reference genome. Only bases that had an average quality score of 20 or higher (Fig. S4 and below) were used. Overall per base coverage was calculated as the sum of the total coverage of each base, and overall error rates were calculated by dividing the number of errors by the overall per base coverage. The error rate for each type of nucleotide substitution, with A→C as an example, was calculated as above except the error rate was adjusted for the base composition of the sequenced RNA such that
where is the error rate for A→C errors, is the base adjusted error rate, and is the adjustment coefficient for base composition, calculated as
where A is the fraction of overall adenosine nucleotides sequenced in the transcriptome. This calculation normalizes the error rate of A→C errors by any base compositional biases in the transcriptome. This error rate is presented in the context of the entire transcriptome (i.e., not within the context of all sequenced adenosine locations).
To ensure that sequencing errors did not influence our results, we analyzed the original sequence data to include all bases having an average quality score of 10 and higher and sequentially increased the stringency of the analysis by analyzing nucleotides at different quality score cut-offs (Fig. S5). By sequentially increasing the stringency of the analysis, we determined the influence of sequencing errors at each quality score. Because transcription error rates asymptote in the quality score range of 18–20 (Fig. S5), reflecting the point where sequencing errors are removed from the analysis. We selected a quality score value of 20 for all analyses, a value that maximizes the numbers of actual errors and provides accurate measures of transcription error rates.
Acknowledgments
We thank Dianne Lou and Sandie Shan for technical assistance, Ashley Acevedo for methodological clarifications, Alejandro Caro-Quintero for isolation of C. ruddii, and Kim Hammond for the preparation of figures. This research was supported by National Institutes of Health Award R01GM108657 (to H.O.).
Footnotes
The authors declare no conflict of interest.
This article is a PNAS Direct Submission.
See Commentary on page 3136.
This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1525329113/-/DCSupplemental.
References
- 1.Springgate CF, Loeb LA. On the fidelity of transcription by Escherichia coli ribonucleic acid polymerase. J Mol Biol. 1975;97(4):577–591. doi: 10.1016/s0022-2836(75)80060-x. [DOI] [PubMed] [Google Scholar]
- 2.Rosenberger RF, Foskett G. An estimate of the frequency of in vivo transcriptional errors at a nonsense codon in Escherichia coli. Mol Gen Genet. 1981;183(3):561–563. doi: 10.1007/BF00268784. [DOI] [PubMed] [Google Scholar]
- 3.Rosenberger RF, Hilton J. The frequency of transcriptional and translational errors at nonsense codons in the lacZ gene of Escherichia coli. Mol Gen Genet. 1983;191(2):207–212. doi: 10.1007/BF00334815. [DOI] [PubMed] [Google Scholar]
- 4.O’Farrell PH. The suppression of defective translation by ppGpp and its role in the stringent response. Cell. 1978;14(3):545–557. doi: 10.1016/0092-8674(78)90241-6. [DOI] [PubMed] [Google Scholar]
- 5.Bouadloun F, Donner D, Kurland CG. Codon-specific missense errors in vivo. EMBO J. 1983;2(8):1351–1356. doi: 10.1002/j.1460-2075.1983.tb01591.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Ballesteros M, Fredriksson A, Henriksson J, Nyström T. Bacterial senescence: Protein oxidation in non-proliferating cells is dictated by the accuracy of the ribosomes. EMBO J. 2001;20(18):5280–5289. doi: 10.1093/emboj/20.18.5280. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Kennell D, Riezman H. Transcription and translation initiation frequencies of the Escherichia coli lac operon. J Mol Biol. 1977;114(1):1–21. doi: 10.1016/0022-2836(77)90279-0. [DOI] [PubMed] [Google Scholar]
- 8.Golding I, Paulsson J, Zawilski SM, Cox EC. Real-time kinetics of gene activity in individual bacteria. Cell. 2005;123(6):1025–1036. doi: 10.1016/j.cell.2005.09.031. [DOI] [PubMed] [Google Scholar]
- 9.Blank A, Gallant JA, Burgess RR, Loeb LA. An RNA polymerase mutant with reduced accuracy of chain elongation. Biochemistry. 1986;25(20):5920–5928. doi: 10.1021/bi00368a013. [DOI] [PubMed] [Google Scholar]
- 10.Drake JW. A constant rate of spontaneous mutation in DNA-based microbes. Proc Natl Acad Sci USA. 1991;88(16):7160–7164. doi: 10.1073/pnas.88.16.7160. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Drake JW, Charlesworth B, Charlesworth D, Crow JF. Rates of spontaneous mutation. Genetics. 1998;148(4):1667–1686. doi: 10.1093/genetics/148.4.1667. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Lee H, Popodi E, Tang H, Foster PL. Rate and molecular spectrum of spontaneous mutations in the bacterium Escherichia coli as determined by whole-genome sequencing. Proc Natl Acad Sci USA. 2012;109(41):E2774–E2783. doi: 10.1073/pnas.1210309109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Wielgoss S, et al. Mutation rate inferred from synonymous substitutions in a long-term evolution experiment with Escherichia coli. G3 (Bethesda) 2011;1(3):183–186. doi: 10.1534/g3.111.000406. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Zhou YN, et al. Isolation and characterization of RNA polymerase rpoB mutations that alter transcription slippage during elongation in Escherichia coli. J Biol Chem. 2013;288(4):2700–2710. doi: 10.1074/jbc.M112.429464. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Arezi B, Hogrefe HH. Escherichia coli DNA polymerase III ε subunit increases Moloney murine leukemia virus reverse transcriptase fidelity and accuracy of RT-PCR procedures. Anal Biochem. 2007;360(1):84–91. doi: 10.1016/j.ab.2006.10.009. [DOI] [PubMed] [Google Scholar]
- 16.Skasko M, et al. Mechanistic differences in RNA-dependent DNA polymerization and fidelity between murine leukemia virus and HIV-1 reverse transcriptases. J Biol Chem. 2005;280(13):12190–12200. doi: 10.1074/jbc.M412859200. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Baranauskas A, et al. Generation and characterization of new highly thermostable and processive M-MuLV reverse transcriptase variants. Protein Eng Des Sel. 2012;25(10):657–668. doi: 10.1093/protein/gzs034. [DOI] [PubMed] [Google Scholar]
- 18.Loman NJ, et al. Performance comparison of benchtop high-throughput sequencing platforms. Nat Biotechnol. 2012;30(5):434–439. doi: 10.1038/nbt.2198. [DOI] [PubMed] [Google Scholar]
- 19.Meacham F, et al. Identification and correction of systematic error in high-throughput sequence data. BMC Bioinformatics. 2011;12:451. doi: 10.1186/1471-2105-12-451. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Gout JF, Thomas WK, Smith Z, Okamoto K, Lynch M. Large-scale detection of in vivo transcription errors. Proc Natl Acad Sci USA. 2013;110(46):18584–18589. doi: 10.1073/pnas.1309843110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Acevedo A, Brodsky L, Andino R. Mutational and fitness landscapes of an RNA virus revealed through population sequencing. Nature. 2014;505(7485):686–690. doi: 10.1038/nature12861. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Acevedo A, Andino R. Library preparation for highly accurate population sequencing of RNA viruses. Nat Protoc. 2014;9(7):1760–1769. doi: 10.1038/nprot.2014.118. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Fredriksson A, et al. Decline in ribosomal fidelity contributes to the accumulation and stabilization of the master stress response regulator sigmaS upon carbon starvation. Genes Dev. 2007;21(7):862–874. doi: 10.1101/gad.409407. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Barak Z, Gallant J, Lindsley D, Kwieciszewki B, Heidel D. Enhanced ribosome frameshifting in stationary phase cells. J Mol Biol. 1996;263(2):140–148. doi: 10.1006/jmbi.1996.0565. [DOI] [PubMed] [Google Scholar]
- 25.Fu C, Parker J. A ribosomal frameshifting error during translation of the argI mRNA of Escherichia coli. Mol Gen Genet. 1994;243(4):434–441. doi: 10.1007/BF00280474. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Wenthzel AM, Stancek M, Isaksson LA. Growth phase dependent stop codon readthrough and shift of translation reading frame in Escherichia coli. FEBS Lett. 1998;421(3):237–242. doi: 10.1016/s0014-5793(97)01570-6. [DOI] [PubMed] [Google Scholar]
- 27.Meyerovich M, Mamou G, Ben-Yehuda S. Visualizing high error levels during gene expression in living bacterial cells. Proc Natl Acad Sci USA. 2010;107(25):11543–11548. doi: 10.1073/pnas.0912989107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Cheng ZF, Deutscher MP. Quality control of ribosomal RNA mediated by polynucleotide phosphorylase and RNase R. Proc Natl Acad Sci USA. 2003;100(11):6388–6393. doi: 10.1073/pnas.1231041100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Imashimizu M, Oshima T, Lubkowska L, Kashlev M. Direct assessment of transcription fidelity by high-resolution RNA sequencing. Nucleic Acids Res. 2013;41(19):9090–9104. doi: 10.1093/nar/gkt698. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Erie DA, Hajiseyedjavadi O, Young MC, von Hippel PH. Multiple RNA polymerase conformations and GreA: Control of the fidelity of transcription. Science. 1993;262(5135):867–873. doi: 10.1126/science.8235608. [DOI] [PubMed] [Google Scholar]
- 31.Zenkin N, Yuzenkova Y. New insights into the functions of transcription factors that bind the RNA polymerase secondary channel. Biomolecules. 2015;5(3):1195–1209. doi: 10.3390/biom5031195. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Roghanian M, Zenkin N, Yuzenkova Y. Bacterial global regulators DksA/ppGpp increase fidelity of transcription. Nucleic Acids Res. 2015;43(3):1529–1536. doi: 10.1093/nar/gkv003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Moran NA, Bennett GM. The tiniest tiny genomes. Annu Rev Microbiol. 2014;68:195–215. doi: 10.1146/annurev-micro-091213-112901. [DOI] [PubMed] [Google Scholar]
- 34.Shigenobu S, Watanabe H, Hattori M, Sakaki Y, Ishikawa H. Genome sequence of the endocellular bacterial symbiont of aphids Buchnera sp. APS. Nature. 2000;407(6800):81–86. doi: 10.1038/35024074. [DOI] [PubMed] [Google Scholar]
- 35.Nakabachi A, et al. The 160-kilobase genome of the bacterial endosymbiont Carsonella. Science. 2006;314(5797):267. doi: 10.1126/science.1134196. [DOI] [PubMed] [Google Scholar]
- 36.Moran NA, McLaughlin HJ, Sorek R. The dynamics and time scale of ongoing genomic erosion in symbiotic bacteria. Science. 2009;323(5912):379–382. doi: 10.1126/science.1167140. [DOI] [PubMed] [Google Scholar]
- 37.Lind PA, Andersson DI. Whole-genome mutational biases in bacteria. Proc Natl Acad Sci USA. 2008;105(46):17878–17883. doi: 10.1073/pnas.0804445105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.McCutcheon JP, Moran NA. Extreme genome reduction in symbiotic bacteria. Nat Rev Microbiol. 2012;10(1):13–26. doi: 10.1038/nrmicro2670. [DOI] [PubMed] [Google Scholar]
- 39.Bar-Nahum G, et al. A ratchet mechanism of transcription elongation and its control. Cell. 2005;120(2):183–193. doi: 10.1016/j.cell.2004.11.045. [DOI] [PubMed] [Google Scholar]
- 40.Vogel U, Jensen KF. 1994. The RNA chain elongation rate in Escherichia coli depends on growth rate. J Bacteriol 176(10):2807–2813.
- 41.Loewe L, Textor V, Scherer S. High deleterious genomic mutation rate in stationary phase of Escherichia coli. Science. 2003;302(5650):1558–1560. doi: 10.1126/science.1087911. [DOI] [PubMed] [Google Scholar]
- 42.Bull HJ, Lombardo MJ, Rosenberg SM. Stationary-phase mutation in the bacterial chromosome: Recombination protein and DNA polymerase IV dependence. Proc Natl Acad Sci USA. 2001;98(15):8334–8341. doi: 10.1073/pnas.151009798. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.McKenzie GJ, Lombardo MJ, Rosenberg SM. Recombination-dependent mutation in Escherichia coli occurs in stationary phase. Genetics. 1998;149(2):1163–1165. doi: 10.1093/genetics/149.2.1163. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Bhamre S, Gadea BB, Koyama CA, White SJ, Fowler RG. An aerobic recA-, umuC-dependent pathway of spontaneous base-pair substitution mutagenesis in Escherichia coli. Mutat Res. 2001;473(2):229–247. doi: 10.1016/s0027-5107(00)00155-x. [DOI] [PubMed] [Google Scholar]
- 45.Bridges BA. Spontaneous mutation in stationary-phase Escherichia coli WP2 carrying various DNA repair alleles. Mutat Res. 1993;302(3):173–176. doi: 10.1016/0165-7992(93)90045-w. [DOI] [PubMed] [Google Scholar]
- 46.Bridges BA. Mutation in resting cells: The role of endogenous DNA damage. Cancer Surv. 1996;28:155–167. [PubMed] [Google Scholar]
- 47.Foster PL. Stress responses and genetic variation in bacteria. Mutat Res. 2005;569(1-2):3–11. doi: 10.1016/j.mrfmmm.2004.07.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Nair S, Finkel SE. Dps protects cells against multiple stresses during stationary phase. J Bacteriol. 2004;186(13):4192–4198. doi: 10.1128/JB.186.13.4192-4198.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Schellhorn HE, Hassan HM. Transcriptional regulation of katE in Escherichia coli K-12. J Bacteriol. 1988;170(9):4286–4292. doi: 10.1128/jb.170.9.4286-4292.1988. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Lombardo MJ, Aponyi I, Rosenberg SM. General stress response regulator RpoS in adaptive mutation and amplification in Escherichia coli. Genetics. 2004;166(2):669–680. doi: 10.1534/genetics.166.2.669. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Coulondre C, Miller JH. Genetic studies of the lac repressor. IV. Mutagenic specificity in the lacI gene of Escherichia coli. J Mol Biol. 1977;117(3):577–606. doi: 10.1016/0022-2836(77)90059-6. [DOI] [PubMed] [Google Scholar]
- 52.Leong PM, Hsia HC, Miller JH. Analysis of spontaneous base substitutions generated in mismatch-repair-deficient strains of Escherichia coli. J Bacteriol. 1986;168(1):412–416. doi: 10.1128/jb.168.1.412-416.1986. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Cupples CG, Miller JH. A set of lacZ mutations in Escherichia coli that allow rapid detection of each of the six base substitutions. Proc Natl Acad Sci USA. 1989;86(14):5345–5349. doi: 10.1073/pnas.86.14.5345. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Frederico LA, Kunkel TA, Shaw BR. A sensitive genetic assay for the detection of cytosine deamination: Determination of rate constants and the activation energy. Biochemistry. 1990;29(10):2532–2537. doi: 10.1021/bi00462a015. [DOI] [PubMed] [Google Scholar]
- 55.Kramer EB, Farabaugh PJ. The frequency of translational misreading errors in E. coli is largely determined by tRNA competition. RNA. 2007;13(1):87–96. doi: 10.1261/rna.294907. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.McNulty DE, et al. Mistranslational errors associated with the rare arginine codon CGG in Escherichia coli. Protein Expr Purif. 2003;27(2):365–374. doi: 10.1016/s1046-5928(02)00610-1. [DOI] [PubMed] [Google Scholar]
- 57.Buckstein MH, He J, Rubin H. Characterization of nucleotide pools as a function of physiological state in Escherichia coli. J Bacteriol. 2008;190(2):718–726. doi: 10.1128/JB.01020-07. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Abbondanzieri EA, Greenleaf WJ, Shaevitz JW, Landick R, Block SM. Direct observation of base-pair stepping by RNA polymerase. Nature. 2005;438(7067):460–465. doi: 10.1038/nature04268. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Larson MH, Landick R, Block SM. Single-molecule studies of RNA polymerase: One singular sensation, every little step it takes. Mol Cell. 2011;41(3):249–262. doi: 10.1016/j.molcel.2011.01.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Nudler E. RNA polymerase backtracking in gene regulation and genome instability. Cell. 2012;149(7):1438–1445. doi: 10.1016/j.cell.2012.06.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Gordon AJ, Satory D, Halliday JA, Herman C. Lost in transcription: Transient errors in information transfer. Curr Opin Microbiol. 2015;24:80–87. doi: 10.1016/j.mib.2015.01.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.D’Ari R, Casadesús J. Underground metabolism. BioEssays. 1998;20(2):181–186. doi: 10.1002/(SICI)1521-1878(199802)20:2<181::AID-BIES10>3.0.CO;2-0. [DOI] [PubMed] [Google Scholar]
- 63.Fan Y, et al. Protein mistranslation protects bacteria against oxidative stress. Nucleic Acids Res. 2015;43(3):1740–1748. doi: 10.1093/nar/gku1404. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Jiang Z, et al. Comparative analysis of genome sequences from four strains of the Buchnera aphidicola Mp endosymbion of the green peach aphid, Myzus persicae. BMC Genomics. 2013;14:917. doi: 10.1186/1471-2164-14-917. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Stead MB, et al. RNAsnap™: A rapid, quantitative and inexpensive, method for isolating total RNA from bacteria. Nucleic Acids Res. 2012;40(20):e156. doi: 10.1093/nar/gks680. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Keseler IM, et al. EcoCyc: fusing model organism databases with systems biology. Nucleic Acids Res. 2013;41(Database issue):D605–D612. doi: 10.1093/nar/gks1027. [DOI] [PMC free article] [PubMed] [Google Scholar]








