Abstract
Background
Codon pair usage (codon context) is a species specific gene primary structure feature whose evolutionary and functional roles are poorly understood. The data available show that codon-context has direct impact on both translation accuracy and efficiency, but one does not yet understand how it affects these two translation variables or whether context biases shape gene evolution.
Methodologies/Principal Findings
Here we study codon-context biases using a set of 72 orthologous highly conserved genes from bacteria, archaea, fungi and high eukaryotes to identify 7 distinct groups of codon context rules. We show that synonymous mutations, i.e., neutral mutations that occur in synonymous codons of codon-pairs, are selected to maintain context biases and that non-synonymous mutations, i.e., non-neutral mutations that alter protein amino acid sequences, are also under selective pressure to preserve codon-context biases.
Conclusions
Since in vivo studies provide evidence for a role of codon context on decoding fidelity in E. coli and for decoding efficiency in mammalian cells, our data support the hypothesis that, like codon usage, codon context modulates the evolution of gene primary structure and fine tunes the structure of open reading frames for high genome translational fidelity and efficiency in the 3 domains of life.
Introduction
The degenerate nature of the genetic code introduces flexibility in gene evolution, allowing for selection of coding sequences with high stability and translational efficiency. Various studies uncovered biases in codon usage associated to translational selection in practically all living beings (reviewed in [1], [2]). Thus, single codons are not chosen randomly, they are under the influence of a number of factors that modulate both speed and accuracy of protein synthesis. The distribution of codon-pairs is also not random but it is independent of codon-usage biases [3], [4], indicating that these two gene primary structure variables evolve independently.
Codon-context biases have been studied in Eubacteria, Archaea and Eukaryota [5], [6] and both general and kingdom-specific trends which can be attributed to translational efficiency and accuracy have been discovered [7]–[11]. However, mutational pressure and epigenetic regulatory features also play a role in codon-context evolution [12]–[15]. Obviously, codon-context features associated to mRNA translation are only found in protein coding genes, while nucleotide context features linked to other selective forces, namely G+C pressure, are found in coding and non-coding regions.
In order to clarify the role of codon context in gene translation and gene evolution and to simplify de novo gene design algorithms, we have studied codon context in conserved and highly expressed genes where translational biases are stronger and more easily identifiable. Our approach is based on the following hypothesis: “if codon context modulates mRNA decoding efficiency then the features that modulate translation efficiency should be visible in highly expressed genes”. Similar approaches have been successfully used to identify functional rare codons that play roles in protein folding [16]. With this goal in mind, we have carried out multiple alignments of orthologous genes from bacteria, archaea and eukaryotic species and developed software tools to highlight codon-context features in these alignments. This allowed us to identify subsets of conserved codon contexts that shape the evolution of coding sequences. We demonstrate here that some of these conservation patterns are so strong that can divide codon contexts into well defined sub-groups. The overall study shows that codon context is a punctual modulator of coding sequence evolution and that context conservation alone is sufficient to explain changes in mRNA and polypeptide sequences. We also show that codon-context imposes positive pressure on synonymous codons and that apparently neutral mutations are in fact constrained by the need to maintain codon-context patterns.
Results
Codon-context bias in conserved genes
In order to determine whether codon context is conserved in coding sequences, a group of 72 highly expressed genes from S. cerevisiae (Table 1) was used to retrieve orthologs from the 3 domains of life (Figure S1). The codon context biases were then determined as previously described by Moura and colleagues [4], [17]. Multiple alignments of the orthologous genes allowed us to highlight codon contexts in red, black and green, according to the bias detected ([4], [17], Figure 1).
Table 1. Gene set.
ADH1 |
ASC1 |
CCW12 |
CDC19 |
ENO1 |
FBA1 |
GPM1 |
ILV5 |
PGK1 |
TDH1 |
TPI1 |
RP(L,P,S) – 61 GENES |
S. cerevisiae genes that were used to build the orthologous gene list analyzed in this study. Individual ribosomal proteins (RP in the table) are included as supporting information (Figure S7).
Codon-pairs used in the orthologs of each group of organisms (bacteria, archaea, fungi and high eukaryotes) were then compared with those of E. coli, M. jannaschii, S. cerevisiae and H. sapiens, respectively, to determine codon context conservation (Methods and Figure 2). Determination of the conservation of single codons of codon-pairs showed a significant number of scenarios where the first codon changed but the context color was conserved (Figure 2). In order to determine if those differences were related to the nature of the two codons of codon-pairs, data from bacteria, archaea, fungi and high eukaryotes were re-analyzed by separating the codon-pairs by their first codon and searching again for context conservation. As shown in Figure 3A for fungal species and in Figure 3B and Figure S2 for the 4 phylogenetic groups, the pattern of codon context conservation among the 4 groups was rather different, with more codon-context pairs being conserved in bacteria (44%), archaea (46%) and fungi (48%) than in high eukaryotes (30%) (Figure 3B, Figure S2). Also, the number of unbiased pairs of codons decreased from bacteria (33%) to high eukaryotes (11%) (Figure 3B, Figure S2).
In order to identify the codon mutational dynamics responsible for this codon-context conservation, the original data set was split again into 3 different groups depending on the variation of the first codon, namely: i) codons that changed to synonymous ones; ii) codons that changed to codons belonging to conserved amino acids; or iii) codons that changed to codons belonging to non-conserved amino acids. In fungi, codon-contexts were mainly conserved when the amino acids of the pair were not altered (Figure 4A,C) or when the codons changed to non-synonymous codons of conserved amino acid families (Figure 4B). Significant differences were not detected when codons changed to codons encoding chemically distinct amino acids (non-conserved amino acids). Conversely, the context was not conserved mainly when codons changed to synonymous codons in the first position of the pair (Figure 4D). This result can be explained by the fact that synonymous codons usually share the first two nucleotides and differ in the third one only. Indeed, previous studies have shown that the major codon-context preferences are associated with the X3-Y1 di-nucleotides of codon-pairs X1X2X3-Y1Y2Y3 (see [4], [5]) and, therefore, a change in X3 may reverse the context color (see Figure 5A). This also explains why changes from synonymous to non-synonymous codons belonging to similar amino acids may maintain the context; X3 nucleotides may remain unchanged (Figure 5B).
Codon-pairs can be grouped by context patterns
In an attempt to elucidate the maintenance of codon-context patterns in cases where neutral synonymous codons changed and in cases involving chemically similar amino acids, the codon-pairs plots of the four phylogenetic groups were arranged according to the pattern of conservation of each codon pair, i.e. each plot was compared to those shown in Figure 4 and was classified using a color code to highlight each group, as in Figure 3A. This allowed us to classify the codon-pairs in 7 major patterns, the 7th group included codon-pairs that did not produce significant biases (Figure 6, Figure S3). In fungi (Figure 6A), codon-pairs starting with the codons AAA-Lys, ACA-Thr, CAG-Gln, CCC-Pro, CUA-Leu, CUG-Leu, GAA-Glu, GCC-Ala, UAU-Tyr and UGC-Cys normally altered the first codon to a synonymous codon to maintain the context (plot A in Figure 4; red squares in Figure 6). Codon pairs starting with the codons AAU-Asn, ACU-Thr, AUC-Ile, AUU-Ile, CAU-His, CUU-Leu, GAU-Asp, GCA-Ala, GCU-Ala, GGG-Gly, GUU-Val, UCU-Ser, UUA-Leu and UUU-Phe changed the first codon mainly to a synonymous codon and altered the context (plot D in Figure 4; blue squares in Figure 6). This was the case for the majority of codon contexts in high eukaryotes (Figure 6C, Figure S3) and explained the low codon-context conservation of this group of organisms. In order to further clarify the genome representability of each of these context patterns, the number of codon-pair types of each pattern (Figure 6C) was displayed in a pie chart (Figure 6B). Almost one third of the contexts were conserved (red, orange and yellow) while the other two thirds were either not conserved (blue and grey) or were undetermined (white) (Figure 6B).
The contribution of each phylogenetic group to those scores was diverse. For example, high eukaryotes contributed with the highest number of non-conserved contexts 18%, against 5%, 3% and 10% of bacteria, archaea and fungi, respectively. Fungi and bacteria contributed with the highest number of conserved contexts, 10% and 9%, respectively, while archaea showed the highest proportion of unbiased results (16%). In other words, the majority of codon contexts in high eukaryotes were non-conserved while most codon-pairs in bacteria and fungi were conserved and most codon-pairs in archaea showed no bias. The latter was probably related to small sample size and poorer quality of the alignments due to lower similarity of orthologues. This is supported by the fact that archaeal genes showed the highest frequency of complete context changes, i.e. the number of times both codons of the codon-pairs changed (Figure S4).
Codon-context biases modulate evolution of coding sequences
In order to determine the relevance of codon-context conservation for the evolution of coding sequences, the codon-pairs associated with codon-context conservation in each group of organisms were further studied (i.e. red, orange and yellow colored codon-pairs in Figure 6 and Figure S3). We investigated first whether codon-pairs that altered their context, i.e., identical codon pairs (e.g. AGA-ACC) that had different colors in the reference and test organisms, were more prone to mutate in order to recover the original color/bias (to achieve context conservation). For this, we have calculated the frequency of unchanged or changed codon pairs, depending on whether the color of the original pair was maintained or reversed, relative to the reference genome (Figure 7A). The frequency of codon conservation in cases where the codon-context was altered (arrow in Figure 7A) was significantly lower than that of the other 3 possibilities, suggesting that context created positive selective pressure on the codon-pair. We have then isolated codon-pairs where we could detect alteration of the first codon of the pair to a synonymous codon to determine whether synonymous codons appeared randomly or whether positive mutational pressure selected codons that maintained the context bias. Whenever a synonymous alternative maintained the context bias it was selected (Figure 7B). Indeed, 68% of the first codon alterations maintained the color and only 16% showed color alteration (p = 5.39E-10). The percentage of synonymous codon alterations at the first codon position which resulted in alteration of context color was identical in cases where an alternative synonymous codon maintained the context (16%) and when such alternative did not exist (16%), suggesting that random codon alterations represent 16% of possible mutations only.
Codon pairs where the first codon mutated to another codon belonging to a conserved amino acid family were also analyzed in the same way (Figure 7C). This allowed us to test whether the choice of a different amino acid could be explained by the need to maintain codon context. Again, 63% (26%+37%) of the mutations involving conserved amino acids maintained the color of the codon-pair, while only in 38% (20%+18%) of the cases a color change was detected (p = 8.5E-05). Moreover, in those 63% of conserved contexts involving conserved amino acids more than half (37%) could be explained by the absence of a synonymous codon that could maintain the context pattern. Therefore, in cases of codon-pairs where synonymous codon alternatives did not exist, amino acids rather than context color were altered, a result that further supported the relevance of codon-pair context bias in the evolution of coding sequences.
Discussion
Gene translation accuracy is a conserved feature of life [18]. Analyses of variables which are commonly used to quantify evolutionary change and gene expression variation show conserved patterns of covariation which are amenable to computational simulation. Such studies have shown that protein misfolding associated to translational misreading explains most of the codon usage biases observed in highly expressed genes [18]. Indeed, genes apparently evolve to avoid mRNA mistranslations and protein misfolding, which is mainly achieved through selection of optimally translated codons, in particular in conserved protein domains. This trend is dependent on gene dispensability (i.e. fitness effect associated to gene deletion) and on the sensitivity of cells or tissues towards protein misfolding [18].
Like codon usage, the context of codons is implicated in translational speed and accuracy, but in ways that are apparently stronger than codon usage [9]. Indeed, specific codon contexts are implicated in missense [7], [19], nonsense [20]–[23] and frameshifting errors [24]–[26] which in turn are influenced by environmental cues, such as the amino acid supply [7], [27]. Certain contexts are also repressed because of ribosome slippage during ribosome decoding [28]. Recently, the relevance of codon-pair contexts on translational efficiency has been highlighted in a study where Synthetic Attenuated Virus Engineering (SAVE) was used for production of live attenuated viral vaccines [10], [11]. In this study viral genomes were redesigned by substituting frequent with infrequently used codon-pairs (codon context) without changing the codon usage bias of the gene nor the amino acid composition of proteins. These recombinant viruses did not show major changes in their in vitro growth rates, but produced less protein than wild type viruses and their virulence was attenuated in mice infection models. Indeed, these codon context engineered viruses were still able to replicate inside the host, did not cause significant symptoms during infection, but were effective in mice immunization [11]. Therefore, much like codon usage, codon context modulates the efficiency of protein synthesis, although the exact molecular mechanism behind this phenomenon is still unclear. Furthermore, one does not yet understand whether context effects are restricted to specific domains of coding sequences or are felt along the entire length of mRNA. That almost 1/3 of the codon-pair types are conserved (Figure 6B) supports the hypothesis that context effects influence translation along the entire length of mRNAs. Our data shows that codon-context conservation, from bacteria to vertebrates, is mainly achieved through the preferential utilization of synonymous codons or amino acids with similar chemical properties.
Important implications of defined codon context patterns are the imposition of specific constraints on the evolution of coding sequences and non-neutrality effects of most mutations. Indeed, synonymous mutations that alter codon-context patterns are highly likely to affect translation efficiency. We have demonstrated that codon pairs that altered their context bias tend to accumulate additional mutations in order to restore sequence-specific codon context biases. Therefore, codon context and codon usage biases [29], local variation in gene expression [30] and fitness penalties associated to mistranslation [31], should all be included in the calculation of the number of synonymous nucleotide substitutions per synonymous site (dS) in order to estimate more accurately the rate of neutral evolution.
Various groups have addressed the problem of the origin of biased genome G+C content in bacterial genomes [32], [33], and have shown that strong biases favoring G+C rich mutations are counterbalanced in a second mutational step by purifying selection [34]. One possible implication of our data is that codon context could also affect G+C content of coding sequences, however, we were unable to detect specific trends that could support this hypothesis (Figure S5).
A significant number of codon contexts (2/3 of the total) were not conserved in our analysis, either because there was not enough data to allow for bias determination (1/3 of the cases), or because they were affected by specific mutational pressures or epigenetic constraints [12]–[15]. Interestingly, 100% of the NNU-NNN contexts of high eukaryotes, 63% of fungi, 25% of archaea and 19% of bacteria, were not conserved, reinforcing the discriminatory power of the U3N1 dinucleotides in codon-context biases [4]–[6].
Also interesting was the low codon-context conservation observed in high eukaryotes relative to the other phylogenetic groups (Figures S2, S3). This apparent softening of evolutionary pressure is not observed in codon usage [18] and suggests that multi-cellular organisms may use somewhat different mRNA decoding rules. It is possible that codon contexts become less important for translational accuracy in high eukaryotes because their higher number of tRNA isoacceptor genes may increase cognate codon decoding (see Figure S6 and [35]). This is supported by the observation that error-prone codon contexts are often associated with codons that are read by rare and/or near-cognate tRNAs (e.g. [36], [37]). If so, codon context can be further distinguished from codon usage, as the latter is mainly dependent on tRNA abundance rather than cognate codon-anticodon interactions.
In conclusion, almost one third of all codon-pairs from bacteria, archaea and eukarya have a significant tendency to conserve context biases in essential genes even if such conservation requires mutations that alter amino acid sequences. Therefore, codon context modulates gene primary structure evolution and, more importantly, neutral mutations that alter codon context create strong negative translational pressure on the codon pair forcing the introduction of addicional mutations that restore species-specific codon context biases.
Methods
Retrieval of orthologous genes
ORFeome sequences were retrieved from NCBI Genbank (ftp.ncbi.nih.gov/genomes/), the Broad Institute (www.broad.mit.edu/annotation/), the Candida Genome database (www.candidagenome.org), www.nature.com/nature/journal and the Ensembl ftp site (ftp.ensembl.org/pub/current_fasta/) (see Figure S1 for details and links). For retrieval of orthologuous gene sets, the sequences of 72 highly conserved S. cerevisiae genes (Table 1 and Figure S7) were downloaded from SGD (http://www.yeastgenome.org/) and were aligned using Anaconda (species listed in Figure S1). All best matches were then aligned against the entire ORFeome of S. cerevisiae and only reciprocal best matches were considered. The K. waltii ORFeome had a reduced number of valid ORFs, and in order to increase this number an additional BLAST was performed using K. lactis orthologues against K. waltii GenBank sequences (using the tool for genomic blast against fungi). These putative K. waltii orthologues were then aligned against the K. lactis ORFeome and only reciprocal best matches were considered. All alignments and subsequent sequence handling took into consideration the alternative nuclear genetic code of C. albicans, C. tropicalis, D. hansenii, C. guilliermondii and C. lusitaniae (leucine CUG codons are decoded as serine in these species). ORF sequences that did not start with an ATG codon, did not end with one of the 3 stop codons (TAA, TGA, TAG), had internal stop codons or undefined bases (N), were discarded from the dataset using Anaconda tools for sequence quality control [17].
Conservation measurements
Each group of orthologous genes was uploaded into Anaconda and was mapped according the codon-context biases identified for each organism (Figure 1), as described previously [4], [5], [17]. Briefly, Anaconda counts all codon pairs of each complete set of coding sequences (ORFeome) and classifies them as preferred or repressed relative to what would be expected if both codons were associated independently. This statistical discrimination is shown by the adjusted residue value which is positive for preferred and negative for rejected codon pairs. This methodology allows for mapping codon-pair context biases at the ORFeome level. Figure 1D shows an example of a mapped ORF, highlighting preferred codon contexts in green and repressed contexts in red. Since adjusted residues are calculated by analysing the frequency of two consecutive codons, color overlapping was eliminated by coloring the first codon of the pair only, the second codon is the first of the next pair of codons.
The mapped ORFs were then aligned using the BLASTP multiple-alignment tool which was implemented in Anaconda [38] with the following parameters: maximum E-value = 0.5; GOP = 11; GEP = 1; matrix = BLOSUM62; identity = more than 10%; ORF aligned = more than 0%. Alignments such as those shown in Figure 1E were used to count the number of times the first codon of a pair and/or the context biases changed, i.e. from red to green or vice-versa, when compared to the reference sequence. In order to simplify the analysis, all bacterial sequences were compared to those of E. coli, and M. jannaschii, S. cerevisiae and H. sapiens sequences were used as references for the archaeal, fungal and high eukaryotic genes, respectively. Changes of the first codon of a pair were considered according to the BLASTP output, i.e., codons that remained unchanged (*), codons that changed to synonymous ones (:), codons that changed to conserved amino acids (.), or codons that changed to non-conserved amino acid families ( ). Codon-context changes were considered whenever a significantly high bias (adjusted residual above 5.00) changed to a significantly low one (adjusted residual bellow −5.00), or vice versa (see [4], [17] for statistical details). We tested whether the high codon usage bias of the orthologous genes used influenced the representation of codon-pairs. For this, the number of codons that were absent in the dataset was counted. The results show an averaged effective number of codons close to 61 and a global codon coverage close to 100% (Figure S8).
Statistics
Significant differences in single codon and codon-pair conservation between test and reference samples were determined using Microsoft Excel spreadsheets and SPSS software tools. The statistical tests used were either two-tailed T-student tests for paired samples (as in Figure 2) or ANOVA analyses for two-factors without replication, followed by post-hoc tests (as in Figure 7). Whenever the number of species per group was low (e.g. n = 5), non-parametric (Wilcoxon) tests were used in parallel with ANOVA analyses, with similar outcomes. The adjustment of proportions to the normal distribution was tested through Kolmogorov-Smirnov (KS) tests prior to further analyses. Values were considered significantly different if p<0.05, except for simultaneous T-student tests where α was divided by the total number of tests performed, according to the Bonferroni's correction to avoid the artificial increase of the α parameter.
Supporting Information
Footnotes
Competing Interests: The authors have declared that no competing interests exist.
Funding: This study was founded by the EU-FP7 MEPHITIS project and by the Portuguese Foundation for Science and Technology (FCT) projects PTDC/BIA-BCM/72251/2006, PTDC/BIA-BCM/64745/2006 and PTDC/BIA-GEN/110383/2009. The Institute of Electronics and Telematics Enginnering of Aveiro (IEETA) supported the development of the Anaconda software package. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
References
- 1.Hershberg R, Petrov DA. Selection on codon bias. Annu Rev Genet. 2008;42:287–299. doi: 10.1146/annurev.genet.42.110807.091442. [DOI] [PubMed] [Google Scholar]
- 2.Hershberg R, Petrov DA. General rules for optimal codon choice. e1000556-PLoS Genet. 2009;5 doi: 10.1371/journal.pgen.1000556. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Gutman GA, Hatfield GW. Nonrandom utilization of codon pairs in Escherichia coli. Proc Natl Acad Sci U S A. 1989;86:3699–3703. doi: 10.1073/pnas.86.10.3699. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Moura G, Pinheiro M, Silva R, Miranda I, Afreixo V, et al. Comparative context analysis of codon pairs on an ORFeome scale. R28-Genome Biol. 2005;6 doi: 10.1186/gb-2005-6-3-r28. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Moura G, Pinheiro M, Arrais J, Gomes AC, Carreto L, et al. Large scale comparative codon-pair context analysis unveils general rules that fine-tune evolution of mRNA primary structure. e847-PLoS ONE. 2007;2 doi: 10.1371/journal.pone.0000847. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Tats A, Tenson T, Remm M. Preferred and avoided codon pairs in three domains of life. 463-BMC Genomics. 2008;9 doi: 10.1186/1471-2164-9-463. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Precup J, Parker J. Missense misreading of asparagine codons as a function of codon identity and context. J Biol Chem. 1987;262:11351–11355. [PubMed] [Google Scholar]
- 8.Parker J. Errors and alternatives in reading the universal genetic code. Microbiol Rev. 1989;53:273–298. doi: 10.1128/mr.53.3.273-298.1989. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Irwin B, Heck JD, Hatfield GW. Codon pair utilization biases influence translational elongation step times. J Biol Chem. 1995;270:22801–22806. doi: 10.1074/jbc.270.39.22801. [DOI] [PubMed] [Google Scholar]
- 10.Coleman JR, Papamichail D, Skiena S, Futcher B, Wimmer E, et al. Virus attenuation by genome-scale changes in codon pair bias. Science. 2008;320:1784–1787. doi: 10.1126/science.1155761. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Mueller S, Coleman JR, Papamichail D, Ward CB, Nimnual A, et al. Live attenuated influenza virus vaccines by computer-aided rational design. Nat Biotechnol. 2010;28:723–726. doi: 10.1038/nbt.1636. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Chen SL, Lee W, Hottes AK, Shapiro L, McAdams HH. Codon usage between genomes is constrained by genome-wide mutational processes. Proc Natl Acad Sci U S A. 2004;101:3480–3485. doi: 10.1073/pnas.0307827100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Chan SW, Henderson IR, Jacobsen SE. Gardening the genome: DNA methylation in Arabidopsis thaliana. Nat Rev Genet. 2005;6:351–360. doi: 10.1038/nrg1601. [DOI] [PubMed] [Google Scholar]
- 14.Robertson KD. DNA methylation and human disease. Nat Rev Genet. 2005;6:597–610. doi: 10.1038/nrg1655. [DOI] [PubMed] [Google Scholar]
- 15.Duan J, Antezana MA. Mammalian mutation pressure, synonymous codon choice, and mRNA degradation. J Mol Evol. 2003;57:694–701. doi: 10.1007/s00239-003-2519-1. [DOI] [PubMed] [Google Scholar]
- 16.Widmann M, Clairo M, Dippon J, Pleiss J. Analysis of the distribution of functionally relevant rare codons. 207-BMC Genomics. 2008;9 doi: 10.1186/1471-2164-9-207. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Moura G, Pinheiro M, Freitas AV, Oliveira JL, Santos MA. Computational and statistical methodologies for ORFeome primary structure analysis. Methods Mol Biol. 2007;395:449–462. doi: 10.1007/978-1-59745-514-5_28. [DOI] [PubMed] [Google Scholar]
- 18.Drummond DA, Wilke CO. Mistranslation-induced protein misfolding as a dominant constraint on coding-sequence evolution. Cell. 2008;134:341–352. doi: 10.1016/j.cell.2008.05.042. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Forman MD, Stack RF, Masters PS, Hauer CR, Baxter SM. High level, context dependent misincorporation of lysine for arginine in Saccharomyces cerevisiae a1 homeodomain expressed in Escherichia coli. Protein Sci. 1998;7:500–503. doi: 10.1002/pro.5560070231. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Murgola EJ, Pagel FT, Hijazi KA. Codon context effects in missense suppression. J Mol Biol. 1984;175:19–27. doi: 10.1016/0022-2836(84)90442-x. [DOI] [PubMed] [Google Scholar]
- 21.Bossi L, Ruth JR. The influence of codon context on genetic code translation. Nature. 1980;286:123–127. doi: 10.1038/286123a0. [DOI] [PubMed] [Google Scholar]
- 22.Kopelowitz J, Hampe C, Goldman R, Reches M, Engelberg-Kulka H. Influence of codon context on UGA suppression and readthrough. J Mol Biol. 1992;225:261–269. doi: 10.1016/0022-2836(92)90920-f. [DOI] [PubMed] [Google Scholar]
- 23.Curran JF, Poole ES, Tate WP, Gross BL. Selection of aminoacyl-tRNAs at sense codons: the size of the tRNA variable loop determines whether the immediate 3' nucleotide to the codon has a context effect. Nucleic Acids Res. 1995;23:4104–4108. doi: 10.1093/nar/23.20.4104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Gurvich OL, Baranov PV, Zhou J, Hammer AW, Gesteland RF, et al. Sequences that direct significant levels of frameshifting are frequent in coding regions of Escherichia coli. EMBO J. 2003;22:5941–5950. doi: 10.1093/emboj/cdg561. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Plant EP, Wang P, Jacobs JL, Dinman JD. A programmed -1 ribosomal frameshift signal can function as a cis-acting mRNA destabilizing element. Nucleic Acids Res. 2004;32:784–790. doi: 10.1093/nar/gkh256. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Licznar P, Mejlhede N, Prere MF, Wills N, Gesteland RF, et al. Programmed translational -1 frameshifting on hexanucleotide motifs and the wobble properties of tRNAs. EMBO J. 2003;22:4770–4778. doi: 10.1093/emboj/cdg465. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Sorensen MA. Charging levels of four tRNA species in Escherichia coli Rel(+) and Rel(-) strains during amino acid starvation: a simple model for the effect of ppGpp on translational accuracy. J Mol Biol. 2001;307:785–798. doi: 10.1006/jmbi.2001.4525. [DOI] [PubMed] [Google Scholar]
- 28.Shah AA, Giddings MC, Gesteland RF, Atkins JF, Ivanov IP. Computational identification of putative programmed translational frameshift sites. Bioinformatics. 2002;18:1046–1053. doi: 10.1093/bioinformatics/18.8.1046. [DOI] [PubMed] [Google Scholar]
- 29.Hirsh AE, Fraser HB, Wall DP. Adjusting for selection on synonymous sites in estimates of evolutionary distance. Mol Biol Evol. 2005;22:174–177. doi: 10.1093/molbev/msh265. [DOI] [PubMed] [Google Scholar]
- 30.Akashi H. Synonymous codon usage in Drosophila melanogaster: natural selection and translational accuracy. Genetics. 1994;136:927–935. doi: 10.1093/genetics/136.3.927. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Coghlan A, Wolfe KH. Relationship of codon bias to mRNA concentration and protein length in Saccharomyces cerevisiae. Yeast. 2000;16:1131–1145. doi: 10.1002/1097-0061(20000915)16:12<1131::AID-YEA609>3.0.CO;2-F. [DOI] [PubMed] [Google Scholar]
- 32.Hildebrand F, Meyer A, Eyre-Walker A. Evidence of Selection upon Genomic GC-Content in Bacteria. e1001107-PLoS Genet. 2010;6 doi: 10.1371/journal.pgen.1001107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Hershberg R, Petrov DA. Evidence That Mutation Is Universally Biased towards AT in Bacteria. PLoS Genet. 2010;6 doi: 10.1371/journal.pgen.1001115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Rocha EP, Feil EJ. Mutational patterns cannot explain genome composition: are there any neutral sites in the genomes of bacteria? e1001104-PLoS Genet. 2010;6 doi: 10.1371/journal.pgen.1001104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Marck C, Grosjean H. tRNomics: analysis of tRNA genes from 50 genomes of Eukarya, Archaea, and Bacteria reveals anticodon-sparing strategies and domain-specific features. RNA. 2002;8:1189–1232. doi: 10.1017/s1355838202022021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Kramer EB, Farabaugh PJ. The frequency of translational misreading errors in E. coli is largely determined by tRNA competition. RNA. 2007;13:87–96. doi: 10.1261/rna.294907. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Curran JF. Decoding with the A:I wobble pair is inefficient. Nucleic Acids Res. 1995;23:683–688. doi: 10.1093/nar/23.4.683. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Pinheiro M, Afreixo V, Moura G, Freitas A, Santos MA, et al. Statistical, computational and visualization methodologies to unveil gene primary structure features. Methods Inf Med. 2006;45:163–168. [PubMed] [Google Scholar]
- 39.Fitzpatrick DA, Logue ME, Stajich JE, Butler G. A fungal phylogeny based on 42 complete genomes derived from supertree and combined gene analysis. 99-BMC Evol Biol. 2006;6 doi: 10.1186/1471-2148-6-99. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.