ABSTRACT
Henri Grosjean and Eric Westhof recently presented an information-rich, alternative view of the genetic code, which takes into account current knowledge of the decoding process, including the complex nature of interactions between mRNA, tRNA and rRNA that take place during protein synthesis on the ribosome, and it also better reflects the evolution of the code. The new asymmetrical circular genetic code has a number of advantages over the traditional codon table and the previous circular diagrams (with a symmetrical/clockwise arrangement of the U, C, A, G bases). Most importantly, all sequence co-variances can be visualized and explained based on the internal logic of the thermodynamics of codon-anticodon interactions.
KEYWORDS: aminoacyl-tRNAs, circular genetic code diagram, decoding, energetics of codon-anticodon interactions, genetic code, genetic code evolution
The genetic code describes the correspondence between the sequence of a given nucleotide triplet in an mRNA molecule, called a codon, and the amino acid that it directs to be added to the growing polypeptide chain during protein synthesis. Decoding, or translation, of mRNAs is performed by ribosomes, with addition of each new amino acid to the growing chain involving a cycle of complex reactions consisting of several major steps.1-3 Placement of the initiator tRNA in the ribosomal P-site (directed by the initiation/AUG codon) sets the reading frame for all subsequent incoming aminoacyl-tRNAs (aa-tRNAs) required for decoding of the message. The next aa-tRNA binds to the ribosomal A-site by forming base-pairs with the next codon in the mRNA. The specificity is such that perfect Watson–Crick base pairs are usually observed between the first 2 nucleotides in the codon and those in the anticodon, but altered base pairing is possible at the third, so-called “wobble,” position. Wobbling occurs because the conformation of the tRNA anticodon loop permits flexibility at the first base of the anticodon.1-3 In the next step, if acceptable codon–anticodon base pairing has been established between the mRNA and the incoming aa-tRNA, decoding is accompanied by a peptide bond formation between the incoming amino acid and the previous amino acid.1-3 Note that this is a simplified overview of the process, which involves many additional/intermediate steps, which will not be considered here.
The genetic code is nearly universal, meaning that in almost all living organisms, the identity of the amino acid encoded by a given triplet codon is the same.4,5 With 4 bases (A, G, U, and C), there are 64 possible triplet codons; 61 sense (encoding amino acids) and 3 nonsense (UAA, UAG, and UGA, so-called stop codons that direct termination of translation). In most organisms, there are 20 common amino acids used in protein synthesis; thus, the genetic code is redundant with most amino acids being encoded by more than one codon. Only two amino acids (Met and Trp) are encoded by just a single codon in most of the organisms (although exceptions to this rule do exist4,5). Phe, Tyr, His, Gln, Asn, Lys, Asp, Glu, and Cys are each encoded by 2 distinct codons, which in each case are identical at positions 1 and 2 but different in position 3 (for example, Phe is encoded by UUU and UUC) (Fig. 1A). Ile is encoded by a group of 3 codons and Val, Pro, Thr, Ala, and Gly are each specified by a group of 4 codons; these also differ only at position 3 within a group. The greatest degree of redundancy exists for Leu, Ser and Arg, which are each encoded by 6 codons (4 in one group and 2 in another group, again with groups defined as being identical at positions 1 and 2) (Fig. 1A). The etiology of this pattern of redundancy is not entirely clear, but is thought to be related to co-evolution of the genetic code and amino acids, with the appearance of the modern group of 20 (+2) amino acids (additional 2 amino acids include selenocysteine and pyrrolysine that are decoded via the UGA an UAG stop codons, respectively) evolving from a relatively small number of early/prebiotic amino acids (such as e.g. Gly, Ala, Asp and Val) which can be also synthesized via pathways with only a few steps (for a review see ref.6). New amino acids added in evolution to the initial group of early amino acids may in some cases have taken over codons previously assigned to their precursors. Thus, in the conventional/standard codon table7-9 (Fig. 1A) with blocks of 4 codons identical in positions 1 and 2 but containing either U, C, A or G in position 3, appearance of new amino acids led to subdivision of larger codon blocks into smaller ones (for example, the GAx block being subdivided/split to encode Asp with GAU and GAC and Glu with GAA and GAG. While fundamentally correct and accepted as quasi-universal, this standard codon table does not fully represent the genetic code and its evolution, as it doesn't take into account all aspects of the decoding process such as, for example, the thermodynamics of codon-anticodon interactions and the influence of tRNA modifications on such interactions.
Figure 1.

Genetic code diagrams. (A) Schematic representation of the conventional (rectangular) genetic code table. (B) Asymmetrical circular genetic code diagram developed by Grosjean and Westhof 10 (adapted with permission from the authors and Oxford University Press) C, G, U, A bases arranged clockwise at the right side of the circular diagram and anti-clockwise at the left side of the diagram. The most thermodynamically stable G/C-rich codons are placed at the top of the circle while “weaker” A/U-rich codons are at the bottom and mixed codons appear in the mid-sections on the left and the right sides of the circle. This asymmetrical representation of the genetic code illustrates the role of chemical energetics in decoding, with clear segregation between all GC-rich 4 codon families (unsplit; all four codons identical at positions 1 and 2 encoding the same amino acid) and all AU-rich smaller codon families (split 2:2 or 3:1). The new representation also highlights the significance of tRNA anticodon hairpin modifications (especially U34) aimed at fine-tuning codon-anticodon base-pairing binding capacity for optimal and uniform translation. Arrows on the left and the right side of the diagram highlight characteristic changes associated with the code evolution. The strengths of the codon-anticodon base pairing interactions are color coded. Strong GC-rich codon-anticodon triplets are highlighted in cyan, while weaker UA pairs are shown in pink and the mixed codon-anticodon triplets are shown on a white background for both (A) and (B).
In a recent issue of Nucleic Acids Research, Henri Grosjean and Eric Westhof presented10 a more information-rich, alternative view of the genetic code table (Fig. 1B). This representation takes into account current knowledge of the decoding process, including the complex nature of interactions between mRNA, tRNA and rRNA that take place during protein synthesis on the ribosome, and it also better reflects the evolution of the code.10 Recent progress in deciphering the structure and function of the ribosome as well as identifying the functional significance of modified nucleotides in tRNAs has revealed the intricate complexity of the decoding process.10 In particular, it was found that third position “wobbling” can occur in several non-canonical ways depending on specific tRNA modifications and that these modifications (especially in the anticodon hairpin) serve to maintain optimal stability of complementary codon–anticodon pairs.10 Incorporating and capitalizing on many previous observations and representations of the code (including previous circular genetic code diagrams10-12), the work of Grosjean and Westhof highlights the importance of numerous “hidden” aspects of the decoding process and presents a visually appealing decoding table that takes into account multiple structural aspects of translation and chemical interactions that govern the process. This improvement on the standard codon table(s) is reminiscent of the “evolution” of the periodic table of elements. In 1789, Antoine Lavoisier published the first list/table of 33 chemical elements, grouping them according to their basic properties into gases, metals, nonmetals, and earths.13 This first revolutionary description gave a comprehensive overview of basic chemical elements, but didn't possess predictive power. The periodic table of chemical elements published by Dmitri Mendeleev in 186914 not only better illustrated periodic trends in the properties of the then-known chemical elements, but also allowed prediction of properties of elements yet to be discovered. In their new representation of the genetic code table, Grosjean and Westhof arranged each codon corresponding to the 20 canonical amino acids on a circle based on the sequence of the codon/anticodon triplet (Fig. 1B). Quadrants of the circle are assigned to codons with G, C, A or U in the first codon position and then are further subdivided as a function of the base in the second and third positions. The most thermodynamically stable G/C-rich codons are placed at the top of the circle while “weaker” A/U-rich codons are placed at the bottom and mixed codons appear in the mid-sections on the left and the right sides of the circle (Fig. 1B). This asymmetrical representation of the genetic code illustrates the role of chemical energetics in decoding, with clear segregation between all GC-rich 4 codon families (unsplit; all four codons identical at positions 1 and 2 encoding the same amino acid) and all AU-rich smaller codon families (split 2:2 or 3:1). The new representation also highlights the significance of tRNA anticodon hairpin modifications (especially U34) aimed at fine-tuning codon-anticodon base-pairing binding capacity for optimal and uniform translation. This led the authors to the important conclusion that during genetic code expansion, optimal stability of complementary codon–anticodon pairs likely served as the main force driving its evolution. It also provides an explanation for the observed nature of codon reassignments most often found within split AU-rich codon families. Thus, starting with GC-rich triplets coding for simple/early amino acids (like Gly, Ala, Pro), the code evolved to include AU-rich codons specifying the amino acid products of new, more complex biosynthetic machineries. This was accompanied by co-evolution of aminoacyl-tRNA synthetases and tRNA modification enzymes.
The new asymmetrical circular genetic code diagram developed by Grosjean and Westhof has a number of advantages over the traditional codon table and the previous circular diagrams (with a symmetrical/clockwise arrangement of the U, C, A, G bases11,12). Perhaps most importantly, all sequence co-variances can be visualized and explained based on the internal logic of the thermodynamics of codon-anticodon interactions.10 This circular code diagram clearly indicates that the code is not a “frozen accident,” but rather a dynamic/developing paradigm that most likely evolved from a “4-column code” in which Gly, Ala, Asp, and Val were the earliest encoded amino acids. In addition to providing “retrospective” understanding of the evolution and current status of the decoding process, the new circular genetic code diagram, like the periodic table of chemical elements, has predictive power. This has the potential to aid genetic code engineering efforts such as identification of optimal codon reassignments for biosynthetic incorporation of non-canonical/non-natural amino acids, believed to be of special interest for biotechnology industry and for structural studies of proteins.15,16
Disclosure of potential conflicts of interest
No potential conflicts of interest were disclosed.
Acknowledgments
I thank Drs. Henri Grosjean and Eric Westhof for useful insights and Patricia Stanhope Baker for help with manuscript preparation.
Funding
This work was supported by grants 13GRNT17070025 (AHA) and HL121779-01A1 (NIH) (to A.A.K.)
References
- [1].Schmeing TM, Ramakrishnan V. What recent ribosome structures have revealed about the mechanism of translation. Nature 2009; 461:1234-42; PMID:19838167; http://dx.doi.org/ 10.1038/nature08403 [DOI] [PubMed] [Google Scholar]
- [2].Voorhees RM, Ramakrishnan V. Structural basis of the translational elongation cycle. Annu Rev Biochem 2013; 82:203-36; PMID:23746255; http://dx.doi.org/ 10.1146/annurev-biochem-113009-092313 [DOI] [PubMed] [Google Scholar]
- [3].Melnikov S, Ben-Shem A, Garreau de Loubresse N, Jenner L, Yusupova G, Yusupov M. One core, two shells: bacterial and eukaryotic ribosomes. Nat Struct Mol Biol 2012; 19(6):560-7; PMID:22664983; http://dx.doi.org/ 10.1038/nsmb.2313 [DOI] [PubMed] [Google Scholar]
- [4].Ling J, O'Donoghue P, Söll D. Genetic code flexibility in microorganisms: novel mechanisms and impact on physiology. Nat Rev Microbiol 2015; 13(11):707-21; PMID:26411296; http://dx.doi.org/ 10.1038/nrmicro3568 [DOI] [PMC free article] [PubMed] [Google Scholar]
- [5].Bezerra AR, Guimarães AR, Santos MA. Non-Standard Genetic Codes Define New Concepts for Protein Engineering. Life (Basel) 2015; 5(4):1610-28; PMID:26569314; http://dx.doi.org/ 10.3390/life5041610 [DOI] [PMC free article] [PubMed] [Google Scholar]
- [6].Sengupta S, Higgs PG. Pathways of Genetic Code Evolution in ancient and modern organisms. J Mol Evol 2015; 80(5–6):229-43; PMID:26054480; http://dx.doi.org/ 10.1007/s00239-015-9686-8 [DOI] [PubMed] [Google Scholar]
- [7].Nirenberg M, Leder P, Bernfield M, Brimacombe R, Trupin J, Rottman F, O'Neal C. RNA codewords and protein synthesis, VII. On the general nature of the RNA code. Proc Natl Acad Sci USA 1965; 53:1161-8; PMID:5330357 [DOI] [PMC free article] [PubMed] [Google Scholar]
- [8].Crick FH. The origin of the genetic code. J Mol Biol 1968; 38:367-79; PMID:4887876 [DOI] [PubMed] [Google Scholar]
- [9].Woese CR, Dugre DH, Saxinger WC, Dugre SA. The molecular basis for the genetic code. Proc Natl Acad Sci USA 1966; 55:966-74; PMID:5219702 [DOI] [PMC free article] [PubMed] [Google Scholar]
- [10].Grosjean H, Westhof E. An integrated, structure- and energy-based view of the genetic code. Nucleic Acids Res 2016; Jul 22. pii: gkw608. [Epub ahead of print]; PMID:27448410; http://dx.doi.org/ 10.1093/nar/gkw608 [DOI] [PMC free article] [PubMed] [Google Scholar]
- [11].Lobanov AV, Turanov AA, Hatfield DL, Gladyshev VN. Dual functions of codons in the genetic code. Crit Rev Biochem Mol Biol 2010; 45(4):257-65; PMID:20446809; http://dx.doi.org/ 10.3109/10409231003786094 [DOI] [PMC free article] [PubMed] [Google Scholar]
- [12].Castro-Chavez F. A tetrahedral representation of the genetic code emphasizing aspects of symmetry. BIO-Complexity 2012; 2:1-6; PMID:22997604; http://dx.doi.org/ 10.5048/BIO-C.2012.2 [DOI] [PMC free article] [PubMed] [Google Scholar]
- [13].Lavoisier A. Traité Élémentaire de Chimie, présenté dans un ordre nouveau, et d'après des découvertes modernes (1 ed.); 1789; A Paris: Cuchet: Libraire, rue & hotel Serpente [Google Scholar]
- [14].Mendelejew D. Über die Beziehungen der Eigenschaften zu den Atomgewichten der Elemente. Zeitschrift für Chemie 1869; 12:405-6 [Google Scholar]
- [15].Nikić I, Lemke EA. Genetic code expansion enabled site-specific dual-color protein labeling: superresolution microscopy and beyond. Curr Opin Chem Biol 2015; 28:164-73; PMID:26302384; http://dx.doi.org/ 10.1016/j.cbpa.2015.07.021 [DOI] [PubMed] [Google Scholar]
- [16].Neumann-Staubitz P, Neumann H. The use of unnatural amino acids to study and engineer protein function. Curr Opin Struct Biol 2016; 38:119-28; PMID:27318816; http://dx.doi.org/ 10.1016/j.sbi.2016.06.006 [DOI] [PubMed] [Google Scholar]
