Abstract
The near-universal genetic code defines the correspondence between codons in genes and amino acids in proteins. We refactored the structure of the genetic code in Escherichia coli and created orthogonal genetic codes that restrict the escape of synthetic genetic information into natural life. We developed orthogonal and mutually orthogonal horizontal gene transfer systems, which permit the transfer of genetic information between organisms that use the same genetic code but restrict the transfer of genetic information between organisms that use different genetic codes. Moreover, we showed that locking refactored codes into synthetic organisms completely blocks invasion by mobile genetic elements, which carry their own translation factors and successfully invade organisms with canonical and compressed genetic codes.
The near-universal genetic code defines the correspondence between codons in genes and amino acids in proteins (1, 2). Because all forms of life use essentially the same genetic code, evolutionary innovation can be shared through horizontal gene transfer (HGT) between organisms (3, 4), and this is a major driver of evolution (5).
However, the near-universal genetic code is also a liability for organisms; mobile genetic elements (or selfish genetic elements) including transposons, viruses and plasmids exploit the universality of the code and co-opt the host cell’s machinery to read their genes and propagate at the expense of host organisms. There is a clear tension between maintaining a common genetic code, to allow the acquisition of beneficial innovation through HGT, and excluding selfish genetic elements that exploit the common code for their own ends (3, 6).
Several deviations from the standard genetic code have been documented in mitochondria and chloroplasts, and most characterized code reassignments involve stop codons (7–9). Known sense codon reassignments in the nuclear genome are rare. The ‘CTG yeast’ decodes the CUG codon (which encodes leucine in the standard code) primarily as serine (97%, with the remaining 3% still assigned to leucine) (10). Viruses for the CTG yeasts are essentially unknown, which suggests that sense codon reassignment may protect against viruses (11). There are no experimentally validated examples of sense codon reassignment in bacteria, although computational evidence supports arginine codon reassignment in bacilli (12).
Genome synthesis (13, 14) and editing provide the opportunity to rewrite the genetic code, and create organisms with new properties (14–19). We synthesized a 4 Mb Escherichia coli genome in which we replaced all annotated occurrences of the TCG and TCA serine codons with the synonymous AGC and AGT codons using defined recoding rules (20); we also replaced the TAG stop codon with TAA. This created Syn61, an organism with a compressed genetic code (14). We further evolved the strain and deleted the genes for the tRNAs that decode TCG and TCA codons (serU, tRNACGASer; serT, tRNAUGASer) and the gene for RF-1 (prfA) that terminates protein synthesis at the TAG stop codon. The resulting organism, Syn61Δ3, cannot read all the codons in the near universal genetic code and therefore cannot read horizontally transferred genes that contain the codons deleted from its genome, as exemplified by resistance to a range of bacteriophage (18).
It has been widely hypothesized that refactoring the structure of the genetic code, through the reassignment of sense codons to distinct canonical amino acids, would create organisms with new properties, and could create a genetic firewall to limit the escape of genetic information from synthetic organisms to natural organisms (4, 6, 21–23). In this work, we tested these hypotheses.
Compressed codes are non-orthogonal
A spectinomycin resistance gene written in the canonical genetic code [SpecR wild type (WT)] was correctly read in, and conferred spectinomycin resistance to, cells that contain the full complement of tRNAs to read the canonical code. However, consistent with previous observations (18), SpecR WT did not confer spectinomycin resistance to Syn61Δ3 cells (Fig. 1).
Fig. 1. Compressed genetic codes are non-orthogonal.
(A) The relationship between the TCG and TCA codons in genes, the decoders for these codons in cells with canonical (WT) decoding and Syn61Δ3 decoding (Δ), and the corresponding protein sequence synthesized. The anticodon of the tRNAs that read TCG or TCA codons is indicated (decoder). The amino acid (aa) used by the tRNA is indicated. Gray for a decoder indicates that the tRNA is loaded with serine. Gray for a codon indicates that the codon is within a non codon-compressed gene, and its decoding as serine will make the correct protein sequence. Pink for a decoder-amino acid pair indicates that the tRNA is deleted. Pink for a codon indicates the codon is absent from the gene because the gene has been designed with codon compression.
(B) Functional assessment of SpecR WT and codon-compressed spectinomycin resistance [recSpecR (ΔTCG, TCA)] genes in (left) cells that use the full complement of tRNAs to decode all the codons in the reading frame (Syn61 WT) and (right) cells in which the tRNAs that decode TCG and TCA codons have been deleted. Cells were spotted on agar plates in the presence or absence of spectinomycin and incubated overnight. The growth of cells in the presence of spectinomycin indicates that the indicated SpecR gene is functional in the indicated strain.
(C and D) Predicted protein synthesis and HGT outcomes from mobile genetic elements, and recipient cells with the indicated decoders and codons in essential genes. (C) A mobile genetic element encoding its genes according to the canonical genetic code, in which TCG and TCA encode serine, cannot be horizontally transferred to Syn61Δ3 cells that have no decoders for TCG and TCA codons. Translation will stall at TCG and TCA codons, and no full-length protein will be synthesized from the essential genes within the mobile genetic element that contain TCG and TCA codons. (D) A mobile genetic element encoding its genes according to the canonical genetic code that also carries a gene for a tRNA decoding TCG and TCA codons can be horizontally transferred to Syn61Δ3 cells. The tRNA encoded on the mobile genetic element can rescue decoding of TCG and TCA codons within essential genes in the mobile genetic element to make the correct protein.
(E) Transfer of mobile genetic elements through conjugation; Colony count indicates successful transconjugants received from ~106 cells. A WT mobile genetic element (F WT) can be transferred into cells that read the canonical code (Syn61 WT) but not into cells that lack tRNAs decoding TCG and TCA (Syn61Δ3). A WT mobile genetic element that encodes a tRNA decoding TCG and TCA codons as serine [F (WT +serT)] can be transferred to both Syn61 WT
We created a recoded spectinomycin resistance gene [recSpecR (ΔTCG, TCA)], with the compressed genetic code used in the Syn61 genome. recSpecR (ΔTCG, TCA) conferred spectinomycin resistance to Syn61Δ3 cells (Fig. 1). The recSpecR (ΔTCG, TCA) gene also conferred spectinomycin resistance to cells that read the canonical genetic code; this was expected because the compressed genetic code uses a subset of the codons used in the canonical genetic code. We made similar observations with hygromycin resistance genes written in canonical and compressed codes (Fig. S1).
These experiments demonstrated that genetic information written in the canonical code can be read in cells that decode the canonical code, but not in cells with genome-wide code compression and cognate tRNA deletion. However, code compressed genes can be read in both cells with cognate tRNA deletion and in cells that decode the canonical code. Therefore, there is no barrier limiting the flow of genetic information from engineered organisms, with compressed genetic codes, to natural forms of life. Creating orthogonal genetic codes, which actively restrict the transfer of genetic information from engineered biological systems to natural systems, is an important and unaddressed challenge.
tRNAs enable invasion of codon compressed organisms
A WT F plasmid [F (WT)], written in the canonical genetic code, was efficiently transferred to cells that read the canonical code. By contrast, F (WT) was not transferred to Syn61Δ3 (Fig. 1, Data File S1), as expected. However, upon selecting for the conjugation of F (WT) from cells that read the canonical code into Syn61Δ2 cells (Syn61 cells deleted for serU and serT but containing prfA), we obtained two viable colonies in which recipient cells had received F (WT) (Fig. S2). These colonies corresponded to rare events, which appeared at a frequency 106-fold less than the colonies resulting from conjugation of F (WT) into cells that read the canonical code. Sequencing the two clones revealed that they had acquired sequences that contained serT from the donor cell. This provided direct experimental evidence that selection for transfer of a mobile genetic element that uses the canonical code to recipients that cannot read the entire canonical code can enable selection for recipients that acquire the tRNA genes necessary to read the canonical genetic code.
To follow the effects of introducing serT into recipient cells in a reproducible system, we created the mobile genetic element F (WT +serT), a variant of F (WT) that contains serT. We demonstrated that F (WT +serT) can be transferred to Syn61Δ3 cells and that this transfer is dependent on serT (Fig. 1). We conclude that acquisition of serT is sufficient to circumvent the genetic isolation that is provided by code compression and cognate tRNA deletion in Syn61Δ3. These experiments highlight that creating systems that actively obstruct invasion by mobile genetic elements that carry their own decoders is an important challenge.
Refactoring code structure
We found that chimeric tRNAs for alanine (tRNACGAAla, tRNAUGAAla), histidine (tRNACGAHis, tRNAUGAHis), leucine (tRNACGALeu, tRNAUGALeu) and proline (tRNACGAPro, tRNAUGAPro) specifically direct the incorporation of the amino acid defined by the parent isoacceptor tRNA in response to their cognate codon (TGC or TCA) at position 3 in superfolder green fluorescent protein (sfGFP) or position 11 in ubiquitin in Syn61Δ3 (Fig. 2, Figs. S3 - S13, Data Files S2 - S3), and produce good yields of protein (Fig. S3). The fidelity of tRNAUGALeu was lower than that of other tRNAs (Data File S3, Fig. S5). We investigated alanyl and leucyl-tRNAs because their anticodons are not identity elements for their cognate aminoacyl-tRNA synthetases and they were therefore expected to be permissive to anticodon mutation; the other tRNAs were identified through a screen (Fig. S14). We found that, unlike tRNACGASer and tRNAUGASer, our chimeric tRNAs specifically decode the Watson-Crick complement of their anticodon sequence; for example, tRNACGAAla decodes TCG codons in preference to TCA codons and tRNAUGAAla decodes TCA codons in preference to TCG codons (Fig. S3, Fig. S5); These tRNAs are also specific with respect to other TCN codons (Fig. S6, Fig. S7), and most reassigned strains grew comparably to parental strains (Fig. S15). Our data show that we can independently re-assign the TCA and TCG codons to alanine, histidine, leucine or proline in Syn61Δ3 and thereby create 16 new genetic codes (Fig. S3, Data File S2, Data File S3, Fig. 2B, Fig. S5). In each new genetic code we changed the identity of the canonical amino acids encoded at specific sense codons with respect to both the canonical code and the other 15 codes we have created (Fig. 2C). Our reassignment strategy is analogous to models proposed for codon capture in natural evolution (24).
Fig. 2. Sense codon reassignment generates genetic codes.
(A) Total synthesis of a codon compressed genome followed by tRNA and release factor deletion yielded Syn61Δ3. The discovery of tRNAs that direct the incorporation of distinct amino acids in response to TCG or TCA codons enables sense codon reassignment to create new genetic codes.
(B) Isoacceptor tRNAs for the indicated amino acids with anticodons altered to the Watson-Crick complement of TCG or TCA codons were introduced into Syn61Δ3 in the indicated pair-wise combinations. We read out the identity of the amino acid incorporated into each codon using GFP genes with TCG or TCA codons at position 3 and electrospray ionization mass spectrometry. When pairs of isoacceptors for distinct amino acids were used, each codon led to the specific incorporation of the amino acid attached to the Watson-Crick-paired isoacceptor. The secondary peak measured in the proline incorporations results from incomplete methionine cleave at the N-terminus. A complete list of found and expected masses are provided in Data File S2.
(C) Sixteen new genetic codes in which TCG and TCA codons are reassigned to alanine, histidine, leucine and proline.
Overall we have refactored the structure of the genetic code. Our new genetic codes expand the number of codons used to encode alanine and proline (from 4 to 6), double the number of codons used to encode histidine, from 2 to 4, and an increase the number of codons used to encode leucine from 6 to 8; this is more codons than are used to encode any amino acid in the canonical code. These experiments also show that the UCN codon box, which encodes serine in the canonical code, can be split to encode additional canonical amino acids.
Orthogonal code orthogonal decoder pairs
Genes that are written using the canonical genetic code, in which TCG and TCA codons encode serine, will make the correct protein product in natural cells that read these codons as serine. However, these genes will yield the incorrect, likely non-functional protein product in cells that decode these codons to incorporate amino acids other than serine.
Similarly, synthetic genes in which we compress the genetic code using the Syn61 recoding scheme and replace codons for specific natural amino acids with TCG and TCA codons will make the correct protein product in cells that decode the TCG and TCA codons to incorporate the correct amino acid. However, these synthetic genes will yield an incorrect, likely non-functional protein product in cells that read the canonical genetic code (Fig. 3).
Fig. 3. Orthogonal genetic systems.
(A) Relationship between the TCG and TCA codons in genes; the decoders for these codons in cells with canonical (WT) decoding and decoding by tRNACGAAla, tRNAUGAHis in Syn61Δ3; and the corresponding protein sequence synthesized. The anticodon of the tRNAs that read TCG or TCA codons is indicated (decoder). The amino acid (aa) used by the tRNA is indicated. Gray for a codon indicates that its decoding as serine will make the correct protein sequence. Yellow for a codon indicates that its decoding as alanine will make the correct protein sequence. Green for a codon indicates that its decoding as histidine will make the correct protein sequence.
(B) Functional assessment of SpecR WT (written in the canonical genetic code) and O-SpecR (TCG-Ala, TCA-His), which is codon-compressed according the Syn61 recoding scheme and has alanine codons replaced with TCG and histidine codons replaced with TCA. The genes are read in cells that read the canonical code (Syn61 WT), and cells where TCG is decoded as alanine and TCA is decoded as histidine [Syn61Δ3 (tRNACGAAla, tRNAUGAHis)]. Cells were spotted on agar plates in the presence or absence of spectinomycin and incubated overnight. The growth of cells in the presence of spectinomycin indicates that the indicated SpecR gene is functional in the indicated strain.
We converted all 27 GCN codons (which encode alanine in the canonical code) to TCG codons, and all 6 CAT/C codons (which encode histidine in the canonical code) to TCA codons in recSpecR (ΔTCG, TCA). This created the orthogonal resistance gene O-SpecR (TCG-Ala, TCA-His). We demonstrated that O-SpecR (TCG-Ala, TCA-His) can be decoded in, and confer spectinomycin resistance to, Syn61Δ3 (tRNACGAAla, tRNAUGAHis) cells, in which TCG is read as alanine and TCA is read as histidine. We further demonstrated that O-SpecR (TCG-Ala, TCA-His) did not confer spectinomycin resistance to cells that read the canonical genetic code. Last, we demonstrated that SpecR WT, in which serine is encoded using TCG and TCA codons, cannot confer spectinomycin resistance to Syn61Δ3 (tRNACGAAla, tRNAUGAHis) cells (Fig. 3). We extended this approach to five other reassignment schemes, as well as to other genes (Fig. S16).
These experiments demonstrated that we can create a genetic code-decoder pair for synthetic genes that is functionally orthogonal with respect to the canonical genetic code-decoder pair for natural genes. The orthogonal code, written in synthetic genes, is correctly read by the cognate orthogonal decoder, but not by the canonical decoder. The canonical code, written in natural genes, is correctly read by the canonical decoder, but not by the orthogonal decoder.
The functional orthogonality of genes in cells with altered decoders will depend on the frequency of reassigned codons and the functional consequences of codon reassignments. The consequences of amino acid substitutions a result of codon reassignment may globally, and crudely correlate with differences in amino acid polarity and hydrophobicity (6, 25). The consequences of amino acid substitutions at particular sites in proteins may be predicted using computational approaches that leverage evolutionary sequence- and/or structural-information (26–29). Although the composition of natural genes is fixed, the codon usage in synthetic genes written in the standard code or any orthogonal code can be simply designed to maximize the number of codons that are subject to reassignment, which may maximize the functional orthogonality of synthetic genes.
Orthogonal HGT
Next, we created orthogonal HGT (O-HGT) systems composed of an orthogonal decoder and a mobile genetic element that uses an orthogonal genetic code. Cells that read the canonical code can transfer a WT mobile genetic element between themselves, but cannot transfer the WT mobile genetic element to cells that contain orthogonal decoders. Cells that contain O-HGT systems can transfer their mobile genetic element to cells that contain a compatible orthogonal decoder, but cannot transfer their mobile genetic element to cells that contain an incompatible orthogonal decoder or to cells that read the canonical code.
A mobile genetic element [F (WT)], written in the canonical code was transferred to cells that read the canonical code, as expected. We also showed that F (WT) could not be transferred to Syn61Δ3 (tRNACGAAla, tRNAUGAHis) cells, in which TCG codons are read as alanine and TCA codons are read as histidine (Fig. 4).
Fig. 4. Orthogonal and mutually orthogonal HGT systems.
(A) HGT between organisms that use distinct, orthogonal genetic codes is prohibited (dashed gray arrows), whereas HGT can occur between cells that share a common genetic code (solid arrows).
(B) Orthogonal horizontal transfer of mobile genetic elements. Colony count indicates the number of transconjugants received from ~106 donor cells that bear the indicated mobile genetic element. A WT mobile genetic element [F (WT)] was transferred into cells that read the canonical genetic code (Syn61 WT) but not into cells where TCG is reassigned to alanine and TCA is reassigned to histidine [Syn61Δ3 (tRNACGAAla, tRNAUGAHis)]. An orthogonal mobile genetic element (O-F1) was transferred into Syn61Δ3 (tRNACGAAla, tRNAUGAHis) but not into Syn61 WT.
(C) Mutually orthogonal HGT systems. Colony count indicates successful transconjugants received from ~106 donor cells bearing the indicated mobile genetic element.
Next, we investigated HGT for mobile genetic elements written in altered genetic codes. We synthesized the mobile genetic element O-F1 (TCG-Ala, TCA-His). The genetic code in all annotated open reading frames of this F plasmid was compressed using the Syn61 scheme, and GCN codons (which encode alanine in the canonical code) and CAT/C codons (which encode histidine in the canonical code) were converted to TCG and TCA codons, respectively, within the trfA gene; this gene is essential for the replication of the mobile genetic element (30).
O-F1 (TCG-Ala, TCA-His) was horizontally transferred to Syn61Δ3 (tRNACGAAla, tRNAUGAHis) cells. We further demonstrated that O-F1 (TCG-Ala, TCA-His) was not horizontally transferred to cells that read the canonical genetic code (Fig. 4). These experiments demonstrated that we can create O-HGT systems. We created additional HGT systems that are orthogonal to the natural genetic system and mutually orthogonal to each other (Fig. 4, Fig. S17). Overall, we demonstrated the scalability of our approach through the creation of five mutually orthogonal HGT systems.
Blocking invading codes
We hypothesized that reassigning TCA and TCG codons to specific natural amino acids and replacing codons for specific natural amino acids in essential genes with TCA and TCG codons in Syn61Δ3 (Fig. 5), would obstruct the serT-mediated HGT we observed in this strain (Fig. 1).
Fig. 5. Orthogonal code-locking blocks invading codes.
(A and B) Predicted protein synthesis and HGT outcomes from mobile genetic elements and recipient cells with the indicated decoders, and codons in essential genes.
(A) Transfer of a WT mobile genetic element that encodes for a tRNA decoding TCG and TCA codons as serine into a cell where TCG is reassigned to alanine and TCA is reassigned to histidine. Essential genes in the WT mobile genetic element, which contain TCG and TCA codons, will be missynthesized, with each TCG and TCA codon in the gene being stochastically decoded as serine or histidine/alanine. This is predicted to attenuate HGT.
(B) Transfer of a WT mobile genetic element that encodes for a tRNA decoding TCG and TCA codons as serine into a cell where TCG is reassigned to alanine and TCA is reassigned to histidine. Essential genes in the WT mobile genetic element, which contain TCG and TCA codons, will be missynthesized, with each TCG and TCA codon in the gene being stochastically decoded as serine or histidine/alanine. In addition, essential genes in the host cell, in which TCG is used to encode alanine and TCA is used to encode histidine, will be missynthesized. This is predicted to ablate HGT.
(C) HGT of a WT mobile genetic element [F (WT +serT)] is ablated in cells that use a refactored genetic code in essential genes. Colony count indicates successful transconjugants received from ~106 cells. Recipient cells and spectinomycin resistance gene variant (SpecR gene) in the recipient cell is indicated. Correctly reading the indicated SpecR gene in the recipient cell, is made essential by addition of spectinomycin.
(D) T4-like phage encoding a seryl-tRNAUGA infect Syn61Δ3 but not cells that bear orthogonal genetic codes. Plaque count indicates the number of successfully replicating phage obtained from infection with 1.1x1010 plaque-forming units (PFU) /mL (phage 12) and 7.5x109 PFU/mL (phage 6). Cells contain cognate spectinomycin resistance genes, as in C; all experiments were performed in the presence of spectinomycin.
Transfer of F (WT +serT) to Syn61Δ3 (tRNACGAAla, tRNAUGAHis, O-SpecR[TCG-Ala, TCA-His]) was obstructed (104 fold) in the absence of spectinomycin, because tRNACGAAla and tRNAUGAHis compete with tRNAUGASer in the recipient cell to decrease the production of functional proteins from the mobile genetic element. However, this obstruction was not sufficient to completely ablate transfer of F (WT +serT). Upon the addition of spectinomycin which makes O-SpecR(TCG-Ala, TCA-His) an essential gene in the cell, the decoding of TCG codons as alanine and the decoding of TCA codons as histidine become essential and ‘locked in’. Under these conditions, the transfer of F (WT +serT) was completely ablated (Fig. 5). Similar results were obtained with other refactored codes and other essential genes (Fig. S18, Fig. S19).
To extend our approach to viral infection, we identified pools of phage from the River Cam that can infect Syn61Δ3 (Data File S4 and supplementary materials, materials and methods). From these pools, we isolated two individual phage (12 and 06 both T4-like phage), which carry an identical tRNAUGASer gene and infect Syn61Δ3 (Fig. S20); some viruses are known to carry their own tRNAs and other translation factors to augment the cellular pool of translation factors and assist in the translation of codons within their own genes (9, 31, 32). As expected, expression of this tRNA in Syn61Δ3 is sufficient to confer susceptibility to infection by (otherwise non infectious) T4 phage (Fig. S21). We demonstrated that, unlike Syn61Δ3, several refactored, code locked strains were completely resistant to infection with phage 6 and phage 12 (Fig. 5, Fig S22).
Our results demonstrate that cells with refactored genetic codes resist invasion by mobile genetic elements that use competing codes. Locking the refactored codes into essential genes in the cell enhances this resistance and can ablate transfer of mobile genetic elements with competing codes.
Discussion
Previous work has shown that the choice of synonymous codons in individual genes and viruses can alter their robustness and evolvability (33–35), but such approaches are limited to exploring subsets of the canonical code. Although a large body of theoretical work, and a limited number of in vitro experiments have considered the relationship between the structure of the genetic code and its robustness and evolvability (21, 22) it has been impossible to investigate the resulting hypotheses through experiments in living cells. Refactoring the structure of the genetic code directly alters the number, and types, of amino acids that can be accessed by point mutations in live cells (Figs. S23, S24) and provides opportunities to experimentally test how altered codes affect the robustness and evolvability of protein and cellular function. In future work we will aim to leverage genetic code refactoring to accelerate directed evolution.
Competition between pools of genotypes written in different genetic codes may have led to a universal code being fixed in extant life. Future work may exploit organisms with refactored and mutually orthogonal codes to experimentally investigate competition between codes, and the role of HGT, in fixing and maintaining a universal genetic code (4, 36).
Shielding synthetic organisms from environmental genetic elements may be valuable for biotechnological applications on an industrial scale, for which contamination with mobile genetic elements, including viruses, can cause financial losses and disrupt vital supply chains (37). Synthetic organisms with decoders for an orthogonal code and essential genes that lock the cognate orthogonal code into the organism, exhibit resistance to mobile genetic elements that are written in the canonical code. This resistance extends to mobile genetic elements that carry tRNA genes that otherwise allow correct reading of the canonical code in their genes. Our work defines a paradigm for creating organisms that actively resist invasion by foreign codes.
Our refactored genetic codes limit the transfer of genetic information from synthetic organisms to natural organisms, and may form the basis of genetic firewalls that isolate synthetic genetic systems from the environment. Such genetic firewalls complement strategies for controlling the survival and growth of synthetic organisms, to realize biocontainment (16, 38, 39). This is especially important when considering applications of engineered organisms outside the laboratory.
The strategies that we have described should be generally applicable to any gene or genetic system added to the synthetic organism. Because the canonical genetic code is near universally conserved, we anticipate that the principles we have established may be applied to a broad range of organisms.
Supplementary Material
Acknowledgments
We thank J. Fredens (National University of Singapore) for advice on conjugation assays; S. Grazioli and A. Kleefeldt for help with de novo assembly of phage genomes; The LMB mass spectrometry facility, especially S.-Y. Peak-Chew, for performing mass spectrometry experiments.
Funding
This work was supported by the Medical Research Council (MRC), UK (MC_U105181009, MC_UP_A024_1008) to J.W.C.; G.P. was supported by a Marie Skłodowska-Curie European Postdoctoral Fellowship (897663). Work in the G.P.C.S. laboratry was supported by award BB/W000105/1 from the Biotechnology and Biological Sciences Research Council, UK (UKRI).
Footnotes
Author contributions: J.F.Z. and J.W.C. conceptualized and planned the project with input from W.E.R.; J.F.Z. demonstrated uni directional genetic isolation and loss of isolation through tRNA import. J.F.Z., T.S.E., G.P., and W.E.R. designed and characterized tRNAs for genetic code refactoring. J.F.Z. demonstrated bi-directional genetic isolation and (mutually-) orthogonal HGT systems. J.F.Z. demonstrated improved genetic isolation through code refactoring and genetic code locking. J.F.Z. and T.K. performed phage experiments under supervision of G.P.C.S.. J.W.C. supervised the project. J.W.C. and J.F.Z. wrote the manuscript, with input from all authors.
Competing interests: The MRC has filed a provisional patent application related to this work on which J.F.Z., W.E.R., and J.W.C. are listed as inventors. J. W.C. has a commercial interest in Constructive Bio. Ltd.
Data and materials availability
The Genbank accession numbers for all the plasmids described in the text are provided in Data file S5, and the authors agree to provide any materials and strains used in this study upon request. All data are available in the main text or supplementary materials.
References and Notes
- 1.Crick FH, Barnett L, Brenner S, Watts-Tobin RJ. General nature of the genetic code for proteins. Nature. 1961;192:1227–1232. doi: 10.1038/1921227a0. [DOI] [PubMed] [Google Scholar]
- 2.Nirenberg MW, Matthaei JH. The dependence of cell-free protein synthesis in E. coli upon naturally occurring or synthetic polyribonucleotides. Proc Natl Acad Sci U S A. 1961;47:1588–1602. doi: 10.1073/pnas.47.10.1588. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Hall RJ, Whelan FJ, McInerney JO, Ou Y, Domingo-Sananes MR. Horizontal Gene Transfer as a Source of Conflict and Cooperation in Prokaryotes. Front Microbiol. 2020;11:1569. doi: 10.3389/fmicb.2020.01569. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Vetsigian K, Woese C, Goldenfeld N. Collective evolution and the genetic code. Proc Natl Acad Sci U S A. 2006;103:10696–10701. doi: 10.1073/pnas.0603780103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Soucy SM, Huang JL, Gogarten JP. Horizontal gene transfer: building the web of life. Nature Reviews Genetics. 2015;16:472–482. doi: 10.1038/nrg3962. [DOI] [PubMed] [Google Scholar]
- 6.Koonin EV, Novozhilov AS. Origin and evolution of the genetic code: the universal enigma. IUBMB Life. 2009;61:99–111. doi: 10.1002/iub.146. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Kollmar M, Muhlhausen S. Nuclear codon reassignments in the genomics era and mechanisms behind their evolution. Bioessays. 2017;39 doi: 10.1002/bies.201600221. [DOI] [PubMed] [Google Scholar]
- 8.Ling J, et al. Natural reassignment of CUU and CUA sense codons to alanine in Ashbya mitochondria. Nucleic Acids Res. 2014;42:499–508. doi: 10.1093/nar/gkt842. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Borges AL, et al. Widespread stop-codon recoding in bacteriophages may regulate translation of lytic genes. Nat Microbiol. 2022;7:918–927. doi: 10.1038/s41564-022-01128-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Santos MA, Gomes AC, Santos MC, Carreto LC, Moura GR. The genetic code of the fungal CTG clade. C R Biol. 2011;334:607–611. doi: 10.1016/j.crvi.2011.05.008. [DOI] [PubMed] [Google Scholar]
- 11.Taylor DJ, Ballinger MJ, Bowman SM, Bruenn JA. Virus-host co-evolution under a modified nuclear genetic code. PeerJ. 2013;1:e50. doi: 10.7717/peerj.50. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Shulgina Y, Eddy SR. A computational screen for alternative genetic codes in over 250,000 genomes. Elife. 2021;10 doi: 10.7554/eLife.71402. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Gibson DG, et al. Complete chemical synthesis, assembly, and cloning of a Mycoplasma genitalium genome. Science. 2008;319:1215–1220. doi: 10.1126/science.1151721. [DOI] [PubMed] [Google Scholar]
- 14.Fredens J, et al. Total synthesis of Escherichia coli with a recoded genome. Nature. 2019;569:514. doi: 10.1038/s41586-019-1192-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Lajoie MJ, et al. Genomically recoded organisms expand biological functions. Science. 2013;342:357–360. doi: 10.1126/science.1241459. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Mandell DJ, et al. Biocontainment of genetically modified organisms by synthetic protein design (vol 518, pg 55, 2015) Nature. 2015;527:264. doi: 10.1038/nature14121. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Rovner AJ, et al. Recoded organisms engineered to depend on synthetic amino acids (vol 518, pg 89, 2015) Nature. 2015;527 doi: 10.1038/nature14095. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Robertson WE, et al. Sense codon reassignment enables viral resistance and encoded polymer synthesis. Science. 2021;372:1057–1062. doi: 10.1126/science.abg3029. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.de la Torre D, Chin JW. Reprogramming the genetic code. Nature Reviews Genetics. 2021;22:169–184. doi: 10.1038/s41576-020-00307-7. [DOI] [PubMed] [Google Scholar]
- 20.Wang KH, et al. Defining synonymous codon compression schemes by genome recoding. Nature. 2016;539:59. doi: 10.1038/nature20124. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Pines G, Winkler JD, Pines A, Gill RT. Refactoring the Genetic Code for Increased Evolvability. mBio. 2017;8 doi: 10.1128/mBio.01654-17. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Calles J, Justice I, Brinkley D, Garcia A, Endy D. Fail-safe genetic codes designed to intrinsically contain engineered organisms. Nucleic Acids Res. 2019;47:10439–10451. doi: 10.1093/nar/gkz745. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Marliere P. The farther, the safer: a manifesto for securely navigating synthetic species away from the old living world. Syst Synth Biol. 2009;3:77–84. doi: 10.1007/s11693-009-9040-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Osawa S, Jukes TH. Codon Reassignment (Codon Capture) in Evolution. Journal of Molecular Evolution. 1989;28:271–278. doi: 10.1007/BF02103422. [DOI] [PubMed] [Google Scholar]
- 25.Schmidt M, Kubyshkin V. How To Quantify a Genetic Firewall? A Polarity-Based Metric for Genetic Code Engineering. Chembiochem. 2021;22:1268–1284. doi: 10.1002/cbic.202000758. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Frazer J, et al. Disease variant prediction with deep generative models of evolutionary data. Nature. 2021;599:91. doi: 10.1038/s41586-021-04043-8. [DOI] [PubMed] [Google Scholar]
- 27.Teng S, Srivastava AK, Schwartz CE, Alexov E, Wang L. Structural assessment of the effects of amino acid substitutions on protein stability and protein protein interaction. Int J Comput Biol Drug Des. 2010;3:334–349. doi: 10.1504/IJCBDD.2010.038396. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Parthiban V, Gromiha MM, Schomburg D. CUPSAT: prediction of protein stability upon point mutations. Nucleic Acids Res. 2006;34:W239M–242. doi: 10.1093/nar/gkl190. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Ng PC, Henikoff S. Predicting the effects of amino acid substitutions on protein function. Annu Rev Genomics Hum Genet. 2006;7:61–80. doi: 10.1146/annurev.genom.7.080505.115630. [DOI] [PubMed] [Google Scholar]
- 30.Konieczny I, Doran KS, Helinski DR, Blasina A. Role of TrfA and DnaA proteins in origin opening during initiation of DNA replication of the broad host range plasmid RK2. Journal of Biological Chemistry. 1997;272:20173–20178. doi: 10.1074/jbc.272.32.20173. [DOI] [PubMed] [Google Scholar]
- 31.Delesalle VA, Tanke NT, Vill AC, Krukonis GP. Testing hypotheses for the presence of tRNA genes in mycobacteriophage genomes. Bacteriophage. 2016;6:e1219441. doi: 10.1080/21597081.2016.1219441. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Bailly-Bechet M, Vergassola M, Rocha E. Causes for the intriguing presence of tRNAs in phages. Genome Research. 2007;17:1486–1495. doi: 10.1101/gr.6649807. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Renda BA, Hammerling MJ, Barrick JE. Engineering reduced evolutionary potential for synthetic biology. Mol Biosyst. 2014;10:1668–1678. doi: 10.1039/c3mb70606k. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Moratorio G, et al. Attenuation of RNA viruses by redirecting their evolution in sequence space. Nat Microbiol. 2017;2:17088. doi: 10.1038/nmicrobiol.2017.88. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Coleman JR, et al. Virus attenuation by genome-scale changes in codon pair bias. Science. 2008;320:1784–1787. doi: 10.1126/science.1155761. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Kubyshkin V, Acevedo-Rocha CG, Budisa N. On universal coding events in protein biogenesis. Biosystems. 2018;164:16–25. doi: 10.1016/j.biosystems.2017.10.004. [DOI] [PubMed] [Google Scholar]
- 37.Barone PW, et al. Viral contamination in biologic manufacture and implications for emerging therapies. Nat Biotechnol. 2020;38:563–572. doi: 10.1038/s41587-020-0507-2. [DOI] [PubMed] [Google Scholar]
- 38.Marliere P, et al. Chemical Evolution of a Bacterium’s Genome. Angewandte Chemie-International Edition. 2011;50:7109–7114. doi: 10.1002/anie.201100535. [DOI] [PubMed] [Google Scholar]
- 39.Lee JW, Chan CTY, Slomovic S, Collins JJ. Next-generation biocontainment systems for engineered organisms. Nature Chemical Biology. 2018;14:530–537. doi: 10.1038/s41589-018-0056-x. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The Genbank accession numbers for all the plasmids described in the text are provided in Data file S5, and the authors agree to provide any materials and strains used in this study upon request. All data are available in the main text or supplementary materials.





