One of the long-standing aims of biomimetic chemistry has been to develop molecules that function as much as possible as the natural ones do.[1] Among the biopolymers, the nucleic acids have long served as a test bed for bio-inspired design. The earliest focus of altered structures for DNA was in the redesign of the phosphodiester backbone,[2] and numerous studies found backbone variants that assembled well into helices. In addition, in recent years, some altered sugars have also been shown to be substrates for certain polymerase enzymes.[3] Such work suggests the possibility of future biological activities associated with altered DNA backbones.
More recently, a number of laboratories have focused on design of replacements for the DNA bases themselves, which encode the chemical information of the cell.[4] This is a challenging goal because biological enzymes have evolved the ability to manipulate these bases and base pairs with extraordinarily high selectivity. The polymerase replication of designed DNA pairs has been studied in a number of laboratories. Significant successes have been reported using varied strategies including base pairs with altered hydrogen bonding patterns,[4a] pairs that lack hydrogen bonds altogether,[4b–e,i,j,m–p] and pairs that have larger-than-natural dimensions.[4q–s,5] However, to date such designed base pairs have generally not been tested in living cells. One recent exception is a single isosteric replacement for a natural base that was replicated efficiently.[6]
Here we describe the first tests of a nonnatural DNA base pair geometry in a living cell. We have inserted single size-expanded DNA (xDNA) bases into phage genomes, and measured their replication in Escherichia coli cells. Surprisingly, although xDNA base pairs are considerably larger than the natural ones, we find that they are bypassed by the cellular replication machinery remarkably well. Indeed, two of the designed pairs possess biological function that is nearly indistinguishable from that of natural base pairs.
The four size-expanded DNA base pairs are shown in Fig. 1. The hydrogen-bonded pairs involve benzo-homologated bases opposite complementary natural partners, yielding base pairs 2.4 Å larger than Watson-Crick pairs.[6] The concept of a benzo-expanded base was first developed by Leonard, who developed lin-benzo-adenine and -guanine and studied them as ribonucleotide analogues three decades ago.[7] We developed analogous benzopyrimidine C-nucleosides and studied the ability of these four deoxynucleoside compounds (xA, xC, xG, xT) to form helices composed of expanded-size pairs. Work to date has shown that fully-substituted duplexes of xDNA pairs are highly stable and form right-handed helices with a backbone conformation resembling B-DNA.[5a, d–f] However, single xDNA pairs substituted within natural DNA are destabilizing, presumably because of the mismatch in size between the large pair and the naturally-sized backbone surrounding it.[5d] This could well present a challenge for a natural DNA polymerase, since such enzymes (especially explicative enzymes) can be highly sensitive to the sizes of base pairs.[6,8] However, living cells possess several different specialized classes of polymerases, some of which function to bypass damaged bases that are often larger than those of normal DNA.[9] Before completing extensive studies of in vitro replication of xDNA bases by natural and modified polymerases, we decided to test whether the cellular machinery already exists that might process these expanded bases.
We measured the information encoding capability of the single xDNA bases by incorporating them separately into oligonucleotides using standard phosphoramidite solid-phase synthesis.[5a, c] The intact incorporation of expanded bases into the DNAs was confirmed by MALDI-TOF mass spectrometry in all cases (see Supporting Data). The oligonucleotides were then ligated into an M13mp7(L2) single-stranded viral genome.[10] These modified genomes were then passaged through E. coli to quantify the biological responses. The responses to be derived are replication bypass efficiency (as scored by the amounts of daughter phage produced with respect to an unmodified competitor genome), and replication fidelity (measured by isolation and composition analysis of the daughter sequence, see below). A feature of this system is that there is no complementary strand opposite the unnatural bases in this single-stranded bacteriophage. This ensures that outcomes from replication and mutagenesis are derived solely from the initial replicative bypass of the modified bases, rather than from preferential replication of an opposing unmodified strand.
The ability of each of the four xDNA bases to support DNA replication in vivo was addressed using a competitive replication assay whereby genomes bearing a nonstandard base are mixed with a normal internal standard prior to electroporation into cells.[11] The concentration of each xDNA base construct was determined in triplicate, followed by normalization and transfection using a 2:1 ratio of xDNA:competitor. The results showed that two of the four size-expanded bases were bypassed highly efficiently (Fig. 2). In E. coli that were not induced for a damage (SOS) response to translesion DNA replication, the xA and xT bases were bypassed with efficiencies that were, respectively, 80% and 73% that of the natural guanine control. The xC base was moderately well bypassed, with an efficiency of 29%, while the signal dropped to 11% for the xG base. When UV light was used to induce the SOS response polymerases, bypass was improved by a significant amount in all cases, with xA and xT reaching the level of the unmodified guanine control. The xC base increased to 53%, while the base that was the strongest block to replication, xG, had the greatest response to the UV-induced SOS DNA polymerases, increasing to 45% relative to the phage genome containing a natural guanine base at the test site. This suggests that flexible damage-reponse enzymes may assist in the synthesis or extension of a large xDNA pair.
Having established that single xDNA base pairs can be processed by the E. coli replication machinery, we proceeded to evaluate which natural bases replaced the xDNA analogs in the daughter phage that were recovered, using a published restriction endonuclease/postlabeling assay.[10] This serves as a quantitative measure of the ability of the bacterial polymerases to accurately read the chemical information encoded in the large-size bases. In the first round of DNA synthesis to produce the (−) strand of the phage DNA, a natural base would be incorporated opposite the xDNA base, thus making a size-expanded pair. In the next round, the all-natural DNA (−) strand would be replicated normally, producing new, all-naturally-substituted (+)-strand daughter phage genomes that carry the sequence information encoded by the xDNA base at the original test site.
We found that two of the four xDNA bases, xA and xC, encoded their analogous replacements correctly. The results are shown graphically in Figure 3. The fidelity of replacement of xA by A in the daughter phage is particularly striking, with 99% of the daughter phage containing adenine at the test site in the genome. This establishes that the xDNA-replicating polymerase correctly incorporates T opposite xA despite the large size of this pair. The xC analog was also read correctly the majority of the time, with replacement by C in 88% of the cases and by A (implicating T-xC mispairing) in 10%. The other two xDNA bases were read incorrectly, with T incorporation opposite both xT and xG being dominant over the “correct” xDNA pairing. Despite the misreading of xG, its coding ambiguity was low, since it coded as A 95% of the time. We also carried out the same experiments with phage that had been passaged through E. coli in which the SOS response had been induced by UV light. The results showed (see Supporting Data) that this did not change the bases encoded by the large-sized bases significantly.
Although the xA and xC expanded bases are read correctly by the bacterial replication machinery, xT and xG are not, at least in this context. We speculate that the mispairings that are observed with these latter two bases arise from an alternative pairing geometry and from an alternative protonation state (Fig. S5). It is possible to pair T opposite xT with a geometry analogous to the T-G wobble, which may be closer to a Watson-Crick geometry. This might explain the observation of “correct” A-xT pairing only 26% of the time as compared with 73% T-xT mispairing. As for the xG base, in some contexts it can be deprotonated at pH values neutral and higher.[5f] If this occurred during replication it would present a structure that is more complementary to T than C. We note that although two xDNA bases are incorrectly processed in this context, this still leaves a substantial information-encoding capability. The correct coding of two xDNA pairs involves four different bases, which is, in principle, the same amount of information content as the natural genetic system.
The finding of efficient replication for two of the large-sized pairs is surprising, given that replicative polymerases can be highly sensitive to nucleobase size.[8] To examine this in more detail we carried out preliminary in vitro experiments with DNA Polymerase I (Klenow fragment (Kf)), the most extensively studied of the E. coli polymerases. We used 28mer DNA templates containing a single xDNA base immediately downstream of a primer. To evaluate enzymatic efficiency and selectivity, we performed steady-state kinetics measurements of the addition of single natural nucleotides opposite each of these large bases. The results are given in Table 1.
Table 1.
template base | nucleoside triphosphate | Vmax (% min−1)b | Km (μM) | efficiency (Vmax/Km) | relative efficiency |
---|---|---|---|---|---|
xA | dATP | 0.0041 (0.0012) | 23 (10) | 1.9 × 102 | 2.4 x10−2 |
xA | dCTP | 0.0035 (0.0005) | 9.9 (2.3) | 3.9 × 102 | 4.9 × 10−2 |
xA | dGTP | 0.0046 (0.0019) | 43 (20) | 1.1 × 102 | 1.4 × 10−2 |
xA | dTTP | 0.15 (0.01) | 18 (3) | 8.0 × 103 | 1 |
| |||||
xC | dATP | 0.40 (0.04) | 4.6 (1.5) | 8.7 × 104 | 6.7 × 10−1 |
xC | dCTP | 0.065 (0.004) | 4.1 (1.5) | 2.0 × 104 | 1.5 × 10−1 |
xC | dGTP | 3.5 (0.2) | 28 (3) | 1.3 × 105 | 1 |
xC | dTTP | 0.19 (0.01) | 17 (5) | 1.2 × 104 | 9.2 × 10−2 |
| |||||
xG | dATP | 0.0034 (0.0004) | 13 (10) | 3.8 × 102 | 5.1 × 10−3 |
xG | dCTP | 0.26 (0.06) | 8 (10) | 7.5 × 104 | 1 |
xG | dGTP | 0.0046 (0.0011) | 20 (9) | 2.7 × 102 | 3.6 × 10−3 |
xG | dTTP | 0.0035 (0.0022) | 15 (17) | 1.3 × 102 | 1.7 × 10−3 |
| |||||
xT | dATP | 9.0 (0.3) | 50 (6) | 1.8 × 105 | 1 |
xT | dCTP | 0.19 (0.03) | 160 (30) | 1.2 × 103 | 6.7 × 10−3 |
xT | dGTP | 0.064 (0.008) | 880 (10) | 7.7 × 101 | 4.3 × 10−4 |
xT | dTTP | 0.59 (0.03) | 17 (2) | 3.6 × 104 | 2.0 × 10−1 |
| |||||
T | dATP | 36 (4) | 3.4 (1.4) | 1.1 × 107 | -- |
Conditions: 200 nM 23mer/28mer primer-template duplex and varied polymerase concentrations in a buffer containing 50 mM Tris•HCI (pH 7.5), 10 mM MgCI2, 50 ug/mL BSA and 1 mM dithiothreitol, incubated at 37°C in a reaction volume of 10 μL. Standard deviations (n=3–5) are given in parentheses.
Normalized for the lowest enzyme concentration used.
The kinetics data confirm that at least one natural enzyme can correctly read sequence information stored in size-expanded bases. Not surprisingly, the Kf enzyme is inefficient in constructing these large base pairs, with Vmax/Km values ca. 100–1000-fold below those for a natural base pair. Interestingly, this polymerase selectively chose the correct pairing partner with each of the four xDNA bases. Moreover, preliminary experiments on extension of an xDNA pair by this enzyme also show selective bypass of a correct pair over mismatched ones, again with very low efficiency (see Supporting Data). It appears that enzyme(s) other than Pol I are responsible for the efficient replication of xG and xT in vivo since these bases exhibited different cellular coding efficiencies (Fig. 3). It will be of interest in the future to explore the other classes of polymerases, including types that are functionally more flexible,[12] to evaluate which are able to efficiently replicate such large pairs. It will also be important to study the replication in new sequence contexts, and to evaluate exonuclease proofreading of such pairs.
Taken together, the results show (a) that a DNA polymerase is able read the chemical information stored in the size-expanded bases, and (b) that the full replication machinery of E. coli is able to recognize the sequence encoded by two of the xDNA bases correctly and efficiently. This intriguing outcome suggests that it may be possible in the future to incorporate multiple xA or xC bases into phage genomes, or to incorporate xDNA pairs into plasmids that encode protein expression. In addition, it would be of interest to explore whether other organisms might also possess the ability to read xDNA pairs. The findings may ultimately lead to new strategies for modifying biological systems in useful ways.
Experimental Section
Synthesis of modified nucleosides and oligonucleotides
The deoxynucleoside phosphoramidite derivatives of xA, xT, xC, and xG were prepared as described previously.[5a–c] They were incorporated into oligodeoxynucleotides using the published methods, and were purified by HPLC and characterized by MALDI-TOF mass spectrometry (see Supporting Data).
Methods for enzyme kinetics
28-nt oligonucleotides containing single xA, xC, xG, xT residues were prepared along with a 23-nt complementary primer, which was labeled at its 5′ end with 32P. Polymerase reaction conditions were as follows: DNA concentration 5 μM in a 37 °C buffer containing 100 mM Tris•HCl (pH 7.5), 20 mM MgCl2, 2 mM dithiothreitol, and 0.1 mg/mL acetylated BSA. Enzyme and nucleotide concentrations were varied. Details of methods are given in the Supporting Data file.
Methods for cellular assays
The Competitive Replication of Adduct Bypass (CRAB) assay[10b] was used to determine the replication blocking (if any) by the size-expanded bases in M13 phage. Figure S1 shows an outline of the assay. Quantification of the modified and control phage sequences is performed on the daughter phage population as described. The Restriction Endonuclease And Postlabeling determination of mutation frequency (REAP) assay[10b] quantifies the type and amount of mutagenesis at the modified base site by obtaining the base composition at that position after cellular replication (Fig. S1). After PCR amplification, products are cleaved, labeled and enzymatically digested, then analyzed by TLC and quantified by phosphorimagery. Experiments were performed in triplicate.
Supplementary Material
Footnotes
We thank the U.S. National Institutes of Health (CA80024 to JME and GM63587 to ETK) for support. JG and HL acknowledge Stanford Graduate Fellowships.
Supporting information for this article is available on the WWW under http://www.angewandte.org or from the author.
Contributor Information
Dr. James C. Delaney, Departments of Chemistry and Biological Engineering, Massachusetts Institute of Technology Cambridge, MA 02139 USA
Dr. Jianmin Gao, Department of Chemistry, Stanford University, Stanford, CA 94305 USA
Dr. Haibo Liu, Department of Chemistry, Stanford University, Stanford, CA 94305 USA
Nidhi Shrivastav, Departments of Chemistry and Biological Engineering, Massachusetts Institute of Technology Cambridge, MA 02139 USA.
Prof. Dr. John M. Essigmann, Email: jessig@mit.edu, Departments of Chemistry and Biological Engineering, Massachusetts Institute of Technology Cambridge, MA 02139 USA
Prof. Dr. Eric T. Kool, Email: kool@stanford.edu, Department of Chemistry, Stanford University, Stanford, CA 94305 USA
References
- 1.a) Breslow R. Pure Appl Chem. 1998;70:267–270. [Google Scholar]; b) Kool ET, Waters ML. Nat Chem Biol. 2007;3:70–73. doi: 10.1038/nchembio0207-70. [DOI] [PubMed] [Google Scholar]
- 2.a) Murakami A, Blake KR, Miller PS. Biochemistry. 1985;24:4041–4046. doi: 10.1021/bi00336a036. [DOI] [PubMed] [Google Scholar]; b) Neilsen PE, Egholm M, Berg RH, Buchardt O. Science. 1991;254:1497–1500. doi: 10.1126/science.1962210. [DOI] [PubMed] [Google Scholar]; c) Nielsen P, Pfundheller HM, Wengel J. Chem Commun. 1997:825–826. [Google Scholar]; d) Lescrinier E, Esnouf R, Schraml J, Busson R, Heus H, Hilbers C, Herdewijn P. Chem Biol. 2000;7:719–731. doi: 10.1016/s1074-5521(00)00017-x. [DOI] [PubMed] [Google Scholar]; e) Egli M, Pallan PS, Pattanayek R, Wilds CJ, Lubini P, Minasov G, Dobler M, Leumann CJ, Eschenmoser A. J Am Chem Soc. 2006;128:10847–10856. doi: 10.1021/ja062548x. [DOI] [PubMed] [Google Scholar]; f) Schöning K, Scholz P, Guntha S, Wu X, Krishnamurthy R, Eschenmoser A. Science. 2000;290:1347–1351. doi: 10.1126/science.290.5495.1347. [DOI] [PubMed] [Google Scholar]
- 3.a) Chaput JC, Szostak JW. J Am Chem Soc. 2003;125:9274–9275. doi: 10.1021/ja035917n. [DOI] [PubMed] [Google Scholar]; b) Shaw BR, Dobrikov M, Wang X, Wan J, He K, Lin JL, Li P, Rait V, Sergueeva Z, Sergueev D. Ann N Y Acad Sci. 2003;1002:12–29. doi: 10.1196/annals.1281.004. [DOI] [PubMed] [Google Scholar]; c) Pochet S, Kaminski PA, Van Aerschot A, Herdewijn P, Marlière P. C R Biol. 2003;326:1175–1184. doi: 10.1016/j.crvi.2003.10.004. [DOI] [PubMed] [Google Scholar]; d) Sinha S, Kim PH, Switzer C. J Am Chem Soc. 2004;126:40–41. doi: 10.1021/ja034986z. [DOI] [PubMed] [Google Scholar]; e) Jung KH, Marx A. Cell Mol Life Sci. 2005;62:2080–2091. doi: 10.1007/s00018-005-5117-0. [DOI] [PMC free article] [PubMed] [Google Scholar]; f) Veedu RN, Vester B, Wengel J. Nucleosides Nucleotides Nucl Acids. 2007;26:1207–1210. doi: 10.1080/15257770701527844. [DOI] [PubMed] [Google Scholar]
- 4.a) Piccirilli JA, Krauch T, Moroney SE, Benner SA. Nature. 1990;343:33–37. doi: 10.1038/343033a0. [DOI] [PubMed] [Google Scholar]; b) Moran S, Ren RXF, Rumney S, Kool ET. J Am Chem Soc. 1997;119:2056–2057. doi: 10.1021/ja963718g. [DOI] [PMC free article] [PubMed] [Google Scholar]; c) Matray TJ, Kool ET. Nature. 1999;399:704–708. doi: 10.1038/21453. [DOI] [PubMed] [Google Scholar]; d) Tae EL, Wu Y, Xia G, Schultz PG, Romesberg FE. J Am Chem Soc. 2001;123:7439–7340. doi: 10.1021/ja010731e. [DOI] [PubMed] [Google Scholar]; e) Parsch J, Engels JW. Nucl Acids Res. 2001;20:815–818. doi: 10.1081/NCN-100002436. [DOI] [PubMed] [Google Scholar]; f) Weizman H, Tor Y. J Am Chem Soc. 2001;123:3375–3376. doi: 10.1021/ja005785n. [DOI] [PubMed] [Google Scholar]; g) Beuck C, Singh I, Bhattacharya A, Hecker W, Parmar VS, Seitz O, Weinhold E. Angew Chem Int Ed. 2003;42:3958–3960. doi: 10.1002/anie.200219972. [DOI] [PubMed] [Google Scholar]; h) Paul N, Nashine VC, Hoops G, Zhang P, Zhou J, Bergstrom DE, Davisson VJ. Chem Biol. 2003;10:815–825. doi: 10.1016/j.chembiol.2003.08.008. [DOI] [PubMed] [Google Scholar]; i) Henry AA, Romesberg FE. Curr Opin Chem Biol. 2003;7:727–733. doi: 10.1016/j.cbpa.2003.10.011. [DOI] [PubMed] [Google Scholar]; j) Hirao I, Harada Y, Kimoto M, Mitsui T, Fujiwara T, Yokoyama S. J Am Chem Soc. 2004;126:13298–13305. doi: 10.1021/ja047201d. [DOI] [PubMed] [Google Scholar]; k) Zhang X, Lee I, Zhou X, Berdis AJ. J Am Chem Soc. 2006;128:143–149. doi: 10.1021/ja0546830. [DOI] [PubMed] [Google Scholar]; l) Moore CL, Zivkovic A, Engels JW, Kuchta RD. Biochemistry. 2004;43:12367–12374. doi: 10.1021/bi0490791. [DOI] [PubMed] [Google Scholar]; m) Hirao I. Curr Opin Chem Biol. 2006;10:622–627. doi: 10.1016/j.cbpa.2006.09.021. [DOI] [PubMed] [Google Scholar]; n) Zahn A, Leumann CJ. Bioorg Med Chem. 2006;14:6174–6188. doi: 10.1016/j.bmc.2006.05.072. [DOI] [PubMed] [Google Scholar]; o) Hwang GT, Romesberg FE. J Am Chem Soc. 2008;130:14872–14882. doi: 10.1021/ja803833h. [DOI] [PMC free article] [PubMed] [Google Scholar]; p) Hirao I, Mitsui T, Kimoto M, Yokoyama S. J Am Chem Soc. 2006;129:15549–15555. doi: 10.1021/ja073830m. [DOI] [PubMed] [Google Scholar]; q) Hikishima S, Minakawa N, Kuramoto K, Fujisawa Y, Ogawa M, Matsuda A. Angew Chem Int Ed. 2005;44:596–598. doi: 10.1002/anie.200461857. [DOI] [PubMed] [Google Scholar]; r) Battersby TR, Albalos M, Friesenhahn MJ. Chem Biol. 2007;14:525–531. doi: 10.1016/j.chembiol.2007.03.012. [DOI] [PubMed] [Google Scholar]; s) Doi Y, Chiba J, Morikawa T, Inouye MJ. J Am Chem Soc. 2008;130:8762–8768. doi: 10.1021/ja801058h. [DOI] [PubMed] [Google Scholar]
- 5.a) Liu H, Gao J, Lynch SR, Maynard L, Saito D, Kool ET. Science. 2003;302:868–871. doi: 10.1126/science.1088334. [DOI] [PubMed] [Google Scholar]; b) Liu H, Gao J, Maynard L, Saito YD, Kool ET. J Am Chem Soc. 2004;126:1102–1109. doi: 10.1021/ja038384r. [DOI] [PubMed] [Google Scholar]; c) Gao J, Liu H, Kool ET. Angew Chem Int Ed. 2005;44:3118–3122. doi: 10.1002/anie.200500069. [DOI] [PubMed] [Google Scholar]; d) Gao J, Liu H, Kool ET. J Am Chem Soc. 2004;126:11826–11831. doi: 10.1021/ja048499a. [DOI] [PubMed] [Google Scholar]; e) Liu H, Lynch SR, Kool ET. J Am Chem Soc. 2004;126:6900–6905. doi: 10.1021/ja0497835. [DOI] [PubMed] [Google Scholar]; f) Lynch SR, Liu H, Gao J, Kool ET. J Am Chem Soc. 2006;128:14704–14711. doi: 10.1021/ja065606n. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Kim TW, Delaney JC, Essigmann JM, Kool ET. Proc Natl Acad Sci USA. 2005;102:15803–15808. doi: 10.1073/pnas.0505113102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.a) Scopes DI, Barrio JR, Leonard NJ. Science. 1977;195:296–298. doi: 10.1126/science.188137. [DOI] [PubMed] [Google Scholar]; b) Leonard NJ. Acc Chem Res. 1982;15:128–135. [Google Scholar]
- 8.Kim TW, Brieba LG, Ellenberger T, Kool ET. J Biol Chem. 2006;281:2289–2295. doi: 10.1074/jbc.M510744200. [DOI] [PubMed] [Google Scholar]
- 9.a) Nohmi T. Ann Rev Microbiol. 2006;60:231–253. doi: 10.1146/annurev.micro.60.080805.142238. [DOI] [PubMed] [Google Scholar]; (b) Jarosz DF, Godoy VG, Walker GC. Cell Cycle. 2007;6:817–822. doi: 10.4161/cc.6.7.4065. [DOI] [PubMed] [Google Scholar]
- 10.a) Delaney JC, Essigmann JM. Chem Biol. 1999;6:743–753. doi: 10.1016/s1074-5521(00)80021-6. [DOI] [PubMed] [Google Scholar]; b) Delaney JC, Essigmann JM. Methods Enzymol. 2006;408:1–15. doi: 10.1016/S0076-6879(06)08001-3. [DOI] [PubMed] [Google Scholar]
- 11.Delaney JC, Essigmann JM. Proc Natl Acad Sci USA. 2004;101:14051–14056. doi: 10.1073/pnas.0403489101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Mizukami S, Kim TW, Helquist SA, Kool ET. Biochemistry. 2006;45:2772–2778. doi: 10.1021/bi051961z. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.