Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2013 Jun 24.
Published in final edited form as: Nat Chem Biol. 2012 Jun 3;8(7):612–614. doi: 10.1038/nchembio.966

KlenTaq polymerase replicates unnatural base pairs by inducing a Watson-Crick geometry

Karin Betz 1,5, Denis A Malyshev 2,5, Thomas Lavergne 2, Wolfram Welte 1, Kay Diederichs 1, Tammy J Dwyer 3, Phillip Ordoukhanian 4, Floyd E Romesberg 2,*, Andreas Marx 1,*
PMCID: PMC3690913  NIHMSID: NIHMS397943  PMID: 22660438

Abstract

Many candidate unnatural DNA base pairs have been developed, but surprisingly, some of the best replicated adopt intercalated structures in free DNA that are difficult to reconcile with known mechanisms of polymerase recognition. Here we present crystal structures of KlenTaq DNA polymerase at different stages of replicating one of the more promising pairs, dNaM-d5SICS, and show that efficient replication results from the polymerase itself inducing the required natural-like structure.


The development of a third, unnatural DNA base pair, and an expanded genetic alphabet, is a central goal of synthetic and chemical biology and would increase the functional diversity of nucleic acids, provide tools for their site-specific labeling1,2, increase the information potential of DNA3, and lay the foundation of a semi-synthetic organism4. DNA replication is a complex process during which DNA polymerases undergo large substrate-induced conformational changes from an “open” complex to a catalytically competent “closed” complex with the cognate triphosphate forming complementary Watson-Crick hydrogen-bonds (H-bonds) with the templating nucleotide that positions it for incorporation into the growing primer strand5. In contrast to these conformational changes in the polymerase, the structures of the two natural base pairs, both before and after covalent incorporation of the triphosphate, are virtually identical to those formed in duplex DNA, in the absence of a polymerase (Fig. 1a). Moreover, the structures of the two natural base pairs are virtually identical to each other, and the rigorous selection of this conserved structure by DNA polymerases is thought to be essential for high fidelity replication6-9. Given this exquisite structure-based substrate selectivity, efficient replication of an unnatural base pair would appear to require that it adopt a structure that closely mimics that of a natural base pair.

Figure 1. KlenTaq polymerase induces the dNaM-d5SICS unnatural base pair to adopt a natural, Watson-Crick-like structure.

Figure 1

Structure of (a) a natural dG-dC base pair and (b) dNaM-d5SICS. Chemical structures are shown at the top of each panel, with a comparison of the structure formed between two nucleotides in duplex DNA (left) and between the templating nucleotide and the incoming triphosphate in the active site of KlenTaq polymerase (right). C1′-C1′ distances are indicated, and CPK renderings are viewed from above and from the minor groove. Only nucleobases are shown with sugar and phosphates omitted for clarity.

Surprisingly, several of the best replicated candidate unnatural base pairs bear little or no resemblance to their natural counterparts and rely not on complementary H-bonding for their pairing, but rather on complementary hydrophobic and packing forces10,11, a strategy for unnatural base pair design first pursued in 199912. In particular, some of the most promising candidates belong to a family of analogues exemplified by dNaM-d5SICS (Fig. 1b, top), for which the efficiency of every step of unnatural base pair synthesis is within an order-of-magnitude of that for a natural base pair10,13. Moreover, DNA containing dNaM-d5SICS may be amplified by PCR or transcribed into RNA with efficiencies and fidelities that approach those of fully natural DNA14,15 . However, based on the solution structure of duplex DNA with d5SICS paired opposite a dNaM analog16, and confirmed here for dNaM-d5SICS itself via solution state NOEs (Supplementary Results, Supplementary Fig. 1), this family of unnatural base pairs forms via an intercalative mode of pairing (Fig. 1b, left). Indeed, intercalation appears to be a general feature of predominantly hydrophobic base pairs17-19, which lack the H-bonds that favor the Watson-Crick-like edge-to-edge mode of pairing. While intercalation maximizes the packing interactions between the predominantly hydrophobic nucleobase analogues, it results in a structure that is very different from that of a natural base pair, and in fact, its structure is more similar to that of a mispair. It is thus difficult to reconcile the replication of these unnatural base pairs with the accepted mechanism of polymerase recognition5-9,20, especially in the case of dNaM-d5SICS, which is replicated with such high efficiency and fidelity. To address this apparent contradiction, we report the 1.9 to 2.2 Å resolution X-ray crystal structures (Supplementary Table 1 and Supplementary Fig. 2) of three binary complexes of the large fragment of Taq DNA polymerase I (KlenTaq) bound to templates with a natural nucleotide or dNaM at the first templating position, as well as two ternary complexes with cognate natural or unnatural triphosphates.

We first solved the structure of the binary complex of KlenTaq bound to a primer-template with dNaM at the first templating position (KTQdNaM) (Supplementary Methods, Fig. 2a). For comparison we also solved the structure of the binary complex between KlenTaq and a fully natural primer-template containing dG or dT at the same position (KTQdG and KTQdT) (Supplementary Fig. 3). In KTQdNaM, the polymerase adopts an overall structure that is similar to that observed in KTQdG and KTQdT (rmsd for Cα atoms: 0.83 Å and 0.35 Å, respectively). Moreover, the bound template of KTQdNaM is virtually superimposable with that of KTQdT (Supplementary Fig. 4), with the templating nucleotides flipped away from the developing duplex, and the two downstream nucleotides, dAT3 and dAT2 stacked on the exposed nascent base pair (Supplementary Fig. 3b,c). The hydrophobic nucleobase of dNaM is positioned in the same pocket as the templating dT of KTQdT, where it engages in hydrophobic packing interactions with O helix residues Met673, Tyr671, Phe667, Tyr664, and the template nucleotides dAT3 and dAT2. The methyl group of dNaM does not appear to make any specific contacts with the polymerase and the relatively weak signal in the electron density map indicates that it is nearly freely rotating. In contrast, the bound template of KTQdG adopts a conformation similar to that observed in the previously reported open structure of KlenTaq bound to a natural primer-template (PDB ID: 4KTQ)21. In these structures, the single-stranded template again kinks at its junction with the duplex portion of the primer-template, but instead of being packed by downstream nucleotides, the nascent base pair is packed by Tyr671 (Supplementary Fig. 5). The different binary complexes reveal that their structures are sequence-dependent, and the relatively large B factors (especially in fingers domain and the region proximal to the primer terminus, including Tyr671) (Supplementary Fig. 6), suggest that they are relatively dynamic. Nonetheless, the structural similarity of KTQdNaM and KTQdT suggests that the unnatural nucleotide is not abnormally perturbative.

Fig. 2. Unnatural base pair formation induces conformational transitions of KlenTaq and the formation of a natural-like ternary complex.

Fig. 2

Structure of complexes showing helices O and O1, primer-template, and d5SICSTP (if present) in: (a) KTQdNaM binary complex (yellow); (b) KTQdNaM-d5SICSTP ternary complex (purple); and (c) their superposition highlighting the structural transition induced by d5SICSTP binding. Schematic illustration of conformational transition induced by d5SICSTP binding (d) and superposition of binary and ternary complexes (e). Superposition of KTQdNaM-d5SICSTP (purple) and KTQdG-dCTP (grey) illustrating the similarities of: (f) helices O and O1, and primer-template; (g) active site; and (h) catalytically critical network of side chains, water molecules, and Mg2+ ions (water molecules and magnesium ions are shown as light pink and purple spheres, respectively, for KTQdNaM-d5SICSTP and as dark grey and light grey spheres, respectively, for KTQdG-dCTP; incoming triphosphate is labeled dNTP, templating nucleotide is labeled with dN and dideoxynucleotide at the primer terminus is labeled with ddN, as appropriate).

To investigate whether the formation of dNaM-d5SICS is able to induce conformational changes similar to those induced by the formation of a natural base pair5,21,22, we next solved the structure of the corresponding ternary complex (KTQdNaM-d5SICSTP) (Fig. 2b). The structure of KTQdNaM-d5SICSTP reveals that d5SICSTP is bound in the active site, and as with natural substrates, its binding does indeed induce the closure of the fingers domain over the active site and a dramatic conformational change in the single-stranded portion of the template, with the phosphate backbone moved significantly and dNaM flipped back along the axis of the developing duplex where the two unnatural nucleotides pair (Fig. 2c -e). Interestingly, similar stabilization of the catalytically active complex is apparently not afforded by mispairing between dNaM and a natural triphosphate, as repeated attempts to soak crystals of KTQdNaM with natural triphosphates failed to produce a stable ternary complex. For a more detailed comparison of the conformational changes induced by correct natural or unnatural triphosphate binding, we also solved the structure of the analogous fully natural complex (KTQdG-dCTP). The structures of KTQdG-dCTP and KTQdNaM-d5SICSTP are similar to each other (Fig. 2f; rmsd for Cα atoms: 0.43 Å), and to the fully natural ternary complex of KlenTaq reported earlier (PDB ID: 3KTQ)21 (rmsd for Cα atoms between KTQdG-dCTP and 3KTQ: 0.30 Å). Relative to KTQdG-dCTP, the active site of KTQdNaM-d5SICSTP is slightly enlarged to accommodate the unnatural base pair, and the relatively larger B-factors suggest that the fingers domain is somewhat more flexible (Supplementary Fig. 6).

A more detailed comparison of the active sites of KTQdG-dCTP and KTQdNaM-d5SICSTP further reveal their similarity (Fig. 2g and h). Just as in the fully natural complex, the orientation of the unnatural triphosphate is stabilized by interactions between its phosphate groups and the side chains of His639, Arg659, Lys663, and the amide backbone of Gln613. Also as with the natural triphosphates, the sugar rings of dNaM and d5SICSTP adopt the C3′-endo conformation, and the phosphate groups of the incoming triphosphate coordinate the two catalytically essential magnesium ions, which also coordinate polymerase residues Asp785, Asp610 and Tyr611 (Fig. 2h). The ortho substituents of both unnatural nucleobases, which structure-activity-relationship data reveal are essential for replication23, are oriented into the developing minor groove, in a fashion analogous to that of the H-bond acceptors of the natural nucleobases24. The d5SICSTP sulfur atom participates in a water-mediated H-bonding network with Glu615, Gln754 and Asn750, and the dNaM methoxy group, unlike in the binary structure, is well ordered and packed with the sulfur atom of d5SICSTP, the backbone of Gly668, and the carbonyl group of Phe667 on one side, and the guanine of the 3′ template nucleotide on the other. Lastly, the distance between the sugar C3′ of d5SICSTP and the α-phosphorus atom in KTQdNaM-d5SICSTP is virtually identical to that observed for the natural triphosphate in KTQdG-dCTP (3.9 and 3.8 Å, respectively).

Most remarkably, unlike the intercalated structure formed in a free duplex, the nucleobases of dNaM and d5SICSTP adopt a co-planar structure with nearly optimal edge-to-edge packing (average distance of 4.2 Å between the hydrophobic edges of the nucleobases), and a C1′-C1′ internucleotide distance that is roughly the same as that of a natural base pair (11.0 Å versus 10.6 Å, respectively, compared to 9.1 Å for dNaM-d5SICS in a free duplex). Thus, despite the absence of Watson-Crick-like H-bonds, and unlike in duplex DNA, the structure of dNaM-d5SICS in the polymerase active site is similar to that of a natural base pair (Fig. 1). The reduced level of intercalation is likely due in part to the A-form structure of the primer terminus, which is wider than the B-form structure of the free duplex. However, given that the triphosphate is only constrained by non-covalent interactions, greater intercalation should be possible, and the fact that it is not observed suggests that the sum of the interactions between the developing base pair and the polymerase active site favors a planar, Watson-Crick-like geometry.

The data provide an explanation for the empirical observation that complementary H-bond formation is not required for the efficient and selective replication of DNA10-12,25. dNaM-d5SICS is efficiently and selectively replicated because its formation provides a suitably strong driving force to induce the required structural transitions in the polymerase, and because it also possesses sufficient plasticity to adapt to the structure it induces in the polymerase. Correspondingly, the data reveal that the polymerase active site is not only capable of selecting for a correct structure among the pairing nucleotides, but also at least in some cases it is capable of enforcing it. Moreover, the efficient replication of dNaM-d5SICS by a variety of other polymerases, including polymerases from different families13,14, suggests that these observations with KlenTaq may be general. It is interesting to speculate that polymerases may have evolved to favor a co-planar geometry to prevent natural nucleotide mispairing via cross-strand intercalation and instead allow only the more specific, edge-to-edge H-bonding interactions. Finally, the data reveal that structural mimicry of a natural base pair is not required for unnatural base pair design and that, as is the case with protein structure and folding, the strong but relatively plastic nature of hydrophobic and packing forces makes them particularly well suited to underlie an expanded genetic alphabet. Further studies aimed at elucidating the factors underlying the efficient continued DNA synthesis after dNaM-d5SICS synthesis are currently underway.

Accession codes. Protein Data Bank: The atomic coordinates and structure factors for the reported crystal structures are deposited under accession 3SZ2 (KTQdG), 3SV4 (KTQdT), 3SYZ (KTQdNaM), 3RTV (KTQdG-dCTP), 3SV3 (KTQdNaM-d5SICSTP).

Supplementary Material

1

Acknowledgements

We thank the beamline staff of the Swiss Light Source at the Paul Scherrer Institute for their assistance during data collection. This work was supported by the Konstanz Research School Chemical Biology (to K.B.) and the National Institutes of Health (GM060005 to F.E.R.).

Footnotes

Author contributions K.B., D.A.M., T.L., F.E.R, and A.M. conceived the project, designed the experiments and analyzed the data. K.B., D.A.M., T.L., and P.O. performed chemical synthesis. K.B., W.W., and K.D. performed crystallography studies. T.J.D. performed the NMR experiments and D.M. and T.J.D. performed modeling studies. K.B., D.A.M., A.M., and F.E.R. wrote the manuscript.

Competing financial interests The authors declare no competing financial interests.

Additional information Supplementary information is available online at http://www.nature.com/naturechemicalbology/.

Reprints and permissions information is available online at http://www.nature.com/reprints/index.html.

References

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1

RESOURCES