Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2020 Aug 17;117(35):21274–21280. doi: 10.1073/pnas.2009415117

How a B family DNA polymerase has been evolved to copy RNA

Woo Suk Choi a, Peng He a,1, Arti Pothukuchy b, Jimmy Gollihar b, Andrew D Ellington b, Wei Yang a,2
PMCID: PMC7474658  PMID: 32817521

Significance

RTX is a reverse transcriptase evolved in vitro from the B family DNA polymerase KOD. Structural analyses of RTX in complex with either a DNA duplex or an RNA–DNA hybrid and comparison with the apo, binary, and ternary complex structures of the original KOD polymerase shed light on how to engineer and alter substrate specificity of enzymes. Among the 16 substitutions that result in the reverse transcriptase activity, only six occur at the substrate-binding surface, and the others change domain–domain interfaces in the polymerase to enable RNA–DNA hybrid binding and reverse transcription. The intrinsically flexible Thumb domain seems to play a major role in accommodating the RNA–DNA hybrid product distal to the active site.

Keywords: reverse transcription, proofreading, 3′ to 5′ exonuclease, Thumb domain

Abstract

We report here crystal structures of a reverse transcriptase RTX, which was evolved in vitro from the B family polymerase KOD, in complex with either a DNA duplex or an RNA–DNA hybrid. Compared with the apo, binary, and ternary complex structures of the original KOD polymerase, the 16 substitutions that result in the function of copying RNA to DNA do not change the overall protein structure. Only six substitutions occur at the substrate-binding surface, and the others change domain–domain interfaces in the polymerase to enable RNA–DNA hybrid binding and reverse transcription. Most notably, F587L at the Palm and Thumb interface stabilizes the open and apo conformation of the Thumb. The intrinsically flexible Thumb domain seems to play a major role in accommodating the RNA–DNA hybrid product distal to the active site. This is reminiscent of naturally occurring RNA-dependent DNA polymerases, including telomerase, which have a dramatically augmented Thumb domain, and of reverse transcriptase, which extends its Thumb with the RNase H domain.


Genetic information flows from DNA to RNA to a protein, but reverse transcription, which copies RNA to DNA, occurs in retroviruses, retrotransposons, and telomere maintenance (1, 2). Naturally occurring reverse transcriptase (RT) is heat labile and lacks the proofreading 3′ to 5′ exonuclease activity that removes misincorporated dNMP and improves the replicative polymerase fidelity by two orders of magnitude (3, 4). In the absence of proofreading, the error rate of DNA synthesis by RT is 10−4 (1 misincorporation in 10,000 nucleotides incorporated) (5). Reverse transcription is widely used to recover cDNA from RNA in laboratory and in diagnostic tests, including COVID-19 detection (68). Therefore, improved efficiency and fidelity of RT is highly desirable.

Thirty years ago A family DNA polymerases were found to have weak intrinsic reverse transcriptase activity (9). Improvements in the RT activity of A family polymerases have been made by isolating new polymerases (10) and introducing random as well as targeted mutations to existing thermostable enzymes (1113). Thermophilic polymerases are desirable for melting RNA secondary structures and also essential for RT-PCR reactions carried out by a single polymerase (11).

The most accurate and thermostable DNA polymerases for PCR, however, belong to the B family (14, 15). RTX was directly evolved in vitro from a thermostable B family DNA polymerase isolated from Thermococcus kodakaraensis, which is known as KOD and widely used for high-efficiency and high-accuracy PCR (1619). Because commercially available RT-PCR kits use a thermostable DNA polymerase and heat-labile RT without the proofreading function, the goal was to create a comparably thermostable RNA-dependent DNA polymerase (or RT). With a total of 16 amino acid substitutions in the 774-residue KOD (17) (SI Appendix, Table S1), RTX has both RNA- and DNA-dependent DNA polymerase activities like a native RT (16). RTX binds an RNA–DNA hybrid less well than a DNA duplex substrate, and its RNA-templated synthesis is less efficient than that templated by DNA (16). The catalytic rate (kcat) of RTX on DNA (49.1 s−1) is lower than that of KOD (160.4 s−1) (16); the RNA-dependent DNA synthesis rate of RTX (47.8 s−1) is also slightly lower than HIV-1 RT (61 s−1) (20). Unlike natural RT, however, RTX is heat resistant, retains the proofreading 3′-5′ exonuclease activity of KOD, and is able to self-correct and remove wrongly incorporated nucleotides copied from an RNA template (16).

The B family DNA polymerases carry an intrinsic proofreading exonuclease (Exo) domain N terminal to the polymerase module in the same polypeptide chain (5). All DNA polymerase modules adopt a right-hand-like structure, composed of the Finger, Thumb, and Palm domains. As with all high-fidelity DNA polymerases, KOD undergoes an open to closed transition of the Finger domain upon binding the correct incoming nucleotide prior to the chemical reaction (5). In the tertiary structure of KOD, Exo forms a large extension opposite to the Palm domain, which contains the catalytic residues of the polymerase (21). The Exo and Palm domains flank and support the flexible Finger domain, which in the B family is composed of a pair of long antiparallel α-helices (Fig. 1) (2227). At the very N terminus of the B family polymerases is the N domain, which contacts the Exo, Finger, and Palm domains and appears to coordinate the catalytic activities of Pol and Exo (22, 24). With the C-terminal Thumb domain, B family DNA polymerases often resemble a closed ring that encircles the DNA and dNTP substrate (Fig. 1A). Within this ring, DNA substrate can toggle between the polymerase and exonuclease active sites (28).

Fig. 1.

Fig. 1.

Structures of RTX complexed with DNA and a RNA–DNA hybrid. (A) Structural changes in KOD from binding only DNA (PDB ID code 4K8Z) to binding both DNA and an incoming dNTP (PDB ID code 5OMF). The binary complex is shown in gray, and the ternary complex is in pink. Domain movements are indicated by red arrowheads. (B) Diagram of the DNA and RNA–DNA hybrid cocrystallized with RTX. (C) Cartoon diagram of the RTX–DNA–dNTP structure. Each domain is color-coded, and the template and primer strands are shown in orange and yellow. The dNTP is shown as sticks. (D) Cartoon diagram of the RTX–RNA–DNA structure. Domains are colored as in C. The RNA template is shown in dark brown. The 15 substitutions in KOD that convert it to RTX are shown as sticks and labeled.

Recently, the crystal structure of a KOD ternary complex with DNA and an incoming dATP analog (2′,3′-dideoxy ATP) was reported, making KOD one of the few B family DNA polymerases whose apo, binary (DNA pol with DNA), and ternary complex structures are known (2931). Comparison of the KOD binary and ternary complex structures reveals that upon binding of the correct incoming dNTP, conformational changes occur in the N, Exo, and Thumb domains in addition to the well-known closing of the Finger domain. In the ternary complex (polymerization mode) the ring-shaped protein becomes slightly open between the Exo and Thumb domains (31) (Movie S1) (Fig. 1A). Unique to the B family DNA polymerases, the most flexible part of the KOD polymerase is the Thumb domain, which is composed of two disjointed parts, Thumb-1 and Thumb-2 (29). Between the apo and substrate-bound KOD structures, Thumb-1 is reconfigured and rotated by 32°, while the partially disordered Thumb-2 becomes ordered and moves 17 Å to contact the DNA product (SI Appendix, Fig. S1). The Thumb-2 movement is significantly larger than that of the Finger (6Å) or Thumb-1 regions (5Å) (30).

To understand how 16 amino acid substitutions in KOD result in the function of RNA-dependent DNA synthesis by RTX and eventually to make RTX a better RT with proofreading activity, we have determined crystal structures of RTX complexed with either a DNA-duplex or RNA–DNA-hybrid substrate and compared these structures with the KOD apo and binary- and ternary-complex structures (2931). In addition to a few mutations lining the substrate-binding surface and directly altering the preference for DNA versus RNA, the majority of mutations appear to weaken interdomain interactions (N, Exo, Finger, Palm, and Thumb) or stabilize individual domain structures and thus indirectly affect substrate preference. The analysis reported here deepens our understanding of how B family DNA polymerases and RT function and provides a platform for future improvement of engineered RTs.

Results

Structure Determination.

RTX was crystallized with a 23/20 nt (nucleotide) DNA duplex or a 16/13 nt RNA–DNA hybrid, each with a 3 nt 5′ overhang on the template strand (Fig. 1B), and diffracted X-rays to 2.4 and 2.5 Å resolution, respectively (SI Appendix, Table S2). These structures were solved by molecular replacement using the KOD–DNA binary complex (30) as a search model and refined (Fig. 1 C and D and SI Appendix, Table S2 and Methods). As in all KOD structures, the last 18 residues at the C terminus (757 to 774) are disordered and were not modeled. In the 2.4 Å crystal structure of the RTX–DNA complex, although an incoming nucleotide (dAMPNPP, a nonhydrolyzable analog of dATP) and two Mg2+ ions were bound in the active site, the Finger domain was not fully closed, which would occur for a reaction-competent ternary complex (31). This RTX–DNA complex is superimposable with the KOD–DNA binary complex structure with a rmsd of 0.74 Å over 725 pairs of Cα atoms. The largest difference is in the Finger domain of RTX, with its tip shifting 4.6 Å toward the closed conformation (Fig. 2A). Additional structural changes are also observed in the N, Exo, and Thumb domains, which may be correlated with the mutations of K118I, M137L, and F587L among the 16 changes that result in the reverse transcription function of RTX (Fig. 2 BD and SI Appendix, Table S1). The implications of these three substitutions will be discussed later.

Fig. 2.

Fig. 2.

Structural details of the RTX–DNA complex. (A) Superposition of RTX–DNA–dNTP with the KOD–DNA binary complex (PDB ID code 4K8Z). RTX is colored as in Fig. 1C; KOD is shown in light gray. The finger movement is circled in blue. A zoom-in view of the active site is shown. Mg2+ ions are shown as purple spheres. (BD) Zoom-in views of K118I (N domain, B), M137L (Exo domain, C), and F587L (Palm domain, D), showing their effects on the structural change in RTX compared with KOD.

In the 2.5 Å RTX–RNA–DNA complex structure, the 3′ end of the primer strand and the templating RNA are shifted away from the reactive position, and the incoming nucleotide binding site is empty (Fig. 1D). Not surprisingly, the Finger is in the open conformation. Distal from the catalytic center the Thumb domain, which normally wraps around the upstream product duplex (Fig. 3A), is significantly different from KOD and RTX in complex with DNA (Fig. 3 AC). Thumb-1 (aa 615 to 660 and 728 to 756, interfacing with the Palm domain), however, is surprisingly more similar to apo than to DNA-bound KOD (Fig. 3D). Thumb-2 (aa 662 to 720), adjacent to the Exo domain, is only half closed compared to the DNA-bound structure (Fig. 3 A and B) and has weak electron density, indicating that it is relatively mobile. Concurrently, the region of the Exo domain that normally contacts the Thumb is partially disordered (Fig. 3E). When the Exo and Palm domains of the two RTX structures are superimposed, every domain appears to undergo a rigid-body movement (Fig. 3B), and the rmsd over 623 pairs of Cα atoms (excluding Thumb-2) is 1.34 Å. The ring-shaped RTX is more open when complexed with the RNA–DNA hybrid than with the DNA duplex (Movie S2).

Fig. 3.

Fig. 3.

Structural details of the RTX–RNA–DNA complex. (A) Thumb movement of RTX when complexed with the RNA–DNA hybrid relative to the KOD–DNA binary complex (PDB ID code 4K8Z). The RTX-R structure is shown in light olive except for the red active site residues, and the KOD–DNA complex is shown in light green gray. The Palm domain and active site of the two are superimposed, but Thumb-1 and Thumb-2 are dramatically different. (B) Superposition of RTX-D and RTX-R confirms the large movement of Thumb-1 and Thumb-2. RTX-D is shown in multiple colors as in Fig. 1C. (C) A zoom-in view of 3B focusing on restructured Thumb-1 in RTX. (D) Comparison of Thumb-1 between RTX-R and apo KOD (PDB ID code 1WN7) after superimposing the Exo and Palm domains of RTX and KOD. (E) A zoom-in view of 3B focusing on the Exo and Thumb-2 interface, where it is disordered in the RTX-R structure.

The structural differences between RTX in the DNA and RNA–DNA complexes could be due to different crystal lattices (SI Appendix, Table S2). However, the largest difference occurs in the Thumb domain, correlating with binding of an RNA–DNA hybrid versus a DNA duplex (Fig. 3C). Because the RNA–DNA hybrid adopts a mixed A and B hybrid form instead of a B form, the minor groove becomes wider and shallower in the hybrid form. The mobility of the Thumb and part of the Exo domain appears to play a role in accommodating the RNA template. One of the 16 substitutions, F587L, occurs at the Palm and Thumb interface, and the branched side chain of Leu appears to stabilize the Thumb-1 in the apo-like conformation (Figs. 2D and 3D). The flexible Thumb-2 and reduced interactions with the DNA duplex and RNA–DNA hybrid may underlie the reduced efficiency of DNA synthesis by RTX.

In the following sections, 15 substitutions in RTX relative to KOD (exclusive of W768R due to its disorder in all KOD and RTX structures) are analyzed in four groups according to their locations relative to the substrate (SI Appendix, Table S1) and potential influence on RNA-templated DNA synthesis.

Substrate-Binding Interface.

R381H, Y384H, and V389I, which are located along a loop extended from the Exo to the Palm domain, interact with the template backbone at the −1 to −4 positions (upstream of the replicating base pair) in the RTX–DNA and KOD ternary complex structures (Fig. 4A). Unfortunately, in the RTX–RNA–DNA complex structure, the RNA template is disordered at the −1 position and is shifted away from the polymerase at the −2 to −4 positions (SI Appendix, Fig. S2). R381H and Y384H are also poorly defined in the electron density map. The RNA strand has an additional hydroxyl group on C2′ and adopts A form and C3′-endo conformations. Despite the absence of detailed protein–RNA interactions, these three substitutions likely facilitate accommodation of the RNA template. R381H and Y384H substitutions reduce the protein volume at the interface with RNA, while the V389I substitution is correlated with an increased distance to the RNA template compared to the DNA duplex (Fig. 4B). We suspect that these three substitutions have an overall effect of reducing binding to either RNA or DNA templates.

Fig. 4.

Fig. 4.

Substitutions at the substrate-binding interface. (A) Stereo diagram of six substitutions in the RTX–DNA–dNTP complex in the linker between Exo and Palm (R381H, Y384H, and V389I) and in the Thumb domain (E664K, G711V, and N735K). The KOD binary complex (PDB ID code 4K8Z) is superimposed in light gray for comparison. RTX is color-coded as in Fig. 1C. (B) A stereoview from the back (relative to A) of the six substitutions in the RTX–RNA–DNA complex. The KOD ternary complex (PDB ID code 5OMF) is superimposed in light pink for comparison. The substitutions are shown as balls and sticks.

Residues E664K, G711V, and N735K occur in the Thumb domain, near the primer strand at the −5 position (E664K) and in contact with the template strand at the −8 and −9 positions across the minor groove (Fig. 4A). The increased positive charge and size of the side chains likely enhance Thumb domain interactions with the substrate distal to the active site. In the DNA binary complex structures of KOD and RTX, E664 and E664K are within van der Waals contact with the DNA primer, but in the KOD ternary complex E664 is more distant from the DNA (Fig. 4A). Substitution of E664 with Lys may aid RTX attachment to the primer strand and thus processivity in compensation for reducing the interactions with the template proximal to the active site. Similarly, the Nε of N735K forms a salt bridge with the template phosphate group at the −8 position in the RTX–DNA complex (Fig. 4A).

Changes in the RTX substrate interface relative to KOD can be summarized as decreasing interactions with the template strand near the replicating base pair in the active site by introducing smaller and less positively charged side chains but enhancing binding to both primer and template strands farther upstream by increasing protein size and positive charge in the Thumb domain.

The Active Site.

Two substitutions in the Palm domain, T514I and I521L, occur in the vicinity of the active site. T514 is immediately adjacent to the steric gate (Y409) that prevents incorporation of ribonucleotides (rNTPs) (22, 32). Interestingly, the Ile substitution maintains the interaction with Y409, as T514 and Cδ1 of Ile and Cγ2 of Thr are both within van der Waals contact distance (3.3 and 3.6 Å, respectively) of the hydroxyl group of Y409 in the DNA complex structures (Fig. 5A). The increased bulk of the Ile substitution at 514 instead induces a side chain rotamer change in the nearby R518. In RTX, the orientation change in the R518 side chain shifts it toward the template backbone in both DNA and RNA–DNA complexes at the −3 position. One α-helical turn away, with the Leu of I521L in RTX, the side chain branch point is altered to avoid the close contacts with the carbonyl oxygen of R518 (Fig. 5B). The I521L substitution retains its support for the β-turn structure of the catalytic motif DxD but reduces the number of clashes with T541, which is sandwiched by the two most conserved Asp residues (33) (Fig. 5B). As a result, even without occupancy of a templating base or incoming dNTP, the active site of the RTX–RNA–DNA complex is indistinguishable from the fully occupied active site observed in the KOD–DNA–dNTP ternary complex. In contrast, the active site in the apo KOD structure adopts a different conformation (Protein Data Bank [PDB] ID code 1WN7; SI Appendix, Fig. S3). It will be interesting to check whether the T514I and I521L substitutions stabilize the catalytic center and make KOD a more efficient DNA polymerase.

Fig. 5.

Fig. 5.

Substitutions near the active site or in the Finger domain. (A) Substitution of T514I in RTX is correlated with the rotamer change in R518 nearby. The active site is marked by the incoming dNTP (dAMPNPP) and two Mg2+ ions (shown as purple spheres). (B) I521L in RTX eliminates the clash with R518 and reduces the number of close contacts with T541 in the DxD catalytic motif (D540 and D542) in KOD structures (marked by red double arrowheads and distance). The substitution also causes a main chain shift of the DxD motif in RTX (marked by the curved red arrow). (C) K466R in Finger forms a salt bridge with E475 and may stabilize the solvent-exposed Finger in RTX (colored blue) compared with the KOD binary (light gray) and ternary (light pink) complexes with K466. (D) Y493L in RTX relieves the clashing contact of Y493 with F356 in KOD. Side chains of interest are shown as sticks and balls and colored according to the domain colors of Figs. 1 C and D and 2A.

The Finger Stability.

K466R and Y493L are in the Finger domain and appear to stabilize the two-helix structure and avoid the close contacts observed in the KOD structure (Fig. 5C). K466R is located on the first α-helix near the tip of the Finger and entirely exposed to solvent. In the apo and binary-complex structures of KOD, K466 is in the vicinity of E475 on the second α-helix of the Finger domain (Fig. 5C); in the ternary-complex structure, K466 moves toward E475 to form a charge–charge interaction. In RTX, the Arg substitution of K466 strengthens the interaction with E475 by forming double salt bridges in the RTX–RNA–DNA complex and thus stabilizes the pair of helices even when the Finger domain is open. The second substitution, Y493L, is in the hydrophobic core formed by the Finger, Exo, and Palm domains when packed against the replicating base pair (Fig. 5D). In all KOD structures Y493 is extremely close to F356, with its hydroxyl group within 3.3 Å of the edge of the benzene ring of F356. Substitution by Leu relieves the close contact with F356 and increases flexibility in the hydrophobic core shared by the three domains (Fig. 5D). These two substitutions in the Finger domain may stabilize the RTX while making it more flexible.

Global Flexibility and Domain Interfaces.

The remaining five substitutions, F38L, R97M, and K118I in the N domain, M137L in the Exo domain, and F587L in the Palm domain, are distal from the active sites of Pol and Exo (Figs. 2 and 6). F38L and R97M are adjacent to one another in the tertiary structure. The two smaller side chains at these positions result in an enlarged internal cavity in the RTX N domain, which likely increases its flexibility (Fig. 6A). Moreover, R97 is on a positively charged surface that mediates template strand binding, which previously was identified as recognizing uracil (deaminated cytosine) in single-stranded templates before they enter the active site (34, 35). Although 16 Å away from the +1 template (1 nt downstream from the templating base) in the crystal structures (Fig. 6B), R97 in KOD would contact the template strand, which with natural substrates is much longer than the 14 or 23 nt templates in the crystal structures. F38L and R97M together may thus increase the flexibility of the N domain and reduce template interactions and, consequently, result in accommodation of an RNA template.

Fig. 6.

Fig. 6.

Domain interface and overall flexibility of RTX. (A) A zoom-in view of F38L, R97M, and K118I (N domain) in the RTX-D structure. The KOD–DNA complex structure (PDB ID code 4K8A) is superimposed and shown in semitransparent light gray. (B) The molecular surface of KOD is shown with its charge potential, where blue represents positive and red represents negative charge potential. R97 is part of the positively charged surface that may bind the template strand in KOD (see the zoom-in view with a dark background). R97M and K118I reduce the positive charge of the surface in RTX.

As previously mentioned when describing the overall structure of the RTX–DNA complex (Fig. 2 BD), K118I, M137L, and F587L substitutions cause subtle but discernible changes between otherwise superimposable KOD and RTX binary complexes with DNA. Because these three substitutions consist of branched hydrophobic side chains with increased stiffness (K118I and M138L), hydrophobic core packing and domain–domain interactions surrounding the substrate-binding interface (Fig. 2 BD) are affected. K118I belongs to the N domain, and the mutation increases the hydrophobicity and flexibility at the convergence of the N, Exo, and Finger domains that contacts the template strand proximal to the active site (Fig. 2A). The K118I substitution leads to a rotamer change in W355 (Fig. 2B) and is possibly responsible for the 1 Å shift of the N domain (Fig. 2A). In addition, the terminal amide of K118 forms a salt bridge with the Exo domain (D343), which likely rigidifies the domain–domain interaction in KOD. K118I compensates the loss of a salt bridge by filling the hydrophobic core. On the other hand, M137L is correlated with the opening of an adjacent loop, aa 129 to 135, which may ease the sliding of the Finger domain relative to Exo during the transition between binary and ternary complexes. These substitutions in RTX appear to “lubricate” the N–Exo and Exo–Finger interfaces.

The most dramatic structural changes caused by a single amino acid substitution in RTX is F587L. F587L occurs in the Palm domain at its interface with Thumb-1. When both are complexed with DNA, the structural differences between RTX and KOD are local. F587L disrupts van der Waals contact between F587 (Palm) and F748 (Thumb) and causes repacking of the Palm–Thumb interface (Fig. 2D). However, the rest of the Thumb domain is superimposable between KOD and RTX, and RTX maintains a mode of DNA binding similar to KOD (Fig. 2A). However, in the RTX–RNA–DNA complex, Thumb-1 rotates 24° and is completely repacked (Fig. 3 AC). Interestingly, the Palm–Thumb interface in the RTX–RNA–DNA complex is similar to that in the KOD apo structure (Fig. 3D), suggesting that the F587L substitution stabilizes the “relaxed” Thumb conformation (Movie S3). Concurrently, Thumb-2 (the second half of Thumb and distal to the Palm domain) moves significantly to accommodate the RNA–DNA hybrid in the RTX–RNA–DNA complex (Figs. 1D and 3A). The large changes observed in the Thumb domain are probably the primary means for accommodating an RNA–DNA hybrid at the cost of reduced contact and stability of polymerase–substrate interactions.

Discussion

RTX Acquires Changes Both Proximal and Distal to Substrate Binding.

Amino acid substitutions designed to engineer reverse transcription activity in a DNA polymerase often focus on residues in the vicinity of the replicating base pair and DNA-binding surface. Interestingly, many of the 16 substitutions in RTX, which were selected by direct evolution of compartmentalized self-replication to copy increasing lengths of RNA (16, 36), are far from substrate and dNTP binding. It is well known that the active sites of RNA and DNA polymerases can be highly similar to the point of being nearly interchangeable (37, 38), and the nascent base pair bound to DNA polymerase is often in A form (3′-endo) and predisposed to be RNA-like (22, 39, 40). DNA polymerases in the A family often possess reverse transcriptase activities (9, 10, 41), and the RT activity can be enhanced by increasing pH and Mn2+ concentrations in the reaction buffer (42), which reduces the polymerase fidelity. RTX, however, demonstrates that conversion from a stringent DNA- to permissible RNA-dependent polymerase in standard Mg2+ buffer is aided by nonintuitive changes well outside the polymerase active site.

The distal substitutions, for example, F38L, R97M, K118I, M137L, and F587L, alter domain–domain interactions and result in the opening of the substrate-binding surface. We tend to think the major differences between RNA and DNA are the extra hydroxyl (2′-OH) groups in RNA and sugar pucker change from 2′-endo to 3′-endo. However, for a DNA polymerase, which usually binds 10 bp (base pair) of a double helix, to become a reverse transcriptase, it must accommodate a wider RNA–DNA hybrid (22 Å in diameter vs. 20 Å for a DNA duplex) before sensing any details of RNA. When we initially compared RTX with the DNA-bound KOD structures (Figs. 2 and 3), the F587L substitution appeared to deform the Thumb domain. However, we find that when compared to the KOD apo structure, F587L does not create a new conformational state but merely changes the equilibrium of the existing states (Movie S3). By stabilizing the apo state of KOD, the F587L substitution allows RTX to keep the Thumb open to accommodate the wider RNA–DNA product during reverse transcription.

Creating space for RNA binding is also a feature of substitutions along the substrate binding interface. Two substitutions proximal to the active site, R381H and Y384H, result in reduced protein size while maintaining hydrogen bond potential to accommodate the RNA–DNA hybrid (Fig. 4). These substitutions would reduce polymerase–DNA interactions and may thus reduce the DNA polymerase efficiency. Perhaps to compensate for this reduction, three substitutions in the Thumb domain distal to the active site, E664K, G711V, and N735K, gain positive charges and size and appear to stabilize enzyme–substrate interactions. To maintain the stability of the overall structure while the domain interactions become more flexible in RTX, certain substitutions appear to form an additional salt bridge (K466R), increase hydrophobicity and size (K118I and T514I), or alleviate close contacts (I521L and Y493L).

Features Shared by RTX, Reverse Transcriptase, and Telomerase.

Compared with the five established families of DNA polymerases (A, B, C, X, and Y), naturally occurring reverse transcriptase (including the catalytic subunit of telomerase) is most closely related to the B family polymerases. In both RT and B family polymerases the Thumb domain occurs at the very C terminus and is directly linked to the Palm domain via a 3-stranded β-sheet (5, 43). In contrast, in A family polymerases, such as Taq Pol, the Thumb domain precedes both the Finger and Palm domains. As a result, the Thumb domain in the A family is less flexible than in the B family (SI Appendix, Fig. S1) (44, 45) but interacts with the upstream template more fluidly with a single α-helical dipole (SI Appendix, Fig. S4A) (46). In contrast, Thumb-2 of the B family interacts with the template strand more substantially, with two α-helical dipoles and positively charged side chains affixed by a β-sheet (SI Appendix, Fig. S4B). As a result, the A family polymerases can naturally accommodate an RNA–DNA duplex and exhibit detectable reverse transcriptase activity (9, 11). Because of the specific structure of Thumb-1 and Thumb-2, B family DNA polymerases are more specific for DNA recognition than the A family enzymes.

Among B family polymerases, eukaryotic Pol α is a primase and extends RNA primer opposite a DNA template, while DNA δ is a strict DNA polymerase (25, 47, 48). We find that both yeast and human Pol α contain a three-residue insertion relative to DNA δ or KOD in the Thumb domain at the interface with the Palm domain (SI Appendix, Fig. S4C). The insertion is adjacent to the F587L mutation in RTX, and the Thumb domain of Pol α contacts only the primer strand (SI Appendix, Fig. S4D), unlike KOD and DNA δ, which bind both template and primer.

Being at the C terminus, the Thumb domains in reverse transcriptase and telomerase exhibit significant flexibility and are able to accommodate insertions. For example, the Thumb domain in telomerase, known as C-terminal extension, is much larger than any Thumb found in DNA polymerases and forms a large interface with the RNA–DNA hybrid (49, 50) (SI Appendix, Fig. S5A). Although the Thumb domain in reverse transcriptase is not very large, the interface with the RNA–DNA hybrid is extended by the addition of an RNase H domain, which specifically recognizes RNA–DNA hybrids (1, 51) (SI Appendix, Fig. S5B). Telomerase and naturally occurring reverse transcriptase demonstrate the same trend as we observe with RTX, that contacts of the upstream RNA–DNA hybrid could greatly enhance the RT activity. As the active site does not strongly discriminate against an RNA template, the polymerase activity and processivity largely depend on how well the enzyme binds the upstream RNA–DNA hybrid. If an RNA–DNA hybrid binding domain (52) were appended to the C terminus of RTX, we suspect that the RT activity would be further enhanced.

Materials and Methods

The RTX protein was prepared as described previously (16). Crystallographic and structural analyses were carried out according to established protocols. Details are given in SI Appendix.

Supplementary Material

Supplementary File
Supplementary File
Download video file (5.5MB, mov)
Supplementary File
Download video file (5.5MB, mov)
Supplementary File
Download video file (6.1MB, mov)

Acknowledgments

We thank Drs. R. Craigie, M. Gellert, and D. J. Leahy for critical reading of the manuscript. This research was supported by National Institute of Diabetes and Digestive and Kidney Disease Intramural Grants DK036144 and DK036146 (to W.Y.), the Welch Foundation (Grant F-1654), and NIH Grant 1R01EB027202-01A0 (to A.D.E.).

Footnotes

The authors declare no competing interest.

This article contains supporting information online at https://www.pnas.org/lookup/suppl/doi:10.1073/pnas.2009415117/-/DCSupplemental.

Data Availability.

The structure coordinates and structure factors (SI Appendix, Table S2) have been deposited in the Protein Data Bank (PDB ID codes 6WYA and 6WYB).

References

  • 1.Telesnitsky A., Goff S. P., “Reverse transcriptase and the generation of retroviral DNA” in Retroviruses, Coffin J. M., Hughes S. H., Varmus H. E., Eds. (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1997). [PubMed] [Google Scholar]
  • 2.Nandakumar J., Cech T. R., Finding the end: Recruitment of telomerase to telomeres. Nat. Rev. Mol. Cell Biol. 14, 69–82 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Menéndez-Arias L., Sebastián-Martín A., Álvarez M., Viral reverse transcriptases. Virus Res. 234, 153–176 (2017). [DOI] [PubMed] [Google Scholar]
  • 4.Reha-Krantz L. J., DNA polymerase proofreading: Multiple roles maintain genome stability. Biochim. Biophys. Acta 1804, 1049–1063 (2010). [DOI] [PubMed] [Google Scholar]
  • 5.Yang W., Gao Y., Translesion and repair DNA polymerases: Diverse structure and mechanism. Annu. Rev. Biochem. 87, 239–261 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Peano C., Severgnini M., Cifola I., De Bellis G., Battaglia C., Transcriptome amplification methods in gene expression profiling. Expert Rev. Mol. Diagn. 6, 465–480 (2006). [DOI] [PubMed] [Google Scholar]
  • 7.Sasagawa Y., Hayashi T., Nikaido I., Strategies for converting RNA to amplifiable cDNA for single-cell RNA sequencing methods. Adv. Exp. Med. Biol. 1129, 1–17 (2019). [DOI] [PubMed] [Google Scholar]
  • 8.Zhu H., Fohlerová Z., Pekárek J., Basova E., Neužil P., Recent advances in lab-on-a-chip technologies for viral diagnosis. Biosens. Bioelectron. 153, 112041 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Jones M. D., Foulkes N. S., Reverse transcription of mRNA by Thermus aquaticus DNA polymerase. Nucleic Acids Res. 17, 8387–8388 (1989). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Moser M. J. et al., Thermostable DNA polymerase from a viral metagenome is a potent RT-PCR enzyme. PLoS One 7, e38371 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Sauter K. B., Marx A., Evolving thermostable reverse transcriptase activity in a DNA polymerase scaffold. Angew. Chem. Int. Ed. Engl. 45, 7633–7635 (2006). [DOI] [PubMed] [Google Scholar]
  • 12.Kranaster R. et al., One-step RNA pathogen detection with reverse transcriptase activity of a mutated thermostable Thermus aquaticus DNA polymerase. Biotechnol. J. 5, 224–231 (2010). [DOI] [PubMed] [Google Scholar]
  • 13.Raghunathan G., Marx A., Identification of Thermus aquaticus DNA polymerase variants with increased mismatch discrimination and reverse transcriptase activity from a smart enzyme mutant library. Sci. Rep. 9, 590 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Ishino S., Ishino Y., DNA polymerases as useful reagents for biotechnology–The history of developmental research in the field. Front. Microbiol. 5, 465 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Aschenbrenner J., Marx A., DNA polymerases and biotechnological applications. Curr. Opin. Biotechnol. 48, 187–195 (2017). [DOI] [PubMed] [Google Scholar]
  • 16.Ellefson J. W. et al., Synthetic evolutionary origin of a proofreading reverse transcriptase. Science 352, 1590–1593 (2016). [DOI] [PubMed] [Google Scholar]
  • 17.Takagi M. et al., Characterization of DNA polymerase from Pyrococcus sp. strain KOD1 and its application to PCR. Appl. Environ. Microbiol. 63, 4504–4510 (1997). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Atomi H., Fukui T., Kanai T., Morikawa M., Imanaka T., Description of Thermococcus kodakaraensis sp. nov., a well studied hyperthermophilic archaeon previously reported as Pyrococcus sp. KOD1. Archaea 1, 263–267 (2004). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Terpe K., Overview of thermostable DNA polymerases for classical PCR applications: From molecular and biochemical fundamentals to commercial systems. Appl. Microbiol. Biotechnol. 97, 10243–10254 (2013). [DOI] [PubMed] [Google Scholar]
  • 20.Kerr S. G., Anderson K. S., Pre-steady-state kinetic characterization of wild type and 3′-azido-3′-deoxythymidine (AZT) resistant human immunodeficiency virus type 1 reverse transcriptase: Implication of RNA directed DNA polymerization in the mechanism of AZT resistance. Biochemistry 36, 14064–14070 (1997). [DOI] [PubMed] [Google Scholar]
  • 21.Wang J. et al., Crystal structure of a pol alpha family replication DNA polymerase from bacteriophage RB69. Cell 89, 1087–1099 (1997). [DOI] [PubMed] [Google Scholar]
  • 22.Franklin M. C., Wang J., Steitz T. A., Structure of the replicating complex of a pol alpha family DNA polymerase. Cell 105, 657–667 (2001). [DOI] [PubMed] [Google Scholar]
  • 23.Berman A. J. et al., Structures of phi29 DNA polymerase complexed with substrate: The mechanism of translocation in B-family polymerases. EMBO J. 26, 3494–3505 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Wang F., Yang W., Structural insight into translesion synthesis by DNA Pol II. Cell 139, 1279–1289 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Perera R. L. et al., Mechanism for priming DNA synthesis by yeast DNA polymerase α. eLife 2, e00482 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Hogg M. et al., Structural basis for processive DNA synthesis by yeast DNA polymerase ε. Nat. Struct. Mol. Biol. 21, 49–55 (2014). [DOI] [PubMed] [Google Scholar]
  • 27.Jain R. et al., Cryo-EM structure and dynamics of eukaryotic DNA polymerase δ holoenzyme. Nat. Struct. Mol. Biol. 26, 955–962 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Hogg M., Aller P., Konigsberg W., Wallace S. S., Doublié S., Structural and biochemical investigation of the role in proofreading of a beta hairpin loop found in the exonuclease domain of a replicative DNA polymerase of the B family. J. Biol. Chem. 282, 1432–1444 (2007). [DOI] [PubMed] [Google Scholar]
  • 29.Hashimoto H. et al., Crystal structure of DNA polymerase from hyperthermophilic archaeon Pyrococcus kodakaraensis KOD1. J. Mol. Biol. 306, 469–477 (2001). [DOI] [PubMed] [Google Scholar]
  • 30.Bergen K., Betz K., Welte W., Diederichs K., Marx A., Structures of KOD and 9°N DNA polymerases complexed with primer template duplex. ChemBioChem 14, 1058–1062 (2013). [DOI] [PubMed] [Google Scholar]
  • 31.Kropp H. M., Betz K., Wirth J., Diederichs K., Marx A., Crystal structures of ternary complexes of archaeal B-family DNA polymerases. PLoS One 12, e0188005 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Astatke M., Ng K., Grindley N. D., Joyce C. M., A single side chain prevents Escherichia coli DNA polymerase I (Klenow fragment) from incorporating ribonucleotides. Proc. Natl. Acad. Sci. U.S.A. 95, 3402–3407 (1998). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Copeland W. C., Wang T. S., Mutational analysis of the human DNA polymerase alpha. The most conserved region in alpha-like DNA polymerases is involved in metal-specific catalysis. J. Biol. Chem. 268, 11028–11040 (1993). [PubMed] [Google Scholar]
  • 34.Greagg M. A. et al., A read-ahead function in archaeal DNA polymerases detects promutagenic template-strand uracil. Proc. Natl. Acad. Sci. U.S.A. 96, 9045–9050 (1999). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Fogg M. J., Pearl L. H., Connolly B. A., Structural basis for uracil recognition by archaeal family B DNA polymerases. Nat. Struct. Biol. 9, 922–927 (2002). [DOI] [PubMed] [Google Scholar]
  • 36.Ghadessy F. J., Ong J. L., Holliger P., Directed evolution of polymerase function by compartmentalized self-replication. Proc. Natl. Acad. Sci. U.S.A. 98, 4552–4557 (2001). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Doublié S., Tabor S., Long A. M., Richardson C. C., Ellenberger T., Crystal structure of a bacteriophage T7 DNA replication complex at 2.2 A resolution. Nature 391, 251–258 (1998). [DOI] [PubMed] [Google Scholar]
  • 38.Jeruzalmi D., Steitz T. A., Structure of T7 RNA polymerase complexed to the transcriptional inhibitor T7 lysozyme. EMBO J. 17, 4101–4113 (1998). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Doublié S., Sawaya M. R., Ellenberger T., An open and closed case for all polymerases. Structure 7, R31–R35 (1999). [DOI] [PubMed] [Google Scholar]
  • 40.Ling H., Boudsocq F., Woodgate R., Yang W., Crystal structure of a Y-family DNA polymerase in action: A mechanism for error-prone and lesion-bypass replication. Cell 107, 91–102 (2001). [DOI] [PubMed] [Google Scholar]
  • 41.Myers T. W., Gelfand D. H., Reverse transcription and DNA amplification by a Thermus thermophilus DNA polymerase. Biochemistry 30, 7661–7666 (1991). [DOI] [PubMed] [Google Scholar]
  • 42.Grabko V. I., Chistyakova L. G., Lyapustin V. N., Korobko V. G., Miroshnikov A. I., Reverse transcription, amplification and sequencing of poliovirus RNA by Taq DNA polymerase. FEBS Lett. 387, 189–192 (1996). [DOI] [PubMed] [Google Scholar]
  • 43.Yang W., Lee Y. S., A DNA-hairpin model for repeat-addition processivity in telomere synthesis. Nat. Struct. Mol. Biol. 22, 844–847 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Kim Y. et al., Crystal structure of Thermus aquaticus DNA polymerase. Nature 376, 612–616 (1995). [DOI] [PubMed] [Google Scholar]
  • 45.Li Y., Mitaxov V., Waksman G., Structure-based design of Taq DNA polymerases with improved properties of dideoxynucleotide incorporation. Proc. Natl. Acad. Sci. U.S.A. 96, 9491–9496 (1999). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Wang W., Hellinga H. W., Beese L. S., Structural evidence for the rare tautomer hypothesis of spontaneous mutagenesis. Proc. Natl. Acad. Sci. U.S.A. 108, 17644–17648 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Swan M. K., Johnson R. E., Prakash L., Prakash S., Aggarwal A. K., Structural basis of high-fidelity DNA synthesis by yeast DNA polymerase delta. Nat. Struct. Mol. Biol. 16, 979–986 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Baranovskiy A. G. et al., Mechanism of concerted RNA-DNA primer synthesis by the human primosome. J. Biol. Chem. 291, 10006–10020 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Lingner J. et al., Reverse transcriptase motifs in the catalytic subunit of telomerase. Science 276, 561–567 (1997). [DOI] [PubMed] [Google Scholar]
  • 50.Mitchell M., Gillis A., Futahashi M., Fujiwara H., Skordalakes E., Structural basis for telomerase catalytic subunit TERT binding to RNA template and telomeric DNA. Nat. Struct. Mol. Biol. 17, 513–518 (2010). [DOI] [PubMed] [Google Scholar]
  • 51.Tian L., Kim M. S., Li H., Wang J., Yang W., Structure of HIV-1 reverse transcriptase cleaving RNA in an RNA/DNA hybrid. Proc. Natl. Acad. Sci. U.S.A. 115, 507–512 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Nowotny M. et al., Specific recognition of RNA/DNA hybrid and enhancement of human RNase H1 activity by HBD. EMBO J. 27, 1172–1181 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary File
Supplementary File
Download video file (5.5MB, mov)
Supplementary File
Download video file (5.5MB, mov)
Supplementary File
Download video file (6.1MB, mov)

Data Availability Statement

The structure coordinates and structure factors (SI Appendix, Table S2) have been deposited in the Protein Data Bank (PDB ID codes 6WYA and 6WYB).


Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES