Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2002 Jul 15;99(15):10090–10095. doi: 10.1073/pnas.152186199

Y586F mutation in murine leukemia virus reverse transcriptase decreases fidelity of DNA synthesis in regions associated with adenine–thymine tracts

Wen-Hui Zhang *,†, Evguenia S Svarovskaia *, Rebekah Barr *, Vinay K Pathak *,
PMCID: PMC126629  PMID: 12119402

Abstract

Using in vivo fidelity assays in which bacterial β-galactosidase or green fluorescent protein genes served as reporters of mutations, we have identified a murine leukemia virus (MLV) RNase H mutant (Y586F) that exhibited an increase in the retroviral mutation rate ≈5-fold in a single replication cycle. DNA-sequencing analysis indicated that the Y586F mutation increased the frequency of substitution mutations 17-fold within 18 nt of adenine–thymine tracts (AAAA, TTTT, or AATT), which are known to induce DNA bending. Sequence alignments indicate that MLV Y586 is equivalent to HIV-1 Y501, a component of the recently described RNase H primer grip domain, which contacts and positions the DNA primer strand near the RNase H active site. The results suggest that wild-type reverse transcriptase (RT) facilitates a specific conformation of the template–primer duplex at the polymerase active site that is important for accuracy of DNA synthesis; when an adenine–thymine tract is within 18 nt of the polymerase active site, the Y586F mutant RT cannot facilitate this specific template–primer conformation, leading to an increase in the frequency of substitution mutations. These findings indicate that the RNase H primer grip can affect the template–primer conformation at the polymerase active site and that the MLV Y586 residue and template–primer conformation are important determinants of RT fidelity.


Reverse transcriptase (RT), which copies retroviral genomic RNA into double-stranded DNA during reverse transcription, contains two major enzymatic activities: a DNA polymerase activity that uses either RNA or DNA as a template and an RNase H activity that can degrade RNA in an RNA⋅DNA hybrid (1). The crystal structure of HIV-1 RT in complex with a template–primer and a dNTP substrate has shown that the polymerase active site and the RNase H cleavage site are separated by 18 nt of the template–primer (2). In the case of murine leukemia virus (MLV) RT, the DNA polymerase and RNase H activities reside in physically separable domains of a single monomeric protein (3).

The RNase H activity of all retroviral RTs performs several functions that are essential for the viral life cycle (4), which include degradation of the RNA template after synthesis of the minus-strand DNA, generation of a specific polypurine primer from which plus-strand DNA synthesis initiates, subsequent removal of the polypurine primer to complete synthesis of the viral double-stranded DNA, and removal of the minus-strand tRNA primer (5). During minus-strand DNA synthesis, RT positions the RNase H active site and cleaves the RNA ≈18 nt 3′ of the site of DNA synthesis (6).

All retroviral RTs carry out error-prone DNA synthesis, in part because they lack a proofreading activity and possess a low affinity for the template (7). The low fidelity of retroviral replication is a major force for generation of high levels of variation in retroviral populations (8), which allows them to adapt quickly to changes in their environment. One important outcome of this high adaptability is the rapid development of resistance to antiretroviral drugs and escape from host immune responses (9). Mutations that occur in retroviruses during replication include substitutions, frameshifts, deletions, duplications, and hypermutations (10, 11). These mutations can occur through misincorporation, dislocation mutagenesis, or RT template switching during reverse transcription (8, 1215).

Structural determinants of MLV and HIV-1 RTs that are important for replication fidelity have been studied both in vitro and in vivo (1622). Single amino acid substitutions at the catalytic site YXDD motif, dNTP substrate-binding site, and RNase H domain can increase the in vivo retroviral mutation rate up to 2.8-fold (21, 22). We previously showed that three substitution mutations in the MLV RNase H domain (S526A, Y598V, and R657S) modestly increased the retroviral mutation rate and decreased RT template switching (22, 23). In recent studies we found that the Y586F mutation in MLV RNase H also substantially decreased the frequency of RT template switching (W.-H.Z. and V.K.P., unpublished results). In an ongoing effort to identify structural determinants of RT that play an important role in fidelity, we now have determined the effect of the Y586F mutation in the RNase H primer grip domain of MLV RT on the fidelity of retroviral replication.

The RNase H primer grip domain was identified recently as a structural element of HIV-1 RT that is involved in the control of RNase H cleavage specificity (24). It consists of several amino acids in the RNase H domain, including HIV-1 Y501, that interact with the DNA primer and was proposed to position the DNA primer strand near the RNase H active site and help to determine the trajectory of the template–primer in relation to the RNase H active site. An alignment of MLV, HIV-1, and several other retroviral RNase H domains as well as the Escherichia coli RNase H domain indicates that MLV Y586 is equivalent to the HIV-1 Y501 residue (Fig. 1). The MLV Y586 and HIV-1 Y501 residues are part of a DSXY motif that is conserved among all RNase H domains except for Rous sarcoma virus, which has phenylalanine (F) at this position (25). The aspartic residue of the DSXY motif has been shown to be one of the conserved residues that is necessary for MLV and HIV-1 RNase H catalytic activity (25, 26).

Figure 1.

Figure 1

Amino acid sequence alignment of the DSXY motif in the RNase H domain of RT and in E. coli RNase H. The DSXY motif is boxed. Residues identical to MLV are shaded in gray, and the Y586 residue of MLV and equivalent tyrosines (Y) are shaded in black. The positions of the aligned amino acids are residues consisting of α-helix A,β-sheet 4, and part of α-helix B of HIV-1. SRV-I, simian retrovirus type I; SNV, spleen necrosis virus; EIAV, equine infectious anemia virus; RSV, Rous sarcoma virus; HTLV-I and HTLV-II, human T cell leukemia virus types I and II; BLV, bovine leukemia virus; HAP, hamster A particle; MPMV, Mason–Pfizer monkey virus; VL, visna lentivirus; HFV, human foamy virus.

The results of this study show that the Y586F substitution in MLV RT increased the in vivo mutation rate by ≈5-fold. A large proportion of the substitution mutations were near adenine–thymine tracts (AAAA, TTTT, or AATT sequences), known as A tracts, which are associated with alterations in nucleic acid conformation that include bends and a narrowed minor groove (2730).

Materials and Methods

Plasmids and Retroviral Vectors.

The MLV-based retroviral vectors pGA-1 and pMP1 were described previously (31, 32). Both vectors contain an internal ribosomal entry site of encephalomyocarditis virus that drives the expression of the neomycin-resistance gene (neo); in addition, pGA-1 expresses the β-galactosidase gene (lacZ), and pMP1 expresses the green fluorescent protein gene (GFP). Plasmid pLGPS expresses MLV gag-pol (33). The Y586F substitution mutation in the MLV RNase H domain of pLGPS was generated by use of the QuickChange site-directed mutagenesis kit (Stratagene). Presence of the desired mutation and absence of other mutations were verified by DNA sequencing and restriction-enzyme digestions. Plasmid pSV-hygro expresses the hygromycin phosphotransferase B gene (hygro; ref. 34). pSV-A-MLV-env expresses the amphotropic MLV envelope gene (35).

Cells, Transfections, and Infections.

D17 dog osteosarcoma cells and D17-based cell lines A3, ANGIE P, and A3GFP11 were maintained, transfected, and infected as described previously (36). The ANGIE P cells stably express amphotropic MLV envelope from pSV-A-MLV-env and contain a single GA-1 provirus (22). The A3GFP11 cells were constructed by infection of A3 cells (an amphotropic MLV envelope-expressing cell line) with MP-1 virus produced from PG13 cells. Southern analysis of A3GFP11 genomic DNA indicated the presence of a single provirus (data not shown). The MLV helper cell line PG13 was maintained and transfected by using a calcium-phosphate precipitation method (37).

Assays for Determining in Vivo Fidelity.

ANGIE P cells were cotransfected with wild-type or Y586F mutant pLGPS and pSV-hygro. The resulting hygromycin-resistant colonies were pooled (>2,000 per experiment), and the GA-1 virus produced was used to infect D17 target cells. Infected D17 cells were selected for resistance to G418 (400 μg/ml) and stained with 5-bromo-4-chloro-indolyl-β-d-galactopyranoside as described previously (22).

Similarly, A3GFP11 cells were cotransfected with wild-type or Y586F mutant pLGPS and pSV-hygro. Viruses produced from the transfected A3GFP11 cells then were used to infect D17 cells as described above. The frequency of GFP inactivation was obtained by examining individual drug-resistant colonies by fluorescence microscopy (Axiovert inverted fluorescence microscope, Zeiss) to determine the GFP-positive (fluorescent) or -negative (nonfluorescent) phenotype.

Isolation of Single Nonfluorescent Cell Clones by Fluorescence-Activated Cell Sorting (FACS).

G418-resistant colonies were pooled from the dishes infected with MP1 viruses and subjected to FACS (Becton Dickinson). The individual nonfluorescent cells, which did not express functional GFP, were separated and sorted into 24-well plates by FACS. The nonfluorescent phenotype of the cell clones was verified by fluorescence microscopy.

Genomic DNA Isolation, PCR, and Sequence Analysis.

Single nonfluorescent cells were grown in 24-well plates and expanded into 60-mm-diameter dishes. The cell clones then were harvested and lysed according to manufacturer instructions (QIAamp DNA Blood mini kit, Qiagen, Valencia, CA). GFP fragments were amplified by PCR using primers MP1623F (5′-tcactccttctctaggcgccggaattgg) and MP2390R (5′-ggaattggccgctcacttgtacagctcg) and Takara ExTaq polymerase (Pavera, Madison, WI). DNA sequencing was performed to identify the mutations present in the inactivated GFP gene (Laboratory of Molecular Technology, Sequencing Core, Science Applications International Corporation, Frederick, MD).

Statistical Analysis.

The χ2 test was used to determine whether the frequency of substitutions near A tracts was as expected by random distribution for the wild-type or Y586F RTs. The χ2 test was used also to determine whether the substitution frequencies for the wild-type and Y586F RTs were significantly different from each other. For each comparison, the P value for statistical significance was set at 0.05. Logistic regression analyses were used to determine the probability of gene inactivation for the wild-type and Y586F RTs and to compare the relative increases in lacZ and GFP gene inactivation.

Results

MLV Y586F RT Increases the in Vivo Retroviral Mutation Rate.

To identify structural determinants in RNase H important for fidelity of reverse transcription, we generated an MLV Y586F RNase H mutant. Substitution of Y586 with phenylalanine in MLV RNase H results in the loss of a hydroxyl group, but the aromatic phenyl ring is preserved. To assess the effect of this change on the RT mutation rate, we compared the in vivo fidelity of the wild-type RT with that of the Y586F mutant by using a previously described assay in which the lacZ gene serves as a mutation reporter (Fig. 2). Briefly, an MLV-based retroviral vector (pGA-1) encoding lacZ was permitted to undergo one cycle of replication, and the frequency of lacZ inactivation was determined by staining with 5-bromo-4-chloro-indolyl-β-d-galactopyranoside and quantitation of white and blue colonies.

Figure 2.

Figure 2

In vivo fidelity assays. (A) Structures of MLV-based vectors pGA-1 and pMP1. Both vectors contain cis-acting elements including long terminal repeats (LTRs) and packaging signal (ψ). The trans-acting genes lacZ, GFP, and neo are transcribed from LTR promoters, and an internal ribosomal entry site (IRES) is used to express neo. The pLGPS construct expresses the MLV gag and pol from a truncated viral LTR. (B) Experimental protocols. Wild-type MLV gag-pol construct pLGPS and mutant Y586F were cotransfected separately (Tf) with pSV-hygro into the ANGIE P or A3GFP11 cell line, which stably expresses amphotropic MLV env and an integrated pGA-1 or pMP1, respectively. The GA-1 or MP1 virus was used to infect (Inf) D17 target cells. The infected cell clones resistant to G418 (G418R) with a wild-type or mutant phenotype were quantified to determine the gene inactivation frequencies of lacZ and GFP. The G418R colonies infected by MP1 also were pooled, and individual nonfluorescent cells were isolated by FACS. The inactivated GFP genes from nonfluorescent cell clones were amplified by PCR and analyzed by sequencing to characterize the nature of GFP-inactivating mutations.

The frequencies of lacZ inactivation obtained from three independent experiments are summarized in Table 1. The wild-type MLV gag-pol exhibited an average lacZ inactivation frequency of 4.9% in one replication cycle, which agreed closely with previously observed frequencies of 5.2–5.4% (21, 22). In contrast, the RNase H mutant Y586F exhibited a 26.4% frequency of lacZ inactivation, which was 5.4-fold higher than the wild-type frequency. In previous studies, mutational analysis of MLV RT and HIV-1 revealed mutations that increased the lacZα inactivation frequencies 2.8- and 4.3-fold, respectively (21, 38). Thus, the 5.4-fold increase in the frequency of lacZ inactivation observed with the Y586F mutant is the largest increase reported to date. We also observed a similar increase in the lacZ inactivation frequency for a spleen necrosis virus RT mutant containing an analogous substitution (data not shown).

Table 1.

LacZ mutant frequencies for wild-type and Y586F mutant RTs

Phenotype Viral titers, colony-forming units/ml (×102)* Experiment no. No. of mutant colonies/total colonies
Mutant frequency, % ± SEM Relative change in mut. freq.§
Experiment Total
Wild type 1  13/243 
99  ± 16 2  37/797  72/1483 4.9  ± 0.2 1.0
3  22/443 
Y586F 1 200/814 
3.7  ± 0.6 2 296/1130 582/2203 26.4  ± 3 5.4
3  86/259 
*

The mean viral titers ± SEM were determined by serial dilutions and infections followed by counting G418R colonies. 

The number of colonies that displayed a mutant colony phenotype/total number of colonies that were observed in three independent experiments. 

The mutant frequency was calculated as follows: number of mutant colonies ÷ total number of colonies) × 100. 

§

The relative change in the mutant frequency was calculated as follows: frequency of gene inactivation observed with Y586F ÷ frequency of gene inactivation observed with wild-type MLV RT. Statistical analysis showed that the Y586F mutant displayed a mutant frequency significantly higher than that for the wild type (logistic regression analyses, P < 0.001). 

Mutations Generated by Wild-Type and Y586F RTs.

To characterize the nature of mutations introduced by the wild-type and Y586F RTs, we used vector MP1 (Fig. 2A), which expressed GFP as a mutation reporter. Because GFP is significantly smaller than lacZ (717 bp vs. 3.5 kb), it is more amenable to PCR amplification and DNA sequencing. In addition, infected cells containing an inactivated GFP can be easily isolated and cloned by FACS to facilitate their characterization. The A3GFP11 cells, which express the amphotropic MLV envelope and contain a single MP-1 provirus (Fig. 2A), were cotransfected with wild-type or the Y586F mutant gag-pol expression construct and pSV-hygro; the virus produced was used to infect D17 target cells. The titer for the Y586F mutant virus was ≈27–38-fold lower than wild type (Tables 1 and 2), which resulted from a defect in RNase H activity; the Y586F mutant was shown previously to possess only 5% of the wild-type RNase H activity in vitro (39). GFP inactivation frequency was determined by examining G418-resistant colonies for GFP expression with fluorescence microscopy. When the G418-resistant colonies obtained by using wild-type RT were examined, 29 of 2,748 colonies from three independent experiments did not express GFP, providing a GFP inactivation frequency of 1.05% (Table 2). In contrast, 92 of 2,023 G418-resistant colonies obtained by using the Y586F mutant did not express GFP, providing a GFP inactivation frequency of 4.55%. Therefore, the frequency of GFP inactivation was increased 4.3-fold when the Y586F mutation was present in the MLV RNase H domain. The increase in the frequency of GFP inactivation was not statistically different from the 5.4-fold increase in the lacZ inactivation frequency (logistic regression analysis, P = 0.071).

Table 2.

GFP mutant frequencies for wild-type and Y586F mutant RTs

Phenotype Viral titers, colony-forming units/ml (×103)* Experiment no. No. of mutant colonies/total colonies
Mutant frequency, % ± SEM Relative change in mut. freq.
Experiment Total
Wild type 1  8/666 
98  ± 8 2  6/679  29/2748 1.05  ± 0.09 1.0
3 15/1403
Y586F 1 37/697 
2.6  ± 0.5 2 28/629  92/2023 4.55  ± 0.42 4.3
3 27/697 
*

See Table 1 for descriptions of the headers. 

Analysis of GFP-Inactivating Mutations.

Southern blotting analysis verified the presence of expected proviral structures in most of the nonfluorescent cell clones (see Fig. 5, which is published as supporting information on the PNAS web site, www.pnas.org). To determine the nature of mutations introduced into the GFP gene by wild-type or mutant RNase H, the mutated GFP sequences from individual G418-resistant nonfluorescent cell clones were amplified by PCR, and DNA sequencing was performed (Fig. 2B). To rule out the possibility of disproportionate clonal expansion of some cells containing inactivated GFP genes, clones that were derived from the same infection and possessed the same inactivating mutations were counted only once. Although this conservative approach ensured that a mutation that occurred during reverse transcription was not counted multiple times, it underestimated the contribution of potential mutational hotspots, because the number of mutations at each hotspot was limited by the number of independent experiments.

The results showed that the spectrum of mutations induced by the wild-type or Y586F mutant RTs included substitutions, frameshift mutations, simple deletions, deletions, and insertions (Fig. 3). A higher proportion of the mutations induced by the Y586F RT were substitutions (42 of 60, or 70%), compared with the mutations induced by wild-type RT (26 of 51, or 51%; Table 3). In addition, the proportions of mutations associated with the RT-slippage mechanism (frameshifts) or RT template switching (simple deletions, deletions with insertions, and insertions) were lower for the Y586F RT (13 and 17%, respectively) than for the wild-type RT (20 and 29%, respectively).

Figure 3.

Figure 3

Spectrum of GFP-inactivating mutations associated with the Y586F mutant RT (shown above the sequence) and wild-type RT (shown below the sequence). The GFP gene is 717 nt, and the start and stop codons are marked with a thick rightward arrow and asterisks, respectively. Shaded nucleotides (278 total) are sequences for which there is an A tract (AAAA, TTTT, and AATT) within 18 nt. Substitutions of these nucleotides are shown as white letters with a black background. Frameshifts at underlined sequences are shown as + (insertion) or − (deletion). The italic letters C, A, and G above the sequence indicate a single mutant with a triple substitution. D or D, simple deletions involving short direct repeats (underlined nucleotides) at the deletion junctions; d and d, deletions without direct repeats. Numbers following D, D, d, or d indicate different mutants; the left and right arrows indicate the deletion junctions. I and I indicate larger than 1-nt insertions, and vertical arrows indicate insertion positions; the numbers after I and I indicate the length of inserted nucleotides.

Table 3.

The effect of wild-type and Y586F mutant RTs on the frequency of GFP-inactivating mutations

Mutation type Wild-type RT
Y586F mutant RT
Relative increase in mutant freq.§
No. of mutants sequenced* Proportion of mutants sequenced, % Mutant freq., % No. of mutants sequenced* Proportion of mutants sequenced, % Mutant freq., %
Subst. near A tracts 7 14 0.15 34 57 2.59 17.2
Other substitutions 19 37 0.39 8 13 0.61 1.6
All substitutions 26 51 0.54 42 70 3.19 5.9
Frameshifts** 10 20 0.21 8 13 0.59 2.8
Temp. switch. mtns.‡‡ 15 29 0.30 10 17 0.77 2.6
All mutations 51 100 1.05 60 100 4.55 4.3
*

Numbers of mutants containing GFP-inactivating mutations identified by DNA sequencing. 

The proportion of mutants identified by DNA sequencing that contain a specific type of GFP-inactivating mutation. 

Mutant frequencies determined by multiplying the proportion of mutants sequenced by the overall mutant frequency (e.g., the mutant frequency for substitutions near A tracts for the wild-type RT is 14 × 1.05 = 0.15). 

§

Fold increase in mutant frequency for the Y586F mutant relative to wild-type RT (e.g., the relative increase in the frequency of substitutions near A tracts is 2.59 ÷ 0.15 = 17.2%). 

Substitution for which there was an A tract within 18 nt of the mutation site. 

Substitution for which an A tract was not present within 18 nt of the mutation site. 

**

Nonfluorescent clones containing a +1 or −1 bp insertion. 

‡‡

Template-switching mutations: nonfluorescent clones containing direct-repeat deletions, deletions, or insertions. 

The proportions of specific types of mutations among the mutant clones that were characterized by DNA sequencing and the overall mutant frequencies (shown in Table 2) were used to determine the frequencies for specific classes of mutations (Table 3). The substitution frequency was 0.54% for wild-type RT and 3.19% for the Y586F mutant RT, indicating a 5.9-fold increase. Similarly, the frequencies for frameshifts and template-switching mutations increased ≈2.8- and 2.6-fold, respectively (Table 3). Overall, the spectrum of the mutations and their relative proportions suggested that the increase in the in vivo mutant frequency was largely but not completely caused by an increase in the frequency of substitutions.

To determine the mechanism by which the Y586F mutation increased the frequency of substitutions, we compared the relative frequencies of different classes of substitutions. The relative proportions of 12 possible substitutions were not significantly different from each other. For example, the proportions of G-to-A substitutions, the most frequent substitution observed, were 15 and 17% for the wild-type RT and Y586F mutant RT, respectively. We also did not observe any differences between the relative proportions of transitions and transversions or the proportions of template nucleotides that were substituted (data not shown). Therefore, the increase in the frequency of substitution mutations was not caused by a disproportionate increase in one or more classes of substitutions.

Substitutions in GFP Induced by the Y586F Mutant RT Are in Regions Containing A Tracts.

We hypothesized that sequences adjacent to the site of substitution played a role in the fidelity of reverse transcription when the Y586F RNase H mutant was used. To test this hypothesis, we analyzed 18 nt of sequence on both sides of the substitution sites, because 18 nt of sequence behind the site of polymerization are bound by RT and may come in contact with the Y586 residue. The analysis revealed that an A-tract sequence was present within 18 nt of a large proportion of the substitutions induced by the Y586F mutant RT but not by the wild-type RT. As shown in Fig. 3 and Table 3, an A-tract sequence was present within 18 nt of the mutation site for 81% of substitutions (34 of 42) generated by the Y586F mutant RT. In contrast, an A tract was present within 18 nt of 27% of substitutions (7 of 26) generated by the wild-type RT. There were eight A tracts within the 717 nt of the GFP ORF; an A tract was located within 18 nt of 278 of the 717 nt of GFP (shaded nt in Fig. 3). We expected 39% of the substitution mutations to be near the A tracts by random distribution (278 ÷ 717 × 100%), which was in good agreement with the 26% frequency observed for the wild-type RT (χ2 test, P > 0.05). In contrast, the observation that 81% of the substitutions induced by the Y586F mutant RT were near A tracts was significantly different from the 39% predicted by random distribution and the 26% observed for the wild-type RT (χ2 test, P < 0.001).

The frequency of substitutions near A tracts for the Y586F mutant RT was 2.59%, whereas the frequency of substitutions near the A tracts for the wild-type RT was 0.15% (Table 3). Therefore, the frequency of substitutions in the vicinity of the A tracts was increased 17.2-fold (2.59 ÷ 0.15%). In sharp contrast to the substitutions near A tracts, the frequency of other substitutions not associated with A tracts was increased only 1.6-fold. These results clearly indicated that the presence of the Y586F mutation substantially increased the frequency of substitutions near A tracts.

We compared the nucleotide distances between the substitutions induced by wild-type and Y586F mutant RTs and the nearest A tract (Fig. 4 A and B). No association was apparent between substitutions induced by the wild-type RT and A tracts. Furthermore, substitutions not associated with A tracts were dispersed evenly over a wide range of distances from the A tracts, suggesting that the clustering near A tracts was demarcated sharply at the 18-nt distance from the A tracts. We also analyzed the distribution of substitutions near A tracts in more detail by comparing the frequencies of substitutions for nucleotide positions 0–15 from the A tract (Fig. 4B). Nucleotides that were 0–1, 8–9, or 14–15 bp from the A tract exhibited higher frequencies of substitution (8, 9, and 7 substitutions of 34 total, respectively). In contrast, nucleotides that were 4–5 or 12–13 bp from the A tract were least likely to be substituted (1 and 0 substitutions of 34 total, respectively). The analysis indicated that the distance from the A tracts had a significant effect on the substitution frequency.

Figure 4.

Figure 4

Substitution mutations in GFP induced by the Y586F mutant RT are in regions containing A tracts. (A) Nucleotide distances between the nearest A tract and substitutions induced by Y586F mutant RT (black bars) and wild-type RT (white bars) are shown. (B) Nucleotide distances between A tracts and substituted bases for substitutions near A tracts induced by Y586F mutant RT (black bars) and wild-type RT (white bars). (C) Substitutions hypothesized to occur during minus- and plus-strand DNA synthesis. The substituted nucleotides are boxed and in lowercase. The A-tract sequences are shown in bold, and the nucleotides from the nearest base of each A tract to the substitution site are underlined.

A comparison of distances from the sites of mutation to the nearest 5′ A tract and the nearest 3′ A tract indicated that the distances were similar, suggesting that the frequency of substitutions 5′ and 3′ of the A tracts was similar (data not shown). We therefore further analyzed the locations of the mutation sites and their distance from the A tracts (Fig. 4C). Eighteen of the 34 substitutions were 5′ of the A tracts, and the remaining 16 were 3′ of the A tracts. Assuming that the A tracts must be in contact with RT to affect the frequency of substitutions, mutations 5′ of the A tracts occurred during minus-strand DNA synthesis, and those 3′ of the A tracts must have occurred during plus-strand DNA synthesis. For mutations that occurred during minus-strand synthesis, the A tracts were in an RNA⋅DNA hybrid; for those that occurred during plus-strand DNA synthesis, the A tracts were in a DNA⋅DNA hybrid. Because the frequency of the substitutions was similar for sequences 5′ and 3′ of the A tracts, we concluded that their effect on the substitution frequency was independent of whether they were present in an RNA⋅DNA or a DNA⋅DNA hybrid.

Discussion

The results of these studies indicate that the Y586F mutant of MLV RT is a mutator polymerase that exhibits 5.4- and 4.3-fold increases in the frequencies of lacZ and GFP inactivation, respectively. The decreased fidelity is not simply an indirect effect of reduced RNase H activity of the Y586F mutant, because in previous studies other MLV RNase H mutants with reduced RNase H activity (39) and RT template switching (23) exhibited a less than 1.6-fold increase in the frequency of lacZ inactivation (22).

The large increases in the in vivo mutation rate induced by the Y586F mutant RT are caused by a 17-fold increase in the frequency of substitution mutations in regions of the template that contain an A tract within 18 nt of the mutation site. Because A-tract sequences themselves are inflexible but associated with bends at the junction with a G/C base pair and narrowed minor groove (27), their effect on the overall fidelity is related most likely to their effect on the conformation of the template–primer hybrid. We hypothesize that the conformation of the RNA⋅DNA or DNA⋅DNA hybrids containing A tracts decreases the fidelity of reverse transcription by the Y586F mutant RT but not by the wild-type RT.

The association between substitutions by Y586F mutant and A tracts strongly suggests that the conformation of the template–primer duplex that is bound to RT plays a significant role in the accuracy of reverse transcription. The interactions of nucleic acids with wild-type RT likely facilitate proper conformation of the template–primer duplex such that correct nucleotides can be selected and incorporated even when sequences that can affect the conformation of the template–primer such as A tracts are encountered. We postulate that the Y586 residue of MLV RT is an important determinant for facilitating proper conformation of the nascent template–primer duplex. The Y586F mutant RT presumably could not induce the appropriate conformation of the template–primer duplex efficiently when A tracts were present in the template, which resulted in an increase in the frequency of substitutions near A tracts.

Crystal structures of HIV-1 RT in complex with DNA⋅DNA and RNA⋅DNA duplexes indicate that the nucleic acid duplexes are in a specific conformation (26, 40). The first 5 bp near the active site are in an A-form conformation; the next 4 bp undergo a 41° bend associated with a transition from A-form to B-form geometry, and the 9 bp near the RNase H active site are in a B-form conformation. The A-form conformation of template–primer has been identified in the vicinity of several other DNA polymerase active sites and was suggested to contribute to the fidelity of DNA synthesis by attenuating sequence-dependent structural alterations (41). In contrast, a model of rat DNA polymerase β bound to B-form DNA suggests that a B-form conformation of nascent template–primer near the polymerase active site can result in mutational hotspots (41). In addition, a recently solved ternary complex structure of Sulfolobus solfataricus P2 DNA polymerase (Dpo4), which exhibits a very low fidelity, revealed that the nascent template–primer DNA is in the B-form (42, 43). Based on structural and biochemical studies, a common fidelity mechanism involving hydrogen-bonding interactions between the polymerase and the minor groove of the template–primer has been proposed (44, 45). An altered conformation of the nascent template–primer can interfere with correct nucleotide incorporation by affecting the geometry of the active site, which interferes with error discrimination by enzymes that scan for correct geometry of the minor groove (4548). It has been shown also that the minor groove of the A-form conformation of nascent primer-template is wider, allowing easy access to the protein side chains and facilitating the interactions of the polymerase and template–primer at the polymerase active site (24, 49). Taken together, our results suggest that one important function of wild-type RT is to induce and stabilize a proper conformation of the template–primer duplex such that the accuracy of DNA synthesis can be maintained regardless of sequence-dependent structural differences.

It has been proposed that RNase H primer grip residues make contact with the DNA primer strand and facilitate positioning of the DNA primer strand near the RNase H active site (24). It was postulated that these interactions with the DNA primer strand affect the positioning of the template–primer at the RNase H active site and affect cleavage specificity. The results of our studies indicate that the Y586 residue, a component of the MLV RNase H primer grip, plays a role in determining the conformation of the nucleic acid at the polymerase active site and affects the fidelity of reverse transcription.

The substitution mutations induced by the Y586F mutant RT occurred at similar frequencies on both sides of the A tracts, suggesting that the effects on fidelity were similar during minus- and plus-strand DNA synthesis. Although the contacts between RT and template RNA are different from the contacts between RT and template DNA, RT has been shown to have similar interactions with the DNA primer strand in both RNA⋅DNA and DNA⋅DNA duplexes (24). Because the Y586 residue makes contact with the DNA primer strand, it is not surprising that its effects on fidelity of minus- and plus-strand DNA synthesis apparently are similar. The distribution of the substitutions generated by the Y586F mutant near A tracts was strikingly nonrandom, indicating the presence of substitution hotspots at nucleotide positions 0–1, 8–9, and 14–15 bp relative to the A tracts. The A tract-induced structural alterations that cause these substitution hotspots are unknown. Because the substitution hotspots were identified from analysis of eight separate A-tract sequences, the structural alterations that lead to the generation of hotspots are likely to be common to most, if not all, A-tract sequences. Furthermore, two of the substitution hotspots are 8–9 and 14–15 nt from the A tracts, indicating that the structural alterations induced by the A tract can have distal effects. One attractive hypothesis is that the A-tract sequences affect the helical curvature of the template–primer duplex, thus altering the conformation of the nascent base pair at the polymerase active site, which in turn affects the frequency of substitution errors.

In summary, a Y586F mutation in MLV RNase H that alters the ability of RT to contact and position the DNA primer affects viral-replication fidelity. Our findings indicate that the RNase H primer grip and the conformation of the template–primer duplex are two important determinants of the accuracy of reverse transcription.

Supplementary Material

Supporting Figure

Acknowledgments

We especially thank John Coffin, Wei-Shau Hu, and Yegor Voronin for intellectual input, Douglas Powell for assistance and expertise in statistical analysis, and Anne Arthur for editorial expertise and revisions.

Abbreviations

RT

reverse transcriptase

MLV

murine leukemia virus

GFP

green fluorescent protein

FACS

fluorescence-activated cell sorting

Footnotes

This paper was submitted directly (Track II) to the PNAS office.

References

  • 1.Coffin J M, Hughes S H, Varmus H E. Retroviruses. Plainview, NY: Cold Spring Harbor Lab. Press; 1997. [PubMed] [Google Scholar]
  • 2.Huang H, Chopra R, Verdine G L, Harrison S C. Science. 1998;282:1669–1675. doi: 10.1126/science.282.5394.1669. [DOI] [PubMed] [Google Scholar]
  • 3.Tanese N, Goff S P. Proc Natl Acad Sci USA. 1988;85:1777–1781. doi: 10.1073/pnas.85.6.1777. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Tanese N, Telesnitsky A, Goff S P. J Virol. 1991;65:4387–4397. doi: 10.1128/jvi.65.8.4387-4397.1991. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Champoux J J. In: Reverse Transcriptase. Skalka A-M G, Goff S P, editors. Plainview, NY: Cold Spring Harbor Lab. Press; 1993. pp. 103–118. [Google Scholar]
  • 6.Wisniewski M, Balakrishnan M, Palaniappan C, Fay P J, Bambara R A. J Biol Chem. 2000;275:37664–37671. doi: 10.1074/jbc.M007381200. [DOI] [PubMed] [Google Scholar]
  • 7.Temin H M. Proc Natl Acad Sci USA. 1993;90:6900–6903. doi: 10.1073/pnas.90.15.6900. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Kim T, Mudry R A, Rexrode C A, Pathak V K. J Virol. 1996;70:7594–7602. doi: 10.1128/jvi.70.11.7594-7602.1996. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Mellors J W, Bazmi H Z, Schinazi R F, Roy B M, Hsiou Y, Arnold E, Weir J, Mayers D L. Antimicrob Agents Chemother. 1995;39:1087–1092. doi: 10.1128/aac.39.5.1087. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Pathak V K, Temin H M. Proc Natl Acad Sci USA. 1990;87:6019–6023. doi: 10.1073/pnas.87.16.6019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Pathak V K, Temin H M. Proc Natl Acad Sci USA. 1990;87:6024–6028. doi: 10.1073/pnas.87.16.6024. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Pathak V K, Hu W S. Semin Virol. 1997;8:141–150. [Google Scholar]
  • 13.Mansky L M, Temin H M. J Virol. 1995;69:5087–5094. doi: 10.1128/jvi.69.8.5087-5094.1995. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Varela-Echavarria A, Prorock C M, Ron Y, Dougherty J P. J Virol. 1993;67:6357–6364. doi: 10.1128/jvi.67.11.6357-6364.1993. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Parthasarathi S, Varela-Echavarria A, Ron Y, Preston B D, Dougherty J P. J Virol. 1995;69:7991–8000. doi: 10.1128/jvi.69.12.7991-8000.1995. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Bakhanashvili M, Avidan O, Hizi A. FEBS Lett. 1996;391:257–262. doi: 10.1016/0014-5793(96)00747-8. [DOI] [PubMed] [Google Scholar]
  • 17.Drosopoulos W C, Prasad V R. J Virol. 1998;72:4224–4230. doi: 10.1128/jvi.72.5.4224-4230.1998. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Harris D, Kaushik N, Pandey P K, Yadav P N, Pandey V N. J Biol Chem. 1998;273:33624–33634. doi: 10.1074/jbc.273.50.33624. [DOI] [PubMed] [Google Scholar]
  • 19.Kaushik N, Singh K, Alluru I, Modak M J. Biochemistry. 1999;38:2617–2627. doi: 10.1021/bi9824285. [DOI] [PubMed] [Google Scholar]
  • 20.Kim B, Hathaway T R, Loeb L A. Biochemistry. 1998;37:5831–5839. doi: 10.1021/bi972672g. [DOI] [PubMed] [Google Scholar]
  • 21.Halvas E K, Svarovskaia E S, Pathak V K. J Virol. 2000;74:10349–10358. doi: 10.1128/jvi.74.22.10349-10358.2000. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Halvas E K, Svarovskaia E S, Pathak V K. J Virol. 2000;74:312–319. [PMC free article] [PubMed] [Google Scholar]
  • 23.Svarovskaia E S, Delviks K A, Hwang C K, Pathak V K. J Virol. 2000;74:7171–7178. doi: 10.1128/jvi.74.15.7171-7178.2000. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Sarafianos S G, Das K, Tantillo C, Clark A D, Jr, Ding J, Whitcomb J M, Boyer P L, Hughes S H, Arnold E. EMBO J. 2001;20:1449–1461. doi: 10.1093/emboj/20.6.1449. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Kanaya S, Kohara A, Miura Y, Sekiguchi A, Iwai S, Inoue H, Ohtsuka E, Ikehara M. J Biol Chem. 1990;265:4615–4621. [PubMed] [Google Scholar]
  • 26.Ding J, Hughes S H, Arnold E. Biopolymers. 1997;44:125–138. doi: 10.1002/(SICI)1097-0282(1997)44:2<125::AID-BIP2>3.0.CO;2-X. [DOI] [PubMed] [Google Scholar]
  • 27.Crothers D M, Haran T E, Nadeau J G. J Biol Chem. 1990;265:7093–7096. [PubMed] [Google Scholar]
  • 28.Shatzky-Schwartz M, Arbuckle N D, Eisenstein M, Rabinovich D, Bareket-Samish A, Haran T E, Luisi B F, Shakked Z. J Mol Biol. 1997;267:595–623. doi: 10.1006/jmbi.1996.0878. [DOI] [PubMed] [Google Scholar]
  • 29.Mack D R, Chiu T K, Dickerson R E. J Mol Biol. 2001;312:1037–1049. doi: 10.1006/jmbi.2001.4994. [DOI] [PubMed] [Google Scholar]
  • 30.Strahs D, Schlick T. J Mol Biol. 2000;301:643–663. doi: 10.1006/jmbi.2000.3863. [DOI] [PubMed] [Google Scholar]
  • 31.Cheslock S R, Anderson J A, Hwang C K, Pathak V K, Hu W S. J Virol. 2000;74:9571–9579. doi: 10.1128/jvi.74.20.9571-9579.2000. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Julias J G, Kim T, Arnold G, Pathak V K. J Virol. 1997;71:4254–4263. doi: 10.1128/jvi.71.6.4254-4263.1997. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Miller A D, Buttimore C. Mol Cell Biol. 1986;6:2895–2902. doi: 10.1128/mcb.6.8.2895. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Gritz L, Davies J. Gene. 1983;25:179–188. doi: 10.1016/0378-1119(83)90223-8. [DOI] [PubMed] [Google Scholar]
  • 35.Landau N R, Page K A, Littman D R. J Virol. 1991;65:162–169. doi: 10.1128/jvi.65.1.162-169.1991. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Delviks K A, Hu W S, Pathak V K. J Virol. 1997;71:6218–6224. doi: 10.1128/jvi.71.8.6218-6224.1997. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Sambrook J, Fritsch E F, Maniatis T. Molecular Cloning: A Laboratory Manual. 2nd ed. Plainview, NY: Cold Spring Harbor Lab. Press; 1989. [Google Scholar]
  • 38.Mansky L M, Bernard L C. J Virol. 2000;74:9532–9539. doi: 10.1128/jvi.74.20.9532-9539.2000. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Blain S W, Goff S P. J Virol. 1995;69:4440–4452. doi: 10.1128/jvi.69.7.4440-4452.1995. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Jacobo-Molina A, Ding J, Nanni R G, Clark A D, Jr, Lu X, Tantillo C, Williams R L, Kamer G, Ferris A L, Clark P, et al. Proc Natl Acad Sci USA. 1993;90:6320–6324. doi: 10.1073/pnas.90.13.6320. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Timsit Y. J Mol Biol. 1999;293:835–853. doi: 10.1006/jmbi.1999.3199. [DOI] [PubMed] [Google Scholar]
  • 42.Ling H, Boudsocq F, Woodgate R, Yang W. Cell. 2001;107:91–102. doi: 10.1016/s0092-8674(01)00515-3. [DOI] [PubMed] [Google Scholar]
  • 43.Friedberg E C, Fischhaber P L, Kisker C. Cell. 2001;107:9–12. doi: 10.1016/s0092-8674(01)00509-8. [DOI] [PubMed] [Google Scholar]
  • 44.Steitz T A. Nature (London) 1998;391:231–232. doi: 10.1038/34542. [DOI] [PubMed] [Google Scholar]
  • 45.Osheroff W P, Beard W A, Wilson S H, Kunkel T A. J Biol Chem. 1999;274:20749–20752. doi: 10.1074/jbc.274.30.20749. [DOI] [PubMed] [Google Scholar]
  • 46.Franklin M C, Wang J, Steitz T A. Cell. 2001;105:657–667. doi: 10.1016/s0092-8674(01)00367-1. [DOI] [PubMed] [Google Scholar]
  • 47.Pelletier H, Sawaya M R, Kumar A, Wilson S H, Kraut J. Science. 1994;264:1891–1903. [PubMed] [Google Scholar]
  • 48.Bebenek K, Beard W A, Darden T A, Li L, Prasad R, Luton B A, Gorenstein D G, Wilson S H, Kunkel T A. Nat Struct Biol. 1997;4:194–197. doi: 10.1038/nsb0397-194. [DOI] [PubMed] [Google Scholar]
  • 49.Doublie S, Ellenberger T. Curr Opin Struct Biol. 1998;8:704–712. doi: 10.1016/s0959-440x(98)80089-4. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supporting Figure
pnas_152186199_1.html (1.6KB, html)

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES