Distinct conformational states underlie pausing during initiation of HIV-1 reverse transcription

Kevin P Larsen; Junhong Choi; Lynnette N Jackson; Kalli Kappel; Jingji Zhang; Betty Ha; Dong-Hua Chen; Elisabetta Viani Puglisi

doi:10.1016/j.jmb.2020.06.003

. Author manuscript; available in PMC: 2021 Jul 24.

Published in final edited form as: J Mol Biol. 2020 Jun 6;432(16):4499–4522. doi: 10.1016/j.jmb.2020.06.003

Distinct conformational states underlie pausing during initiation of HIV-1 reverse transcription

Kevin P Larsen ^1,^2,^*,^†, Junhong Choi ^1,^3,^*,^‡, Lynnette N Jackson ^1,^*, Kalli Kappel ^2,^§, Jingji Zhang ¹, Betty Ha ^1,⁴, Dong-Hua Chen ¹, Elisabetta Viani Puglisi ^1,^**

PMCID: PMC7387199 NIHMSID: NIHMS1604038 PMID: 32512005

Abstract

A hallmark of the initiation step of HIV-1 reverse transcription, in which viral RNA genome is converted into double-stranded DNA, is that it is slow and non-processive. Biochemical studies have identified specific sites along the viral RNA genomic template in which reverse transcriptase (RT) stalls. These stalling points, which occur after the addition of 3 and 5 template dNTPs, may serve as checkpoints to regulate the precise timing of HIV-1 reverse transcription following viral entry. Structural studies of reverse transcriptase initiation complexes (RTICs) have revealed unique conformations that may explain the slow rate of incorporation, however, questions remain about the temporal evolution of the complex and features that contribute to strong pausing during initiation. Here we present cryo-electron microscopy (cryo-EM) and single-molecule characterization of an RTIC after three rounds of dNTP incorporation (+3), the first major pausing point during reverse transcription initiation. Cryo-EM structures of a +3 extended RTIC reveal conformational heterogeneity within the RTIC core. Three distinct conformations were identified, two of which adopt unique, likely off-pathway, intermediates in the canonical polymerization cycle. Single-molecule Förster resonance energy transfer (smFRET) experiments confirm that the +3 RTIC is more structurally dynamic than earlier stage RTICs. These alternative conformations were selectively disrupted through structure-guided point mutations to shift smFRET populations back towards the on-pathway conformation. Our results support the hypothesis that conformational heterogeneity within the HIV-1 reverse transcriptase initiation complex during pausing serves as an additional means of regulating HIV-1 replication.

Introduction

Reverse transcription of the HIV-1 single-stranded RNA genome to double-stranded DNA is the first step of viral replication upon fusion of a virion with a host cell[1–3]. This event is initiated within the viral capsid at an RNA-protein complex formed between a packaged tRNA^Lys₃ primer, a region of viral genomic RNA (vRNA) in the 5’ untranslated region (UTR), and the viral enzyme reverse transcriptase (RT)[4]. Initiation of reverse transcription is kinetically slow: DNA polymerization rates are 100–500-fold slower than those during later steps in reverse transcription[5, 6]. As such, initiation appears to serve as a regulatory step that delays the completion of the entire reverse transcription process, but the molecular and structural mechanisms that govern this regulation remain poorly understood[7, 8].

Decades of biochemical work have revealed the critical sequence elements for reverse transcription initiation[7, 9–18]. The highly conserved HIV-1 5’ UTR contains an 18 nucleotide sequence complementary to the tRNA^Lys₃ primer, dubbed the primer binding sequence (PBS), which forms the intermolecular helix required for RT binding and initiation from the primer 3’ hydroxyl[19–22]. These studies also determined potential sequence elements leading to intermolecular base pairing between tRNA^Lys₃ and vRNA that extend outside of the PBS sequence (Fig. 1a)[7]. Mutations in these vRNA regions, contained within the conserved 5’ UTR, are deleterious for viral replication and helped identify the regulatory core of the initiation complex[9, 10, 13, 14, 16–18, 23–25]. RNA-template constructs that span this region recapitulate the biochemical properties of full-length genomic RNA[9, 16]. Detailed kinetic investigations have highlighted the extremely slow rate of incorporation for the first six deoxynucleotides during HIV-1 reverse transcription initiation, with rates of 0.22 s⁻¹ versus 50–100 s⁻¹ for elongation (Supplementary Fig. 1a)[6, 13, 18, 26, 27]. Strikingly, incorporation of the 4^th nucleotide is 10-fold slower than other stages of initiation, and leads to abortive reverse transcription products, evidenced by pause sites that arise during HIV-1 RT extension[18, 28, 29].

Fig. 1. — a, Secondary structures of the HIV-1 viral RNA and tRNA^Lys₃ sequences used for the structural studies. The locations of the modified nucleotides used for crosslinking on the tRNA^Lys₃ are boxed. Regions of vRNA–tRNA^Lys₃ intermolecular base pairing are colored. b, Secondary structure of the vRNA–tRNA^Lys₃ complex as determined in Larsen *et al.* 2018. Nucleotides of interest are boxed and colored in an identical fashion to panel a. c, Crosslinking scheme for the +1 RTIC. The tRNA^Lys₃ primer contains a modified deoxyguanosine at position 71, which forms a disulfide bond with RT p66 thumb αH, and is extended by a dC at the 3’ end (gold). Nucleotides identities contained within the RT nucleotide (N) and primer (P) binding sites are indicated. d, Crosslinking scheme for the +3 RTIC. The tRNA^Lys₃ primer contains a modified deoxyguanosine at position 73, which forms a disulfide bond with RT p66 thumb αH, and is extended by a dC, dT, and dG at the 3’ end (gold). Nucleotides identities contained within the RT nucleotide (N) and primer (P) binding sites are indicated.

To understand why RT initiation proceeds at such a slow rate, we previously determined the structure of the reverse transcriptase initiation complex extended by 1 deoxynucleotide (+1 RTIC) using cryo-electron microscopy (cryo-EM)[30]. The complex, which contained a 101-nucleotide fragment of viral RNA, the tRNA^Lys₃ primer, and RT, was stabilized by the specific crosslinking of a cysteine residue at position 258 of the p66 thumb and a modified nucleotide (‘convertible G’) embedded within the tRNA^Lys₃ primer sequence at position 71 (Fig. 1a). In the structure, 8.0 Å global resolution and 4.5 Å core resolution, we observed the 18 base pair PBS helix within the long nucleic binding cleft of RT; this helix is extended by four additional tRNA-vRNA base pairs, and the 5’ end of the tRNA^Lys₃ is refolded into a long irregular helix that coaxially stacks on the extended PBS helix. The viral RNA forms two helical structures, viral RNA helices 1 and 2 (H1 and H2), that are positioned adjacent to the polymerase active site, and an unstructured, single-stranded viral RNA connecting loop bridges H1 to the end of the extended PBS helix near the RNase H domain (Fig. 1b).

Although this structure contained several features that could explain the general slowness of reverse transcription initiation, open questions remain. The 3’ end of the tRNA^Lys₃ primer within the +1 RTIC was deviated from the RT polymerase active site and not poised for dNTP incorporation. This displacement was coupled with an open conformation of the RT thumb, reminiscent of RT structures containing bound non-nucleoside reverse transcriptase inhibitors (NNRTI) [31]. In addition, the +1 RTIC was captured in a pre-translocation state with semi-open fingers. These RTIC core structural features were also observed in a recent x-ray crystallography study of RT bound to simplified, un-extended dsRNA duplex suggesting that the origins of the displaced conformation may be a combination of the inherent rigidity and topology of A-form RNA duplexes, which prevents the distinct bend observed in DNA duplex substrates[32]. It is also likely that vRNA H2, in the context of the full HIV-1 RTIC, must be partially melted to allow full engagement of the PBS helix within the polymerase active site. While these structural results have illuminated the features of the RTIC that dictate general slowness during initiation, they fail to explain why specific initiation states have strikingly decreased polymerization rates and stalling[18, 28, 29]. Additional structural views of these later RT initiation states are needed.

To explore how the RTIC evolves temporally and to determine why continuation of polymerization by RT from the +3-extension state is particularly inefficient, we have applied a combined biochemical, structural, and single-molecule dynamics approach. The structure of the RTIC extended by three deoxynucleotides (+3) was determined by cryo-EM and exhibits a similar global fold for the vRNA–tRNA^Lys₃ complex as observed for the +1 state but with a shift of H2 downward into the RT polymerase active site. Additionally, we observed three distinct structural classes with altered PBS helix positions within the RT nucleic acid binding cleft. One class adopts a conformation that is similar to the +1 initiation complex, whereas in the other two classes, there is a large shift of the PBS helix further from the active site. Interaction of the PBS helix with the RNase H domain of RT appears to stabilize these shifted states. We confirmed the existence of dynamic conformational exchange within the +3 initiation complex using single-molecule FRET experiments and biochemical assays. The combined results provide additional structural insight into why extension from the +3 state is extremely slow and leads to abortive products, supporting a model whereby H2 acts as a steric barrier for efficient, rapid polymerization.

Results

Reagent preparation and validation for structural studies

RNAs and HIV-1 RT for cryo-EM samples were prepared using previously described procedures[30]. The HIV-1 genomic viral RNA (vRNA) used for structural studies spans residues 123–223 of the NL4.3 sequence and encompasses all of the conserved vRNA sequence features required for efficient initiation (Fig. 1a,b)[9, 16, 30]. To allow specific RNA-RT crosslinking, full-length tRNA^Lys₃ was chemically synthesized (TriLink Biotechnologies)[30]. These RNAs contain an N2-cystamine-2’-deoxyguanosine nucleotides at either the 71 or 73 position for forming the +1 and +3 extended crosslinked RTICs, respectively (Fig. 1c,d). The tRNA^Lys₃ used for +3 RTIC formation contained an additional three deoxynucleotides on the 3’ end (79 nts vs 76 nts) during synthesis to mimic extension. This strategy was employed to avoid crosslinking of RT to partially extended vRNA–tRNA^Lys₃ complexes and to generate a more homogeneous complex; attempts to extend the vRNA–tRNA^Lys₃ biochemically by three dNTPs prior to crosslinking resulted in samples unsuitable for even medium resolution structural work (data not shown) most likely due to sample heterogeneity.

vRNA–tRNA^Lys₃ complexes were formed by heat annealing and purified by size-exclusion chromatography as described previously[30, 33–36]. The homogeneity of the complexes was confirmed by size-exclusion chromatography and native PAGE prior to RTIC formation (Supplementary Fig. 1b-f). Interestingly, we found that the +3 extended vRNA–tRNA^Lys₃ complex formed a higher order species that dissipated when incubated at temperatures higher than 4°C (Supplementary Fig. 1d-f). HIV-1 reverse transcriptase was purified as previously described and tested for activity by primer extension assays using vRNA–tRNA^Lys₃ complexes containing a chemically-attached dye at the 5’ end of the primer as previously described[30, 35]. HIV-1 RT was crosslinked to the vRNA–tRNA complexes over a 24-hour incubation, purified by anion-exchange and size-exclusion chromatography, and prepared for cryo-EM following previously established procedures (Fig. 2a and Supplementary Fig. 2a-d)[30]. Homogeneity and purity of the crosslinked RTICs were confirmed by native and SDS PAGE prior to cryo-EM grid preparation (Fig. 2b and Supplementary Fig. 2e).

Fig. 2. — a, Size-exclusion chromatography purification of the +3 RTIC. The sample exhibited a single, main peak with a minimal shoulder. b, SDS-PAGE analysis of the +3 RTIC components and purified complex. The p66 subunit of RT is shifted after crosslinking with the vRNA–tRNA^Lys₃ complex. c, Selected cryo-EM micrograph of the +3 RTIC. The sample exhibited monodisperse particles with minimal aggregation. d, 2D class averages of the +3 RTIC. Secondary structure features are visible and helical RNA density can be observed within and emerging from the RT core.

The activity of the purified, crosslinked +3 RTIC was tested and compared to the uncrosslinked complex by monitoring the incorporation of the next templated dNTP (Supplementary Fig. 3a,b). The rate of incorporation for the uncrosslinked +3 initiation complex was comparable to that of previously published results[28, 29]. The crosslinked +3 RTIC was capable of incorporating the next template dNTP at an increased rate compared to the uncrosslinked complex, suggesting that an on-pathway, active complex was captured and purified for structural work (Supplementary Fig. 3b).

Cryo-EM of +1 and +3 extended RTICs

Grids were prepared for cryo-EM using a protocol modified from that used previously[30]. Sample buffer conditions were altered to contain decreased NaCl (75 mM) and increased MgCl₂ (6 mM) concentrations, which produced data with more well-defined RNA structural features. Because the previously published cryo-EM data on the +1 RTIC was unable to resolve key characteristics, namely interactions within the RNase H domain, we collected a new cryo-EM dataset for the +1 extended RTIC in conjunction with the +3 extended RTIC to allow more direct comparison of the two initiation states.

We collected 7,790 movies of the +1 RTIC sample, which were motion-corrected and summed, and picked approximately 3.1 million particles by semi-automated particle picking in Relion (Supplementary Fig. 4)[37, 38]. Two-dimensional (2D) classification was used to eliminate damaged particles and noise. The final, remaining 2D classes exhibited secondary-structure features and peripheral RNA structures emerging from the RTIC core (Supplementary Fig. 4b). Standard classification and refinement procedures were performed, which resulted in a global map at 6.6 Å and a core map at 4.1 Å (Fig. 3a,b and Supplementary Fig. 4c-f). The global map featured helical density for vRNA helix 2 (H2) and the extended tRNA helix and weaker, less ordered density for vRNA helix 1 (H1) and the vRNA connecting loop (CL) (Fig. 3a). Unsurprisingly, the locations of these features closely matched those found in the previously published structure and were amenable to low-resolution modeling using the Rosetta DRRAFTER software package[39]. The near-atomic resolution core map (4.1Å) reconstruction featured increased side chain density compared to our prior 4.5 Å +1 RTIC map, making it possible to build and refine more accurate atomic models using an iterative combination of DRRAFTER, Coot, and Phenix real-space-refine (Fig. 3c and Supplementary Fig. 5)[39–42]. Most notably, the cryo-EM density near the junction between extended-PBS and the tRNA was no longer distorted. This allowed for modeling of the RNA backbone in this region, which was previously not possible (Fig. 3b,c). We used this improved +1 RTIC model as a reference for subsequent comparison to +3 RTIC states.

Fig. 3. — a, Global cryo-EM map and model of the +1 RTIC (RT p66: purple, RT p51: gray, vRNA: yellow, tRNA^Lys₃: red). RNA structures of interest are labeled. Nucleotides that are not modeled within the +1 RTIC cryo-EM core map are faded. b, 4.1 Å cryo-EM map of the +1 RTIC core colored by chain. c, Near-atomic model of the +1 RTIC core colored by chain. Regions of the vRNA–tRNA^Lys₃ complex within the RNase H domain that were previously unsuited for near-atomic modeling are boxed (vRNA nt 200–203 and tRNA^Lys₃ nt 1–4, 51–58).

For the +3 RTIC sample, we collected an additional 7,279 movies, which were motion-corrected and summed, and picked approximately 2.9 million particles by semi-automated particle picking in Relion (Fig. 2c, Supplementary Fig. 6a)[37, 38]. Again, two-dimensional (2D) classification was used to eliminate damaged and noise particles. The remaining 2D classes exhibited secondary-structure features and peripheral RNA structures emerging from the RTIC core (Fig. 2d). After one round of three-dimensional (3D) classification, we were able to identify conformational heterogeneity within +3 RTIC that initially separated into two distinct classes (Supplementary Fig. 6a).

All of the identified classes exhibited RT bound to the vRNA–tRNA^Lys₃ complex, which featured a global fold and apparent secondary structure resembling that previously observed in the +1 RTIC (Fig. 4a, b-d). The first class most closely resembled the +1 RTIC reconstruction, and standard refinement procedures resulted in a global map at 4.9 Å and a core map at 4.2 Å (Fig. 4b,e and Supplementary Fig. 6a and 7). Additional global subclasses were identified, but these subclasses only exhibited differences in the strength of density and orientation of the peripheral regions of the tRNA and H2 (Supplementary Fig. 6b). The second class exhibited conformational heterogeneity within the RT polymerase domain and PBS helix. An additional round of 3D classification revealed two distinct subclasses exhibiting distinct amounts of domain movement (Supplementary Fig. 6a). The first of these subclasses, which exhibited the largest domain displacements, was refined to produce a global map at 5.3 Å and a core map at 4.5 Å (Fig. 4c,f and Supplementary Fig. 7). The second subclass, which exhibited a lesser degree of domain movement, was similarly refined to produce a global map at 5.3 Å and a core map at 4.7 Å (Fig. 4d,g and Supplementary Fig. 7). As above, additional global subclasses were identified, which exhibited differences in the strength of density and orientation of the peripheral regions of the tRNA and H2 (Supplementary Fig. 6b). As with the +1 RTIC, atomic models for the +3 RTIC cryo-EM density core maps were built using a combination of DRRAFTER, Coot, and Phenix real-space-refine[39–42].

Fig. 4. — a, Secondary structure of the viral RNA and tRNA^Lys₃ within the HIV-1 +3 RTIC. The shaded box highlights the extended PBS (tRNA^Lys₃ nt 55–79, vRNA nt 181–203) and the boxed nucleotide highlights the position of the crosslinkable guanosine, G73. **b-d**, Global cryo-EM maps of the +3 RTIC conformers. Maps are colored by component: viral RNA (yellow), tRNA^Lys₃ (red), RT p66 (purple), RT p51 (gray). Observed secondary structure features from a are labeled in panel b. A low-threshold, colored map has been overlaid into a transparent, high-threshold map to illustrate the helical nature of the vRNA–tRNA^Lys₃ complex. **e-g**, Core cryo-EM maps and models of the +3 RTIC conformers. The base of vRNA H2 (nt 134–137, 175–179), extended PBS, and a portion of the tRNA^Lys₃ helix (nt 1–8, 46–54) were modeled.

Global conformation of the +3 extended RTIC

Despite conformational heterogeneity within the core (described below), the RNA in the +3 RTIC in all classes retain the secondary structure that was observed in the +1 extension state (Fig. 4a,b). In all classes, the PBS helix within the RT polymerase domain has been extended by three dNTPs off the 3’ end of the tRNA^Lys₃, which form an initial DNA-RNA hybrid of 3 base pairs with vRNA nts 181–179. As with the +1 state, the PBS helix has been extended by four additional base pairs near the RNase H domain between tRNA^Lys₃ nts 55–58 and vRNA nts 200–203. For all three conformers, additional rounds of 3D classification revealed the presence of the previously observed extended tRNA^Lys₃ helix. However, the strength and orientation of this density with respect to the core appears highly variable when comparing these additional subclasses (Supplementary Fig. 6b). The tRNA^Lys₃ helix is connected by the single-stranded connection loop to the 7-bp viral RNA helix 1 (H1). As previously observed, density for H1 is poorly resolved in all three +3 conformers and the new +1 data. Helical density for viral RNA helix 2 (H2) is located immediately above the RT fingers and is strikingly well defined for all +3 conformers (Fig. 4b-d). Compared to the +1 RTIC, H2 has lowered towards the RT polymerase active site and engages more closely with the RT fingers domain (Fig. 5b). Downward movement of H2 is due to RT extension off the 3’ end of the tRNA^Lys₃, which has rotated the PBS helix along the binding cleft towards the RNase H domain. This is consistent with the hypothesis that the stable structure of H2 may represent a significant barrier to the continuation of reverse transcription past the +3 state[5, 13, 18, 29, 30]. In all +3 conformers, the apical region of H2 is poorly resolved.

Fig. 5. — a, Structural overlay of +1pre RTIC (yellow/red/purple/gray) and +3pre RTIC (dark gray) states. The alignment is rotated by 90° to highlight that the extended PBS helix follows a similar track that deviates upon reaching the RNase H domain. Structures are aligned by the p51 subunit. b, Viral RNA H2 moves downward towards the polymerase active site upon extending from the +1 (yellow) to +3 (gray) state. c, Polymerase active site region of the +1pre and +3pre RTIC states. Structures are aligned by the p51 subunit. d, RNase H domain of the +1pre and +3pre RTIC states. The extended-PBS helix of the +3pre state (gray) shifts away from the RNase H domain of RT by ~3 Å compared to the +1pre state. Secondary structures of the RNase H domain are labeled for reference.

While the +3 extension state favors a common global architecture of the RTIC, the distinct classes identified via cryo-EM make clear that the +3 RTIC adopts an ensemble of conformations that differ from each other significantly within the RT polymerase domain, binding cleft, and RNase H domain. These differences are more evident when inspecting the location of the tRNA^Lys₃ 3’ end within the RT polymerase domain and are described in the following sections.

+3 RTIC can adopt a pre-translocation state conformation similar to the +1 RTIC

Detailed analysis of the cryo-EM structures reveals one state that clearly adopts a conformation trapped in a pre-translocation state reminiscent of that observed in the +1 RTIC (Fig. 5a,c). This +3 pre-translocation state (+3pre) features a semi-open conformation of the RT fingers, like that observed in the +1 RTIC, which is capable of accommodating the downward-shifted H2 (Fig. 5b). Density for H2 is well defined and higher in resolution compared to its density in the +1pre state, suggesting that additional vRNA-RT contacts may be serving to stabilize the complex. Unfortunately, at this modest resolution (4.2/4.9 Å), we could not reliably identify any unique RT finger–RNA contacts that could account for this stabilization. The PBS helix within the RT binding cleft follows a similar track as it does in the +1 state until the RNase H domain (Fig. 5a). At this point, the +3pre PBS curves slightly away from the RNase H domain, shifting the RNA backbone approximately 3 Å away from RT compared to its position in the +1pre state (Fig. 5a,d). This shift likely results from RT extension, which moves the PBS–tRNA^Lys₃ junction further outside of the RNase H domain. Movement of this flexible junction outside of the RT binding cleft likely destabilizes potential RT–RNA contacts that would normally stabilize the PBS–tRNA^Lys₃ junction. This resulting shift may reduce the overall stability of the +3pre state and lead to the observed alternative conformations. However, due to the limited resolution of our cryo-EM data, such changes in RT-nucleic acid contacts can only be inferred by the distance between the RT and RNA backbones.

+3 RTIC can adopt additional conformations featuring large domain movements

As discussed above, cryo-EM of the +3 RTIC revealed two other conformational states, which feature an altered arrangement of the RT and vRNA–tRNA^Lys₃ domains. The first of these states exhibits a dramatic displacement of the PBS helix along the RT binding cleft (+3 displaced state, +3dis) (Fig. 6a). Remarkably, this lateral shifting movement of ~9 Å, with respect to the +3pre state, is accompanied by only a slight rotation of the PBS helix near the PBS–tRNA^Lys₃ junction. To accommodate this shift, the RT thumb domain swings towards the fingers domain (Fig. 6c,e). This results in a loss of the canonical RT thumb–minor groove interaction commonly found in RT-nucleic acid structures. Alignment of the thumb domain with previously solved RT structures does not reveal any major distortions, suggesting that this movement is well within the thumb domain range of motion (Supplementary Fig. 8). This is supported by past structural work, which has indicated that the RT thumb domain is capable of adopting a wide range of conformations[31, 43–45]. In addition to the thumb movement, the RT fingers domain has opened by ~5.5 Å, with respect to the +3pre state, in order to accommodate the displacement of the PBS helix (Fig. 6c). The difference in total displacement between the PBS helix and fingers domain results in the close packing of the RNA against the fingers.

Fig. 6. — a, Structural overlay of the +3pre (gray) and +3int (green) RTIC states. The extended-PBS helix shifts ~9 Å along the RT binding cleft towards the fingers subdomain. The tRNA^Lys₃ helix swings back into the RNase H domain, displacing it by 8–9 Å. b, Structural overlay of the +3pre (gray) and +3dis (brown) RTIC states. The extended-PBS helix shifts ~4.5 Å along the RT binding cleft towards the fingers subdomain. The tRNA^Lys₃ helix makes a similar movement to that observed in the +3int state. c, Polymerase active site of the +3pre, +3int, and +3dis states aligned by the RT p51 subunit, with catalytic residues highlighted in red. The fingers, thumb, and PBS helix are all shifted further from the RNase H domain, representing a continuum of motion. The fingers open ~5.5 Å in order to accommodate the full displacement of the PBS from the +3pre to the +3dis state. d, Polymerase active site of the +3pre (gray), +3int (brown), and +3dis (green) states aligned by the RT fingers and connection subdomain backbones. All three states exhibit similar positions of the vRNA H2, fingers, polymerase active site, and 3’ end of the primer terminus. This suggests that the RT–RNA contact landscape in this are is minimally impacted by the domain movements. e, Thumb domain of the +3 RTIC states. This domain exhibits a continuum of motion that parallels the PBS movement along the RT binding cleft.

We also observed an intermediate state (+3int), in which the PBS helix, thumb, and fingers domains adopt conformations in-between those observed in the +3pre and the +3dis states (Fig. 6b). For the +3int state, the PBS helix has shifted a more moderate distance of ~4.5 Å along the cleft towards the fingers with respect to the +3pre state. This shift is accompanied by a corresponding movement within the RT thumb and fingers domains (Fig. 6c,e). Taken together, this suggests that the +3int state represents a metastable state en route between the +3pre and +3dis states.

While we observe large, coupled domain movements within the polymerase domain, the RNA–protein contacts within this domain appear to be similar among the three states (Fig. 6d). Thus, we speculated that the three distinct conformers might have resulted from a dramatically altered protein–RNA contact landscape elsewhere in the complex. We thus focused on the RT RNase H domain, where large conformational changes are readily apparent.

RNA backbone interactions with the RT RNase H domain in the +1 and +3 RTIC

Using the higher resolution +1 RTIC cryo-EM data, we were able to assess the conformation near the RNase H domain of RT, allowing us to better define potential RT–RNA contacts. As observed in the previously determined map, a major point of interaction between the RNA and RT appears at RNase H αA-B (residues 473–488 and 499–509) and β1–3 (residues 438–469), which are positioned along the minor groove of the +1pre PBS helix (Fig. 5d and Fig. 7a). However, analysis of the improved dataset reveals previously unobserved density that potentially corresponds to RT–nucleic acid contacts. For example, RNase H αE (residues 544–558) features several positively charged residues at the C-terminal end and is positioned at the extended PBS-tRNA coaxially stacked junction (Fig. 5d and Fig. 7a). Alpha helix I (αI) (residues 274–284), within the p51 thumb domain, features a positively charged surface (K275, R277, K281) that faces the minor groove of the tRNA helix (Fig. 7a). These two helices could provide additional stabilization for the coaxially stacked junction between the extended PBS helix and refolded tRNA primer. We anticipated that extension from the +1 to the +3 state would push the extended PBS-tRNA junction outside of the binding cleft, weakening these stabilizing electrostatic interactions and possibly disrupting the extended tRNA helix.

Fig. 7. — a, View of the extended PBS (vRNA: yellow, tRNA^Lys₃: red) and coaxially stacked tRNA^Lys₃ helices within the RT RNase H domain. The two tRNA nucleotides on either side of the coaxially stacked junction are colored pink for reference. RT RNase H domain secondary structure elements are labeled. The positively charge sidechains on RT p51 αI labeled (K281, R277, & K275: rust). **b-c**, Identical views of the PBS and tRNA^Lys₃ helices within the RT RNase H domain for the +3pre (b), +3int (c), and +3dis (d) states. The +3int and +3dis states exhibit a shifting of the extended PBS–tRNA^Lys₃ junction that repositions it within the RNase H domain into close proximity of αI.

In contrast to the +1 RTIC, the +3 RTIC adopts several conformations near the RNase H domain, supporting the hypothesis that interactions in this region might explain the global differences among those classes (Fig. 4 and Fig. 7b-d). While the +3pre state resembles the +1pre state, its PBS helix angles away from the RT binding cleft upon entering the RNase H domain. This angling shifts the PBS backbone approximately ~3 Å away from the RT binding cleft with respect to the +1pre state (Fig. 5a,d). As expected, extension from the +1pre state to the +3pre state also rotates and pushes the PBS helix away from the polymerase domain, effectively moving the extended PBS–tRNA^Lys₃ coaxially stacked junction further outside of the RT binding cleft (Fig. 5d).

The +3dis and +3int states differ markedly from the +3pre state in the region of the RNase H domain, suggesting an altered RT–RNA contact landscape (Fig. 7b-d). The extended PBS helix is shifted back into the binding cleft of RT. This ~8–9 Å displacement compared to the +3pre state moves the extended PBS–tRNA^Lys₃ junction into the RNase H domain and shifts the RNA back into close contact with RT (Fig 6a,b and Fig 7c,d). This movement potentially restores RT–RNA contacts in the RNase H domain that were lost from the transition from the +1pre to the +3pre state (Fig. 5d, Fig. 7). Stabilized by this more extensive RT–nucleic acid contact landscape, the coaxially stacked PBS–tRNA helix in the +3dis and +3int adopts a more linear conformation. The shifting of the PBS helix back into the RT binding cleft correlates with the large-scale conformational changes observed at the polymerase domain of the complex (Fig. 6c). The RT–RNA contact landscape also appears altered in the +3dis and +3int states compared to in the +3pre and +1pre states. In the shifted +3dis and +3int states, the RNase H αA-B and β1–2 fall along the PBS major groove while RNase H αE and its C-terminus ride along the minor groove (Fig. 7c,d). In contrast, the +3pre state features minor groove interactions with RNase H αA-B and β1–2, and a major groove interaction with RNase H αE (Fig. 6 and Fig. 7a,b), as observed in the +1pre state. The shifted tRNA helix backbone in the +3dis and +3int states also moves into close proximity with αI, an RNA–RT contact that had previously been broken due to extension from the +1pre to +3pre state (Fig. 7c,d). The restoration of this contact provided us with a potential region to selectively target, through site-directed mutagenesis, to modulate the stability of +3dis and +3int states.

In summary, the +3 RTIC adopts multiple conformations involving large-scale shifts of the PBS helix within the RT binding cleft. These movements likely involve altered RNA–protein contacts within the RNase H domain and may arise from the barrier presented by HIV-1 vRNA H2 in the RT polymerase domain. To strengthen the structural model developed here, we first wanted to probe the possibility that the alternative conformations observed here were byproducts of the covalent crosslinking used to stabilize the complex for structural work. We therefore tested whether these static states are sampled dynamically in solution using a non-crosslinked system.

Single-molecule FRET investigation of the +3 RTIC dynamics

We probed the conformational dynamics of the +3 RTIC using single-molecule Förster resonance energy transfer (smFRET). Previous studies had effectively used single-molecule FRET to monitor the intrinsic dynamic of the HIV-1 initiation complex through the clever placement of fluorescent FRET probes[29, 35]. Most previous labeling schemes placed fluorescent dyes within the RT fingers domain or the C-terminus of the RNase H domain, resulting in a strong FRET signal when combined with a 5’ labeled tRNA^Lys₃ or vRNA labeled within the connecting loop. This scheme is ideal when monitoring RT binding or RT flipping dynamics to a nucleic acid substrate, which feature substantial FRET value changes. According to our structural models, these labeling schemes would not distinguish between conformers that differ by a ~9 Å movement within several select domains. Using our cryo-EM structures as a guide, we designed a FRET system to probe the conformational changes observed in the +3 RTIC. A surface-exposed alanine at position 355 on the p51 subunit was mutated to cysteine with no appreciable difference in polymerase activity (Supplementary Fig. 3c). This A355C RT mutant was labeled with a Cyanine5 (Cy5) fluorophore using maleimide chemistry. For single-molecule assays, the 101-nt vRNA template was extended on the 5’ end with an unstructured linker and primed with 5’-Biotin-GMP during T7 transcription for surface immobilization (Supplementary Fig. 9b). Full-length and +3 extended synthetic tRNA^Lys₃ primers were labeled at the C5 position of nucleotide C4 (TriLink Biotechnologies) with a Cyanine3 (Cy3) fluorophore to serve as the FRET pair donor (Fig. 8a and Supplementary Fig. 9a). Based on our cryo-EM models, we predicted that the distance between the donor and acceptor fluorophores would be ~55 Å apart in the +1pre state, and should increase to ~60 Å in the +3pre state and decrease to ~50 Å in the +3dis and +3int states, respectively, due to the shifting of the PBS–tRNA^Lys₃ helix (Fig. 8a and Supplementary Fig. 9a). Since the Forster Radius (R0) for the Cy3 and Cy5 FRET pair is around 60 Å[46], these distances should allow for monitoring of the exchange between the +3pre and +3dis/int states using FRET.

Fig. 8. — a, Labeling positions of the +3 RTIC for single-molecule FRET experiments. Cytidine 4 of the tRNA^Lys₃ is labeled with Cyanine3 (green dot) and residue 355 of RT p51 is labeled with Cyanine5 (red dot). Distances between the C6 (cytidine 4) and the CA of residue 355 are indicated for the +3pre (gray) and +3dis (dark green) states. b, Representative single-molecule traces for the HIV-1 RTIC. Upper trace depicts the fluorescence signal of the Cy3 and Cy5 channels (Cy3 excitation). Lower trace depicts FRET efficiency as calculated as a ratio of the Cy3 fluorescence and the sum of Cy3 and Cy5 fluorescence for each frame. c, FRET efficiency histogram of the +0 and +3 RTICs with wild-type or mutant p51 RT. d, FRET efficiency mean and variance for the +0 and +3 RTICs. Notably, the mean FRET efficiency of the +3 RTIC lowers more dramatically for the +3 RTIC p51 mutant than the +0 RTIC mutant.

Using this single-molecule FRET system, we first probed the dynamics of the uncrosslinked and unextended RTIC. We annealed biotinylated vRNA and Cy3-labeled tRNA^Lys₃ and tethered the vRNA/tRNA complex to a Neutravidin-coated microscope slide surface (Supplementary Fig. 9b). Therefore, Cy3 fluorescence observed on the surface should correspond to a vRNA/tRNA complex. Upon injecting Cy5-labeled RT, we observed FRET events between Cy3 and Cy5, indicative of specific RT binding to the surface-tethered vRNA–tRNA^Lys₃ complex in proximity to the PBS (Fig. 8b). At the 40 nM of final Cy5-labeled RT concentration, we also observe a non-specific surface adsorption of Cy5-labeled RT to the surface (Supplementary Fig. 9c). We collected FRET events with the FRET efficiency greater than a set threshold (0.3) to calculate the FRET distribution. We expect that by selecting only for the FRET events we should filter out the fluorescence signal from non-specific surface bindings of Cy5-labeled RT. This FRET distribution was fit to a single-Gaussian with a mean of 0.57 and a variance of 0.32 (Fig. 8c,d). Next, we performed the same assay using an uncrosslinked +3 RTIC complex, where tRNA in vRNA–tRNA^Lys₃ complex was synthetically extended by three deoxynucleotides. The single-Gaussian fit of the FRET distribution had a mean of 0.57 and a variance of 0.36 (Fig. 8c,d). A comparison of the two datasets indicated that the broader FRET distribution of the +3 RTIC was statistically different from that of the +0 RTIC (p-value of 3.8*10⁻³; Two-sample F-test of equal variance). The analysis of these data supports a model in which the +3 RTIC state has increased underlying conformational heterogeneity. However, the weak binding affinity between RT and vRNA–tRNA^Lys₃ duplex resulted in transient FRET events, where a clear distinction between FRET states for the +3pre and +3dis/int conformational states were not observed, thereby presenting no details on the underlying population. To investigate potential differences in RT binding kinetics between the two complexes, we analyzed lifetimes of the FRET-on state (RT-bound state) for the +0 and +3 RTICs. Calculated FRET-on state lifetimes for both complexes did not differ substantially (+0: 1.49 ± 0.06s, +3: 1.60 ± 0.12s), indicating that RT remains bound to each vRNA/tRNA substrate for a similar amount of time (Supplementary Fig. 9d). The presence of unlabeled RT (labeling efficiency estimated to be 60%; Methods) precluded calculating accurate FRET-off state lifetimes, which would be overestimated due to missed unlabeled RT binding events.

Mutations within the p51 thumb significantly alter the +3 RTIC conformational landscape

To clarify the observed conformational dynamics of the +3 RTIC, we perturbed the population distribution using structure-guided mutations. Ideally, specific alterations to the RT–RNA contact landscape would selectively disrupt the displaced PBS binding mode found in the +3dis/int states while maintaining the +1pre and +3pre conformations. The p51 thumb domain αI was a clear candidate, because it closely associates with the tRNA^Lys₃ backbone due to PBS helix displacement in the +3dis/int states (Fig. 7). We hypothesized that mutating the charged face of this αI would dramatically disrupt the +3dis/int conformations, while leaving +3pre and +1pre conformations relatively unaltered (Fig. 8a). Such a disruption should shift the mean FRET efficiency, as the lower FRET state of the +3pre would be favored. This p51 triple mutant (K275A, R277A, K281A) was generated and labeled using the same Cy5 maleimide chemistry described earlier. Bulk activity assays of the triple p51 mutant RT revealed similar global reverse transcriptase activity and pausing patterns (Supplementary Fig. 3c-f). Notably, the triple mutant RT appears to slightly increase the propensity of +3 pausing, likely due to decreased overall affinity of RT for the vRNA–tRNA^Lys₃ complex.

We performed similar single-molecule FRET experiments on the p51 triple mutant RT–RNA complexes as described above for wild-type complexes. The data on the p51 triple mutant revealed a significant alteration of the +3 RTIC conformational landscape (Fig. 8c,d). The mean FRET efficiency distribution of the +3 RTIC shifted from 0.57 to 0.46 (p-value of 2*10⁻¹²³; Two-sample student’s t-test) with the variance changing from 0.36 to 0.29 (p-value of 3*10⁻¹³; Two-sample F-test of equal variance), as shown by a tighter FRET distribution around the lower mean. This decrease in mean FRET efficiency was substantially greater than that of the p51 triple mutant bound to the +0 RTIC, which moved from 0.57 to 0.53. The dramatic lowering of the +3 RTIC FRET efficiency could be explained by selective disruption of the +3dis/int state binding mode, which has a higher FRET efficiency than the +3pre state, although there is no direct evidence from the smFRET data that both conformers are actually populated in solution. We also repeated our FRET-on state lifetime analysis for the p51 triple-mutant complexes. The +0 RTIC exhibited a slight lifetime decrease relative to the WT +0 RTIC (1.38 ± 0.09s, 7.4% decrease), indicating that the +0 unextended RT-bound complex was relatively unperturbed by the mutations. However, the +3 RTIC exhibited a far more dramatic change in the FRET-on state lifetime (1.06 ± 0.05s, 34% decrease), indicating that the mutations had a far greater effect on the stability of RT bound to the +3 extended complex (Supplementary Fig. 9d; see Discussion). We also observed that short FRET-on lifetime measurements correlate with a low mean FRET efficiency (Fig. 8c). One possible explanation was that because the FRET efficiency in the first and last frame of one FRET-on event are typically averaged with the FRET-off state, including these frames could contribute to a lower mean FRET-efficiency for very short FRET events. Indeed, we observed an increase in the mean FRET efficiencies when the first and last frames are removed from the analysis, but the overall trend remained unchanged, arguing against a short FRET-on lifetime causing a low mean FRET efficiency (Supplementary Fig. 9e). Taken together, these results suggest that the +3 RTIC conformations observed in the cryo-EM data are also present in an uncrosslinked system in solution.

Discussion

In this manuscript, we sought to investigate the structural basis for RT pausing during initiation using cryo-EM. Previous biochemical and biophysical studies have identified the initiation step of reverse transcription as a highly dynamic process with kinetic properties distinct from those of later elongation steps[7]. Moreover, the presence of RNA structure has previously been correlated with slow rates of polymerization and specific stalling points along the viral RNA sequence, and recent cryo-EM and x-ray crystallography studies of the HIV-1 RTIC have revealed general structural features that likely contribute to slow polymerization rates[7, 18, 29, 30, 32]. However, the structural principles that dictate when stalling occurs, and, more generally, how RT transitions from initiation into elongation, have remained unknown. Here, we show that the +3 extended RTIC can adopt multiple, distinct conformational states. Below, we explore how these distinct states relate to RT pausing and discuss their implications for regulation of HIV-1 reverse transcription initiation.

We first repeated our previous cryo-EM studies of the +1 extended RTIC to characterize more thoroughly the previously unresolved regions of the RTIC structure (Fig. 3b,c)[30, 32]. Minor alterations in the sample buffer conditions and the use of an improved cryo-electron microscope setup allowed for the determination of a 4.1 Å cryo-EM map of the +1 RTIC core. This map allowed us to model the coaxially stacked PBS–tRNA^Lys₃ junction, which lies at the edge of the RT binding cleft within the RNase H domain. Global improvements in the +1 RTIC cryo-EM data also resolved helical features for vRNA H2, which had previously exhibited weak and distorted density. However, likely due to intrinsic flexibility, vRNA H1 remained poorly resolved, with 3D classification revealing varying degrees of density for the extended tRNA^Lys₃ helix. Both of these observations are consistent with our previously published results[30], and the improved resolution of this dataset provides a basis for comparisons with the additional +3 complexes presented in this study.

Most critically, we captured and characterized an active +3 RTIC using cryo-EM to elucidate how RNA structure is employed by HIV-1 to alter reverse transcription rates during the initiation step. The +3 reverse transcription initiation extension intermediate is characterized by a strong, kinetic pause, which features a 10-fold decrease in RT polymerization compared to other initiation intermediates[28]. Pausing at the +3 extension state has been previously correlated with the presence of HIV-1 viral RNA secondary structure, which in turn has been shown to increase the propensity of RT to adopt alternative binding modes[18, 29]. However, the exact features of the +3 RTIC that make it uniquely susceptible to pausing have remained unknown. Processing of the +3 RTIC cryo-EM data revealed a surprising amount of conformational heterogeneity within the RTIC core, which had previously adopted a single conformation in the +1 complex (Fig. 4b-d)[30]. Despite these conformational differences, we found that the +3 RTIC exhibited strong, ordered density for HIV-1 vRNA helix 2. Within all +3 RTIC conformers, we found that vRNA H2 had lowered further into the RT polymerase active site compared to the +1 complex, an expected consequence of adding two additional dNTPs onto the 3’ terminus of the primer strand. Lowering of H2 further into the RT polymerase active site and into closer contact with the fingers domain may serve to reduce the conformational heterogeneity of H2, which had previously hampered our ability to resolve it at the +1 extension state. Therefore, as predicted, the stable nature of H2 presents a substantial barrier to the continuation of reverse transcription in this state[18, 29, 30, 36, 47].

The cryo-EM data revealed the presence of three distinct +3 RTIC conformational states. These states exhibit a continuum of motion in which the PBS helix moves along the binding cleft of RT towards the fingers domain. The first of these conformers, +3pre, closely resembles the +1 RTIC, with both states adopting a pre-translocation conformation within the RT polymerase active site (Fig. 5c). The main difference between these two pre-translocation states lies in the PBS helix path along the RT binding cleft. The PBS helix of the +3pre state is shifted slightly away from the RNase H domain, possibly owing to a reduced ability of RT to bind the long, rigid A-form RNA helix compared to the more malleable B-form DNA. As a result, the PBS–tRNA^Lys₃ coaxially stacked junction is positioned further away from the RT binding cleft. We hypothesize that these slight movements would result in a weaker RT–PBS interaction at the +3pre state compared to the +1 state. The inability to retain an on-pathway interaction with the RNA substrate could result in the adoption of alternative conformers that partially restore RT–nucleic acid contacts and may explain the presence of two additional conformations, +3int and +3dis, that are not observed in the +1 RTIC cryo-EM data.

The +3int and +3dis conformations provide further explanations for pausing of reverse transcription at this state. For these two additional conformations the shifted motion of the PBS along the RT binding cleft presses the 3’ primer terminus against fingers domain, effectively moving it away from the polymerase active site (Fig. 6). This is accompanied by movement of the RT thumb domain, which retains contact with the minor groove of the PBS helix, towards the fingers domain. While we see a continuum of motion within the polymerase domain across the +3 RTIC states, our data do not readily reveal any unique interactions within this domain that would selectively stabilize the +3dis/int states over the +3pre state. Nonetheless, these conformational changes within the polymerase domain shift the 3’ end of the tRNA primer further away from the polymerase active site residues. This further disengagement from an active conformation suggests an additional mechanism by which RT polymerization from the +3 extension state is slowed during HIV-1 reverse transcription initiation. However, this analysis is limited by the low resolution of the data and lack of distinct side chain density in this region. As such, our dataset may simply not be able to reveal unique interactions potentially stabilizing the movements observed in the +3dis and +3int states.

The distinct conformational states of the +3 RTIC are distinguished by different interactions in the RNase H domain. Shifting of the PBS helix along the RT binding cleft is accompanied by a dramatic swinging of the coaxially stacked PBS–tRNA^Lys₃ helix into the RNase H domain (Fig. 6 and Fig. 7). This movement repositions the RNA backbone to be in close contact with RT, potentially restoring any stabilizing contacts lost in the transition from the +1pre to +3pre state. The formation of additional RT–nucleic acid contacts explains the shifted movement of the PBS helix along the binding cleft and would effectively counteract any strain induced by the altered RT finger and thumb domain conformations. We sought to probe these altered conformations in a uncrosslinked system using single-molecule FRET; the results were consistent with rapid sampling of different states in the uncrosslinked +3 RTIC, although there is no direct evidence of multiple FRET states or conformational sampling. By targeting RT–nucleic acid contacts that preferentially stabilize the +3dis/int states through mutagenesis, we observed a shorter residency time of RT in RTIC in the context of +3 RTIC than that of an +0 RTIC (Supplementary Fig. 9d). This is consistent with our model, based on the cryo-EM data, that these mutations would selectively perturb the +3dis/int states and alter the +3 RTIC conformational landscape more so than that of an +0 RTIC. Taken together, the cryo-EM and single-molecule results suggest an additional layer of conformational dynamics that governs the regulation of HIV-1 reverse transcription initiation.

These structural and single-molecule data, along with previous biochemical and biophysical studies, suggest a model in which reverse transcription initiation continues slowly, yet unimpeded, until the +3 extension state (Fig. 9). At this point, RT must wait until the template nucleotides sequestered within vRNA H2 are exposed. Due to a weakened RT–nucleic acid contact landscape of the +3pre state, RT can readily adopt energetically accessible, alternative binding modes of the +3int and +3dis states. Such conformational heterogeneity, which may also be accompanied by previously observed “enzyme-flipping” states, would decrease RT’s propensity to sample “on-pathway” binding modes that lead to polymerization[8, 29]. Alleviation of the +3 pause is accomplished by fraying of the terminal base pairs of vRNA helix 2, to release the template nucleotides, and concurrent repositioning of the 3’ primer terminus into the RT polymerase active site. Such a model would explain past biochemical and biophysical data suggesting that while HIV-1 vRNA H2 remains at least partially folded until the +6 extension state, only the +3 extension state features dramatically reduced polymerization rates[6, 18, 28, 29].

Fig 9. — HIV-1 reverse transcription initiation begins with RT binding to the vRNA–tRNA^Lys₃ complex and extending off the tRNA^Lys₃ 3′ end. Polymerization of the first three dNTPs by RT proceeds slowly, with frequent dissociation and rebinding of RT, until the +3 extension state is reached. At this point, polymerization stalls due to the sequestering of the template nucleotides within vRNA H2 (represented by the +3pre state). The weakened RT–nucleic acid contact landscape of the +3pre state allows RT to sample and then readily adopt alternative binding modes of the +3int and +3dis states. This weakened “on-pathway” +3pre state likely also contributes to “enzyme-flipping” (not pictured), which further stalls initiation. Fraying of the terminal base pairs of vRNA H2 exposes the template strand, allowing for polymerization to proceed and HIV-1 reverse transcription initiation to continue. In this model, nothing prevents vRNA H2 from reforming in the absence of RT polymerization past the +3 state, thereby allowing RT to continue sampling on and off-pathway binding modes.

Materials and Methods

Sample Preparation

HIV-1 vRNA constructs for RTIC formation were prepared by in vitro transcription with T7 RNA polymerase as previously described[34, 48, 49]. Transcripts were denatured in 8 M urea and purified on a 12% TBE-Urea sequencing PAGE gel. Gel extraction was performed using 0.3 M ammonium acetate. Following ethanol precipitation, the RNA was dissolved in 10 mM Bis-Tris propane, pH 7.0, 10 mM NaCl and stored at −20°C. The crosslinkable tRNA^Lys₃ primer constructs were purchased from TriLink Biotechnologies. The crosslinkable RNA primer was chemically synthesized, PAGE purified, and analyzed by denaturing-PAGE and mass spectrometry. During synthesis, an N2-Cystamine-2’-deoxyguanosine was placed at the 71 or 73 position (cys-71 and cys-73) for crosslinking purposes. The cys-73 construct was also extended by three dNTPs on the 3’ to mimic the +3 extension state.

HIV-1 vRNA constructs for single-molecule experiments were modified by extending the 5’ end with an unstructured linker (5’-GGGCACGUCUGUUGUGUGACUCUGGUA-3’). In vitro transcription was performed with the addition of 1.2 mM biotin-GMP[34]. Purification was performed as described above for unmodified HIV-1 vRNA constructs.

Both vRNA–tRNA^Lys₃ complexes were formed by mixing the vRNA and tRNA^Lys₃ in a 1:1 molar ratio at 1 μM each in 10 mM Bis-Tris propane, pH 7.0, 10 mM NaCl. The mixture was warmed to 90°C and slow cooled to room temperature. The vRNA-tRNA complex was purified away from higher order and unannealed monomer species using a Superdex 200 (26/60) gel filtration column with 10 mM Bis-Tris propane, pH 7.0, 100 mM NaCl[30, 34, 35]. Notably, the +3 cys-73 tRNA^Lys₃–vRNA complex exhibited a larger fraction of higher order species that dissipated at room temperature (Supplementary Fig. 1d-f). The presence of a single species was confirmed with native PAGE and samples were concentrated on a MilliporeSigma 10,000 MWCO concentrator (Supplementary Fig. 1e). Samples were crosslinked to HIV-1 RT or used for activity assays within 24 hours of purification[30].

HIV-1 RT was expressed in Escherichia coli strain BL21 (DE3). Two expression vectors, one containing the p66 and ampicillin resistance and the other containing p51 and kanamycin resistance, were constructed. The COOH-terminus of p66 contains an unstructured linker and a six-histidine tag[30, 35]. A cysteine mutation for crosslinking was introduced into helix H of p66 (Q258C) for structural studies[50, 51]. The protein used in this study had an E478Q mutation, introduced to eliminate RNase H activity as RT has been shown to cleave dsRNA when stalled for long periods of time[50, 52]. Cell pellets were lysed through sonication and the enzyme was purified by gravity Ni-nitrilotriacetic acid affinity chromatography, followed by cation exchange chromatography using a Superdex 200 (26/600). The his-tag was cleaved by thrombin digestion overnight. The cleaved protein was re-applied to a Ni-NTA column to removed protein with uncleaved his-tag. This was followed by an additional final size-exclusion chromatography step. The protein was stored at 4°C in 300 mM NaCl, 50 mM Tris, pH 8.0, 5 mM β-met for up to one month.

The RTICs were prepared by mixing RT and vRNA–tRNA^Lys₃ complexes at 2 and 1 μM, respectively, in a buffer containing 25 mM NaCl, 25 mM KCl, 5 mM MgCl₂, 50 mM Tris, pH 7.5, and 100 μM ddCTP if generating the +RTIC[30]. The mixture was allowed to crosslink overnight at room temperature. The complex was purified by anion-exchange chromatography with a linear gradient. This was followed by a size-exclusion chromatography step to remove any higher-molecular weight aggregates. The purity and homogeneity of the final complex was assessed by SDS-PAGE (under non-reducing conditions) and size-exclusion chromatography (Supplementary Fig. 2).

For smFRET experiments, cysteine residues within the p66 and p51 subunits were mutated to serine, and a cysteine residue was inserted at position 355 of p51 for dye-labeling using maleimide chemistry. An additional RT construct, designed to alter the +3 RTIC conformational landscape, was constructed by also mutating K281, R277, and K275 on the p51 subunit to alanine. Initial purification of all single-cysteine RT constructs was carried out as described above. Dye-labeling was carried out using 16-fold molar excess dye (5 μM RT and 80 μM Cy5-maleimide) in a 300 mM NaCl, 50 mM Tris-HCl (pH 7.2) buffer at room temperature for 15 minutes according to previously described methods. The labeling reaction was quenched upon the addition of 5 mM β-met and excess dye was removed by two passages on a 10DG desalting column. Dye-labeled RT was further purified by SEC on a Superdex 200 (26/200) column (50mM Tris HCl pH 8.0 300mM NaCl, 5 mM β-met). Specific p51 labeling was assessed by SDS-PAGE and imaging with a Storm 860 (Molecular Dynamics). Labeling efficiency was calculated by measuring the absorbance values of the labeled species at both 280 nm and 647 nm (Cy3 absorbance)[35]. These absorbance values were used to calculate the concentrations of RT and the Cy3 dye. Using the ratio between these two values, we estimate that our labeling efficiency is approximately 60%.

Cryo-EM of the +1 extended RTIC

RTIC complex in low monovalent salt buffer and Mg²⁺ (75 mM NaCl, 6 mM MgCl₂, 10 mM Tris-HCl pH 8.0) containing 0.2% (w/v) OG (beta-octyl glucoside) was applied (3.0 μL) to glow discharged (25s) at a concentration of 20 μM to holey carbon grids (Quantifoil Au, R2/1, 200 mesh) and subsequently vitrified using a Leica EM GP (2s pre-blot, 2.5s blot at 95% humidity). Data automation with SerialEM[53, 54] was performed on a Titan Krios (Thermo Fisher Scientific) at 300 kV with a Gatan K2 Summit direct detection camera and energy filter in counting mode with 200 ms exposure per frame. Data was collected at 50 frames per micrograph at a magnification of 130,000, which corresponds to 1.06 Å/pixel at the specimen plane. In total, 7,904 micrographs were collected at defocus values varying from −1.0 to −3.0 μm. The movie frames were motion-corrected and dose-weighted by MotionCor2 and CTF parameters were estimated by GCTF[55, 56]. See Supplementary Table 1 for data collection and processing statistics.

Cryo-EM data were processed using Relion (Supplementary Fig. 4 and Supplementary Table 1)[37, 38, 57, 58]. 3,052,652 particle projections were semi-automatically picked from the motion-corrected micrographs, and sorted through four rounds of reference-free 2D classification. 930,060 particles belonging to classes with well-defined RT and RNA features were selected for further processing. An initial 3D model was obtained using Relion, based on the selected 2D classes, and used for 3D classification in Relion. An initial round of 3D classification was performed and particles with well-defined RNA and RT density were combined. These 676,661 particles were subjected to two parallel rounds of 3D-classification. The first classification was performed without a mask and a single class exhibiting well-defined features for the vRNA and tRNA^Lys₃ was selected. This class was used to generate a loose mask for a final round of 3D refinement, which resulted in a map with a final of resolution of 6.6 Å (global). The second classification was performed after refining the grouped particles and employed a mask that focused refinement on RT, the PBS helix, and a portion of the tRNA^Lys₃ helix. Two classes exhibiting well-defined features for RT and the PBS helix were selected for refinement. This class was refined to a resolution of 4.1 Å (core). Additional rounds of classification and refinement (including CTF refinement and Bayesian Polishing as implemented in Relion) were attempted for each of these processing pathways, but no improvement in resolution was observed. All resolution estimates were done according to the 0.143 “gold-standard” FSC criterion (Supplementary Fig. 4e,f). Maps were corrected for the modulation transfer function (MTF) of K2 direct detection camera at 300 kV and then sharpened using a B factors of −200 Å² (core) and −184 Å² (global) during the post-processing step. Local resolution estimate and angular distribution calculations on the core map were performed using Relion (Supplementary Fig. 4c,e). Cryo-EM figures were all generated using Chimera[59].

Cryo-EM of the +3 extended RTIC

RTIC complex in low monovalent salt buffer and Mg²⁺ (75 mM NaCl, 6 mM MgCl₂, 10 mM Tris-HCl pH 8.0) containing 0.2% (w/v) OG was applied (3.0 μL) to glow discharged (25s) to holey carbon grids (Quantifoil Au, R2/1, 200 mesh) at a concentration of 30 μM and subsequently vitrified using a Leica EM GP (2s pre-blot, 1.5s blot at 95% humidity). Cryo-EM imaging of the +3 RTIC mirrored that of the +1 RTIC with 40 frames per micrograph at a magnification of 130,000, which corresponds to 1.06 Å/pixel at the specimen plane. In total, 7,336 micrographs were collected at defocus values varying from −1.5 to −3.5 μm. The movie frames were motion-corrected and dose-weighted by MotionCor2 and CTF parameters were estimated by GCTF[55, 56]. See Supplementary Table 1 for data collection and processing statistics.

Cryo-EM data were processed using Relion (Supplementary Fig. 6 and Supplementary Table 1)[37, 38, 57, 58]. 2,911,135 particles were semi-automatically picked from the motion-corrected micrographs, and sorted through four rounds of reference-free 2D classification. 902,164 particles belonging to classes with well-defined RT and RNA features were selected for further processing. An initial 3D model was obtained using Relion, based on the selected 2D classes, and used for 3D classification in Relion. An initial round of 3D classification was performed and three classes with well-defined RNA and RT density were identified. Classes with similar RT and RNA conformations were combined. The first set of particles, which adopted a pre-translocation conformation, was refined to resolutions of 4.2 Å (core, masked) and 4.9 Å (global, unmasked). Additional rounds of classification and refinement were attempted, but no improvement in resolution or additional heterogeneity was observed. The second set of classes was refined and subjected to an additional round of 3D-classification using a mask and fine angular sampling. Two distinct conformations of the RT-PBS core were identified and the corresponding classes were combined and refined separately. The first class (intermediate) was refined to resolutions of 4.7 Å (core, masked) and 5.3 Å (global, unmasked). The second class (displaced) was refined to resolutions of 4.5 Å (core, masked) and 5.3 Å (global, unmasked). Again, attempts to further classify and refine these classes lead to no improvements in resolution or identification of additional conformational states. For all three sets of particles (pre-translocation, intermediate, and displaced) CTF refinement and Bayesian polishing showed no improvement in the resolution or density. All resolution estimates were done according to the 0.143 “gold-standard” FSC criterion (Supplementary Fig. 7a and Supplementary Table 1). Maps were corrected for the modulation transfer function (MTF) of K2 direct detection camera at 300 kV and then sharpened using a B factors of −250 Å² (pre-translocation core), −200 Å² (pretranslocation global), −275 Å² (intermediate core), −200 Å² (intermediate global), −250 Å² (displaced core) and −200 Å² (displaced global) during the post-processing step. Local resolution estimates and angular distribution calculations on the core maps were performed using Relion (Supplementary Fig. 7a-c). Cryo-EM figures were all generated using Chimera or Chimera X.

Modeling and refinement

For all modeling, the crystal structure of RT bound to a DNA/DNA duplex (PDB: 3V81), with the nucleic acid substrate and ligands removed, was used as the starting model for RT[31]. Modeling of the RNA for all core maps followed a similar workflow. For the +3dis and +3int states, an initial round of Phenix real-space-refinement was performed on RT using secondary structure restraints and morphing in order to fit the RT polymerase domains into the shifted density[41, 42]. Initial RNA models for the +1 core (extended PBS: tRNA^Lys₃ nt 55–77, vRNA 203–181 coaxially stacked tRNA^Lys₃: nt 1–4, 54–51) and +3 core (vRNA H2: nt 134–137, 178–175 extended PBS: tRNA^Lys₃ nt 55–79, vRNA nt 203–179 coaxially stacked tRNA^Lys₃: nt 1–8, 46–54) were docked into the unsharpened density using Chimera. Due to high convergence, 100 initial models of the RNA were built using DRRAFTER[39]. The RNA was allowed to move a rigid body relative to the protein and all RNA residues were allowed to rebuild during both initial low-resolution and final-high resolution stages of modeling. For the low-resolution stage, 1000 Monte Carlo cycles were used (flag: –cycles 1000). Two rounds of minimization were performed (flag: -minimize_rounds 2). To preserve the disulfide crosslink between the PBS and RT, the models were further restrained using a distance constraint of 6.8 Å between the CA of 258 (RT p66) and the N2 of G73 (tRNA). The top 10 scoring DRRAFTER models for each map were visually inspected in Chimera and Coot[40, 59]. Models exhibiting the best fit to the density were chosen for Phenix (1.17) real-space-refinement with secondary structure restraints in place for the RNA and protein. To further restrain the models during refinement, a loose bond constraint between C258 (CA) and G73 (N2) was put into place as previously described[30]. The model was visually inspected and manually adjusted in Coot[40]. Due to insufficient resolution, regions of RT did not exhibit reasonable density for sidechains. Therefore, portions of each RT model were truncated to a main-chain backbone during final inspection. Model quality was assessed using the comprehensive model validation tool from Phenix (1.17) and MolProbity (Supplementary Table 2) as well as by comparing the map-vs-model FSC with the FSC of the cryo-EM reconstruction (Supplementary Fig. 5a)[41, 60]. A cross-validation test was performed to assess overfitting. Atoms of the final model were perturbed by an average of 0.4 Å and one round of Phenix real-space-refinement was performed against one half map of the gold-standard refinement. The resulting map-vs-model FSC curves of this refined model against the same half map (FSCwork) and the second half map (FSCtest) do not diverge significantly, indicating the absence of overfitting (Supplementary Fig. 5b).

Single-molecule FRET experiments

Overall single-molecule FRET experimental protocol was adapted from Coey et al. 2017. Quartz microscope slides were prepared for each experiment, where the surfaces were initially derivatized with Biotin-PEG molecules and incubated with 1 μM Neutravidin for 5 minutes, yielding a PEG-Neutravidin-derivatized surface. Unbound Neutravidin were washed with a cryo-EM imaging buffer (75 mM NaCl, 6 mM MgCl₂, 10 mM Tris-HCl pH 8.0). Biotinylated vRNA-tRNA complexes in the same EM imaging buffer were immobilized to the surface via biotin-Neutravidin interaction. (Biotin)-vRNA–(Cy3)-tRNA^Lys₃ complexes (tRNA^Lys₃ either synthetically extended with three deoxyribose nucleotide or unextended) were formed before the experiment, and purified over a Superdex 200 (26/60) gel filtration column similar to the cryo-EM sample preparation, where its concentration is determined via RNA Following immobilization of the (Biotin)-vRNA–(Cy3)-tRNA^Lys₃ complexes to the slide surface, excess complexes are washed with a cryo-EM imaging buffer containing oxygen scavenging and photostabilizing reagents (2.5mM 3,4-dihydroxybenzoic acid, 250nM protocatechuate dioxygenase, 2.5 mM TSY (Trolox-like triplet state quencher supplied by the Pacific Biosciences)) without OG (beta-octyl glucoside). Density of single-complex immobilization was checked on a prism-based total-internal reflection microscope for each condition at different concentration of (Biotin)-vRNA–(Cy3)-tRNA^Lys₃ complexes, ranging from 100 to 500 pM (concentration calculated by A260 absorbance on NanoDrop instrument, adjusted by the estimated 20% biotin-labeling efficiency). Single-molecule experiments were carried out by delivering cryo-EM imaging buffer containing oxygen scavenging, photostabilizing reagents and 40 nM of Cy5-labeled RT (wild-type or p51 mutant) to each slide chamber immobilized with RTIC molecules. Fluorescence data for Cy3 and Cy5 channels were taken as 5-minute movie with 200-millisecond exposure time for each frame using an equipped QV2 (photometrics) and EMCCD camera (Andor Technology), excited with 532-nm laser only.

Resulting movies were analyzed using MATLAB (mathworks) scripts written in house. First, two fluorescence channels (Cy3 and Cy5) in movies were aligned and molecules are selected based on the high-intensity in both channels. Individual Cy3 and Cy5 traces for each molecule were manually inspected for anti-correlation, a characteristic of single-molecule FRET. FRET on and off states were automatically assigned based on the threshold of 0.3 FRET efficiency (calculated as a ratio of the Cy3 fluorescence and the sum of Cy3 and Cy5 fluorescence for each frame), and manually inspected for anticorrelation in the beginning and the end of each event. Collected FRET-on frames were further corrected for possible difference in efficiency of fluorescence measurement in Cy3 and Cy5 channels (gamma factor correction). Resulting FRET efficiency distributions are fitted with a single Gaussian distribution and FRET lifetimes are fitted with a single-exponential distribution. Statistical tests such as two-sample student’s t-test and F-test were employed as indicated in the text.

Activity Assays

For all activity assays, the RTIC (+0 and +3 extended), RT (WT and mutants), and vRNA–tRNA^Lys₃ (+0 and +3 extended) were all purified as described above.

Time course Assay

Time course incorporation assays were performed as previously described[30]. In brief, +3 RTIC (200 nM) was preincubated for 20 min at 37°C in 50 mM Tris-HCl, pH 8.0, 50 mM KCl, 2.5 mM MgCl₂. Free vRNA–(+3)-tRNA^Lys₃ (200 nM) and RT (2 μM) were also preincubated for 20 min under the same conditions. Incorporation reactions were started by the addition of α−³²P-dCTP (50 nM), and dCTP (50 μM). Reactions were quenched at a range of times, run on a 4–20% SDS-PAGE gel, dried, exposed 18 hours on a phosphoimager screen (Molecular Dynamics) and imaged with a Storm 860 (Molecular Dynamics). Bands were quantified using ImageQuant. Intensity was normalized to the highest band intensity for the individual time course assays after background subtraction (set to 1). All time course assays required no special equipment[6, 30]. Plotting and curve fitting were performed in Prism.

Reverse transcriptase activity assay

Reverse transcription activity assays were performed as described in Coey, et. al (2018) and used a similarly described Cy3-labeled tRNA^Lys₃ primer labeled on the 5’ end with Cyanine3 (TriLink)[35]. For reactions beginning from the +3 extension state, the tRNA primer was chemically synthesized with the appropriate 3’ end dNTPs. Mixtures containing 50 mM Tris-HCl (pH 8.0), 50 mM KCl, 6 mM MgCl2, 5 mM βME, 3 μM RT, and 200 nM vRNA–tRNA^Lys₃ duplex were pre-incubated at 37 °C for 5 minutes. For time course assays used to assess pausing patterns, stoichiometric amounts of RT and vRNA–tRNA^Lys₃ duplex were used in order to enhance band intensities (200 nM RT and vRNA–tRNA^Lys₃). The addition of 100 μM dNTP mix initiated each reaction, which was performed at 37 °C and quenched with an equal volume of 90% formamide (50 mM EDTA) at the appropriate time. Ladders for time-course assays were generated using the same setup with 25 μM of the ddNTP of interest added to the dNTP mix. Single time-point extension reactions were quenched after 1 hour or 2 minutes (in triplicate), ladder reactions were quenched after 2 hours, and time-course reactions were quenched at their respective time-points. Each activity assay was visualized on an 8.5% urea-polyacrylamide gel that was pre-run at 100W for 2 hours and run at 120W for 3 hours. Samples were heated at 95°C for 5 minutes and 85°C running buffer was added to the top chamber immediately before loading. Gels were scanned on a Typhoon Trio scanner (Amersham Biosciences) and band intensities were quantified using the ImageQuant software (Molecular Dynamics) or ImageJ[61]. For single time-point extensions, each condition was run in triplicate. Values were plotted as the mean +/− standard deviation, with individual points shown (Supplementary Fig. 3c,f).

Supplementary Material

SI Fig 1. Purification of the +0 and +3 extended vRNA–tRNA^Lys₃ complexes. a, The initiation phase (orange) of HIV-1 reverse transcription, which encompasses the addition of the first six nucleotides, is slow (~0.2 nt/s) and non-processive. Addition of the third nucleotide (red) marks the first major RT stalling point during initiation. After addition of the sixth nucleotide, the rapid (50-100 nt/s) elongation phase (green) begins. b, Purification of the vRNA/tRNA^Lys₃ complex by size exclusion chromatography (SEC). c, Native PAGE analysis of the free vRNA, tRNA^Lys₃, and purified vRNA/tRNA^Lys₃. d, Purification of the vRNA/+3 tRNA^Lys₃ complex by size exclusion chromatography. The complex elutes as two peaks, indicative of a higher order species. e, Native PAGE analysis of the free vRNA, tRNA^Lys₃, and two vRNA–(+3)tRNA^Lys₃ peaks. Gel was run at room temperature and the early eluting SEC species (peak 2) has a similar mobility to peak 1. f, Native PAGE analysis performed at 4°C of the two SEC peaks, indicating the presence of a higher order species in peak 2.

SI Fig 2. Purification of the +1 and +3 RTICs. a, Anion-exchange chromatogram illustrating the purification of the +1 RTIC away from free RT and vRNA–tRNA^Lys₃. b, Size exclusion chromatogram of the +1 RTIC. c, Anion-exchange chromatogram illustrating the purification of the +3 RTIC away from free RT and vRNA–(+3)tRNA^Lys₃. The vRNA–(+3)tRNA^Lys₃ complex separates poorly from the +3 RTIC. d, Size exclusion chromatogram of the +3 RTIC. e, Native PAGE analysis of the free (+3)tRNA, vRNA, vRNA–(+3)tRNA^Lys₃, and +3 RTIC. All species run as a single band.

SI Fig. 3. Activity of the +3 RTICs and RT constructs. a, Example autoradiograph images of the free and crosslinked +3 RTICs incorporating an incoming α−32P-dCTP nucleotide after 37°C incubation across a range of times. Samples appear as a doublet due to partial RNA denaturation during the SDS PAGE step. b, Results of crosslinked +3 RTIC (squares) and uncrosslinked +3 RTIC (circles) α−32P-dCTP incorporation assays were quantified and fit using the relationship: Intensity = Λ* (1 – e−kpure)+B+(1 – e−kslow−1) for the crosslinked +3 RTIC and the relationship: Intensity = B*(1 – e−kslow−1) for the uncrosslinked +3 RTIC where A and B represent the amplitude of the fast and slow processes respectively, k_pol is the apparent extension rate constant, and k_slow is the rate of the slow process. The second relationship was used for the uncrosslinked data, as the slow process appears to dominate incorporation. The best fits were obtained with: A = 0.2571 AU, k_pol = 0.09998 s−1, B = 0.7439, k_slow = 0.001632 s⁻¹ for the crosslinked +3 RTIC; B = 0.9904, k_slow = 0.003338 s⁻¹ for the uncrosslinked +3 RTIC. Assays were independently repeated three times to ensure reproducibility. c, Relative activities, judged by primer usage, of wild-type, wild-type/single cysteine (A355C), triple-mutant (K281A, R277A, K275A), and tripe-mutant/single cysteine (K281A, R277A, K275A, A355C). Values are mean ± s.d. (n=3 independent experiments). d, Time course reverse transcription extension assays performed using wild-type and triple mutant (K281A, R277A, K275A) reverse transcriptase under stoichiometric conditions in order to better visualize RT pausing patterns. Time points are: 0s, 15s, 30s, 1m, 2m, 4m, 6m, 10m, 30m, 45m, 60m, and 150m (s=seconds, m=minutes). Wild-type RT gel contains a ddNTP sequencing ladder. e, Triplicate (2m) assay performed on wild-type and triple mutant RT to assess the amount of +3 pausing. f, Percentage of +3 pausing from panel e (wild-type RT=30.45±0.16%, triple mutant=24.99±0.43%). Percent pausing was calculated using band intensities as (+3 extended primer)/(total unextended + extended primer)*100. Values are mean ± s.d. (n=3 independent experiments).

SI Fig. 4. Cryo-EM of the +1 RTIC. a, Representative cryo-EM image of the +1 RTIC. All 7,790 micrographs used for processing have a similar appearance. b, Representative 2D class averages of the +1 RTIC. c, Angular distribution of particles in the +1 RTIC 4.1 Å core map. The length of each projection is proportional to the number of assigned particles. d, Gold standard FSC curves of the +1 RTIC core cryo-EM map. e, The final 4.1 Å core map colored according to local resolution as estimated by Relion. f, Gold standard FSC curves of the +1 RTIC global 6.6 Å cryo-EM map. g, Data processing workflow for the +1 RTIC core and global cryo-EM maps.

SI Fig. 5. Model validation and fit to density. a, Map-vs-model FSC curves for all four core model/map pairs (+1pre, +3pre, +3int, +3dis). The 0.5 cutoff is indicated. b, Cross-validation test FSC curves to assess overfitting for all four core model/map pairs. The map-vs-model FSC curves for the FSCwork and FSCtest do not diverge significantly, indicating an absence of overfitting. c-g, Representative regions of the four core maps fitted with their respective models prior to any side chain trimming. The maps display a variable amount of side chain density, as expected at this resolution range. A view for the extended PBS and co-axially stacked tRNA^Lys₃ helices fit into the corresponding maps are also shown (tRNA=red, vRNA=yellow). Phosphates of the RNA backbone are partially resolved.

SI Fig. 6. Data processing workflow for the +3 RTIC complexes. a, Data processing workflow for the +3 RTIC core and global cryo-EM maps. The intermediate and displaced maps required an additional round of 3D classification in order to separate them. b, Results of an additional round of 3D classification on the +3 RTIC global maps. The apical regions of viral RNA H2 and the extended tRNA^Lys₃ helix adopt varying orientations with regard to the RTIC core.

SI Fig. 7. +3 RTIC cryo-EM resolution estimates and validation. a, Gold standard FSC curves of the +3 RTIC core cryo-EM maps (labeled). b, Gold standard FSC curves of the +3 RTIC global cryo-EM maps (labeled). c, Angular distribution of particles in the +3 RTIC core maps. The length of each projection is proportional to the number of assigned particles. d, The final +3 RTIC core maps colored according to local resolution as estimated by Relion. Each map has its own resolution scale.

SI Fig. 8. +3 RTIC thumb alignments. Alignment of the +3pre (gray), +3int (brown), and +3dis (green) thumb domain (237-381) with previously solved RT–DNA or RT–RNA structures does not reveal substantial distortions due to the non-canonical movements exhibited in the cryo-EM data. Alignment was performed on the protein backbone and the backbone RMSD was calculated relative to the respective +3 RTIC models.

SI Fig. 9. Single-molecule FRET of the RTIC. a, Labeling positions of the +3 RTIC and +0 RTIC for single-molecule FRET experiments. Position 4 (C) of the tRNA^Lys₃ is labeled with cyanine3 (green) and residue 355 of RT p51 is labeled with cyanine5 (red). Distances between the C6 (cytidine 4) and the CA of 355 are indicated for the +3pre (gray) and +1pre (red/yellow) states. b, Cartoon of the single-molecule RT setup. Biotinylated and Cy3 (donor dye) labeled vRNA–tRNA^Lys₃ complexes are immobilized on a quartz slide with a neutravidine-PEG surface. RT labeled with Cy5 (acceptor) is added after immobilization and remains uncrosslinked. c, A representative field of view from the TIRFM setup. Single-molecule fluorescence data is taken as a continuous movie for 300 seconds, with 200-millisecond exposure time for each frame. The first (top) and last (bottom) frame of a typical movie is shown here, where Cy3 (left) and Cy5 (right) fluorescence signals are collected, split, and recorded simultaneously. A relatively high concentration of Cy5-labeled RT (40nM) induces higher background in the Cy5 channel, as well as its non-specific adsorption to the surface. d, Calculated FRET-on state times (+1pre: n=2446, +3pre: n=755, +3int: n=1053, +3dis: n=1734). Error bars represent 95% confidence intervals after fitting a single-exponential distribution. e, FRET efficiency mean and variance for the +0 and +3 RTICs after filtering out the first and last frames for each FRET event. The same trends are observed in the unfiltered data displayed in Fig. 8d.

NIHMS1604038-supplement-1.doc^{(27.3MB, doc)}

HIV-1 reverse transcription initiation is slow and non-processive, with discrete stalling points.
Cryo-EM and single-molecule FRET were used to structurally characterize the HIV-1 reverse transcriptase initiation complex (RTIC) trapped at the +1 and +3 extension states.
The +3 extended RTIC exhibits conformational heterogeneity that underpins stalling and provides new insight into reverse transcription regulation.

Acknowledgements

We thank G. Skiniotis and E. Montabana for electron microscopy advice and support. N.R. Latorraca for comments on the manuscript and computational support. S. Fromm for cryo-EM processing advice. Supported by National Institutes of Health grant GM082545 to E.V.P., T32-GM008294 (Molecular Biophysics Training Program) to K.P.L and K.K., National Science Foundation Graduate Research Fellowship Program (DGE-114747 and DGE-1656518) to K.K. and L.N.J., Gabilan Stanford Graduate Fellowship to K.K., the Stanford Interdisciplinary Graduate Fellowship (Bio-X) to J.C., and the Knut and Alice Wallenberg Foundation postdoctoral scholarship to J.Z. (no. 2015.0406). We thank Stanford University and the Stanford Research Computing Center for providing the Sherlock cluster resources. Additional calculations were performed on the Stanford BioX3 cluster, supported by NIH Shared Instrumentation Grant 1S10RR02664701.

Footnotes

Accession numbers: The coordinates and cryo-em maps have been deposited in the Protein Data Bank and Electron Microscopy Data Bank under accession codes 6WAZ and EMDB-21582 (+1pre), EMDB-22002 (+1pre, global), 6WB0 and EMDB-21583 (+3pre), EMDB-22003 (+3pre, global/unmasked), 6WB1 and EMDB-21584 (+3int), EMDB-22004 (+3int, global/unmasked), 6WB2 and EMDB-21585 (+3dis), and EMDB-22005 (+3dis, global/unmasked).

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Bibliography

[1].Baltimore D RNA-dependent DNA polymerase in virions of RNA tumour viruses. Nature. 1970;226:1209–11. [DOI] [PubMed] [Google Scholar]
[2].Gilboa E, Mitra SW, Goff S, Baltimore D. A detailed model of reverse transcription and tests of crucial aspects. Cell. 1979;18:93–100. [DOI] [PubMed] [Google Scholar]
[3].Temin HM, Mizutani S. RNA-dependent DNA polymerase in virions of Rous sarcoma virus. Nature. 1970;226:1211–3. [DOI] [PubMed] [Google Scholar]
[4].Hu WS, Hughes SH. HIV-1 reverse transcription. Cold Spring Harb Perspect Med. 2012;2. [DOI] [PMC free article] [PubMed] [Google Scholar]
[5].Isel C, Lanchy JM, Le Grice SF, Ehresmann C, Ehresmann B, Marquet R. Specific initiation and switch to elongation of human immunodeficiency virus type 1 reverse transcription require the post-transcriptional modifications of primer tRNA3Lys. EMBO J. 1996;15:917–24. [PMC free article] [PubMed] [Google Scholar]
[6].Lanchy JM, Ehresmann C, Le Grice SF, Ehresmann B, Marquet R. Binding and kinetic properties of HIV-1 reverse transcriptase markedly differ during initiation and elongation of reverse transcription. EMBO J. 1996;15:7178–87. [PMC free article] [PubMed] [Google Scholar]
[7].Isel C, Ehresmann C, Marquet R. Initiation of HIV Reverse Transcription. Viruses. 2010;2:213–43. [DOI] [PMC free article] [PubMed] [Google Scholar]
[8].Larsen KP, Choi J, Prabhakar A, Puglisi EV, Puglisi JD. Relating Structure and Dynamics in RNA Biology. Cold Spring Harb Perspect Biol. 2019;11. [DOI] [PMC free article] [PubMed] [Google Scholar]
[9].Goldschmidt V, Paillart JC, Rigourd M, Ehresmann B, Aubertin AM, Ehresmann C, et al. Structural variability of the initiation complex of HIV-1 reverse transcription. J Biol Chem. 2004;279:35923–31. [DOI] [PubMed] [Google Scholar]
[10].Beerens N, Berkhout B. The tRNA primer activation signal in the human immunodeficiency virus type 1 genome is important for initiation and processive elongation of reverse transcription. J Virol. 2002;76:2329–39. [DOI] [PMC free article] [PubMed] [Google Scholar]
[11].Beerens N, Groot F, Berkhout B. Initiation of HIV-1 reverse transcription is regulated by a primer activation signal. J Biol Chem. 2001;276:31247–56. [DOI] [PubMed] [Google Scholar]
[12].Beerens N, Jepsen MD, Nechyporuk-Zloy V, Kruger AC, Darlix JL, Kjems J, et al. Role of the primer activation signal in tRNA annealing onto the HIV-1 genome studied by single-molecule FRET microscopy. RNA. 2013;19:517–26. [DOI] [PMC free article] [PubMed] [Google Scholar]
[13].Goldschmidt V, Rigourd M, Ehresmann C, Le Grice SF, Ehresmann B, Marquet R. Direct and indirect contributions of RNA secondary structure elements to the initiation of HIV-1 reverse transcription. J Biol Chem. 2002;277:43233–42. [DOI] [PubMed] [Google Scholar]
[14].Isel C, Keith G, Ehresmann B, Ehresmann C, Marquet R. Mutational analysis of the tRNA3Lys/HIV-1 RNA (primer/template) complex. Nucleic Acids Res. 1998;26:1198–204. [DOI] [PMC free article] [PubMed] [Google Scholar]
[15].Isel C, Marquet R, Keith G, Ehresmann C, Ehresmann B. Modified nucleotides of tRNA(3Lys) modulate primer/template loop-loop interaction in the initiation complex of HIV-1 reverse transcription. J Biol Chem. 1993;268:25269–72. [PubMed] [Google Scholar]
[16].Iwatani Y, Rosen AE, Guo J, Musier-Forsyth K, Levin JG. Efficient initiation of HIV-1 reverse transcription in vitro. Requirement for RNA sequences downstream of the primer binding site abrogated by nucleocapsid protein-dependent primer-template interactions. J Biol Chem. 2003;278:14185–95. [DOI] [PubMed] [Google Scholar]
[17].Liang C, Li X, Rong L, Inouye P, Quan Y, Kleiman L, et al. The importance of the A-rich loop in human immunodeficiency virus type 1 reverse transcription and infectivity. J Virol. 1997;71:5750–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
[18].Liang C, Rong L, Gotte M, Li X, Quan Y, Kleiman L, et al. Mechanistic studies of early pausing events during initiation of HIV-1 reverse transcription. J Biol Chem. 1998;273:21309–15. [DOI] [PubMed] [Google Scholar]
[19].Ratner L, Haseltine W, Patarca R, Livak KJ, Starcich B, Josephs SF, et al. Complete nucleotide sequence of the AIDS virus, HTLV-III. Nature. 1985;313:277–84. [DOI] [PubMed] [Google Scholar]
[20].Wilkinson KA, Gorelick RJ, Vasa SM, Guex N, Rein A, Mathews DH, et al. High-throughput SHAPE analysis reveals structures in HIV-1 genomic RNA strongly conserved across distinct biological states. PLoS Biol. 2008;6:e96. [DOI] [PMC free article] [PubMed] [Google Scholar]
[21].Watts JM, Dang KK, Gorelick RJ, Leonard CW, Bess JW Jr, Swanstrom R, et al. Architecture and secondary structure of an entire HIV-1 RNA genome. Nature. 2009;460:711–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
[22].Sukosd Z, Andersen ES, Seemann SE, Jensen MK, Hansen M, Gorodkin J, et al. Full-length RNA structure prediction of the HIV-1 genome reveals a conserved core domain. Nucleic Acids Res. 2015;43:10168–79. [DOI] [PMC free article] [PubMed] [Google Scholar]
[23].Beerens N, Berkhout B. Switching the in vitro tRNA usage of HIV-1 by simultaneous adaptation of the PBS and PAS. RNA. 2002;8:357–69. [DOI] [PMC free article] [PubMed] [Google Scholar]
[24].Goldschmidt V, Ehresmann C, Ehresmann B, Marquet R. Does the HIV-1 primer activation signal interact with tRNA3(Lys) during the initiation of reverse transcription? Nucleic Acids Res. 2003;31:850–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
[25].Ooms M, Cupac D, Abbink TE, Huthoff H, Berkhout B. The availability of the primer activation signal (PAS) affects the efficiency of HIV-1 reverse transcription initiation. Nucleic Acids Res. 2007;35:1649–59. [DOI] [PMC free article] [PubMed] [Google Scholar]
[26].Suo Z, Johnson KA. RNA secondary structure switching during DNA synthesis catalyzed by HIV-1 reverse transcriptase. Biochemistry. 1997;36:14778–85. [DOI] [PubMed] [Google Scholar]
[27].Suo Z, Johnson KA. Effect of RNA secondary structure on the kinetics of DNA synthesis catalyzed by HIV-1 reverse transcriptase. Biochemistry. 1997;36:12459–67. [DOI] [PubMed] [Google Scholar]
[28].Lanchy JM, Keith G, Le Grice SF, Ehresmann B, Ehresmann C, Marquet R. Contacts between reverse transcriptase and the primer strand govern the transition from initiation to elongation of HIV-1 reverse transcription. J Biol Chem. 1998;273:24425–32. [DOI] [PubMed] [Google Scholar]
[29].Liu S, Harada BT, Miller JT, Le Grice SF, Zhuang X. Initiation complex dynamics direct the transitions between distinct phases of early HIV reverse transcription. Nat Struct Mol Biol. 2010;17:1453–60. [DOI] [PMC free article] [PubMed] [Google Scholar]
[30].Larsen KP, Mathiharan YK, Kappel K, Coey AT, Chen DH, Barrero D, et al. Architecture of an HIV-1 reverse transcriptase initiation complex. Nature. 2018;557:118–22. [DOI] [PMC free article] [PubMed] [Google Scholar]
[31].Das K, Martinez SE, Bauman JD, Arnold E. HIV-1 reverse transcriptase complex with DNA and nevirapine reveals non-nucleoside inhibition mechanism. Nat Struct Mol Biol. 2012;19:253–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
[32].Das K, Martinez SE, DeStefano JJ, Arnold E. Structure of HIV-1 RT/dsRNA initiation complex prior to nucleotide incorporation. Proc Natl Acad Sci U S A. 2019;116:7308–13. [DOI] [PMC free article] [PubMed] [Google Scholar]
[33].Brule F, Marquet R, Rong L, Wainberg MA, Roques BP, Le Grice SF, et al. Structural and functional properties of the HIV-1 RNA-tRNA(Lys)3 primer complex annealed by the nucleocapsid protein: comparison with the heat-annealed complex. RNA. 2002;8:8–15. [DOI] [PMC free article] [PubMed] [Google Scholar]
[34].Coey A, Larsen K, Puglisi JD, Viani Puglisi E. Heterogeneous structures formed by conserved RNA sequences within the HIV reverse transcription initiation site. RNA. 2016;22:1689–98. [DOI] [PMC free article] [PubMed] [Google Scholar]
[35].Coey AT, Larsen KP, Choi J, Barrero DJ, Puglisi JD, Puglisi EV. Dynamic Interplay of RNA and Protein in the Human Immunodeficiency Virus-1 Reverse Transcription Initiation Complex. J Mol Biol. 2018;430:5137–50. [DOI] [PMC free article] [PubMed] [Google Scholar]
[36].Puglisi EV, Puglisi JD. Secondary structure of the HIV reverse transcription initiation complex by NMR. J Mol Biol. 2011;410:863–74. [DOI] [PMC free article] [PubMed] [Google Scholar]
[37].Scheres SH. RELION: implementation of a Bayesian approach to cryo-EM structure determination. J Struct Biol. 2012;180:519–30. [DOI] [PMC free article] [PubMed] [Google Scholar]
[38].Scheres SH. Semi-automated selection of cryo-EM particles in RELION-1.3. J Struct Biol. 2015;189:114–22. [DOI] [PMC free article] [PubMed] [Google Scholar]
[39].Kappel K, Liu S, Larsen KP, Skiniotis G, Puglisi EV, Puglisi JD, et al. De novo computational RNA modeling into cryo-EM maps of large ribonucleoprotein complexes. Nat Methods. 2018;15:947–54. [DOI] [PMC free article] [PubMed] [Google Scholar]
[40].Emsley P, Cowtan K. Coot: model-building tools for molecular graphics. Acta Crystallogr D Biol Crystallogr. 2004;60:2126–32. [DOI] [PubMed] [Google Scholar]
[41].Adams PD, Afonine PV, Bunkoczi G, Chen VB, Davis IW, Echols N, et al. PHENIX: a comprehensive Python-based system for macromolecular structure solution. Acta Crystallogr D Biol Crystallogr. 2010;66:213–21. [DOI] [PMC free article] [PubMed] [Google Scholar]
[42].Afonine PV, Poon BK, Read RJ, Sobolev OV, Terwilliger TC, Urzhumtsev A, et al. Real-space refinement in PHENIX for cryo-EM and crystallography. Acta Crystallogr D Struct Biol. 2018;74:531–44. [DOI] [PMC free article] [PubMed] [Google Scholar]
[43].Jaeger J, Restle T, Steitz TA. The structure of HIV-1 reverse transcriptase complexed with an RNA pseudoknot inhibitor. EMBO J. 1998;17:4535–42. [DOI] [PMC free article] [PubMed] [Google Scholar]
[44].Rodgers DW, Gamblin SJ, Harris BA, Ray S, Culp JS, Hellmig B, et al. The structure of unliganded reverse transcriptase from the human immunodeficiency virus type 1. Proc Natl Acad Sci U S A. 1995;92:1222–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
[45].Schauer GD, Huber KD, Leuba SH, Sluis-Cremer N. Mechanism of allosteric inhibition of HIV-1 reverse transcriptase revealed by single-molecule and ensemble fluorescence. Nucleic Acids Res. 2014;42:11687–96. [DOI] [PMC free article] [PubMed] [Google Scholar]
[46].Murphy MC, Rasnik I, Cheng W, Lohman TM, Ha T. Probing single-stranded DNA conformational flexibility using fluorescence spectroscopy. Biophys J. 2004;86:2530–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
[47].Lanchy JM, Isel C, Keith G, Le Grice SF, Ehresmann C, Ehresmann B, et al. Dynamics of the HIV-1 reverse transcription complex during initiation of DNA synthesis. J Biol Chem. 2000;275:12306–12. [DOI] [PubMed] [Google Scholar]
[48].Petrov A, Wu T, Puglisi EV, Puglisi JD. RNA purification by preparative polyacrylamide gel electrophoresis. Methods Enzymol. 2013;530:315–30. [DOI] [PubMed] [Google Scholar]
[49].Larsen KP, Mathiharan YK, Kappel K, Coey AT, Chen D-H, Barrero D, et al. Architecture of an HIV-1 reverse transcriptase initiation complex. Nature. 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
[50].Huang H, Chopra R, Verdine GL, Harrison SC. Structure of a covalently trapped catalytic complex of HIV-1 reverse transcriptase: implications for drug resistance. Science. 1998;282:1669–75. [DOI] [PubMed] [Google Scholar]
[51].Huang H, Harrison SC, Verdine GL. Trapping of a catalytic HIV reverse transcriptase*template:primer complex through a disulfide bond. Chem Biol. 2000;7:355–64. [DOI] [PubMed] [Google Scholar]
[52].Gotte M, Fackler S, Hermann T, Perola E, Cellai L, Gross HJ, et al. HIV-1 reverse transcriptase-associated RNase H cleaves RNA/RNA in arrested complexes: implications for the mechanism by which RNase H discriminates between RNA/RNA and RNA/DNA. EMBO J. 1995;14:833–41. [DOI] [PMC free article] [PubMed] [Google Scholar]
[53].Mastronarde DN. Automated electron microscope tomography using robust prediction of specimen movements. J Struct Biol. 2005;152:36–51. [DOI] [PubMed] [Google Scholar]
[54].Schorb M, Haberbosch I, Hagen WJH, Schwab Y, Mastronarde DN. Software tools for automated transmission electron microscopy. Nat Methods. 2019;16:471–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
[55].Zhang K Gctf: Real-time CTF determination and correction. J Struct Biol. 2016;193:1–12. [DOI] [PMC free article] [PubMed] [Google Scholar]
[56].Zheng SQ, Palovcak E, Armache JP, Verba KA, Cheng Y, Agard DA. MotionCor2: anisotropic correction of beam-induced motion for improved cryo-electron microscopy. Nat Methods. 2017;14:331–2. [DOI] [PMC free article] [PubMed] [Google Scholar]
[57].Scheres SH. Processing of Structurally Heterogeneous Cryo-EM Data in RELION. Methods Enzymol. 2016;579:125–57. [DOI] [PubMed] [Google Scholar]
[58].Zivanov J, Nakane T, Forsberg BO, Kimanius D, Hagen WJ, Lindahl E, et al. New tools for automated high-resolution cryo-EM structure determination in RELION-3. Elife. 2018;7. [DOI] [PMC free article] [PubMed] [Google Scholar]
[59].Pettersen EF, Goddard TD, Huang CC, Couch GS, Greenblatt DM, Meng EC, et al. UCSF Chimera--a visualization system for exploratory research and analysis. J Comput Chem. 2004;25:1605–12. [DOI] [PubMed] [Google Scholar]
[60].Davis IW, Leaver-Fay A, Chen VB, Block JN, Kapral GJ, Wang X, et al. MolProbity: all-atom contacts and structure validation for proteins and nucleic acids. Nucleic Acids Res. 2007;35:W375–83. [DOI] [PMC free article] [PubMed] [Google Scholar]
[61].Schneider CA, Rasband WS, Eliceiri KW. NIH Image to ImageJ: 25 years of image analysis. Nat Methods. 2012;9:671–5. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

NIHMS1604038-supplement-1.doc^{(27.3MB, doc)}

[R1] [1].Baltimore D RNA-dependent DNA polymerase in virions of RNA tumour viruses. Nature. 1970;226:1209–11. [DOI] [PubMed] [Google Scholar]

[R2] [2].Gilboa E, Mitra SW, Goff S, Baltimore D. A detailed model of reverse transcription and tests of crucial aspects. Cell. 1979;18:93–100. [DOI] [PubMed] [Google Scholar]

[R3] [3].Temin HM, Mizutani S. RNA-dependent DNA polymerase in virions of Rous sarcoma virus. Nature. 1970;226:1211–3. [DOI] [PubMed] [Google Scholar]

[R4] [4].Hu WS, Hughes SH. HIV-1 reverse transcription. Cold Spring Harb Perspect Med. 2012;2. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R5] [5].Isel C, Lanchy JM, Le Grice SF, Ehresmann C, Ehresmann B, Marquet R. Specific initiation and switch to elongation of human immunodeficiency virus type 1 reverse transcription require the post-transcriptional modifications of primer tRNA3Lys. EMBO J. 1996;15:917–24. [PMC free article] [PubMed] [Google Scholar]

[R6] [6].Lanchy JM, Ehresmann C, Le Grice SF, Ehresmann B, Marquet R. Binding and kinetic properties of HIV-1 reverse transcriptase markedly differ during initiation and elongation of reverse transcription. EMBO J. 1996;15:7178–87. [PMC free article] [PubMed] [Google Scholar]

[R7] [7].Isel C, Ehresmann C, Marquet R. Initiation of HIV Reverse Transcription. Viruses. 2010;2:213–43. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R8] [8].Larsen KP, Choi J, Prabhakar A, Puglisi EV, Puglisi JD. Relating Structure and Dynamics in RNA Biology. Cold Spring Harb Perspect Biol. 2019;11. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R9] [9].Goldschmidt V, Paillart JC, Rigourd M, Ehresmann B, Aubertin AM, Ehresmann C, et al. Structural variability of the initiation complex of HIV-1 reverse transcription. J Biol Chem. 2004;279:35923–31. [DOI] [PubMed] [Google Scholar]

[R10] [10].Beerens N, Berkhout B. The tRNA primer activation signal in the human immunodeficiency virus type 1 genome is important for initiation and processive elongation of reverse transcription. J Virol. 2002;76:2329–39. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R11] [11].Beerens N, Groot F, Berkhout B. Initiation of HIV-1 reverse transcription is regulated by a primer activation signal. J Biol Chem. 2001;276:31247–56. [DOI] [PubMed] [Google Scholar]

[R12] [12].Beerens N, Jepsen MD, Nechyporuk-Zloy V, Kruger AC, Darlix JL, Kjems J, et al. Role of the primer activation signal in tRNA annealing onto the HIV-1 genome studied by single-molecule FRET microscopy. RNA. 2013;19:517–26. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R13] [13].Goldschmidt V, Rigourd M, Ehresmann C, Le Grice SF, Ehresmann B, Marquet R. Direct and indirect contributions of RNA secondary structure elements to the initiation of HIV-1 reverse transcription. J Biol Chem. 2002;277:43233–42. [DOI] [PubMed] [Google Scholar]

[R14] [14].Isel C, Keith G, Ehresmann B, Ehresmann C, Marquet R. Mutational analysis of the tRNA3Lys/HIV-1 RNA (primer/template) complex. Nucleic Acids Res. 1998;26:1198–204. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R15] [15].Isel C, Marquet R, Keith G, Ehresmann C, Ehresmann B. Modified nucleotides of tRNA(3Lys) modulate primer/template loop-loop interaction in the initiation complex of HIV-1 reverse transcription. J Biol Chem. 1993;268:25269–72. [PubMed] [Google Scholar]

[R16] [16].Iwatani Y, Rosen AE, Guo J, Musier-Forsyth K, Levin JG. Efficient initiation of HIV-1 reverse transcription in vitro. Requirement for RNA sequences downstream of the primer binding site abrogated by nucleocapsid protein-dependent primer-template interactions. J Biol Chem. 2003;278:14185–95. [DOI] [PubMed] [Google Scholar]

[R17] [17].Liang C, Li X, Rong L, Inouye P, Quan Y, Kleiman L, et al. The importance of the A-rich loop in human immunodeficiency virus type 1 reverse transcription and infectivity. J Virol. 1997;71:5750–7. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R18] [18].Liang C, Rong L, Gotte M, Li X, Quan Y, Kleiman L, et al. Mechanistic studies of early pausing events during initiation of HIV-1 reverse transcription. J Biol Chem. 1998;273:21309–15. [DOI] [PubMed] [Google Scholar]

[R19] [19].Ratner L, Haseltine W, Patarca R, Livak KJ, Starcich B, Josephs SF, et al. Complete nucleotide sequence of the AIDS virus, HTLV-III. Nature. 1985;313:277–84. [DOI] [PubMed] [Google Scholar]

[R20] [20].Wilkinson KA, Gorelick RJ, Vasa SM, Guex N, Rein A, Mathews DH, et al. High-throughput SHAPE analysis reveals structures in HIV-1 genomic RNA strongly conserved across distinct biological states. PLoS Biol. 2008;6:e96. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R21] [21].Watts JM, Dang KK, Gorelick RJ, Leonard CW, Bess JW Jr, Swanstrom R, et al. Architecture and secondary structure of an entire HIV-1 RNA genome. Nature. 2009;460:711–6. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R22] [22].Sukosd Z, Andersen ES, Seemann SE, Jensen MK, Hansen M, Gorodkin J, et al. Full-length RNA structure prediction of the HIV-1 genome reveals a conserved core domain. Nucleic Acids Res. 2015;43:10168–79. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R23] [23].Beerens N, Berkhout B. Switching the in vitro tRNA usage of HIV-1 by simultaneous adaptation of the PBS and PAS. RNA. 2002;8:357–69. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R24] [24].Goldschmidt V, Ehresmann C, Ehresmann B, Marquet R. Does the HIV-1 primer activation signal interact with tRNA3(Lys) during the initiation of reverse transcription? Nucleic Acids Res. 2003;31:850–9. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R25] [25].Ooms M, Cupac D, Abbink TE, Huthoff H, Berkhout B. The availability of the primer activation signal (PAS) affects the efficiency of HIV-1 reverse transcription initiation. Nucleic Acids Res. 2007;35:1649–59. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R26] [26].Suo Z, Johnson KA. RNA secondary structure switching during DNA synthesis catalyzed by HIV-1 reverse transcriptase. Biochemistry. 1997;36:14778–85. [DOI] [PubMed] [Google Scholar]

[R27] [27].Suo Z, Johnson KA. Effect of RNA secondary structure on the kinetics of DNA synthesis catalyzed by HIV-1 reverse transcriptase. Biochemistry. 1997;36:12459–67. [DOI] [PubMed] [Google Scholar]

[R28] [28].Lanchy JM, Keith G, Le Grice SF, Ehresmann B, Ehresmann C, Marquet R. Contacts between reverse transcriptase and the primer strand govern the transition from initiation to elongation of HIV-1 reverse transcription. J Biol Chem. 1998;273:24425–32. [DOI] [PubMed] [Google Scholar]

[R29] [29].Liu S, Harada BT, Miller JT, Le Grice SF, Zhuang X. Initiation complex dynamics direct the transitions between distinct phases of early HIV reverse transcription. Nat Struct Mol Biol. 2010;17:1453–60. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R30] [30].Larsen KP, Mathiharan YK, Kappel K, Coey AT, Chen DH, Barrero D, et al. Architecture of an HIV-1 reverse transcriptase initiation complex. Nature. 2018;557:118–22. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R31] [31].Das K, Martinez SE, Bauman JD, Arnold E. HIV-1 reverse transcriptase complex with DNA and nevirapine reveals non-nucleoside inhibition mechanism. Nat Struct Mol Biol. 2012;19:253–9. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R32] [32].Das K, Martinez SE, DeStefano JJ, Arnold E. Structure of HIV-1 RT/dsRNA initiation complex prior to nucleotide incorporation. Proc Natl Acad Sci U S A. 2019;116:7308–13. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R33] [33].Brule F, Marquet R, Rong L, Wainberg MA, Roques BP, Le Grice SF, et al. Structural and functional properties of the HIV-1 RNA-tRNA(Lys)3 primer complex annealed by the nucleocapsid protein: comparison with the heat-annealed complex. RNA. 2002;8:8–15. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R34] [34].Coey A, Larsen K, Puglisi JD, Viani Puglisi E. Heterogeneous structures formed by conserved RNA sequences within the HIV reverse transcription initiation site. RNA. 2016;22:1689–98. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R35] [35].Coey AT, Larsen KP, Choi J, Barrero DJ, Puglisi JD, Puglisi EV. Dynamic Interplay of RNA and Protein in the Human Immunodeficiency Virus-1 Reverse Transcription Initiation Complex. J Mol Biol. 2018;430:5137–50. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R36] [36].Puglisi EV, Puglisi JD. Secondary structure of the HIV reverse transcription initiation complex by NMR. J Mol Biol. 2011;410:863–74. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R37] [37].Scheres SH. RELION: implementation of a Bayesian approach to cryo-EM structure determination. J Struct Biol. 2012;180:519–30. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R38] [38].Scheres SH. Semi-automated selection of cryo-EM particles in RELION-1.3. J Struct Biol. 2015;189:114–22. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R39] [39].Kappel K, Liu S, Larsen KP, Skiniotis G, Puglisi EV, Puglisi JD, et al. De novo computational RNA modeling into cryo-EM maps of large ribonucleoprotein complexes. Nat Methods. 2018;15:947–54. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R40] [40].Emsley P, Cowtan K. Coot: model-building tools for molecular graphics. Acta Crystallogr D Biol Crystallogr. 2004;60:2126–32. [DOI] [PubMed] [Google Scholar]

[R41] [41].Adams PD, Afonine PV, Bunkoczi G, Chen VB, Davis IW, Echols N, et al. PHENIX: a comprehensive Python-based system for macromolecular structure solution. Acta Crystallogr D Biol Crystallogr. 2010;66:213–21. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R42] [42].Afonine PV, Poon BK, Read RJ, Sobolev OV, Terwilliger TC, Urzhumtsev A, et al. Real-space refinement in PHENIX for cryo-EM and crystallography. Acta Crystallogr D Struct Biol. 2018;74:531–44. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R43] [43].Jaeger J, Restle T, Steitz TA. The structure of HIV-1 reverse transcriptase complexed with an RNA pseudoknot inhibitor. EMBO J. 1998;17:4535–42. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R44] [44].Rodgers DW, Gamblin SJ, Harris BA, Ray S, Culp JS, Hellmig B, et al. The structure of unliganded reverse transcriptase from the human immunodeficiency virus type 1. Proc Natl Acad Sci U S A. 1995;92:1222–6. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R45] [45].Schauer GD, Huber KD, Leuba SH, Sluis-Cremer N. Mechanism of allosteric inhibition of HIV-1 reverse transcriptase revealed by single-molecule and ensemble fluorescence. Nucleic Acids Res. 2014;42:11687–96. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R46] [46].Murphy MC, Rasnik I, Cheng W, Lohman TM, Ha T. Probing single-stranded DNA conformational flexibility using fluorescence spectroscopy. Biophys J. 2004;86:2530–7. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R47] [47].Lanchy JM, Isel C, Keith G, Le Grice SF, Ehresmann C, Ehresmann B, et al. Dynamics of the HIV-1 reverse transcription complex during initiation of DNA synthesis. J Biol Chem. 2000;275:12306–12. [DOI] [PubMed] [Google Scholar]

[R48] [48].Petrov A, Wu T, Puglisi EV, Puglisi JD. RNA purification by preparative polyacrylamide gel electrophoresis. Methods Enzymol. 2013;530:315–30. [DOI] [PubMed] [Google Scholar]

[R49] [49].Larsen KP, Mathiharan YK, Kappel K, Coey AT, Chen D-H, Barrero D, et al. Architecture of an HIV-1 reverse transcriptase initiation complex. Nature. 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R50] [50].Huang H, Chopra R, Verdine GL, Harrison SC. Structure of a covalently trapped catalytic complex of HIV-1 reverse transcriptase: implications for drug resistance. Science. 1998;282:1669–75. [DOI] [PubMed] [Google Scholar]

[R51] [51].Huang H, Harrison SC, Verdine GL. Trapping of a catalytic HIV reverse transcriptase*template:primer complex through a disulfide bond. Chem Biol. 2000;7:355–64. [DOI] [PubMed] [Google Scholar]

[R52] [52].Gotte M, Fackler S, Hermann T, Perola E, Cellai L, Gross HJ, et al. HIV-1 reverse transcriptase-associated RNase H cleaves RNA/RNA in arrested complexes: implications for the mechanism by which RNase H discriminates between RNA/RNA and RNA/DNA. EMBO J. 1995;14:833–41. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R53] [53].Mastronarde DN. Automated electron microscope tomography using robust prediction of specimen movements. J Struct Biol. 2005;152:36–51. [DOI] [PubMed] [Google Scholar]

[R54] [54].Schorb M, Haberbosch I, Hagen WJH, Schwab Y, Mastronarde DN. Software tools for automated transmission electron microscopy. Nat Methods. 2019;16:471–7. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R55] [55].Zhang K Gctf: Real-time CTF determination and correction. J Struct Biol. 2016;193:1–12. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R56] [56].Zheng SQ, Palovcak E, Armache JP, Verba KA, Cheng Y, Agard DA. MotionCor2: anisotropic correction of beam-induced motion for improved cryo-electron microscopy. Nat Methods. 2017;14:331–2. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R57] [57].Scheres SH. Processing of Structurally Heterogeneous Cryo-EM Data in RELION. Methods Enzymol. 2016;579:125–57. [DOI] [PubMed] [Google Scholar]

[R58] [58].Zivanov J, Nakane T, Forsberg BO, Kimanius D, Hagen WJ, Lindahl E, et al. New tools for automated high-resolution cryo-EM structure determination in RELION-3. Elife. 2018;7. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R59] [59].Pettersen EF, Goddard TD, Huang CC, Couch GS, Greenblatt DM, Meng EC, et al. UCSF Chimera--a visualization system for exploratory research and analysis. J Comput Chem. 2004;25:1605–12. [DOI] [PubMed] [Google Scholar]

[R60] [60].Davis IW, Leaver-Fay A, Chen VB, Block JN, Kapral GJ, Wang X, et al. MolProbity: all-atom contacts and structure validation for proteins and nucleic acids. Nucleic Acids Res. 2007;35:W375–83. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R61] [61].Schneider CA, Rasband WS, Eliceiri KW. NIH Image to ImageJ: 25 years of image analysis. Nat Methods. 2012;9:671–5. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Distinct conformational states underlie pausing during initiation of HIV-1 reverse transcription

Kevin P Larsen

Junhong Choi

Lynnette N Jackson

Kalli Kappel

Jingji Zhang

Betty Ha

Dong-Hua Chen

Elisabetta Viani Puglisi

Abstract

Introduction

Fig. 1. RNA secondary structures and crosslinking scheme.

Results

Reagent preparation and validation for structural studies

Fig. 2. Purification and preliminary cryo-EM of the +3 RTIC.

Cryo-EM of +1 and +3 extended RTICs

Fig. 3. Global and core architecture of the +1 RTIC.

Fig. 4. Cryo-EM reconstructions of the +3 RTIC.

Global conformation of the +3 extended RTIC

Fig. 5. Comparison of the +1pre and +3pre RTIC states.

+3 RTIC can adopt a pre-translocation state conformation similar to the +1 RTIC

+3 RTIC can adopt additional conformations featuring large domain movements

Fig. 6. The +3 RTIC exhibits global domain movements.

RNA backbone interactions with the RT RNase H domain in the +1 and +3 RTIC

Fig. 7. Protein-RNA contact landscape within the RNase H domain.

Single-molecule FRET investigation of the +3 RTIC dynamics

Fig. 8. Single-molecule FRET of the +3 extended RTIC.

Mutations within the p51 thumb significantly alter the +3 RTIC conformational landscape

Discussion

Fig 9. Simple model for stalling during the +3 reverse transcription initiation state.

Materials and Methods

Sample Preparation

Cryo-EM of the +1 extended RTIC

Cryo-EM of the +3 extended RTIC

Modeling and refinement

Single-molecule FRET experiments

Activity Assays

Time course Assay

Reverse transcriptase activity assay

Supplementary Material

Acknowledgements

Footnotes

Bibliography

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases