Abstract
The low solubility and aggregation properties of HIV-1 integrase are major obstacles for biochemical and structural studies. Lens epithelium-derived growth factor (LEDGF) is a cellular factor that binds integrase and tethers preintegration complexes to chromatin prior to integration. LEDGF also stimulates HIV-1 integrase DNA strand transfer activity and improves its solubility in vitro. We show that these properties are conferred by a short peptide spanning residues 178 to 197 of LEDGF that encompasses its AT-hook DNA binding elements. The peptide stimulates HIV-1 integrase activity both in trans and in cis. Fusion of the peptide to either the N- or C-terminus of integrase results in maximal stimulation of concerted integration activity and greatly improves the solubility of the protein as well as nucleoprotein complexes of integrase with viral DNA ends (intasomes). High-resolution structures of HIV-1 intasomes are required to understand the mechanism of integrase strand transfer inhibitors (INSTIs), which are front line drugs for the treatment of HIV-1, and how the virus can develop resistance to INSTIs. We have previously determined the structure of the HIV-1 Strand Transfer Complex (STC) intasome. The improved biophysical properties of intasomes assembled with LEDGF peptide fusion integrase have enabled us to determine the structure of the Cleaved Synaptic Complex (CSC) intasome, which is the direct target of INSTIs.
Keywords: site-specific recombination, retrovirus, integrase, integration, nucleoprotein complex, intasome, integrase strand transfer inhibitor (INSTI)
Graphical Abstract
Fusion of a peptide derived from lens epithelium-derived growth factor to the N-terminus of HIV-1 integrase facilitates determination of the structure of the HIV-1 CSC intasome, the target of integrase strand transfer inhibitors (INSTIs)

Introduction
Integration of a DNA copy of the viral genome into chromosomal DNA is an essential step in the replication of HIV-1 and other retroviruses (reviewed in [1, 2]). The integrated DNA serves as the template for transcription of viral RNAs and is replicated along with cellular DNA during each cycle of cell division. The chemical steps of DNA integration involve cleavage of two nucleotides from the 3’ ends of the initially blunt ended viral DNA (3’ end processing) and a transesterification reaction that covalently joins the processed 3’ ends to target DNA (DNA strand transfer) [3]. These reactions take place in stable nucleoprotein complexes between the virally encoded integrase (IN) protein and DNAs that are termed intasomes. The first intasome on the reaction pathway is a complex between a pair of viral DNA ends and integrase, the Stable Synaptic Complex (SSC). 3’ end processing occurs within the SSC to form the Cleaved Synaptic Complex (CSC). The CSC captures a target DNA to form the Target Capture Complex (TCC). DNA strand transfer then occurs within the TCC to generate the Strand Transfer Complex (STC) [2, 4]. Currently approved integrase strand transfer inhibitors (INSTIs) recognize the CSC rather than free integrase protein. High-resolution structures of CSCs with and without bound inhibitors are required to understand mechanistic details of INSTI action and how HIV-1 can evolve resistance.
Structural studies of HIV-1 intasomes have been impeded by their poor solubility, low efficiency of assembly, and heterogeneity. The structures of all three domains of HIV-1 integrase (IN) were determined either by X-ray crystallography or NMR more than two decades ago [5–9], yet it is only recently that the first HIV-1 intasome structure, the STC, was determined [10]. Determination of the STC structure was facilitated by exploiting a hyperactive integrase with a fusion of the DNA binding protein Sso7d to the N-terminus [11] and assembly of the STC with a DNA substrate that mimics the product of DNA strand transfer [12]. The assembled STCs were found to exist as multiple species, including tetramers and dodecamers, with the same set of positionally conserved domains around the DNA [10]. In the absence of intasome structures with wild-type integrase, the possibility that the HIV-1 intasome structures were perturbed by the presence of the Sso7d domain cannot be excluded. We therefore searched for alternative tools for structural studies of HIV-1 intasomes in the absence of the Sso7d domain. We find that fusion of a peptide derived from lens epithelium-derived growth factor (LEDGF) to either the N or C terminus of HIV-1 IN confers a remarkably similar phenotype to that of the much larger Sso7d domain fusion and provides a tool for structural studies of HIV-1 intasomes in the absence of the Sso7d domain. To demonstrate the utility of the new construct we have exploited it to determine the structure of the HIV-1 CSC intasome, which is the target of INSTIs, by single particle cryo-EM.
Results and Discussion
Peptides derived from LEDGF stimulate concerted DNA integration
LEDGF is expressed as two predominant forms, p75 and p52 [13]; only the larger isoform binds HIV-1 integrase [14]. The term “LEDGF” herein refers to LEDGF/p75.
LEDGF (Figure 1A) stimulates the activity of HIV-1 integrase in vitro [15–18]; in the absence of LEDGF concerted integration is extremely inefficient with wild type HIV-1 integrase and oligo DNA substrate [19] as shown in Figure 1B. HIV-1 integrase protomers dynamically exchange with one another in solution and stabilization of oligomers upon binding of LEDGF has been implicated in the mechanism of stimulation [19, 20]. To systematically determine the regions of LEDGF that might contribute to stimulation of integrase activity, we constructed a library of overlapping peptides spanning LEDGF and tested each peptide for activity in stimulating DNA integration (data not shown). Remarkably, two peptides spanning residues 181 to 210, which lie upstream of the IBD (Figure 1A), stimulated DNA integration. To more precisely define the sequence responsible for stimulation, a second set of overlapping peptides spanning this region was synthesized (Figure 2A) and tested in the integration assay (Figure 2B). A peptide spanning residues P178 to P197 (peptide 5 or P5) was the most active for stimulating DNA integration. This peptide includes the two AT-hooks in LEDGF, which underlie its ability to bind DNA [18].
Figure 1.

A. Schematic of LEDGF (isoform p75) protein domains. IBD, integrase-binding domain. B. LEDGF/75 stimulates HIV-1 DNA integration in vitro. Reactions contained 0.6 μM of HIV-1 integrase (IN) and the indicated concentrations of LEDGF (lanes 3 to 6), together with the 32 bp viral DNA substrate and supercoiled pGEM-9zf plasmid DNA as the target. Deproteinized integration products were separated in 1.5% agarose gels, stained with ethidium bromide, and detected by UV. Lane 1, DNA ladder; lane 2, protein was omitted from the reaction; lane3, LEDGF was omitted from the reaction. The migration position of the unreacted viral DNA and the concerted and half-site integration products are indicated. The gel is representative of the results from multiple independent experiments.
Figure 2.

DNA integration stimulation by peptides spanning residues 171 to 217 of LEDGF. A. Sequences of the peptides and position relative to the two AT-hooks AT1 and AT2. Residues that are evolutionarily conserved in AT1 and AT2 of LEDGF [18, 22] are colored in blue. The most active peptide for stimulating integration is highlighted. B. Integration assays were carried out in the absence of peptide (lane 1) or in the presence of 50 mM of the indicated peptide (lanes 2–18). Integration products were separated in a 1.5% agarose gel and detected by a fluorescence scanner. The migration position of the unreacted viral DNA and the concerted and half-site integration products are indicated. The gel is representative of the results of multiple independent experiments.
Peptide 5, in addition to the AT-hook motifs from which it is derived, is highly positively charged. We therefore tested the effects of amino acid substitutions that alter charge on stimulation of integration (Figure 3). In general, changes that increased positive charge maintained or enhanced activity whereas changes that decreased positive charge diminished activity. Mutations that abolished activity to stimulate concerted DNA integration greatly diminished DNA binding activity of the peptide. However, the correlation of DNA binding with stimulatory activity was not absolute; peptide M1 which showed the greatest stimulatory activity bound DNA with lower affinity than either peptide M2 or M3 (compare panels B and C in Figure 3). It is unclear whether DNA binding is directly involved in the mechanism of stimulation.
Figure 3.

Positively charged residues are critical for stimulation of DNA integration activity. A. Peptides and their amino acid sequences. Residues that were changed relative to peptide P5 are indicated in red for positively charged amino acids and in green for negatively charged amino acids. In M7, the altered AT1 sequence is shown in blue. B. 1.0 μM HIV-1 IN was incubated with 5’ FAM-labeled 32 bp viral DNA substrate and supercoiled pGEM-9zf plasmid DNA as the target in the absence (lane 1) or in the presence of 100 μM peptide (lanes 2–9). Integration products were separated in a 1.5% agarose gel and detected by a fluorescence scanner. C. Polarization of FAM fluorescence upon peptide binding (average and variance (error bars) of a triplicate experiment). 10 nM FAM-labeled 32 bp viral DNA substrate and peptides (concentration from 0 to 225 μM) were mixed and incubated at room temperature (RT) for 1 h in a 384-well plate, and then endpoint fluorescence polarization was measured with a BMG CLARIOstar microplate reader.
LEDGF AT-hooks and IBD are determinants for stimulating HIV-1 integrase activity
A concentration of at least 100 μM of peptide was required for the optimal stimulation, whereas 1 μM of full-length of LEDGF protein is sufficient. This suggested that the IBD could very well enhance the activity of the AT-hook in the context of the full-length LEDGF. We accordingly constructed a set of deletion mutants from the N-terminus, the C-terminus or the N plus C-terminus of LEDGF (Figure 4A) and tested their ability to stimulate DNA integration (Figure 4B). The Del-92 protein that spanned the AT-hook region and the IBD domain was sufficient for maximal stimulation. We conclude that binding of the IBD to integrase raises the effective local concentration of the AT-hook domain, resulting in maximal stimulation at a lower molar concentration compared to the peptide alone.
Figure 4.

LEDGF AT-hooks and IBD are determinants for stimulation of HIV-1 integrase activity. A. Schematic of LEDGF truncation and mutant constructs. B. 0.6 μM HIV-1 integrase was incubated with 5’ FAM-labeled 32 bp viral DNA substrate and supercoiled pGEM-9zf plasmid DNA in the absence (lane 1) or in the presence of 1 μM of LEDGF variant proteins (lanes 2–12). Integration products were separated in a 1.5% agarose gel and detected by using a fluorescence scanner. IBD429, the minimum IBD residues 347–429; IBD460, IBD residues 307–460; IBD530, IBD residues 307–530; Del-197, N-terminal deletion, residues 198–530: Del-92, PWWP domain deletion, residues 93–530; N-346, C-terminal deletion, residues 1–346. D366N, Asp to Asn substitution at position 366, which is critical for integrase binding [48] [16]; ATmut, AT-hook mutant containing K179L/R180A/G181A/R182E and R194S/G195A/R196E substitutions in AT1 and AT2 regions, respectively. Relative degrees of stimulation activity (panel A): +++, similar to WT LEDGF; +, 20–50% of WT LEDGF; −, < 5% of WT LEDGF.
Fusion of the AT-hooks to integrase is sufficient for maximal stimulation of DNA integration
Peptide P5 was fused to the N-terminus of HIV-1 integrase with a non-structured linker of 2, 14, 25 or 50 amino acids (aa). The linkers were derived from LAP2 alpha residues 51–100, which assume a random coil as assessed by NMR [21]. We first analyzed the abilities of the P5 peptide fusion proteins to catalyze concerted DNA integration (Figure 5A). Concerted DNA integration was readily detected with all fusion proteins, with the 50 aa linker protein (P5-IN) conferring the highest level of activity (lane 6), on par with the stimulatory activity of full-length LEDGF (lane 7). The linker alone does not significantly stimulate DNA integration (Figure S1), confirming that the stimulatory effect is conferred by the peptide.
Figure 5.

Fusion of AT-hook peptide to HIV-1 integrase is sufficient to mimic properties of LEDGF in trans. A. 0.6 μM HIV-1 IN (lanes 1 and 2) or HIV-1 IN with peptide P5 linked to the N-terminus (lanes 3 to 6) were incubated with 5’ FAM-labeled 32 bp viral DNA substrate and supercoiled pGEM-9zf plasmid DNA as the target. Lane 1, HIV-1 IN only; lane 2, HIV-1 IN plus 100 μM P5 peptide; lane 3, P5-IN (2 aa linker); lane 4, P5-IN (14 aa linker), lane 5, P5-IN (25 aa linker); lane 6, P5-IN (50 aa linker); lane 7, HIV-1 IN plus 1 μM LEDGF; lane 8, Sso7d-IN. Integration products were separated in a 1.5% agarose gel and then detected using a fluorescence scanner. B. Comparison of the solubilities of HIV-1 wild-type IN and P5-IN with a 50 aa linker. Proteins were incubated at the indicated NaCl concentrations in 20 mM HEPES pH 7.5, 10% glycerol, 5 mM DTT and 1 mM EDTA, centrifuged, and the supernatants were analyzed by SDS-PAGE. C. Comparison of the activities of different fusion INs. Lane 1, Sso7d-IN; lane 2, P5-IN; lane 3, M1-IN; lane 4, AT-hook peptide (residues 297–318) of HDGFL2 fused to the N-terminus of HIV-1 IN with a 50 aa linker; peptide P5 fused to the C-terminus of HIV-1 IN with a 50 aa linker.
A major limitation with wild type HIV-1 integrase is its poor solubility under low salt conditions that are required for intasome assembly and catalysis. We therefore compared the solubility profiles of wild type and P5-IN under various NaCl conditions. Comparing the integrase levels remaining in supernatant fractions after centrifugation revealed that P5-IN was considerably more soluble than wild type integrase at low ionic strength (Figure 5B).
We next tested if the stimulatory activity of peptide P5 exhibited positional dependence. A similar degree of stimulation of DNA integration was observed when peptide P5 was fused to the C-terminus of integrase with a 50 aa linker (Figure 5C). We also tested an N-terminal fusion of the AT-hook region (residues 297–318) of hepatoma-derived growth factor-like protein 2 (HDGFL2), which can stimulate HIV-1 integrase activity in vitro [22] and plays a subsidiary role to LEDGF in directing viral preintegration complexes to integrate into genes [23, 24]. This peptide fusion protein stimulated concerted integration to a similar extent as P5-IN.
P5-IN is slightly more efficient than Sso7d-IN in the same buffer conditions. Similar to results with Sso7d-IN [11], DMSO and PEG, which are required for maximal wild-type integrase activity, are not required for maximal stimulation (Figure 5C). The reason that the Sso7d domain and the LEDGF-derived peptide confer such similar properties is unclear, though masking of determinants within wild-type integrase that cause protein aggregation under conditions of limited ionic strength may be one effect [11]. Further mirroring the similarity with Sso7d-IN, P5-IN when tested as a Vpr fusion protein trans-complemented the infectivity defect of D64N/D116N integrase active site mutant virus (Figure S2); we suspect that the reduced activity compared to wild-type integrase in this assay may be due to an assembly defect, as is common with numerous HIV-1 integrase mutant viruses [25].
P5-IN efficiently assembles intasomes with oligo DNA substrate
P5-IN is able to efficiently assemble CSC intasomes with 32 bp oligo DNA substrate, and the complexes are readily analyzed by size-exclusion chromatography and agarose gel electrophoresis (Figure 6). The broad and asymmetric elution peak indicates that intasomes assembled with P5-IN are not homogenous, as is the case for intasomes assembled with wild-type integrase (not shown) and Sso7d-IN [10]. The degree of heterogeneity is similar to that observed with intasomes assembled with Sso7d-IN.
Figure 6.

Size-exclusion chromatography of intasomes assembled with P5-IN and 5’ FAM-labeled 32 bp viral DNA substrate. A. Elution profile of assembled intasomes on Superdex 200 2.3/30. Prior to chromatography, 500 mM NaCl was added to the assembly mixture. After incubation at RT for 15 min, the mixture was centrifuged at 15,000 g for 15 min and the supernatant was concentrated to 50 μl using a microconcentrator (Satorius Stedim Biotech). The column was equilibrated with 20 mM HEPES pH 7.5, 20% glycerol, 1 mM TCEP and 500 mM NaCl. The flow rate was 40 ml/min, and the fraction size was 50 μl. B. Fractions 1 to 15 (F1-F15), corresponding to 22 min to 41 min elution time were analyzed by native 3% agarose gel electrophoresis, and detected by a fluorescence scanner. Multiple species of intasomes are indicated by arrows; it is noted that the large intasome species are not well separated due to the limited resolution of the Superdex 200 column. The presence of trace amount of intasomes in late elution fractions is due to non-specific interaction between intasomes and the Superdex 200 matrix.
Structure of the HIV-1 CSC intasome assembled with P5-IN
Although structures of many retroviral intasomes have been determined [26–30], the only HIV-1 intasome structure currently in the literature is that of the product STC intasome assembled with Sso7d-IN [10]. Due to the heterogeneity of complex assembly, these intasome structures could be resolved as multiple species, including tetramers and dodecamers of integrase with DNA substrate. Importantly, both tetrameric and dodecameric Sso7d-IN intasomes harbored the conserved intasome core (CIC) that is conserved among all intasome structures determined to date [4]. In contrast to Sso7d-IN, intasomes of other retroviral species appear to be much more homogeneous. This raised the question of whether the extra Sso7d domain, which was required to solve the prior HIV-1 intasome structure, contributes to the observed heterogeneity.
We show here that HIV-1 CSC intasomes assembled with P5-IN are similarly heterogeneous, suggesting that the Sso7d domain does not significantly contribute to the heterogeneity. Despite the presence of multiple species, we were able to determine the structure of a major intasome species by cryo-EM (Figure 7) at a final map resolution of 4.7Å. The CIC has the highest resolution; however, regions outside of the CIC, including distal and flanking IN dimers, are compromised by compositional and conformational heterogeneity, resulting to a lower resolution map (Figure 7 and Figures S3). Statistics, flow-chart of data processing, and local resolution estimation are shown in Table S1 and Figure S3 and S4. The cryo-EM reconstruction of the CSC intasome is consistent with a dodecameric assembly of P5-IN surrounding the ends of the viral DNA, as was a major class of HIV-1 STC intasomes [10]. The architecture of the P5-IN CSC and Sso7d-IN STC structures are very similar; however, the distal subunits within STC_IBD are slightly more compact (Figure S5). 3D classification showed the presence of other oligomeric species, which did not refine to yield a high-resolution map. All P5 peptides are disordered in the maps.
Figure 7.

Structure of the HIV-1 CSC intasome assembled with P5-IN. A and C, Cryo-EM reconstruction of the intasome and Conserved Intasome Core (CIC). Density maps are colored as transparent purple and the atomic models derived from the densities are shown in green and orange for integrase protomers and viral DNA, respectively. B and D, Atomic models of the P5-IN intasome and CIC colored by integrase subunits. E and G, segmented density with higher contour to highlight the CIC composed of six different integrase subunits, which both flanking subunits CTDs are colored magenta. F and H, atomic model of the CIC derived from the cryo-EM density, colored as in (E). Distal subunits and flanking subunits, except for the flanking subunit C terminus, are omitted for clarity.
CSC intasomes are the target of INSTIs and high-resolution structures of these intasomes will be required to understand their mechanism of action and how HIV-1 can evolve resistance. Current modeling of INSTIs with HIV-1 intasomes [31] relies on structures of interactions of these drugs with intasomes of related retroviruses such as prototype foamy virus PFV [26, 32–35]. While highly informative in the immediate vicinity of the active site, sequence divergence makes modeling difficult in many regions where resistance mutations occur. The similarity of the P5-IN CSC intasome structure with the Sso7d STC intasome not only validates using Sso7d intasomes for drug interaction studies, but also provides an additional tool in this endeavor.
Materials and Methods
DNA substrates and peptides
Oligonucleotides were purchased from Integrated DNA Technologies (Coralville, Iowa). 32 bp DNA substrate corresponding to the U5 end of HIV-1 DNA was prepared by annealing of U5 32 bp precut: 5’-CCTTTTAGTCAGTGTGGAAAATCTCTAGCA and the U5 complementary strand: 5’ACTGCTAGAGATTTTCCACACTGACTAAAAGG. Fluorescent DNA substrates were prepared by attaching 6-FAM fluorophore at the 5’ end of the precut oligonucleotide. Peptides (purity >90%) were purchased from Genscript (Piscataway, NJ).
Recombinant DNA construction
DNAs encoding N-terminal and C-terminal peptide fusions to HIV-1 integrases with 2, 14, 25, or 50-amino acid linker sequences, and fusion of linker only were synthesized and cloned into pET-15b by GenScript (Piscataway, NJ). 2, 14, 25 or 50-amino acid non-structured linker was introduced between the peptide and IN domains, respectively. The linker sequences, which have been demonstrated by NMR to be a non-structured random coil [21], were derived from Lap2 alpha starting from residue 51. P5-IN, with the 50-amino acid linker between the peptide and the integrase N-terminus, was used for this work, except as otherwise noted. DNAs for full-length LEDGF and its mutants were cloned between the Nde1 and Xho1 sites of pET-28a (Novagen).
Protein expression and purification
His-tagged wild type and fusion integrase were expressed and purified essentially as described [36]. Briefly, integrase was expressed in E. coli BL21(DE3) and the cells were lysed in buffer containing 20 mM Hepes pH 7.5, 10% glycerol, 2 mM 2-mercaptoethanol, 20 mM imidazole and 1 M NaCl. The protein was purified by nickel-affinity chromatography and the His-tag was removed with thrombin. Aggregated protein was removed by gel filtration on a HiLoad 26/60 Superdex-200 column (GE Healthcare) equilibrated with 20 mM Hepes pH 7.5, 10% glycerol, 5 mM DTT, 1 mM EDTA and 1 M NaCl. The protein was concentrated using an Amicon centrifugal concentrator (EMD Millipore) as necessary, flash-frozen in liquid nitrogen and stored at −80°C. His-tagged LEDGF and its mutants were expressed and purified as described [18]. All protein preparations were at least 95% pure as estimated by quantitation of Coomasie stained gels.
Integration assay and intasome assembly
Integrase (1.0 μM, unless otherwise noted) and 0.5 μM viral DNA substrate were preincubated on ice in 20 mM HEPES pH 7.5, 25% glycerol, 50 mM 3-(Benzyldimethylammonio) propanesulfonate (NDSB-256), 10 mM DTT, 5 mM MgCl2, 4 μM ZnCl2, 100 mM NaCl, and 300 ng of target plasmid DNA (pGEM-9zf) in a 20 μl reaction volume with or without peptide. The reaction was initiated by transfer to 37°C and incubation was continued for 90 min. For integration product analysis, the reactions were stopped by addition of SDS and EDTA to 0.2% and 10 mM, respectively, together with 5 mg of proteinase K. Incubation was continued at 37°C for 1 h. The DNA was then recovered by ethanol precipitation and subjected to electrophoresis in a 1.5% agarose gel in 1x TBE buffer. DNA was visualized either by ethidium bromide staining or by fluorescence scanning using a Typhoon 8600 fluorescence scanner (GE Healthcare). Intasome assembly was carried out in the same way except that target DNA was omitted and CaCl2 was substituted for MgCl2. For electrophoretic mobility shift assays (EMSAs) of intasomes, the reactions were stopped after 1 h incubation at 37°C by chilling on ice and addition of 10 μg/ml heparin. A 2.5 ml aliquot was subjected to electrophoresis on a 3.0% low melting 1xTBE agarose gel (SeaKem LE agarose) containing 10 mg/ml heparin. All images in the figures are representative of three or more separate experiments.
FAM-DNA fluorescence polarization measurement
10 nM of 5’ end 6-FAM labelled DNA substrate was mixed with serial dilution of peptide starting from 225 μM in binding buffer (20 mM HEPES, pH 7.5; 100 mM NaCl; 1 mM TCEP; 0.1 mg/ml BSA; 0.05% Tween 20). Samples (15 μl) in triplicate were transferred to a 384-well black polystyrene microplate (MSD 42-000-0118). The microplate was sealed and quickly spun at 220 g for 1 min. The plate was incubated at room temperature for 60 min before reading. Fluorescence polarization was measured by a microplate reader (BMG CLARIOstar) in endpoint mode (Ex filter 482–16, EM filter 530–40, dichroic filter).
CSC intasome preparation for cryo-EM
Scaled-up CSC intasome preparations were assembled by mixing 3.0 μM integrase with 1.0 μM DNA substrate (made by annealing 5”-AGCGTGGGCGGGAAAATCTCTAGCA with 5’-ACTGCTAGAGATTTTCCCGCCCACGCT) in buffer containing 20 mM HEPES pH 7.5, 5 mM MgCl2, 5 mM 2-mercaptoethanol, 4 μM ZnCl2, 100 mM NaCl, 25% (w/v) glycerol and 50 mM NDSB-256 in the presence of 50 μM dolutegravir (DTG) to kinetically trap CSC intasomes, followed by incubation at 30°C for 60 min. The reaction was stopped by incubating on ice for 15 min. NaCl was then added to 500 mM, and after incubation at room temperature for 15 min the mixture was centrifuged at 15,000 g for 15 min to remove insoluble aggregate. CSC intasomes were purified by Ni-affinity chromatography and two steps of size exclusion chromatography. Briefly, the mixture was firstly loaded onto a HisTrap HP column (GE Healthcare) equilibrated with 20 mM Tris-Cl pH 8.0, 5 mM 2-mercaptoethanol, 5 mM MgCl2, 0.5 M NaCl, and 20% (w/v) glycerol. HIV-1 intasomes were eluted with a linear gradient of 0 mM to 500 mM imidazole in the same buffer. Intasome-containing fractions were combined and concentrated with a centrifugal filter (Amicon Ultra-15 Ultracel-100k). CSC aggregates and free IN were then removed by gel filtration on a Superose 6 Increase 10/300 GL column (GE Healthcare) equilibrated with 20 mM Tris pH 6.2, 0.5 mM TCEP, 600 mM NaCl, 5.0 mM MgCl2, and 10% (w/v) glycerol. Pooled intasomes were concentrated prior to the second size exclusion chromatography step on a TSKgel UltraSW Aggregate column (Tosoh Bioscience) in 20 mM Tris pH 6.2, 0.5 mM TCEP, 550 mM NaCl, 5mM MgCl2, and 6% (w/v) glycerol. Intasomes corresponding to single CSCs were pooled and concentrated to 0.4 mg/ml for cryo-EM study.
Cryo-EM grid preparation, data acquisition and analysis
Purified CSC intasomes were applied onto freshly plasma cleaned (30s, Pelco easyGlow™ plasma cleaner) holey carbon grids (QUANTIFOIL® R 1.2/1.3 300 mesh, EMS), adsorbed for 10 sec, blotted for 4.5 seconds and then plunged into liquid ethane using a Leica EM GP plunge freezer in an ambient environment of 18°C. Images were acquired using SerialEM software installed on a FEI Titan Krios electron microscope operating at 300 kV [37]. Data were collected at super resolution mode with a pixel size of 0.54 Å. The individual frames were aligned and summed using MotionCor2 [38] according to the nominal dose rate. An initial 3D model was generated directly from the class averages using CryoSPARC 2 [39], 3D classifications and refinement were performed using Relion v3 [40]. A total of 275,960 particles were picked from 2D classification. Classes were picked based on the size of the stable core; some classes appeared to lack one or more protomers within the stable core relative to the largest stable core size class. Six 3D classes were grouped into two groups based on the assembly completeness in the stable core region and the two groups were refined separately. 134,763 particles from class 1, 2 and 4 showing the fully assembled octameric core were selected for 3D refinement with C1 or C2 symmetry imposed resulting in maps at 6.0 Å and 4.7 Å, respectively. 140k particles from class 3, 5 and 6 with a smaller intasome core were refined to a low-quality map at 9.5 Å. Focused refinement with a mask around the intasome core region resulted in slightly better maps at 4.6 Å. Post-processing with a CIC mask resulted in a 4.5 Å map. (Figure S3). Resolution was estimated by applying a soft mask around the protein density with the Fourier shell correlation (FSC) 0.143 criterion [41, 42]. Local resolution was estimated using Resmap [42]. UCSF Chimera, PyMOL, Coot and Phenix were used for model building, refinement, and validation [43, 44]. The previous individual domains from x-ray or NMR structures (3L3U, 1K6Y and 1IHV) and strand transfer complex (5U1C) were used as the starting model to build the CIC model. They were fitted into the cryo-EM map by rigid body docking with UCSF Chimera [45]. Catalytic core domains dimers and partial CCD-CTD linkers from IN protein X-ray crystal structure 1EX4 were rigid body fitted into the density map of distal subunits and flanking subunits. The missing linker regions were built manually with Coot [46]. Distal subunits in the stable core were resolved with a lower resolution, but the main helixes can still be traced. Models were refined in phenix.real_space_refine with secondary structure restraints. The final model’s geometry and validation were done by Molprobity [47]. The relevant refinement statistics are summarized in Table S1. The CSC and CIC (conserved intasome core) atomic models and analysis are shown in Figure 7 and supplementary Figures S3–S5. The EMBD and PDB codes for the P5-IN intasome are EMD-20689 and 6U8Q. The corresponding codes for the CIC are EMD-21151 and 6VDK.
Supplementary Material
Highlights.
Structural studies of the nucleoprotein complexes (intasomes) that mediate HIV-1 DNA integration have been frustrated by low efficiency of in vitro assembly and aggregation. Intasomes are the target of integrase strand transfer inhibitors and high-resolution structures are required to understand their detailed mechanism of action and how HIV-1 can evolve resistance.
Fusion of a peptide derived from the AT-hook region of lens epithelium-derived growth factor to the N-terminus of HIV-1 integrase stimulates catalytic activity and markedly reduces the propensity of intasomes to aggregate.
We have used the peptide fusion integrase to determine the structure of the HIV-1 cleaved synaptic complex (CSC) intasome, which is the target of integrase strand transfer inhibitors (INSTIs), by cryo-EM.
Acknowledgements
This work was supported by the Intramural Program of NIDDK, NIH and by the AIDS Targeted Antiviral Program of the Office of the Director of the NIH, and NIH grant R01 A1070042 (to ANE). This work utilized the NIH Multi-Institute Cryo-EM Facility (MICEF).
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
The only changes are substitution of high-resolution non-compressed files for the figures (in place of the compressed files submitted for review) and inclusion of EMDB and PDB codes in Figure S1 and the main text.
REFERENCES
- [1].Lesbats P, Engelman AN, Cherepanov P. Retroviral DNA integration. Chemical Reviews. 2016;116:12730–57. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [2].Craigie R. Nucleoprotein intermediates in HIV-1 DNA integration: structure and function of HIV-1 intasomes In: Harris JR, Bhella D, editors. Virus Protein and Nucleoprotein Complexes2018. p. 189–210. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [3].Engelman A, Mizuuchi K, Craigie R. HIV-1 DNA integration: mechanism of viral DNA cleavage and DNA strand transfer. Cell. 1991;67:1211–21. [DOI] [PubMed] [Google Scholar]
- [4].Engelman AN, Cherepanov P. Retroviral intasomes arising. Curr Opin Struct Biol. 2017;47:23–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [5].Dyda F, Hickman AB, Jenkins TM, Engelman A, Craigie R, Davies DR. Crystal structure of the catalytic domain of HIV-1 integrase: similarity to other polynucleotidyl transferases. Science. 1994;266:1981–6. [DOI] [PubMed] [Google Scholar]
- [6].Eijkelenboom AP, Lutzke RA, Boelens R, Plasterk RH, Kaptein R, Hard K. The DNA-binding domain of HIV-1 integrase has an SH3-like fold. Nat Struct Biol. 1995;2:807–10. [DOI] [PubMed] [Google Scholar]
- [7].Cai M, Zheng R, Caffrey M, Craigie R, Clore GM, Gronenborn AM. Solution structure of the N-terminal zinc binding domain of HIV-1 integrase. Nat Struct Biol. 1997;4:567–77. [DOI] [PubMed] [Google Scholar]
- [8].Wang JY, Ling H, Yang W, Craigie R. Structure of a two-domain fragment of HIV-1 integrase: implications for domain organization in the intact protein. EMBO J. 2001;20:7333–43. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [9].Chen JCH, Krucinski J, Miercke LJW, Finer-Moore JS, Tang AH, Leavitt AD, et al. Crystal structure of the HIV-1 integrase catalytic core and C- terminal domains: a model for viral DNA binding. Proc Natl Acad Sci USA. 2000;97:8233–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [10].Passos DO, Li M, Yang RB, Rebensburg SV, Ghirlando R, Jeon Y, et al. Cryo-EM structures and atomic model of the HIV-1 strand transfer complex intasome. Science. 2017;355:89–92. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [11].Li M, Jurado KA, Lin S, Engelman A, Craigie R. Engineered hyperactive integrase for concerted HIV-1 DNA integration. Plos One. 2014;9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [12].Yin Z, Lapkouski M, Yang W, Craigie R. Assembly of prototype foamy virus strand transfer complexes on product DNA bypassing catalysis of integration. Protein Science. 2012;21:1849–57. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [13].Ge H, Si YZ, Roeder RG. Isolation of cDNAs encoding novel transcription coactivators p52 and p75 reveals an alternate regulatory mechanism of transcriptional activation. EMBO J. 1998;17:6723–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [14].Maertens G, Cherepanov P, Pluymers W, Busschots K, De Clercq E, Debyser Z, et al. LEDGF/p75 is essential for nuclear and chromosomal targeting of HIV-1 integrase in human cells. J Biol Chem. 2003;278:33528–39. [DOI] [PubMed] [Google Scholar]
- [15].Cherepanov P, Maertens G, Proost P, Devreese B, Van Beeumen J, Engelborghs Y, et al. HIV-1 integrase forms stable tetramers and associates with LEDGF/p75 protein in human cells. J Biol Chem. 2003;278:372–81. [DOI] [PubMed] [Google Scholar]
- [16].Cherepanov P. LEDGF/p75 interacts with divergent lentiviral integrases and modulates their enzymatic activity in vitro. Nucleic Acids Res. 2007;35:113–24. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [17].Pandey KK, Sinha S, Grandgenett DP. Transcriptional coactivator LEDGF/p75 modulates human immunodeficiency virus type 1 integrase-mediated concerted integration. J Virol. 2007;81:3969–79. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [18].Turlure F, Maertens G, Rahman S, Cherepanov P, Engelman A. A tripartite DNA-binding element, comprised of the nuclear localization signal and two AT-hook motifs, mediates the association of LEDGF/p75 with chromatin in vivo. Nucleic Acids Res. 2006;34:1653–65. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [19].Hare S, Shun MC, Gupta SS, Valkov E, Engelman A, Cherepanov P. A novel co-crystal structure afords the design of gain-of-function lentiviral integrase mutants in the presence of modified PSIP1/LEDGF/p75. Plos Pathogens. 2009;5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [20].McKee CJ, Kessl JJ, Shkriabai N, Dar MJ, Engelman A, Kvaratskhelia M. Dynamic modulation of HIV-1integrase structure and function by cellular lens epithelium-derived growth factor (LEDGF) protein. J Biol Chem. 2008;283:31802–12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [21].Cai ML, Huang Y, Ghirlando R, Wilson KL, Craigie R, Clore GM. Solution structure of the constant region of nuclear envelope protein LAP2 reveals two LEM-domain structures: one binds BAF and the other binds DNA. EMBO J. 2001;20:4399–407. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [22].Cherepanov P, Devroe E, Silver PA, Engelman A. Identification of an evolutionarily conserved domain in human lens epithelium-derived growth factor/transcriptional coactivator p75 (LEDGF/p75) that binds HIV-1 integrase. J Biol Chem. 2004;279:48883–92. [DOI] [PubMed] [Google Scholar]
- [23].Wang H, Jurado KA, Wu XL, Shun MC, Li X, Ferris AL, et al. HRP2 determines the efficiency and specificity of HIV-1 integration in LEDGF/p75 knockout cells but does not contribute to the antiviral activity of a potent LEDGF/p75-binding site integrase inhibitor. Nucleic Acids Res. 2012;40:11518–30. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [24].Schrijvers R, Vets S, De Rijck J, Malani N, Bushman FD, Debyser Z, et al. HRP-2 determines HIV-1 integration site selection in LEDGF/p75 depleted cells. Retrovirology. 2012;9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [25].Engelman A. Pleiotropic nature of of HIV-1 integrase mutations In: Neamati N, editor. HIV-1 Integrase: Mechanism and Inhibitor Design. Hoboken, NJ: John Wiley & Sons, Inc; 2011. p. 67–81. [Google Scholar]
- [26].Hare S, Gupta SS, Valkov E, Engelman A, Cherepanov P. Retroviral intasome assembly and inhibition of DNA strand transfer. Nature. 2010;464:232–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [27].Maertens GN, Hare S, Cherepanov P. The mechanism of retroviral integration from X-ray structures of its key intermediates. Nature. 2010;468:326–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [28].Ballandras-Colas A, Browne M, Cook NJ, Dewdney TG, Domeler B, Cherepanov P, et al. Cryo-EM reveals a novel octameric integrase structure for betaretroviral intasome function. Nature. 2016;530:358–61. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [29].Yin Z, Shi K, Banerjee S, Pandey KK, Bera S, Grandgenett DP, et al. Crystal structure of the Rous sarcoma virus intasome. Nature. 2016;530:362–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [30].Ballandras-Colas A, Maskell DP, Serrao E, Locke J, Swuec P, Jonsson SR, et al. A supramolecular assembly mediates lentiviral DNA integration. Science. 2017;355:93–5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [31].Krishnan L, Li XA, Naraharisetty HL, Hare S, Cherepanov P, Engelman A. Structure-based modeling of the functional HIV-1 intasome and its inhibition. Proc Natl Acad Sci USA. 2010;107:15910–5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [32].Hare S, Vos AM, Clayton RF, Thuring JW, Cummings MD, Cherepanov P. Molecular mechanisms of retroviral integrase inhibition and the evolution of viral resistance. Proc Natl Acad Sci USA. 2010;107:20057–62. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [33].Hare S, Smith SJ, Metifiot M, Jaxa-Chamiec A, Pommier Y, Hughes SH, et al. Structural and functional analyses of the second-generation integrase strand transfer inhibitor dolutegravir (S/GSK1349572). Molecular Pharmacology. 2011;80:565–72. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [34].Zhao XZ, Smith SJ, Maskell DP, Metifiot M, Pye VE, Fesen K, et al. HIV-1 integrase strand transfer inhibitors with reduced susceptibility to drug resistant mutant integrases. Acs Chemical Biology. 2016;11:1074–81. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [35].Zhao XZ, Smith SJ, Maskell DP, Metifiot M, Pye VE, Fesen K, et al. Structure-guided optimization of HIV integrase strand transfer inhibitors. Journal of Medicinal Chemistry. 2017;60:7315–32. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [36].Li M, Craigie R. Nucleoprotein complex intermediates in HIV-1 integration. Methods. 2009;47:237–42. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [37].Mastronarde DN. Automated electron microscope tomography using robust prediction of specimen movements. Journal of Structural Biology. 2005;152:36–51. [DOI] [PubMed] [Google Scholar]
- [38].Zheng SQ, Palovcak E, Armache J-P, Verba KA, Cheng Y, Agard DA. MotionCor2: anisotropic correction of beam-induced motion for improved cryo-electron microscopy. Nature Methods. 2017;14:331–2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [39].Punjani A, Rubinstein JL, Fleet DJ, Brubaker MA. cryoSPARC: algorithms for rapid unsupervised cryo-EM structure determination. Nature Methods. 2017;14:290–6. [DOI] [PubMed] [Google Scholar]
- [40].Fernandez-Leiro R, Scheres SHW. A pipeline approach to single-particle processing in RELION. Acta Crystallographica Section D-Structural Biology. 2017;73:496–502. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [41].Swint-Kruse L, Brown CS. Resmap: automated representation of macromolecular interfaces as two-dimensional networks. Bioinformatics. 2005;21:3327–8. [DOI] [PubMed] [Google Scholar]
- [42].Kucukelbir A, Sigworth FJ, Tagare HD. Quantifying the local resolution of cryo-EMEM density maps. Nature Methods. 2014;11:63–5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [43].Emsley P, Lohkamp B, Scott WG, Cowtan K. Features and development of Coot. Acta Crystallographica Section D-Biological Crystallography. 2010;66:486–501. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [44].Barad BA, Echols N, Wang RYR, Cheng Y, DiMaio F, Adams PD, et al. EMRinger: side chain directed model and map validation for 3D cryo-electron microscopy. Nature Methods. 2015;12:943–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [45].Pettersen EF, Goddard TD, Huang CC, Couch GS, Greenblatt DM, Meng EC, et al. UCSF chimera - A visualization system for exploratory research and analysis. Journal of Computational Chemistry. 2004;25:1605–12. [DOI] [PubMed] [Google Scholar]
- [46].Emsley P, Cowtan K. Coot: model-building tools for molecular graphics. Acta Crystallographica Section D-Biological Crystallography. 2004;60:2126–32. [DOI] [PubMed] [Google Scholar]
- [47].Chen VB, Arendall WB III, Headd JJ, Keedy DA, Immormino RM, Kapral GJ, et al. MolProbity: all-atom structure validation for macromolecular crystallography. Acta Crystallographica Section D-Structural Biology. 2010;66:12–21. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [48].Cherepanov P, Sun ZYJ, Rahman S, Maertens G, Wagner G, Engelman A. Solution structure of the HIV-1 integrase-binding domain in LEDGF/p75. Nat Struct Mol Biol. 2005;12:526–32. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
