Skip to main content
Journal of Virology logoLink to Journal of Virology
. 2022 Apr 18;96(9):e02164-21. doi: 10.1128/jvi.02164-21

High-Resolution Structure of the Nuclease Domain of the Human Parvovirus B19 Main Replication Protein NS1

Jonathan L Sanchez b,c,*, Niloofar Ghadirian c, Nancy C Horton a,
Editor: Lori Frappierd
PMCID: PMC9093113  PMID: 35435730

ABSTRACT

Two new structures of the N-terminal domain of the main replication protein, NS1, of human parvovirus B19 (B19V) are presented here. This domain (NS1-nuc) plays an important role in the “rolling hairpin” replication of the single-stranded B19V DNA genome, recognizing origin of replication sequences in double-stranded DNA, and cleaving (i.e., nicking) single-stranded DNA at a nearby site known as the terminal resolution site (trs). The three-dimensional structure of NS1-nuc is well conserved between the two forms, as well as with a previously solved structure of a sequence variant of the same domain; however, it is shown here at a significantly higher resolution (2.4 Å). Using structures of NS1-nuc homologues bound to single- and double-stranded DNA, models for DNA recognition and nicking by B19V NS1-nuc are presented that predict residues important for DNA cleavage and for sequence-specific recognition at the viral origin of replication.

IMPORTANCE The high-resolution structure of the DNA binding and cleavage domain of the main replicative protein, NS1, from the human-pathogenic virus human parvovirus B19 is presented here. Included also are predictions of how the protein recognizes important sequences in the viral DNA which are required for viral replication. These predictions can be used to further investigate the function of this protein, as well as to predict the effects on viral viability due to mutations in the viral protein and viral DNA sequences. Finally, the high-resolution structure facilitates structure-guided drug design efforts to develop antiviral compounds against this important human pathogen.

KEYWORDS: viral origin of replication, DNA nicking, nuclease, single-stranded DNA binding, double-stranded DNA binding, parvovirus, protein structure, enzyme, DNA cleavage, human parvovirus B19, endonuclease, protein structure-function

INTRODUCTION

Human parvovirus B19 (B19V) is a ubiquitous virus infecting the majority of the human population (1, 2). B19V, a member of the Parvoviridae family and the genus Erythrovirus has been associated with myriad different illnesses; B19V was first discovered as the cause of aplastic crisis in patients with chronic hemolytic anemia (3), then as the causative agent of erythema infectiosum (“fifth disease”) in 1983 (4) which results in mild fever and a distinctive rash in children, and fever, often with hepatitis and arthralgia, in adults. B19V infection is also associated with pure red-cell aplasia from persistent infection in immunocompromised patients, and hydrops fetalis (a serious condition of the fetus) in pregnant women (1). In addition, B19V infection has also been associated with other serious conditions such as inflammatory cardiomyopathy and the induction of autoimmune or autoimmune-like disease (short or long term) (58).

B19V is a single-stranded nonenveloped DNA virus of 5,596 nucleotides, with an internal coding region flanked by palindromic sequences capable of forming terminal hairpin structures. B19V replicates in erythroid progenitors using a rolling hairpin mechanism (9). The viral genome encodes six protein products, as follows: VP1 and VP2, which compose the viral capsid; NS1, the main replication protein; and three smaller nonstructural proteins (1012). Viral replication utilizes cellular factors, proposed to be coordinated by the viral NS1 protein (1315), and is thought to make use of the terminal hairpins (13, 15) (Fig. 1A). Extension of the 3′end of one terminal hairpin by a cellular polymerase results in replication of the majority of the viral genome (Fig. 1A, step 1), and NS1 is thought to bind to repeat sequences (Fig. 1B, NS1 binding element 1 [NSBE1] to NSBE4) and nick or cleave in one strand at the terminal resolution site (trs) (Fig. 1A, trs), producing a new 3′ end that can be used to prime synthesis of the remaining viral DNA (Fig. 1A, step 3) (14). Based on amino acid sequence and homology to other parvoviral replication proteins, B19V NS1 (NS1) is predicted to contain both nuclease and helicase domains involved in B19V replication. NS1 also contains a C-terminal domain involved in the protein’s promoter transactivation activity (16), which acts upon its own viral promoter, p6, as well as on several host promoters (1721). Furthermore, the genome of B19V is known to insert into host DNA, a reaction likely involving double-stranded DNA (dsDNA) recognition and strand nicking by NS1, followed by host DNA repair (22).

FIG 1.

FIG 1

Overview of NS1-nuc function in replication and three-dimensional structure. (A) Replication using the cellular replication machinery is primed by the 3′ folded over hairpin (or inverted terminal repeat [ITR]). Upon double-stranded formation, transcription proceeds, producing new copies of NS1. NS1 binds at the GC-rich sequences denoted NSBE1 to NSBE4 and cleaves at the nearby terminal resolution site or trs, leaving NS1 covalently attached to the 5′ end at the site of cleavage. Replication of the remaining segment of the genome is completed using the newly generated 3′ OH and following unfolding of the terminal hairpin. (B) DNA sequences at the viral origin of replication, including the trs and NSBE sequences. (C) Silver-stained SDS-PAGE analysis of a drop containing form II crystals (lane 2). The full-length construct is predicted to be 25 kDa; however, a smaller species, indicated by the asterisk (*), may be the form found in the crystals. (D) Overview of form I NS1-nuc with the positions of putative DNA cleavage active site residues in red and predicted double-stranded DNA (dsDNA) binding residues shown in magenta. Mg2+ identified in form II is shown as a green sphere. (E) Sequence, secondary structure, and notable features of form I NS1-nuc. H1 to H9, alpha helices; β1 to β5, β-strands; β, β-turn; γ, γ-turn; red hairpin, β-hairpin. White letters in red boxes indicate putative DNA cleavage active site residues, white letters in magenta boxes indicate predicted dsDNA binding residues, and cyan boxes indicate predicted single-stranded DNA (ssDNA) binding residues. (F) Topology diagram of form I NS1-nuc with α-helices as red cylinders, β-strands as yellow arrows, N and C termini in green circles, and residue number of secondary structural elements in black.

Prior studies show that the isolated N-terminal nuclease domain of B19V NS1 (NS1-nuc) binds sequence-specifically to the NSBE sequences in dsDNA, and it is also responsible for sequence-specific DNA cleavage (or nicking) in single-stranded, but not double-stranded, DNA at the trs site (23). Since the DNA encountered by NS1 is double-stranded, it is assumed that binding of NS1 to its target sequences in dsDNA induces strand separation nearby, allowing for the endonuclease activity of NS1 to cleave at the trs site. However, how NS1 recognizes its target sequences in dsDNA, as well as at the nicking site, are currently unknown, as is the mechanism by which NS1 binding to DNA at the NSBE induces strand separation. To begin to answer these questions, we determined the high-resolution crystal structure of NS1-nuc. Two crystal forms were solved, one at 2.4-Å resolution (form I) and another at 3.5-Å (form II) resolution which shows binding of a single Mg2+ in the DNA nicking active site. Using homology modeling, three-dimensional models for (i) trs sequence recognition in single-stranded DNA (ssDNA), (ii) DNA cleavage in ssDNA, and (iii) dsDNA binding at the NSBE sequences are presented.

RESULTS AND DISCUSSION

Structure solution, refinement, and overall analysis.

Crystallographic data and structure refinement statistics are shown in Table 1. X-ray diffraction data were collected, scaled, and truncated to resolutions based on scaling, signal to noise, cross-correlation analysis, and final structure and map quality after structure refinement. These resolutions are 2.4 Å for form I, where the highest-resolution shell shows I/σ of 2.4, CC1/2 of 38.5%, and final Rwork and Rfree of 19.9% and 25.7%, respectively. In the case of form II, data were truncated at 3.5 Å, where the highest-resolution shell shows I/σ of 1.67, CC1/2 of 89%, and final Rwork and Rfree of 26.1% and 29.6%, respectively. Residues corresponding to amino acids 2 to 173 of the NS1 sequence were included in the final refined model of form I; however, the electron density for DNA was not evident despite its addition to the protein solution used in the form I crystallization experiments. The additional residues on the C terminus of the NS1-nuc construct used for form II crystallization (residues 175 to 209) were also not evident in the electron density maps, possibly due to protein degradation during crystallization (Fig. 1C), and therefore the final model contains residues 2 to 174 of the NS1 sequence. Figure 1D shows two views of the NS1-nuc ribbon diagram of form I, and Fig. 1E and F show secondary structural elements and other notable structural features. The largest differences between form I and II NS1-nuc structures include a variation in the trace at residues 24 to 30 (Fig. 2A and B), as well as the absence of residues 127 to 128 in form II (Fig. 2A), which are residues implicated in dsDNA binding (discussed further below). These structural differences likely originate from crystal packing interactions, since close packing with neighboring NS1-nuc copies occurs at these residues in all cases except at residues 127 and 128 of form II, and it is distinct in each crystal form. These distinct interfaces likely stabilize different conformations of otherwise more mobile segments of NS1-nuc. The form I map contains additional electron density between neighboring copies of NS1-nuc (Fig. 2C). Of the possible solvent molecules present in the crystallization solution, citrate was found to fit best, based on refined Rfree, refined temperature factors, and fit to map (Fig. 2C). The citrate molecule is within hydrogen bonding (Fig. 2C, dashed lines) and/or salt bridging distance (note that the crystallization conditions include pH 4.5) of appropriate groups in the two neighboring NS1-nuc copies, including the active-site residues (His81, His83, and Tyr141) of one copy, and Lys119, Tyr130, and Thr97 in another. Both citrate locations overlap sites of nucleotide binding in the model of ssDNA binding (see below). In the form II crystal structure, a significant (i.e., 3σ) positive difference density was also seen in the position occupied by Zn2+ in a previously determined structure of NS1-nuc (PDB accession code 6USM [24]), suggesting divalent cation binding (Fig. 2D, 2Fo-Fc at 1σ map in blue and 1Fo-Fc map in magenta at 3σ when Mg2+ is omitted from the model). Mg2+ was modeled into this position (Fig. 2D, yellow sphere), since crystallization conditions contained 200 mM Mg2+.

TABLE 1.

Data collection and refinement statistics

Statistic Forma
I II
PDB accession code 7SZY 7SZX
Site of data collection SSRL BL9-2 SSRL BL9-2
Wavelength (Å) 0.97946 0.97946
Resolution range (Å) 45.06–2.4 (2.486–2.4) 46.21–3.5 (3.625–3.5)
Space group P22121 P3221
Unit cell
 Edge lengths (Å) 45.0599, 49.0999, 74.9899 106.71, 106.71, 59.5904
 Internal angles (°) 90, 90, 90 90, 90, 120
Total no. of reflections 13,532 (1,322) 10,152 (976)
No. of unique reflections 6,799 (666) 5,099 (488)
Multiplicity 2.0 (2.0) 2.0 (2.0)
Completeness (%) 98.42 (98.66) 98.70 (97.41)
Mean I/σI 7.21 (2.43) 5.37 (1.67)
Wilson B-factor 28.31 91.66
R merge 0.1022 (0.5331) 0.07577 (0.4226)
R meas 0.1446 (0.7539) 0.1072 (0.5977)
CC1/2 0.981 (0.385) 0.993 (0.89)
CC 0.995 (0.745) 0.998 (0.971)
No. of reflections used in refinement 6,796 (665) 5,074 (488)
No. of reflections used for Rfree 333 (40) 508 (50)
R work 0.1986 (0.2629) 0.2605 (0.3421)
R free 0.2566 (0.2891) 0.2956 (0.4067)
CCwork 0.951 (0.778) 0.908 (0.759)
CCfree 0.898 (0.601) 0.825 (0.685)
No. of nonhydrogen atoms 1,439 1,294
 Macromolecules 1,337 1,289
 Ligands 13 1
 Solvent 89 4
No. of protein residues 174 171
RMS
 Bonds 0.003 0.004
 Angles 0.61 0.87
Ramachandran (%)
 Favored 95.93 94.01
 Allowed 4.07 5.39
 Outliers 0.00 0.60
Rotamer outliers (%) 1.43 0.00
Clashscore 2.28 6.09
B-factor
 Avg 30.11 94.33
 Macromolecules 29.83 94.51
 Ligands 39.45 90.01
 Solvents 32.96 37.16
a

Statistics for the highest-resolution shell are shown in parentheses.

FIG 2.

FIG 2

Comparison of the two new crystal forms of NS-nuc and binding sites of solvent molecules and Mg2+. (A) Two views of a cartoon representation of form I (blue) and form II (green). Putative active site residues are shown in red, and predicted dsDNA binding sites in form I are shown in magenta. Arrows indicate regions disordered in form II (left), and a region (residues 24 to 30) of deviation in the main chain position between the two forms (right). (B) Left, form I 2Fo-Fc map (at 1σ) with residues 24 to 30 of the form I model (blue, red, yellow). The main chain trace of form II is shown in green. Right, form II 2Fo-Fc map (at 1σ) with residues 24 to 30 of the form II model (green, blue, red, yellow). The main chain trace of form I shown in blue. (C) A citrate molecule modeled into the electron density map of form I is found at the interface between two copies of NS1-nuc (red/white and green). The 2Fo-Fc map in the refined model is shown at 1σ in gray, the positive (3σ) 1Fo-Fc map shown in green, and the negative (−3σ) 1Fo-Fc map shown in red. Dashed lines indicate hydrogen bonding distances (2.7 to 3.2 Å) between appropriate hydrogen bonding donating and accepting groups. (D) A 3σ peak is found in the form II 1Fo-Fc map (magenta) near five residues of the active site and is modeled as Mg2+ (yellow sphere). 2Fo-Fc map shown at 1σ in blue.

Comparison to other parvoviral Rep protein nuclease domains.

Form I NS1-nuc was solved by molecular replacement using the NS1-nuc domain found in the PDB accession code 6USM (24). After refinement, these two structures of NS1-nuc were found to be very similar, with a root mean square deviation (RMSD) of 0.8 Å over 164 residue alpha carbon (Cα) atoms (Fig. 3A), and the largest differences in structure were found in loop segments at the exterior of the protein (Fig. 3B, thick red ribbons). The loop segment containing residues 147 and 148 is not present in the 6USM coordinates but is well ordered in form I NS1-nuc and contains a cis-proline (Fig. 3C). In addition, residues at the C terminus of the domain are truncated to 171 in 6USM but extend to 175 in form I NS1-nuc. A single Zn2+ is located in the active site of 6USM in a nearly identical position to that of the Mg2+ ion modeled in form II NS1-nuc (Fig. 3D). Differences in the amino acid sequence of the NS1-nuc construct used in our studies and that of 6USM occur in 13 positions (Fig. 3E, red text). These originate from differences in biologically relevant sequences present in the NCBI database (6USM follows the sequence of a laboratory isolate, GenBank accession number AAG00943l form I and II NS1-nuc follow the sequence of an isolate from a blood bank, GenBank accession number ABN45789.1) and represent the two major variants present in the NCBI database (Fig. 3F and G). These substitutions are largely conservative in nature, and no large perturbations to the two structures are found at these positions.

FIG 3.

FIG 3

Comparison of NS1-nuc structures with sequence variations. (A) Orthogonal views of a superposition of the two structures (form I structure in slate blue and that from PDB accession code 6USM [24] in orange). Positions of active site residues are shown in red, and predicted dsDNA binding residues in magenta (form I only). Regions of greater deviation are identified with arrows. (B) Orthogonal views of form I NS1-nuc showing root mean square deviation (RMSD) of alpha carbon (Cα) atoms between NS1-nuc and 6USM by color (see legend in Å). Residue numbers in segments with highest [RMSD] are also identified. The active site residues (His81, His83, and Tyr141) are shown in spheres colored by Cα RMSD. (C) 2Fo-Fc (1σ) map of form I (in blue and red sticks) around the cis-proline at residue number 147. This region is not present in the model from 6USM (residues Ile146 and Asn149 shown in orange, blue, and red sticks). (D) Alignment of DNA nicking active sites from form II (blue) and 6USM (light red). The modeled Mg2+ in form II is shown in green, and the assigned Zn2+ of 6USM is shown in gray. (E) Alignment of amino acid sequence of 6USM and form I and II crystal structures of NS1-nuc. Differences are highlighted in red text. Secondary structural elements are shown above the sequences, γ indicates γ-turn, β indicates β-turn, and red hairpin indicates β-hairpin. Active site residues are shown with red boxes above the sequence, and predicted single- and double-stranded DNA binding residues are shown by a blue dot or a magenta line, respectively, above the sequences. (F) LOGO (62) representation of sequence variations of the 195 sequences of NS1-nuc (residues 1 to 176 of B19V NS1) from NCBI database. Significant variations are boxed in yellow. (G) Two views of form I NS1-nuc with positions of significant sequence variants shown in yellow, and active-site residues shown in red stick.

Figure 4 compares the structure of NS1-nuc (Fig. 4A to C, form I, blue) to homologous parvoviral structures present in the PDB originating from Adeno-associated virus 2 (AAV2; PDB accession code 5DCX [25]) (Fig. 4A, magenta), Minute virus of mice (MVM; PDB accession code 3WRN [26]) (Fig. 4B, yellow), and Human bocovirus (HBoV; PDB accession code 4KW3 [27]) (Fig. 4C, green). Ribbon diagrams of pairwise superpositions (using Cα atoms) are shown in Fig. 4A to C, with root mean square deviation values (using Cα atoms) mapped onto the structure of NS1-nuc (form I) in Fig. 4D to F (thicker, redder lines indicate greater RMSD; see legend in Å below each ribbon diagram). Most differences in structure occur in the positioning of loops and α-helices behind (as shown in the figure) the central β-sheet containing the active site histidine residues (Fig. 4A to C, marked in red). The structure of the AAV2 homolog contains an additional segment located between the N-terminal nuclease domain and the central helicase domain of AAV2 Rep (the homolog of B19V NS1), which due to sequence truncation is not present in the other domain structures (Fig. 4A, arrow). The overall RMSD values between NS1-nuc (form I) and the other three structures are very similar, with 3.2 Å, 3.2 Å, and 3.0 Å over 163 Cα atoms of B19V NS1-nuc and the nuclease domains of AAV2 Rep, MVM NS1, and HBoV NS1, respectively.

FIG 4.

FIG 4

Comparison of NS1-nuc with parvoviral homologues. (A) Comparison of form I NS1-nuc (blue) with the nuclease domain of adeno-associated virus (AAV2) Rep (magenta; PDB accession code 5DCX [25]). Positions of active-site residues in NS1-nuc (Tyr141, His81, and His83) shown in red. (B) As in panel A, but comparing NS1-nuc (blue) and the nuclease domain of minute virus of mice (MVM) NS1 (yellow; PDB accession code 3WRN [26]). (C) As in panel A, but comparing NS1-nuc (blue) and the nuclease domain of human bocavirus (HBoV) NS1 (green; PDB accession code 4KW3 [27]). (D) Ribbon diagram showing RMSD following superposition between Cα of NS1-nuc and AAV2 Rep-nuc. Ribbon color and thickness indicate local RMSD, with scale shown below, in Å. (E) As in panel D but with NS1-nuc and MVM NS1-nuc. (F) As in panel D but with NS1-nuc and HBoV NS1-nuc.

Prediction of nick site ssDNA recognition and cleavage by NS1-nuc.

A structure of a more distantly related viral replication enzyme (PDB accession code 6WE1 (28) from Wheat dwarf virus (WDV) (29) provides the basis for a model of sequence-specific ssDNA recognition and nicking by NS1-nuc. The Cα atoms of the active site residues (His81, His83, and Tyr141 in form I NS1-nuc and His59, His61, and Phe106 in WDV Rep-nuc, which contains the active site mutation Y106F) were used to superimpose the two structures, and the position of the ssDNA bound to WDV Rep-nuc was identified (Fig. 5A, yellow). A feature of the WDV Rep-nuc structure, the “ssDNA bridging motif” (29) (Fig. 5A, orange), is directly involved in ssDNA binding, making numerous interactions with the bases of several nucleotides and bridging the 5′ and 3′ ends of the bound ssDNA. In NS1-nuc, an insert is found in this position (Fig. 5A, magenta), which is predicted to interact with dsDNA (discussed further below), but it also likely interacts with ssDNA. Figure 5B shows the electrostatic potential maps of NS1-nuc (mapped onto the protein surface on the left and as a field on the right), where blue indicates a high positive charge and red indicates high negative charge. The predicted ssDNA binding face (shown by the position of ssDNA taken from the alignment with the WDV Rep-nuc structure; magenta in left, yellow in right), shows a high degree of positive charge consistent with binding to negatively charged DNA.

FIG 5.

FIG 5

Nicking site recognition in ssDNA by WDV Rep-nuc and NS1-nuc. (A) Orthogonal views of form I NS1-nuc (slate blue) overlaid with a structure of wheat dwarf virus (WDV) Rep-nuc (teal; PDB accession code 6WE1 [28]) bound to ssDNA (yellow). Mg2+ from form II NS1-nuc is shown as a green sphere, and Mn2+ from WDV Rep-nuc is shown as a dark purple sphere. Active site residues of NS1-nuc and WDV Rep-nuc are shown in red and brown, respectively. The “ssDNA bridging” segment of WDV Rep-nuc is shown in orange, and residues 123 to 132 of NS1-nuc (a putative dsDNA binding element) shown in magenta. (B) Left, electrostatic potential of Form I NS1-nuc calculated using APBS (60) in PyMol mapped onto the surface of form I NS1-nuc. ssDNA from PDB accession file 6WE1 (28) after superposition of NS1-nuc and WDV Rep-nuc, shown in magenta, to mark the predicted ssDNA binding cleft. Red indicates a negative charge, and blue indicates a positive charge. Orientation of NS1-nuc as in left side of panel A. Right, electrostatic potential field calculated using Adaptive Poisson-Boltzmann Solver (APBS) (60) in PyMol of form I NS1-nuc, with ssDNA from WDV Rep-nuc/ssDNA of PDB accession code 6WE1 (28) shown in yellow (charge ranges from −5 in red to +5 in blue). Orientation of NS1-nuc as in the left side of panel A. (C) As in panel A, but with residues implicated in ssDNA binding shown as spheres. Red box indicates the residue H9, which appears to be in a similar position as a H91 in WDV Rep-nuc. (D) Orthogonal views of the nuclease domain of WDV Rep-nuc bound to nick site ssDNA (PDB accession code 6WE1 [28]) with residues within hydrogen bonding, van der Waals, or salt bridging distance shown as spheres. Red box indicates the residue H91, which appears to be in a similar position as a H9 in B19V NS1-nuc.

The model of NS1-nuc bound to ssDNA created from the superposition with the WDV Rep-nuc structure allows for the prediction of residues likely to be involved in ssDNA recognition (Fig. 5C, light blue, red, and dark blue spheres). Only His9 of NS1-nuc (Fig. 5C, H9, red boxes) and His91 of WDV Rep-nuc (Fig. 5D, H91) appear to be conserved between the two structures. The model of NS1-nuc bound to nick site ssDNA also provides the opportunity to model atoms in the active site (Fig. 6A and B). Both WDV Rep-nuc and NS1-nuc are members of the HUH nuclease superfamily, which require a divalent cation for DNA cleavage activity (30). Prior work with NS1-nuc showed that Mg2+, Co2+, Ni2+, and Mn2+, but not Zn2+, Ca2+, or Cu2+, confer DNA cleavage activity. Both the Mg2+ (from form II NS1-nuc) and Zn2+ (from NS1-nuc in PDB accession code 6USM [24]) are bound in the same position in the active site, and therefore the difference in activity with these two ions may derive from different chemical properties, such as ligation geometries and the ability to polarize ligated atoms (3133). In the current model of NS1-nuc bound to ssDNA, the Mg2+ (Fig. 6A, green sphere) is positioned near a nonesterified oxygen of the scissile phosphate (SP) (the bond to be cleaved in the nicking/nuclease reaction). The distance between the modeled phosphate oxygen and Mg2+ is 1.4 Å, somewhat closer than the typical ligation distance for Mg2+ to oxygen ligands (1.9 to 2.1 Å) (34), but small adjustments in the position of the bound DNA could easily bring this distance to a more optimal value. The Mg2+ is also within ligation distance of the side chains of the active site residues His81 (2.5 Å) and His83 (2.5 Å) (Fig. 6A), as well as that of Glu72 (2.3 Å) (the relatively longer ligation distances may be due to coordinate error in the relatively low resolution of the form II structure). WDV Rep-nuc also possesses a nearby glutamic acid residue in the active site that is capable of ligation to the Mg2+ (Fig. 6B, Glu110). Divalent cation-dependent nucleases catalyze DNA cleavage by any or all of the following mechanisms: (i) polarization of the nucleophile (in this case, the phenolic oxygen of Tyr141) to increase its nucleophilicity (this often occurs via ligation to the divalent cation and may result in deprotonation of the nucleophile), (ii) stabilization of the transition state after nucleophilic attack (often via divalent cation ligation to a nonesterified oxygen of the scissile phosphate), and (iii) stabilization of the leaving group (O3′) following bond breakage (often via direct ligation to the divalent cation or protonation from a divalent cation-ligated water molecule) (31, 3537). In our NS1-nuc structure with modeled ssDNA, we found that the Mg2+ is positioned well to stabilize the transition state after nucleophilic attack via its predicted ligation to a nonesterified oxygen of the scissile phosphate (Fig. 6A). In the case of stabilization of the leaving group, the ssDNA-bound model does not predict a direct ligation of the O3′ leaving group to the Mg2+, but protonation by a Mg2+-ligated water molecule could be possible. The model also does not predict direct ligation of the nucleophile (the oxygen of the side chain of Tyr141) to Mg2+. However, divalent cation-dependent nucleases also accelerate DNA cleavage by organizing reactive groups in the active site into a geometry favorable for nucleophilic attack and bond breakage (31, 3537). This geometry includes (i) positioning the nucleophile within van der Waals radii of the phosphorus atom (the atom to be attacked by the nucleophile) and (ii) arranging the three atoms of the bond-making and bond-breaking reaction (the attacking group, the phosphorus atom, and the leaving group) in an “in-line” configuration such that the angle between them is 180° (37). We find in our model that the Tyr141 hydroxyl oxygen atom (the nucleophile of the DNA nicking reaction) is 3.6 Å from the phosphorus atom of the phosphodiester to be cleaved (the estimated van der Waals radii of the two atoms is 3.3 Å, 1.5 Å for oxygen, and 1.8 Å for phosphorus [38]), and the angle between the nucleophile, phosphorus atom, and leaving group (O3′ of the 5′ nucleotide) is 147°. Hence, the active site moieties are poised in an appropriate position for the catalytic reaction to occur. Finally, the terminal amine of the Lys145 side chain is within hydrogen bonding distance to the Tyr141 OH nucleophile (2.9 Å), suggesting a contribution to the catalytic reaction by acting as a general base to accept a proton from the nucleophile and/or to stabilize a negative charge on the nucleophile following proton loss.

FIG 6.

FIG 6

Interactions with bound Mg2+ and citrate, and comparison to WDV Rep-nuc. (A) Form I NS1-nuc shown in cartoon, with active site residues in stick form. Mg2+ from form II shown as a green sphere. ssDNA (shown in yellow, orange, red, and blue, with numbering relative to nick site) from WDV Rep-nuc/ssDNA (PDB accession code 6WE1 [28]) after superposition on to form I NS1-nuc using the Cα of the two His and Tyr active site residues. Distances between Mg2+ and atoms of His81, His83, Tyr141, and Glu72, shown in Å. The distance between the Mg2+ ion and a nonesterified oxygen of the scissile phosphodiester of the modeled ssDNA is 1.4 Å. (B) As in A, with selected side chains of WDV Rep-nuc (PDB accession code 6WE1 [28]) after superposition onto NS1-nuc shown in teal shown in stick form. (C) As in panel A, showing the location of a bound citrate molecule (cyan) near the active site and the backbone of modeled ssDNA between nucleotides +1 and −1. (D) As in panel A, but with a surface rendering of NS1-nuc and showing a second position of citrate bound near nucleotide −6 of the modeled ssDNA.

The ssDNA in the model also overlaps with the citrate molecules bound to NS1-nuc in the form I structure. In this form, each asymmetric unit contains one NS1-nuc and one bound citrate molecule. However, because the citrate molecule binds between two copies of NS1-nuc, it has two distinct binding sites in a single copy of NS1-nuc. One location is very near the active site (Fig. 6C). Two carboxylate groups of citrate bind near the phosphate positions of nucleotides +1 and −1 in the modeled ssDNA, consistent with the affinity for negatively charged moieties in these locations. The second citrate binding site is found near the nucleotide at the −6 position of the modeled ssDNA (Fig. 6D). The citrate molecule is bound in a pocket formed on the surface of NS1-nuc, ∼4 Å closer to NS1-nuc than to the −6 nucleotide, possibly predicting the true path of the bound ssDNA in this region and thus implicating Tyr130 and Thr97 (Fig. 2C) in ssDNA binding as well.

The structure of WDV Rep-nuc bound to nick site DNA (Fig. 7A) suggests that sequence-specific recognition occurs through a combination of direct readout, consisting of hydrogen bonds and van der Waals interactions to the chemically distinct portions of the DNA bases, as well as indirect readout, derived from the sequence-specific energetics of DNA structure and base stacking (29). Direct readout of the nick site DNA sequence is accomplished via hydrogen bonds between residues of WDV Rep-nuc and bases of nucleotides at the 2, −5, and −6 positions (see nucleotide numbering in Fig. 7B) (29). Indirect readout of the nick site DNA sequence is suggested by the distorted U-shape of the bound ssDNA, as well as the base pairs between Ade1 and Thy-4 (a Watson-Crick base pair) and Thy-1 and Ade-3 (a non-Watson-Crick base pair). To understand how NS1-nuc recognizes its nick site in ssDNA, we substituted the DNA sequence of the ssDNA in the NS1-nuc/ssDNA model derived from the superposition with the WDV Rep-nuc/ssDNA structure (Fig. 7C and D). First, in terms of direct readout, the low amino acid sequence conservation in the DNA binding site residues makes direct readout contacts between NS1-nuc and the bound ssDNA difficult to predict, with the possible exception of the 2 position of the nick site DNA (which is Cyt in both viral nick sites; see Fig. 7B and D). WDV Rep-nuc recognizes this base with hydrogen bonds from protein backbone atoms to the base-pairing atoms Cyt2 (Fig. 7E). In the model of NS1-nuc bound to nick site DNA, hydrogen bonds between the side chain of Asp133 and the O2 and N3 of Cyt2 would be possible with small adjustments (distances are 2.0 Å and 3.0 Å from the Asp133 carboxylate atoms to the O2 and N3 Cyt2 atoms, respectively; see Fig. 7F), predicting a role for Asp133 in sequence-specific recognition at the 2 position of the nick site DNA. In addition, Arg5 approaches Cyt2 from behind the base, and Phe131 may form a π-hydrogen bond (39) to the NH2 at the 4 position of Cyt2 in this model, suggesting that these residues may also be important in DNA nick site sequence discrimination (Fig. 7F). In terms of indirect readout, the Watson-Crick base pairing between nucleotides at the 1 and −4 positions (Ade1 and Thy-4 in the WDV sequence; see Fig. 7G) may be conserved, as these nucleotides are Cyt1 and Gua-4 in the B19V nick site sequence (see model in Fig. 7H). The non-Watson-Crick base pair between Thy-1 and Ade-3 in the WDV structure occurs with a single hydrogen bond between the N3 of Thy-1 and N3 of Ade-3 (Fig. 7I). In the B19V nick site sequence, these bases are Ade-1 and Ade-3, and the structural model predicts that a single hydrogen bond is possible (after some adjustment due to the larger size of the Ade base at the −1 position) between the N6 of Ade-1 and N3 of Ade-3 (Fig. 7J, green and dark blue sticks). Finally, in both models, a pyrimidine base is found at the turn of the bound DNA (Fig. 7A and B, Thy-2 in WDV; Thy-2 in WDV, Fig. 7C and D, Cyt-2 in B19V), which may be an important factor in indirect readout of the DNA sequence due to the formation of stacking interactions with the nucleotide at position −3 (an Ade in both cases).

FIG 7.

FIG 7

Nick site DNA binding by WDV Rep-nuc and NS1-nuc. (A) Wheat dwarf virus (WDV) Rep-nuc bound to ssDNA containing the nick site (PDB accession code 6WE1 [28]). (B) Sequence of DNA in the WDV origin of replication around the nick site. Dashed lines indicate hydrogen bonding between bases. Stacking of bases is indicated by vertical alignment of bases. (C) Model of NS1-nuc bound to nick (trs) site DNA based on the WDV Rep-nuc/DNA structure shown in panel A. (D) B19V nick site DNA sequence and predicted interactions between bases. (E) Interactions between Cyt2 and WDV Rep-nuc (PDB accession code 6WE1 [28]). (F) Model for interactions between Cyt2 and NS1-nuc. (G) Watson-Crick base pair formed between Ade1 and Thy-4 formed in DNA bound by WDV Rep-nuc. Active-site residues are shown in red and blue stick, and bound Mn2+ shown as a purple sphere. (H) Predicted Watson-Crick base pair formed by Cyt1 and Gua-4 in the model of B19V nick (trs) site DNA bound to NS1-nuc. (I) Non-Watson-Crick base pair in WDV Rep-nuc bound DNA between Thy-1 and Ade-3. (J) B19V NS1 nick (trs) site contains Ade at −1 and Ade at −3. Green shows interactions between −1 and −3 predicted in the model of NS1-nuc bound to nick site ssDNA, and yellow shows the position of the Thy-1 in the structure with WDV Rep-nuc bound to DNA (coordinates for Ade-3 of the WDV structure overlap the B19V model). The dashed line indicates close approach of N6 of Ade-1 and N3 of Ade-3 (2.3 Å). Adjustments in positioning of bases would be necessary to bring this to a reasonable distance, such as 2.8 Å.

A study of sequence preferences at the B19V nick site was performed with the NS1-nuc domain (23), which found the greatest preferences at nucleotides Cyt2, Cyt1, and Ade-1, with Cyt2 being most important, followed by Cyt1 and Ade-1. The nucleotides at these positions were also found to be the most important for nicking by WDV Rep-nuc using a sequence specificity selection approach (HUH-seq) (29), and in the same order of importance. In the case of B19V, exhaustive substitutions of the nick site sequence were not tested; instead, only mutation to the Watson-Crick base pairing partner was examined for cleavage by NS1-nuc. However, the large decrease in the DNA cleavage activity of NS1-nuc found as a result of changing Cyt2 to a Gua (23) may be explained by the predicted tight binding pocket surrounding Cyt2 (Fig. 7F), since a Gua base would be too large to fit into this pocket and could not form the same interactions with Asp133 and Phe131. The next most significant decrease in cleavage activity occurred with the substitution of Cyt1 to Gua. Since the model predicts a base pairing interaction with Gua-4 (Fig. 7H), it is clear that substitution of the Cyt with a Gua would disrupt this pairing. Strangely, substitution of Gua-4 with Cyt resulted in an increase in DNA cleavage by NS1-nuc. The Cyt-Cyt interaction (between Cyt1 and Cyt-4), while not predicted to be favorable, may result in less disruption to the structure of the bound DNA due to the relatively small size of the Cyt bases (compared to two Gua bases as in Gua1–Gua-4). Finally, substitution of Ade-1 with Thy diminished cleavage by approximately 50% (23). This substitution would disrupt the predicted hydrogen bonding with Ade-3, between the N6 of Ade-1 and the N3 of Ade-3 (Fig. 7J, green and blue sticks). A Thy at position −1 would not offer a hydrogen bond donor to take the place of the N6 of Ade-1, but, interestingly, it is the sequence seen in the WDV structure (Fig. 7J, Thy-1, yellow). The fact that substitution of Ade-1 with Thy is detrimental to DNA cleavage by NS1-nuc suggests a different configuration of these bases in the two structures, which would be necessary to accommodate the wild-type Ade-1–Ade-3 interaction predicted in the bound B19V ssDNA. Conversely, substitution of Ade-3 to Thy had little effect, possibly due to the availability of the O2 of a Thy to take the place of the N3 of Ade-3 in the predicted hydrogen bonding interaction (Fig. 7J). It may also be that substitutions near the nicking site (in −1, 1, and 2 positions) are more sensitive due to the requirement for precise positioning of atoms in the active site in order to achieve optimal cleavage activity.

Prediction of dsDNA binding by NS1-nuc.

In addition to binding to ssDNA, B19V NS1 must also bind to target sequences in dsDNA known as NS1 binding elements (NSBE). These sequences are located near the nick site (i.e., the trs) (Fig. 1B). It is the N-terminal nuclease domain of NS1 (i.e., NS1-nuc) which is also responsible for recognition of these sequences (23). The structure of the nuclease domain from AAV5 Rep (AAV5 Rep-nuc) bound to dsDNA containing Rep binding element (RBE) repeats (PDB accession code 1RZ9 [40]), which are sequences analogous to the NSBE of B19V, provides a framework for modeling the interactions of NS1-nuc with the NSBE sequences in dsDNA. Figure 8A shows a map of the AAV5 Rep-nuc domains bound to the RBE sites in dsDNA in this crystal structure. Five copies of AAV5 Rep-nuc (Fig. 8A, colored ovals) bind the five four-base (imperfect) RBE repeats (Fig. 8A, boxed base pairs), and follow the helical twist of the DNA, since each copy of AAV5 Rep-nuc interacts with bases in both the major and minor grooves in an equivalent manner (Fig. 8B). Interactions with the DNA by each copy of AAV5 Rep-nuc are not confined to a single quartet, but instead overlap. For example, one copy (Fig. 8A, yellow) interacts with base pairs of the first RBE quartet (Fig. 8A, boxed sequences, counting from right to left) in the major groove, as well as base pairs in the minor groove in the second RBE quartet. The next AAV5 Rep-nuc copy (Fig. 8A, green) interacts with the second RBE quartet bases via the major groove, and with third quartet bases via the minor groove. Each AAV5 Rep-nuc copy therefore interacts with two quartets, and each quartet interacts with two copies of AAV5 Rep-nuc. Two segments of AAV5 Rep-nuc interact with dsDNA, and these same segments are largely conserved in the structure (but not sequence) of NS1-nuc (Fig. 8C). AAV5 Rep-nuc interacts with the minor groove of dsDNA via residues 101 to 111 (corresponding to 93to 103 in NS1-nuc; see Fig. 8D) and with the major groove using residues 135 to 142 (corresponding to 124 to 130 in NS1-nuc; see Fig. 8D).

FIG 8.

FIG 8

Predicted recognition of NSBE-containing dsDNA by NS1-nuc. (A) AAV5 viral origin of replication sequence with RBE (boxed base pairs), trs (a.k.a. nicking site; arrow), and positions of each AAV5 Rep-nuc (from PDB accession code 1RZ9 [40]). The position of each copy of AAV5 Rep-nuc is colored differently and corresponds to AAV5 Rep-nuc domain colors from the crystal structure shown in panel B. (B) Ribbon drawing of five copies of the AAV5 Rep-nuc bound to RBE sequences in the AAV5 origin of replication (from PDB accession code 1RZ9 [40]). DNA is shown in cartoon form and colored in magenta. (C) Overlay of NS1-nuc (blue) on AAV5 Rep-nuc (yellow) bound to RBE-containing dsDNA (white). Putative DNA binding residues of NS1-nuc shown in magenta and active-site residues highlighted in red. (D) Residues of the segments of AAV5 Rep-nuc closely approaching or interacting with the dsDNA, and corresponding residues of NS1-nuc after least-squares alignment of the two-protein structures. Residues shown in red indicate sequence-specific contacts from AAV5 Rep-nuc to the DNA (PDB accession code 1RZ9 [40]). (E) B19V origin of replication sequences with NSBE (four leftmost boxes) and nicking site/trs shown with arrow (boxed nucleotides around the trs indicate those in the model shown in panel F). (F) Comparison of models of dsDNA (shown in white) and ssDNA (containing the trs sequence, shown in green) bound to NS1-nuc showing some overlap in binding sites. The active site residues His81, His83, and Tyr141 are shown as dark blue spheres, and the predicted dsDNA binding residues shown in dark purple. The location of the ssDNA binding site on NS1-nuc occurs at the interface between adjacent copies of NS1-nuc in the dsDNA binding model. Note that the nick site (trs) is located ∼11 nucleotides to the right (as oriented in the figure) of the leftmost NSBE for NS1-nuc (the putative dsDNA binding sites, leftmost four boxed sequences in panel E). The actual configuration of Rep and NS1 nuclease domains when bound to both dsDNA sites and the trs in ssDNA are currently unknown.

Although identification of the likely dsDNA binding residues in NS1-nuc is relatively straightforward, prediction of exactly where NS1-nuc binds to the B19V DNA sequences is not. Figure 8E shows the NSBE and trs sequences of the B19V origin of replication. Rather than five quartet repeats as in AAV5, the sequence appears to have four octet repeats separated by 2-bp spacings. The octet sequences are very GC rich and were identified as NS1 binding sites in a prior study (41). Additional DNA binding studies showed that as many as 5 to 7 copies of NS1-nuc can bind to dsDNA containing all four NSBE sequences (23). But the differences in repeat size, spacing, and DNA and protein sequences made the exact positioning and modeling of protein-DNA interactions between NS1-nuc and DNA not possible. NS1-nuc may bind the NSBE sequences in a manner similar to that of AAV5 Rep-nuc, using overlapping quartets rather than four separate octets, but this has yet to be shown. Prior binding investigations indicated that the three octets closest to the trs are most important for NS1-nuc binding (23, 42). These sequences total 28 bp, or 7 quartets, consistent with the 7 copies of NS1-nuc binding in the AAV5 Rep-nuc pattern. However, more information on NS1 interactions with dsDNA containing the NSBE sequences will be necessary to assign the exact positioning and amino acid-nucleotide interactions between NS1-nuc and the DNA sequence.

Prior binding studies indicated that NS1-nuc binds DNA cooperatively, as evidenced by the shape of binding isotherms (23). Cooperativity indicates that the binding of DNA by one copy of NS1-nuc increases the affinity of subsequent copies of NS1-nuc to the same DNA. This effect can occur when favorable protein-protein interactions occur between copies of NS1-nuc on the DNA and/or when distortions made to the DNA upon binding of one copy facilitate the binding of subsequent copies by negating the requirement for the DNA distortion (which costs energy) by those subsequent protein copies. Cooperative DNA binding by AAV5 Rep-nuc is also suggested by the shape of the binding isotherm shown in Fig. 4B (lowest panel) of Hickman et al. (43). However, no protein-protein interactions are found between copies of AAV5 Rep-nuc bound to dsDNA in the crystal structure, and similarly, no protein-protein interactions are predicted when the NS1-nuc structure is superimposed onto each of the five copies of AAV5 Rep-nuc bound to dsDNA (the closest approach between different copies of NS1-nuc is 8 Å). It is possible that residues beyond the C terminus of AAV5 Rep-nuc or NS1-nuc extend enough to allow contacts between neighboring copies of AAV5 Rep-nuc or NS1-nuc when bound to DNA. Indeed, when a structure of AAV2 Rep-nuc containing 13 additional residues at its C terminus (PDB accession code 5DCX) (25, 44) is superimposed onto each of the five AAV5 Rep-nuc copies of the AAV5 Rep-nuc/dsDNA structure, the additional residues extend far enough to form interactions with neighboring AAV5 Rep-nuc copies. However, these residues were not present in the prior binding studies which showed strongly cooperative DNA binding by both AAV5 Rep and B19V NS1 nuclease domains (23, 43), suggesting the existence of some other mechanism of cooperativity beyond protein-protein interactions.

The other likely origin of cooperative DNA binding involves distortions in the bound DNA, where binding of each copy of a DNA binding protein distorts the DNA in a way which results in an increase in binding affinity of subsequent copies. The DNA conformation in the AAV5 Rep-nuc/dsDNA structure indeed shows distortions from ideal B-form DNA that follow the pattern of the quartet repeat (43). Since binding sites of NS1-nuc to NSBE likely overlap as they do in RBE binding by AAV5 Rep-nuc, distortions induced by one copy of NS1-nuc could facilitate binding of neighboring copies by presenting DNA predistorted in a manner which complements the protein-DNA binding interface and eliminates the cost of DNA distortion to the binding of that subsequent copy to the DNA. Hence, we predict based on this analysis that DNA distortions play a large role in the cooperativity seen in NSBE binding by NS1-nuc.

Finally, we show the relative positions of the two types of DNA binding interfaces in Fig. 8F. Regions of the red copy of NS1-nuc which are predicted to interact with dsDNA are shown in dark purple, and residues of the DNA nicking active site shown in dark blue. Although the ssDNA (green) and dsDNA (white) overlap minimally, it appears unlikely that NS1-nuc could bind to both simultaneously. The ssDNA interface on one copy of NS1-nuc is also located between adjacent copies of NS1-nuc on the dsDNA (modeled based on binding of AAV5 Rep-nuc binding to dsDNA). Note that the trs site is located distal to the dsDNA binding sites (Fig. 8E). In Fig. 8F, with the NS1-nuc domains bound to the NSBE sequences, the trs would be ∼1 turn of the DNA (if B-form) to the right of the red copy of NS1-nuc. Since NS1-nuc binds and cleaves the trs only when single stranded, some mechanism of strand opening, presumably following NS1 binding to the NSBE, must occur prior to trs cleavage. The current best model for NS1 action at the B19V origin of replication derives from studies of AAV2 Rep (45). A recent structural investigation showed that Rep68, which contains an SF3 helicase and an ATPase domain C-terminal to the nuclease domain (corresponding to residues 210 to 481 in B19V NS1), forms rings which encircle bound ss and dsDNA. The authors propose a model (45) similar to one proposed previously (43), in which the nuclease domains bring multiple copies of Rep to the origin via interaction with the RBE sequences, leading to formation of a heptameric ring structure that induces strand separation (possibly concurrent with loss of one Rep to form a hexameric ring). The strand separation then allows the trs to be cleaved by an available nuclease domain. Further studies will be required to confirm such a mechanistic model for B19V NS1.

MATERIALS AND METHODS

Protein expression and purification.

Purification of form I NS1-nuc (residues 2 to 176 of B19V NS1) free of purification tags was performed as previously described using an N-terminal 6×His- and maltose binding protein (MBP)-tagged fusion protein with a tobacco etch virus (TEV) protease cleavage site between the tags and the NS1-nuc sequences (23). Following cell lysis using an Avestin Emulsiflex-C3 instrument, cell debris were pelleted and the cell-free lysate incubated with pre-equilibrated Talon resin (Clontech, Inc.). The partially pure eluted protein was then incubated with a 1:1 molar ratio of TEV protease (46) overnight at 4°C. NS1-nuc free of MBP and His tags was then further purified with DEAE and heparin fast protein liquid chromatography (FPLC; GE, Inc.). A longer construct of NS1-nuc (residues 2 to 209 of B19V NS1) was used in the form II crystals and was prepared with an N-terminal 6×His tag and TEV cleavage site and expressed in Tuner (DE3) cells overnight at 17°C following induction. Cells were lysed using an Avestin Emulisflex-C3, then centrifuged to pellet cell debris. Purification proceeded with Talon resin (Clontech, Inc.) chromatography followed by DEAE FPLC (GE). Purified protein was dialyzed into 0.1 M bis-Tris-propane (pH 9.5), 150 mM NaCl, 1 mM 2-mercaptoethanol, and 50% glycerol, aliquoted, flash frozen in liquid nitrogen, then stored at −80°C until needed.

Crystallization, X-ray diffraction data collection, structure solution, and refinement.

Crystallization proceeded using the hanging drop vapor diffusion method. NS1-nuc protein (form I is free of the MBP tag, while form II retains the His tag) was dialyzed extensively against 0.1 M bis-Tris-propane (pH 9.5), 150 mM NaCl, and 1 mM 2-mercaptoethanol and concentrated to 5 to 10 mg/mL. Form I crystals appeared after several weeks at 17°C in the crystallization solution containing 15% polyethylene glycol (PEG) 3350, 0.1 M sodium citrate (pH 4.5), 0.1 M NaCl, 0.1 M LiCl, and a 7:1 molar ratio of NS1-nuc to NSBE-containing DNA (5′-TCGCCGCCGGTAGGCGGGACTT, 5′-AAGTCCCGCCTACCGGCGGCGA). For data collection, these crystals were exchanged into cryoprotectant (15% PEG 3350, 0.1 M sodium citrate [pH 4.5], 0.1 M NaCl, 0.1 M LiCl, and 30% glycerol) prior to flash freezing in liquid nitrogen. Form II crystals appeared with the crystallization solution consisting of 2.5 M NaCl, 0.1 M Tris-HCl (pH 7.0), and 200 mM MgCl2 after several weeks at 4°C. For X-ray diffraction data collection, form II crystals were harvested and exchanged into 2.5 M NaCl, 0.1 M Tris-HCl (pH 7.0), and 30% glycerol and then flash frozen and stored in liquid nitrogen. X-ray diffraction data collection was performed at SSRL BL 9-2 at 100K using Blu-Ice software (47). Data processing, including integration, scaling, and merging, was performed with iMOSFLM (48) and SCALA (49, 50). Structure solution of form I NS1-nuc was performed using molecular replacement in PHASER (51) within the PHENIX software suite (52) and by searching for one copy of the nuclease domain using coordinates from PDB accession code 6USM (24). Structure building and refinement proceeded through an iterative process using COOT (53, 54) and refinement using PHENIX (52, 5557). Solution of form II proceeded using the same procedure, but with the form I structural coordinates as the search model. Refinement of form II made use of form I as a reference structure, as well as secondary structure geometry restraints in the PHENIX software suite (52). RMSD calculations were performed with UCSF Chimera (58), PyMol (Schrodinger), and DALI (59). Images of structural models and electron density were prepared with UCSF Chimera (58) and PyMol (Schrodinger). Electrostatic calculations performed with the software APBS (60) in PyMol (Schrodinger). Topology diagrams were made with PDBsum (61).

Sequence variation analysis of NS1-nuc.

Sequences (195 total) of human parvovirus B19 NS1 (residues 1 to 176) were extracted from NCBI using BLASTp and the sequence of form I NS1-nuc as the search sequence. WebLOGO (62) was used to create a figure to display amino acid sequence variations.

Data availability.

Coordinates and structure factor amplitudes for form I and II NS1-nuc structures have been deposited in the PDB under accession codes 7SZY and 7SZX.

ACKNOWLEDGMENTS

Research reported in this publication was supported by the National Institute of General Medical Sciences of the National Institutes of Health via grant T32GM008659 (to J.L.S.), and by the University of Arizona Technology Research Initiative Fund (TRIF). Use of the Stanford Synchrotron Radiation Lightsource, SLAC National Accelerator Laboratory, is supported by the U.S. Department of Energy, Office of Science, Office of Basic Energy Sciences under Contract No. DE-AC02-76SF00515. The SSRL Structural Molecular Biology Program is supported by the DOE Office of Biological and Environmental Research, and by the National Institutes of Health, National Institute of General Medical Sciences (P30GM133894). The contents of this publication are solely the responsibility of the authors and do not necessarily represent the official views of NIGMS or NIH.

We declare that they have no conflicts of interest with the contents of this article.

Contributor Information

Nancy C. Horton, Email: nhorton@u.arizona.edu.

Lori Frappier, University of Toronto.

REFERENCES

  • 1.Young NS, Brown KE. 2004. Parvovirus B19. N Engl J Med 350:586–597. 10.1056/NEJMra030840. [DOI] [PubMed] [Google Scholar]
  • 2.Cossart YE, Field AM, Cant B, Widdows D. 1975. Parvovirus-like particles in human sera. Lancet 1:72–73. 10.1016/S0140-6736(75)91074-0. [DOI] [PubMed] [Google Scholar]
  • 3.Pattison JR, Jones SE, Hodgson J, Davis LR, White JM, Stroud CE, Murtaza L. 1981. Parvovirus infections and hypoplastic crisis in sickle-cell anaemia. Lancet 1:664–665. 10.1016/S0140-6736(81)91579-8. [DOI] [PubMed] [Google Scholar]
  • 4.Anderson MJ, Jones SE, Fisher-Hoch SP, Lewis E, Hall SM, Bartlett CL, Cohen BJ, Mortimer PP, Pereira MS. 1983. Human parvovirus, the cause of erythema infectiosum (fifth disease)? Lancet 1:1378. 10.1016/S0140-6736(83)92152-9. [DOI] [PubMed] [Google Scholar]
  • 5.Tsay GJ, Zouali M. 2006. Unscrambling the role of human parvovirus B19 signaling in systemic autoimmunity. Biochem Pharmacol 72:1453–1459. 10.1016/j.bcp.2006.04.023. [DOI] [PubMed] [Google Scholar]
  • 6.Colmegna I, Alberts-Grill N. 2009. Parvovirus B19: its role in chronic arthritis. Rheum Dis Clin North Am 35:95–110. 10.1016/j.rdc.2009.03.004. [DOI] [PubMed] [Google Scholar]
  • 7.Franssila R, Hedman K. 2006. Infection and musculoskeletal conditions: viral causes of arthritis. Best Pract Res Clin Rheumatol 20:1139–1157. 10.1016/j.berh.2006.08.007. [DOI] [PubMed] [Google Scholar]
  • 8.Kerr JR. 2016. The role of parvovirus B19 in the pathogenesis of autoimmunity and autoimmune disease. J Clin Pathol 69:279–291. 10.1136/jclinpath-2015-203455. [DOI] [PubMed] [Google Scholar]
  • 9.Tattersall P, Ward DC. 1976. Rolling hairpin model for replication of parvovirus and linear chromosomal DNA. Nature 263:106–109. 10.1038/263106a0. [DOI] [PubMed] [Google Scholar]
  • 10.Cotmore SF, McKie VC, Anderson LJ, Astell CR, Tattersall P. 1986. Identification of the major structural and nonstructural proteins encoded by human parvovirus B19 and mapping of their genes by procaryotic expression of isolated genomic fragments. J Virol 60:548–557. 10.1128/JVI.60.2.548-557.1986. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Ozawa K, Ayub J, Hao YS, Kurtzman G, Shimada T, Young N. 1987. Novel transcription map for the B19 (human) pathogenic parvovirus. J Virol 61:2395–2406. 10.1128/JVI.61.8.2395-2406.1987. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.St Amand J, Beard C, Humphries K, Astell CR. 1991. Analysis of splice junctions and in vitro and in vivo translation potential of the small, abundant B19 parvovirus RNAs. Virology 183:133–142. 10.1016/0042-6822(91)90126-V. [DOI] [PubMed] [Google Scholar]
  • 13.Zhi N, Mills IP, Lu J, Wong S, Filippone C, Brown KE. 2006. Molecular and functional analyses of a human parvovirus B19 infectious clone demonstrates essential roles for NS1, VP1, and the 11-kilodalton protein in virus replication and infectivity. J Virol 80:5941–5950. 10.1128/JVI.02430-05. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Ozawa K, Kurtzman G, Young N. 1986. Replication of the B19 parvovirus in human bone marrow cell cultures. Science 233:883–886. 10.1126/science.3738514. [DOI] [PubMed] [Google Scholar]
  • 15.Berns KI. 1990. Parvovirus replication. Microbiol Rev 54:316–329. 10.1128/mr.54.3.316-329.1990. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Xu P, Zhou Z, Xiong M, Zou W, Deng X, Ganaie SS, Kleiboeker S, Peng J, Liu K, Wang S, Ye SQ, Qiu J. 2017. Parvovirus B19 NS1 protein induces cell cycle arrest at G2-phase by activating the ATR-CDC25C-CDK1 pathway. PLoS Pathog 13:e1006266. 10.1371/journal.ppat.1006266. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Nakashima A, Morita E, Saito S, Sugamura K. 2004. Human parvovirus B19 nonstructural protein transactivates the p21/WAF1 through Sp1. Virology 329:493–504. 10.1016/j.virol.2004.09.008. [DOI] [PubMed] [Google Scholar]
  • 18.Moffatt S, Tanaka N, Tada K, Nose M, Nakamura M, Muraoka O, Hirano T, Sugamura K. 1996. A cytotoxic nonstructural protein, NS1, of human parvovirus B19 induces activation of interleukin-6 gene expression. J Virol 70:8485–8491. 10.1128/JVI.70.12.8485-8491.1996. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Duechting A, Tschöpe C, Kaiser H, Lamkemeyer T, Tanaka N, Aberle S, Lang F, Torresi J, Kandolf R, Bock C-T. 2008. Human parvovirus B19 NS1 protein modulates inflammatory signaling by activation of STAT3/PIAS3 in human endothelial cells. J Virol 82:7942–7952. 10.1128/JVI.00891-08. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Wan Z, Zhi N, Wong S, Keyvanfar K, Liu D, Raghavachari N, Munson PJ, Su S, Malide D, Kajigaya S, Young NS. 2010. Human parvovirus B19 causes cell cycle arrest of human erythroid progenitors via deregulation of the E2F family of transcription factors. J Clin Invest 120:3530–3544. 10.1172/JCI41805. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Fu Y, Ishii KK, Munakata Y, Saitoh T, Kaku M, Sasaki T. 2002. Regulation of tumor necrosis factor alpha promoter by human parvovirus B19 NS1 through activation of AP-1 and AP-2. J Virol 76:5395–5403. 10.1128/jvi.76.11.5395-5403.2002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Janovitz T, Wong S, Young NS, Oliveira T, Falck-Pedersen E. 2017. Parvovirus B19 integration into human CD36+ erythroid progenitor cells. Virology 511:40–48. 10.1016/j.virol.2017.08.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Sanchez JL, Romero Z, Quinones A, Torgeson KR, Horton NC. 2016. DNA binding and cleavage by the human parvovirus B19 NS1 nuclease domain. Biochemistry 55:6577–6593. 10.1021/acs.biochem.6b00534. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Tewary SK. 2020. Data from “Structure of nuclease domain of human parvovirus B19 non-structural protein 1 in complex with zinc.” Protein Data Bank 10.2210/pdb6USM/pdb. [DOI] [Google Scholar]
  • 25.Musayev FN, Zarate-Perez F. 2015. Data from “Structural studies of AAV2 Rep68 reveal a partially structured linker and compact domain conformation.” Protein Data Bank 10.2210/pdb5DCX/pdb. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Tewary SK, Zhao H, Tang L. 2014. Data from “Minute virus of mice non-structural protein-1N-terminal nuclease domain reveals a unique Zn2+ coordination in the active site pocket and shows a novel mode of DNA recognition at the origin of replication.” Protein Data Bank 10.2210/pdb3WRR/pdb. [DOI] [Google Scholar]
  • 27.Tewary SK, Zhao H, Tang L. 2013. Data from “Crystal structure of the non-structural protein 1 N-terminal origin-recognition/nickase domain from the emerging human bocavirus.” Protein Data Bank 10.2210/pdb4KW3/pdb. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Tompkins K, Litzau LA, Pornschloegl L, Nelson AT, Evans RL, III, Gordon WR. 2020. Data from “Wheat dwarf virus Rep domain complexed with a single-stranded DNA 8-mer comprising the cleavage site.” Protein Data Bank 10.2210/pdb6WE1/pdb. [DOI] [Google Scholar]
  • 29.Tompkins KJ, Houtti M, Litzau LA, Aird EJ, Everett BA, Nelson AT, Pornschloegl L, Limon-Swanson LK, Evans RL, Evans K, Shi K, Aihara H, Gordon WR. 2021. Molecular underpinnings of ssDNA specificity by Rep HUH-endonucleases and implications for HUH-tag multiplexing and engineering. Nucleic Acids Res 49:1046–1064. 10.1093/nar/gkaa1248. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Chandler M, de la Cruz F, Dyda F, Hickman AB, Moncalian G, Ton-Hoang B. 2013. Breaking and joining single-stranded DNA: the HUH endonuclease superfamily. Nat Rev Microbiol 11:525–538. 10.1038/nrmicro3067. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Yang W. 2011. Nucleases: diversity of structure, function and mechanism. Q Rev Biophys 44:1–93. 10.1017/S0033583510000181. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Kragten J. 1978. Atlas of metal-ligand equilibria in aqueous solution. Halsted Press, Chichester, England. [Google Scholar]
  • 33.Pontius BW, Lott WB, von Hippel PH. 1997. Observations on catalysis by hammerhead ribozymes are consistent with a two-divalent-metal-ion mechanism. Proc Natl Acad Sci USA 94:2290–2294. 10.1073/pnas.94.6.2290. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Harding MM. 1999. The geometry of metal-ligand interactions relevant to proteins. Acta Crystallogr D Biol Crystallogr 55:1432–1443. 10.1107/S0907444999007374. [DOI] [PubMed] [Google Scholar]
  • 35.Horton NC. 2008. DNA nucleases, p 333–363. In Rice PA, Correll CC (ed), Protein-nucleic acid interactions: structural biology. The Royal Society of Chemistry, Cambridge, United Kingdom. [Google Scholar]
  • 36.Horton NC, Dorner LF, Perona JJ. 2002. Making the most of metal ions. Nat Struct Biol 9:42–47. 10.1038/nsb741. [DOI] [PubMed] [Google Scholar]
  • 37.Gerlt JA. 1993. Mechanistic principles of enzyme-catalyzed cleavage of phosphodiester bonds, vol. 25 p 1–34. In Lloyd S, Linn S, Roberts R (ed), Nucleases, 2nd ed. Cold Spring Harbor Press, Cold Spring Harbor, NY. [Google Scholar]
  • 38.Mantina M, Chamberlin AC, Valero R, Cramer CJ, Truhlar DG. 2009. Consistent van der Waals radii for the whole main group. J Phys Chem A 113:5806–5812. 10.1021/jp8111556. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Trachsel MA, Ottiger P, Frey HM, Pfaffen C, Bihlmeier A, Klopper W, Leutwyler S. 2015. Modeling the histidine-phenylalanine interaction: the NH···π hydrogen bond of imidazole·benzene. J Phys Chem B 119:7778–7790. 10.1021/jp512766r. [DOI] [PubMed] [Google Scholar]
  • 40.Hickman AB, Ronning DR, Perez ZN, Kotin RM, Dyda F. 2003. Data from “Crystal structure of AAV Rep complexed with the Rep-binding sequence.” Protein Data Bank 10.2210/pdb1RZ9/pdb. [DOI] [Google Scholar]
  • 41.Raab U, Beckenlehner K, Lowin T, Niller HH, Doyle S, Modrow S. 2002. NS1 protein of parvovirus B19 interacts directly with DNA sequences of the p6 promoter and with the cellular transcription factors Sp1/Sp3. Virology 293:86–93. 10.1006/viro.2001.1285. [DOI] [PubMed] [Google Scholar]
  • 42.Tewary SK, Zhao H, Deng X, Qiu J, Tang L. 2014. The human parvovirus B19 non-structural protein 1 N-terminal domain specifically binds to the origin of replication in the viral DNA. Virology 449:297–303. 10.1016/j.virol.2013.11.031. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Hickman AB, Ronning DR, Perez ZN, Kotin RM, Dyda F. 2004. The nuclease domain of adeno-associated virus Rep coordinates replication initiation using two distinct DNA recognition interfaces. Mol Cell 13:403–414. 10.1016/s1097-2765(04)00023-1. [DOI] [PubMed] [Google Scholar]
  • 44.Musayev FN, Zarate-Perez F, Bardelli M, Bishop C, Saniev EF, Linden RM, Henckaerts E, Escalante CR. 2015. Structural studies of AAV2 Rep68 reveal a partially structured linker and compact domain conformation. Biochemistry 54:5907–5919. 10.1021/acs.biochem.5b00610. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Santosh V, Musayev FN, Jaiswal R, Zarate-Perez F, Vanderwinkel B, Dierckx C, Endicott M, Sharifi K, Dryden K, Henkaerts E, Escalante CR. 2020. The Cryo-EM structure of AAV2 Rep68 in complex with ssDNA reveals a malleable AAA+ machine that can switch between oligomeric states. Nucleic Acids Res 48:12983–12999. 10.1093/nar/gkaa1133. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Tropea JE, Cherry S, Waugh DS. 2009. Expression and purification of soluble His6-tagged TEV protease. Methods Mol Biol 498:297–307. 10.1007/978-1-59745-196-3_19. [DOI] [PubMed] [Google Scholar]
  • 47.McPhillips TM, McPhillips SE, Chiu HJ, Cohen AE, Deacon AM, Ellis PJ, Garman E, Gonzalez A, Sauter NK, Phizackerley RP, Soltis SM, Kuhn P. 2002. Blu-Ice and the Distributed Control System: software for data acquisition and instrument control at macromolecular crystallography beamlines. J Synchrotron Radiat 9:401–406. 10.1107/s0909049502015170. [DOI] [PubMed] [Google Scholar]
  • 48.Geoff T, Battye G, Kontogiannis L, Johnson O, Powella HR, Leslie AGW. 2011. iMOSFLM: a new graphical interface for diffraction-image processing with MOSFLM. Acta Cryst D67:271–281. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Evans P. 2006. Scaling and assessment of data quality. Acta Crystallogr D Biol Crystallogr 62:72–82. 10.1107/S0907444905036693. [DOI] [PubMed] [Google Scholar]
  • 50.Evans PR. 1993. Data reduction. Proc CCP4 Study Weekend Data Collection Processing 1993:114–122. [Google Scholar]
  • 51.McCoy AJ, Grosse-Kunstleve RW, Adams PD, Winn MD, Storoni LC, Read RJ. 2007. Phaser crystallographic software. J Appl Crystallogr 40:658–674. 10.1107/S0021889807021206. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Adams PD, Grosse-Kunstleve RW, Hung LW, Ioerger TR, McCoy AJ, Moriarty NW, Read RJ, Sacchettini JC, Sauter NK, Terwilliger TC. 2002. PHENIX: building new software for automated crystallographic structure determination. Acta Crystallogr D Biol Crystallogr 58:1948–1954. 10.1107/s0907444902016657. [DOI] [PubMed] [Google Scholar]
  • 53.Emsley P, Cowtan K. 2004. Coot: model-building tools for molecular graphics. Acta Crystallogr D Biol Crystallogr 60:2126–2132. 10.1107/S0907444904019158. [DOI] [PubMed] [Google Scholar]
  • 54.Emsley P, Lohkamp B, Scott WG, Cowtan K. 2010. Features and development of Coot. Acta Crystallogr D Biol Crystallogr 66:486–501. 10.1107/S0907444910007493. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Grosse-Kunstleve RW, Terwilliger TC, Sauter NK, Adams PD. 2012. Automatic Fortran to C++ conversion with FABLE. Source Code Biol Med 7:5. 10.1186/1751-0473-7-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Headd JJ, Echols N, Afonine PV, Grosse-Kunstleve RW, Chen VB, Moriarty NW, Richardson DC, Richardson JS, Adams PD. 2012. Use of knowledge-based restraints in phenix.refine to improve macromolecular refinement at low resolution. Acta Crystallogr D Biol Crystallogr 68:381–390. 10.1107/S0907444911047834. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Afonine PV, Grosse-Kunstleve RW, Adams PD, Urzhumtsev A. 2013. Bulk-solvent and overall scaling revisited: faster calculations, improved results. Acta Crystallogr D Biol Crystallogr 69:625–634. 10.1107/S0907444913000462. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Pettersen EF, Goddard TD, Huang CC, Couch GS, Greenblatt DM, Meng EC, Ferrin TE. 2004. UCSF Chimera—a visualization system for exploratory research and analysis. J Comput Chem 25:1605–1612. 10.1002/jcc.20084. [DOI] [PubMed] [Google Scholar]
  • 59.Holm L. 2020. Using Dali for protein structure comparison. Methods Mol Biol 2112:29–42. 10.1007/978-1-0716-0270-6_3. [DOI] [PubMed] [Google Scholar]
  • 60.Jurrus E, Engel D, Star K, Monson K, Brandi J, Felberg LE, Brookes DH, Wilson L, Chen J, Liles K, Chun M, Li P, Gohara DW, Dolinsky T, Konecny R, Koes DR, Nielsen JE, Head-Gordon T, Geng W, Krasny R, Wei GW, Holst MJ, McCammon JA, Baker NA. 2018. Improvements to the APBS biomolecular solvation software suite. Protein Sci 27:112–128. 10.1002/pro.3280. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Laskowski RA, Jabłońska J, Pravda L, Vařeková RS, Thornton JM. 2018. PDBsum: structural summaries of PDB entries. Protein Sci 27:129–134. 10.1002/pro.3289. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Crooks GE, Hon G, Chandonia JM, Brenner SE. 2004. WebLogo: a sequence logo generator. Genome Res 14:1188–1190. 10.1101/gr.849004. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

Coordinates and structure factor amplitudes for form I and II NS1-nuc structures have been deposited in the PDB under accession codes 7SZY and 7SZX.


Articles from Journal of Virology are provided here courtesy of American Society for Microbiology (ASM)

RESOURCES