Solution Structure of the 2A Protease from a Common Cold Agent, Human Rhinovirus C2, Strain W12

Woonghee Lee; Kelly E Watters; Andrew T Troupis; Nichole M Reinen; Fabian P Suchy; Kylie L Moyer; Ronnie O Frederick; Marco Tonelli; David J Aceti; Ann C Palmenberg; John L Markley

doi:10.1371/journal.pone.0097198

. 2014 Jun 17;9(6):e97198. doi: 10.1371/journal.pone.0097198

Solution Structure of the 2A Protease from a Common Cold Agent, Human Rhinovirus C2, Strain W12

Woonghee Lee ¹, Kelly E Watters ³, Andrew T Troupis ², Nichole M Reinen ², Fabian P Suchy ², Kylie L Moyer ², Ronnie O Frederick ², Marco Tonelli ¹, David J Aceti ², Ann C Palmenberg ³, John L Markley ^1,^2,^*,^¤

Editor: Mark J van Raaij⁴

PMCID: PMC4061012 PMID: 24937088

Abstract

Human rhinovirus strains differ greatly in their virulence, and this has been correlated with the differing substrate specificity of the respective 2A protease (2A^pro). Rhinoviruses use their 2A^pro to cleave a spectrum of cellular proteins important to virus replication and anti-host activities. These enzymes share a chymotrypsin-like fold stabilized by a tetra-coordinated zinc ion. The catalytic triad consists of conserved Cys (C105), His (H34), and Asp (D18) residues. We used a semi-automated NMR protocol developed at NMRFAM to determine the solution structure of 2A^pro (C₁₀₅A variant) from an isolate of the clinically important rhinovirus C species (RV-C). The backbone of C2 2Apro superimposed closely (1.41–1.81 Å rmsd) with those of orthologs from RV-A2, coxsackie B4 (CB4), and enterovirus 71 (EV71) having sequence identities between 40% and 60%. Comparison of the structures suggest that the differential functional properties of C2 2A^pro stem from its unique surface charge, high proportion of surface aromatics, and sequence surrounding the di-tyrosine flap.

Introduction

Human rhinoviruses (RVs) are single-stranded, positive-sense RNA Enteroviruses in the Picornaviridae family and the most ubiquitous agents of the common cold. Originally catalogued by serotyping relative to an historical repository of clinical strains, thousands of isolates representing more than 110 different RV genotypes are now binned within the RV-A and RV-B species, according to overt similarities in their VP1 capsid sequences. For taxonomic clarity, the species letter (e.g. A or B) precedes the assigned type number (e.g. B14, A2) when referring to individual clades. Like other enterovirus genomes, the RVs encode a polyprotein that is co- and post-translationally processed by proteases that form part of the polyprotein (Figure 1). The first cleavage is by 2A^pro. It occurs autocatalytically within the nascent polyprotein to form the amino terminus of the protease. The downstream 3C^pro subsequently undergoes two self-release reactions and then completes the excision of 2A^pro.

The polyprotein is cleaved co- and post-translationally to release mature viral proteins. During infection, 2A^pro is excised at the N-terminus by self-catalysis and at the C-terminus by 3C^pro. The released protease cleaves cellular substrates including eIF4G and nucleoporins.

During infection, both enzymes contribute to host cell shut-off activities, helping the virus evade host defense mechanisms and promote its replication. Among known reactions, 3C^pro and/or its precursors cleave nuclear transcription factors, preventing most pol2 mRNA synthesis [1], [2]. In parallel, 2A^pro targets translation pathways by cleaving initiation factors eIF4G-I and -II, required proteins for cap-dependent mRNA recognition by ribosomes [3], [4]. Additionally, 2A^pro reacts with the nuclear pore complex, cleaving multiple central core nucleoporin proteins (Nups). Since the movement of cellular proteins and RNA in and out of the nucleus is at the core of all gene activation schemes, including those required for nearly every innate immunity trigger, the 2A^pro alteration of Nups results in a comprehensive failure of nucleocytoplasmic transport and dependent processes of intracellular signaling [5], [6]. Interestingly though, few of the homologous enterovirus 2A^pro behave exactly the same with regard to these activities [7]. Among RV genotypes, the pairwise 2A^pro sequence identities range from 33% to 98% [8], a variation much greater than for the respective 3C^pro (<20%), or even some regions of the capsid proteins [8]. The variation confers to each 2A^pro subtle differences in substrate preference and rate kinetics toward particular Nups and eIF4G cohorts [9]. The observed turnover rates varied in the order: HRV-A > HRV-C >> HRV-B. The individual proclivities are not well understood, but they are proposed to be linked mechanistically to diverse infection outcomes unique to each sequence clade, perhaps through the regulation of preferential cytokine induction [9].

The enterovirus 2A^pro are small (142–150 amino acids) chymotrypsin-like enzymes that use Cys as the active nucleophile [10], [11]. The crystal structures of RV-A2 [11] and EV-71 (enterovirus 71) [12], [13] and the NMR structure of EV-CB4 (enterovirus coxsackie B4) [14] enzymes have been determined. When combined with biochemical studies on RV-B14, the structures show these enzymes are able to choose their preferred substrates from among a variety of related sequences because their highly variable binding surfaces sense and discriminate residues P8 to P2′ relative to the scission position [15]. The discernment influences the cleavage rates and pattern selection of many cellular substrates as well as the precise location of the polyprotein self-processing sites [16], [17]. From an antiviral standpoint, it is important to understand how this selectivity works at the structural level for different 2A^pro, because putative therapies aimed at the plethora of RV types need to define and target commonalities among the crucial viral enzymes.

In 2006, multiple rhinoviruses representing a new species, the RV-C, were discovered in patients suffering influenza-illnesses with severe respiratory compromise [18]. The RV-C have special clinical relevance, because it is now recognized these new isolates (51 types) can grow in both the upper and lower airways and are responsible for up to half of RV infections in children, especially those with a propensity for asthma. Unlike the RV-A or RV-B, the RV-C cannot be grown in established tissue culture, a limitation that has hindered investigations into interventions directed against the virus capsid, or viral enzymes. Nonetheless, multiple RV-C genomes have been sequenced in their entirety, and key isolates have been rendered into cDNA [19]. These reagents have allowed essential non-structural proteins to be expressed and compared at the enzymatic level, including the 2A^pro from types C2 and C15 [9]. We report here the first 3D structure of an RV-C protein, the 2A^pro from C2, strain W12, whose functional properties have been studied extensively [9]. Stable isotope-labeled protein was prepared at the Center for Eukaryotic Structural Genomics (CESG), and the solution structure was determined at the National Magnetic Resonance Facility at Madison (NMRFAM). In addition to achieving the goal of providing biological insights into the intrinsic enzyme variability, the full, extensive NMR data collected served as test sets for NMRFAM software designed for high-throughput structure determination, including PINE-SPARKY [20] and PONDEROSA [21].

Materials and Methods

Plasmid Design and Construction

The protease cDNA was from RV-C2, strain W12 [9]. The sequence of the 2A gene was identical to GenBank JN837695, although the parental genome has not been sequenced entirely [22]. An amplicon for the gene encoding the RV-C2 2A^pro (strain W12) was isolated by PCR methods from the pET-11a plasmid previously described as Cw12 [9]. The reaction used AccuPrime Supermix (Invitrogen) and DNA primers 5' 2A^pro-Bsa1 and 3' 2A^pro-Xho1 (UW-Madison Biotechnology Center) shown in Table 1. The PCR product and DNA for expression vector, pE-SUMO Kan (Lifesensors) were digested with BsaI (New England Biolabs) and XhoI (Promega) then ligated by T4 DNA ligase under a temperature cycling reaction at 10°C for 30 s and 30°C for 30 s, repeated 800 times. Competent E. coli cells (Lucigen 10G) were transformed with a heat-inactivated ligation sample (65°C for 25 min) then plated onto YT agar plates containing kanamycin (50 µg/mL). After overnight incubation (37°C), individual colonies were picked, suspended and stored in 20% sterile glycerol. The cell suspensions (3 μL glycerol stocks) were screened by PCR and positive recombinant plasmids were isolated and the inserted DNA was sequenced (UW-Madison Biotechnology Center) to identify clones with intact 2A^pro genes. Site-directed mutagenesis to convert the active site-Cys₁₀₅ codon to Ala₁₅₀ used primers PI 5' 2A^pro-C₁₀₅A and PI 3' 2A^pro-C₁₀₅A (Table 1), with polymerase incomplete primer extension (PIPE) methods and either AccuPrime Supermix or Stratagene Pfu Turbo Ultra [23]. In preliminary extraction trials, this modification (pC2-2A-C₁₀₅A) gave larger, more stable yields of 2A^pro for structure studies.

Table 1. DNA Primers used for Cloning and Mutating RV-C2 2A^pro.

	DNA primer name	Primer DNA sequences^*
1	5' 2A^pro-Bsa1	`5′ACTAGTGGTACCGGTCTCAAGGT GGACCTAGTGACCTATTTGTTCAC`
2	3' 2A^pro-Xho1	`5′GGGCCCGCTCGAGGGATCCTCATTA TTGAGAGGTTGCTTTGATATTATAAG`
3	PI 5' 2A^pro-C₁₀₅A	`CCA GGT GAC gcg GGA GGT AAA TTA CTG TGC AGA CAT GGG GTT`
4	PI 3' 2A^pro-C₁₀₅A	`TTT ACC TCC cgc GTC ACC TGG GAC ACA TGG TCC TTC TCC AAT`

Open in a new tab

*Restriction sites are in bold; primer regions that anneal to 2A^pro gene are underlined; and lowercase letters show DNA bases at the sites of directed mutagenesis.

Optimal Expression Parameters

Host selection for optimal 2A^pro production used small-scale screening techniques developed by the CESG [24]. A series of competent E. coli strains (Rosetta2(DE3), Rosetta2(DE3)-pLysS from Novagen, and BL21-DE3 CodonPlus RILP from Stratagene) were transformed with pE-SUMO C2 2A^pro then grown on plates containing chloramphenicol and kanamycin (either YT agar plus 1% glucose or MDAG solid medium). The plates were incubated (37°C) overnight, before colonies were picked into MDAG liquid medium [25] (0.5 mL, supplemented with the appropriate antibiotics) in a 96-well format growth block. The composition of MDAG solid medium and MDAG liquid medium can be found in Protocol ID: LP.4813 at http://sbkb.org/tt/protocol?ttid=MPP-GO.111408&lab=MPP&trialid=3&protocolid=LP.4813.

The cultures were grown overnight a 25°C with shaking at 250 rpm. 10–20 μL of each culture was used to inoculate 0.5 mL of Terrific Broth with glycerol (TB+g) auto-induction medium prepared in a series of 96-well format growth blocks. The blocks were shaken and incubated at varying temperatures (30, 25, 15 and 10°C) to identify the best combinations of host strain, growth temperature and induction methods for soluble protein overproduction, as assayed by SDS-PAGE analysis of the soluble fractions and spin IMAC (immobilized metal affinity chromatography) captured protein.

Large-Scale Protein Production

For large-scale production of 2A^pro, cell cultures were amplified from fresh transformations of BL21(DE3) with the pE-SUMO C2 2A^pro plasmid. Colonies were inoculated into starter cultures (1 mL YT, plus 1% glucose, kanamycin and chloramphenicol). After initial growth with shaking (1 to 3 h, 37°C, 250–320 rpm), the starters were transferred into MDAG (50–100 mL plus antibiotics) then further grown overnight (25°C, rotary shaker, 250–320 rpm). These starter cultures (10–12 mL) were then amplified in 2 L PET bottles (500 mL YT medium in a rotary shaker) for 2–5 h, until the OD₆₀₀ was between 1.0 and 1.4 AU. Growth temperature was reduced to 25–30°C, ZnCl₂ was added (to 50 µM), followed 15–30 min later by IPTG (to 0.1–0.2 mM). The cells were grown overnight with shaking (250–320 rpm), harvested by centrifugation (4,000 g, 30 min) and stored at −80°C. In tests to optimize protein yields, unlabeled 2A^pro was also prepared using 500 mL of TB+g based auto-induction medium [26]. Essentially, this is a basic medium (12 g/L tryptone, 24 g/L yeast extract, 9.4 g/L KH₂PO₄, 2.2 g/L K₂HP O₄ and 10 g glycerol, and 100 μL/L antifoam) with supplements (3.75% aspartic acid, 2 mM MgS O₄, 0.825 mM glucose, 87 mM glycerol, 4.6 mM α–lactose). The TB+g auto-induction medium was used in place of YT and required no induction with IPTG.

Preparation of Uniformly ¹⁵N and ¹³C/¹⁵N-Labeled Protein on a Large-Scale

Isotopically-labeled protein was prepared as described above, except that an M9 based medium was used in place of YT (per L: 100 mL of 10x M9 salts, 70 g Na₂HPO₄, 30 g KH₂PO₄, 5 g NaCl, 1 mL of 1000x metal mix, 1 mL of B12 vitamin mixture [25], [26], 30 mg thiamine, 100 μL antifoam, 35 µg/mL chloramphenicol and 50 µg/mL kanamycin [26] and, as appropriate, 1 g ¹⁵NH₄Cl and/or 4 g U-¹³C-glucose). The medium also contained 0.1 mM CaCl₂, 50 µM ZnCl₂, and 2 mM Mg₂SO₄.

Protein Purification

Cell pastes (5–10 g) were thawed and resuspended in lysis buffer (60–70 mL, 20 mM Tris pH 7.2, 500 mM NaCl, 10% ethylene glycol, 5 mM imidazole, 1 mM PMSF, 0.1% NP-40, Sigma) containing lysozyme (5 μL, Novagen), RNase (10 μL, Qiagen), Benzonase (5 μL, Novagen, 25 U/µl), or OmniCleave nuclease (Epicenter, 10 KU). The lysates were sonicated in a Misonix 3000 at 4°C with pulsing on (∼80 Watt) for 2 s and off for 4 s over 15 min and then clarified by centrifugation (30 min, 70,000 g). Polyethylene imine (to 0.1% w/v, Fluka) was added, and the samples were clarified again by centrifugation (30 min, 70,000 g) before the addition of (NH₄)₂SO₄ (to 70% w/v) and DTT (to 2 mM). The collected pellets were resuspended in IMAC buffer 1 (30–40 mL, 20 mM Tris, pH 7.2, 10% glycerol, 35 mM imidazole, 1 mM PMSF), clarified (70,000 g, 30 min) then filtered (0.8 micron, Millipore) before loading onto IMAC resin (Qiagen Superflow FF) at a rate of 1–2 mL/min. The column (∼10 mL) was washed (10 volumes) with IMAC buffer 2 (buffer 1 plus 500 mM NaCl) then with IMAC buffer 3 (buffer 2 plus 65 mM imidazole), before protein elution with IMAC buffer 4 (buffer 2 plus 250 mM imidazole). Usually, 90% of the target was eluted in the first 15–30 mL as assayed by SDS-PAGE. Appropriate fractions were dialyzed overnight into buffer (Tris 20 mM pH 8.0, 150 mM NaCl and 2 mM DTT or β-mercaptoethanol), before the SUMO domain was removed from the N-terminus of 2A^pro by incubation with 0.5 mg SUMO protease (prepared in house) for 3–4 h at 30°C. The sample was loaded onto an IMAC column freshly equilibrated with IMAC buffer 1, which bound the His-tagged SUMO domain. The 2A^pro target was retrieved in the flow-through (4–5 fractions of 5–10 mL) and pooled. The final fractionation was by gel filtration (GE Healthcare HiPrep 16/60 Sephacryl S-200, 20 mM Tris, pH 8.0, 150 mM NaCl, 2 mM DTT). The purified protein was spin concentrated (Sartorius Vivaspin 20 10 kDa PES concentrator, 5,000 g) and then drop frozen in liquid nitrogen. The final yield was 27.5 mg of purified protein from 0.5 L double-labeled Martek (rich) media. The purity of protein samples was determined by SDS-PAGE (Figure 2). The C₁₀₅A variant protein aggregated less during purification and produced a higher yield of protein.

The recombinant methods described above were used to prepare ¹³C/¹⁵N-labeled C2 2A^pro (C₁₀₅A) for NMR studies. Representative samples from the procedure were fractionated by SDS-PAGE then visualized with Bio-Rad Stain-Free. Lane 1, Bio-Rad Precision Plus protein standards; lane 2, protein pellet after (NH₄)₂SO₄ precipitation; lane 3, SUMO-2A^pro after IMAC elution; lane 4, 2Apro after SUMO cleavage and IMAC elution; lanes 5–6, final protein fractions after gel filtration.

NMR Data Collection

The samples for NMR spectroscopy contained 3.4 mg [U-¹³C,U-¹⁵N]-2A^pro dissolved in buffer (0.4 mL, 10 mM MES, 20 mM NaCl, 10 mM DTT, 10% ²H₂O, 90% H₂O, pH 6.5). The solutions (∼0.5 mM) were placed in 5 mm Shigemi tubes (Allison Park, PA). NMR data were collected at NMRFAM on Agilent VNMRS spectrometers operating at 600 MHz, 800 MHz, and 900 MHz. The temperature was regulated at 313 K, the temperature at which the protein exhibited the best quality 2D ¹H-¹⁵N HSQC spectrum. A 600 MHz spectrometer equipped with a triple-resonance cryogenic probe was used to record 3D HNCO, HN(CA)CO, HNCA, HN(CO)CA, CBCA(CO)NH, HBHA(CO)NH, C(CO)NH, H(CCO)NH, H(C)CH-TOCSY, and ¹⁵N-edited NOESY data sets. The 800 MHz spectrometer with a conventional triple-resonance probe was used to record 2D ¹H-¹⁵N HSQC, 3D ¹⁵N-edited TOCSY, (H)CCH-TOCSY, and ¹³C-edited NOESY data sets. The 900 MHz instrument with a triple-resonance cryogenic probe was used to record 2D ¹H-¹³C HSQC and 3D HNCACB spectra. All time-domain data were processed with NMRPipe [27] to generate frequency-domain sets which were converted to SPARKY (ucsf) file format [28] for further analysis.

NMR Spectral Analysis and Structure Calculation

Resonances for backbone atoms in the ¹H-¹⁵N HSQC, HNCACB, and CBCA(CO)NH spectra were initially identified with the APES program [29]. The restricted peak picking feature in SPARKY identified signals from additional backbone and side chain atoms. All peaks identified by automation were carefully validated by visual inspection. Peak lists for each spectrum were exported to the PINE-NMR server [30], which yielded automated resonance assignments for all but four of the backbone spin systems. The assignment probabilities were high for all but one residue, which was at 50%. We used the PINE-SPARKY [20] package to validate these assignments and complete the missing assignments. Validated chemical shift assignments were then imported into PONDEROSA [21] for the automated assignment of NOE cross-peaks in ¹⁵N-edited NOESY and ¹³C-edited NOESY data sets. SPARKY was again used to manually validate and refine NOE peak identification and assignments. Curated lists of NOE assignments and distance and torsion angle restraints were used to further refine the structure, through manual operation of CYANA (version 3.0) [31] followed by fine-tuned structure calculation. Hydrogen bond restraints for regions with regular secondary structure (d _N-O = 2.7 to 3.5 Å; d _H ^N _-O = 1.8 to 2.5 Å) were then added. The torsion angle constraints, generated by a TALOS+ [32] module and executed within PONDEROSA, were validated individually, by reference to SPARKY and PyMOL [33] visualizations, to remove any constraints that were too tight. Once an acceptable structure was obtained, as validated by the PSVS suite server [34], the metal-coordinating side chains were identified (C₅₁, C₅₃, C₁₁₁, H₁₁₃), and a zinc ion was added to the model. Subsequent CYANA calculations provided covalent distance restraints for the zinc coordination side chains (Cys S^γ−Zn = 2.40 Å and His N^ε2−Zn = 2.20 Å). The 15 best models from a total of 200 models annealed from random structures were chosen, on the basis of lowest energy with fewest violations, to represent the structure of C2 2A^pro. With reference to the A2 (2hrv), CB4 (1z8r) and EV71 (4fvd) orthologs, MOLMOL [35] was used to superimpose the files, then calculate the root mean square deviation (rmsd) for each pair. PyMOL (version 1.2r3pre, Schrödinger, LLC) was used for graphical display. Electrostatic potential surfaces were calculated with the APBS plug-in [36] for PyMOL according to PQR files generated from Poisson-Boltzmann electrostatics calculated by the PDB2PQR package [37]. Secondary structure features in the lowest-energy model were identified by STRIDE [38]. MolProbity [39], PROCHECK [40], and the PSVS suite server [34] were used to assess the quality of the final ensemble of structures. The coordinates and related data are deposited in Protein Data Bank with the assignment code, 2M5T. The chemical shift data are deposited in the Biological Magnetic Resonance Bank, as 19079.

Dynamics

¹H-¹⁵N NOE and ¹⁵N relaxation (T ₁, T ₂) data were recorded on the Agilent VNMRS 800 MHz spectrometer equipped with a conventional triple-resonance probe. Multi-interleaved NMR spectra were collected with relaxation delays of 0, 50, 100, 200, 300, 400, 600, 1200, and 1600 ms for the ¹⁵N T ₁ measurements, and with relaxation delays of 10, 30, 50, 70, 90, and 110 ms for the ¹⁵N T ₂ measurements. The relaxation rate constants were extracted in SPARKY by fitting the decay of peak height as a function of the relaxation delay to a single exponential function. Interleaved 2D ¹H-¹⁵N HSQC spectra, with and without 5-s proton saturation, were collected for the ¹H-¹⁵N NOE measurements. The ¹H-¹⁵N heteronuclear NOE values were obtained from the ratios of peak heights between two spectra calculated with SPARKY and LibreOffice spreadsheet programs.

Exposure of Aromatics

The surface accessibility of aromatic side chains (His, Phe, Trp, Tyr) were evaluated for the lowest energy structure using STRIDE [38]. The observed accessible surface areas were divided by values representing the fully exposed residue accessible surface areas in corresponding tripeptides: Gly-His-Gly: (1.94 Å²), Gly-Phe-Gly: (2.18 Å²), Gly-Trp-Gly (2.59 Å²), and Gly-Tyr-Gly: (2.29 Å²) according to described procedures [41]. The residues were binned into “exposed” (30–100%), “partially exposed” (10–30%) and “buried” (0–10%) categories, accordingly. Similar procedures were used in the analysis of the three other structures: A2, CB4, EV71.

Results

Protein Characterization

The wild-type protein was highly active [9], and the ¹H-¹⁵N HSQC spectrum of ¹⁵N-labeled wild-type 2A^pro (Figure 3) was well dispersed, indicating that the protein was well folded. However, the wild-type protein aggregated over time, which prevented the collection of the valid series of three-dimensional data sets required for a structure determination. The inactive C₁₀₅A variant, which yielded a very similar ¹H-¹⁵N HSQC spectrum (Figure 3), was better behaved. Analytical gel filtration using a Shimadzu Prominence HPLC system identified conditions under which the C₁₀₅A protein was monomeric (100 mM succinate buffer, pH 5.5, 100 mM NaCl, 2 mM TCEP), and these conditions, when evaluated by differential scanning fluorimetry (DSF), indicated that C2 2A^pro (C₁₀₅A) was of sufficient stability for structure determination.

The two spectra are very similar; however, that of the wild-type protease exhibits small signals attributed to self-cleavage products.

Structure Description

The final structure was based on a total of 1440 constraints (1239 distance constraints, 142 angle constraints, and 59 hydrogen bond constraints). STRIDE [34] analysis of the structures determined that the protein consists mostly of β-strands as also reported for the ortholog, A2 2A^pro [11]. The assigned secondary structural elements are indicated in Figure 4A. The nomenclature follows that for A2 2A^pro. The NOE restraints per residue used in the structure calculation are summarized in Figure 4B. The lack of NOE assignments for the N-terminus, C-terminus, and for residues 82–86 facing the catalytic triad region (H₁₈, D₃₄, A₁₀₅) led to slightly higher rmsd values and lower structural compactness of the models in these regions (Figure 4C).

(A) Secondary structural features from the NMR solution structure: β-strands (*arrows*) and 3₁₀ helices (*boxes*). (B) The total number of constraints used for the structure calculation plotted as a function of residue number. (C) Rmsd values for backbone atoms (N, C^α, and C′) of the best 15 models relative to the average structure. Structurally compact regions have rmsd values below 2 Å.

The 15 best models (Figure 5A) were chosen to represent the solution structure of the full enzyme (142 amino acids). For the regions with regular secondary structure, the rmsd was 0.6 Å for backbone heavy atoms and 0.8 Å for all heavy atoms. When tested by MolProbity [39], 93.6% of the backbone angles were in “most favored” regions, 6.4% in “allowed” regions, and none in “disallowed regions” of the Ramachandran plot. The Z-scores for backbone/all dihedral angles from PROCHECK [40] were measured in the range of −2.95 to −5.62, while the mean score/Z-score values from MolProbity [39] were 24.03 to −2.60 (Table 2).

(A) The backbone atoms (N, C^α, C′) for the best 15 models as superimposed by MOLMOL³¹ for the regions of regular secondary structure. (B) Ribbon diagram of the lowest energy model indicating the N-terminal domain (*orange*), C-terminal domain (*gray*), and the connecting loop (*green*). Stick representations (*magenta*) show the side chains (C₅₁, C₅₃, C₁₁₁, H₁₁₃) ligating the zinc ion (*gray sphere*), and side chains of the residues (*cyan*) forming the catalytic triad (H₁₈, D₃₄, C₁₀₅A). The di-tyrosine flap (Y₈₄, Y₈₅, P₈₆) lies near this triad. The two structures are rotated by 180^o.

Table 2. Statistics for the NMR Structure of C2 2A^pro.

Conformationally restricting distance constraints
Intraresidue [i = j]	274
Sequential [(i–j) = 1]	181
Medium Range [1<(i–j)≤5]	148
Long Range [(i–j)>5]	636
Total	1239
Dihedral angle constraints
φ	70
ψ	72
Hydrogen-bond constraints	59
CYANA target function [Å]	3.49
Average rmsd to the mean CYANA coordinates [Å]
Regular secondary structure elements, backbone heavy^a	0.6
Regular secondary structure elements, all heavy atoms^a	0.8
Backbone heavy atoms N, Cα, C′ (1–142)	1.5
All heavy atoms (1–142)	1.7
PROCHECK Z-scores (φ and ψ/all dihedral angles)	−2.95/−5.62
MolProbity Mean score/Z-score	24.03/−2.60
Ramachandran plot summary for selected residue ranges from PROCHECK [%]^a
Most favored regions	85.0
Additionally allowed regions	13.2
Generously allowed regions	1.8
Disallowed regions	0.0
Ramachandran plot summary for selected residue ranges from MolProbity [%]^a
Most favored regions	93.6
Allowed regions	6.4
Disallowed regions	0.0
Average number of distance constraint violations per CYANA conformer
0.2–0.5 Å	11
>0.5 Å	0
Average number of angle constraint violations per CYANA conformer
>10°	0

Open in a new tab

Stretches of regular secondary structure: 7–9, 12–16, 28–30, 35–39, 55–60, 65–74, 78–79, 88–96, 108–110, 115–122, 127–131.

C2 2A^pro has N- and C-terminal domains connected by a central loop. The N-terminal domain (Figure 5B orange) has four strands that constitute an antiparallel β-sheet (β-strands V₇–T₉ [bI2], A₁₂–N₁₆ [cI], L₂₈–A₃₀ [eI2], L₃₅–G₃₉ [fI]). The C-terminal domain (Figure 5B gray) has six strands that constitute an antiparallel β-barrel (β-strands S₅₅–S₆₀ [aII], R₆₅–V₇₉ [bII], H₈₈–E₉₇ [cII], G₁₀₇–L₁₁₀ [dII], V₁₁₅–G₁₂₃ [eII], H₁₂₆–D₁₃₁ [fII]). The connecting loop (Figure 5B green) includes C₄₀–T₅₄. The di-tyrosine flap (Y₈₄, Y₈₅, P₈₆), conserved structurally in all such proteases, configures here as a β-hairpin loop (Figure 2C block arrow), as it does in A2 2A^pro (Y₈₅, Y₈₆, P₈₇), CB4 2A^pro (Y₈₉, Y₉₀, P₉₁), and EV71 2A^pro (Y₈₉, Y₉₀, P₉₁). Three short 3₁₀-helices seen in A2 2A^pro were also identified in the C2 2A^pro structure, each consisting of three residues that come after β-strands (cI, eI2, and aII); the third 3₁₀-helix seen in these two proteins is missing in CB4 2A^pro, while the second helix is categorized as an α-helix in EV71 2A^pro.

Protein Dynamics

Longitudinal (T ₁) and transverse (T ₂) ¹⁵N relaxation data as well as ¹H-¹⁵N heteronuclear NOE data (Figure 6) were collected to explore the dynamic behavior of C2 2A^pro. We used Eq. 1 to estimate the overall correlation time (τ _c) from the T ₁/T ₂ ratios of residues involved in elements of secondary structure.

(A) Longitudinal (T ₁) relaxation times, (B) transverse (T ₂) relaxation times, and (C) ¹H-¹⁵N heteronuclear NOE data for the nitrogen backbone atoms of C2 2A^pro plotted as a function of the amino acid sequence. The standard errors for all measurements were within the size of the data points shown.

(1)

The resulting τ _c value was 10.5 ns. Inspection of the T ₁/T ₂ ratios and ¹H-¹⁵N heteronuclear NOE data showed, apart from the five mobile C-terminal residues, very little internal motion over the whole sequence, including the loop regions. This appears to be a common feature of picornaviral proteases [12]. However, despite little evidence for internal motion, the non-uniform intensity of peaks in ¹H-¹⁵N -HSQC spectra suggests the existence of localized structural heterogeneity. CB4 2A^pro exhibited similar phenomena in previous NMR studies [14].

Discussion

NMR Methods

The methods used in this study represent a collaborative effort by CESG and NMRFAM to develop generalized, rapid-through-put techniques for protein purification and structure determination. This charged, self-cleaving protease with a tendency to aggregate presented particular challenges. The problems were solved here, by stepwise judicious selection of cloning vector (pE-SUMO), host strain, isolation and purification protocols, the C₁₀₅A mutation, and solution conditions. Linkage of the output from PINE-NMR [30] to PINE-SPARKY validations [20] facilitated and virtually automated the spectral peak assignments. The final structure was of high quality and well supported by the extensive datasets.

2A^pro Structure Comparisons

The C2 2A^pro is the first protein from an RV-C to be examined at the structural level. Among enteroviruses, the only viral genus to have such enzymes, structures were previously reported for 2A^pro from RV-A2 [11] and EV-71 [13] determined by crystallography and EV-CB4 [14] determined by NMR. The sequence identities are 57% between A2 and C2, 41% between CB4 and C2, and 40% between EV71 and C2. Structure alignments show that the only relative indels are confined to a short stretch in the first domain (before eI2) and to length discontinuities at the N- and C-terminal cleavage sites (Figure 7). For comparison, important structural and functional elements are highlighted on this map. The substrate-binding di-tyrosine flap (YYP) is marked by an ellipse. The one His (H₁₁₃) and three Cys residues (C₅₁, C₅₃, C₁₁₁ dashed boxes) responsible for coordinating the structural zinc ion (Figure 5B gray sphere) converge on the back side of the molecule, basically holding the main domains together. Sequencing studies have highlighted a number of RV isolates that are apparent recombinants within the 2A^pro region [42]. When this occurs, invariably, within or between RV-A and RV-C strains, the identified breakpoints cluster in the central linker region and at the C-terminus, swapping the intact N- and C-terminal domains. That these recombinants are apparently fully functional suggests that the two main domains fold independently, with each domain contributing zinc coordination elements that stabilize the full enzyme.

Residues are color-coded by type. Residues in the catalytic triad (C2: H₁₈, D₃₄, and C₁₀₅A) are boxed with solid lines. Residues whose side chains ligate the zinc ion (C2: C₅₁, C₅₃, C₁₁₁, H₁₁₃) are boxed with dashed lines. The ellipse highlights the conserved YYP sequence in the di-tyrosine flap. Symbols above the sequences indicate secondary structural features as per Figure 3.

The catalytic triads (H₁₈, D₃₄, C₁₀₅) in all four structurally determined enzymes are identical (Figure 7 solid boxes) and located within a pronounced substrate-binding groove opposite to the zinc. The C₁₀₅ nucleophile is in a conserved PGDCGG motif, between two β-strands within the C-terminal domain (cII and dII). In the C2, as well as the CB4 and EV71 structures, this reactive Cys was mutated to Ala to obtain protein sufficiently stable for structure determination. The sequences indicated (Figure 7) reflect those mutations.

Superimposition of the 3D structures of C2 and CB4 2A^pro (Figure 8A; NMR model 1) gave a lower pairwise backbone rmsd (1.809 Å) than might have been expected from the 41% sequence identity. Superimposition of C2 and EV71 2A^pro models (40% sequence identity) yielded the lowest pairwise rmsd (1.4 Å). When electrostatic potential surfaces were generated with the contouring value set to ±10 kT/e (Figure 8 B,C,D,E), all four enzymes exhibited similar negative charge surface distributions (red) despite the overall sequence differences. However, the C2 enzyme (Figure 8B) lacks several intensely basic surface patches (blue) displayed by A2 (Figure 8C), CB4 (Figure 8D) and EV71 (Figure 8E). Examples of sequence differences at aligned positions that result in a more acidic pI for the C2 sequence overall (4.62) than for A2 (5.43), CB4 (5.20), or EV71 (6.04) include C2 G₃₉/A2 R₄₀ and C2 L₆₃/A2 K₆₄. Actually, the C2 enzyme has the most acidic pI of known 2A^pro sequences [8], [9].

(A) Superimposition of backbones of the four proteases showing their structural similarity. Pairwise rmsd values for C2 relative to both A2 and CB4 proteases are both 1.809 Å, while to EV71 protease is 1.4 Å. Poisson-Boltzmann electrostatic potential surfaces are illustrated by PyMOL [29] for (B) C2, (C) A2,(D) CB4 and (E) EV71 2A^pro. Each structure is shown in the same orientation. (F) Comparison of the positions of the bll−cll and cll−dll loops in the structures of C2 (*blue*), A2 (*red*), CB4 (*green*), and EV71 (*orange*) 2A^pro.

Other differences between the four structures are observed in the distance between the two loops (bII-cII and cII-dII) that constitute the binding cleft (Figure 8F). The two loops are closest together in the structure of CB4 2A^pro (green) followed by A2 2A^pro (red), and the binding sites of these two proteases can be characterized as closed. By contrast, EV71 2A^pro (orange) and C2 2A^pro (blue) exhibit open binding sites with their two loops about the same distance apart.

Instead of positive charges, the C2 2A^pro structure exposes an unusual level of aromatics on its surface. In most other proteins, aromatics normally contribute to the hydrophobic core that stabilizes the protein structure [43]. The degree of exposure for each residue of C2 2A^pro was determined by comparing the observed solvent accessible surface area (SAS), obtained from STRIDE [38], to theoretical SAS values for a fully exposed residue. By this metric, 12 of 18 (67%) aromatic residues in C2 2A^pro were found to be exposed to solvent (6 Tyr, 4 His, 1 Phe, 1 Trp). Four more are only partially buried (2 Tyr, 2 His), and only two are fully (>90%) buried (Y₅₈, F₁₂₉). Similar analysis of the other structures showed the exposure of 12 of 26 (46%) aromatics in A2 2A^pro (5 Tyr, 6 His, 1 Trp), 12 of 22 (55%) aromatics in CB4 2A^pro (4 Tyr, 5 His, 1 Phe, 2 Trp), and 11 of 20 (55%) aromatics in EV71 2A^pro (5 Tyr, 4 His, 2 Trp). Rather than aromatics, the hydrophobic core of C2 2A^pro consists mostly of Val, Leu and Ile residues, an unusual selection for this purpose. Similar characteristics were noted for CB4 2A^pro [14]. Of the four proteins, C2 2A^pro has the highest ratio of exposed aromatics and also the surface with the lowest positive charge.

RV 2A^pro Sequence and Structural Variability

Comparison of the four structures now available supports the idea that the hallmark sequence variability among enterovirus 2A^pro translates mostly into surface charge variability, rather than alterations in the essential core configuration, the loop lengths, or internal dynamics that might affect the catalytic residues [14]. These are relatively rigid proteases, and yet in infected cells, different RV isolates are quite selective about their substrate preferences and rates of cleavage [7], [17]. To date, the preferences of only six RV enzymes (A16, A89, B4, B14, C2, C6) have been compared head-to-head [9], although seven more (A1, A2, A45, A95, B17, B52, C15) were recently cloned and are undergoing similar tests (K. Watters and A. C. Palmenberg, unpublished). Polyclonal antibodies raised against the A16 enzyme cross-react with C15 but not C2 (Watters and Palmenberg, 2011), verifying differences at the surface level, but also suggesting the general 2A^pro proclivities may eventually cluster into a limited series of reactive clades, along sequence (e.g. A16 and C15) or species (A or B or C) lines. Because many of the preferred, natural Nup substrates for 2A^pro lie buried in the hydrophobic cores of the nuclear pores, perhaps the surface groupings influence physical accessibility, contributing at least in part to the observed cleavage patterns. Surface differences between the A2 and CB4 enzymes have been shown to directly affect the relative rates of eIF4G cleavage [44].

Another possibility is that the substrate binding pocket, sensitive to the P8−P2′ sequence of the substrate, is the key to specificity [15]. Created in part by the variable di-tyrosine flap, the binding groove is responsive, even during the autocatalytic self-cleaving event, to the sequence and shape of the substrate that fills it. When nine amino acids flanking the NH₂-terminus of B14 2A^pro were substituted into an A1 or A2 context, the chimeras were unable to cleave themselves from their polyproteins [45]. The same was true when the A2 enzyme was tested in trans against peptides encoding other RV processing sites, even those from closely related viruses [16]. It required at least three substitutions within this length to re-establish activity. The protease reacted to mutated residues in the P2, P1 and P2′ locations during cis reactions [45], but is apparently tolerant of certain changes in the P1, P2′, and P3′ locations during trans reactions [16]. Clearly, all these enzymes are sensing both the shape and sequence of their targets [14]. A WebLogo depiction [46] summarizing all known RV sequences within the self-cleavage sites (Figure 9) highlights the variability encoded here. Not only are the RV-B enzymes extended by two amino acids (cleavage is between positions “−1” and “1”), there is almost no consensus within or between species. The di-tyrosine flap, both upstream and downstream of the few conserved residues (YYP) is another region with pronounced variability. The flap forms one side of the binding cleft (Figure 5B) where substrate acceptance is a prerequisite to the conformational changes that occur during catalysis. In contrast, the zinc-binding residues, the catalytic triad, and C-terminal di-peptide (Q/G) recognized by 3C^pro are absolutely conserved in all species, types, and isolates (n = 348). The 3C^pro enzymes as a rule have more limited selectivity, and for all RV, the carboxyl terminus of 2A^pro is released at an identical Gln/Gly pair.

WebLogo depictions [46] summarize full species alignment information for key 2A^pro residues. RV polyprotein alignments have been described [8]. This dataset compared RV-A (79 types, 208 seqs), RV-B (30 types, 74 seqs), RV-C (32 types, 67 seqs). The residue height indicates the relative amino acid frequency. The A2, B14 and C2 numbering system is for the native, ungapped proteins.

The current determination of the structure of C2 2A^pro is only the start of further investigations that compare and contrast this important cohort of enzymes. It has been proposed that the particular avidities with which individual 2A^pro attack their Nups (or eIF4G) profoundly affect relative viral replication levels, intracellular signaling or extra cellular signaling, all of which are underlying triggers for different host immune responses [9]. It is important to define these mechanisms, embedded in the structures, in order to understand the consequent variability among virus phenotypes.

Associated Content

Accession Codes

The atomic coordinates and assigned chemical shifts and structural constraints were deposited in the PDB with ID code 2M5T. NMR data were deposited in the BMRB with ID code 19079.

Acknowledgments

The authors thank CESG staff members Lai Bergeman, Soyoon Hwang, Jaclyn Saunders, Darius Chow, Brian Fox, John Primm, and Donna Troestler for their contributions to this project.

Data Availability

The authors confirm that all data underlying the findings are fully available without restriction. Worldwide Protein Data Bank (wwpdb.org): 2M5T BioMagResBank (bmrb.wisc.edu): 19079.

Funding Statement

This work was supported by National Institutes of Health grant U19 AI104317 to ACP, NIH training grant T32 AI078985 to KW and NIH grants U01 GM094622, and P41GM103399 to JLM. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

1. Clark ME, Hämmerle T, Wimmer E, Dasgupta A (1991) Poliovirus proteinase 3C converts an active form of transcription factor IIIC to an inactive form: a mechanism for inhibition of host cell polymerase III transcription by poliovirus. EMBO J 10: 2941–2947. [DOI] [PMC free article] [PubMed] [Google Scholar]
2. Yalamanchili P, Datta U, Dasgupta A (1997) Inhibition of host cell transcription by poliovirus: cleavage of transcription factor CREB by poliovirus-encoded protease 3Cpro. J Virol 71: 1220–1226. [DOI] [PMC free article] [PubMed] [Google Scholar]
3. Lamphear BJ, Yan R, Yang F, Waters D, Liebig HD, et al. (1993) Mapping the cleavage site in protein synthesis initiation factor eIF-4 gamma of the 2A proteases from human Coxsackievirus and rhinovirus. J Biol Chem 268: 19200–19203. [PubMed] [Google Scholar]
4. Liebig HD, Seipelt J, Vassilieva E, Gradi A, Kuechler E (2002) A thermosensitive mutant of HRV2 2A proteinase: evidence for direct cleavage of eIF4GI and eIF4GII. FEBS Lett 523: 53–57. [DOI] [PubMed] [Google Scholar]
5. Castelló A, Izquierdo JM, Welnowska E, Carrasco L (2009) RNA nuclear export is blocked by poliovirus 2A protease and is concomitant with nucleoporin cleavage. J Cell Sci 122: 3799–3809. [DOI] [PubMed] [Google Scholar]
6. Gustin KE, Sarnow P (2002) Inhibition of nuclear import and alteration of nuclear pore complex composition by rhinovirus. J Virol 76: 8787–8796. [DOI] [PMC free article] [PubMed] [Google Scholar]
7. Skern T, Sommergruber W, Auer H, Volkmann P, Zorn M, et al. (1991) Substrate requirements of a human rhinoviral 2A proteinase. Virology 181: 46–54. [DOI] [PubMed] [Google Scholar]
8.Palmenberg AC, Rathe JA, Liggett SB (2010) Analysis of the complete genome sequences of human rhinovirus. J Allergy Clin Immunol 125: : 1190–1199; quiz 1200–1201. [DOI] [PMC free article] [PubMed] [Google Scholar]
9. Watters K, Palmenberg AC (2011) Differential processing of nuclear pore complex proteins by rhinovirus 2A proteases from different species and serotypes. J Virol 85: 10874–10883. [DOI] [PMC free article] [PubMed] [Google Scholar]
10. Bazan JF, Fletterick RJ (1988) Viral cysteine proteases are homologous to the trypsin-like family of serine proteases: structural and functional implications. Proc Natl Acad Sci USA 85: 7872–7876. [DOI] [PMC free article] [PubMed] [Google Scholar]
11. Petersen JF, Cherney MM, Liebig HD, Skern T, Kuechler E, et al. (1999) The structure of the 2A proteinase from a common cold virus: a proteinase responsible for the shut-off of host-cell protein synthesis. EMBO J 18: 5463–5475. [DOI] [PMC free article] [PubMed] [Google Scholar]
12. Cai Q, Yameen M, Liu W, Gao Z, Li Y, et al. (2013) Conformational Plasticity of the 2A Proteinase from Enterovirus 71. Journal of Virology 87: 7348–7356. [DOI] [PMC free article] [PubMed] [Google Scholar]
13. Mu Z, Wang B, Zhang X, Gao X, Bo Q, et al. (2013) Crystal Structure of 2A Proteinase from Hand, Foot and Mouth Disease Virus. Journal of Molecular Biology 425: 4530–4543. [DOI] [PubMed] [Google Scholar]
14. Baxter NJ, Roetzer A, Liebig H-D, Sedelnikova SE, Hounslow AM, et al. (2006) Structure and dynamics of coxsackievirus B4 2A proteinase, an enyzme involved in the etiology of heart disease. J Virol 80: 1451–1462. [DOI] [PMC free article] [PubMed] [Google Scholar]
15. Wang QM, Johnson RB, Sommergruber W, Shepherd TA (1998) Development of in vitro peptide substrates for human rhinovirus-14 2A protease. Arch Biochem Biophys 356: 12–18. [DOI] [PubMed] [Google Scholar]
16. Sommergruber W, Ahorn H, Zöphel A, Maurer-Fogy I, Fessl F, et al. (1992) Cleavage specificity on synthetic peptide substrates of human rhinovirus 2 proteinase 2A. J Biol Chem 267: 22639–22644. [PubMed] [Google Scholar]
17. Sousa C, Schmid EM, Skern T (2006) Defining residues involved in human rhinovirus 2A proteinase substrate recognition. FEBS Lett 580: 5713–5717. [DOI] [PubMed] [Google Scholar]
18. Dominguez SR, Briese T, Palacios G, Hui J, Villari J, et al. (2008) Multiplex MassTag-PCR for respiratory pathogens in pediatric nasopharyngeal washes negative by conventional diagnostic testing shows a high prevalence of viruses belonging to a newly recognized rhinovirus clade. J Clin Virol 43: 219–222. [DOI] [PMC free article] [PubMed] [Google Scholar]
19. Bochkov YA, Palmenberg AC, Lee W-M, Rathe JA, Amineva SP, et al. (2011) Molecular modeling, organ culture and reverse genetics for a newly identified human rhinovirus C. Nat Med 17: 627–632. [DOI] [PMC free article] [PubMed] [Google Scholar]
20. Lee W, Westler WM, Bahrami A, Eghbalnia HR, Markley JL (2009) PINE-SPARKY: graphical interface for evaluating automated probabilistic peak assignments in protein NMR spectroscopy. Bioinformatics 25: 2085–2087. [DOI] [PMC free article] [PubMed] [Google Scholar]
21. Lee W, Kim JH, Westler WM, Markley JL (2011) PONDEROSA, an automated 3D-NOESY peak picking program, enables automated protein structure determination. Bioinformatics 27: 1727–1728. [DOI] [PMC free article] [PubMed] [Google Scholar]
22. Lee W-M, Kiesner C, Pappas T, Lee I, Grindle K, et al. (2007) A diverse group of previously unrecognized human rhinoviruses are common causes of respiratory illnesses in infants. PLoS ONE 2: e966. [DOI] [PMC free article] [PubMed] [Google Scholar]
23. Klock HE, Lesley SA (2009) The Polymerase Incomplete Primer Extension (PIPE) method applied to high-throughput cloning and site-directed mutagenesis. Methods Mol Biol 498: 91–103. [DOI] [PubMed] [Google Scholar]
24. Frederick RO, Bergeman L, Blommel PG, Bailey LJ, McCoy JG, et al. (2007) Small-scale, semi-automated purification of eukaryotic proteins for structure determination. J Struct Funct Genomics 8: 153–166. [DOI] [PMC free article] [PubMed] [Google Scholar]
25. Studier FW (2005) Protein production by auto-induction in high density shaking cultures. Protein Expr Purif 41: 207–234. [DOI] [PubMed] [Google Scholar]
26. Blommel PG, Becker KJ, Duvnjak P, Fox BG (2007) Enhanced bacterial protein expression during auto-induction obtained by alteration of lac repressor dosage and medium composition. Biotechnol Prog 23: 585–598. [DOI] [PMC free article] [PubMed] [Google Scholar]
27. Delaglio F, Grzesiek S, Vuister GW, Zhu G, Pfeifer J, et al. (1995) NMRPipe: a multidimensional spectral processing system based on UNIX pipes. J Biomol NMR 6: 277–293. [DOI] [PubMed] [Google Scholar]
28.Goddard TD, Kneller DG (2008) SPARKY 3. University of California, San Francisco.
29. Shin J, Lee W, Lee W (2008) Structural proteomics by NMR spectroscopy. Expert Rev Proteomics 5: 589–601. [DOI] [PubMed] [Google Scholar]
30. Bahrami A, Assadi AH, Markley JL, Eghbalnia HR (2009) Probabilistic interaction network of evidence algorithm and its application to complete labeling of peak lists from protein NMR spectroscopy. PLoS Comput Biol 5: e1000307. [DOI] [PMC free article] [PubMed] [Google Scholar]
31. Güntert P (2004) Automated NMR structure calculation with CYANA. Methods Mol Biol 278: 353–378. [DOI] [PubMed] [Google Scholar]
32. Shen Y, Delaglio F, Cornilescu G, Bax A (2009) TALOS+: a hybrid method for predicting protein backbone torsion angles from NMR chemical shifts. J Biomol NMR 44: 213–223. [DOI] [PMC free article] [PubMed] [Google Scholar]
33. DeLano WL, Lam JW (2005) PyMOL: A communications tool for computational models. Abstr Pap Am Chem S 230: U1371–U1372. [Google Scholar]
34. Bhattacharya A, Tejero R, Montelione GT (2007) Evaluating protein structures determined by structural genomics consortia. Proteins 66: 778–795. [DOI] [PubMed] [Google Scholar]
35.Koradi R, Billeter M, Wüthrich K (1996) MOLMOL: a program for display and analysis of macromolecular structures. J Mol Graph 14: : 51–55, 29–32. [DOI] [PubMed] [Google Scholar]
36. Baker NA, Sept D, Joseph S, Holst MJ, McCammon JA (2001) Electrostatics of nanosystems: application to microtubules and the ribosome. Proc Natl Acad Sci USA 98: 10037–10041. [DOI] [PMC free article] [PubMed] [Google Scholar]
37. Dolinsky TJ, Nielsen JE, McCammon JA, Baker NA (2004) PDB2PQR: an automated pipeline for the setup of Poisson-Boltzmann electrostatics calculations. Nucleic Acids Res 32: W665–667. [DOI] [PMC free article] [PubMed] [Google Scholar]
38. Frishman D, Argos P (1995) Knowledge-based protein secondary structure assignment. Proteins 23: 566–579. [DOI] [PubMed] [Google Scholar]
39. Chen VB, Arendall WB 3rd, Headd JJ, Keedy DA, Immormino RM, et al. (2010) MolProbity: all-atom structure validation for macromolecular crystallography. Acta Crystallogr D Biol Crystallogr 66: 12–21. [DOI] [PMC free article] [PubMed] [Google Scholar]
40. Laskowski RA, Rullmannn JA, MacArthur MW, Kaptein R, Thornton JM (1996) AQUA and PROCHECK-NMR: programs for checking the quality of protein structures solved by NMR. J Biomol NMR 8: 477–486. [DOI] [PubMed] [Google Scholar]
41. Eisenhaber F, Argos P (1993) Improved strategy in analytic surface calculation for molecular systems: Handling of singularities and computational efficiency. Journal of Computational Chemistry 14: 1272–1280. [Google Scholar]
42. McIntyre CL, McWilliam Leitch EC, Savolainen-Kopra C, Hovi T, Simmonds P (2010) Analysis of genetic diversity and sites of recombination in human rhinovirus species C. J Virol 84: 10297–10310. [DOI] [PMC free article] [PubMed] [Google Scholar]
43. Cox JD, Hunt JA, Compher KM, Fierke CA, Christianson DW (2000) Structural influence of hydrophobic core residues on metal binding and specificity in carbonic anhydrase II. Biochemistry 39: 13687–13694. [DOI] [PubMed] [Google Scholar]
44. Foeger N, Schmid EM, Skern T (2003) Human rhinovirus 2 2Apro recognition of eukaryotic initiation factor 4GI. Involvement of an exosite. J Biol Chem 278: 33200–33207. [DOI] [PubMed] [Google Scholar]
45. Neubauer D, Aumayr M, Gösler I, Skern T (2013) Specificity of human rhinovirus 2A(pro) is determined by combined spatial properties of four cleavage site residues. J Gen Virol 94: 1535–1546. [DOI] [PMC free article] [PubMed] [Google Scholar]
46. Crooks GE, Hon G, Chandonia J-M, Brenner SE (2004) WebLogo: a sequence logo generator. Genome Res 14: 1188–1190. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The authors confirm that all data underlying the findings are fully available without restriction. Worldwide Protein Data Bank (wwpdb.org): 2M5T BioMagResBank (bmrb.wisc.edu): 19079.

[pone.0097198-Clark1] 1. Clark ME, Hämmerle T, Wimmer E, Dasgupta A (1991) Poliovirus proteinase 3C converts an active form of transcription factor IIIC to an inactive form: a mechanism for inhibition of host cell polymerase III transcription by poliovirus. EMBO J 10: 2941–2947. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0097198-Yalamanchili1] 2. Yalamanchili P, Datta U, Dasgupta A (1997) Inhibition of host cell transcription by poliovirus: cleavage of transcription factor CREB by poliovirus-encoded protease 3Cpro. J Virol 71: 1220–1226. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0097198-Lamphear1] 3. Lamphear BJ, Yan R, Yang F, Waters D, Liebig HD, et al. (1993) Mapping the cleavage site in protein synthesis initiation factor eIF-4 gamma of the 2A proteases from human Coxsackievirus and rhinovirus. J Biol Chem 268: 19200–19203. [PubMed] [Google Scholar]

[pone.0097198-Liebig1] 4. Liebig HD, Seipelt J, Vassilieva E, Gradi A, Kuechler E (2002) A thermosensitive mutant of HRV2 2A proteinase: evidence for direct cleavage of eIF4GI and eIF4GII. FEBS Lett 523: 53–57. [DOI] [PubMed] [Google Scholar]

[pone.0097198-Castell1] 5. Castelló A, Izquierdo JM, Welnowska E, Carrasco L (2009) RNA nuclear export is blocked by poliovirus 2A protease and is concomitant with nucleoporin cleavage. J Cell Sci 122: 3799–3809. [DOI] [PubMed] [Google Scholar]

[pone.0097198-Gustin1] 6. Gustin KE, Sarnow P (2002) Inhibition of nuclear import and alteration of nuclear pore complex composition by rhinovirus. J Virol 76: 8787–8796. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0097198-Skern1] 7. Skern T, Sommergruber W, Auer H, Volkmann P, Zorn M, et al. (1991) Substrate requirements of a human rhinoviral 2A proteinase. Virology 181: 46–54. [DOI] [PubMed] [Google Scholar]

[pone.0097198-Palmenberg1] 8.Palmenberg AC, Rathe JA, Liggett SB (2010) Analysis of the complete genome sequences of human rhinovirus. J Allergy Clin Immunol 125: : 1190–1199; quiz 1200–1201. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0097198-Watters1] 9. Watters K, Palmenberg AC (2011) Differential processing of nuclear pore complex proteins by rhinovirus 2A proteases from different species and serotypes. J Virol 85: 10874–10883. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0097198-Bazan1] 10. Bazan JF, Fletterick RJ (1988) Viral cysteine proteases are homologous to the trypsin-like family of serine proteases: structural and functional implications. Proc Natl Acad Sci USA 85: 7872–7876. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0097198-Petersen1] 11. Petersen JF, Cherney MM, Liebig HD, Skern T, Kuechler E, et al. (1999) The structure of the 2A proteinase from a common cold virus: a proteinase responsible for the shut-off of host-cell protein synthesis. EMBO J 18: 5463–5475. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0097198-Cai1] 12. Cai Q, Yameen M, Liu W, Gao Z, Li Y, et al. (2013) Conformational Plasticity of the 2A Proteinase from Enterovirus 71. Journal of Virology 87: 7348–7356. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0097198-Mu1] 13. Mu Z, Wang B, Zhang X, Gao X, Bo Q, et al. (2013) Crystal Structure of 2A Proteinase from Hand, Foot and Mouth Disease Virus. Journal of Molecular Biology 425: 4530–4543. [DOI] [PubMed] [Google Scholar]

[pone.0097198-Baxter1] 14. Baxter NJ, Roetzer A, Liebig H-D, Sedelnikova SE, Hounslow AM, et al. (2006) Structure and dynamics of coxsackievirus B4 2A proteinase, an enyzme involved in the etiology of heart disease. J Virol 80: 1451–1462. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0097198-Wang1] 15. Wang QM, Johnson RB, Sommergruber W, Shepherd TA (1998) Development of in vitro peptide substrates for human rhinovirus-14 2A protease. Arch Biochem Biophys 356: 12–18. [DOI] [PubMed] [Google Scholar]

[pone.0097198-Sommergruber1] 16. Sommergruber W, Ahorn H, Zöphel A, Maurer-Fogy I, Fessl F, et al. (1992) Cleavage specificity on synthetic peptide substrates of human rhinovirus 2 proteinase 2A. J Biol Chem 267: 22639–22644. [PubMed] [Google Scholar]

[pone.0097198-Sousa1] 17. Sousa C, Schmid EM, Skern T (2006) Defining residues involved in human rhinovirus 2A proteinase substrate recognition. FEBS Lett 580: 5713–5717. [DOI] [PubMed] [Google Scholar]

[pone.0097198-Dominguez1] 18. Dominguez SR, Briese T, Palacios G, Hui J, Villari J, et al. (2008) Multiplex MassTag-PCR for respiratory pathogens in pediatric nasopharyngeal washes negative by conventional diagnostic testing shows a high prevalence of viruses belonging to a newly recognized rhinovirus clade. J Clin Virol 43: 219–222. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0097198-Bochkov1] 19. Bochkov YA, Palmenberg AC, Lee W-M, Rathe JA, Amineva SP, et al. (2011) Molecular modeling, organ culture and reverse genetics for a newly identified human rhinovirus C. Nat Med 17: 627–632. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0097198-Lee1] 20. Lee W, Westler WM, Bahrami A, Eghbalnia HR, Markley JL (2009) PINE-SPARKY: graphical interface for evaluating automated probabilistic peak assignments in protein NMR spectroscopy. Bioinformatics 25: 2085–2087. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0097198-Lee2] 21. Lee W, Kim JH, Westler WM, Markley JL (2011) PONDEROSA, an automated 3D-NOESY peak picking program, enables automated protein structure determination. Bioinformatics 27: 1727–1728. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0097198-Lee3] 22. Lee W-M, Kiesner C, Pappas T, Lee I, Grindle K, et al. (2007) A diverse group of previously unrecognized human rhinoviruses are common causes of respiratory illnesses in infants. PLoS ONE 2: e966. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0097198-Klock1] 23. Klock HE, Lesley SA (2009) The Polymerase Incomplete Primer Extension (PIPE) method applied to high-throughput cloning and site-directed mutagenesis. Methods Mol Biol 498: 91–103. [DOI] [PubMed] [Google Scholar]

[pone.0097198-Frederick1] 24. Frederick RO, Bergeman L, Blommel PG, Bailey LJ, McCoy JG, et al. (2007) Small-scale, semi-automated purification of eukaryotic proteins for structure determination. J Struct Funct Genomics 8: 153–166. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0097198-Studier1] 25. Studier FW (2005) Protein production by auto-induction in high density shaking cultures. Protein Expr Purif 41: 207–234. [DOI] [PubMed] [Google Scholar]

[pone.0097198-Blommel1] 26. Blommel PG, Becker KJ, Duvnjak P, Fox BG (2007) Enhanced bacterial protein expression during auto-induction obtained by alteration of lac repressor dosage and medium composition. Biotechnol Prog 23: 585–598. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0097198-Delaglio1] 27. Delaglio F, Grzesiek S, Vuister GW, Zhu G, Pfeifer J, et al. (1995) NMRPipe: a multidimensional spectral processing system based on UNIX pipes. J Biomol NMR 6: 277–293. [DOI] [PubMed] [Google Scholar]

[pone.0097198-Goddard1] 28.Goddard TD, Kneller DG (2008) SPARKY 3. University of California, San Francisco.

[pone.0097198-Shin1] 29. Shin J, Lee W, Lee W (2008) Structural proteomics by NMR spectroscopy. Expert Rev Proteomics 5: 589–601. [DOI] [PubMed] [Google Scholar]

[pone.0097198-Bahrami1] 30. Bahrami A, Assadi AH, Markley JL, Eghbalnia HR (2009) Probabilistic interaction network of evidence algorithm and its application to complete labeling of peak lists from protein NMR spectroscopy. PLoS Comput Biol 5: e1000307. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0097198-Gntert1] 31. Güntert P (2004) Automated NMR structure calculation with CYANA. Methods Mol Biol 278: 353–378. [DOI] [PubMed] [Google Scholar]

[pone.0097198-Shen1] 32. Shen Y, Delaglio F, Cornilescu G, Bax A (2009) TALOS+: a hybrid method for predicting protein backbone torsion angles from NMR chemical shifts. J Biomol NMR 44: 213–223. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0097198-DeLano1] 33. DeLano WL, Lam JW (2005) PyMOL: A communications tool for computational models. Abstr Pap Am Chem S 230: U1371–U1372. [Google Scholar]

[pone.0097198-Bhattacharya1] 34. Bhattacharya A, Tejero R, Montelione GT (2007) Evaluating protein structures determined by structural genomics consortia. Proteins 66: 778–795. [DOI] [PubMed] [Google Scholar]

[pone.0097198-Koradi1] 35.Koradi R, Billeter M, Wüthrich K (1996) MOLMOL: a program for display and analysis of macromolecular structures. J Mol Graph 14: : 51–55, 29–32. [DOI] [PubMed] [Google Scholar]

[pone.0097198-Baker1] 36. Baker NA, Sept D, Joseph S, Holst MJ, McCammon JA (2001) Electrostatics of nanosystems: application to microtubules and the ribosome. Proc Natl Acad Sci USA 98: 10037–10041. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0097198-Dolinsky1] 37. Dolinsky TJ, Nielsen JE, McCammon JA, Baker NA (2004) PDB2PQR: an automated pipeline for the setup of Poisson-Boltzmann electrostatics calculations. Nucleic Acids Res 32: W665–667. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0097198-Frishman1] 38. Frishman D, Argos P (1995) Knowledge-based protein secondary structure assignment. Proteins 23: 566–579. [DOI] [PubMed] [Google Scholar]

[pone.0097198-Chen1] 39. Chen VB, Arendall WB 3rd, Headd JJ, Keedy DA, Immormino RM, et al. (2010) MolProbity: all-atom structure validation for macromolecular crystallography. Acta Crystallogr D Biol Crystallogr 66: 12–21. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0097198-Laskowski1] 40. Laskowski RA, Rullmannn JA, MacArthur MW, Kaptein R, Thornton JM (1996) AQUA and PROCHECK-NMR: programs for checking the quality of protein structures solved by NMR. J Biomol NMR 8: 477–486. [DOI] [PubMed] [Google Scholar]

[pone.0097198-Eisenhaber1] 41. Eisenhaber F, Argos P (1993) Improved strategy in analytic surface calculation for molecular systems: Handling of singularities and computational efficiency. Journal of Computational Chemistry 14: 1272–1280. [Google Scholar]

[pone.0097198-McIntyre1] 42. McIntyre CL, McWilliam Leitch EC, Savolainen-Kopra C, Hovi T, Simmonds P (2010) Analysis of genetic diversity and sites of recombination in human rhinovirus species C. J Virol 84: 10297–10310. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0097198-Cox1] 43. Cox JD, Hunt JA, Compher KM, Fierke CA, Christianson DW (2000) Structural influence of hydrophobic core residues on metal binding and specificity in carbonic anhydrase II. Biochemistry 39: 13687–13694. [DOI] [PubMed] [Google Scholar]

[pone.0097198-Foeger1] 44. Foeger N, Schmid EM, Skern T (2003) Human rhinovirus 2 2Apro recognition of eukaryotic initiation factor 4GI. Involvement of an exosite. J Biol Chem 278: 33200–33207. [DOI] [PubMed] [Google Scholar]

[pone.0097198-Neubauer1] 45. Neubauer D, Aumayr M, Gösler I, Skern T (2013) Specificity of human rhinovirus 2A(pro) is determined by combined spatial properties of four cleavage site residues. J Gen Virol 94: 1535–1546. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0097198-Crooks1] 46. Crooks GE, Hon G, Chandonia J-M, Brenner SE (2004) WebLogo: a sequence logo generator. Genome Res 14: 1188–1190. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Solution Structure of the 2A Protease from a Common Cold Agent, Human Rhinovirus C2, Strain W12

Woonghee Lee

Kelly E Watters

Andrew T Troupis

Nichole M Reinen

Fabian P Suchy

Kylie L Moyer

Ronnie O Frederick

Marco Tonelli

David J Aceti

Ann C Palmenberg

John L Markley

Roles

Abstract

Introduction

Figure 1. An RV RNA genome encodes a single polyprotein.

Materials and Methods

Plasmid Design and Construction

Table 1. DNA Primers used for Cloning and Mutating RV-C2 2Apro.

Optimal Expression Parameters

Large-Scale Protein Production

Preparation of Uniformly 15N and 13C/15N-Labeled Protein on a Large-Scale

Protein Purification

Figure 2. SDS-PAGE illustrating purification of RV-C2 2Apro.

NMR Data Collection

NMR Spectral Analysis and Structure Calculation

Dynamics

Exposure of Aromatics

Results

Protein Characterization

Figure 3. 1H-15N HSQC spectra of 15N-labeled wild-type 2Apro (purple) and C105A 2Apro (red).

Structure Description

Figure 4. Properties of C2 2Apro datasets.

Figure 5. Solution structure of C2 2Apro.

Table 2. Statistics for the NMR Structure of C2 2Apro.

Protein Dynamics

Figure 6. Relaxation times and heteronuclear NOEs.

Discussion

NMR Methods

2Apro Structure Comparisons

Figure 7. Sequence alignment of C2, A2, CB4, and EV71 2Apro.

Figure 8. Cross-eyed stereoscopic representations of 2Apro structures.

RV 2Apro Sequence and Structural Variability

Figure 9. RV sequences by species.

Associated Content

Accession Codes

Acknowledgments

Data Availability

Funding Statement

References

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

Table 1. DNA Primers used for Cloning and Mutating RV-C2 2A^pro.

Preparation of Uniformly ¹⁵N and ¹³C/¹⁵N-Labeled Protein on a Large-Scale

Figure 2. SDS-PAGE illustrating purification of RV-C2 2A^pro.

Figure 3. ¹H-¹⁵N HSQC spectra of ¹⁵N-labeled wild-type 2A^pro (purple) and C₁₀₅A 2A^pro (red).

Figure 4. Properties of C2 2A^pro datasets.

Figure 5. Solution structure of C2 2A^pro.

Table 2. Statistics for the NMR Structure of C2 2A^pro.

2A^pro Structure Comparisons

Figure 7. Sequence alignment of C2, A2, CB4, and EV71 2A^pro.

Figure 8. Cross-eyed stereoscopic representations of 2A^pro structures.

RV 2A^pro Sequence and Structural Variability