Skip to main content
Science Advances logoLink to Science Advances
. 2022 Dec 9;8(49):eadd2191. doi: 10.1126/sciadv.add2191

Biochemical and structural insights into SARS-CoV-2 polyprotein processing by Mpro

Ruchi Yadav 1,2,, Valentine V Courouble 3,4,, Sanjay K Dey 1,2,, Jerry Joe E K Harrison 5, Jennifer Timm 1, Jesse B Hopkins 6, Ryan L Slack 7, Stefan G Sarafianos 7,8, Francesc X Ruiz 1,2,*, Patrick R Griffin 3,4,9,10,*, Eddy Arnold 1,2,*
PMCID: PMC9733933  PMID: 36490335

Abstract

SARS-CoV-2, a human coronavirus, is the causative agent of the COVID-19 pandemic. Its genome is translated into two large polyproteins subsequently cleaved by viral papain-like protease and main protease (Mpro). Polyprotein processing is essential yet incompletely understood. We studied Mpro-mediated processing of the nsp7-11 polyprotein, whose mature products include cofactors of the viral replicase, and identified the order of cleavages. Integrative modeling based on mass spectrometry (including hydrogen-deuterium exchange and cross-linking) and x-ray scattering yielded a nsp7-11 structural ensemble, demonstrating shared secondary structural elements with individual nsps. The pattern of cross-links and HDX footprint of the C145A Mpro and nsp7-11 complex demonstrate preferential binding of the enzyme active site to the polyprotein junction sites and additional transient contacts to help orient the enzyme on its substrate for cleavage. Last, proteolysis assays were used to characterize the effect of inhibitors/binders on Mpro processing/inhibition using the nsp7-11 polyprotein as substrate.


Elucidation of the structural basis of cleavage order of SARS-CoV-2 nsp7-11 polyprotein with implications for Mpro inhibition.

INTRODUCTION

The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2; CoV-2), a member of the family Coronaviridae, is responsible for the ongoing coronavirus disease 2019 (COVID-19) global pandemic (1). The toll of CoV-2 is extraordinary in terms of worldwide repercussions in the number of infected people, deaths, and pace of infection spread (https://covid19.who.int/). SARS-CoV-2 has a ~30-kb (+)-sense RNA genome, one of the largest known of any RNA virus, that encodes 13 open reading frames (ORFs), including replicase (ORF1a/ORF1b), spike (S), envelope (E), membrane (M), nucleocapsid (N), and 7 other ORFs that encode accessory proteins (2). ORF1a and ORF1b are translated to produce two large polyproteins, pp1a and pp1ab. These polyproteins are subsequently cleaved into 16 nonstructural proteins (nsps) by virally encoded proteases: the papain-like protease (PLpro; a domain of nsp3), which cleaves junctions from nsp1 to the nsp4 N terminus, and the main protease (Mpro; nsp5, 3C-like protease), which cleaves junctions from the nsp4 C terminus to nsp16 (3). The “polyprotein strategy”—used by most RNA viruses and retroviruses—allows for (i) a more compact genome, (ii) regulation of activity through a precise temporal (i.e., stage of viral cycle) and spatial (i.e., subcellular location) cleavage pattern, and (iii) cleavage intermediates having distinct and critical roles from those of the cleaved products, as shown for alphaviruses, picornaviruses, and noroviruses (46). Hence, coordinated processing of polyproteins is vital for regulating the viral life cycle.

Different polyprotein intermediates derived from Mpro-mediated pp1a/1b processing have been detected in other CoVs, including mouse hepatitis virus (MHV) (7, 8) and alphacoronavirus human CoV 229E (HCoV-229E) (9). CoV-2 and MHV belong to the betacoronavirus genus with the latter being a good surrogate mouse model for studying CoV-2 infection and biology (1012). Notably, mutations in the junction sites within the MHV nsp7-10 polyprotein were found to be lethal for viral replication, with the exception of the nsp9-10 site, where mutations led to a crippled mutant virus (8). In addition, a polyprotein intermediate of ~150 kDa corresponding to nsp4-10/11 has been detected in pulse-chase experiments (13, 14). Reverse genetic studies with temperature-sensitive mutants in MHV suggest that nsp4-11 could serve as a scaffold where replicative enzymes (nsp12, nsp14, and nsp16) may dock to perform their activities on the viral RNA (15). Alternatively, they may indicate that mutation of this intermediate perturbs Mpro processing (16). Thus, the functional roles of nsp4-10/11 in virus replication remain unclear. In addition, the subcellular localization of nsp7 to nsp10 has been studied for several CoVs using immunofluorescence microscopy and cryo–electron microscopy/tomography. pp1a/1ab is anchored to the endoplasmic reticulum membranes by flanking transmembrane domains of nsp4 and nsp6, along with membrane-spanning nsp3. This topology results in the membrane-anchored Mpro being exposed to the cytosol along with, most likely, the nsp7-10/7-11 region (7, 1720). Moreover, data from CoVs and other RNA viruses suggest that “convoluted membranes” (the precursors of the coronavirus replication organelles formed by double-membrane vesicles) may be the main site of viral gene expression and polyprotein processing. However, it should be noted that these labeling techniques cannot distinguish between mature nsps and polyprotein intermediates.

More recently in CoV-2–infected cells, the identification of viral cleavage sites at nsp4, nsp8-9, and nsp10-12 junctions at different postinfection time points has validated the presence of polyprotein intermediates and thus garnered support for further investigation into their functional relevance and structures (3). Krichel and coauthors (21, 22) have applied a structure-function approach to investigate the processing of the SARS-CoV nsp7-10 and Middle East respiratory syndrome coronavirus (MERS-CoV) nsp7-11 polyproteins in vitro using native mass spectrometry (MS). Their results emphasized the critical role of the polyprotein conformation and the structural environment of the cleavage junctions in determining cleavage order, as the order of processing was previously inferred by determining the specific activity of Mpro cleavage on short oligopeptide sequences comprising the cleavage junctions (23).

One of the most investigated CoV-2 targets has been Mpro with ~500 Protein Data Bank (PDB) structural depositions (https://rcsb.org/covid19). These structures include Mpro in both immature forms (24) and its mature apo form [https://rcsb.org/covid19; (25, 26)]. Furthermore, there are multiple structures of Mpro bound with inhibitors (27, 28), including the recently Food and Drug Administration (FDA)–approved Pfizer inhibitor [PF-07321332, nirmatrelvir (NMTV)] (29), small molecules and fragment binders (3032), and several structures with peptide substrates and products (3335). Even with these efforts and with >2000 SARS-CoV-2 PDB depositions, no CoV polyprotein structures have been reported to date (https://rcsb.org/covid19). Despite their importance in the viral life cycle, polyprotein structural knowledge is very underrepresented in comparison to the multitude of solved structures of mature, postcleavage proteins (6).

Here, we have used a multipronged approach to study the structural basis of processing of the CoV-2 nsp7-11 polyprotein by Mpro in vitro, given its highly dynamic nature and multidomain organization. We have characterized the processing kinetics through gel-based and pulse labeling MS techniques, as well as the structural footprint of the polyprotein on Mpro and vice versa. We also developed integrative structural models of the nsp7-11 and nsp7-8 polyproteins [by MS, small-angle x-ray scattering (SAXS), and molecular modeling]. These experiments allowed us to rationalize how the tertiary structure of the polyprotein influences the order of processing of the polyprotein by Mpro and provided insights into the binding of the polyprotein substrate to Mpro. Lastly, taking advantage of the vast number of Mpro-ligand structures, we identified a set of binders (with some displaying antiviral activity) overlapping with regions of Mpro relevant for polyprotein binding outside of its active site and probed them in proteolytic inhibition assays including the full-length polyprotein as substrate. Taken together, the information gathered from this study improves our understanding of the role of polyproteins in SARS-CoV-2 viral replication.

RESULTS

SDS-PAGE analysis reveals cleavage order for Mpro-mediated processing of the nsp7-11 polyprotein

Polyprotein processing in coronaviruses is a precise and tightly regulated process (8, 16, 36). We first expressed and purified the nsp7-11 and nsp7-8 polyproteins and wild-type (WT) Mpro (fig. S1) to assess the proteolytic cleavage order of SARS-CoV-2 polyproteins. Next, we conducted a semiquantitative proteolysis assay of the nsp7-11 polyprotein with Mpro using SDS–polyacrylamide gel electrophoresis (SDS-PAGE) as a readout (Fig. 1A and fig. S2). Analysis of the nsp7-11 polyprotein processing at room temperature revealed that the nsp9-10 junction was cleaved initially (starting ~30 min), followed by simultaneous cleavage of the nsp8-9 and nsp10-11 junctions (starting ~2 hours), and lastly the nsp7-8 junction (starting ~4 hours) (Fig. 1, A and B). This order of cleavage is identical to the polyprotein processing order reported for SARS-CoV (CoV-1), which was expected given their high amino acid sequence conservation (22). Identification of the protein bands in the SDS-PAGE gel was done by performing in-gel tryptic digestion followed by liquid chromatography–tandem MS (LC-MS/MS) analysis (fig. S2). Authentic N- and C-terminal peptides were observed for some of the proteins, suggesting that the Mpro cleavage resembles the authentic processing in our in vitro system. Moreover, performing the proteolysis at 4°C slowed the cleavage process but did not change the cleavage order. Altering ratios of Mpro to polyprotein also had no effect on the cleavage order (1:6 and 1:12 molar ratios) (fig. S3), further supporting the specificity of Mpro and the lack of a temperature/concentration-dependent effect on cleavage order. After 24 hours of exposure to Mpro, the nsp7-8 junction was not completely cleaved. The proteolysis assay with the nsp7-8 intermediate polyprotein showed that nsp7-8 was also not fully cleaved after 24 hours (1:10 molar ratio of Mpro:nsp7-8) (fig. S4), suggesting that the structural environment around the nsp7-8 junction impedes efficient Mpro cleavage with respect to the other junction sites.

Fig. 1. In vitro analysis of SARS-CoV-2 nsp7-11 polyprotein processing by WT Mpro.

Fig. 1.

(A) SDS–polyacrylamide gel electrophoresis (PAGE) showing the limited proteolysis of nsp7-11 polyprotein by Mpro over a time course of 24 hours at RT (room temperature). +/− shows the presence or absence of the respective proteins. The lane labeled as M is the protein marker. Black arrows on the right indicate the proteins generated from the cleavage of nsp7-11 polyprotein by Mpro. (B) Schematic representation of the cleavage order of the nsp7-11 polyprotein by Mpro. (C) Pulsed HDX-MS analysis of nsp7-11 with Mpro. Color scale represents changes in deuterium uptake over the course of the cleavage reaction, with gray representing no significant change in deuterium uptake, white denoting no sequence coverage, and black representing residues within peptides that are no longer identifiable.

Pulse labeling HDX-MS validates the cleavage order of polyprotein processing observed by SDS-PAGE analysis

To gain further structural insight into the polyprotein processing by Mpro, we conducted the same proteolysis assay followed by pulse labeling hydrogen-deuterium exchange MS (HDX-MS). HDX-MS is a useful tool for probing protein conformation and dynamics by measuring protein backbone amide hydrogen exchange. During conventional continuous labeling HDX-MS, proteins are incubated in deuterated buffer for increasing lengths of time such that changes in solvent exchange, as measured by deuterium uptake, represent changes in equilibrium conformational dynamics as a result of alterations in the backbone amide hydrogen chemical environment, such as changes in hydrogen bonding and/or solvent accessibility. However, with pulsed HDX-MS, protein interactions are sampled at different incubation intervals and then subjected to a single, short deuterium labeling step. Thus, observed changes in solvent exchange are representative of distinct protein populations present in solution. Briefly, we incubated nsp7-11 with Mpro at an equimolar ratio, and the cleavage reaction was allowed to proceed over 24 hours on ice. Aliquots of the reaction were taken at various time intervals and incubated in deuterated buffer for 30 s on ice before being quenched, flash-frozen, and stored until ready for MS analysis. All time points were compared to nsp7-11 without Mpro to observe changes in solvent exchange occurring in the polyprotein over the course of the proteolytic process (Fig. 1C, fig. S5, and table S1).

We observed increased solvent exchange in the nsp9 C-terminal region at 30 min as compared to the protein in the absence of enzyme (Fig. 1C). Solvent exchange continued to increase in this region during the proteolysis time course and was followed by increased solvent exchange in the nsp10 N-terminal region. These increases in solvent exchange can be explained by cleavage of the nsp9-10 junction releasing the nsp7-9 N termini and nsp10-11 C termini from the intact polyprotein junction. These newly formed termini have increased conformational mobility compared to the intact polyprotein junctions. Concomitantly, we observed decreased solvent exchange at the cleavage junction residues, which is a result of reduced solvent accessibility due to Mpro binding at the junction sites during the cleavage reaction. As the proteolysis time course progressed, we also observed a decrease in signal intensity of the peptides spanning the nsp9-10 junction site, providing further evidence of productive cleavage of the nsp9-10 junction. At 24 hours, we were no longer able to detect these peptides in the mass spectrometer, indicating full cleavage of the nsp9-10 site (fig. S5). These observations suggest that the nsp9-10 junction is being cleaved first, consistent with our SDS-PAGE analysis (Fig. 1A).

The nsp8-9 and nsp10-11 junctions appeared to be simultaneously cleaved next as both junctions showed changes in solvent exchange starting at 4 hours. At the nsp9 N-terminal region, we observed increased solvent exchange compared to the protein in the absence of the enzyme. As we were unable to detect peptides specifically spanning the nsp8-9 junction, we could not determine the exact timing of full cleavage. Meanwhile, at the nsp10-11 junction site, we observed decreased solvent exchange and reduced signal intensity in the peptides spanning the junction site until 24 hours when they were no longer identified in the mass spectrometer, suggesting full cleavage. While it appears that nsp11 is no longer observed at 24 hours, this is due to our inability to detect any nsp11-only peptides, as it is only 13 amino acids in length, such that all the peptides covering nsp11 also include the junction residues (fig. S5). Nevertheless, it was clear that nsp8-9 and nsp10-11 are cleaved simultaneously, following cleavage at the nsp9-10 site.

The nsp7-8 site is cleaved last, as we did not observe any changes in deuterium uptake near the nsp7-8 junction until 8 hours. While we did not observe decreased solvent exchange in the peptides spanning the nsp7-8 junction to indicate Mpro binding at this site, we no longer detected these peptides in the mass spectrometer at 24 hours, suggesting that the nsp7-8 junction is cleaved by Mpro.

We also observed changes in solvent exchange away from the junction sites in nsp7 and nsp8. The protection from solvent exchange within both nsp7 and nsp8 suggests that nsp7 and nsp8 associate into a heterodimer after their release from the polyprotein. The pattern of protection observed in nsp7 agreed well with the protection we observed in our prior continuous labeling HDX-MS analysis of the nsp7:nsp8 complex (wherein we confirmed agreement of our HDX-MS data with the multiple x-ray crystal structures of “linear” heterotetrameric nsp7:nsp8 complexes) (37). Unexpectedly, increased exchange within nsp8 was also observed starting at 2 hours. These peptides demonstrated EX1 exchange kinetics as revealed by the detection of two distinct deuterated ion distributions (bimodal) for the same peptide (38). Under native conditions, this behavior has been shown to be a result of multiple intermediate conformational protein states (3942). The observed EX1 behavior in nsp8 could be explained by the increased flexibility of the nsp8 N terminus, adopting multiple conformations as previously documented (21, 37).

In addition, nsp10 also showed decreased deuterium uptake away from the junction site. Comparing the intrinsic exchange profile of nsp7-11, nsp7-10, and individual nsp10 (see the Section ‘HDX-MS profile of polyprotein revealed similar secondary structure elements to individual nsps’), we observed that nsp10 has the greatest deuterium uptake in nsp7-11, while nsp7-10 and nsp10 showed similar intrinsic exchange profiles, suggesting that the presence of nsp11 reduces solvent exchange within nsp10. Specifically, the regions of protection from solvent exchange observed in nsp10 during the pulsed HDX-MS experiment align with the residues showing decreased deuterium uptake in mature nsp10 and nsp7-10. Moreover, the pulsed HDX-MS following nsp7-10 proteolysis (fig. S6) did not show any changes in deuterium exchange in nsp10, as expected from the comparison of the intrinsic exchange profiles of nsp7-10 and mature nsp10. This confirms that released nsp10 does not interact with other liberated proteins in solution, and the observed decreased solvent exchange in nsp10 is due to the release of nsp11 from nsp10.

Overall, the results of pulsed HDX-MS of nsp7-11 proteolysis were consistent with the SDS-PAGE proteolytic results, showing the processing order to be: (i) nsp9-10, (ii) nsp8-9 and nsp10-11, and, lastly, (iii) nsp7-8 (Fig. 1B). Moreover, pulsed HDX-MS with nsp7-10 displayed similar results with the same processing order as well as interaction of mature nsp7 and nsp8 after their release (fig. S6).

Continuous labeling HDX-MS demonstrates localized sites of interaction of C145A Mpro to polyprotein junction sites

Next, we used continuous labeling HDX-MS and cross-linking MS (XL-MS) as complementary techniques to better understand the solution phase dynamics of the complex. To facilitate these studies, we used a catalytically inactive Mpro mutant, C145A Mpro. Mutation of the active site cysteine to alanine inhibits the cleavage of the substrate, allowing us to capture the bound Mpro:polyprotein complex. Using continuous labeling HDX-MS, we compared nsp7-11 versus nsp7-11 in complex with C145A Mpro at an equimolar ratio (Fig. 2A, fig. S7A, and table S1). Increased protection from solvent exchange was observed at all junction sites except the nsp10-11 junction. The nsp9-10 junction had the largest magnitude in protection from solvent exchange, which suggests that it is the primary binding site on the polyprotein, as the presence of Mpro at the junction site would reduce solvent accessibility and solvent exchange. This observation is consistent with the SDS-PAGE (Fig. 1A) and pulsed HDX-MS results (Fig. 1C) that also indicate the nsp9-10 junction to be the initial target of Mpro. Minimal alteration of solvent exchange was observed within the nsp subdomains outside of the junction sites, suggesting that Mpro interaction with the polyprotein is favored at the junction site sequences and that binding of Mpro does not induce significant long-range conformational changes in the polyprotein. Only nsp8 showed additional regions of protection from solvent exchange away from the junctions, specifically residues T120 to M140 and K182 to L213. These regions of the polyprotein are inherently more dynamic, as determined by higher intrinsic deuterium exchange (Fig. 3A), and thus, the observed protection suggests that interaction with C145A Mpro is stabilizing the nsp8 N-terminal region.

Fig. 2. HDX-MS and XL-MS reveal the in-solution dynamics of the Mpro polyprotein complex.

Fig. 2.

Consolidated differential HDX-MS results of (A) nsp7-11 versus nsp7-11 in complex with C145A Mpro, (B) nsp7-10 versus nsp7-10 in complex with C145A Mpro, (C) C145A Mpro versus C145A Mpro with nsp7-11 over time course up to 12 hours, and (D) C145A Mpro versus C145A Mpro with nsp7-11 at 12 hours only. All consolidated differential HDX-MS results are colored on the basis of the change in percent deuterium as described in the scale bar, with regions showing no significant change in deuterium in gray and regions with no sequence coverage in white. (E) Overlay of HDX-MS and XL-MS results on nsp7-10 and C145A Mpro sequences. Observed intra-Mpro and intra-nsp7-10 cross-links colored in black and inter-C145A Mpro to nsp7-10 cross-links colored in green. Consolidated changes in percent deuterium uptake are taken from (B) and (C). (F) Overlay of HDX-MS results from (D) on C145A Mpro (modeled on the basis of PDB 7DVY), with residues forming interprotein cross-links with nsp7-10 shown as sticks and labeled. Consolidated changes in percent deuterium uptake are colored according to the scale bar, with regions showing no significant change in deuterium in gray and regions with no sequence coverage in black.

Fig. 3. Integrative structural modeling generates an ensemble of nsp7-11 models.

Fig. 3.

(A) Plots of observed percent deuterium per residue for nsp7, nsp8, nsp9, nsp7-8, nsp7-10, and nsp7-11. Secondary structures from PDB 6YHU for nsp7 and nsp8, PDB 6WXD for nsp9, and PDB 6ZPE for nsp10 are drawn on plots with α helices shown as barrels, β strands shown as arrows, and coils shown as rectangles. (B) Scheme of integrative structural modeling workflow for the nsp7-11 polyprotein. One model is shown to represent all 10 generated models. (C) Top 10 nsp7-11 models grouped into four representative tertiary structures. Models are colored by nsp, with nsp7 in magenta, nsp8 in purple, nsp9 in teal, nsp10 in cyan, and nsp11 in tan.

Similar results were observed with the nsp7-10 polyprotein (Fig. 2B, fig. S7B, and table S1). All the cleavage sites in nsp7-10 were protected from solvent exchange upon interaction with C145A Mpro. In addition, the nsp9-10 junction showed the greatest magnitude in protection from solvent exchange, while only nsp8 showed additional regions of protection outside the junctions. These results indicate that nsp11 does not alter the polyprotein interactions with C145A Mpro.

Continuous labeling HDX-MS validates polyprotein binding to C145A Mpro beyond the active site pocket

Next, we profiled changes in solvent exchange of C145A Mpro bound to nsp7-11 at an equimolar ratio but did not observe any significant protection from solvent exchange within C145A Mpro (fig. S7C), despite evidence from the above differential HDX-MS experiments that C145A Mpro is interacting with the polyprotein. We noted that C145A Mpro deuterium uptake was relatively low over an experimental time course of 1 hour, suggesting that the Mpro dimer is stable. Thus, we incubated C145A Mpro and the C145A Mpro:nsp7-11 complex in deuterium buffer for 12 hours to increase the observable experimental window for detecting protection from solvent exchange. Overall, deuterium uptake was increased over this longer time course, and significant protection from solvent exchange was observed in the active site peptides in the region containing the C145A mutation only when in the presence of the polyprotein (Fig. 2C, fig. S7D, and table S1). Additional regions of the enzyme were shown to be protected from solvent exchange at 12 hours of incubation in deuterated buffer only, including peptides spanning residues T45 to S46, V77 to L89, I106 to F150, Y161 to L205, and D289 to Q306 (Fig. 2D, fig. S7E, and table S1). These regions mapped from the back side of the C145 Mpro active site pocket, including the other catalytic dyad residue (H41), to the vicinity of the dimerization interface, hinting at a putative binding track for the rest of the nsp7-11 polyprotein outside of the nsp9-10 junction (see below for more details).

XL-MS demonstrates additional contact sites between C145A Mpro and polyproteins

While the continuous labeling HDX-MS analysis of the C145A Mpro polyprotein complex reported on changes in protein backbone dynamics, we also analyzed the complex using XL-MS to probe protein side-chain reactivity and spatial proximity. A total of nine interprotein cross-links were identified between C145A Mpro and nsp7-10 (Fig. 2E). The three C145A Mpro residues (K61, S62, and K102) that form interprotein cross-links with nsp7-10 were mapped to the catalytic domain (Fig. 2F). When these cross-links were mapped alongside the HDX-MS data, we observed that the interprotein cross-links map to residues outside of the regions showing protection from solvent exchange (Fig. 2E). Accordingly, these cross-links represent additional contact sites between the polyprotein and Mpro that might help orient the junction sites into the active site and stabilize the complex. For example, K162 within nsp8 is located between the two regions of nsp8 showing protection from solvent exchange (Fig. 2B) and forms an interprotein cross-link with S62 and K102 in Mpro. This further supports that interaction with C145A Mpro stabilizes nsp8 conformation. No cross-links were observed between the Mpro active site and polyprotein junction sites as we used the MS-cleavable cross-linker disuccinimidyl sulfoxide (DSSO) that primarily reacts with lysine as well as serine, threonine, and tyrosine albeit with lower efficiency. Because the Mpro active site does not contain lysine residues, only S46 and T190 may be capable of reacting with DSSO; however, our results suggest that these residues are not reactive toward the cross-linking reagent under the current experimental conditions.

HDX-MS profile of polyprotein revealed similar secondary structure elements to individual nsps

The HDX-MS intrinsic exchange profiles of nsp7-11, nsp7-10, and nsp7-8 polyproteins all revealed similar solvent exchange behavior (Fig. 3A), which suggests that all polyproteins share similar secondary structure elements and overall conformation. In addition, the HDX-MS intrinsic exchange profiles of the polyproteins largely resembled the intrinsic exchange profiles of individual nsp7, nsp8, and nsp9 (Fig. 3A). This suggests that the secondary structures within the polyproteins remain largely unchanged in these mature nsps. The only exception is mature nsp10, which, by itself or within nsp7-10, shows reduced deuterium uptake compared to nsp10 within the nsp7-11 polyprotein. This suggests that nsp10 shows an increase in dynamics when bound to nsp11.

Integrative structural modeling of the nsp7-11 polyprotein structure predicts an ensemble of different conformations

Next, we turned to an integrative structural modeling approach using multiple experimental techniques to account for biases inherent to each technique. We used the ab initio–based Iterative Threading Assembly Refinement (I-TASSER) algorithm that allows incorporation of experimental restraints (43). We decided to focus the integrative modeling efforts on nsp7-11 as translation of ORF1a includes nsp11.

We first used analytical size exclusion chromatography (SEC) coupled to multiangle light scattering (MALS) and SAXS detection (SEC-MALS-SAXS) to analyze the assembly state and structural features of the polyprotein in solution. The SEC of the nsp7-11 polyprotein showed two peaks, suggesting the presence of two different states: monomer and dimer, with the monomeric form being predominant (fig. S8A). The MALS analysis was used to calculate the molecular weights (MW) of the two identified peaks for nsp7-11 (∼60 and ~110 kDa). SAXS analysis was conducted for both oligomeric states to understand the arrangement of these polyproteins in solution. Concretely, evolving factor analysis was used for separating the scattering of the monomer and dimer components (fig. S8B) in a model-independent way (44). Both states yielded a linear Guinier plot, indicating the presence of stable protein sample with no aggregation (fig. S8C). The bell-shaped (Gaussian) curve at lower q values in the Kratky plot showed that the sample contains folded domains with no significant disorder (fig. S8D). The pair-distance distribution function, P(r), which is related to the shape of the sample, indicated a globular-shaped protein for both the monomeric and dimeric forms of nsp7-11 (fig. S8E). The Rg and Dmax values calculated from the P(r) are 48.2 and 191 Å for the dimer and 35.8 and 156 Å for the monomeric nsp7-11 (table S2).

Subsequently, we applied the integrative structural modeling approach to predict the structure(s) of the monomeric state of the nsp7-11 polyprotein (Fig. 3B), which may likely be the form binding to Mpro (based on the HDX-MS data; see Fig. 2A and fig. S7A). Models were generated on the basis of the amino acid sequence and the following experimental parameters: (i) distance constraints from XL-MS, (ii) secondary structure restraints from solved x-ray crystal structures of the mature nsp7 to nsp10, and (iii) various nsp7-8 polyprotein models (fig. S10, A to F; more details in the Supplementary Materials). A final ensemble of 10 nsp7-11 models were binned into four representative conformational groups, which were all assessed using our gamut of techniques (Fig. 3, B and C).

When we compared the four model groups, the nsp7 helical bundle stood as the most conserved structural element in all models, except for group D. The other nsp domains presented more diversity in structural conformations and orientations. Group A is defined by an extended helical N terminus, with a “golf-club” conformation, observed in the CoV-1 nsp7:nsp8 complex (45) and in CoV-2 structures of nsp8 interacting with nsp7 and nsp12 (4648). Groups B and C exhibit a more compact organization of nsp8, with group B having nsp7, nsp8, nsp9, and nsp10 arranged linearly, while group C has all domains arranged in a packed “sphere,” and group D presents an “open” conformation.

Despite these conformational differences, all the models satisfied most of the cross-links, with distances equal to or less than 30 Å (upper limit distance for DSSO cross-links; Fig. 4A, fig. S11, and table S3). Specifically, group B satisfied the greatest percentage of cross-links, while groups A and D had the highest number of violations. These violations mostly stemmed from the extended nsp8 N-terminal helix in group A and the less ordered conformation in group D, which suggests that the nsp7-11 polyprotein, and especially the nsp8 segment, samples multiple conformations in solution. This conclusion is also supported by the HDX-MS data, showing that the central region of the nsp8 N-terminal subdomain exhibited greater solvent exchange (higher percent deuterium), suggesting increased inherent dynamics (fig. S12D).

Fig. 4. Assessment of representative nsp7-11 models based on experimental data.

Fig. 4.

(A) Mapping of nsp7-11 intraprotein cross-links onto representative nsp7-11 models. Satisfied cross-links equal to or less than 30 Å are shown in blue, and violated cross-links greater than 30 Å are shown in red. Percent of cross-links satisfied is reported under the structure. (B) Representative nsp7-11 models are colored on the basis of 10 s of percent deuterium value. Black indicates no sequence coverage in the HDX-MS experiment. Agreement of model with experimental data as calculated by HDXer is reported as the RMSE under the model. (C) Fitting of representative nsp7-11 models into the SAXS envelope and Rg values are reported under the model.

Next, we evaluated the agreement of the models with the experimental HDX-MS data using HDXer software (Fig. 4B; fig. S12, A and B; and table S3) (49, 50). HDXer calculates deuterated fractions for peptide segments corresponding to the experimental data as a function of the experimental deuterium exposure times. We then plotted the computationally derived percent deuterium value at a 10-s incubation in deuterated buffer for each model versus the experimentally determined percent deuterium value at a 10-s incubation in deuterated buffer and calculated their root mean square error (RMSE). Group A demonstrated the lowest RMSE (best agreement), compared to other models. Group C and D models had the highest RMSE, suggesting these conformations to be representative of models with too much rigidity or too much flexibility, respectively. Moreover, it was noted that nsp11 had the worst agreement with experimental HDX-MS data, while nsp7 had the best agreement (fig. S12C), which is likely due to the lack of nsp11 structures and abundance of x-ray crystal structures of nsp7 in complex with nsp8.

Next, a three-dimensional (3D) shape (bead model) was reconstructed for the monomeric nsp7-11 from the scattering profile using DAMMIF/N from the ATSAS package (51, 52). As the SAXS scattering profile represents averaged scattering from all different possible orientations, it may be possible that many different shapes/orientations can generate the same scattering profile, and therefore, for certain shapes, it can be difficult to generate a bead model that correctly represents the solution shape. To assess whether a bead model uniquely fits the scattering data or whether multiple models can fit the data, certain criteria are checked including ambiguity score, normalized spatial discrepancy (NSD) value, number of clusters, and parameters such as Rg, Dmax, and MW values. Ambiguity score or “a-score” is the initial screening that informs about the number of possible shapes representing the same scattering profile. An a-score below 1.5 is usually indicative of a unique ab initio shape determination. In our case, 0.85 a-score suggested a unique 3D reconstruction. The Rg and Dmax values obtained from the reconstructed model were close to those calculated from the P(r) function. The MW value of the refined model was also comparable to the expected MW value (table S2). Another important criterion to consider is NSD, which is used to evaluate the stability of the reconstruction. An NSD value less than 1.0 suggests fair stability of the reconstructions. DAMAVER reported 0.95 NSD for our reconstruction, which is on the borderline of a stable reconstruction. DAMCLUST created nine different clusters, suggesting that several different shapes in solution could have generated the same scattering profile. While the ambiguity score and comparable Rg, Dmax, and MW values favored the bead model reconstruction, other criteria such as NSD and the number of clusters suggested heterogeneity in the reconstruction. The higher NSD value and multiple clusters are likely due to nsp7-11 adopting multiple conformations. As stated earlier, the central segment of nsp8 is highly flexible and dynamic, as suggested by greater solvent exchange in HDX, which could lead to heterogeneity in the conformations. Of the four representative conformational groups, group A showed the best fit with the reconstructed SAXS envelope, as the extended nsp8 helix fit into the elongated extension of the envelope. The less ordered and open conformation of the group D model appeared to fit better in the SAXS envelope compared to the more ordered and compact structure of group C models. We also compared the calculated scattering profile for the models to the experimental scattering profile. The χ2 and Rg values for group A showed the greatest agreement with experimental data (Fig. 4C, fig. S13, and table S3).

In summary, the assessment of the 10 models using HDX-MS, XL-MS, and SAXS highlighted that nsp7-11 can sample four major conformations (table S3). To note, our integrative structure modeling approach cannot ascertain the abundance of the different conformers within the ensemble. Group A conformers adopted an extended nsp8 helix with good agreement with HDX-MS and SAXS data but poor agreement with XL-MS data. Group B conformers showed linear nsp organization with good XL-MS agreement but average agreement with HDX-MS and poor SAXS agreement. Group C conformers were arranged as a packed spherical structure with poor HDX-MS and SAXS agreement but good XL-MS agreement. Lastly, group D conformers had the most dynamic conformations (i.e., fewer ordered secondary structural elements) and showed good agreement to SAXS data but average agreement to HDX-MS and poor XL-MS agreement (table S3).

The ensemble of nsp7-11 models unveils the interplay between cleavage junction conformation and accessibility to determine preference and order of cleavage

Next, we evaluated the structural environment of the cleavage junctions in the ensemble of nsp7-11 models to understand the influence of polyprotein substrate conformation and accessibility in processing (Fig. 5, fig. S14, and table S3). In general, all the junctions (except for nsp8-9, which was just partially covered by HDX-MS; Fig. 2A) showed high levels of solvent exchange (high percent deuterium values), consistent with the fact that the cleavage regions should be accessible for proteolysis to occur. The combination of secondary structure and accessible surface area for groups B and C was most consistent with the processing order that we determined by limited proteolysis and pulsed HDX-MS (Fig. 5, fig. S14, and table S3). Comparing all the junctions, the nsp9-10 junction, which was the first to be cleaved, was the most exposed junction in all the models and adopted a random coil in all but one model, which may best facilitate interaction with Mpro. On the other hand, the nsp7-8 junction, which was the last to be cleaved, was more hindered and mostly adopted an α-helical conformation, which may entail a slow cleavage event. For group A models, nsp7-8 junction appeared to be the most accessible junction, which ultimately lends to our conclusion that the polyprotein is likely sampling multiple conformations with some being more amenable to proteolytic processing than others.

Fig. 5. Assessment of junction site in representative nsp7-11 models.

Fig. 5.

Analysis of the secondary structure elements of junction sites in representative nsp7-11 models. Models are overlaid with 10 s of deuterium uptake values from Fig. 4B.

Probing nsp7-11 binding to Mpro with small-molecule binders

To further understand the implications of polyprotein binding to Mpro outside its active site—studied first via HDX-MS and XL-MS—regarding proteolytic processing, we leveraged the proteolysis assay using the nsp7-11 polyprotein as Mpro substrate to measure inhibition by active site and non–active site binders of Mpro identified through crystallography (30, 32). Specifically, we selected small-molecule binders overlapping with the Mpro regions showing protection from solvent exchange in the differential HDX-MS of C145A Mpro with nsp7-11 at 12 hours (Fig. 6A and table S4). Some of them presented antiviral activity, but most of them were not tested in enzymatic assays (32). We used the FDA-approved drug NMTV as a positive control.

Fig. 6. Inhibition of polyprotein processing by small molecules.

Fig. 6.

(A) Mapping of small molecules or fragments onto the structure of Mpro (PDB 7DVY) that is colored according to the consolidated differential percent deuterium uptake values shown in Fig. 2D. Binders are shown as stick models and color-coded by the region of Mpro that they interact with. See table S4 for more information regarding binders. (B) SDS-PAGE gel showing proteolytic processing of nsp7-11 by Mpro in the presence of increasing NMTV concentrations (0 to 10 μM) for 24 hours. (C) Inhibition of NMTV is shown by plotting the normalized band intensities of the nsp7-11 substrate versus NMTV concentrations. (D) Dose-response curve of NMTV inhibition of Mpro. The median inhibitory concentration (IC50) value was calculated from three independent replicates. (E) Differential HDX-MS results for Mpro in the presence and absence of NMTV overlaid onto the structure of Mpro with NMTV (PDB 7RFW).

NMTV, as expected, strongly inhibited Mpro processing [median inhibitory concentration (IC50) = 255 nM] of the full nsp7-11 polyprotein substrate in vitro (Fig. 6, B to D) (29). None of the non–active site binders displayed significant inhibition of the enzymatic activity of Mpro (fig. S15A). On the contrary, climbazole and pelitinib showed activation of Mpro activity in our assay conditions, despite the latter showing inhibition of SARS-CoV-2 replication with median effective concentration = 1.25 μM with moderate cytotoxicity (32).

Next, we analyzed by differential HDX-MS the effect of ligand binding on Mpro. NMTV was the only compound that showed significant change in Mpro solvent exchange behavior (Fig. 6E, fig. S15B, and table S1). The lack of observed change in solvent exchange may be due to experimental limitations in studying the interactions of weak binders by HDX-MS (53, 54). The NMTV interaction footprint on Mpro demonstrates strong protection from solvent exchange in the active site, in agreement with the mechanism of action of NMTV forming a reversible covalent thioimidate adduct with the catalytic C145 (29). These results closely resemble the nsp7-10/11 interaction footprint (Figs. 2 and 6A), as we observed protection in the active site of Mpro upon interaction with the polyprotein. In addition, the nsp7-10/11 footprint showed protection from solvent exchange in residues V77 to L89 not found in the presence of NMTV. These residues are located on the back of the catalytic domain, near residues K61 and S62, which form inter–Mpro-nsp7-11 cross-links and are thus likely stemming from a more transient interaction of Mpro with the polyprotein away from the active site.

DISCUSSION

In this work, we have studied the processing of CoV-2 nsp7-11 polyproteins by Mpro. As expected by the high degree of amino acid conservation, we have seen that CoV-2 polyprotein processing is almost, if not, identical to that observed for CoV-1 (22). The cleavage order deduced from the gel analysis is also supported by results from the pulsed HDX-MS experiment. The increased solvent exchange observed in the pulsed HDX-MS in the nsp9 C-terminal region at the first time point (30 min) suggests cleavage and release of nsp7-9 to increase solvent exposure of the nsp9 C-terminal region. This is consistent with the fact that SDS-PAGE gel shows an intermediate nsp7-9 polyprotein observed at 30 min to 1 hour, suggesting that the nsp9-10 junction is the first cleavage site. As shown in the literature for CoV-1 (22), the order of processing cannot be directly inferred from the substrate specificity of Mpro with peptides mimicking the cleavage junctions as the conformation and accessibility of the substrate polyprotein(s) are critical to regulating the process.

Whether the same in vitro order of cleavages occurs during viral replication is unknown. However, several lines of evidence support this concept. Several studies have detected the nsp4-nsp10/11 polyprotein intermediate in MHV-infected cells (13, 16, 55). Recently, in CoV-2–infected cells, the identification of viral cleavage sites at nsp4, nsp8-9, and nsp10-12 junctions at different postinfection time points is also consistent with such a polyprotein intermediate (3). Reverse genetics studies with MHV-infected cells (8, 16, 55) also provide support for their essential role in the viral replication cycle. As shown in MHV, the processing order of the nsp7-10 region is crucial for viral replication: Either domain deletions or switching and cleavage site mutations were lethal to the virus replication, with the exception being the inactivation of the nsp9-10 cleavage site, which yielded an attenuated mutant virus (8).

In addition, the nsp7-11 and nsp7-8 processing results indicate the presence of the nsp7-8 intermediate even after 24 hours of exposure to Mpro. It is not known whether this longer-lived intermediate could have some functional or essential role in the viral cycle; further suppression of nsp7-8 maturation could represent a unique drug target. The existence of potent maturation inhibitors in HIV has validated this concept as a plausible strategy; bevirimat, the lead for this class, binds to the CA/SP1 junction of the Gag polyprotein and hinders its cleavage: This junction (similar to the nsp7-8 junction) is in a dynamic helix-to-coil equilibrium, and binding of bevirimat stabilizes the helical conformation (5658). Regardless, it should be noted that, as labeling techniques used for microscopy cannot distinguish between mature nsps and polyprotein intermediates, chemical probes specifically targeting the nsp7-8 junction could help in further elucidation of the role of polyproteins during the CoV cycle.

Despite the efforts to understand the role of polyprotein processing, structural characterization of the polyproteins is still incomplete. Our HDX-MS results revealed the critical observation that the studied polyproteins have similar intrinsic exchange profiles as the individual proteins (and thus share similar structural elements) that led us to conclude that the individual nsps do not undergo large structural rearrangements following cleavage by Mpro. This observation is also supported by our pulsed HDX-MS analysis of polyprotein proteolysis. If the intact polyprotein, cleaved polyprotein intermediates, and mature nsps underwent significant conformational changes, large changes in the intrinsic exchange profiles for each population of protein would have been expected.

Accordingly, this permitted the use of an integrative structural biology approach combining modeling and experimental methodologies to construct 3D models of nsp7-11 polyprotein, which were used to elucidate the structural basis for the order of CoV-2 polyprotein processing. The structural predictions of the nsp7-11 polyprotein using the I-TASSER software provided us with an ensemble of 10 models with four representative conformations. Overall, none of the four groups satisfy all the experimental HDX-MS, XL-MS, and SAXS data, suggesting that the nsp7-11 polyprotein is dynamic and samples multiple conformations. While SAXS and HDX-MS capture the extended nsp8 helix conformation represented by group A, XL-MS data are more consistent with the more globular protein conformations seen in groups B and C. The surface-accessible areas of the cleavage junctions and secondary structure element analysis of the nsp7-11 polyprotein suggested that groups B and C (comprising 6 of the 10 models of the ensemble) might represent the polyprotein conformations in better agreement with the processing order that we determined experimentally (e.g., more accessible and disordered nsp9-10 junction in comparison with a more structured and hindered nsp7-8 junction). On the other hand, the four models comprising groups A and D showcase the conformational adaptability of the polyprotein. In these models, the nsp7-8 junction is more exposed and unstructured, thus more accessible for cleavage. Overall, the nsp7-11 model ensemble recapitulates the need for viral polyproteins to adopt different conformations during the replication cycle, i.e., metamorphic proteins (59, 60), given the strict genetic economy of RNA viruses.

The HDX-MS footprint and XL-MS of the Mpro:nsp7-11 complex revealed the importance of the “incognito” part of the polyprotein, the part of the polyprotein excluding the junctions captured in Mpro:substrate peptidic structures (3335), in processing. The pattern of cross-links and HDX footprint demonstrated that there are multiple transient contacts between Mpro and the nsp7-11 polyprotein, which help to orient the enzyme on its substrate for cleavage. We propose that the positioning of the polyprotein may be such that either the polyprotein binds to the active site of one Mpro protomer and wraps around to make contact with the back side of the catalytic domain of that same protomer, or the polyprotein binds to the active site of one protomer and sits on top of the back side of the catalytic domain of the other protomer (Fig. 7). These interactions, away from the Mpro active site and the nsp7-11 polyprotein junction sites, may reflect transient interactions that occur while Mpro scans multiple polyprotein intermediates and their different conformations as candidate substrates during proteolytic processing. The lack of Mpro inhibition by the non-active site surface binders also hints at the transient nature of these interactions (fig. S15). Nevertheless, these interactions may be important in setting the conformation of the junctions for cleavage, as aforementioned.

Fig. 7. Schematic representation of the nsp7-11 polyprotein substrate binding to Mpro.

Fig. 7.

Polyprotein binding to the active site of one Mpro protomer and wrapping around to contact the back side of the catalytic domain of that same protomer (top). Polyprotein binding to the active site of one protomer and sitting on top of the back side of the catalytic domain of the other protomer (bottom). The two protomers of Mpro are shown in blue and cyan. Protein residues shown in gray are the active site residues, and residues in red are sites of interprotein cross-linking with the polyprotein.

The HDX-MS footprint of Mpro:nsp7-11 also reveals significant protection in the Mpro dimerization interface area, especially near the Mpro C terminus (Figs. 2D and 6), suggesting that nsp7-11 binding stabilizes the Mpro dimer. In this sense, El-Baba and colleagues (61) identified that fragment JGY, found through crystallographic fragment screening and binding in the dimer interface (30), destabilized the Mpro dimer and showed ~35% inhibition of the rate of processing at 100 μM. Along the same lines, Sun and colleagues (62) found a nanobody, NB2B4, which binds the C-terminal domain of monomeric Mpro (PDB 7VFB) and inhibits activity with an IC50 value of ~150 nM. Thus, destabilization of the Mpro catalytic dimer may contribute to the mechanism of inhibition. Combined with the lack of Mpro inhibition by the non–active site surface binders (fig. S15), allosteric inhibition of Mpro may only be efficiently achieved by interface binders destabilizing the Mpro dimer. Binding in areas on the surface may not distort the active site of Mpro, which is, by nature, very malleable (accommodating 11 different junctions in virio) (25, 34).

While the impressive crystallographic small-molecule repurposing campaign (32) has provided valuable hits and probes along with antiviral activity testing, enzymatic inhibition was not reported by this study. As reviewed for remdesivir (63), the value of mechanistic and enzymatic inhibition studies (alongside antiviral studies) is paramount because it provides a logical path for developing direct-acting antivirals. The best example in the current case is pelitinib, which, given its strong antiviral activity, was portrayed as an allosteric inhibitor. Our studies show that it might be an allosteric activator. We hypothesize that this activation might be due to the stabilization of the Mpro dimer (fig. S15A). To understand whether its antiviral activity is due to an off-target effect [pelitinib has low inhibition of PLpro in enzymatic inhibition assays (64)] or due to dysregulation of viral maturation [as seen for efavirenz acceleration of Gag-Pol processing in HIV (65)], more experiments are required.

In summary, this study describes the structural basis of the order of Mpro processing of the essential nsp7-11 segment and the importance of the more transient interactions of the substrate to Mpro for proper positioning and catalysis and provides a mechanistic validation of allosteric inhibition. In conclusion, our results give structural insights into CoV-2 polyproteins, which will help us in understanding the structure-function relationships, drug design, and the fundamental biology of polyprotein activity and processing in CoV-2.

MATERIALS AND METHODS

Reagents and plasmids

Unless otherwise specified, all chemicals and reagents were purchased from Sigma-Aldrich (St. Louis, MO). Formic acid, trifluoroacetic acid (TFA), and ultrahigh-performance liquid chromatography (UHPLC)–grade solvents were purchased from Thermo Fisher Scientific. The active site inhibitor NMTV and non–active site binder RS102895 were purchased from MedChemExpress. AT7519 and climbazole were purchased from Selleck Chemicals. PD168568 was purchased from Tocris Bioscience. Pelitinib was purchased from BioVision. The pGEX-6P-1-nsp5 (or Mpro) plasmid was a gift from M. Walsh, Diamond Light Source. pGBWm4046979 (coding for full-length nsp7, National Center for Biotechnology Information (NCBI) reference sequence: YP_009725303.1, codon-optimized, with an initial Met and a cleavable C-terminal by tobacco etch virus (TEV) protease 6x-His tag] was a gift from Ginkgo Bioworks (Addgene plasmid 145611; http://n2t.net/; addgene:145611; RRID: Addgene_145611). pGBWm4046852 (coding for full-length nsp8, NCBI reference sequence: YP_009725304.1, codon-optimized, with an initial Met and a cleavable C-terminal TEV protease 6x-His tag) was a gift from Ginkgo Bioworks (Addgene plasmid 145584; http://n2t.net/; addgene:145584; RRID: Addgene_145584). The pET-28a-nsp9 gene was obtained from BEI Resources (NR-53501). The gene encoding SARS-CoV-2 nsp10 was cloned into the pGEX-6P-1 vector to generate an expression construct containing an N-terminal glutathione S-transferase (GST) tag and a human rhinovirus (HRV) 3C protease cleavage site (GST3C-nsp10). Plasmids for codon-optimized pET-28a-His6-nsp7-8 and pET-28a-His6-nsp7-11 (with an HRV 3C protease cleavage site between the 6x-His tag and the coding sequence) were obtained from GenScript (Piscataway, NJ). Primers used for cloning and mutagenesis, as well as plasmid sequences, are available upon request. HRV 3C and TEV proteases were recombinantly expressed using in-house plasmids.

Protein expression and purification

WT Mpro was produced with native N and C termini, as described in (66). The pGEX-6P-1-nsp5 expression plasmid was transformed into Escherichia coli Rosetta gami competent cells and cultured in LB medium at 37°C with ampicillin (100 μg/ml). Next day, the culture was diluted 1:100 into 1 liter of LB medium supplemented with ampicillin (100 μg/ml). The cells were grown to OD600 (optical density at 600 nm) = 0.8 before being induced with 1 mM isopropyl-β-d-thiogalactopyranoside (IPTG) at 16°C. After 10 hours of induction, the cells were collected by centrifugation at 7200g for 10 min and stored at −80°C. The cell pellet was resuspended in 50 mM tris (pH 8.0), 300 mM NaCl, 5 mM imidazole, and 1 mM tris(2-carboxyethyl)phosphine (TCEP) followed by sonication and centrifugation at 30,000g for 60 min. The cleared lysate was loaded on a Ni-NTA (nitrilotriacetic acid) affinity column (Qiagen). The bound proteins were first washed with lysis buffer and then with the lysis buffer supplemented with 20 mM imidazole to remove nonspecific proteins. Mpro was eluted with 300 mM imidazole in the lysis buffer and then purified by SEC using a prepacked Superose 6 Increase 10/300 GL column (GE Healthcare Life Sciences) equilibrated in 50 mM tris (pH 8.0), 300 mM NaCl, and 1 mM TCEP. The fractions containing the pure protein were pooled, concentrated, and stored at −80°C.

nsp7 and nsp8 were produced as described in our earlier work (37). nsp9 was purified using the protocol described in (67). Overall, the plasmid was first transformed into E. coli BL21-CodonPlus (DE3)-RIL cells and then grown in LB medium with kanamycin (50 μg/ml) at 37°C. The cells were grown to an OD600 of 1.0 before being induced with 0.5 mM IPTG. After 4 hours of induction, the cells were collected by centrifugation at 7200g. The cells were resuspended in a lysis buffer [20 mM Hepes (pH 7.0), 150 mM NaCl, 20 mM imidazole, 2 mM MgCl2, and 0.5 mM TCEP]. The cells were lysed by sonication in the presence of 1 mg of lysozyme and then centrifuged at 10,000g for 20 min. The cleared lysate was loaded on a Ni-NTA affinity column (Qiagen), and the column was then washed with the lysis buffer and 50 mM imidazole buffer. The His-tagged protein was eluted with 400 mM imidazole in the lysis buffer. The His-tag was cleaved by incubating the protein with HRV 3C protease overnight at 4°C. After digestion, the protein was passed through a second Ni-NTA column to remove the 3C protease and the residual uncleaved protein. The protein sample was then purified by SEC using a prepacked Superose 6 Increase 10/300 GL (GE Healthcare Life Sciences) equilibrated in 20 mM Hepes (pH 7.0), 150 mM NaCl, 2 mM MgCl2, and 0.5 mM TCEP. Pure protein–containing fractions were concentrated and stored at −80°C after snap-freezing.

A single colony of E. coli BL-CodonPlus (DE3)-RIL (Agilent Technologies) carrying the GST3C-nsp10 was used to inoculate 50 ml of LB medium containing the appropriate antibiotics [carbenicillin (100 μg/ml) and chloramphenicol (25 μg/ml)]. This seeding culture was grown overnight in a shaking incubator at 37°C. The seeding cultures were then used to inoculate 1 liter of expression cultures containing the appropriate antibiotics to an initial OD600 of 0.2 and grown in a shaking incubator at 37°C to an OD600 of 0.6. The temperature was reduced to 16°C, and protein expression was induced at an OD600 of 0.9 with the addition of 0.1 mM IPTG. The expression cultures were harvested after 16 hours by centrifugation for 30 min at 2555g, followed by flash-freezing and storage at −80°C. All centrifugation steps were performed at 4°C. The cell pellet from 1 liter of expression culture was resuspended in lysis buffer [50 mM tris, 300 mM NaCl, 5 mM β-mercaptoethanol, 4 mM MgSO4, 10% (v/v) glycerol (pH 8.0)] at a ratio of 5 ml of lysis buffer to 1 g of cell paste and thawed on ice. The cells were lysed by sonication on ice for 8 min, and the cellular debris was separated from the soluble lysate by centrifugation for 30 min at 48,000g. The volume of the soluble lysate was measured, and an equal volume of saturated ammonium sulfate was added to achieve 50% saturation, followed by overnight incubation at 4°C. The soluble fraction was separated by centrifugation for 30 min at 24,000g and discarded. The pellet was resuspended in 10 ml of lysis buffer, and 100 μl of polyethyleneimine (5%, w/v) was added in a dropwise fashion. The insoluble material was removed by centrifugation for 30 min at 24,000g. The supernatant was decanted and added to 2 ml of Glutathione Sepharose 4 FF (Cytiva) affinity medium, which had been preequilibrated with lysis buffer. Batch binding was performed on an orbital rotator at 4°C for 4 hours, and the unbound protein was removed using gravity-flow chromatography and washed with 20 ml of lysis buffer. GST3C-nsp10 was cleaved on-column with the addition of 8 ml of lysis buffer containing HRV 3C protease (0.2 mg/ml) and incubation on an orbital rotator at 4°C overnight. The cleaved nsp10 was collected in the flow-through and wash fractions, concentrated to 198 μM using an Amicon Ultra centrifugal filter (MilliporeSigma), and stored at −80°C.

The nsp7-8 and nsp7-11 polyprotein genes were transformed into E. coli BL21-CodonPlus (DE3)-RIL cells and grown overnight on an LB-agar plate containing kanamycin (50 μg/ml). A single colony was picked from the plate and inoculated into LB medium with kanamycin (50 μg/ml). The culture was grown overnight at 37°C. Next morning, the starter culture was diluted 1:500 into the LB medium. The cells were grown at 37°C until OD600 of ~1.6 was reached. The culture was then allowed to cool for an hour at 20°C with continuous shaking after which it was induced with 1 mM IPTG. After overnight incubation, the cells were collected by centrifugation at 7200g. The cell pellet was resuspended in lysis buffer [50 mM tris (pH 8.0), 500 mM NaCl, 20 mM imidazole, 5% glycerol, 10 mM CHAPS, and 1 mM TCEP] supplemented with 1 μM leupeptin, 1 μM pepstatin, and 1 mM phenylmethylsulfonyl fluoride. The cell suspension was lysed by sonication and clarified by centrifugation at 30,000g at 4°C for an hour. The supernatant was loaded on a Ni-NTA affinity column (Qiagen), preequilibrated with the lysis buffer. The column was first washed with lysis buffer and then with 50 mM imidazole in lysis buffer. Homemade HRV 3C protease in the buffer containing 50 mM tris (pH 8.0), 500 mM NaCl, 20 mM imidazole, 5% glycerol, and 1 mM TCEP was added to perform on-column cleavage of the 6x-His tag at 4°C. The digested protein was eluted from the column and passed through a second Ni-NTA column. This reverse Ni-NTA step is performed to remove residual 3C protease and uncleaved protein. The protein was further purified by ion-exchange chromatography using the HiTrap Heparin HP column (GE Healthcare Life Sciences) and a Mono Q anion exchange column (16/10; GE Healthcare Life Sciences) using gradient elution from 150 mM to 2 M NaCl. The protein sample was then purified by SEC using a prepacked Superose 6 Increase 10/300 GL (GE Healthcare Life Sciences) equilibrated in 50 mM tris (pH 8.0), 500 mM NaCl, 5% glycerol, and 1 mM TCEP. Pure protein–containing fractions were pooled together, concentrated, and stored at −80°C.

Proteolysis assays with nsp7-8 and nsp7-11 polyprotein substrates

WT Mpro was used to carry out cleavage assays with the nsp7-8 and nsp7-11 polyprotein substrates. The in vitro cleavage reaction was performed by incubating the polyproteins with Mpro WT (nsp7-11:Mpro molar ratio was 6 μM:0.5 μM; nsp7-8:Mpro molar ratio was 5 μM:0.5 μM) at room temperature in the assay buffer: 50 mM tris (pH 7.5), 150 mM NaCl, and 1 mM dithiothreitol (DTT). The reaction was stopped at various time points by the addition of 4× stop buffer [277.8 mM tris-HCl (pH 6.8), 44.4% glycerol, 4.4% SDS, and 0.02% bromophenol blue]. The samples were then denatured at 95°C for 5 min and assessed on a gradient SDS-PAGE gel. The bands for the full-length substrates, intermediate products, and the final cleavage products were cut and confirmed by MS. In-gel trypsin digestion was performed on the gel bands, and LC-MS/MS was carried out on them (see the Supplementary Materials for MS experimental details). The in vitro cleavage reaction was also performed in Hepes buffer [50 mM Hepes (pH 8.0), 500 mM NaCl, and 1 mM TCEP] at 4°C to match the assay conditions for processing for pulse labeling HDX (fig. S3).

In vitro assessment of the effects of small molecules on Mpro activity

The stock solutions of all the Mpro binders were made in dimethyl sulfoxide. They were diluted in the assay buffer and preincubated for 30 min at room temperature with Mpro WT before starting the reaction. The nsp7-11 polyprotein substrate was then added to the reaction at 6 μM. The reaction was stopped after 24 hours. After denaturing, the samples were then run on the SDS-PAGE gel. The effect of small molecules on Mpro activity was assessed by observing the amount of substrate (nsp7-11) present after 24 hours. The gel band intensity for nsp7-11 was calculated using ImageJ software (https://imagej.nih.gov/ij/index.html) and plotted against the concentration of binders using the GraphPad Prism version 9.3.1 (GraphPad Software, La Jolla, CA, USA; www.graphpad.com). The IC50 value calculation for NMTV was also done using the GraphPad Prism version 9.3.1.

Cross-linking mass spectrometry

Sample preparation

For DSSO (Thermo Fisher Scientific) cross-linking reactions, individual protein and protein-protein complexes were diluted to 10 μM in cross-linking buffer [50 mM Hepes (pH 8.0), 500 mM NaCl, and 1 mM TCEP] and incubated for 30 min at room temperature before initiating the cross-linking reaction. DSSO cross-linker was freshly dissolved in cross-linking buffer to a final concentration of 75 mM before being added to the protein solution at a final concentration of 1.5 mM. The reaction was incubated at 25°C for 45 or 90 min and then quenched by adding 1 μl of 1.0 M tris (pH 8.0) and incubating for an additional 10 min at 25°C. Control reactions were performed in parallel without adding the DSSO cross-linker. All cross-linking reactions were carried out in three replicates. The presence of cross-linked proteins was confirmed by comparing to the no–cross-link negative control samples using SDS-PAGE and Coomassie staining. The remaining cross-linked and non–cross-linked samples were separately pooled and then precipitated using methanol and chloroform. Dried protein pellets were resuspended in 12.5 μl of resuspension buffer [50 mM ammonium bicarbonate and 8 M urea (pH 8.0)]. ProteaseMAX (Promega, V5111) was added to 0.02%, and the solutions were mixed on an orbital shaker operating at 400 rpm for 5 min. After resuspension, 87.5 μl of digestion buffer [50 mM ammonium bicarbonate (pH 8.0)] was added. Protein samples were reduced by adding 1 μl of 500 mM DTT followed by incubation of the protein solutions on an orbital shaker operating at 400 rpm at 56°C for 20 min. After reduction, 2.7 μl of 550 mM iodoacetamide was added, and the solutions were incubated at room temperature in the dark for 15 min. Reduced and alkylated protein solutions were digested overnight using trypsin at a ratio of 1:150 (w/w) (trypsin:protein) at 37°C. Peptides were acidified with 1% TFA and then desalted using C18 ZipTip (Millipore, catalog no. ZTC18 5096). Dried peptides were resuspended in 10 μl of 0.1% TFA in water. Samples were then frozen and stored at −20°C until LC-MS analysis.

Liquid chromatography and mass spectrometry

A total of 500 ng of sample was injected (triplicate injections for cross-linked samples and duplicate injections for control samples) onto an UltiMate 3000 UHPLC system (Dionex, Thermo Fisher Scientific). Peptides were trapped using a μPAC C18 trapping column (PharmaFluidics) using a load pump operating at 20 μl/min. Peptides were separated on a 200-cm μPAC C18 column (PharmaFluidics) using a linear gradient (1% solvent B for 4 min, 1 to 30% solvent B from 4 to 70 min, 30 to 55% solvent B from 70 to 90 min, 55 to 97% solvent B from 90 to 112 min, and isocratic at 97% solvent B from 112 to 120 min) at a flow rate of 800 nl/min. Gradient solvent A contained 0.1% formic acid, and solvent B contained 80% acetonitrile and 0.1% formic acid. LC eluate was interfaced to an Orbitrap Fusion Lumos Tribrid mass spectrometer (Thermo Fisher Scientific) with a Nanospray Flex ion source (Thermo Fisher Scientific). The source voltage was set to 2.5 kV, and the S-Lens RF level was set to 30%. Cross-links were identified using a previously described MS2-MS3 method (68) with slight modifications. Full scans were recorded from mass/charge ratio (m/z) 150 to 1500 at a resolution of 60,000 in the Orbitrap mass analyzer. The automatic gain control (AGC) target value was set to 4 × 105, and the maximum injection time was set to 50 ms in the Orbitrap. MS2 scans were recorded at a resolution of 30,000 in the Orbitrap mass analyzer. Only precursors with charge state between 4 and 8 were selected for MS2 scans. The AGC target was set to 5 × 104, a maximum injection time of 150 ms, and an isolation width of 1.6 m/z. Collision-induced dissociation fragmentation energy was set to 25%. The two most abundant reporter doublets from the MS2 scans with a charge state of 2 to 6, a 31.9721-Da mass difference, and a mass tolerance of ±10 parts per million (ppm) were selected for MS3. The MS3 scans were recorded in the ion trap in rapid mode using higher-energy C-trap dissociation (HCD) fragmentation with 35% collision energy. The AGC target was set to 2 × 104, the maximum injection time was set to 200 ms, and the isolation width was set to 2.0 m/z.

Data analysis

To identify cross-linked peptides, Thermo.Raw files were imported into Proteome Discoverer 2.5 (Thermo Fisher Scientific) and analyzed via XlinkX algorithm (69) using the MS2_MS3 workflow with the following parameters: MS1 mass tolerance, 10 ppm; MS2 mass tolerance, 20 ppm; MS3 mass tolerance, 0.5 Da; digestion, trypsin with four missed cleavages allowed; minimum peptide length of four amino acids; fixed modification, carbamidomethylation (C); variable modification, oxidation (M), and DSSO (K, S, T, and Y). The XlinkX/PD Validator node was used for cross-linked peptide validation with a 1% false discovery rate. Identified cross-links were further validated and quantified using Skyline (version 19.1) (70) using a previously described protocol (71). Cross-link spectral matches found in Proteome Discoverer were exported and converted to sequence spectrum list format using Excel (Microsoft). Cross-link peak areas were assessed using the MS1 full-scan filtering protocol for peaks within 8 min of the cross-link spectral match identification. Peak areas were assigned to the specified cross-linked peptide identification if the mass error was within 10 ppm of the theoretical mass, the isotope dot product was greater than 0.95, and the peak was not found in the non–cross-linked negative control samples. The isotope dot product compares the distribution of the measured MS1 signals against the theoretical isotope abundance distribution calculated on the basis of the peptide sequence. Its value ranges between 0 and 1, where 1 indicates a perfect match (72). Pairwise comparisons were made using the “MSstats” package (73) implemented in Skyline to calculate relative fold changes and significance. Significant change thresholds were defined as a log2 fold change less than −2 or greater than 2 and −log10 P value greater than 1.3 (P value less than 0.05). Visualization of proteins and cross-links was generated using xiNET (74). The data have been deposited to the ProteomeXchange Consortium via the PRIDE (75) partner repository with the dataset identifier PXD033748.

Hydrogen-deuterium exchange mass spectrometry

Peptide identification

Peptides were identified using MS/MS experiments performed on a QExactive (Thermo Fisher Scientific, San Jose, CA) over a 70-min gradient. Product ion spectra were acquired in a data-dependent mode, and the five most abundant ions were selected for the product ion analysis per scan event. The MS/MS *.raw data files were converted to *.mgf files and then submitted to MASCOT (version 2.3, Matrix Science, London, UK) for peptide identification. The maximum number of missed cleavages was set to four with a mass tolerance of ±0.6 Da for precursor ions and of ±8 ppm for fragment ions. Oxidation to methionine was selected for variable modification. Pepsin was used for digestion, and no specific enzyme was selected in MASCOT during the search. Peptides included in the peptide set used for HDX detection had a MASCOT score of 20 or greater. The MS/MS MASCOT search was also performed against a decoy (reverse) sequence, and false positives were ruled out if they did not pass a 1% false discovery rate.

Pulse labeling

The nsp7-10 or nsp7-11 polyprotein at 10 μM concentration was incubated with WT Mpro at 1:1 molar ratio, and 5-μl aliquots of the cleavage reaction were removed at 600, 1800, 3600, 14,400, and 86,400 s as well as 7200 and 28,800 s for nsp7-11 only. Aliquots were mixed with 20 μl of deuterated (D2O-containing) buffer [50 mM Hepes, 500 mM NaCl, and 1 mM TCEP (pD 8.4)] and incubated on ice for 30 s. Deuterated samples were quenched with 25 μl of quench solution [5 M urea and 1% TFA (pH 2)] and immediately flash-frozen and stored until ready for direct inject MS analysis.

Continuous labeling

Experiments with continuous labeling were carried out on a fully automated system (CTC HTS PAL, LEAP Technologies, Carrboro, NC; housed inside a 4°C cabinet) as previously described (76) with the following modifications. For differential HDX, protein-protein complexes were preformed and allowed to incubate for 30 min at room temperature before analysis. The reactions (5 μl) were mixed with 20 μl of deuterated (D2O-containing) buffer [50 mM Hepes, 500 mM NaCl, and 1 mM TCEP (pD 8.4)] and incubated at 4°C for 0, 10, 30, 60, 900, or 3600 s. Following on-exchange, unwanted forward- or back-exchange was minimized, and the protein was denatured by the addition of 25 μl of a quench solution [5 M urea and 1% TFA (pH 2.0)] before being immediately passed along for online digestion.

HDX-MS analysis

Samples were digested through an immobilized pepsin column (prepared in-house) at 50 μl/min [0.1% (v/v) TFA at 4°C], and the resulting peptides were trapped and desalted on a 2 mm-by-10 mm C8 trap column (Hypersil Gold, Thermo Fisher Scientific). The bound peptides were then gradient-eluted [4 to 40% (v/v) CH3CN and 0.3% (v/v) formic acid] on a 2.1 mm-by-50 mm C18 separation column (Hypersil Gold, Thermo Fisher Scientific) for 5 min. Sample handling and peptide separation were conducted at 4°C. The eluted peptides were then subjected to electrospray ionization directly coupled to a high-resolution Orbitrap mass spectrometer (QExactive, Thermo Fisher Scientific).

Data rendering

The intensity-weighted mean m/z centroid value of each peptide envelope was calculated and subsequently converted into a percentage of deuterium incorporation. This is accomplished by determining the observed averages of the undeuterated and fully deuterated spectra using the conventional formula described elsewhere (77). The fully deuterated control, 100% deuterium incorporation, was calculated theoretically, and corrections for back-exchange were made on the basis of an estimated 70% deuterium recovery and accounting for 80% final deuterium concentration in the sample (1:5 dilution in deuterated buffer). Statistical significance for the differential HDX data is determined by an unpaired t test for each time point, a procedure that is integrated into the HDX Workbench software (78).

The HDX data from all overlapping peptides were consolidated to individual amino acid values using a residue averaging approach. Briefly, for each residue, the deuterium incorporation values and peptide lengths from all overlapping peptides were assembled. A weighting function was applied in which shorter peptides were weighted more heavily and longer peptides were weighted less. Each of the weighted deuterium incorporation values was then averaged by incorporating this weighting function to produce a single value for each amino acid. The initial two residues of each peptide, as well as prolines, were omitted from the calculations. This approach is similar to that previously described (79).

Deuterium uptake for each peptide is calculated as the average of %D for all on-exchange time points, and the difference in average %D values between the unbound and bound samples is presented as a heatmap with a color code given at the bottom of the figure (warm colors for deprotection and cool colors for protection). Peptides are colored by the software automatically to display significant differences, determined either by a >5% difference (less or more protection) in average deuterium uptake between the two states or by using the results of unpaired t tests at each time point (P < 0.05 for any two time points or P < 0.01 for any single time point). Peptides with nonsignificant changes between the two states are colored gray. The exchange at the first two residues for any given peptide is not colored. Each peptide bar in the heatmap view displays the average Δ %D values, associated SD, and the charge state. In addition, overlapping peptides with a similar protection trend covering the same region are used to rule out data ambiguity. The data have been deposited to the ProteomeXchange Consortium via the PRIDE (75) partner repository with the dataset identifier PXD033702 for the pulse labeling HDX-MS experiment and PXD033698 for continuous labeling HDX-MS experiments [project name: Biochemical and structural insights into SARS-CoV-2 polyprotein processing by Mpro (HDX-MS continuous labeling), project accession: PXD033698, and reviewer account details: username: reviewer_pxd033698@ebi.ac.uk and password: QgiXipcs; project name: Biochemical and structural insights into SARS-CoV-2 polyprotein processing by Mpro (HDX-MS pulse labeling), project accession: PXD033702, and reviewer account details: username: reviewer_pxd033702@ebi.ac.uk and password: wHh4zEPB].

SEC coupled to MALS and SAXS detection

Purified and concentrated nsp7-8 (8 mg/ml) and nsp7-11 (4 mg/ml) were used for data collection. SAXS was performed at BioCAT (beamline 18ID at the Advanced Photon Source, Chicago) with in-line SEC to separate the sample from aggregates and other contaminants, thus ensuring optimal sample quality and MALS, dynamic light scattering (DLS), and refractive index (RI) measurement for additional biophysical characterization (SEC-MALS-SAXS). The samples were loaded on a Superdex 200 Increase 10/300 GL column (Cytiva) run by a 1260 Infinity II HPLC (Agilent Technologies) at 0.6 ml/min. The flow passed through (in order) the Agilent ultraviolet detector, a MALS detector, a DLS detector (DAWN Helios II, Wyatt Technologies), and an RI detector (Optilab T-rEX, Wyatt). The flow then went through the SAXS flow cell. The flow cell consists of a 1.0-mm–inside diameter quartz capillary with ~20-μm walls. A coflowing buffer sheath is used to separate the samples from the capillary walls, helping prevent radiation damage (80). Scattering intensity was recorded using a Pilatus3 X 1M (Dectris) detector, which was placed 3.69 m from the nsp7-11 sample, giving us access to a q range of 0.003 to 0.35 Å−1 and 3.631 m from the nsp7-8 sample, giving us access to a qrange of 0.0047 to 0.35 Å−1. The data were reduced using BioXTAS RAW 2.0.3 (81). Buffer blanks were created by averaging regions flanking the elution peak and subtracted from exposures selected from the elution peak to create the I(q)-versus-q curves used for subsequent analyses. MWs and hydrodynamic radii were calculated from the MALS and DLS data, respectively, using the ASTRA 7 software (Wyatt). Data analysis was carried out using the RAW software package for the determination of radius of gyration (Rg), P(r) distribution, and particle maximum dimension (Dmax) parameters and for qualitative flexibility analysis (through generation of Rg-normalized Kratky and Guinier plots). Volumetric bead modeling was performed using the DAMMIN software package (52). The resulting bead models were averaged and filtered using the DAMAVER package (82), generating the final bead model reconstruction. The SAXS data are deposited in the SAXS database under the accession codes SASDPY2, SASDPZ2, SASDP23, and SASDP33.

Structural integrative modeling using I-TASSER

For the structural predictions of the nsp7-11 polyprotein, an integrative modeling approach was used. The I-TASSER server (43), which is an online source for automated protein structure prediction, was used to generate models of the polyproteins. A two-run approach was used to model the nsp7-11 polyprotein. Run 1 included the following inputs: (i) amino acid sequence, (ii) distance constraints from XL-MS, (iii) nsp7-8 model as a template, and (iv) secondary structure constraints for nsp7, nsp8, nsp9, and nsp10 as advised by HDX-MS to generate models A1, A2, D, C1, and C2 (Fig. 3, B and C). Run 2 included the following: (i) amino acid sequence, (ii) distance constraints from XL-MS, (iii) nsp7-8 as a template, and (iv) secondary structure constraints for nsp8, nsp9, and nsp10 as advised by HDX-MS to generate models A3, B1, B2, B3, and B4 (Fig. 3, B and C). Initial observation of the polyprotein by HDX-MS showed a similar pattern of deuterium uptake compared to the individual proteins (Fig. 3A), suggesting that secondary structures within the polyprotein are likely to largely resemble the secondary structures of the mature nsps. Accordingly, this allowed us to delineate secondary structural constraints based on solved x-ray crystal structures of nsp7, nsp8, nsp9, and nsp10. We also used two nsp7-8 models that we previously generated using a similar integrative modeling workflow to serve as additional structural templates because the HDX-MS footprint of the nsp7-8 polyprotein resembles the footprint of nsp7-8 in nsp7-11 (Fig. 3A). The two nsp7-8 models were chosen on the basis of their varying agreement with the XL-MS and HDX-MS data to limit bias from a particular experimental approach and to sample the conformational landscape as thoroughly as possible (see the Supplementary Materials for nsp7-8 integrative modeling).

The 10 nsp7-11 output models were assessed against the experimental (i) HDX-MS, (ii) XL-MS, and (iii) SAXS data: (i) Agreement of models to HDX-MS data was completed using HDXer, which generated theoretical deuterium uptake values for the models to compare to experimental values (49, 50). Smaller RMSE indicates better agreement of models to experimental data. (ii) Cross-links were mapped on the models using xiVIEW (DOI: 10.1101/561829) to calculate distances and determine the percentage of cross-links satisfied, i.e., distances less than 30 Å. (iii) A theoretical scattering profile was generated for each model using the CRYSOL web interface (83). The theoretical scattering profile of each model was then fitted against the experimental scattering profile. Last, the secondary structural elements and the solvent-accessible surface area of the junction sites for both polyproteins were also analyzed, and the results were compared with the limited proteolysis results to evaluate the physiological relevance of the structure in the context of polyprotein processing. The junction-accessible area was calculated by the summation of the accessible area of four residues (P1, P2, P1’, and P2’) at the junction site. The accessible surface area for each residue was calculated using VADAR (84). The integrative structures of nsp7-11 polyprotein have been deposited in the PDB-Dev databank under accession code PDBDEV_00000120. They are also provided in the Supplementary Materials as PyMOL sessions.

Acknowledgments

We thank M. Narwal and K. Murakami (Pennsylvania State University) for providing protein samples of the C145A Mpro mutant and the nsp7-10 polyprotein, numerous helpful discussions, and also critical reading of the manuscript. We thank B. Pascal and R. Vera Alvarez (Omics Informatics LLC) for help in running the nsp7-11 and nsp7-8 models through the HDXer software. We thank H. Zheng and P. Lobel (Biological Mass Spectrometry Facility of Robert Wood Johnson Medical School and Rutgers University) for performing the in-gel LC-MS/MS for the limited proteolysis assay of nsp7-11 polyprotein by Mpro. Molecular graphics and analyses were performed with (i) UCSF Chimera, developed by the Resource for Biocomputing, Visualization, and Informatics at the University of California, San Francisco; and (ii) PyMOL [DeLano WL (2002), The PyMOL Molecular Graphics System, San Carlos, CA; DeLano Scientific].

Funding: We are grateful for support from National Institutes of Health (NIH) grants U54 AI150472 (to E.A. and P.R.G.), AI 027690 (to E.A.), F31 DK126394 (to V.V.C.), R01 AI167356 (to S.G.S.), and T32 AI157855 (to R.L.S.) and a research award from the Rutgers Center for COVID-19 Response and Pandemic Preparedness (to E.A.). We are grateful to V. Nanda and P. Falkowski for providing financial support for J.T.’s efforts. This research used resources of the Advanced Photon Source, a U.S. Department of Energy (DOE) Office of Science User Facility operated for the DOE Office of Science by Argonne National Laboratory under contract no. DE-AC02-06CH11357. This project was supported by grant P30 GM138395 from the National Institute of General Medical Sciences of the NIH. Use of the Pilatus 3 1M detector was provided by grant 1S10OD018090-01 from NIGMS. The content is solely the responsibility of the authors and does not necessarily reflect the official views of the National Institute of General Medical Sciences or the NIH.

Author contributions: Each author’s contributions to the paper are as follows: Conceptualization: E.A., P.R.G., F.X.R., R.Y., V.V.C., and S.K.D. Methodology: R.Y., V.V.C., S.K.D., J.J.E.K.H., J.T., J.B.H., F.X.R., and R.L.S. Investigation: R.Y., V.V.C., S.K.D., R.L.S., S.G.S., E.A., P.R.G., and F.X.R. Visualization: R.Y., V.V.C., S.K.D., and F.X.R. Supervision: E.A., P.R.G., and F.X.R. Writing—original draft: R.Y., V.V.C., F.X.R., E.A., and P.R.G. Writing—review and editing: All authors.

Competing interests: The authors declare that they have no competing interests.

Data and materials availability: The XL-MS data have been deposited to the ProteomeXchange Consortium via the PRIDE partner repository with the dataset identifier PXD033748. The HDX-MS data have been deposited to the ProteomeXchange Consortium via the PRIDE partner repository with the dataset identifier PXD033702 for the pulse labeling HDX-MS experiment and PXD033698 for continuous labeling HDX-MS experiment. The SAXS data have been deposited in the SAXS database under the accession codes SASDPY2, SASDPZ2, SASDP23, and SASDP33. The integrative structures of nsp7-11 and nsp7-8 polyprotein have been deposited in the PDB-Dev databank under accession codes PDBDEV_00000120 and PDBDEV_00000119, respectively. All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials.

Supplementary Materials

This PDF file includes:

Supplementary Text

Figs. S1 to S15

Tables S1 to S4

Other Supplementary Material for this : manuscript includes the following:

Data S1 and S2

View/request a protocol for this paper from Bio-protocol.

REFERENCES AND NOTES

  • 1.N. Zhu, D. Zhang, W. Wang, X. Li, B. Yang, J. Song, X. Zhao, B. Huang, W. Shi, R. Lu, P. Niu, F. Zhan, X. Ma, D. Wang, W. Xu, G. Wu, G. F. Gao, W. Tan; China Novel Coronavirus Investigating and Research Team ,A novel coronavirus from patients with pneumonia in China, 2019. N. Engl. J. Med. 382,727–733 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.D. E. Gordon, G. M. Jang, M. Bouhaddou, J. Xu, K. Obernier, K. M. White, M. J. O’Meara, V. V. Rezelj, J. Z. Guo, D. L. Swaney, T. A. Tummino, R. Hüttenhain, R. M. Kaake, A. L. Richards, B. Tutuncuoglu, H. Foussard, J. Batra, K. Haas, M. Modak, M. Kim, P. Haas, B. J. Polacco, H. Braberg, J. M. Fabius, M. Eckhardt, M. Soucheray, M. J. Bennett, M. Cakir, M. J. McGregor, Q. Li, B. Meyer, F. Roesch, T. Vallet, A. M. Kain, L. Miorin, E. Moreno, Z. Z. C. Naing, Y. Zhou, S. Peng, Y. Shi, Z. Zhang, W. Shen, I. T. Kirby, J. E. Melnyk, J. S. Chorba, K. Lou, S. A. Dai, I. Barrio-Hernandez, D. Memon, C. Hernandez-Armenta, J. Lyu, C. J. P. Mathy, T. Perica, K. B. Pilla, S. J. Ganesan, D. J. Saltzberg, R. Rakesh, X. Liu, S. B. Rosenthal, L. Calviello, S. Venkataramanan, J. Liboy-Lugo, Y. Lin, X. P. Huang, Y. Liu, S. A. Wankowicz, M. Bohn, M. Safari, F. S. Ugur, C. Koh, N. S. Savar, Q. D. Tran, D. Shengjuler, S. J. Fletcher, M. C. O’Neal, Y. Cai, J. C. J. Chang, D. J. Broadhurst, S. Klippsten, P. P. Sharp, N. A. Wenzell, D. Kuzuoglu-Ozturk, H. Y. Wang, R. Trenker, J. M. Young, D. A. Cavero, J. Hiatt, T. L. Roth, U. Rathore, A. Subramanian, J. Noack, M. Hubert, R. M. Stroud, A. D. Frankel, O. S. Rosenberg, K. A. Verba, D. A. Agard, M. Ott, M. Emerman, N. Jura, M. von Zastrow, E. Verdin, A. Ashworth, O. Schwartz, C. d’Enfert, S. Mukherjee, M. Jacobson, H. S. Malik, D. G. Fujimori, T. Ideker, C. S. Craik, S. N. Floor, J. S. Fraser, J. D. Gross, A. Sali, B. L. Roth, D. Ruggero, J. Taunton, T. Kortemme, P. Beltrao, M. Vignuzzi, A. Garcia-Sastre, K. M. Shokat, B. K. Shoichet, N. J. Krogan,A SARS-CoV-2 protein interaction map reveals targets for drug repurposing. Nature 583,459–468 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.B. Meyer, J. Chiaravalli, S. Gellenoncourt, P. Brownridge, D. P. Bryne, L. A. Daly, A. Grauslys, M. Walter, F. Agou, L. A. Chakrabarti, C. S. Craik, C. E. Eyers, P. A. Eyers, Y. Gambin, A. R. Jones, E. Sierecki, E. Verdin, M. Vignuzzi, E. Emmott,Characterising proteolysis during SARS-CoV-2 infection identifies viral cleavage sites and cellular targets with therapeutic potential. Nat. Commun. 12,5553 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.E. Emmott, A. de Rougemont, M. Hosmillo, J. Lu, T. Fitzmaurice, J. Haas, I. Goodfellow,Polyprotein processing and intermolecular interactions within the viral replication complex spatially and temporally control norovirus protease activity. J. Biol. Chem. 294,4259–4271 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.G. Shin, S. A. Yost, M. T. Miller, E. J. Elrod, A. Grakoui, J. Marcotrigiano,Structural and functional insights into alphavirus polyprotein processing and pathogenesis. Proc. Natl. Acad. Sci. U.S.A. 109,16534–16539 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.S. A. Yost, J. Marcotrigiano,Viral precursor polyproteins: Keys of regulation from replication to maturation. Curr. Opin. Virol. 3,137–142 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.A. G. Bost, R. H. Carnahan, X. T. Lu, M. R. Denison,Four proteins processed from the replicase gene polyprotein of mouse hepatitis virus colocalize in the cell periphery and adjacent to sites of virion assembly. J. Virol. 74,3379–3387 (2000). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.D. J. Deming, R. L. Graham, M. R. Denison, R. S. Baric,Processing of open reading frame 1a replicase proteins nsp7 to nsp10 in murine hepatitis virus strain A59 replication. J. Virol. 81,10280–10291 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.J. Ziebuhr, S. G. Siddell,Processing of the human coronavirus 229E replicase polyproteins by the virus-encoded 3C-like proteinase: Identification of proteolytic products and cleavage sites common to pp1a and pp1ab. J. Virol. 73,177–185 (1999). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.W. Cui, S. Cui, C. Chen, X. Chen, Z. Wang, H. Yang, L. Zhang,The crystal structure of main protease from mouse hepatitis virus A59 in complex with an inhibitor. Biochem. Biophys. Res. Commun. 511,794–799 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.R. W. Körner, M. Majjouti, M. A. A. Alcazar, E. Mahabir,Of mice and men: The coronavirus MHV and mouse models as a translational approach to understand SARS-CoV-2. Viruses 12,880 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.T. P. Sheahan, A. C. Sims, S. T. Zhou, R. L. Graham, A. J. Pruijssers, M. L. Agostini, S. R. Leist, A. Schafer, K. H. Dinnon, L. J. Stevens, J. D. Chappell, X. T. Lu, T. M. Hughes, A. S. George, C. S. Hill, S. A. Montgomery, A. J. Brown, G. R. Bluemling, M. G. Natchus, M. Saindane, A. A. Kolykhalov, G. Painter, J. Harcourt, A. Tamin, N. J. Thornburg, R. Swanstrom, M. R. Denison, R. S. Baric,An orally bioavailable broad-spectrum antiviral inhibits SARS-CoV-2 in human airway epithelial cell cultures and multiple coronaviruses in mice. Sci. Transl. Med. 12,eabb5883 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.A. Kanjanahaluethai, S. C. Baker,Identification of mouse hepatitis virus papain-like proteinase 2 activity. J. Virol. 74,7911–7921 (2000). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.J. J. Schiller, A. Kanjanahaluethai, S. C. Baker,Processing of the coronavirus MHV-JHM polymerase polyprotein: Identification of precursors and proteolytic products spanning 400 kilodaltons of ORF1a. Virology 242,288–302 (1998). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.S. G. Sawicki, D. L. Sawicki, D. Younker, Y. Meyer, V. Thiel, H. Stokes, S. G. Siddell,Functional and genetic analysis of coronavirus replicase-transcriptase proteins. PLOS Pathog. 1,e39 (2005). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.E. F. Donaldson, R. L. Graham, A. C. Sims, M. R. Denison, R. S. Baric,Analysis of murine hepatitis virus strain A59 temperature-sensitive mutant TS-LA6 suggests that nsp10 plays a critical role in polyprotein processing. J. Virol. 81,7086–7098 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.R. Gosert, A. Kanjanahaluethai, D. Egger, K. Bienz, S. C. Baker,RNA replication of mouse hepatitis virus takes place at double-membrane vesicles. J. Virol. 76,3697–3708 (2002). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.K. Knoops, M. Kikkert, S. H. E. van den Worm, J. C. Zevenhoven-Dobbe, Y. van der Meer, A. J. Koster, A. M. Mommaas, E. J. Snijder,SARS-coronavirus replication is supported by a reticulovesicular network of modified endoplasmic reticulum. PLOS Biol. 6,1957–1974 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.D. Oudshoorn, K. Rijs, R. W. A. L. Limpens, K. Groen, A. J. Koster, E. J. Snijder, M. Kikkert, M. Barcena,Expression and cleavage of middle east respiratory syndrome coronavirus nsp3-4 polyprotein induce the formation of double-membrane vesicles that mimic those associated with coronaviral RNA replication. mBio 8,e01658-17 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.G. Wolff, R. W. A. L. Limpens, J. C. Zevenhoven-Dobbe, U. Laugks, S. Zheng, A. W. M. de Jong, R. I. Koning, D. A. Agard, K. Grunewald, A. J. Koster, E. J. Snijder, M. Barcena,A molecular pore spans the double membrane of the coronavirus replication organelle. Science 369,1395–1398 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.B. Krichel, G. Bylapudi, C. Schmidt, C. Blanchet, R. Schubert, L. Brings, M. Koehler, R. Zenobi, D. Svergun, K. Lorenzen, R. Madhugiri, J. Ziebuhr, C. Uetrecht,Hallmarks of Alpha- and Betacoronavirus non-structural protein 7+8 complexes. Sci. Adv. 7,eabf1004 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.B. Krichel, S. Falke, R. Hilgenfeld, L. Redecke, C. Uetrecht,Processing of the SARS-CoV pp1a/ab nsp7-10 region. Biochem. J. 477,1009–1019 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.S. Gildenhuys,Expanding our understanding of the role polyprotein conformation plays in the coronavirus life cycle. Biochem. J. 477,1479–1482 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.G. D. Noske, A. M. Nakamura, V. O. Gawriljuk, R. S. Fernandes, G. M. A. Lima, H. V. D. Rosa, H. D. Pereira, A. C. M. Zeri, A. F. Z. Nascimento, M. C. L. C. Freire, D. Fearon, A. Douangamath, F. von Delft, G. Oliva, A. S. Godoy,A crystallographic snapshot of SARS-CoV-2 main protease maturation process. J. Mol. Biol. 433,167118 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.D. W. Kneller, G. Phillips, H. M. O’Neill, R. Jedrzejczak, L. Stols, P. Langan, A. Joachimiak, L. Coates, A. Kovalevsky,Structural plasticity of SARS-CoV-2 3CL Mpro active site cavity revealed by room temperature x-ray crystallography. Nat. Commun. 11,3202 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.D. W. Kneller, G. Phillips, H. M. O’Neill, K. Tan, A. Joachimiak, L. Coates, A. Kovalevsky,Room-temperature x-ray crystallography reveals the oxidation and reactivity of cysteine residues in SARS-CoV-2 3CL Mpro: Insights into enzyme mechanism and drug design. IUCrJ 7,1028–1035 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Z. M. Jin, X. Y. Du, Y. C. Xu, Y. Q. Deng, M. Q. Liu, Y. Zhao, B. Zhang, X. F. Li, L. K. Zhang, C. Peng, Y. K. Duan, J. Yu, L. Wang, K. L. Yang, F. J. Liu, R. D. Jiang, X. L. Yang, T. You, X. C. Liu, X. N. Yang, F. Bai, H. Liu, X. Liu, L. W. Guddat, W. Q. Xu, G. F. Xiao, C. F. Qin, Z. L. Shi, H. L. Jiang, Z. H. Rao, H. T. Yang,Structure of Mpro from SARS-CoV-2 and discovery of its inhibitors. Nature 582,289–293 (2020). [DOI] [PubMed] [Google Scholar]
  • 28.L. L. Zhang, D. Z. Lin, X. Y. Y. Sun, U. Curth, C. Drosten, L. Sauerhering, S. Becker, K. Rox, R. Hilgenfeld,Crystal structure of SARS-CoV-2 main protease provides a basis for design of improved α-ketoamide inhibitors. Science 368,409–412 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.D. R. Owen, C. M. N. Allerton, A. S. Anderson, L. Aschenbrenner, M. Avery, S. Berritt, B. Boras, R. D. Cardin, A. Carlo, K. J. Coffman, A. Dantonio, L. Di, H. Eng, R. Ferre, K. S. Gajiwala, S. A. Gibson, S. E. Greasley, B. L. Hurst, E. P. Kadar, A. S. Kalgutkar, J. C. Lee, J. Lee, W. Liu, S. W. Mason, S. Noell, J. J. Novak, R. S. Obach, K. Ogilvie, N. C. Patel, M. Pettersson, D. K. Rai, M. R. Reese, M. F. Sammons, J. G. Sathish, R. S. P. Singh, C. M. Steppan, A. Stewart, J. B. Tuttle, L. Updyke, P. R. Verhoest, L. Q. Wei, Q. Y. Yang, Y. A. Zhu,An oral SARS-CoV-2 M-pro inhibitor clinical candidate for the treatment of COVID-19. Science 374,1586–1593 (2021). [DOI] [PubMed] [Google Scholar]
  • 30.A. Douangamath, D. Fearon, P. Gehrtz, T. Krojer, P. Lukacik, C. D. Owen, E. Resnick, C. Strain-Damerell, A. Aimon, P. Abranyi-Balogh, J. Brandao-Neto, A. Carbery, G. Davison, A. Dias, T. D. Downes, L. Dunnett, M. Fairhead, J. D. Firth, S. P. Jones, A. Keeley, G. M. Keseru, H. F. Klein, M. P. Martin, M. E. M. Noble, P. O’Brien, A. Powell, R. N. Reddi, R. Skyner, M. Snee, M. J. Waring, C. Wild, N. London, F. von Delft, M. A. Walsh,Crystallographic and electrophilic fragment screening of the SARS-CoV-2 main protease. Nat. Commun. 11,5047 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.N. Drayman, J. K. DeMarco, K. A. Jones, S. A. Azizi, H. M. Froggatt, K. Tan, N. I. Maltseva, S. Chen, V. Nicolaescu, S. Dvorkin, K. Furlong, R. S. Kathayat, M. R. Firpo, V. Mastrodomenico, E. A. Bruce, M. M. Schmidt, R. Jedrzejczak, M. A. Munoz-Alia, B. Schuster, V. Nair, K. Y. Han, A. O’Brien, A. Tomatsidou, B. Meyer, M. Vignuzzi, D. Missiakas, J. W. Botten, C. B. Brooke, H. Lee, S. C. Baker, B. C. Mounce, N. S. Heaton, W. E. Severson, K. E. Palmer, B. C. Dickinson, A. Joachimiak, G. Randall, S. Tay,Masitinib is a broad coronavirus 3CL inhibitor that blocks replication of SARS-CoV-2. Science 373,931–936 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.S. Gunther, P. Y. A. Reinke, Y. Fernandez-Garcia, J. Lieske, T. J. Lane, H. M. Ginn, F. H. M. Koua, C. Ehrt, W. Ewert, D. Oberthuer, O. Yefanov, S. Meier, K. Lorenzen, B. Krichel, J. D. Kopicki, L. Gelisio, W. Brehm, I. Dunkel, B. Seychell, H. Gieseler, B. Norton-Baker, B. Escudero-Perez, M. Domaracky, S. Saouane, A. Tolstikova, T. A. White, A. Hanle, M. Groessler, H. Fleckenstein, F. Trost, M. Galchenkova, Y. Gevorkov, C. F. Li, S. Awel, A. Peck, M. Barthelmess, F. Schlunzen, P. L. Xavier, N. Werner, H. Andaleeb, N. Ullah, S. Falke, V. Srinivasan, B. A. Franca, M. Schwinzer, H. Brognaro, C. Rogers, D. Melo, J. J. Zaitseva-Doyle, J. Knoska, G. E. Pena-Murillo, A. R. Mashhour, V. Hennicke, P. Fischer, J. Hakanpaa, J. Meyer, P. Gribbon, B. Ellinger, M. Kuzikov, M. Wolf, A. R. Beccari, G. Bourenkov, D. von Stetten, G. Pompidor, I. Bento, S. Panneerselvam, I. Karpics, T. R. Schneider, M. M. Garcia-Alai, S. Niebling, C. Gunther, C. Schmidt, R. Schubert, H. J. Han, J. Boger, D. C. F. Monteiro, L. L. Zhang, X. Y. Y. Sun, J. Pletzer-Zelgert, J. Wollenhaupt, C. G. Feiler, M. S. Weiss, E. C. Schulz, P. Mehrabi, K. Karnicar, A. Usenik, J. Loboda, H. Tidow, A. Chari, R. Hilgenfeld, C. Uetrecht, R. Cox, A. Zaliani, T. Beck, M. Rarey, S. Gunther, D. Turk, W. Hinrichs, H. N. Chapman, A. R. Pearson, C. Betzel, A. Meents,X-ray screening identifies active site and allosteric inhibitors of SARS-CoV-2 main protease. Science 372,642–646 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.E. A. MacDonald, G. Frey, M. N. Namchuk, S. C. Harrison, S. M. Hinshaw, I. W. Windsor,Recognition of divergent viral substrates by the SARS-CoV-2 main protease. ACS Infect. Dis. 7,2591–2595 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.A. M. Shaqra, S. Zvornicanin, Q. Y. Huang, G. J. Lockbaum, M. Knapp, L. Tandeske, D. T. Barkan, J. Flynn, D. N. A. Bolon, S. Moquin, D. Dovala, N. Kurt Yilmaz, C. A. Schiffer,Defining the substrate envelope of SARS-CoV-2 main protease to predict and avoid drug resistance. Nat. Commun. 13,3556 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Y. Zhao, Y. Zhu, X. Liu, Z. Jin, Y. Duan, Q. Zhang, C. Wu, L. Feng, X. Du, J. Zhao, M. Shao, B. Zhang, X. Yang, L. Wu, X. Ji, L. W. Guddat, K. Yang, Z. Rao, H. Yang,Structural basis for replicase polyprotein cleavage and substrate specificity of main protease from SARS-CoV-2. Proc. Natl. Acad. Sci. U.S.A. 119,e2117142119 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.S. G. Sawicki, D. L. Sawicki, S. G. Siddell,A contemporary view of coronavirus transcription. J. Virol. 81,20–29 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.V. V. Courouble, S. K. Dey, R. Yadav, J. Timm, J. Harrison, F. X. Ruiz, E. Arnold, P. R. Griffin,Revealing the structural plasticity of SARS-CoV-2 nsp7 and nsp8 using structural proteomics. J. Am. Soc. Mass Spectrom. 32,1618–1630 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.D. M. Ferraro, N. Lazo, A. D. Robertson,EX1 hydrogen exchange and protein folding. Biochemistry 43,587–594 (2004). [DOI] [PubMed] [Google Scholar]
  • 39.C. B. Arrington, A. D. Robertson,Correlated motions in native proteins from MS analysis of NH exchange: Evidence for a manifold of unfolding reactions in ovomucoid third domain. J. Mol. Biol. 300,221–232 (2000). [DOI] [PubMed] [Google Scholar]
  • 40.J. Zheng, M. R. Chang, R. E. Stites, Y. Wang, J. B. Bruning, B. D. Pascal, S. J. Novick, R. D. Garcia-Ordonez, K. R. Stayrook, M. J. Chalmers, J. A. Dodge, P. R. Griffin,HDX reveals the conformational dynamics of DNA sequence specific VDR co-activator interactions. Nat. Commun. 8,923 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.J. Zheng, C. Wang, M. R. Chang, S. C. Devarkar, B. Schweibenz, G. C. Crynen, R. D. Garcia-Ordonez, B. D. Pascal, S. J. Novick, S. S. Patel, J. Marcotrigiano, P. R. Griffin,HDX-MS reveals dysregulated checkpoints that compromise discrimination against self RNA during RIG-I mediated autoimmunity. Nat. Commun. 9,5366 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.J. Y. Zhou, L. Y. Yang, A. DeColli, C. F. Meyers, N. S. Nemeria, F. Jordan,Conformational dynamics of 1-deoxy-D-xylulose 5-phosphate synthase on ligand binding revealed by H/D exchange MS. Proc. Natl. Acad. Sci. U.S.A. 114,9355–9360 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.J. Y. Yang, Y. Zhang,I-TASSER server: New development for protein structure and function predictions. Nucleic Acids Res. 43,W174–W181 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.S. P. Meisburger, A. B. Taylor, C. A. Khan, S. N. Zhang, P. F. Fitzpatrick, N. Ando,Domain movements upon activation of phenylalanine hydroxylase characterized by crystallography and chromatography-coupled small-angle x-ray scattering. J. Am. Chem. Soc. 138,6506–6516 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Y. J. Zhai, F. Sun, X. M. Li, H. Pang, X. L. Xu, M. Bartlam, Z. H. Rao,Insights into SARS-CoV transcription and replication from the structure of the nsp7-nsp8 hexadecamer. Nat. Struct. Mol. Biol. 12,980–986 (2005). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.H. S. Hillen,Structure and function of SARS-CoV-2 polymerase. Curr. Opin. Virol. 48,82–90 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.H. S. Hillen, G. Kokic, L. Farnung, C. Dienemann, D. Tegunov, P. Cramer,Structure of replicating SARS-CoV-2 polymerase. Nature 584,154–156 (2020). [DOI] [PubMed] [Google Scholar]
  • 48.B. Malone, E. A. Campbell, S. A. Darst,CoV-er all the bases: Structural perspectives of SARS-CoV-2 RNA synthesis. Enzyme 49,1–37 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.R. T. Bradshaw, F. Marinelli, J. D. Faraldo-Gomez, L. R. Forrest,Interpretation of HDX data by maximum-entropy reweighting of simulated structural ensembles. Biophys. J. 118,1649–1664 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.K. C. Kihn, T. Wilson, A. K. Smith, R. T. Bradshaw, P. L. Wintrode, L. R. Forrest, A. Wilks, D. J. Deredge,Modeling the native ensemble of PhuS using enhanced sampling MD and HDX-ensemble reweighting. Biophys. J. 120,5141–5157 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.D. Franke, D. I. Svergun,DAMMIF, a program for rapid ab-initio shape determination in small-angle scattering. J. Appl. Crystallogr. 42,342–346 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.D. I. Svergun,Restoring low resolution structure of biological macromolecules from solution scattering using simulated annealing. Biophys. J. 76,2879–2886 (1999). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Y. Hamuro, S. J. Coales,Hydrogen/deuterium exchange mass spectrometry for weak binders. J. Am. Soc. Mass Spectrom. 33,735–739 (2022). [DOI] [PubMed] [Google Scholar]
  • 54.A. J. Percy, M. Rey, K. M. Burns, D. C. Schriemer,Probing protein interactions with hydrogen/deuterium exchange and mass spectrometry-a review. Anal. Chim. Acta 721,7–21 (2012). [DOI] [PubMed] [Google Scholar]
  • 55.S. G. Sawicki, D. L. Sawicki,Coronavirus transcription: A perspective. Curr. Top. Microbiol. Immunol. 287,31–55 (2005). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.A. T. Nguyen, C. L. Feasley, K. W. Jackson, T. J. Nitz, K. Salzwedel, G. M. Air, M. Sakalian,The prototype HIV-1 maturation inhibitor, bevirimat, binds to the CA-SP1 cleavage site in immature Gag particles. Retrovirology 8,101 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.M. D. Purdy, D. Shi, J. Chrustowicz, J. Hattne, T. Gonen, M. Yeager,MicroED structures of HIV-1 Gag CTD-SP1 reveal binding interactions with the maturation inhibitor bevirimat. Proc. Natl. Acad. Sci. U.S.A. 115,13258–13263 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.M. Z. Wang, C. M. Quinn, J. R. Perilla, H. L. Zhang, R. Shirra, G. J. Hou, I. J. Byeon, C. L. Suiter, S. Ablan, E. Urano, T. J. Nitz, C. Aiken, E. O. Freed, P. J. Zhang, K. Schulten, A. M. Gronenborn, T. Polenova,Quenching protein dynamics interferes with HIV capsid maturation. Nat. Commun. 8,1779 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.R. E. London,HIV-1 reverse transcriptase: A metamorphic protein with three stable states. Structure 27,420–426 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.A. F. Dishman, B. F. Volkman,Unfolding the mysteries of protein metamorphosis. ACS Chem. Biol. 13,1438–1446 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.T. J. El-Baba, C. A. Lutomski, A. L. Kantsadi, T. R. Malla, T. John, V. Mikhailov, J. R. Bolla, C. J. Schofield, N. Zitzmann, I. Vakonakis, C. V. Robinson,Allosteric inhibition of the SARS-CoV-2 main protease: Insights from mass spectrometry based assays. Angew. Chem. Int. Ed. 59,23544–23548 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Z. C. Sun, L. Wang, X. Y. Li, C. P. Fan, J. F. Xu, Z. Z. Shi, H. R. Qiao, Z. Y. Lan, X. Zhang, L. Y. Li, X. Zhou, Y. Geng,An extended conformation of SARS-CoV-2 main protease reveals allosteric targets. Proc. Natl. Acad. Sci. U.S.A. 119,e2120913119 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.M. Gotte,Remdesivir for the treatment of Covid-19: The value of biochemical studies. Curr. Opin. Virol. 49,81–85 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.M. A. Redhead, C. D. Owen, L. Brewitz, A. H. Collette, P. Lukacik, C. Strain-Damerell, S. W. Robinson, P. M. Collins, P. Schafer, M. Swindells, C. J. Radoux, I. N. Hopkins, D. Fearon, A. Douangamath, F. von Delft, T. R. Malla, L. Vangeel, T. Vercruysse, J. Thibaut, P. Leyssen, T. T. Nguyen, M. Hull, A. Tumber, D. J. Hallett, C. J. Schofield, D. I. Stuart, A. L. Hopkins, M. A. Walsh,Bispecific repurposed medicines targeting the viral and immunological arms of COVID-19. Sci. Rep. 11,13208 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.N. Sluis-Cremer, G. Tachedjian,Mechanisms of inhibition of HIV replication by non-nucleoside reverse transcriptase inhibitors. Virus Res. 134,147–156 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.X. Y. Xue, H. T. Yang, W. Shen, Q. Zhao, J. Li, K. L. Yang, C. Chen, Y. H. Jin, M. Bartlam, Z. H. Rao,Production of authentic SARS-CoV M-pro with enhanced activity: Application as a novel tag-cleavage endopeptidase for protein overproduction. J. Mol. Biol. 366,965–975 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.D. R. Littler, B. S. Gully, R. N. Colson, J. Rossjohn,Crystal structure of the SARS-CoV-2 non-structural protein 9, nsp9. iScience 23,101258 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.F. Liu, P. Lossl, R. Scheltema, R. Viner, A. J. R. Heck,Optimized fragmentation schemes and data analysis strategies for proteome-wide cross-link identification. Nat. Commun. 8,15473 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.F. Liu, D. T. S. Rijkers, H. Post, A. J. R. Heck,Proteome-wide profiling of protein assemblies by cross-linking mass spectrometry. Nat. Methods 12,1179–1184 (2015). [DOI] [PubMed] [Google Scholar]
  • 70.B. MacLean, D. M. Tomazela, N. Shulman, M. Chambers, G. L. Finney, B. Frewen, R. Kern, D. L. Tabb, D. C. Liebler, M. J. MacCoss,Skyline: An open source document editor for creating and analyzing targeted proteomics experiments. Bioinformatics 26,966–968 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Z. Chen, J. Rappsilber,A generic solution for quantifying cross-linked peptides using software Skyline. Protoc. Exch. 10.1038/protex.2018.001 , (2018). [Google Scholar]
  • 72.Z. A. Chen, J. Rappsilber,Quantitative cross-linking/mass spectrometry to elucidate structural changes in proteins and their complexes. Nat. Protoc. 14,171–201 (2019). [DOI] [PubMed] [Google Scholar]
  • 73.M. Choi, C. Y. Chang, T. Clough, D. Broudy, T. Killeen, B. MacLean, O. Vitek,MSstats: An R package for statistical analysis of quantitative mass spectrometry-based proteomic experiments. Bioinformatics 30,2524–2526 (2014). [DOI] [PubMed] [Google Scholar]
  • 74.C. W. Combe, L. Fischer, J. Rappsilber,xiNET: Cross-link network maps with residue resolution. Mol. Cell. Proteomics 14,1137–1147 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Y. Perez-Riverol, A. Csordas, J. W. Bai, M. Bernal-Llinares, S. Hewapathirana, D. J. Kundu, A. Inuganti, J. Griss, G. Mayer, M. Eisenacher, E. Perez, J. Uszkoreit, J. Pfeuffer, T. Sachsenberg, S. Yilmaz, S. Tiwary, J. Cox, E. Audain, M. Walzer, A. F. Jarnuczak, T. Ternent, A. Brazma, J. A. Vizcaino,The PRIDE database and related tools and resources in 2019: Improving support for quantification data. Nucleic Acids Res. 47,D442–D450 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.M. J. Chalmers, S. A. Busby, B. D. Pascal, Y. J. He, C. L. Hendrickson, A. G. Marshall, P. R. Griffin,Probing protein ligand interactions by automated hydrogen/deuterium exchange mass spectrometry. Anal. Chem. 78,1005–1014 (2006). [DOI] [PubMed] [Google Scholar]
  • 77.Z. Zhang, D. L. Smith,Determination of amide hydrogen exchange by mass spectrometry: A new tool for protein structure elucidation. Protein Sci. 2,522–531 (1993). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78.B. D. Pascal, S. Willis, J. L. Lauer, R. R. Landgraf, G. M. West, D. Marciano, S. Novick, D. Goswami, M. J. Chalmers, P. R. Griffin,HDX workbench: Software for the analysis of H/D exchange MS data. J. Am. Soc. Mass Spectrom. 23,1512–1521 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79.T. R. Keppel, D. D. Weis,Mapping residual structure in intrinsically disordered proteins at residue resolution using millisecond hydrogen/deuterium exchange and residue averaging. J. Am. Soc. Mass Spectrom. 26,547–554 (2015). [DOI] [PubMed] [Google Scholar]
  • 80.N. Kirby, N. Cowieson, A. M. Hawley, S. T. Mudie, D. J. McGillivray, M. Kusel, V. Samardzic-Boban, T. M. Ryan,Improved radiation dose efficiency in solution SAXS using a sheath flow sample environment. Acta Crystallogr. D Struct. Biol. 72,1254–1266 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81.J. B. Hopkins, R. E. Gillilan, S. Skou,BioXTAS RAW: Improvements to a free open-source program for small-angle x-ray scattering data reduction and analysis. J. Appl. Cryst. 50,1545–1553 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 82.V. V. Volkov, D. I. Svergun,Uniqueness of ab initio shape determination in small-angle scattering. J. Appl. Cryst. 36,860–864 (2003). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 83.D. Franke, M. V. Petoukhov, P. V. Konarev, A. Panjkovich, A. Tuukkanen, H. D. T. Mertens, A. G. Kikhney, N. R. Hajizadeh, J. M. Franklin, C. M. Jeffries, D. I. Svergun,ATSAS 2.8: A comprehensive data analysis suite for small-angle scattering from macromolecular solutions. J. Appl. Cryst. 50,1212–1225 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 84.L. Willard, A. Ranjan, H. Zhang, H. Monzavi, R. F. Boyko, B. D. Sykes, D. S. Wishart,VADAR: A web server for quantitative evaluation of protein structure quality. Nucleic Acids Res. 31,3316–3319 (2003). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 85.A. Shevchenko, M. Wilm, O. Vorm, M. Mann,Mass spectrometric sequencing of proteins silver-stained polyacrylamide gels. Anal. Chem. 68,850–858 (1996). [DOI] [PubMed] [Google Scholar]
  • 86.D. E. Sleat, M. C. Della Valle, H. Zheng, D. F. Moore, P. Lobel,The mannose 6-phosphate glycoprotein proteome. J. Proteome Res. 7,3010–3021 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 87.R. Craig, R. C. Beavis,TANDEM: Matching proteins with tandem mass spectra. Bioinformatics 20,1466–1467 (2004). [DOI] [PubMed] [Google Scholar]
  • 88.D. Franke, C. M. Jeffries, D. I. Svergun,Correlation map, a goodness-of-fit test for one-dimensional x-ray scattering spectra. Nat. Methods 12,419–422 (2015). [DOI] [PubMed] [Google Scholar]
  • 89.D. I. Svergun,Determination of the regularization parameter in indirect-transform methods using perceptual criteria. J. Appl. Cryst. 25,495–503 (1992). [Google Scholar]
  • 90.R. P. Rambo, J. A. Tainer,Accurate assessment of mass, models and resolution by small-angle scattering. Nature 496,477–481 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 91.V. Piiadov, E. Ares de Araujo, M. Oliveira Neto, A. F. Craievich, I. Polikarpov,SAXSMoW 2.0: Online calculator of the molecular weight of proteins in dilute solution from experimental SAXS data measured on a relative scale. Protein Sci. 28,454–463 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Text

Figs. S1 to S15

Tables S1 to S4

Data S1 and S2


Articles from Science Advances are provided here courtesy of American Association for the Advancement of Science

RESOURCES