Abstract
Coronavirus non‐structural protein 3 (nsp3) forms hexameric crowns of pores in the double membrane vesicle that houses the replication–transcription complex. Nsp3 in SARS‐like viruses has three unique domains absent in other coronavirus nsp3 proteins. Two of these, SUD‐N (Macrodomain 2) and SUD‐M (Macrodomain 3), form two lobes connected by a peptide linker and an interdomain disulfide bridge. We resolve the first complete x‐ray structure of SARS‐CoV SUD‐N/M as well as a mutant variant of SARS‐CoV‐2 SUD‐N/M modified to restore cysteines for interdomain disulfide bond naturally lost by evolution. Comparative analysis of all structures revealed SUD‐N and SUD‐M are not rigidly associated but rather have significant rotational flexibility. Phylogenetic analysis supports that the potential to form the disulfide bond is common across betacoronavirus isolates from many bat species and civets, but also one or both of the cysteines that form the disulfide bond are absent across isolates from bats and pangolins. The absence of these cysteines does not impact viral replication or protein translation.
Keywords: Mac2, Mac3, macrodomains, SARS‐CoV, SARS‐CoV‐2, structure biology, SUD, viral replication
1. INTRODUCTION
The coronaviruses SARS‐CoV and SARS‐CoV‐2 are the causative agents of Severe Acute Respiratory Syndrome (SARS) and Coronavirus disease 2019 (COVID‐19), respectively (Fouchier et al., 2003; Wu et al., 2020; Zhou et al., 2020). Both viruses caused global outbreaks in humans in the early twenty‐first century, with SARS accounting for more than 8000 probable cases and 774 deaths from 2002 to 2004 in Asia and North America and COVID‐19 accounting for over 7 million deaths worldwide from 2019 to 2024 (CDC, 2003; Dong et al., 2020). Coronaviruses are enveloped, positive‐strand, single‐stranded RNA (ssRNA+) viruses. Upon delivery of the ssRNA into infected cells, non‐structural proteins (nsp) translated by host ribosomes from the incoming RNA strand are produced, leading to modification of the endoplasmic reticulum to create double membrane vesicles (DMVs) that ultimately contain the viral‐encoded replication‐transcription complex (RTC). Within the DMVs, the RTC generates genomic ssRNA+ copies for packaging into new viral capsids along with subgenomic messenger RNA (mRNA) strands that will be translated by host ribosomes to produce the viral structural S, N, E, and M proteins (Malone et al., 2022). Although the RNA molecules are produced within the DMV, both the packaging of genomic RNA into capsids and the translation of subgenomic RNAs require the RNA molecules to be exported from the DMV to the cytoplasm (Klein et al., 2020; Malone et al., 2022; Roingeard et al., 2022). Cryo‐electron tomography of DMV structures has revealed that the DMV membranes contain hexameric transmembrane pores formed by non‐structural protein 3 (nsp3) in complex with non‐structural protein 4 (nsp4) and non‐structural protein 6 (nsp6), and it is postulated that viral RNAs pass through these pores to access the cytoplasm for translation or viral capsid packaging (Wolff et al., 2020; Zimmermann et al., 2023).
Nsp3 is the largest protein produced by coronaviruses (Lei et al., 2018; Tan et al., 2007). The hexamer of nsp3 extends from the DMV into the cell cytoplasm with each nsp3 protein tethered to the DMV by two transmembrane domains, TM1 and TM2. The small nsp3 ectodomain (3Ecto) between TM1 and TM2 is displayed on the luminal side of the DMV and interacts with nsp4 as part of the transmembrane pore complex (Lei et al., 2018; Tan et al., 2007; Wolff et al., 2020; Zimmermann et al., 2023). The C‐terminal Y1 and CoV‐Y1 domains are also exposed to the cytoplasmic side of the DMV (Figure 1a).
FIGURE 1.

The nsp3 SARS Unique domains from SARS‐CoV and SARS‐CoV‐2. (a) Diagram of the domain structure of nsp3 form SARS‐CoV, SARS‐CoV‐2, and related viruses. Domains in the long N‐terminal extension into the cytosol are the ubiquitin‐like domain 1 (Ubl1), the hypervariable region (HVR), macrodomain 1 (Mac1), macrodomain 2 (Mac2/SUD‐N), macrodomain 3 (Mac3/SUD‐M), domain preceding Ubl2 and PLpro (DPUP/SUD‐C), ubiquitin‐like domain 2 (Ubl2), papain‐like protease (PLpro), nucleic acid binding (NAB) and betacoronavirus‐specific marker (SM). The transmembrane region 1 (TM1) and region 2 (TM2) cross the DMV double membrane with the nsp3 ectodomain (3Ecto) facing the luminal side of the membrane and binds with nsp4. The C‐terminal domains Y1 and coronavirus Y (CoV‐Y) are exposed to the cytosolic face of the DMV. The SARS Unique domain (SUD) is shown in yellow with N‐terminal (N), middle (M), and C‐terminal (C) domains marked. (b) Amino acid alignment of SUD‐N/M from SARS‐CoV and SARS‐CoV‐2 with important regions that bind PAIP1M and G4 motifs indicated with red dashed boxes. Cysteine residues that participate in disulfide bond in SARS1‐N/M substitutions are marked with arrows. (c) Overlay of all six chains of prior determined structures of SARS‐CoV SUD‐N/M (SARS1‐N/M, PDB ID 2w2g and 2wct). Backbone in beige, helices in orange, and beta‐strands in red. Structures have conserved fold with r.m.s.d. of ~0.7. All show two macrodomains with six beta‐strand wrapped by five helices. Linker is disordered in all structures. Disulfide bond linkage between Cy492 and Cys623 (shown in yellow in center of structures) is seen in all six chains.
The organization of N‐terminal domains of nsp3 is variable across different coronaviruses but is conserved between SARS‐CoV and SARS‐CoV‐2 (Figure 1a). Three of these domains, known as the SARS unique domains (SUD), are found within nsp3 only in SARS‐CoV, SARS‐CoV‐2, and other closely related SARS‐like coronaviruses (Lei et al., 2018; Neuman, 2016; Snijder et al., 2003; Xu et al., 2020) and are notably absent in other human coronaviruses including MERS‐CoV, HCoV‐OC43, and HCoV‐229E (Lei et al., 2018; Snijder et al., 2003; Tan et al., 2007). These unique domains include macrodomain 2 (Mac2, also known as SUD‐N), macrodomain 3 (Mac3, also known as SUD‐M), and the domain‐preceding Ubl2 and PLpro (DPUP, also known as SUD‐C). The combination of Mac2 and Mac3 (also known as SUD2Core) will be referred to here as SUD‐N/M and as SARS1‐N/M for SARS‐CoV and SARS2‐N/M for SARS‐CoV‐2.
The SARS1‐N and SARS1‐M domains of SARS‐CoV each have a macrodomain fold similar to the ADP‐ribosylhydrolase domain Mac1 (Chatterjee et al., 2009; Johnson et al., 2010; Tan et al., 2009). However, unlike Mac1, these domains do not bind ADP‐ribose (Tan et al., 2009). Rather, the SARS1‐N/M proteins bind to RNA and DNA oligonucleotides, especially those folded to form guanine‐quadruplex (G4) structures (Tan et al., 2007; Tan et al., 2009). These three‐dimensional structures are formed in guanine‐rich regions of nucleotides in which four guanine bases organize into a guanine tetrad plane and then two or more tetrad planes stack to form the G4 (Lipps & Rhodes, 2009). Modeling and site‐directed mutagenesis studies of SARS1‐N/M suggest both RNA and DNA G4 nucleotide structures bind via lysines in a flexible surface‐exposed loop of the SARS1‐M domain with minor contributions from the SARS1‐N domain (Kusov et al., 2015).
The SUD‐N/M domains from SARS‐CoV are 74% identical and 85% similar in amino acid sequence with the SARS‐CoV‐2 domains (Figure 1b), and likewise, SARS2‐N/M domains bind to G4 nucleotide structures (Lavigne et al., 2021; Qin et al., 2023). SARS2‐N/M binds preferentially to a G4 motif found in the 5′UTR of TRF2 mRNA folded to form a G4 structure, and this correlates with inhibition of the unfolded protein response that detects endoplasmic reticulum stress (Lavigne et al., 2021). SARS2‐N/M also binds to G4 motifs formed by the folding of single‐stranded DNA, including a G4 motif of BCL2 promoter, and this interaction is suggested to promote apoptosis (Lavigne et al., 2021). Further, SARS2‐N/M has been shown to bind the 5′UTR for SARS‐CoV‐2 RNAs and to possess global RNA‐binding activity (Lemak et al., 2024).
Interactions of SARS2‐N/M domains with other proteins may also have functionally significant relevance to viral infection. The binding of SARS2‐N/M domains to G4 structures is enhanced by its interaction with viral non‐structural protein 5 (nsp5), and this interaction is also associated with enhanced apoptosis (Li, Yu, et al., 2023). In addition, both SARS1‐N and SARS2‐N bind the middle domain of polyadenylate binding protein‐interacting protein 1 (PAIP1M), a cytoplasmic co‐factor that binds to polyadenylate binding protein 1 (PAB1), which is an enhancer of ribosome translation (Lei et al., 2021; Qin et al., 2023). The interaction with PAIP1M as well as the co‐sedimentation of nsp3 with ribosomal fractions and with the ribosomal inhibitor non‐structural protein 1 (nsp1) suggests that SUD‐N domains may function in part to promote translation of viral mRNAs over host mRNAs (Lei et al., 2021).
X‐ray structures of SARS1‐N/M have been determined and reveal these two domains form a bi‐lobal protein connected by both a flexible linker peptide (underlined in Figure 1b) and by an interdomain disulfide bridge (PDB ID 2w2g and 2wct) (Tan et al., 2009). In this study, we determine additional structures of the SARS1‐N/M domains as well as a variant of SARS2‐N/M that forms the interdomain disulfide bond. Analysis of all available structures reveals these two domains are more flexible than suggested by earlier structural studies and the domains have torsional twists with respect to each other, although each of the macrodomains is structurally conserved. The presence or absence of cysteines has a minimal contribution to binding to nucleic acids and did not impact viral replication. Across all the SARS‐like viruses, the capacity to form the disulfide bond is shared among viruses closely related to SARS‐CoV while viruses related to Pangolin‐SARS and bat isolates closely related to SARS‐CoV‐2 lack one or both cysteines.
2. RESULTS
2.1. A complete structure of SARS‐CoV SUD‐N/M
The x‐ray structure of the 259 a.a. joined SARS1‐N/M domains was previously solved at 2.22 Å with two chains in the asymmetric unit (PDB ID 2w2g) (Tan et al., 2009). A second lower‐resolution structure has four chains in the asymmetric unit (PDB ID 2wct) (Tan et al., 2009). All six chains from both structures have nearly identical backbone folds (Figure 1c). Since all the prior SARS1‐N/M structures are nearly identical (root‐mean‐squared‐deviation (r.m.s.d. = 0.6–0.8 Å)), only the structure chain reported as PDB ID 2w2g/A will be analyzed in this study.
As part of a structural genomics project to address the COVID‐19 pandemic, we attempted to crystallize the recombinant SARS2‐N/M domains of the comparable 259 a.a. protein as used for prior structures of protein from SARS1‐N/M (Figure 1c). After multiple failed attempts to crystallize SARS2‐N/M, we cloned and purified SARS1‐N/M to test the compatibility of our approach for protein purification and crystallization using this protein known to crystallize. This SARS‐CoV‐derived protein rapidly crystallized, data were collected, and the structure was solved by molecular replacement using PDB ID 2w2g as a model.
This new structure (PDB ID 8ufl) was determined at 2.51 Å and showed two chains in the asymmetric unit, designated Chain 8ufl/A and 8ufl/B. Similar to the prior structures, the SARS1‐N has a macrodomain fold comprised of a six stranded β sheet (βN1–βN6–βN5–βN2–βN4–βN3) surrounded by six helices (αN1–αN6) (Figure 2a). Unique to this structure, chain 8ufl/A was complete across all residues and included resolution of residues 515–523 in the peptide linker between the N and M domains, which was disordered in chain B and in all prior structures. The initial strand of the SARS1‐M domain in this new structure resolved as a helix missing from all previous structures and here designated as αM1 followed by βM1. This M domain now has a similar macrodomain as the SARS1‐N domain with a composition as a six‐strand β sheet (βM1–βM6–βM5–βM2–βM4–βM3) surrounded by six helices (αM1–αM6). Similar to all prior structures, this new structure resolved with an interdomain disulfide bond between residues Cys492 linking the large LoopN8 between βN5 and αN6 with Cys623 in αM5 of SARS1‐M (Figure 2b).
FIGURE 2.

Complete structure of SARS1‐N/M. (a) Cylinder and slabs diagram of SARS1‐N/M (PDB ID 8ufl/A) with backbone in light gray, helices in coral, and beta‐strands in red. Domains are labeled according to this complete structure. Disulfide bond (C–C) between SARS1‐N Cys492 and SARS1‐M Cys623 is shown. (b) Close‐up view of the disulfide bond in SARS2‐N/M structure (8ufl) with (Fo‐Fc) omit density map shown as a gray mesh countered at 3.0 sigma level for the Cys‐516, Cys‐647, and close neighbors marked. (c) Ribbon diagram of SARS1‐N/M structure 8ufl (light gray) overlaid with structure 2w2g/A (yellow) by alignment of SUD‐N domains. Distance of shift of C–C bond is marked. Helices αM5 are shown in yellow (2w2g) and coral (8ufl) to emphasize degree of shift of this C–C linked helix. Rotational twist is indicated with a black arrow, as measured from cysteine bond SG at center to surface exposed Lys587 CB that participates in binding to G4 quadruplex.
The other significant difference in this structure compared to prior SARS‐CoV structures (Figure 1c) is that the relative rotational orientation of SUD‐N and SUD‐M was shifted compared to the alignment of all other structures. Although SUD‐N and SUD‐M each independently aligned between 2w2g and 8ufl, when SUD‐N was aligned, SUD‐M was rotated 21.7° from the reorientation of the disulfide bond and flexibility in the linker (Figure 2c). This shift moved Lys565 that participates in binding to G4 quadruplex structures by 12.9 Å.
These data indicated that in contrast to prior studies, there is rotational flexibility in the disulfide‐bound structures such that the cleft between SUD‐N and SUD‐M is not a fixed interface and flexibility is introduced by both the disulfide bond and the interdomain peptide linker.
2.2. Structure of SARS‐CoV‐2 SUD‐N/M variant with reintroduced disulfide bond
The successful new structure of SARS1‐N/M (Figure 2a), and our high‐throughput crystallization approach, supported that the failure of SARS2‐N/M to crystallize was most likely linked to the 25% difference in amino acid sequence between the two proteins. Analysis of the aligned sequences of these domains revealed that SARS2‐N/M lacks the two cysteines that form the interdomain disulfide bond in SARS1‐N/M; thus, the joined domains could not form this potentially stabilizing disulfide bond (Figure 1b). A new synthetic clone of SARS2‐N/M was generated that exchanged codons for residues Leu516 in SARS2‐N and Tyr647 in SARS2‐M to cysteines to introduce the potential for disulfide bond formation at the same location as found in SARS1‐N/M. The protein variant with L516C and Y647C, here referred to as SARS2‐NC/MC, was purified in the absence of reducing agent and set up for crystallization. The protein rapidly crystallized, data were collected, and the structure determined at 1.65 Å resolution. The structure was comprised of one chain in the asymmetric unit.
The SARS2‐NC domain resolved a similar macrodomain fold as for SARS1‐N (r.m.s.d. = 0.857), except that the large loop with αN4 in SARS1‐N was disordered (Figure 3a). In the SARS2‐NC/MC structure, the interdomain linker (residues 541–548) was not resolved and the first strand of the SARS2‐MC domain did not contain either the αM1 or βM1 that were determined in the complete SARS1‐N/M structure. The SARS2‐M domain is otherwise structurally conserved with SARS1‐M (r.m.s.d. = 0.617).
FIGURE 3.

Structure SARS2‐NC/MC and AF2 model reveal increased torsional twist in SARS‐CoV‐2 compared to SARS‐CoV. (a) Cylinder and slabs diagram of SARS2‐N/M (PDB ID 8ufm/A, this study) with backbone in light blue, helices in teal, and beta‐strands in dark blue. Domains are labeled according to the complete structure of SARS1‐N/M presented in Figure 2. Disulfide bond (C–C) between SARS2‐N/M L516C‐Y647C is indicated with the bond shown in yellow. Disordered amino acids in the structure are shown as dashed lines. (b) Close‐up view of the disulfide bond in SARS2‐NC/MC structure (8ufm) with (Fo‐Fc) omit density map shown as a gray mesh countered at 3.0 sigma level for the Cys‐516, Cys‐647, and close neighbors marked. (c) Overlay of SARS2‐NCMC from 8ufm depicted as cartoons (light blue) and 8hbl (pink). Ellipse indicates structure ηN1 present in 8hbl. (d–f) Alignment of SUD‐N domains of SARS2‐NCMC (light blue) with (d) SARS1‐N/M 2w2g (pale yellow), (e) 8ufl (gray), and (f) AF2 best model of native SARS2‐N/M (tan). Distances between disulfide bonds are indicated with arrows. Rotational twist is indicated with a black arrow, as measured from cysteine bond SG at center to surface exposed lysine that participates in binding to G4 quadruplex (Lys587 CB for SARS2‐N/M or Lys565 CB for SARS1‐N/M). Range of angles for panel F represents variation from five generated AF2 models. In panels (c–f), helix αM5 for each model is shown as tube to emphasize the relative orientation of the SUD‐M domains (8ufm, teal; 2w2g, yellow; 8ufl, coral; AF2 brown) and rotation of panel images relative to each other. (g–i) Structural focus on linked LoopN8 with αM5 via cysteine bond or model with Leu and Tyr showing sidechain clash modeled by independent alignment of N and M domains.
The new SARS2‐NC/MC structure contained an interdomain disulfide bond formed between modified residues L516C from the ηN1 structure within LoopN8 and Y647C from αM5, just as found in SARS1‐N/M (Figure 3b). In the course of completing this work, similar structures of SARS2‐NC/MC were published with resolution of 1.35 Å (PDB ID 8gqc) and 8hbl 1.58 Å (PDB ID 8hbl) (Qin et al., 2023). These structures are nearly identical except 8hbl resolved the ηN1 structure within LoopN8 (Figure 3c).
Similar to the recent report, the alignment of the SUD‐N and SUD‐M domains between SARS1‐N/M structure 2w2g is rotated in SARS2‐NC/MC. However, even as the individual domains are closely aligned, the overall r.m.s.d. is 9.3 Å. The high variance in structural alignment is due to the rotation of SARS2‐M by 36° relative to the orientation of these two domains (Figure 3d). The rotation occurs due to the movement of LoopN8 such that the disulfide bond is 6.4 Å shifted compared to SARS1‐N/M structures. When compared to our new complete structure 8ufl for SARS2‐N/M, the r.m.s.d. is even greater at 12.8 Å with the rotation measured as 49°, the shift of the disulfide bond as 7.7 Å (Figure 3e), and flex of LoopN8 to an alternative orientation. Further, the G4 quadruplex binding strand including Lys565 (SARS1‐M) or Lys589 (SARS2‐M) is significantly rotated away by 22.5 Å compared to structure 2w2g and 31.4 Å compared to structure 8ufl. Similarly, when the SUD‐M domains are aligned, the PAIP1M binding region that includes Cys417 in SARS2‐N and Cys393 in SARS1‐N are shifted by 15 Å compared to 2w2g and 23.6 Å compared to 8ufl.
An AlphaFold2 (AF2) computational model was generated using ColabFold (Mirdita et al., 2022) to predict the structure of SARS2‐N/M in the absence of the disulfide bond (Figure 3f). This AF2‐predicted structure is identical to the x‐ray‐determined structure for each domain when aligned independently (r.m.s.d. = 0.667 Å for SARS2‐N and 0.673 Å for SARS2‐M). However, the AF2 model predicts the domains are dramatically reoriented by an 88°–110° shift in five distinct AF2 generated models, as measured from the aligned position of Lys589 to L516C in solved structures 8ufm and 8hbl to the position of Lys589 in the AF2 model. Residue Tyr647 in the AF2 model shifted 24.2 Å away from the position of Y647C in the solved structures of SARS2‐NC/MC. When the LoopN8 and αM5 structures were independently overlaid on the 8ufm structure (Figure 3g–I), it is revealed that in the absence of the two cysteines in SARS2‐N/M, the Tyr647 would clash at any given rotamer position within the SUD‐M domain, except for the orientation shown in Figure 3i when it would clash with the C‐beta of the residue of Leu516 from SUD‐N domain. Thus, SUD‐M was rotated away in the computational model (Figure 3f). These data support that the C–C bond introduction provided a single conformation for crystallization in an otherwise highly flexible protein, and the natural protein likely adopts a structure without the SUD‐N/SUD‐M interface due to clashes of the interface residues.
2.3. Phylogenetic analysis of SUD‐N/M reveals cysteines also absent in pangolin‐SARS and related coronaviruses
We next considered the question of whether the disulfide bond was a sequence variation distinct between SARS‐CoV and SARS‐CoV‐2 and whether the potential disulfide bond is present in other SARS‐like coronaviruses. A phylogenetic tree of protein sequences was generated using 120 diverse amino acid sequences of betacoronaviruses available from the National Center for Biotechnology Information (NCBI). We found that the sequence sorted to four clades (Figure 4a,b). Of significance, in the largest clade (black/red in Figure 4b), all of the sequences have both cysteines, including many isolates from Rhinolophus sp. (horseshoe bats). Isolates from Paradoxurus hermaphroditius (palm civet) were 100% identical in the regions adjacent to the cysteines, but many sequences including from R. sinicus, R. stheno, and R. affinis were also identical in this region.
FIGURE 4.

Phylogenetic relatedness of SUD‐N/M protein sequences. (a) CLUSTALW alignment and (b) phylogenetic tree of 120 unique full‐length SUD‐N/M protein sequences as indicated by the NCBI locus tag. Sequence alignment in panel (a) is limited to 21 residues surrounding LoopN8 or αM5 as indicated, with selected sequences representing all unique sequences in this region indicated by gray highlight in the phylogenetic tree. Cysteines that form double bond in SARS‐CoV and substitutions at this position are indicated by color as detailed in the legend. Source species is indicated by either the common name (panel a) or the species name colored as indicated in legend (panel b). Four clades are designated by the color of the branches.
A second clade along the phylogenetic tree includes SARS‐CoV‐2 (blue in Figure 4b). Several isolates from horseshoe bats in Laos in 2020 (notably BANAL‐20‐236 isolated from R. marshalli (NCBI locus tag UAY1352), BANAL‐20‐103 isolated from R. pusillus (NCBI locus tag UAY13228), and BANAL‐20‐52 isolated from R. malayanus (NCBI locus tag UAY13216)) were most closely related in protein sequence with SARS2‐N/M and included the first cysteine exchanged to Leu and the second to tyrosine as found in SARS‐CoV‐2. Strains in this same clade isolated from Manis javanica (pangolins) and other varieties of horseshoe bats (R. stheno, R. blythi, and R. cornutus) also lacked the ability to form the cysteine bond due to amino acid differences at one or both cysteines. While the cysteine in αM5 is changed to Tyr in most of these isolates, the cysteine in LoopN8 showed greater variation including the substitution of Ser or Asn (Figure 4a).
The third and fourth clades are composed of SUD‐N/M protein sequences more distantly related to the SARS‐CoV and SARS‐CoV‐2 sequences (lavender and green in Figure 4b). The third clade was comprised of sequences from horseshoe bats and had the bulkier Tyr extending from αM5 and a variety of residues with bulky side chains (Asn, Val, and Lys) extending from LoopN8 (Figure 4a). By contrast, the distantly related strains from roundleaf bats (Hipposideros sp.) showed only 24%–25% identity to SARS‐CoV and SARS‐CoV‐2 and had a variety of residues substituted at both cysteine positions (Figure 4b). These data support that the C‐C bond may be limited to the SARS‐CoV subclade with bulky residues that would push apart SUD‐N and SUD‐M as common in other betacoronavirus lineages.
2.4. The loss of the cysteine interaction does not impact the function of SARS‐N/M
A functional significance of the reorientation of SUD‐N and SUD‐M and the clash between residues Leu and Tyr that would occur if SARS2‐N/M adopted the same structure as crystallized is that any contribution of SUD‐N to G4 quadruplex binding would be lost. We confirmed that native SARS2‐Nc/Mc does bind BCL2 promoter G4 quadruplexes by zone interference gel electrophoresis with a K d of 1.9 μM (Figure 5a). Although the method has limitations due to protein smear along the migration lanes, we were able to determine that the binding efficiency was slightly improved compared to unmodified SARS2‐N/M at 4.1 μM and essentially identical to K d for SARS1‐N/M at 1.6 μM. Further, this protein bound to SARS‐CoV‐2 5′‐UTR at K d < 20 μM, which is lower than the reported value for the unmodified protein at 41.4 μM in an experiment that was conducted simultaneously but split for purposes of publication (Lemak et al., 2024). These data support that the reintroduction of the cysteines may have slightly improved the binding affinity of the proteins for both G4 and 5′‐UTR with SUD‐N as previously reported.
FIGURE 5.

Introduction of Cys residues does not impact G4 quadruplex binding. (a–c) Zone‐interference assay with protein (10 μM) mixed with Bcl‐2 (0–40 μM) oligonucleotide folded into a G4 quadruplex the presence of 100 mM KCl: (a) SARS2‐N/M, (b) SARS2‐NC/MC, or (c) SARS1‐N/M. (d) EMSA for quantification of binding of SARS2‐NC/MC to 5′‐[32P]‐labeled ssRNA of the SARS‐CoV‐2 5′‐UTR (245 nt) region. For all panels, binding percentage (shown as a mean ± SD) and estimated K d value were determined using three independent experiments. Representative gel is shown at left and percentage of binding was plotted from triplicate assays at right.
As this domain is thought to also impact translational efficiency, we considered if the slight loss of nucleic acid affinity would impact viral replication. To test this, we used a SARS‐CoV‐2 replicon assay that uses luciferase as a readout of viral RNA replication in cells infected with single‐round particles (Taha, Chen, et al., 2023; Taha, Suryawanshi, et al., 2023). The rationale for this experiment is that if translation was negatively impacted, protein complex for RNA replication would be reduced such that the overall copies of viral mRNAs would also be reduced. To test the impact of cysteines, the residues for both L516C and/or Y647C were introduced to the sequence of the SARS‐CoV‐2 WA1 isolate. Although the mutations slightly reduced luciferase levels indicative of reduced viral replication when the cysteines were present, the reduction in luciferase signal was not significant in all three independently tested cell lines (Table 1). We further tested another mutation that could have introduced flexibility in the linker region. SARS‐CoV has a proline at the position of SARS‐CoV‐2 Ser543 (Table 1). However, a S543P mutation, either alone or in combination with L516C and Y647C, also did not impact viral RNA replication in all three cell lines.
TABLE 1.
Impact of nsp3 SUD‐N/M mutations on SARS‐CoV‐2 viral RNA replication.
| Relative light units (×106) | |||
|---|---|---|---|
| WA1 nsp3 | VeroE6+AT a | HEK293T+AT a | BHK‐21+AT a |
| Wild‐type | 9.41 ± 0.23 | 0.737 ± 0.117 | 1.49 ± 0.07 |
| L516C | 8.30 ± 0.36 | 0.651 ± 0.080 | 1.46 ± 0.03 |
| Y647C | 8.13 ± 0.48 | 0.630 ± 0.045 | 1.43 ± 0.10 |
| L516C Y647C | 8.06 ± 0.39 | 0.579 ± 0.079 | 1.61 ± 0.06 |
| S543P | 7.92 ± 0.27 | 0.558 ± 0.036 | 1.36 ± 0.08 |
| L516C S543P Y647C | 7.36 ± 0.34 | 0.487 ± 0. 42 | 1.44 ± 0.05 |
| S676T b | 3.75 ± 0.31 | 0.229 ± 0.034 | 0.453 ± 0.022 |
| No spike b | 0.00215 ± 0.00045 | 0.00167 ± 0.00016 | 0.00121 ± 0.00006 |
+AT indicates cell line is stably transduced to express the entry factors ACE2 and TMPRSS2.
Values statistically lower than wild‐type in all three tested cells lines (p < 0.001) by one‐way ANOVA followed by Dunnett's multiple comparisons test. All other values were not significantly different than wild‐type in at least one cell line.
As a control in the replicon assay, we tested a mutation of the last SUD‐M residue Ser676 to threonine at the junction between SUD‐M and DPUP (also known as SUD‐C) that was previously reported to attenuate viral RNA replication (Li, Xue, et al., 2023). We confirmed that the S676T mutation reduced viral replication by 60%–70%. A Ser at this position is found in all sarbecoviruses shown in Figure 4 and this residue is under negative selection in SARS‐CoV‐2 (Jaroszewski et al., 2021; Sedova et al., 2020), suggesting the residue is important for efficient coronavirus replication.
Altogether, these data support that SUD‐N/M function as independent domains during viral mRNA replication. This increased flexibility may slightly reduce known activities of the unique domains in nucleic acid binding but thus did not confer a positive selective advantage based on the known activities of the unique domains in the emergence of SARS‐CoV‐2 as a human pathogen.
3. DISCUSSION
Extensive studies of the crystallization of SARS‐CoV SUD‐N/M have driven the development of hypotheses and conclusions that these two domains are interlinked by a disulfide bond that introduces rigidity. Docking studies originally supported that the interface between the two domains may even jointly contribute to binding to G4 quadruplexes, although mutagenesis studies support that binding is related to a large loop adjacent to αM3 of only SUD‐M (Kusov et al., 2015; Tan et al., 2009). Our structures determined here, particularly a SARS‐CoV structure in an altered conformation compared to all prior structures, reveal that there is torsional twist in the orientation of the two domains driven by the flexibility of LoopN8 and the interdomain linker. The implication of these studies is that SUD‐N/M were previously considered a “core” of the three unique domains, but our studies support the hypothesis that SUD‐N and SUD‐M are independent as is SUD‐C.
Indeed, it is questionable if even in SARS‐CoV and related SARS‐like coronaviruses that also have both cysteines, disulfide bridges would occur within nsp3 in virus‐infected cells. Disulfide bonds do not form in the reducing environment of the cell cytosol into which nsp3 extends when if forms the crown of pore in the DMV. This leads to speculation of whether this disulfide bond is an in vitro artifact of overexpression and purification from Eschericia coli. However, nsp3 does insert into endoplasmic reticulum (ER) membranes that ultimately form the DMV and thus these residues may access ER enzymes that introduce disulfide bonds at some point during DMV maturation (Malone et al., 2022). Indeed, recent data indicated the presence of antioxidant proteins present in viral replication vesicles isolated from the Zika virus (Denolly et al., 2023). Since these vesicles likewise have an ER origin, a more reduced environment might modify nsp3 within the cell. Hence, for SARS‐CoV, this may be a functional stabilization during early establishment of the DMV but the bond may be broken in the cytoplasm‐exposed crown. This stabilization, however, would not occur for SARS‐CoV‐2 nsp3.
A practical application of the introduction of the disulfide bond into SARS‐CoV‐2 SUD‐N/M is that it may prove advantageous for advanced studies of SARS‐CoV‐2 nsp3 structure. Compilation of structural biology efforts has resulted in a near full coverage of SARS‐CoV‐2 nsp3 with x‐ray structures of the independent domains. Recent work has described using cryoelectron tomography for visualization of the DMV crown with a resolution of 20 Å into which structures of Mac2 and Mac3 can be fitted (Zimmermann et al., 2023). Notably, the structures fit to a bend in the crown structure. The lack of a C–C bond could introduce flexibility into this junction that could result in the motion of the knob of each crown point, which is comprised of Ubl2 and Mac1 domains. The introduction of a C–C bond into nsp3 might be sufficient to stabilize nsp3 for the purpose of improving subatomic averaging to improve resolution, presenting an advantage for more advanced structural studies. Indeed, the introduction of the disulfide bond into an extended nsp3 comprising the entire N‐terminal domain from Ubl1 to the first transmembrane domain did result in a more stable recombinant protein that was more easily purified than the unmodified protein (Imhoff, 2023). This preliminary finding supports that the flexibility of nsp3 may hamper advanced structural and biophysical studies of the large protein that could be improved by the reintroduction of this bond found in some but not all SARS‐like coronaviruses.
4. METHODS
4.1. Protein expression and purification
The primary amino acid sequence of nsp3 corresponding to SUD2‐N/M and the SUD2‐NC/MC were taken from the original WA1 sequence 1231–1494 for polyprotein 1ab, which is equivalent to the 413–678 of nsp3 from SARS‐CoV‐2 and residues: 1231–1494 (polyprotein 1ab) equivalent to the SUD‐N/M 389–652 of nsp3 and 1207–1470 of the polyprotein from SARS‐CoV.
Nucleotide sequences encoding proteins were codon‐optimized, using GenSmart™ free software (GenScript), synthesized and cloned into the pMCSG53 vector (Twist Biosciences), and transformed in E. coli BL‐21(DE3) Magic cells. The transformed bacteria were cultured in Terrific Broth media (SUD‐N/M) or Se‐Met (SUD2‐NC/MC), and the protein expression was induced by the addition of 0.5 mM isopropyl‐β‐D‐1‐thiogalactopyranoside when cultures reached an optical density of ~1.6–1.8 determined by the changes in the absorbance at 600 nm. The cells were collected by centrifugation at 5500g for 10 min, the pellets were resuspended in lysis buffer, 50 mM Tris–HCl pH 8.3, 500 mM NaCl, 10% glycerol, and 0.1% IGEPAL, and frozen at −30 C until purification.
For protein purification, the frozen cells were thawed and sonicated at 45% intensity for 20 min, then treated with benzonase for 1 h at 4°C. The lysate was clarified by centrifugation at 30,000g for 40 min and loaded into a His‐Trap FF [Ni–nitrilotriacetic acid (NTA)] column using a GE Healthcare ÅKTA Pure system using loading buffer (10 mM Tris–HCl (pH 8.3), 500 mM NaCl). The column was washed with loading buffer, followed by loading buffer with 25 mM imidazole. The protein was eluted with 10 mM Tris (pH 8.3), 500 mM NaCl, and 500 mM imidazole, and loaded onto a Superdex 200 26/600 column, run with loading buffer, collected, and incubated with tobacco etch mosaic virus (TEV) protease overnight. The cleaved tag and TEV protease were separated from the protein by Ni‐NTA affinity chromatography using loading buffer and the protein was collected in the flow through. The protein was concentrated to 6.4–6.5 or 8.5 mg/mL and set up for crystallization immediately.
4.2. Crystallization
The proteins in 0.3 M NaCl, 0.1 M Tris pH 8.3 (6.4 mg/mL for SARS1‐N/M and 6.45 mg/mL for SARS2‐Nc/Mc) were set up for crystallization as 2 μL crystallization drops (1 μl protein:1 μl reservoir solution) in 96‐well (Corning) plates using commercially available Classics II, PEG's II, AmSO4, Anions, and ComPAS Suites (Qiagen). Diffraction quality crystals were obtained for SARS1‐N/M in Classics II screen (F6), 0.2 M ammonium sulfate, 0.1 M Bis‐Tris pH 5.5, 25% PEG 3350 and for SARS2‐Nc/Mc in AmSO4 screen (A2), 0.2 M ammonium acetate, 2.2 M ammonium sulfate. The crystals were cryoprotected in 2.0 M lithium sulfate, and flash‐frozen in liquid nitrogen for data collection.
4.3. Data collection and refinement
Diffraction data were collected at the Life Science Collaborative Access Team (LS‐CAT) at the Advanced Photon Source, Argonne National Laboratory. Data collection and refinement statistics are reported in Table 2. The data set was processed and scaled with the HKL‐3000 suite (Minor et al., 2006). The structure was solved by molecular replacement with Phaser (McCoy, 2007) from the CCP4 suite using the crystal structure for SARS‐CoV (PDB accession code 2w2g) as a search model. The residues of the linker region were removed and the model was split into two rigid bodies (domains) to use the multiple body search algorithm in PHASER. The initial solution went through several rounds of refinement in REFMAC v5.8.0258 (Murshudov et al., 2011), and manual model corrections using Coot (Emsley et al., 2010). Well‐defined residues of the linker region were rebuilt and the models were further refined in REFMAC. The water molecules were generated using ARP/wARP (Cohen et al., 2008), and ligands were added to the model manually during visual inspection in Coot. Translation‐Libration‐Screw (TLS) groups were created by the TLSMD server (Painter & Merritt, 2006) (http://skuld.bmsc.washington.edu/~tlsmd/), and TLS corrections were applied during the final stages of refinement. MolProbity (Chen et al., 2010) (http://molprobity.biochem.duke.edu/) was used for monitoring the quality of the model during refinement and for the final validation of the structure. The final models and diffraction data were deposited to the Protein Data Bank (https://www.rcsb.org/) with the assigned PDB accession codes 8ufl and 8ufm.
TABLE 2.
Structure determination data refinement.
| Data processing | PDB ID 8ufl | PDB ID 8ufm |
|---|---|---|
| Structure | SARS1‐N/M | SARS1‐Nc/Mc |
| Beamline | APS 21‐ID‐F | APS 21‐ID‐D |
| Wavelength (Å) | 0.97872 | 1.12704 |
| Resolution range (Å) | 30.00–2.50 | 30.00–1.65 |
| Space group | P212121 | P3 1 21 |
| Cell parameters (Å, °) |
a = 67.70, b = 84.86, c = 93.60; a = 90.00; b = 90.00; g = 90.00 |
a = 86.27, b = 86.27, c = 76.67; a = 90.00; b = 90.00; g = 120.00 |
| Unique reflections | 19,083 (908) | 40,007 (1971) |
| Multiplicity | 5.6 (5.6) | 18.8 (14.4) |
| Completeness (%) | 100.0 (100.0) | 99.9 (99.7) |
| Mean I/sigma(I) | 15.1 (2.2) | 30.1 (2.0) |
| Wilson B‐factor (Å2) | 41.1 | 26.8 |
| R‐merge | 0.112 (0.800) | 0.100 (1.625) |
| Rpim | 0.052 (0.370) | 0.024 (0.437) |
| CC1/2 | 0.97 (0.74) | 1.00 (0.77) |
| Refinement | ||
| Resolution range (Å) | 29.80–2.51 (2.57–2.51) | 28.26–1.65 (1.69–1.65) |
| Reflections work/test | 18,109 (1262)/928 (70) | 37,981 (2743)/1997 (153) |
| R work/R free | 0.214 (0.307)/0.267 (0.314) | 0.185 (0.324)/0.208 (0.342) |
| Total number of atoms | 4268 | 2209 |
| Macromolecule atoms | 4044 | 1921 |
| Ligand/solvent (H2O) | 66/125 | 42/208 |
| RMSD (bonds) (Å) | 0.004 | 0.006 |
| RMSD (angles) (°) | 1.369 | 1.454 |
| Ramachandran favored (%) | 95.0 | 98.0 |
| Ramachandran allowed (%) | 5.0 | 2.0 |
| Ramachandran outliers (%) | 0.0 | 0.0 |
| Rotamer outliers (%) | 0.2 | 0.0 |
| Clashscore | 6 | 1 |
| Average B‐factor (Å2) | 48.2 | 37.2 |
4.4. Structure analysis
All structure analysis was conducted using UCSF ChimeraX software including Matchmaker, Distances, and Structure Alignment tools (Pettersen et al., 2004). AF2 models were generated in ChimeraX using its interface to ColabFold (Mirdita et al., 2022), and Pymol V 3.03.
4.5. Amino acid sequence analysis
The NCBI database (ncbi.nlm.nih.gov) was queried using BLASTP with the SARS‐CoV‐2 SUD‐N/M protein sequence against the non‐redundant protein database, excluding SARS‐CoV‐2 and SARS‐CoV. SARS‐CoV‐2, SARS‐CoV, and Civet‐CoV sequences were manually added. A total of 242 retrieved sequences were annotated for isolate and species of origin based on the NCBI genome record. Duplicate sequences from the same isolate or 100% identical sequences isolated from the same species were removed from the analysis. A total of 120 non‐redundant sequences were identified with NCBI locus tag, strain isolate name, isolate source, and references detailed in Supplemental File 1. The protein amino acid sequences were aligned using Clustal Omega Multiple Sequence Alignment and a phylogenetic tree generated using the default settings at the EMBL‐EBI Job Dispatcher (www.ebi.ac.uk, Madeira et al., 2024). The tree was annotated in iTOL v.6 (itol.embl.de, Letuni and Bork, Letunic & Bork, 2024). Visualization of the alignment was generated from the Clustal Omega alignment using MacVector v. 18.2.5.
4.6. Zone‐interference gel electrophoresis (ZIGE)
Zone interference gel electrophoresis is used to measure weak protein‐ligand interactions (Abrahams et al., 1988) by generating a system where dissociation is permanently counteracted by association under rapid dynamic equilibrium conditions and where maximum binding will be observed as a higher migration. This method was used before to measure SUD‐N/M from SARS‐CoV binding to G‐quadruplex from diverse sequences, including the BCL‐2 promoter (5′‐GGGCGCGGGAGGAATTGGGCGGG‐3′) (Tan et al., 2007). Here we adapted the method reported by Tan et al. (2007) to measure the binding of SUD‐N/M from SARS‐CoV and SARS‐CoV‐2 using a custom 3D‐printed comb. Briefly, 10 μM of each protein was incubated with 0–40 μM of the folded BCL‐2 promotor in the presence of 100 mM KCl at room temperature for 30 min. The agarose gel (1%) gel was loaded with 100 μL of a mixture containing 0–40 μM of the ligand in TBE (20 mM Tris, 50 mM boric acid, 0.1 mM ethylendiaminotetracetic acid (EDTA), pH 8.3), 1% dimethylsulfoxide (DMS0) and 0.01% of bromophenol blue (BPB) in the long slots, and run at 100 mA for 1 min or until the front of the dye reached the small slot. Then, 10 μL of a mixture containing 10 μM of protein with increasing concentrations of DNA as indicated above were mixed with 1% DMSO and 0.01% BPB, loaded in the small slots, and continued the electrophoresis in TBE for 1 h. The gel was fixed using 40% acetic acid and 30% ethanol for 2 h and stained with 0.2% Coomassie blue for 1 h and destained with 10% acetic acid and 50% methanol until background washed out, and then the gel was imaged. The K d was calculated by measuring the % of binding by measuring the distance of the maximum protein migration (pm) from the small well to the front of the agarose gel, using 100% as the migrated (mpm) distance at maximum concentration (40 μM), and using the equation: . The K d was determined by non‐linear fit, one‐site binding using Prism (GraphPad V10).
4.7. Electrophoretic mobility shift assays (EMSA)
The cDNA of SARS‐CoV‐2 was generated using the High Capacity cDNA Reverse.
Transcription Kit (Applied BioSystems, Waltham, MA, USA) from the MN908947.3 synthetic SARS‐CoV‐2 RNA (Twist Bioscience, South San Francisco, CA, USA). The DNA of the 5′‐UTR region (1–245 bp) was amplified by PCR to include the T7 promoter with primers 5′‐TAATACGACTCACTATAGGGATTAAAGGTTTATACCTTCC‐3′ (forward) and 5′‐GGACGAAACCTAGATGTGCTGATGATCG‐3′ (reverse). The DNA of the region downstream of 5′‐UTR (301–545 bp) was amplified using PCR to include the T7 promoter with primers: 5′‐TAATACGACTCACTATAGGGACACGTCCAACTCAGTTTG‐3′ (forward) and 5′‐CTTCGAGTTCTGCTACCAGCTCAACCATAACATGAC‐3′ (reverse). Substrate ssRNA was transcribed using HiScribe T7 High Yield RNA Synthesis Kit (New England BioLabs, Ipswich, MA, USA) and was [32P]‐labeled at the 5′ end using T4 polynucleotide kinase (New England BioLabs) and purified as previously described (Beloglazova et al., 2011). The RNA binding assays were performed using 50 mM Tris–HCl (pH 8.0), 150 mM NaCl, 5 mM CaCl2, 1 mM dithiothreitol (DTT), 20 U RNaseOUT (Invitrogen), and 8 nM 5′‐[32P]‐labeled RNA substrate. Reactions were incubated for one hour at 37°C, quenched by the addition of glycerol loading dye, and separated on 6% native polyacrylamide gels. The gels were visualized using a Phosphoimager, and the percentage of bound substrate was quantified using ImageLab software (Bio‐Rad). To determine the K d, the percentage of bound substrate was plotted against total protein concentration, and the curves were generated using non‐linear regression fit in Prism software (GraphPad).
4.8. SARS‐CoV‐2 replicon assay
The SARS‐CoV‐2 replicon assay was conducted as described previously (Taha, Chen, et al., 2023). Briefly, the pBAC SARS‐CoV‐2 ΔSpike WT or nsp3 amino acid‐modified plasmids (1 μg) were transfected into BHK‐21 cells along with N R203M and S Delta variant expression vectors (0.5 μg each) in a 24‐well tissue culture dish. The culture media was replaced with fresh growth medium 12–16 h post‐transfection. The media containing single‐round infectious particles was collected and 0.45 μm‐filtered 72 h post‐transfection. The filtered media containing single‐round infectious particles was added to 2 × 104 VeroE6 cells stably expressing ACE2 and TMPRSS2 (Vero+AT), HEK293T cells stably expressing ACE2 and TMPRSS2 (293 T + AT), and BHK‐21 cells stably expressing ACE2 and TMPRSS2 (BHK21 + AT) in a 96‐well plate. The cells were washed with 200 μL culture medium and 100 μL culture medium was added 12–24 h post‐infection. To measure luciferase activity, 50 μL of supernatant from infected cells was mixed with an equal volume of Nano‐Glo luciferase assay buffer and substrate and analyzed on an Infinite M Plex plate reader (Tecan).
AUTHOR CONTRIBUTIONS
Monica Rosas‐Lemus: Conceptualization; investigation; writing – original draft; writing – review and editing; formal analysis. George Minasov: Investigation; visualization; writing – review and editing; formal analysis; data curation. Joseph S. Brunzelle: Investigation; visualization; formal analysis; data curation. Taha Y. Taha: Investigation; writing – review and editing; formal analysis; supervision. Sofia Lemak: Investigation. Shaohui Yin: Investigation. Ludmilla Shuvalova: Investigation. Julia Rosecrans: Investigation. Kanika Khanna: Investigation. H. Steven Seifert: Resources; conceptualization. Alexei Savchenko: Resources; conceptualization; funding acquisition. Peter J. Stogios: Investigation; resources; funding acquisition; supervision. Melanie Ott: Funding acquisition; supervision. Karla J. F. Satchell: Conceptualization; funding acquisition; writing – original draft; writing – review and editing; visualization; supervision; formal analysis; project administration.
CONFLICT OF INTEREST STATEMENT
K. J. F. S. has a significant financial interest in Situ Biosciences, a contract research organization that conducts research unrelated to this work. T. Y. T. and M. O. are listed as inventors on a patent application filed by the Gladstone Institutes that covers the use of pGLUE to generate SARS‐CoV‐2 infectious clones and replicons. All other authors declare no conflicts of interest.
Supporting information
Supplemental File 1. XLS of 120 sequences used for phylogenetic analysis including NCBI Locus Tag, Strain isolate, Isolate source, and study references.
ACKNOWLEDGMENTS
The authors thank A. Creanga and B. Graham for the Vero cells overexpressing human ACE2 and TMPRSS2. This project was supported by HHS/NIH/NIAID contract 75N93022C00035 (to K. J. F. S. and A. S.) and NIH grants R37 AI033493 and R01 AI146073 (to H. S.) and U19 AI135990 (to M. O.) Project further supported by funding from the University of Toronto COVID‐19 Action Initiative (to P. S.), the Roddenberry Foundation, P. and E. Taft, and the Pendleton Foundation (to M. O.). M. O. is a Chan Zuckerberg Biohub – San Francisco Investigator. M. R. L. is a mentored Principal Investigator at the Autophagy, Inflammation and Metabolism Center of Biomedical Research Excellence (NIH: P20GM121176). The research used resources of the Advanced Photon Source, a U.S. Department of Energy (DOE) Office of Science User Facility operated for the DOE Office of Science by Argonne National Laboratory under Contract No. DE‐AC02‐06CH11357. Use of the LS‐CAT Sector 21 was supported by the Michigan Economic Development Corporation and the Michigan Technology Tri‐Corridor (Grant 085P1000817).
Rosas‐Lemus M, Minasov G, Brunzelle JS, Taha TY, Lemak S, Yin S, et al. Torsional twist of the SARS‐CoV and SARS‐CoV‐2 SUD‐N and SUD‐M domains. Protein Science. 2025;34(3):e70050. 10.1002/pro.70050
Review Editor: John Kuriyan
DATA AVAILABILITY STATEMENT
The data that support the findings of this study are openly available in Protein Data Base at https://www.rcsb.org, reference number 8ufl and 8ufm.
REFERENCES
- Abrahams JP, Kraal B, Bosch L. Zone‐interference gel electrophoresis: a new method for studying weak protein‐nucleic acid complexes under native equilibrium conditions. Nucleic Acids Res. 1988;16:10099–10108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Beloglazova N, Petit P, Flick R, Brown G, Savchenko A, Yakunin AF. Structure and activity of the Cas3 HD nuclease MJ0384, an effector enzyme of the CRISPR interference. EMBO J. 2011;30:4616–4627. [DOI] [PMC free article] [PubMed] [Google Scholar]
- CDC . Revised U.S. surveillance case definition for severe acute respiratory syndrome (SARS) and update on SARS cases‐‐United States and worldwide, December 2003. MMWR Morb Mortal Wkly Rep. 2003;52:1202–1206. [PubMed] [Google Scholar]
- Chatterjee A, Johnson MA, Serrano P, Pedrini B, Joseph JS, Neuman BW, et al. Nuclear magnetic resonance structure shows that the severe acute respiratory syndrome coronavirus‐unique domain contains a macrodomain fold. J Virol. 2009;83:1823–1836. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chen VB, Arendall WB 3rd, Headd JJ, Keedy DA, Immormino RM, Kapral GJ, et al. Molprobity: all‐atom structure validation for macromolecular crystallography. Acta Crystallogr D Biol Crystallogr. 2010;66:12–21. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cohen SX, Ben Jelloul M, Long F, Vagin A, Knipscheer P, Lebbink J, et al. Arp/warp and molecular replacement: the next generation. Acta Crystallogr D Biol Crystallogr. 2008;64:49–60. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Denolly S, Stukalov A, Barayeu U, Rosinski AN, Kritsiligkou P, Joecks S, et al. Zika virus remodelled er membranes contain proviral factors involved in redox and methylation pathways. Nat Commun. 2023;14:8045. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dong E, Du H, Gardner L. An interactive web‐based dashboard to track covid‐19 in real time. Lancet Infect Dis. 2020;20:533–534. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Emsley P, Lohkamp B, Scott WG, Cowtan K. Features and development of coot. Acta Crystallogr D Biol Crystallogr. 2010;66:486–501. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fouchier RA, Kuiken T, Schutten M, van Amerongen G, van Doornum GJ, van den Hoogen BG, et al. Aetiology: koch's postulates fulfilled for SARS virus. Nature. 2003;423:240. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Imhoff MEC. Functional and structural studies of the papain‐like protease encoded in coronavirus non‐structural protein 3. Thesis. West Lafayette, IN: Purdue University Graduate School; 2023. 10.25394/PGS.22688422.v1 [DOI] [Google Scholar]
- Jaroszewski L, Iyer M, Alisoltani A, Sedova M, Godzik A. The interplay of SARS‐CoV‐2 evolution and constraints imposed by the structure and functionality of its proteins. PLoS Comput Biol. 2021;17:e1009147. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Johnson MA, Chatterjee A, Neuman BW, Wuthrich K. SARS coronavirus unique domain: three‐domain molecular architecture in solution and RNA binding. J Mol Biol. 2010;400:724–742. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Klein S, Cortese M, Winter SL, Wachsmuth‐Melm M, Neufeldt CJ, Cerikan B, et al. SARS‐CoV‐2 structure and replication characterized by in situ cryo‐electron tomography. Nat Commun. 2020;11:5885. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kusov Y, Tan J, Alvarez E, Enjuanes L, Hilgenfeld R. A g‐quadruplex‐binding macrodomain within the "SARS‐unique domain" is essential for the activity of the SARS‐coronavirus replication‐transcription complex. Virology. 2015;484:313–322. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lavigne M, Helynck O, Rigolet P, Boudria‐Souilah R, Nowakowski M, Baron B, et al. SARS‐CoV‐2 nsp3 unique domain sud interacts with guanine quadruplexes and g4‐ligands inhibit this interaction. Nucleic Acids Res. 2021;49:7695–7712. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lei J, Kusov Y, Hilgenfeld R. Nsp3 of coronaviruses: structures and functions of a large multi‐domain protein. Antivir Res. 2018;149:58–74. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lei J, Ma‐Lauer Y, Han Y, Thoms M, Buschauer R, Jores J, et al. The SARS‐unique domain (sud) of SARS‐CoV and SARS‐CoV‐2 interacts with human paip1 to enhance viral rna translation. EMBO J. 2021;40:e102277. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lemak S, Skarina T, Flick R, Patel DT, Stogios PJ, Savchenko A. Structural and functional analyses of SARS‐CoV‐2 nsp3 and its specific interactions with the 5′ UTR of the viral genome. BioRxiv. 2024. 10.1101/2024.05.09.593331 [DOI] [Google Scholar]
- Letunic I, Bork P. Interactive tree of life (iTOL) v6: recent updates to the phylogenetic tree display and annotation tool. Nucleic Acids Res. 2024;52:W78–W82. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li P, Xue B, Schnicker NJ, Wong LR, Meyerholz DK, Perlman S. Nsp3‐n interactions are critical for SARS‐CoV‐2 fitness and virulence. Proc Natl Acad Sci U S A. 2023;120:e2305674120. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li Y, Yu Q, Huang R, Chen H, Ren H, Ma L, et al. SARS‐CoV‐2 sud2 and nsp5 conspire to boost apoptosis of respiratory epithelial cells via an augmented interaction with the F‐quadruplex of BclII. MBio. 2023;14:e0335922. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lipps HJ, Rhodes D. G‐quadruplex structures: in vivo evidence and function. Trends Cell Biol. 2009;19:414–422. [DOI] [PubMed] [Google Scholar]
- Madeira F, Madhusoodanan N, Lee J, Eusebi A, Niewielska A, Tivey ARN, et al. The EMBL‐EBI Job Dispatcher sequence analysis tools framework in 2024. Nucleic Acids Res. 2024;52:W521–W525. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Malone B, Urakova N, Snijder EJ, Campbell EA. Structures and functions of coronavirus replication‐transcription complexes and their relevance for SARS‐CoV‐2 drug design. Nat Rev Mol Cell Biol. 2022;23:21–39. [DOI] [PMC free article] [PubMed] [Google Scholar]
- McCoy AJ. Solving structures of protein complexes by molecular replacement with phaser. Acta Crystallogr D Biol Crystallogr. 2007;63:32–41. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Minor W, Cymborowski M, Otwinowski Z, Chruszcz M. Hkl‐3000: the integration of data reduction and structure solution‐‐from diffraction images to an initial model in minutes. Acta Crystallogr D Biol Crystallogr. 2006;62:859–866. [DOI] [PubMed] [Google Scholar]
- Mirdita M, Schutze K, Moriwaki Y, Heo L, Ovchinnikov S, Steinegger M. Colabfold: making protein folding accessible to all. Nat Methods. 2022;19:679–682. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Murshudov GN, Skubak P, Lebedev AA, Pannu NS, Steiner RA, Nicholls RA, et al. Refmac5 for the refinement of macromolecular crystal structures. Acta Crystallogr D Biol Crystallogr. 2011;67:355–367. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Neuman BW. Bioinformatics and functional analyses of coronavirus nonstructural proteins involved in the formation of replicative organelles. Antivir Res. 2016;135:97–107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Painter J, Merritt EA. Tlsmd web server for the generation of multi‐group tls models. J Appl Cryst. 2006;39:109–111. [Google Scholar]
- Pettersen EF, Goddard TD, Huang CC, Couch GS, Greenblatt DM, Meng EC, et al. Ucsf chimera—a visualization system for exploratory research and analysis. J Comput Chem. 2004;25:1605–1612. [DOI] [PubMed] [Google Scholar]
- Qin B, Li Z, Tang K, Wang T, Xie Y, Aumonier S, et al. Identification of the SARS‐unique domain of SARS‐CoV‐2 as an antiviral target. Nat Commun. 2023;14:3999. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Roingeard P, Eymieux S, Burlaud‐Gaillard J, Hourioux C, Patient R, Blanchard E. The double‐membrane vesicle (DMV): a virus‐induced organelle dedicated to the replication of SARS‐CoV‐2 and other positive‐sense single‐stranded RNA viruses. Cell Mol Life Sci. 2022;79:425. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sedova M, Jaroszewski L, Alisoltani A, Godzik A. Coronavirus3d: 3d structural visualization of covid‐19 genomic divergence. Bioinformatics. 2020;36:4360–4362. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Snijder EJ, Bredenbeek PJ, Dobbe JC, Thiel V, Ziebuhr J, Poon LL, et al. Unique and conserved features of genome and proteome of SARS‐coronavirus, an early split‐off from the coronavirus group 2 lineage. J Mol Biol. 2003;331:991–1004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Taha TY, Chen IP, Hayashi JM, Tabata T, Walcott K, Kimmerly GR, et al. Rapid assembly of SARS‐CoV‐2 genomes reveals attenuation of the omicron ba.1 variant through nsp6. Nat Commun. 2023;14:2308. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Taha TY, Suryawanshi RK, Chen IP, Correy GJ, McCavitt‐Malvido M, O'Leary PC, et al. A single inactivating amino acid change in the SARS‐CoV‐2 nsp3 MAC1 domain attenuates viral replication in vivo. PLoS Pathog. 2023;19:e1011614. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tan J, Kusov Y, Mutschall D, Tech S, Nagarajan K, Hilgenfeld R, et al. The "SARS‐unique domain" (sud) of SARS coronavirus is an oligo(g)‐binding protein. Biochem Biophys Res Commun. 2007;364:877–882. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tan J, Vonrhein C, Smart OS, Bricogne G, Bollati M, Kusov Y, et al. The SARS‐unique domain (sud) of SARS coronavirus contains two macrodomains that bind g‐quadruplexes. PLoS Pathog. 2009;5:e1000428. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wolff G, Limpens R, Zevenhoven‐Dobbe JC, Laugks U, Zheng S, de Jong AWM, et al. A molecular pore spans the double membrane of the coronavirus replication organelle. Science. 2020;369:1395–1398. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wu F, Zhao S, Yu B, Chen Y, Wang W, Song Z, et al. A new coronavirus associated with human respiratory disease in China. Nature. 2020;579:265–269. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xu J, Zhao S, Teng T, Abdalla AE, Zhu W, Xie L, et al. Systematic comparison of two animal‐to‐human transmitted human coronaviruses: SARS‐CoV‐2 and SARS‐CoV. Viruses. 2020;12:244. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhou P, Yang X, Wang X, Hu B, Zhang L, Zhang W, et al. A pneumonia outbreak associated with a new coronavirus of probable bat origin. Nature. 2020;579:270–273. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zimmermann L, Zhao X, Makroczyova J, Wachsmuth‐Melm M, Prasad V, Hensel Z, et al. SARS‐CoV‐2 nsp3 and nsp4 are minimal constituents of a pore spanning replication organelle. Nat Commun. 2023;14:7894. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Supplemental File 1. XLS of 120 sequences used for phylogenetic analysis including NCBI Locus Tag, Strain isolate, Isolate source, and study references.
Data Availability Statement
The data that support the findings of this study are openly available in Protein Data Base at https://www.rcsb.org, reference number 8ufl and 8ufm.
