Skip to main content
The Journal of Biological Chemistry logoLink to The Journal of Biological Chemistry
. 2024 May 7;300(6):107354. doi: 10.1016/j.jbc.2024.107354

Phosphorylation in the Ser/Arg-rich region of the nucleocapsid of SARS-CoV-2 regulates phase separation by inhibiting self-association of a distant helix

Hannah Stuwe 1,, Patrick N Reardon 2,, Zhen Yu 1, Sahana Shah 1, Kaitlyn Hughes 1, Elisar J Barbar 1,
PMCID: PMC11180338  PMID: 38718862

Abstract

The nucleocapsid protein (N) of SARS-CoV-2 is essential for virus replication, genome packaging, evading host immunity, and virus maturation. N is a multidomain protein composed of an independently folded monomeric N-terminal domain that is the primary site for RNA binding and a dimeric C-terminal domain that is essential for efficient phase separation and condensate formation with RNA. The domains are separated by a disordered Ser/Arg-rich region preceding a self-associating Leu-rich helix. Phosphorylation in the Ser/Arg region in infected cells decreases the viscosity of N:RNA condensates promoting viral replication and host immune evasion. The molecular level effect of phosphorylation, however, is missing from our current understanding. Using NMR spectroscopy and analytical ultracentrifugation, we show that phosphorylation destabilizes the self-associating Leu-rich helix 30 amino-acids distant from the phosphorylation site. NMR and gel shift assays demonstrate that RNA binding by the linker is dampened by phosphorylation, whereas RNA binding to the full-length protein is not significantly affected presumably due to retained strong interactions with the primary RNA-binding domain. Introducing a switchable self-associating domain to replace the Leu-rich helix confirms the importance of linker self-association to droplet formation and suggests that phosphorylation not only increases solubility of the positively charged elongated Ser/Arg region as observed in other RNA-binding proteins but can also inhibit self-association of the Leu-rich helix. These data highlight the effect of phosphorylation both at local sites and at a distant self-associating hydrophobic helix in regulating liquid–liquid phase separation of the entire protein.

Keywords: SARS-CoV-2, phosphorylation, AUC, NMR, LLPS, protein RNA interactions


SARS-CoV-2 is an enveloped single-stranded, positive sense RNA virus with a large 30kb genome (1). The SARS-CoV-2 virion is composed of four structural proteins: spike (S), membrane (M), envelope (E), and nucleocapsid (N). The viral membrane, with S, M, and E, surround the helical nucleocapsid containing the viral RNA genome encapsulated by N (1, 2, 3). N has several functions within the viral life cycle but is primarily involved in protecting the viral RNA genome by binding, condensing, and packaging it within the virion (2, 4). N also functions in the replicase transcriptase complexes where it mediates the synthesis of genomic RNA (gRNA) within the host cell (5, 6, 7). Additionally, N contributes to innate immune evasion via sequestering stress granule protein G3BP1 (8, 9). N’s ability to phase separate, particularly with gRNA, has been extensively investigated (3, 4, 10, 11, 12, 13, 14, 15, 16, 17, 18) and undergoing this process is what allows N to participate in a vast and varied functions (12, 19). These N-containing condensates are also a promising target for drug developments (20).

N is a 419 amino-acid long protein with two independently folded domains, the N-terminal domain (NTD) and the C-terminal domain (CTD) (Fig. 1) (21). These domains are flanked by two disordered tails and separated by a central disordered linker that contains a serine/arginine rich (SR-rich) region, followed by a short Leu-rich helix (LRH) (Fig. 1). The CTD forms a domain swapped dimer, facilitating the dimerization of N (21, 22, 23) and is essential for phase separation with RNA (24) while the NTD is the primary RNA-binding domain (25, 26, 27, 28) and its binding to RNA is strengthened in constructs that contain the intrinsically disordered N-terminal and central regions (29, 30). Upon binding to RNA, the N protein packs the NTD and the CTD closer together (31). Multiple studies using hydrogen-deuterium exchange mass spectrometry and analytical ultracentrifugation have shown that the central linker self-associates (32, 33) and using molecular dynamics modeling, have predicted the helical LRH region as the most likely site of self-association (33, 34, 35).

Figure 1.

Figure 1

Domain maps of SARS-CoV-2N and constructs used in this work.A, amino acid sequence of SARS-CoV-2N construct spanning the linker region from residues 175 to 245. Residual TEV protease cleavage sites are in orange. The SR-rich region is in bold black, with phosphorylation sites investigated here in bold red. The Leu-rich helix (LRH) spanning residues 216 to 232 is in bold blue. B, domain maps of full-length (FL-N), mutant full-length (muFL-N), shorter constructs N175–365, N175–245, and GFP-tagged N175–245 from top to bottom, respectively. Intrinsically disordered regions are represented by black lines, the structured RNA-binding domain NTD is represented by a dark green rectangle, and the dimerization domain CTD by a dark green oval. The LRH is represented by a small blue rectangle. GSK priming site pSer188 is pointed to by an arrow, and the subsequent phosphorylation sites are indicated by red circles. The muFL-N has a TQT recognition motif for LC8-binding site incorporated to replace the LRH shown as gray rectangle. C, a cartoon depiction of dimeric FL-N showing self-association of LRH- and RNA-binding site of the NTD. CTD, C-terminal domain; GSK, glycogen synthase kinase; LC8, dynein light chain 8; LRH, Leu-rich helix; NTD, N-terminal domain; SR-rich, serine/arginine-rich.

The dual roles of N in the assembly with gRNA into viral RNA-protein complexes and in localization to the replicase transcriptase complexes to enhance viral replication and transcription are regulated by phosphorylation of the SR-rich region of N (36, 37, 38, 39). Phosphorylation of the homolog SARS-CoV-1 SR-rich region did not significantly affect RNA binding of the nucleocapsid protein but impaired its ability to form oligomers (40). In SARS-CoV-2, phosphorylation of N altered liquid-liquid phase separation (LLPS) formation, with unmodified protein forming gel-like condensates, while phosphorylated protein forming more dynamic liquid-like droplets (19). In a follow up study, the Morgan group showed that the assembly into viral particles requires the LRH and using phosphomimetic SR demonstrated that phosphorylation inhibits vRNP assembly (41). The phosphorylation-mediated differences in LLPS characteristics are hypothesized to contribute to N switching between functions, with liquid-like phosphorylated N droplets potentially promoting viral replication and host immune evasion and gel-like unphosphorylated droplets promoting viral RNA packaging (10, 11). Overall, these observations demonstrate the critical role of phosphorylation in regulating the function of N during the virus life cycle. It is notable that the SR-rich region in the linker can have up to 15 phosphorylation sites depending on cell type (42) which will presumably alter the structure and interactions of the highly positive charge pattern of the unphosphorylated protein.

Here, we systematically probe the effect of a single phosphorylation event, at serine 188, a known priming site of phosphorylation (42) and of hyperphosphorylation, via glycogen synthase kinase 3β (GSK-3), on the structure of the linker domain, the self-association of the LRH, and the interactions of the full-length protein with gRNA, including its ability to phase separate. Our structural and functional characterization demonstrate the importance of phosphorylation in affecting not only the local environment but also in inhibiting the self-association of LRH distant from the site of phosphorylation. We expect that phosphorylation at all potential sites will cause an even more profound change. We present a model that explains how phosphorylation could alter the structure of N in condensates and force unpackaging of viral RNA.

Results

Association of the central linker is localized to helical residues 216 to 232

A suite of BEST triple resonance NMR experiments helped assign the backbone resonances of N175–245 at 10 °C (Fig. 2), including the 225 to 230 region missing in the assignments deposited in the BMRB (43). The 15N HSQC spectrum is consistent with a primarily disordered polypeptide, with generally narrow chemical shift dispersion. Fast time scale dynamics determined by measuring nuclear spin relaxation 15N R1, 15N R2, and {1H}-15N Nuclear Overhauser Effect (NOE) identified residues 178 to 215 and 236 to 248 to be fully disordered with NOE’s below 0.5 and negative NOE’s at the termini (Fig. 2). 15N R2 and R1 relaxation rates in these regions were relatively uniform, while 15N R2 and {1H}-15N NOE were elevated for residues 216 to 232, which together with the carbon chemical shifts (Fig. S1) confirm that residues 216 to 232 adopt a helical structure and that except for this short helix, N175–245 does not adopt any regular secondary structure under our experimental conditions.

Figure 2.

Figure 2

NMR structural characterization of WT N175 to 245 and pSer188 N175 to 245.A, 15N-HSQC spectrum of 150 μM WT N175–245 at 10 °C with backbone resonance assignments. B, overlay of 15N-HSQC spectrum of WT N175–245 (black) and pSer188 N175–245 (red), with shifted peaks labeled. C, 15N nuclear spin relaxation of WT N175–245 (black) compared to pSer188 N175–245 (red). The Leu-rich helix is indicated.

As stated above, missing peaks for the deposited assignments were attributed to self-association of the central linker (43). Consistent with this interpretation, the peaks assigned at lower concentrations of 15N-labeled N175–245 (Fig. 3) correspond to the helical region and thus identify self-association to be localized to the helical residues. At 300 μM, these peaks disappeared from the spectrum (Fig. 3).

Figure 3.

Figure 3

Concentration dependence of self-association of WT and pSer188 N175 to 245.A, 15N-HSQC spectra of WT N175–245 (left) at 300 μM (red) and 100 μM (black) concentration. Resonances that disappear at higher concentration are labeled. 15N-HSQC spectra of pSer188 N175–245 (right) at 300 μM (red) and 200 μM (black) concentration. The same resonances are labeled. Missing peaks for residues 233 to 235 is due to ordered structure in proximity of the self-associating helix. B, sedimentation velocity analytical ultracentrifugation (SV-AUC) of WT-N175–245 GFP (left) at 50 to 200 μM concentration and SV-AUC of pSer188 N175–245 GFP (right) at 100 and 200 μM concentration. C, representative sedimentation equilibrium data at three rotor speeds for WT-N175–245 GFP (left) and for pSer 188 N175–245 GFP (right). Monomer-dimer equilibrium fits are shown as solid lines. All AUC experiments with N175–245 were performed with GFP fusion.

To confirm that peak attenuation is in fact due to self-association and to evaluate the strength of self-association, we performed sedimentation velocity analytical ultracentrifugation (SV-AUC) experiments on GFP-tagged N175–245 (WT-N175–245 GFP). The GFP tag was used to provide a stronger UV absorbance. The SV-AUC profiles clearly show two peaks in the WT- N175–245 GFP at 200 μM, corresponding to ∼3.1S and 4.8S, suggesting the presence of monomeric and dimeric species (Fig. 3). The GFP control at the same concentration only gave rise to a single peak, indicating no significant dimerization under these conditions (Fig. S2). Subsequent sedimentation equilibrium analysis (SE-AUC) gave a mass average molecular weight for N175–245 GFP of ∼66 kDa near the expected molecular weight of 72 kDa for a dimer (Fig. 3) which upon analysis to a monomer-dimer equilibrium model yielded a dissociation constant of 77 μM. Together, these data indicate that the LRH can self-associate but only weakly in the linker construct.

Self-association of LRH is strengthened in context of dimeric CTD

To probe the behavior of the LRH in the context of the full-length protein (FL-N), we collected a series of 15N HSQC experiments at decreasing concentrations of a 15N-labeled construct of N including the central linker and the CTD, spanning residues 175 to 365 (N175–365) as a more tractable model of dimeric FL-N. For concentrations of N175–365 as low as 10 μM, which is still considerably above the 1 μM dimerization constant of the CTD without the LRH (22), but significantly lower than the 77 μM dimerization constant of the LRH alone, resonances corresponding to residues 216 to 232 were still not detectable, indicating they are involved in self-association at those concentrations (Fig. 4). For comparison, these residues were observed at 200 μM in the linker construct. Additionally, SV-AUC experiments at 350 μM, the highest concentration assessed, showed two distinct peaks corresponding to ∼2.7S and 4.2S, suggesting the presence of dimeric and tetrameric species (Fig. 4). At 200 μM, only a primary peak at ∼2.7S and a shoulder in the ∼3.3 to 4.5S range were observed. This broad shoulder suggests exchange between dimeric and tetrameric states. At 100 μM, a single broad peak centering at ∼2.7S was observed, indicative of a dimeric N175–365 as the primary species in solution, but still undergoing exchange with higher order species. SE-AUC data collected to confirm that the higher order species is indeed a tetramer were fit to a dimer-tetramer model yielding an equilibrium dissociation constant of about 300 μM. Together, these data demonstrate that the self-association of the LRH is much stronger in context of the dimeric CTD and could form tetramers bridging two dimeric N molecules.

Figure 4.

Figure 4

Concentration dependence self-association of WT and phosphorylated N175 to 365.A, 15N-TROSY-HSQC spectra of 50 μM WT N175–365 (red) compared to WT N175–245 (black). Resonances of N175–245 that are not present in N175–365 are labeled. Resonances that correspond to residual TEV cleavage sites in N175–245 are indicated by an asterisk. B, 15N-TROSY-HSQC spectra of 50 μM GSK-hyperphosphorylated N175–365 (+GSK N175–365) (red) compared to WT N175–245 (black). Down-field resonances corresponding to phosphorylated serine are circled. Resonance labeling scheme is the same as in (A). C, SV-AUC of WT-N175–365 at 100 to 340 μM concentration. D, SV-AUC of +GSK N175–365 at 200 and 340 μM concentration. E, representative model of proposed tetramerization of N175–365 due to LRH self-association (dark blue). Tetrameric N175–365 is in exchange with dimeric N175–365 as indicated on the right. F, representative model of the effect of hyperphosphorylation of N175–365 showing the CTD as an intact dimer (green spheres) but with dissociation of the LRH from tetramers and dimers to monomers. CTD, C-terminal domain; GSK, glycogen synthase kinase 3β; LRH, Leu-rich helix.

Phosphorylation in the SR-rich region inhibits self-association of the LRH

N protein phosphorylation is a complex process involving several host kinases acting sequentially to phosphorylate the SR-rich region (36, 42, 44). We simplify this process by focusing only on Ser188 as an activating primer for GSK-3β, which phosphorylates the SR-rich region in an n-4 pattern. Based on the expected pattern of GSK-3 phosphorylation, with only Ser188 primed, we can expect a total of four phosphates incorporated after GSK-3 activation at residues 188, 184, 180, and 176.

Phosphorylation of N in the SR-rich region changes the consistency of phase-separated droplets with RNA from dense gel-like droplets to dynamic liquid-like droplets (19). To determine the molecular basis for this process, we sought to identify the effect of phosphorylation on the structure of linker alone and in context of the CTD. The SR-rich region preceding the self-associating LRH is a stretch of 40 residues containing 7 arginine and 14 serine residues that are targets for phosphorylation (45). We used genetic code expansion to incorporate a single phosphoserine at known priming position 188 (pSer188) (46). Spectra of 15N-HSQC of pSer188 N175–245 and WT N175–245 are essentially identical, except for the resonances corresponding to residues proximal to the phosphorylation site (Fig. 2). Comparison of 15N R1, 15N R2 and {1H}-15N NOE revealed similar dynamic properties, with a modest increase in {1H}-15N NOE near the phosphorylation site and a modest decrease in R2 in the alpha helical region. Since phosphorylation typically forms a hydrogen bond between the phosphate and the backbone amide causing the strong downfield shift in the amide proton resonance for pSer188 and is expected to dampen fast motion, it is not surprising to see a modest impact on the {1H}-15N NOE in the region proximal to the site of phosphorylation. The chemical shift changes for residues 186, 187, 189, and 190 suggest some minor structural changes localized to the vicinity of S188.

Even though only a modest change in chemical shifts is observed in pSer188 N175–245 spectra, a major increase is observed in the peak intensities of the self-associating LRH that is also reflected in the decrease in R2. NMR spectra collected at decreasing concentrations like those of the WT N175–245 showed higher intensities for the helical resonances implicated in self-association, compared to WT N175–245 at similar concentration (Fig. 3). SV-AUC analysis of pSer188-N175–245 GFP also indicated weakened self-association (Fig. 3). Concentrations of 200 μM and 100 μM of pSer188-N175–245 GFP showed a single peak in the c(s) plots, with sedimentation coefficients of ∼3.3 and ∼3.1S, respectively, indicating a weaker self-association when compared to WT N175–245 which showed two peaks. SE-AUC of pSer188 N175–245 GFP determined a Kd for dimer dissociation of 235 μM, significantly weaker than the 77 μM of WT N175–245 (Fig. 3).

pSer188 N175–365 displayed similar behavior to WT in 15N HSQC spectra at decreasing concentrations (Fig. S3). As with WT, resonances corresponding to residues 216 to 232 were not detectable at concentrations as low as 10 μM (Fig. S4), indicating that a single phosphoserine incorporation is not sufficient to dissociate the LRH in the context of the CTD. We then tested if hyperphosphorylation would dissociate the LRH. Addition of GSK-3 (+GSK N175–365) which phosphorylates serine or threonine residues N-terminal of an initial site of phosphorylation (42). While GSK-3 will canonically result in four phosphoserines, recent MS analysis of the SR-rich linker primed at S188 and subsequently reacted with GSK-3 showed partial phosphorylation up to six sites (47). These results are consistent with our observed NMR spectrum of GSK-3–treated N175–365, which showed three strong phosphoserine resonances and additional weaker resonances corresponding to lower abundance phosphorylation of additional serines. 15N HSQC experiments at decreasing concentrations revealed that resonances corresponding to residues 216 to 232 re-appeared at concentrations as high as 100 μM, indicating that hyperphosphorylation prevents LRH self-association in context of the CTD (Fig. 4). The dissociation of the tetramer is confirmed by SV-AUC experiments, in which there is a single broad peak centering around ∼2.7 S (which corresponds to dimeric +GSK N175–365) at 340 μM and 200 μM.

Phosphorylation decreases RNA binding in the linker region

RNA binding to N is primarily driven by interactions with the NTD and, to a lesser extent, the CTD (24, 28). The central linker region shows direct interactions with RNA in the presence of the NTD (30). Here, we confirm using EMSA that N175–245 corresponding to the central linker alone can directly interact with the first 1000 nts of the 5’-end of SARS-Cov-2 viral RNA (g(1–1000)) (Fig. 5). Increasing concentrations of N175–245 caused an increasing shift of the g(1–1000) to a higher molecular weight, indicating that the protein is binding the RNA and reducing its mobility in the gel in a concentration-dependent manner.

Figure 5.

Figure 5

Phosphorylation modulation of N-RNA interactions.A, peak intensity ratios based on NMR titrations of g(1–1000) RNA into 15N-labeled N175–245 (top) and pSer188 N175–245 (bottom). The protein:RNA ratio varied from 10000:1 to 1000:1. Data are plotted as intensity ratios for peaks with and without RNA. B, EMSA gels for WT N175–245 (0–30 μM) and pSer188 N175–245 (0–75 μM) with g(1–1000) RNA (0.5 μM). GSK-3 was used to hyperphosphorylate (+GSK) the pSer188 N175–245 (−GSK). C, EMSA gels for WT FL-N (0–20 μM) and pSer188 FL-N (0–40 μM) with g(1–1000) RNA (0.5 μM), along with GSK-3 hyperphosphorylation. For (B) and (C), the g(1–1000) RNA migrates as multiple bands because the RNA is not denatured and can adopt secondary structures. D, WT FL-N (left), pSer188 FL-N (middle), and hyperphosphorylated FL-N (+GSK FL-N) (right) liquid-liquid phase separation (LLPS) using fluorescence imaging (top) and bright field (bottom). The g(1–1000) RNA was labeled with cy3 for fluorescence imaging. Images were taken at 40× magnification, and the scale bar represents 50 μm. All LLPS experiments were collected at 37 °C after 2 h of incubation. E, LLPS utilizing the same conditions as (D) for muFL-N, muFL-N with LC8 added, and pSer188 muFL-N with LC8 added (from left to right, respectively). The scale bar represents 100 μm. FL-N, full length N protein; GSK, glycogen synthase kinase 3β; LC8, dynein light chain 8.

We utilized NMR to probe the RNA interaction sites within N175–245. Addition of increasing concentrations of g(1–1000) RNA to 15N-labeled WT N175–245 caused most of the resonances to disappear, consistent with the formation of high molecular weight complexes. Resonances that are still observed at a 1000:1 protein:RNA molar ratio are localized to the C-terminal end of the WT N175–245 containing the residual tobacco etch virus (TEV) cleavage site, indicating these residues are not significantly interacting with g(1–1000) (Fig. 5). The significant line broadening of the remaining resonances suggests that the SR-rich region and the LRH are interacting with RNA, with only modest differences between the regions. This observation is consistent with a previous report that shows peak attenuation in this region when using a small piece of RNA (48).

We next determined if phosphorylation altered RNA binding using pSer188 N175–245. We found that the pSer188 N175–245 also bound RNA, based on EMSA analysis, but with weaker affinity (Fig. 5). Weaker binding was also demonstrated by NMR titrations of 15N-labeled pSer188 N175–245, with g(1–1000) RNA showing less peak attenuation at similar concentrations of RNA compared to the WT N175–245 (Fig. S5). Residues 175 to 200 also show less peak attenuation, suggesting that the double negative charge of pSer188 is sufficient to counter the favorable electrostatic interactions between the RNA and the arginine-rich linker. In contrast to the WT, the LRH resonances are more attenuated in the pSer188 N175–245 spectra with RNA, indicating that the RNA could still weakly interact with the helical region or that the RNA is altering the monomer-dimer exchange leading to additional broadening in the helical region.

Having observed a modest impact on RNA binding from a single phosphorylation, we next examined the impact of hyper-phosphorylation on RNA binding using GSK-3. Our EMSA results on GSK-3–hyperphosphorylated pSer188 N175–245 show full disruption of the interaction between N175–245 and g(1–1000) RNA even at 75 μM protein.

Phosphorylation of FL-N attenuates LLPS but not RNA binding

Having found that phosphorylation attenuated binding of N175–245 to RNA, we performed similar experiments on the FL-N. In contrast to what we observed with N175–245, EMSA analysis showed that RNA-binding affinity to the FL-N was not affected by phosphorylation and hyperphosphorylation (Fig. 5) clearly indicating that linker phosphorylation at these four sites does not significantly alter g(1–1000) RNA binding by FL-N.

Phosphorylation has also been consistently shown to alter the phase-separation properties of FL-N (11, 19). We thus tested the effect of phosphorylation on LLPS behavior in the context of the full-length protein by comparing the ability of pSer188 and WT FL-N proteins to form droplets. WT FL-N phase separated at 4 μM protein with 50 nM g(1–1000) RNA at 37 °C (Fig. 5 and Table S1). The droplets formed with pSer188 FL-N protein were on average similar in size to WT FL-N but much fewer in number. The difference became even more pronounced in the GSK-hyperphosphorylated protein which showed almost no phase separation under these conditions (Fig. 5), with only a handful of droplets formed, clearly demonstrating that phosphorylation in the SR-rich region inhibits LLPS. The size of droplets formed in WT compared to pSer188 FL-N are similar, but the number of droplets formed by hyperphosphorylated FL-N was ∼70× less than the singly phosphorylated pSer188 FL-N and 180× less than the number of droplets observed for WT FL-N (Table S1).

A switchable self-association domain to dissect effect of phosphorylation on LLPS

To test the contribution of the self-associating LRH to LLPS, we engineered a variant of FL-N with a switchable self-association capability by replacing the LRH with a disordered 10 amino acid segment containing a TQT recognition motif (KAIDAATQTE) specific for the dimeric hub protein LC8. LC8 is a dimerizing hub as it binds two disordered chains of partner proteins at the TQT motif (49). By replacing the LRH with a TQT motif of similar length, we have created a construct of FL-N (muFL-N) that will only self-associate in the linker in the presence of LC8 but is quite disordered without. We tested that LC8 indeed binds muFL-N (Fig. S6) and forms a tight complex. LLPS experiments on muFL-N showed less droplet formation with g(1–1000) RNA when compared to WT FL-N at similar conditions (Fig. 5), but upon addition of LC8, muFL-N showed extensive phase separation with g(1–1000) RNA (Fig. 5 and Table S2). Addition of LC8 to phosphorylated muFL-N (pSer188 muFL-N) increased LLPS but not to the same extent as in the absence of phosphorylation (Table S2 and Fig. 5), indicating that disrupting self-association in the linker is important but is not the only determinant of reduced LLPS. These data implicate the LRH as a primary modulator of LLPS as in both phosphorylated and unphosphorylated muFL-N LLPS experiments, LC8-induced dimerization promotes phase-separation behavior.

Discussion

The structure of N is generally depicted with two folded domains, NTD and CTD, that are dimerized by the CTD and linked by a central disordered linker containing an α-helical region, LRH (Fig. 1). At the N terminus of the disordered linker is a positively charged SR-rich region that can undergo hyperphosphorylation. Given that the SR region is hyperphosphorylated in infected cells but unphosphorylated during viral assembly in infectious virions, it is significant to understand the molecular processes that drive this switch. Here, we show that the LRH-spanning residues 216 to 232 forms self-associated dimers and higher order aggregates in the unphosphorylated form, but its self-association is significantly weakened by phosphorylation and hyperphosphorylation at sites within the SR-rich region about 30 amino-acids distant from the LRH. Phosphorylation also attenuates RNA binding to the SR region, but while it does not significantly affect RNA binding to the full-length protein, it drastically alters its LLPS.

Higher order association in FL-N is driven by the LRH-forming dimer of dimers

Since the crystal structure of the CTD was solved for a dimer (21, 23), it has been recognized that the CTD alone is the dimerization domain of N. Importantly, the CTD is a stable dimer and does not form higher order oligomers at NMR concentrations (22, 24). Work in the literature based on AUC and molecular dynamics reported a higher order trimeric coiled-coil mediated by intermolecular interactions in the LRH (34). Also reported is a mutation in N at the beginning of the LRH, G215C, exhibiting substantially stronger self-association and shifting self-association to a tetrameric oligomeric state (33). Using sedimentation equilibrium AUC on a construct containing the LRH and the CTD, we indeed identified a higher order species in addition to the dimer at concentration above 100 μM (Fig. 4). We assign this oligomerization state to a tetramer rather than a trimer for the following reasons: the LRH self-association is coupled to CTD dimerization and therefore the dimer is always the more stable species, and at higher concentration, it will populate a dimer of dimers or a tetramer 4-helix bundle that brings two dimers of N together. In this model, the LRH cannot dissociate in the presence of the CTD dimer but forms an intra dimer helix–helix interaction at low protein concentration, while the 4-helix bundle tetramer forms at high concentration. Multivalent binding of RNA to N will increase the effective local concentration shifting the equilibrium towards a tetramer or higher.

To deconvolute the contribution of the LRH from that of the CTD to formation of the higher order structure, we used NMR and AUC on constructs of the linker alone and the linker plus CTD and compared these results to the CTD alone. LRH in the linker construct has a weak self-association of 77 μM which is significantly enhanced to about 1 μM in context of the CTD and can also form weak tetramers. A construct corresponding to the CTD alone is a stable dimer at all concentrations. Therefore, these studies show clearly that the LRH is required to form higher order assemblies but forms only multiples of dimers.

Phosphorylation decreases LLPS formation by inhibiting LRH self-association

MD simulations suggested that phosphorylation of the SR region increases inter- and intra-peptide interactions through phosphate-arginine salt bridges (50). Phosphorylation of an SR region’s synthetic peptide also exhibited reduced binding to polyU RNA as well as showed more liquid-like droplets (50). More recently, hyperphosphorylation was reported to promote direct binding of the phosphorylated SR region to the RNA-binding domain of the NTD and thus providing a mechanism for inhibiting RNA binding (51). The reported work, however, did not address the effect of phosphorylation on the LRH, as we do below.

Our NMR results show that phosphorylation at serine 188 did not dramatically alter the monomeric structure of the linker. Instead, phosphorylation resulted in a lower propensity of the linker to self-associate, with a Kd, three-fold weaker than WT. To place these results in context of the full-length protein, we determined the effect of phosphorylation in the presence of the dimeric CTD which significantly enhances dimerization of the LRH. A single phosphorylation event did not affect LRH self-association, but hyperphosphorylation with GSK-3, which places a pSer at 3 to 6 sites, caused dissociation of the tetrameric form suggesting that the level of phosphorylation can be tuned in response to the level of the protein concentration and self-association needed.

How phosphorylation in the SR region, about 30 amino-acid residues distant from the beginning of the LRH, promotes its dissociation is not entirely obvious. Comparison to other RNA-binding proteins with SR-rich regions gives some clues and suggests that phosphorylation could simply be increasing the solubility of the protein, preventing oligomerization (45). Phosphorylation significantly alters the net charge of a protein, adding a −2 charge with each phosphorylated residue at physiological pH. In TDP-43, for example, phosphorylation was proposed as a preventative cellular mechanism against aggregation (52). TDP-43 hyperphosphorylation was also shown to reduce phase separation and aggregation. One attractive explanation is that serine residues are more prone to interact with other protein residues than with solvent, while phosphoserine would interact more with water molecules than be involved in protein–protein interactions (45). Increased interactions with water would still not explain our observation of phosphorylation promoting dissociation 30 residues away, however. It is likely that solvation would change the structure of the SR region from a more extended to flexible with local structures; a change supported by chemical shift differences and a slight increase in the heteronuclear NOEs in the phosphorylated form. Local structures could interfere with stacking of the LRH in a helical bundle, thus opposing higher order formation.

Phase separation and RNA binding

While SR-rich regions of proteins are often involved in RNA interactions (53, 54), in SARS-CoV-2, the NTD is the primary RNA-binding domain, but the CTD and the disordered domains are all capable of interacting with RNA with varying affinity and specificity. We show here direct interaction between the SR region and RNA, but its phosphorylation resulted in reduced RNA binding, while hyperphosphorylation by GSK-3, primed with pSer188, dramatically abolished it. Interestingly, in contrast to the isolated linker domain, WT, pSer188, and hyperphosphorylated FL-N all interact strongly with RNA as shown by EMSA (Fig. 5). These results demonstrate that while phosphorylation of the linker inhibits RNA binding at the SR-region, it does not inhibit overall RNA binding by FL-N presumably because phosphorylation would leave the high affinity NTD-binding site unaltered. A recent preprint reported a direct interaction of the phosphorylated region with the RNA site on the NTD, potentially blocking RNA binding and abolishing binding to FL-N (51). We see no effect on RNA binding with the GSK-phosphorylated FL-N. The discrepancies between these studies and ours could be explained as due to their use of a short RNA sequence while ours uses a 1000 nt viral RNA. The 1000 nt RNA may have some allosteric effects preventing the blocking of the NTD. Further, a larger RNA with multivalent sites will compete with any weak interactions between the SR-rich region and the RNA-binding region of the NTD.

Exciting about our findings is the effect of phosphorylation on phase separation of FL-N. Phosphorylation by a combination of CDK-1 and GSK-3 was reported to alter phase-separation behavior, inducing a transition from gel-like to liquid-like droplets (19). Our GSK-hyperphosphorylated FL-N shows significant reduction in both the size and number of droplets formed. Interesting to note is that while RNA interaction is certainly required for phase separation, phase-separation behavior can be inhibited by phosphorylation in the disordered linker region even while overall RNA binding remains intact.

A model for how hyperphosphorylation in the SR region attenuates phase separation

The model of Figure 6 summarizes our data and proposes a process of how phosphorylation switches the role of N from viral packaging to viral replication. The N protein is in a dimer-tetramer equilibrium that shifts towards tetramer at high protein concentration as shown here or forms higher order oligomers when bound multivalently to RNA (55). It is well established that viral packaging would involve genome compaction through multivalent protein:RNA and protein:protein interactions (56). In the absence of phosphorylation, the LRH self-association organizes the SR regions in an elongated manner primed for higher order aggregation when bound to RNA. A single phosphorylation introduces repulsive charges that disrupt charge patterning which is normally associated with LLPS (57) preventing stacking of the SR regions and improving solubility thus reducing LLPS while keeping the self-association of LRH. The local effect of phosphorylation is confirmed by using the switchable dimer which shows that self-association is not enough to restore full LLPS. Hyperphosphorylation is required to amplify the disruption in the SR regions and significantly introduces bound water molecules to cause dissociation of the LRH about 30 amino acids downstream, disrupting LLPS.

Figure 6.

Figure 6

Proposed model of how phosphorylation acts as a molecular switch for the role of N from viral packaging to viral replication. FL-N is depicted here using similar illustration as in Figure 1C. (Left) Unphosphorylated FL-N is in a dimer-tetramer equilibrium which shifts to tetramer when bound to RNA. The positive charges in the SR-rich region cause elongation due to charge–charge repulsion. RNA intercalates by binding to the SR region of the linker, resulting in a compacted RNA expected to be most populated in LLPS and in viral packaging. (Middle) A single phosphorylation event (red) introduces negative charges in the SR causing some structural change. RNA remains bound to the NTD but does not bind the linker resulting in reduced compaction. (Right) Four phosphorylation events with GSK-3 causing significant structural changes in the SR-region of GSK hyperphosphorylated FL-N and dissociation of the 4-helix bundle tetramers. The resulting structure of FL-N is a dimer with significant flexibility in the linker after dissociation of the LRH causing significant dispersing of the RNA, consistent with the model expected in viral replication. Two N proteins are shown to illustrate multivalent binding. Phosphates are indicated by red circles, and arginines by circled pluses. RNA is depicted by the dark blue line. FL-N, full length N protein; GSK-3, glycogen synthase kinase 3β; NTD, N-terminal domain; SR-rich, serine/arginine-rich.

The model of Figure 6 also explains how abolishing binding of the SR region to RNA reduces compaction while keeping the RNA bound in the full-length protein. Binding to the linker intercalates the RNA tightly in the nucleocapsid during packaging and this is reversed upon phosphorylation. Dissociation of the LRH will increase its accessibility to other binding partners during viral maturation such as the viral protein nsp3a (58) and the host protein 14-3-3 (59).

Concluding remarks

Phosphorylation in SR-rich proteins has been observed to increase solubility or disrupt alpha helical regions inhibiting their ability to self-assemble in fibrils (60). In TDP-43, phosphorylation of few key residues disrupts the helical structure and regulates condensate formation (61). Interesting to point out here, however, that in these examples, the phosphorylation effect is spread locally. The effect of phosphorylation of the N protein and its binding to RNA at multivalent sites is significantly more complicated. Our model presents an explanation of how phosphorylation in a disordered linker in addition to causing changes in the local environment can also act on the self-association of a distant helix that is easily responsive to long range site-specific changes, contributing to switching from genome replication to virus maturation.

Experimental procedures

Plasmid construction

The N175–245 plasmid construct was prepared by inserting DNA encoding amino acid residues 175 to 245 into a pRBC SUMO vector with a N-terminal bdSUMO tag, a C-terminal TEV protease cleavage site linked to GFP, and a hexahistidine tag (62). N175–365 plasmid construct was prepared by inserting DNA encoding amino acid residues 175 to 365 into a pRBC vector with a C-terminal TEV protease cleavage site linked to a hexahistidine tag. pSer188 N175–245 plasmid was generated as previously described (46). pSer188 N175–365 was prepared from the N175–365 plasmid by mutating S188 to an amber stop codon (TAG). pSer188- FL-N plasmid was generated by cloning the FL-N sequence into the pRBC vector, then mutating S188 to a TAG for expression with genetic code expansion. muFL-N and pSer188 muFL-N plasmids were prepared from the FL-N and pSer188-FL-N plasmids respectively, using site-directed mutagenesis to remove the DNA-encoding residues 216 to 232 and replace them with DNA encoding the sequence KAIDAATQTE, a sequence known to bind LC8 strongly (63).

Protein expression and purification

To prepare WT N175–245 or WT N175–365 proteins, BL21(DE3) E. coli transformed with the appropriate plasmid were grown in LB-rich media to an OD of 0.6 to 0.8, then induced with 1 mM IPTG at 37 °C for 6 h. Stable isotope-labeled samples of WT N175–245 or WT N175–365 were expressed and induced using the same procedure as natural abundance, expect that the cells were grown in MJ9 minimal media supplemented with 15N ammonium chloride and 13C glucose as the sole nitrogen or carbon source as appropriate. Following induction, cells were harvested by centrifugation and either used immediately or stored at −80 °C.

WT N175–245 and WT N175–365 were purified using the TALON His-tag purification protocol (Clontech Laboratories). Cell pellets were lysed in 50 mM tris, 500 mM NaCl, 5 mM imidazole, 1 mM NaN3, pH 7.5 (high salt buffer) by sonication and centrifuged at 27,200 relative centrifugal force to remove cell debris. The supernatant was mixed with resin for 1 h and washed with 20 column volumes of high salt buffer followed by four column volumes 50 mM tris, 150 mM NaCl, 5 mM imidazole, 1 mM NaN3, pH 7.5 (low salt buffer). For WT N175–245, the SUMO solubility tag was cleaved with 200 nM SENP1 protease (46, 64, 65) (∼1:250 protease:protein) for 1 h at 4 °C on the resin. Proteins were eluted off the resin with high salt buffer supplemented with 300 mM imidazole. Proteins were exchanged into buffer containing 50 mM Tris, 300 mM NaCl, 5 mM imidazole, pH 7.5 using a PD-10 desalting column (Cytiva). The tag was cleaved by incubating overnight with TEV protease (66) (∼1:20 protease:protein) at 4 °C and then removed by reverse affinity purification with Talon His-Tag resin. Proteins were concentrated and exchanged into 50 mM sodium phosphate, 150 mM NaCl, pH 6.5 (NMR buffer).

FL-N and muFL-N were prepared using a modified procedure based on one previously described for FL-N (22). The plasmid was transformed into E. coli Rosetta (DE3) cells, cultured in 2xYT media to an OD of 0.6, and induced with 1 mM IPTG at 18 °C overnight. Cells were harvested by centrifugation then resuspended in 50 mM sodium phosphate, 1M NaCl, 5 mM imidazole, 1 mM NaN3, 1 mg/ml lysozyme, pH 8.0 and incubated for 1 h at 4 °C. Cells were sonicated 3× 2 min and centrifuged at 27,200 relative centrifugal force for 45 min. The clarified lysate was mixed with Talon His-Tag resin, incubated for 1 h at 4 °C, and then washed with 20 column volumes of 50 mM sodium phosphate, 3M NaCl, 10 mM imidazole, 1 mM NaN3, pH 8.0 to remove nonspecifically bound RNA. The protein was eluted with 50 mM sodium phosphate, 300 mM NaCl, 350 mM imidazole, 1 mM NaN3, pH 8.0 FL-N, then concentrated, and further purified on a Superdex 200 column in 50 mM sodium phosphate, 150 mM NaCl, pH 7.5 buffer. The protein was concentrated and either used immediately or stored flash frozen at −80 °C.

To prepare pSer188 N175–245 and pSer N175–365, BL21(DE3) ΔSerB E. coli cells were transformed simultaneously with the appropriate expression plasmid and pKW2-EFsep. For natural abundance samples, cells were grown in 2xYT to an OD of 0.6 to 0.8 and induced with 1 mM IPTG for 48 h at 18 °C. Stable isotope-labeled pSer188 N175–245 and pSer N175–365 were expressed and purified as previously described (46). Briefly, cells were grown to an OD 0.6 to 0.8 in MJ9 minimal media supplemented with 15N Celtone (0.2%) and then induced with 1 mM IPTG for 48 h at 18 °C. Cells were harvested by centrifugation and stored at −80 °C. Protein was purified using the same method as used for the WT samples above, except that buffers were supplemented with phosphatase inhibitors (10 mM NaF, 2.5 mM sodium pyrophosphate, and 1 mM orthovanadate).

To prepare pSer188 FL-N and pSer188 muFL-N proteins, the appropriate expression plasmid was transformed simultaneously with pKW2-EFsep into BL21(DE3) ΔSerB E. coli cells. Cells were grown in rich-Auto-inducing Media (37) at 37 °C until OD 600 reached 1.3 and then at 18 °C for 48 h. Cells were harvested and the protein was purified using the same methods as FL-N, except that the buffers were supplemented with phosphatase inhibitors (10 mM NaF, 2.5 mM sodium pyrophosphate, and 1 mM orthovanadate).

For hyperphosphorylated protein, 80 μM of target protein (pSer188 linker, pSer188 N175–365, FL-N pSer188, or muFL-N pSer188) was incubated with 80 nM of GSK-3 in a buffer containing 20 mM Tris pH 7.4, 150 mM NaCl, 10 mM MgCl2, and 1 mM ATP, at 37 °C for 20 h. Hyperphosphorylated proteins were further purified by size-exclusion chromatography using a Superdex S75 column. Phos-Tag SDS-PAGE gels were used to confirm the phosphorylation status of all phosphorylated samples.

LC8 expression and purification was performed as previously described (67). The GSK-3 protein was expressed and purified as previously described (46). Table S3 lists the final buffer conditions for each protein and experiment.

NMR spectroscopy

NMR experiments were performed using an 800 MHz Bruker Avance III HD NMR spectrometer equipped with a triple resonance (TCI) cryogenic probe. Backbone resonance assignments were made using a suite of BEST triple resonance experiments, including HNCO, HNCACB, HNCACO, and HNCOCACB (68). All NMR data were processed (apodized, zero filled, Fourier Transformed, and phased) using nmrPipe (69) and analyzed in nmrviewJ (70). Three dimensional experiments were collected using nonuniform sampling, and nonuniform sampling artifacts were suppressed using SMILE (71). Resonance assignments were deposited in the BMRB under ascension number 51904. 15N nuclear spin relaxation parameters were measured using Bruker temperature compensated pulse sequences, with eight unique delay times. The 60 ms delay was collected in triplicate to aid with error estimation. Peak intensities were fit in nmrviewJ to an exponential decay model, with Monte Carlo–based error estimation. For the {1H}-15N NOE, the D1 delay was increased to 8 s to ensure complete relaxation between scans. NMR titrations of N175–245 with RNA were performed using 2D 15N-BEST-TROSY experiments. Peak intensities were measured in nmrviewJ and normalized to the corresponding peak in the spectrum without RNA. Secondary chemical shifts were calculated using sequence, temperature, and pH-corrected chemical shifts (72). NMR concentration titrations on N175–365 were performed using 2D 15N TROSY HSQC experiments. All experiments were conducted at 10 °C.

Electrophoretic mobility shift assay

The first 1000 nucleotides from the viral genome (g(1–1000) RNA) at a final concentration of 0.5 μM was incubated with increasing concentrations of protein (range of 0–75 μM, in 20 mM Tris, 150 mM NaCl, 1 mM DTT, pH 7.5) at room temperature for 20 min. The total reaction volume was 10 μl. After incubation, 2 μl of 6× loading dye was added to the reaction before loading on a 1% agarose gel. The gel was run at 150 V for 1 h. RNA bands were stained with Midori Green Nucleic Acid staining solution (Bulldog Bio. Inc) and visualized using a Bio-Rad Gel Doc Image system.

Analytical ultracentrifugation

Analytical ultracentrifugation was performed using a Beckman Coulter Optima XL-A analytical ultracentrifuge equipped with absorbance optics. SV-AUC experiments used either N175–245 protein tagged with GFP because N175–245 by itself does not have a strong UV absorbance or a construct of N175–365. All SV-AUC samples were run in standard 2-channel sectored cells using an An60-Ti rotor at 20 °C. The concentration of each protein was varied from 50 to 200 μM. Samples were spun at 42,000 rpm and with 300 scans per sample. Data were fit to the continuous c(s) model using SEDFIT (73). Buffer density and viscosity were calculated using SEDNTERP (74). For sedimentation equilibrium analysis, three concentrations of each sample were loaded into 6-well cells. Samples were centrifuged at 10,000, 13,000, and 18,000 rpm in an An60-Ti rotor for 30 to 36 h at each rotor speed. For GFP N175–245, data were fit to either a single ideal species or monomer-dimer equilibrium using the software Heteroanalysis (https://colelab.uconn.edu/) (75). For N175–365, data were fit to a dimer-tetramer model.

LLPS and microscopy

Fluorescence microscopy images were taken on a Keyence BZ-X700/BZ-X710 microscope with a 40× objective lens and a 384-well plate (Cellvis P384-1.5H-N); images were processed using BZ-x viewer and BZ-x analyzer software (https://www.keyence.com/landing/microscope/lp_fluorescence.jsp). For this experiment, cy3-labeled RNA was diluted into nuclease-free water to reach a final concentration of 50 nM when added to the protein sample. Stocks of unlabeled protein were prepared by diluting into 20 mM Tris, 150 mM NaCl, 1 mM DTT, pH 7.5 droplet buffer. For FL-N constructs, samples were prepared at a total concentration of 4 μM. For linker constructs, samples were prepared at a total concentration of 20 μM. Protein samples were prepared by combining 27 μl of protein stock of the appropriate concentration with 3 μl of cy3-labeled g(1–1000) RNA for a total sample volume of 30 μl. For muFL-N samples with LC8 added, 3.6 μM LC8 was combined with muFL-N and cy3-labeled g(1–1000). The samples were incubated at 37 °C for 2 h and subsequent imaging was taken. Foci were counted using ImageJ software (https://imagej.net/ij/). Average foci diameter was calculated in ImageJ from a random sampling of 10 foci from each image.

Data availability

All data needed to evaluate the conclusions in the paper are present in the paper and/or the supporting information.

Supporting information

This article contains supporting information.

Conflicts of interests

The authors declare that they have no conflicts of interest with the contents of this article.

Acknowledgments

We acknowledge helpful discussions from Richard Cooley and David Hendrix. We also thank Alyssa Garcia and Gabriel El Youssef for assistance with preparing protein samples.

Author contributions

H. S. writing–original draft; H. S., P. N. R., Z. Y., S. S., K. H., and E. J. B. investigation; H. S. and P. N. R. formal analysis; H. S., P. N. R., and E. J. B. conceptualization; P. N. R. and E. J. B. writing–review and editing; P. N. R., Z. Y., and E. J. B. supervision; P. N. R. and Z. Y. methodology; Z. Y., S. S., and K. H. data curation; K. H. visualization; E. J. B. resources; E. J. B. project administration; E. J. B. funding acquisition.

Funding and additional information

This work was supported by the U.S. National Science Foundation EAGER grant MCB 2034446 to E.J.B. The Oregon State University NMR Facility was supported by the National Institutes of Health, HEI Grant 1S10OD018518, and by the M. J. Murdock Charitable Trust grant #2014162. This work was also aided by the GCE4All Biomedical Technology Optimization and Dissemination Center supported by National Institute of General Medical Science grant RM1-GM144227.

Reviewed by members of the JBC Editorial Board. Edited by Wolfgang Peti

Supporting information

Supplemental Figures S1–S6 and Tables S1–S3
mmc1.docx (1.1MB, docx)

References

  • 1.Zhou P., Yang X.L., Wang X.G., Hu B., Zhang L., Chen H.D., et al. A pneumonia outbreak associated with a new coronavirus of probable bat origin. Nature. 2020;579:270–273. doi: 10.1038/s41586-020-2012-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.McBride R., van Zyl M., Fielding B.C. The coronavirus nucleocapsid is a multifunctional protein. Viruses. 2014;6:2991–3018. doi: 10.3390/v6082991. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Yao H., Song Y., Chen Y., Wu N., Xu J., Sun C., et al. Molecular architecture of the SARS-CoV-2 virus. Cell. 2020;183:730–738.e13. doi: 10.1016/j.cell.2020.09.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Cao C., Cai Z., Xiao X., Rao J., Chen J., Hu N., et al. The architecture of the SARS-CoV-2 RNA genome inside virion. Nat. Commun. 2021;12:3917. doi: 10.1038/s41467-021-22785-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Cai T., Yu Z., Wang Z., Liang C., Richard S. Arginine methylation of SARS-Cov-2 nucleocapsid protein regulates RNA binding, its ability to suppress stress granule formation, and viral replication. J. Biol. Chem. 2021;297 doi: 10.1016/j.jbc.2021.100821. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Xu W., Pei G., Liu H., Ju X., Wang J., Ding Q., et al. Compartmentalization-aided interaction screening reveals extensive high-order complexes within the SARS-CoV-2 proteome. Cell Rep. 2021;36 doi: 10.1016/j.celrep.2021.109482. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Cong Y., Ulasli M., Schepers H., Mauthe M., V’kovski P., Kriegenburg F., et al. Nucleocapsid protein recruitment to replication-transcription complexes plays a crucial role in coronaviral life cycle. J. Virol. 2020;94 doi: 10.1128/JVI.01925-19. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Luo L., Li Z., Zhao T., Ju X., Ma P., Jin B., et al. SARS-CoV-2 nucleocapsid protein phase separates with G3BPs to disassemble stress granules and facilitate viral production. Sci. Bull. 2021;66:1194–1204. doi: 10.1016/j.scib.2021.01.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Yang Z., Johnson B.A., Meliopoulos V.A., Ju X., Zhang P., Hughes M.P., et al. Interaction between host G3BP and viral nucleocapsid protein regulates SARS-CoV-2 replication and pathogenicity. Cell Rep. 2024;43 doi: 10.1016/j.celrep.2024.113965. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Chen H., Cui Y., Han X., Hu W., Sun M., Zhang Y., et al. Liquid-liquid phase separation by SARS-CoV-2 nucleocapsid protein and RNA. Cell Res. 2020;30:1143–1145. doi: 10.1038/s41422-020-00408-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Lu S., Ye Q., Singh D., Cao Y., Diedrich J.K., Yates J.R., 3rd, et al. The SARS-CoV-2 nucleocapsid phosphoprotein forms mutually exclusive condensates with RNA and the membrane-associated M protein. Nat. Commun. 2021;12:502. doi: 10.1038/s41467-020-20768-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Jack A., Ferro L.S., Trnka M.J., Wehri E., Nadgir A., Nguyenla X., et al. SARS-CoV-2 nucleocapsid protein forms condensates with viral genomic RNA. PLoS Biol. 2021;19 doi: 10.1371/journal.pbio.3001425. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Zachrdla M., Savastano A., Ibáñez de Opakua A., Cima-Omori M.S., Zweckstetter M. Contributions of the N-terminal intrinsically disordered region of the severe acute respiratory syndrome coronavirus 2 nucleocapsid protein to RNA-induced phase separation. Protein Sci. 2022;31 doi: 10.1002/pro.4409. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Roden C.A., Dai Y., Giannetti C.A., Seim I., Lee M., Sealfon R., et al. Double-stranded RNA drives SARS-CoV-2 nucleocapsid protein to undergo phase separation at specific temperatures. Nucleic Acids Res. 2022;50:8168–8192. doi: 10.1093/nar/gkac596. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Perdikari T.M., Murthy A.C., Ryan V.H., Watters S., Naik M.T., Fawzi N.L. SARS-CoV-2 nucleocapsid protein phase-separates with RNA and with human hnRNPs. EMBO J. 2020;39 doi: 10.15252/embj.2020106478. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Dong H., Zhang H., Jalin J., He Z., Wang R., Huang L., et al. Nucleocapsid proteins from human coronaviruses possess phase separation capabilities and promote FUS pathological aggregation. Protein Sci. 2023;32:e4826. doi: 10.1002/pro.4826. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Tayeb-Fligelman E., Bowler J.T., Tai C.E., Sawaya M.R., Jiang Y.X., Garcia G., Jr., et al. Low complexity domains of the nucleocapsid protein of SARS-CoV-2 form amyloid fibrils. Nat. Commun. 2023;14:2379. doi: 10.1038/s41467-023-37865-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Cascarina S.M., Ross E.D. Phase separation by the SARS-CoV-2 nucleocapsid protein: consensus and open questions. J. Biol. Chem. 2022;298 doi: 10.1016/j.jbc.2022.101677. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Carlson C.R., Asfaha J.B., Ghent C.M., Howard C.J., Hartooni N., Safari M., et al. Phosphoregulation of phase separation by the SARS-CoV-2 N protein suggests a biophysical basis for its dual functions. Mol. Cell. 2020;80:1092–1103.e4. doi: 10.1016/j.molcel.2020.11.025. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Chen Y., Lei X., Jiang Z., Humphries F., Parsi K.M., Mustone N.J., et al. Cellular nucleic acid-binding protein restricts SARS-CoV-2 by regulating interferon and disrupting RNA-protein condensates. Proc. Natl. Acad. Sci. U. S. A. 2023;120 doi: 10.1073/pnas.2308355120. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Peng Y., Du N., Lei Y., Dorje S., Qi J., Luo T., et al. Structures of the SARS-CoV-2 nucleocapsid and their perspectives for drug design. EMBO J. 2020;39 doi: 10.15252/embj.2020105938. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Forsythe H.M., Rodriguez Galvan J., Yu Z., Pinckney S., Reardon P., Cooley R.B., et al. Multivalent binding of the partially disordered SARS-CoV-2 nucleocapsid phosphoprotein dimer to RNA. Biophys. J. 2021;120:2890–2901. doi: 10.1016/j.bpj.2021.03.023. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Zhou R., Zeng R., von Brunn A., Lei J. Structural characterization of the C-terminal domain of SARS-CoV-2 nucleocapsid protein. Mol. Biomed. 2020;1:2. doi: 10.1186/s43556-020-00001-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Estelle A.B., Forsythe H.M., Yu Z., Hughes K., Lasher B., Allen P., et al. RNA structure and multiple weak interactions balance the interplay between RNA binding and phase separation of SARS-CoV-2 nucleocapsid. PNAS Nexus. 2023;2:pgad333. doi: 10.1093/pnasnexus/pgad333. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Korn S.M., Dhamotharan K., Jeffries C.M., Schlundt A. The preference signature of the SARS-CoV-2 Nucleocapsid NTD for its 5'-genomic RNA elements. Nat. Commun. 2023;14:3331. doi: 10.1038/s41467-023-38882-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Caruso I.P., Dos Santos Almeida V., do Amaral M.J., de Andrade G.C., de Araujo G.R., de Araújo T.S., et al. Insights into the specificity for the interaction of the promiscuous SARS-CoV-2 nucleocapsid protein N-terminal domain with deoxyribonucleic acids. Int. J. Biol. Macromol. 2022;203:466–480. doi: 10.1016/j.ijbiomac.2022.01.121. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Redzic J.S., Lee E., Born A., Issaian A., Henen M.A., Nichols P.J., et al. The inherent dynamics and interaction sites of the SARS-CoV-2 nucleocapsid N-terminal region. J. Mol. Biol. 2021;433 doi: 10.1016/j.jmb.2021.167108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Dinesh D.C., Chalupska D., Silhan J., Koutna E., Nencka R., Veverka V., et al. Structural basis of RNA recognition by the SARS-CoV-2 nucleocapsid phosphoprotein. PLoS Pathog. 2020;16 doi: 10.1371/journal.ppat.1009100. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Cubuk J., Alston J.J., Incicco J.J., Holehouse A.S., Hall K.B., Stuchell-Brereton M.D., et al. The disordered N-terminal tail of SARS-CoV-2 Nucleocapsid protein forms a dynamic complex with RNA. Nucleic Acids Res. 2024;52:2609–2624. doi: 10.1093/nar/gkad1215. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Pontoriero L., Schiavina M., Korn S.M., Schlundt A., Pierattelli R., Felli I.C. NMR reveals specific tracts within the intrinsically disordered regions of the SARS-CoV-2 nucleocapsid protein involved in RNA encountering. Biomolecules. 2022;12:929. doi: 10.3390/biom12070929. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Ribeiro-Filho H.V., Jara G.E., Batista F.A.H., Schleder G.R., Costa Tonoli C.C., Soprano A.S., et al. Structural dynamics of SARS-CoV-2 nucleocapsid protein induced by RNA binding. PLoS Comput. Biol. 2022;18 doi: 10.1371/journal.pcbi.1010121. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Ye Q., West A.M.V., Silletti S., Corbett K.D. Architecture and self-assembly of the SARS-CoV-2 nucleocapsid protein. Protein Sci. 2020;29:1890–1901. doi: 10.1002/pro.3909. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Zhao H., Nguyen A., Wu D., Li Y., Hassan S.A., Chen J., et al. Plasticity in structure and assembly of SARS-CoV-2 nucleocapsid protein. PNAS Nexus. 2022;1:pgac049. doi: 10.1093/pnasnexus/pgac049. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Zhao H., Wu D., Hassan S.A., Nguyen A., Chen J., Piszczek G., et al. A conserved oligomerization domain in the disordered linker of coronavirus nucleocapsid proteins. Sci. Adv. 2023;9 doi: 10.1126/sciadv.adg6473. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Zhao H., Wu D., Nguyen A., Li Y., Adao R.C., Valkov E., et al. Energetic and structural features of SARS-CoV-2 N-protein co-assemblies with nucleic acids. iScience. 2021;24 doi: 10.1016/j.isci.2021.102523. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Bouhaddou M., Memon D., Meyer B., White K.M., Rezelj V.V., Correa Marrero M., et al. The global phosphorylation landscape of SARS-CoV-2 infection. Cell. 2020;182:685–712.e19. doi: 10.1016/j.cell.2020.06.034. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Cheng N., Liu M., Li W., Sun B., Liu D., Wang G., et al. Protein post-translational modification in SARS-CoV-2 and host interaction. Front. Immunol. 2022;13 doi: 10.3389/fimmu.2022.1068449. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Carlson C.R., Adly A.N., Bi M., Howard C.J., Frost A., Cheng Y., et al. Reconstitution of the SARS-CoV-2 ribonucleosome provides insights into genomic RNA packaging and regulation by phosphorylation. J. Biol. Chem. 2022;298 doi: 10.1016/j.jbc.2022.102560. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Liu X., Verma A., Garcia G., Jr., Ramage H., Lucas A., Myers R.L., et al. Targeting the coronavirus nucleocapsid protein through GSK-3 inhibition. Proc. Natl. Acad. Sci. U. S. A. 2021;118 doi: 10.1073/pnas.2113401118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Peng T.Y., Lee K.R., Tarn W.Y. Phosphorylation of the arginine/serine dipeptide-rich motif of the severe acute respiratory syndrome coronavirus nucleocapsid protein modulates its multimerization, translation inhibitory activity and cellular localization. FEBS J. 2008;275:4152–4163. doi: 10.1111/j.1742-4658.2008.06564.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Adly A.N., Bi M., Carlson C.R., Syed A.M., Ciling A., Doudna J.A., et al. Assembly of SARS-CoV-2 ribonucleosomes by truncated N(∗) variant of the nucleocapsid protein. J. Biol. Chem. 2023;299 doi: 10.1016/j.jbc.2023.105362. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Yaron T.M., Heaton B.E., Levy T.M., Johnson J.L., Jordan T.X., Cohen B.M., et al. Host protein kinases required for SARS-CoV-2 nucleocapsid phosphorylation and viral replication. Sci. Signal. 2022;15 doi: 10.1126/scisignal.abm0808. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Guseva S., Perez L.M., Camacho-Zarco A., Bessa L.M., Salvi N., Malki A., et al. (1)H, (13)C and (15)N Backbone chemical shift assignments of the n-terminal and central intrinsically disordered domains of SARS-CoV-2 nucleoprotein. Biomol. NMR Assign. 2021;15:255–260. doi: 10.1007/s12104-021-10014-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Wu C.H., Yeh S.H., Tsay Y.G., Shieh Y.H., Kao C.L., Chen Y.S., et al. Glycogen synthase kinase-3 regulates the phosphorylation of severe acute respiratory syndrome coronavirus nucleocapsid protein and viral replication. J. Biol. Chem. 2009;284:5229–5239. doi: 10.1074/jbc.M805747200. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Kundinger S.R., Dammer E.B., Yin L., Hurst C., Shapley S., Ping L., et al. Phosphorylation regulates arginine-rich RNA-binding protein solubility and oligomerization. J. Biol. Chem. 2021;297 doi: 10.1016/j.jbc.2021.101306. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Vesely C.H., Reardon P.N., Yu Z., Barbar E., Mehl R.A., Cooley R.B. Accessing isotopically labeled proteins containing genetically encoded phosphoserine for NMR with optimized expression conditions. J. Biol. Chem. 2022;298 doi: 10.1016/j.jbc.2022.102613. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Franklin R. Oregon State University; Corvallis, OR: 2023. Revealing sequences and modifications of intact proteins using electron fragmentation. Doctor of Philosophy (Ph.D.) Doctoral. [Google Scholar]
  • 48.Schiavina M., Pontoriero L., Uversky V.N., Felli I.C., Pierattelli R. The highly flexible disordered regions of the SARS-CoV-2 nucleocapsid N protein within the 1-248 residue construct: sequence-specific resonance assignments through NMR. Biomol. NMR Assign. 2021;15:219–227. doi: 10.1007/s12104-021-10009-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Barbar E. Dynein light chain LC8 is a dimerization hub essential in diverse protein networks. Biochemistry. 2008;47:503–508. doi: 10.1021/bi701995m. [DOI] [PubMed] [Google Scholar]
  • 50.Savastano A., Ibáñez de Opakua A., Rankovic M., Zweckstetter M. Nucleocapsid protein of SARS-CoV-2 phase separates into RNA-rich polymerase-containing condensates. Nat. Commun. 2020;11:6041. doi: 10.1038/s41467-020-19843-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Botova M., Camacho-Zarco A.R., Tognetti J., Bessa L.M., Guseva S., Mikkola E., et al. A specific phosphorylation-dependent conformational switch of SARS-CoV-2 nucleoprotein inhibits RNA binding. bioRxiv. 2024 doi: 10.1101/2024.02.22.579423. [preprint] [DOI] [Google Scholar]
  • 52.Chiang W.-C., Fang Y.-S., Lye Y.S., Weng T.-Y., Ganesan K., Huang S.H., et al. Hyperphosphorylation-mimetic TDP-43 drives amyloid formation and possesses neuronal toxicity at the oligomeric stage. ACS Chem. Neurosci. 2022;13:2599–2612. doi: 10.1021/acschemneuro.1c00873. [DOI] [PubMed] [Google Scholar]
  • 53.Luo H., Ye F., Chen K., Shen X., Jiang H. SR-rich motif plays a pivotal role in recombinant SARS coronavirus nucleocapsid protein multimerization. Biochemistry. 2005;44:15351–15358. doi: 10.1021/bi051122c. [DOI] [PubMed] [Google Scholar]
  • 54.Nikolakaki E., Giannakouros T. SR/RS motifs as critical determinants of coronavirus life cycle. Front Mol. Biosci. 2020;7:219. doi: 10.3389/fmolb.2020.00219. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Morse M., Sefcikova J., Rouzina I., Beuning P.J., Williams M.C. Structural domains of SARS-CoV-2 nucleocapsid protein coordinate to compact long nucleic acid substrates. Nucleic Acids Res. 2023;51:290–303. doi: 10.1093/nar/gkac1179. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Cubuk J., Alston J.J., Incicco J.J., Singh S., Stuchell-Brereton M.D., Ward M.D., et al. The SARS-CoV-2 nucleocapsid protein is dynamic, disordered, and phase separates with RNA. Nat. Commun. 2021;12:1936. doi: 10.1038/s41467-021-21953-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Prasad A., Bharathi V., Sivalingam V., Girdhar A., Patel B.K. Molecular mechanisms of TDP-43 misfolding and pathology in amyotrophic lateral sclerosis. Front Mol. Neurosci. 2019;12:25. doi: 10.3389/fnmol.2019.00025. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Bessa L.M., Guseva S., Camacho-Zarco A.R., Salvi N., Maurin D., Perez L.M., et al. The intrinsically disordered SARS-CoV-2 nucleoprotein in dynamic complex with its viral partner nsp3a. Sci. Adv. 2022;8 doi: 10.1126/sciadv.abm4034. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Tugaeva K.V., Hawkins D., Smith J.L.R., Bayfield O.W., Ker D.S., Sysoev A.A., et al. The mechanism of SARS-CoV-2 nucleocapsid protein recognition by the human 14-3-3 proteins. J. Mol. Biol. 2021;433 doi: 10.1016/j.jmb.2021.166875. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Haider R., Penumutchu S., Boyko S., Surewicz W.K. Phosphomimetic substitutions in TDP-43’s transiently α-helical region suppress phase separation. Biophys. J. 2024;123:361–373. doi: 10.1016/j.bpj.2024.01.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Li H.-R., Chiang W.-C., Chou P.-C., Wang W.-J., Huang J.-R. TAR DNA-binding protein 43 (TDP-43) liquid-liquid phase separation is mediated by just a few aromatic residues. J. Biol. Chem. 2018;293:6090–6098. doi: 10.1074/jbc.AC117.001037. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Zhu P., Stanisheuski S., Franklin R., Vogel A., Vesely C.H., Reardon P., et al. Autonomous synthesis of functional, permanently phosphorylated proteins for defining the interactome of monomeric 14-3-3ζ. ACS Cent. Sci. 2023;9:816–835. doi: 10.1021/acscentsci.3c00191. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Jespersen N., Estelle A., Waugh N., Davey N.E., Blikstad C., Ammon Y.C., et al. Systematic identification of recognition motifs for the hub protein LC8. Life Sci. Alliance. 2019;2 doi: 10.26508/lsa.201900366. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Zhu P., Gafken P.R., Mehl R.A., Cooley R.B. A highly versatile expression system for the production of multiply phosphorylated proteins. ACS Chem. Biol. 2019;14:1564–1572. doi: 10.1021/acschembio.9b00307. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Frey S., Görlich D. A new set of highly efficient, tag-cleaving proteases for purifying recombinant proteins. J. Chromatogr. A. 2014;1337:95–105. doi: 10.1016/j.chroma.2014.02.029. [DOI] [PubMed] [Google Scholar]
  • 66.Young C.L., Britton Z.T., Robinson A.S. Recombinant protein expression and purification: a comprehensive review of affinity tags and microbial applications. Biotechnol. J. 2012;7:620–634. doi: 10.1002/biot.201100155. [DOI] [PubMed] [Google Scholar]
  • 67.Jespersen N.E., Leyrat C., Gerard F.C., Bourhis J.M., Blondel D., Jamin M., et al. The LC8-RavP ensemble structure evinces A role for LC8 in regulating lyssavirus polymerase functionality. J. Mol. Biol. 2019;431:4959–4977. doi: 10.1016/j.jmb.2019.10.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Schanda P., Van Melckebeke H., Brutscher B. Speeding up three-dimensional protein NMR experiments to a few minutes. J. Am. Chem. Soc. 2006;128:9042–9043. doi: 10.1021/ja062025p. [DOI] [PubMed] [Google Scholar]
  • 69.Delaglio F., Grzesiek S., Vuister G.W., Zhu G., Pfeifer J., Bax A. NMRPipe: a multidimensional spectral processing system based on UNIX pipes. J. Biomol. NMR. 1995;6:277–293. doi: 10.1007/BF00197809. [DOI] [PubMed] [Google Scholar]
  • 70.Johnson B.A. Using NMRView to visualize and analyze the NMR spectra of macromolecules. Methods Mol. Biol. 2004;278:313–352. doi: 10.1385/1-59259-809-9:313. [DOI] [PubMed] [Google Scholar]
  • 71.Ying J., Delaglio F., Torchia D.A., Bax A. Sparse multidimensional iterative lineshape-enhanced (SMILE) reconstruction of both non-uniformly sampled and conventional NMR data. J. Biomol. NMR. 2017;68:101–118. doi: 10.1007/s10858-016-0072-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72.Kjaergaard M., Poulsen F.M. Sequence correction of random coil chemical shifts: correlation between neighbor correction factors and changes in the Ramachandran distribution. J. Biomol. NMR. 2011;50:157–165. doi: 10.1007/s10858-011-9508-2. [DOI] [PubMed] [Google Scholar]
  • 73.Schuck P. Size-distribution analysis of macromolecules by sedimentation velocity ultracentrifugation and lamm equation modeling. Biophys. J. 2000;78:1606–1619. doi: 10.1016/S0006-3495(00)76713-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Philo J.S. SEDNTERP: a calculation and database utility to aid interpretation of analytical ultracentrifugation and light scattering data. Eur. Biophys. J. 2023;52:233–266. doi: 10.1007/s00249-023-01629-0. [DOI] [PubMed] [Google Scholar]
  • 75.Cole J.L. Analysis of heterogeneous interactions. Methods Enzymol. 2004;384:212–232. doi: 10.1016/S0076-6879(04)84013-8. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental Figures S1–S6 and Tables S1–S3
mmc1.docx (1.1MB, docx)

Data Availability Statement

All data needed to evaluate the conclusions in the paper are present in the paper and/or the supporting information.


Articles from The Journal of Biological Chemistry are provided here courtesy of American Society for Biochemistry and Molecular Biology

RESOURCES