Abstract
The SARS-CoV-2 nucleocapsid (N) protein is an abundant RNA-binding protein critical for viral genome packaging, yet the molecular details that underlie this process are poorly understood. Here we combine single-molecule spectroscopy with all-atom simulations to uncover the molecular details that contribute to N protein function. N protein contains three dynamic disordered regions that house putative transiently-helical binding motifs. The two folded domains interact minimally such that full-length N protein is a flexible and multivalent RNA-binding protein. N protein also undergoes liquid-liquid phase separation when mixed with RNA, and polymer theory predicts that the same multivalent interactions that drive phase separation also engender RNA compaction. We offer a simple symmetry-breaking model that provides a plausible route through which single-genome condensation preferentially occurs over phase separation, suggesting that phase separation offers a convenient macroscopic readout of a key nanoscopic interaction.
Subject terms: Intrinsically disordered proteins, Single-molecule biophysics, Computational models
SARS-CoV-2 nucleocapsid (N) protein is responsible for viral genome packaging. Here the authors employ single-molecule spectroscopy with all-atom simulations to provide the molecular details of N protein and show that it undergoes phase separation with RNA.
Introduction
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is an enveloped, positive-strand RNA virus that causes the disease COVID-19 (Coronavirus Disease-2019)1. While coronaviruses typically cause relatively mild respiratory diseases, as of February 2021 COVID-19 is on course to kill 2.5 million people since its emergence in late 2019 1–3. While recent progress in vaccine development has been remarkable, the emergence of novel coronaviruses in human populations represents a continuing threat4. As a result, therapeutic approaches that address fundamental and general viral mechanisms will offer a key route for first-line intervention against future pandemics.
A challenge in identifying candidate drugs is our relatively sparse understanding of the molecular details that underlie the function of SARS-CoV-2 proteins. As a result, there is a surge of biochemical and biophysical exploration of these proteins, with the ultimate goal of identifying proteins that are suitable targets for disruption, ideally with insight into the molecular details of how disruption could be achieved5,6.
While much attention has been focused on the Spike (S) protein, many other SARS-CoV-2 proteins play equally critical roles in viral physiology, yet we know relatively little about their structural or biophysical properties7–10. Here we performed a high-resolution structural and biophysical characterization of the SARS-CoV-2 nucleocapsid (N) protein, the protein responsible for genome packaging11,12. A large fraction of N protein is predicted to be intrinsically disordered, which constitutes a major barrier to conventional structural characterization13. To overcome these limitations, we combined single-molecule spectroscopy with all-atom simulations to build a residue-by-residue description of all three disordered regions in the context of their folded domains. The combination of single-molecule spectroscopy and simulations to reconstruct structural ensembles has been applied extensively to uncover key molecular details underlying disordered protein regions14–19. Our goal here is to provide biophysical and structural insights into the physical basis of N protein function.
In exploring the molecular properties of N protein, we discovered that it undergoes phase separation with RNA, as was also reported recently20–27. Given N protein underlies viral packaging, we reasoned phase separation may in fact be an unavoidable epiphenomenon that reflects physical properties necessary to drive the compaction of long genomic RNA molecules. To explore this principle further, we developed a simple physical model, which suggested symmetry breaking through a small number of high-affinity-binding sites that can organize anisotropic multivalent interactions to drive single-polymer compaction, as opposed to multi-polymer phase separation. Irrespective of its physiological role, our results suggest that phase separation provides a macroscopic readout (visible droplets) of a nanoscopic process (protein:RNA and protein:protein interaction). In the context of SARS-CoV-2, those interactions are expected to be key for viral packaging, such that assays that monitor phase separation of N protein with RNA may offer a convenient route to identify compounds that will also attenuate viral assembly.
Results
Coronavirus nucleocapsid proteins are multi-domain RNA-binding proteins that play a critical role in many aspects of the viral life cycle12,28. The SARS-CoV-2 N protein shares substantial sequence conservation with other coronavirus nucleocapsid proteins (Figs. S1–5). Work on N protein from a range of model coronaviruses has shown that N protein undergoes both self-association, interaction with other proteins, and interaction with RNA, all in a highly multivalent manner.
The SARS-CoV-2 N protein can be divided into five domains: a predicted intrinsically disordered N-terminal domain (NTD), an RNA-binding domain (RBD), a predicted disordered central linker (LINK), a dimerization domain, and a predicted disordered C-terminal domain (CTD) (Fig. 1A–C). While SARS-CoV-2 is a novel coronavirus, decades of work on model coronaviruses (including SARS coronavirus) have revealed a number of features expected to hold true in the SARS-CoV-2 N protein. Notably, all five domains are predicted to bind RNA29–35, and while the dimerization domain facilitates the formation of well-defined stoichiometric dimers, RNA-independent higher-order oligomerization is also expected to occur34,36–38. Importantly, protein–protein and protein–RNA interaction sites have been mapped to all three disordered regions.
Despite recent structures of the RBD (Fig. 1B) and dimerization domains (Fig. 1C) from SARS-CoV-2, the solution-state conformational behavior of the full-length protein remains elusive39–41. Understanding N protein function necessitates a mechanistic understanding of the flexible predicted disordered regions and their interplay with the folded domains. A recent small-angle X-ray study shows good agreement with previous work on SARS, suggesting the LINK is relatively extended, but neither the structural basis for this extension nor the underlying dynamics are known29,42.
Here, we address these questions by probing three full-length constructs of the N protein with fluorescent labels (Alexa 488 and 594) flanking the NTD, the LINK, and the CTD (see Fig. 1A and Table S1). These constructs allow us to quantify conformations and dynamics of the disordered regions in the context of the full-length protein using single-molecule Förster resonance energy transfer (FRET) and fluorescence correlation spectroscopy (FCS) (see SI for details). We also investigated the stability of the RBD and truncated variants of the protein to test the role of long range interactions on the disordered regions (see SI and Table S2). In parallel to the experiments, we performed all-atom Monte Carlo simulations of each of the three IDRs in isolation and in context with their adjacent folded domains.
The NTD is disordered, flexible, and transiently interacts with the RBD
We started our analysis by investigating the NTD conformations. Under native conditions, single-molecule FRET measurements revealed the occurrence of a single population with a mean transfer efficiency of 0.65 ± 0.03 (Figs. 2A and S6). To assess whether this transfer efficiency reports on a rigid distance (e.g., structure formation or persistent interaction with the RBD) or is a dynamic average across multiple conformations, we first compare the lifetime of the fluorophores with transfer efficiency. Under native conditions, the donor and acceptor lifetimes for the NTD construct lie on the line that represents fast conformational dynamics (Fig. S7A). To properly quantify the timescale associated with these fast structural rearrangements, we leveraged nanosecond FCS. As expected for a dynamic population43,44, the cross-correlation of acceptor–donor photons for the NTD is anticorrelated (Figs. 2B and S12). A global fit of the donor–donor, acceptor–acceptor, and acceptor–donor correlations yields a reconfiguration time τr = 170 ± 30 ns. This is longer than reconfiguration times observed for other proteins with a similar persistence length and charge content44–47, hinting at a large contribution from internal friction due to rapid intramolecular contacts (formed either within the NTD or with the RBD) or transient formation of short structural motifs48. A conversion from transfer efficiency to chain dimensions can be obtained by assuming the distribution of distances computed from polymer models. Assuming a Gaussian chain distribution yields a root-mean-square distance between the fluorophores r1–68 of 48 ± 2 Å. When using the recently proposed self-avoiding walk (SAW) model49 (see Supplementary Information), we compute a value of r1–68 47 ± 2 Å. This corresponds to values of persistence length (see SI) equal to 4.5 ± 0.4 and 4.3 ± 0.4 Å for the Gaussian and SAW distribution, respectively, which are similar to values reported for another unfolded protein under native conditions44–46,50. Overall, these results confirm the NTD is disordered, as predicted by sequence analysis.
We next examined the interaction of the NTD with other domains in the protein. We studied a truncated N protein variant that contains only the NTD and RBD domains (NTD–RBD) and samples identical labeling positions. The root-mean-square distance r1–68 is 46 ± 2 Å for both the Gaussian and SAW models, within errors from the NTD-FL values, suggesting no or limited interaction between the NTD and the LINKER, DIMER, and CTD domains (see Fig. S8 and Table S2). We then assessed the role of the folded RBD and its influence on the conformations of the NTD by studying the effect of a chemical denaturant on the protein. The titration with guanidinium chloride (GdmCl) reveals a decrease of transfer efficiencies when moving from native buffer conditions to 1 M GdmCl, followed by a plateau of the transfer efficiencies at concentrations between 1 and 2 M and a subsequent further decrease at higher concentrations (Figs. S6 and S8). This behavior can be understood assuming that the plateau between 1 and 2 M GdmCl represents the average of transfer efficiencies between two populations in equilibrium that have very close transfer efficiency and are not completely resolved because of shot noise. We interpret these two populations as the contribution of the folding and unfolding fraction of the RBD domain on the distances probed by the NTD-FL construct, which includes a labeling position within the folded RBD. Indeed, this interpretation is supported by a broadening in the transfer efficiency peak between 1 and 2 M GdmCl. Besides the effect of the unfolding of the RBD, the dimensions of the NTD-FL are also modulated by a change in the solvent quality when adding denaturant (Figs. 2C, S6 and S8) and this contribution to the expansion of the chain can be described using an empirical binding model51–55. A fit of the interdye root-mean-square distances to this model and the inferred stability of the RBD domain (midpoint: 1.3 ± 0.2 M; ΔG0 = (5 ± 1) kcal mol−1) are presented in Fig. 2C. A comparative fit of the histograms assuming two overlapping populations yields a consistent result in terms of RBD stability and protein conformations (Fig. S9). To confirm the inferred RBD stability results, we directly interrogated the RBD domain by measuring a full-length construct with labels in position 68 and 172, which flanks the folded RBD structure (see section “RBD folding” in SI). Though the denaturation of the RBD reveals coexistence of up to three populations, which we identify as an unfolded, an intermediate, and a folded state, the range of the folding transition is compatible with the estimates made using the NTD constructs (midpoint: 1.68 ± 0.02 M, see Fig. S9 and Table S6).
To better understand the sequence-dependent conformational behavior of the NTD, we turned to all-atom simulations of an NTD–RBD construct. We used a novel sequential sampling approach that integrates long timescale MD simulations performed using the Folding@home distributed computing platform with all-atom Monte Carlo simulation performed with the ABSINTH forcefield to generate an ensemble of almost 400,000 distinct conformations (see “Methods”)56–58. We also performed simulations of the NTD in isolation.
We observed good agreement between simulation and experiment for the equivalent inter-residue distance (Fig. 2D). The peaks on the left side of the histogram reflect specific simulations where the NTD engages more extensively with the RBD through a fuzzy interaction, leading to local kinetic traps59. We also identified several regions in the NTD where transient helices form, and using normalized distance maps found regions of transient attractive and repulsive interaction between the NTD and the RBD (Fig. 2E). In particular, the basic beta-strand extension from the RBD (Fig. 1B) repels the arginine-rich C-terminal region of the NTD, while a phenylalanine residue (F17) in the NTD engages with a hydrophobic face on the RBD (Fig. 2G). Finally, we noticed the arginine-rich C-terminal residues (residues 31–38) form a transient alpha helix projecting three of the four arginines in the same direction (Figs. 2F, H). These features provide molecular insight into previously reported functional observations (see “Discussion”).
The linker is highly dynamic and there is minimal interaction between the RBD and the dimerization domain
We next turned to the linker (LINK FL) construct to investigate how the disordered region modulates the interaction and dynamics between the two folded domains. Under aqueous buffer conditions, single-molecule FRET reveals the coexistence of two overlapping populations with mean transfer efficiencies of 0.55 ± 0.03 and 0.75 ± 0.03, respectively (Fig. 3A). A small change in ionic strength of the solution is sufficient to alter the equilibrium between these two populations and favor the low transfer efficiency state (see inset in Fig. 3C). Comparison of the fluorescence lifetimes and transfer efficiencies indicates that, like the NTD, the transfer efficiencies represent dynamic conformational ensembles sampled by the LINK (Fig. S7A). ns-FCS confirms fast dynamics across the measured distribution of transfer efficiencies, with a characteristic reconfiguration time τr of 120 ± 20 ns (Figs. 3B and S12). This reconfiguration time is compatible with high internal friction effects, as observed for other unstructured proteins44,45, but may also account for the drag of the surrounding domains. The root-mean-square interdye distance corresponding to the low transfer efficiency population r172–245 is equal to 55 ± 2 Å (lp = 5.4 ± 0.4 Å) when assuming a Gaussian chain distribution and 54 ± 2 Å (lp = 5.2 ± 0.4 Å) when using a SAW model (see SI). In contrast, the root-mean-square interdye distance corresponding to the high transfer efficiency population is equal to 42 ± 2 Å when assuming a Gaussian Chain distribution or 45 ± 2 Å using the SAW model (with a corresponding lp = 3.2 ± 0.3 Å and lp = 3.6 ± 0.3 Å, respectively) (see SI).
Next, we addressed whether the LINK segment populates elements of persistent secondary structure or forms stable interaction with the RBD or dimerization domains. The addition of denaturant leads to the rapid loss of the high transfer efficiency population and a continuous shift of the remaining population toward lower transfer efficiencies (Figs. S6 and S8). These results correspond to an almost linear expansion of the chain in response to denaturant (see Fig. 3C).
To better understand the nature of the two populations and explain the weak dependence of the linker expansion on denaturant, we investigated the same labeling positions in the absence of the DIMER and CTD domains (LINK ΔDIMER) (Table S2). smFRET measurements of this truncated version revealed a single population that undergoes a strong compaction with decreasing GdmCl concentration (Figs. S6 and S8). Interestingly the transfer efficiency measured in aqueous buffer is equivalent to the one reported by the high transfer efficiency population of the LINK FL construct. The electrostatic nature of this compaction is clearly revealed by titrating a polar non-ionic denaturant (urea) and observing that the chain remains largely compact and recovers the same dimensions measured in GdmCl only when adding salt to the solution (Fig. S10). Overall, the LINK ΔDIMER observations lead us to speculate that the LINK domain can either self-interact or interact with the RBD domain, whereas addition of the DIMER and CTD domains restricts these configurations and largely favor more expanded states with the exceptions of very low ionic strength conditions. To further explore the configurations of the LINK, we turned again to Monte Carlo simulations.
As with the NTD, all-atom Monte Carlo simulations provide atomistic insight that can be compared with our spectroscopic results. Given the size of the system, an alternative sampling strategy to the NTD–RBD construct was pursued here that did not include MD simulations of the folded domains, but we instead performed simulations of a construct that included the RBD, LINK, and dimerization domain (but not the NTD and CTD). In addition, we also performed simulations of the LINK in isolation.
We again found good agreement between simulations and experiment (Fig. 3D). The root-mean-square inter-residue distance for the low transfer efficiency population (between simulated positions 172 and 245) is 59.1 Å, which is within the experimental error of the single-molecule observations. Normalized distance map shows a number of regions of repulsion, notably that the RBD repels the N-terminal part of the LINK and the dimerization domain repels the C-terminal part of the LINK (Fig. 3E). We tentatively suggest this may reflect sequence properties chosen to prevent aberrant interactions between the LINK and the two folded domains. In the LINK-only simulations we identified two regions that form transient helices at low populations (20–25%), although these are less prominent in the context of the full-length protein (Fig. 3F). These two helices encompass a serine–arginine (SR) rich region known to mediate both protein–protein and protein–RNA interaction. Helix H3 formation leads to the alignment of three arginine residues along one face of the helix. The second helix (H4) is a leucine/alanine-rich hydrophobic helix which may contribute to oligomerization, or act as a helical recognition motif for other protein interactions (notably as a nuclear export signal (NES) for Crm1, see “Discussion”).
The CTD engages in transient but non-negligible interactions with the dimerization domain
Finally, we again applied single-molecule FRET (Fig. 4A) and ns-FCS (Fig. 4B) to understand the conformational behavior of the CTD FL construct. Single-molecule FRET experiments again reveal a single population with a mean transfer efficiency of 0.59 ± 0.03 (Fig. 4A) and the denaturant dependence follows the expected trend for a disordered region, with a shift of the transfer efficiency toward lower values (Figs. 4C, S6 and S8), from 0.59 to 0.35. Interestingly, when studying the denaturant dependence of the protein, we noticed that the width of the distribution increases while moving toward aqueous buffer conditions. This suggests that the protein may form transient contacts or adopt local structure. Comparison with a truncated variant that contains only the CTD (Fig. S8) reveals a very similar distribution, with almost identical mean transfer efficiency but a narrower width (Fig. S6), suggesting that part of the broadening is due to interactions with the neighboring domains.
To further investigate putative interaction between the CTD and neighboring domains, we turned to the investigation of protein dynamics. Though the comparison of the fluorophore lifetimes against transfer efficiency (Fig. S7A) appears to support a dynamic nature underlying the CTD FL population, ns-FCS reveals a flat acceptor–donor cross-correlation on the nanosecond timescale (Fig. 4B). However, inspection of the donor–donor and acceptor–acceptor autocorrelations reveal a correlated decay. This is different from that expected for a completely static system such as polyprolines60, where the donor–donor and acceptor–acceptor autocorrelation are also flat. An increase in the autocorrelations can be observed for static quenching of the dyes with aromatic residues. Interestingly, donor dye quenching can also contribute to a positive amplitude in the donor–acceptor correlation61,62. Therefore, a plausible interpretation of the flat cross-correlation data is that we are observing two populations in equilibrium whose correlations (one anticorrelated, reflecting conformational dynamics, and one correlated, reflecting quenching due contact formation) compensate each other.
To further investigate the possibility of two coexisting populations, we performed ns-FCS at increasing GdmCl concentrations. These experiments revealed a progressive increase of the anticorrelated amplitude in the cross-correlation, consistent with an increase of the dynamic population. Moreover, we also observed a simultaneous decrease in the overall donor–donor autocorrelation amplitude, consistent with a decrease in the quenched population (Fig. S12). Taken together, these results support our hypothesis that there are at least two distinct species existing in equilibrium. By analyzing the dynamic species between 0.16 and 0.6 M GdmCl, we quantified an average reconfiguration time (τr) of 64 ± 7 ns for the dynamic population in the CTD. Under the assumption that the mean transfer efficiency still originates (at least partially) from a dynamic distribution, the estimate of the inter-residue root-mean-square distance is r363–419 = 51 ± 2 Å (lp = 6.1 ± 0.5 Å) for a Gaussian chain distribution and r363–419 = 48 ± 1 Å (lp = 5.4 ± 0.4 Å) for the SAW model (see SI). However, some caution should be used when interpreting these numbers since we know there is some contribution from fluorophore static quenching, which may in turn contribute to an underestimate of the effective transfer efficiency63.
We again obtained good agreement between all-atom Monte Carlo simulations and experiments (Fig. 4D). Scaling maps reveal extensive intramolecular interaction by the residues that make up H6, both in terms of local intra-IDR interactions and interaction with the dimerization domain (Fig. 4E). We identified two transient helices, one (H5) is minimally populated but the second (H6) is more highly populated in the IDR-only simulation and still present at ~20% in the folded state simulations (Fig. 4F). The difference reflects the fact that several of the helix-forming residues interact with the dimerization domain, leading to a competition between helix formation and intramolecular interaction. Mapping normalized distances onto the folded structure reveals that interactions occur primarily with the N-terminal portion of the dimerization domain (Fig. 4G). As with the LINK and the NTD, a positively charged set of residues immediately adjacent to the folded domain in the CTD drive repulsion between this region and the dimerization domain. H6 is the most robust helix observed across all three IDRs, and is a perfect amphipathic helix with a hydrophobic surface on one side and charged/polar residues on the other (Fig. 4H). The cluster of hydrophobic residues in H6 engage in intramolecular contacts and offer a likely physical explanation for the complex ns-FCS data in aqueous buffer.
N protein undergoes phase separation with RNA
Over the last decade, biomolecular condensates formed through phase separation have emerged as a new mode of cellular organization64–67. Many of the proteins that have been shown to drive phase separation in vitro are RNA-binding proteins with intrinsically disordered regions64,68. Moreover, multivalency is the key molecular feature that determines if a biomolecule can undergo higher-order assembly69. Having characterized N protein to reveal three IDRs with distinct binding sites for both protein–protein and protein–RNA interactions it became clear that N protein possesses all of the features consistent with a protein that may undergo phase separation. With these results in hand, we anticipated that N protein would undergo phase separation with RNA70–72.
In line with this expectation, we observed robust droplet formation with homopolymeric RNA (Fig. 5A, B) under aqueous buffer conditions, both at 50 mM Tris and at a higher salt concentration of 50 mM NaCl. Turbidity assays at different concentrations of protein and poly(rU) (200–250 nucleotides) demonstrate the classical re-entrant phase behavior expected for a system undergoing heterotypic interaction (Fig. 5C, D). It is to be noted that turbidity experiments do not exhaustively cover all the conditions for phase separation and are only indicative of the low-boundary concentration regime explored in the current experiments. In particular, turbidity experiments do not provide a measurement of tie-lines, though they are inherently a reflection of the free energy and chemical potential of the solution mixture73. Interestingly, phase separation occurs at relatively low concentrations, in the low μM range, which are compatible with physiological concentration of the protein and nucleic acids. Though increasing salt concentration results in an upshift of the phase boundaries, one has to consider that in a cellular environment this effect might be counteracted by cellular crowding.
One peculiar characteristic of our measured phase diagram is the narrow regime of conditions in which we observe phase separation of nonspecific RNA at a fixed concentration of protein. This leads us to hypothesize that the protein may have evolved to maintain tight control of concentrations at which phase separation can (or cannot) occur. Interestingly, when rescaling the turbidity curves as a ratio between protein and RNA, we find all the curve maxima aligning at a similar stoichiometry, approximately 20 nucleotides per protein in the absence of added salt and 30 nucleotides when adding 50 mM NaCl (Fig. S13). These ratios are in line with the charge neutralization criterion proposed by Banerjee et al.74, since the estimated net charge of the protein at pH 7.4 is +24. Finally, given we observed phase separation with poly(rU), the behavior we are observing is likely driven by relatively nonspecific protein:RNA interactions. In agreement, work from a number of other groups has also established this phenomenon across a range of solution conditions and RNA types20–27.
Having established phase separation through a number of assays, we wondered what—if any—physiological relevance this may have for the normal biology of SARS-CoV-2.
A simple polymer model shows symmetry breaking can facilitate multiple metastable single-polymer condensates instead of a single multi-polymer condensate
Why might phase separation of N protein with RNA be advantageous to SARS-CoV-2? One possible model is that large, micron-sized cytoplasmic condensates of N protein and RNA form through phase separation and facilitate genome packaging. These condensates may act as molecular factories that help concentrate the components for pre-capsid assembly (where we define a pre-capsid here simply as a species that contains a single copy of the genome with multiple copies of the associated N protein), a model that has been proposed in other viruses75.
However, given that phase separation is unavoidable when high concentrations of multivalent species are combined, we propose that an alternative interpretation of our data is that in this context, phase separation is simply an inevitable epiphenomenon that reflects the inherent multivalency of the N protein for itself and for RNA. This poses questions about the origin of specificity for viral genomic RNA (gRNA), and, of focus in our study, how phase separation might relate to a single-genome packaging through RNA compaction.
Given the expectation of a single genome per virion, we reasoned SARS-CoV-2 might have evolved a mechanism to limit phase separation with gRNA (i.e., to avoid multi-genome condensates), with a preference instead for single-genome packaging (single-genome condensates). This mechanism may exist in competition with the intrinsic phase separation of the N protein with other nonspecific RNAs (nsRNA).
One possible way to limit phase separation between two components (e.g., gRNA/nsRNA and N protein) is to ensure the levels of these components are held at a sufficiently low total concentration such that the phase boundary is never crossed. While possible, such a regulatory mechanism is at the mercy of extrinsic factors that may substantially modulate the saturation concentration76–78. Furthermore, not only must phase separation be prevented, but gRNA compaction should also be promoted through the binding of N protein. In this scenario, the affinity between gRNA and N protein plays a central role in determining the required concentration for condensation of the macromolecule (gRNA) by the ligand (N protein).
Given a system composed of components with defined valencies, phase boundaries are encoded by the strength of interaction between the interacting domains in the components. Considering a long polymer (e.g., gRNA) with proteins adsorbed onto that polymer as adhesive points (stickers), the physics of associative polymers predicts that the same interactions that cause phase separation will also control the condensation of individual long polymers69,70,79–82. With this in mind, we hypothesized that phase separation is reporting on the physical interactions that underlie genome compaction.
To explore this hypothesis, we developed a simple computational model where the interplay between compaction and phase separation could be explored. Our setup consists of two types of species: long multivalent polymers and short multivalent binders (Fig. 6A). All interactions are isotropic, and each bead is inherently multivalent as a result. In the simplest instantiation of this model, favorable polymer:binder and binder:binder interactions are encoded, mimicking the scenario in which a binder (e.g., a protein) can engage in nonspecific polymer (RNA) interaction as well as binder–binder (protein–protein) interaction. As expected for simulations of binders with homopolymer polymers we observed phase separation in a concentration-dependent manner (Fig. 6B, E). Phase separation gives rise to a single large spherical cluster with multiple polymers and binders (Fig. 6D, H–L).
Given our homopolymers undergo robust phase separation, we wondered if a break in the symmetry between intra- and intermolecular interactions would be enough to promote single-polymer condensation in the same concentration regime over which we had previously observed phase separation. Symmetry breaking in our model is achieved through a single high-affinity-binding site (Fig. 6A). We choose this particular mode of symmetry breaking to mimic the presence of a packaging signal—a region of the genome that is essential for efficient viral packaging—an established feature in many viruses (including coronaviruses) although we emphasize this is a general model, as opposed to trying to directly model gRNA with a packaging signal83–85.
We performed identical simulations to those in Fig. 6C, D using the same system with polymers that now possess a single high-affinity binding site (Fig. 6E). Under these conditions we did not observe large phase separated droplets (Fig. 6F). Instead, each individual polymer undergoes collapse to form a single-polymer condensate (Fig. 6E). Collapse is driven by the recruitment of binders to the high-affinity site, where they coat the chain, forming a local cluster of binders on the polymer. This cluster is then able to interact with the remaining regions of the polymer through weak nonspecific interactions, the same interactions that drove phase separation in Fig. 6B–D. Symmetry breaking is achieved because the local concentration of binder around the site is high, such that intramolecular interactions are favored over intermolecular interaction. This high local concentration also drives compaction at low binder concentrations. As a result, instead of a single multi-polymer condensate, we observe multiple single-polymers condensates, where the absolute number matches the number of polymers in the system (Fig. 6G).
The high-affinity-binding site polarizes the single-polymer condensate, such that they are organized, recalcitrant to fusion, and kinetically metastable. To illustrate this metastable nature, extended simulations using an approximate kinetic Monte Carlo scheme demonstrated that a high-affinity-binding site dramatically slows assembly of multichain assemblies, but that ultimately these are the thermodynamically optimal configuration (Fig. S18). A convenient physical analogy is that of a micelle, which are non-stoichiometric stable assemblies. Even for micelles that are far from their optimal size, fusion is slow because it requires substantial molecular reorganization and the breaking of stable interactions86,87.
Finally, we ran simulations under conditions in which binder:polymer interactions were reduced, mimicking the scenario in which nonspecific protein:RNA interactions are inhibited (Fig. 6L). Under these conditions no phase separation occurs for polymers that lack a high-affinity-binding site, while for polymers with a high-affinity-binding site no chain compaction occurs (in contrast to when binder:polymer interactions are present, see Fig. 6J). This result illustrates how phase separation offers a convenient readout for molecular interactions that might otherwise be challenging to measure.
We emphasize that our conclusions from these coarse-grained simulations are subject to the parameters in our model. We present these results to demonstrate an example of how this single-genome packaging could be achieved, offering a class of mechanism that may be in play. This is in contrast to the much stronger statement that this is how it is achieved, a statement that would require much more evidence to make. Recent elegant work by Ranganathan and Shakhnovich88 identified kinetically arrested microclusters, where slow kinetics result from the saturation of stickers within those clusters. This is completely analogous to our results (albeit with homotypic interactions, rather than heterotypic interactions), giving us confidence that the physical principles uncovered are robust and, we tentatively suggest, quite general. Future simulations are required to systematically explore the details of the relevant parameter space in our system. However, regardless of those parameters, our model does establish that if weak multivalent interactions underlie the formation of large multi-polymer droplets, those same interactions cannot also drive polymer compaction inside the droplet.
Discussion
The nucleocapsid (N) protein from SARS-CoV-2 is a multivalent RNA-binding protein critical for viral replication and genome packaging11,12. To better understand how the various folded and disordered domains interact with one another, we applied single-molecule spectroscopy and all-atom simulations to perform a detailed biophysical dissection of the protein, uncovering several putative interaction motifs. Furthermore, based on both sequence analysis and our single-molecule experiments, we anticipated that N protein would undergo phase separation with RNA. In agreement with this prediction, and in line with work from the Gladfelter and Yildiz groups working independently from us, we find that N protein robustly undergoes phase separation in vitro with model RNA under a range of different salt conditions. Using simple polymer models, we propose that the same interactions that drive phase separation may also drive genome packaging into a dynamic, single-genome condensate. The formation of single-genome condensates (as opposed to multi-genome droplets) is influenced by the presence of one (or more) symmetry-breaking interaction sites, which we tentatively suggest could reflect packaging signals in viral genomes.
All three IDRs are highly dynamic
Our single-molecule experiments and all-atom simulations are in good agreement with one another and reveal that all three IDRs are extended and, depending on solution condition, highly dynamic. Simulations suggest the NTD may interact transiently with the RBD, which offers an explanation for the slightly slowed reconfiguration time measured by nanosecond FCS. The LINK shows rapid rearrangement, demonstrating the RBD and dimerization domain are not interacting. Finally, we see a pronounced interaction between the CTD and the dimerization domain, although these interactions are still highly transient.
Single-molecule experiments and all-atom simulations were performed on monomeric versions of the protein, yet N protein has previously been shown to undergo dimerization and form higher-order oligomers in the absence of RNA36. To assess the formation of oligomeric species, we use a combination of NativePAGE, crosslinking, and FCS experiments (see Fig. S14 and SI). These experiments and the comparison between full-length and truncated variants suggest that in the concentration regime used for single-molecule experiments the protein exists as a monomer.
Simulations identify multiple transient helices
We identified a number of transient helical motifs that provide structural insight into previously characterized molecular interactions. Transient helices are ubiquitous in viral disordered regions and have been shown to underlie molecular interactions in a range of systems75,89–91. While the application of molecular simulations to identify transient helices in disordered regions can suffer from forcefield inaccuracies, it is worth noting that in prior work we have found good agreement between experimental and simulated secondary structure analysis across a range of systems explored in an analogous manner70,92–94.
Transient helix H2 (in the NTD) and H3 (in the LINK) flank the RBD and organize a set of arginine residues to face the same direction (Figs. 2H and 3F). Both the NTD and LINK have been shown to drive RNA binding, such that we propose these helical arginine-rich motifs (ARMs) may engage in both nonspecific binding and may also contribute to RNA specificity, as has been proposed previously29,95,96. The serine–arginine SR region (which includes H3) has been previously identified as engaging in interaction with a structured acidic helix in Nsp3 in the model coronavirus MHV, consistent with an electrostatic helical interaction97,98. Recent NMR data also show excellent agreement with our results, identifying a transient helix that shows 1:1 overlap with H3 24. The SR region is necessary for recruitment to replication-transcription centers in MHV, and also undergoes phosphorylation, setting the stage for a complex regulatory system awaiting exploration99,100.
Transient helix H4 (in the LINK, Fig. 3F) was previously predicted bioinformatically and identified as a conserved feature across different coronaviruses, in agreement with our own secondary structure predictions (Fig. S19)29. Furthermore, the equivalent region was identified in SARS coronavirus as a NES, such that we suspect this too is a classical Crm1-binding leucine-rich NES101. Jack et al.20 identified helix H4 as enriched for homotypic cross-links in the context of droplets, supporting a model in which this region promotes protein:protein interactions, an interpretation corroborated by hydrogen–deuterium exchange mass spectrometry on RBD–LINK in the dilute phase26.
Concerning the CTD, two transient helices are identified, helix H5 and H6. While transient helix H5 is weakly populated, the positive charge associated with this region may make it critical for protein:RNA interaction, a result strongly supported by the observation that deletion of this region ablates protein:RNA phase separation20. Transient helix H6 is an amphipathic helix with a highly hydrophobic face (Fig. 4H). Recent hydrogen–deuterium exchange mass spectrometry also identified H6 41. Residues in this region have previously been identified as mediating M protein binding in other coronaviruses, such that we propose H6 underlies that interaction21,102–104. Recent work has also identified amphipathic transient helices in disordered proteins as interacting directly with membranes, such that an additional (albeit entirely speculative) role could involve direct membrane interaction, as has been observed in other viral phosphoproteins105,106.
As a final note, while these helices are conserved between SARS, SARS-CoV-2, and in many bat-coronaviruses, they are less well conserved in MHV and MERS, suggesting these regions are malleable over evolution (Fig. S1/3/5).
The physiological relevance of nucleocapsid protein phase separation in SARS-CoV-2 physiology
Our work has revealed that SARS-CoV-2 N protein undergoes phase separation with RNA when reconstituted in vitro. The solution environment and types of RNA used in our experiments are very different from the cytoplasm and viral RNA. However, similar results have been obtained in published and unpublished work by several other groups under a variety of conditions, including via in cell experiments20–27. Taken together, these results demonstrate that N protein can undergo bona fide phase separation, and that N protein condensates can form in cells. Nevertheless, the complexity introduced by multidimensional linkage effects in vivo could substantially influence the phase behavior and composition of condensates observed in the cell78,81,107. Of note, the regime we have identified in which phase separation occurs (Fig. 5) is remarkably relatively narrow, consistent with a model in which single-genome condensates for virion assembly are favored over larger multi-genome droplets.
Does phase separation play a physiological role in SARS-CoV-2 biology? Phase separation has been invoked or suggested in a number of viral contexts to date108–114. In SARS-CoV-2, one possible model suggests phase separation may drive recruitment of components to viral replication sites, although how this dovetails with the fact that replication occurs in double-membrane-bound vesicles (DMVs) remains to be explored24,115. An alternative (and non-mutually exclusive) model is one in which phase separation catalyzes nucleocapsid polymerization, as has been proposed in elegant work on measles virus75. Here, the process of phase separation is decoupled from genome packaging, where gRNA condensation occurs through association with a helical nucleocapsid. If applied to SARS-CoV-2, such a model would suggest that (1) initially N protein and RNA phase separate in the cytosol, (2) some discrete pre-capsid state forms within condensates, and (3) upon maturation, the pre-capsid is released from the condensate and undergoes subsequent virion assembly by interacting with the membrane-bound M, E, and S structural proteins at the ER–Golgi intermediate compartment (ERGIC). While this model is attractive it places a number of constraints on the physical properties of this pre-capsid, not least that the ability to escape the parent condensate dictates that the assembled pre-capsid must interact less strongly with the condensate components than in the unassembled state. This requirement introduces some thermodynamic complexities: how is a pre-capsid state driven to assemble if it is necessarily less stable than the unassembled pre-capsid, and how is incomplete or abortive pre-capsid formation avoided if—as assembly occurs—the pre-capsid becomes progressively less stable?
A phase separation and assembly model raises additional questions, such as the origins of specificity for recruitment of viral proteins and viral RNA, the kinetics of pre-capsid-assembly within a large condensate, and preferential packaging of gRNA over sub-genomic RNA. None of these questions are unanswerable, nor do they invalidate this model, but they should be addressed if the physiological relevance of large cytoplasmic condensates is to be further explored in the context of virion assembly.
Our preferred interpretation is that N protein has evolved to drive genome compaction for packaging (Fig. 7). In this model, a single-genome condensate forms through N protein gRNA interaction, driven by a small number of high-affinity sites. This (meta)-stable single-genome condensate undergoes subsequent maturation, leading to virion assembly. In this model, condensate-associated N proteins are in exchange with a bulk pool of soluble N protein, such that the interactions that drive compaction are heterogeneous and dynamic. Our model provides a physical mechanism in good empirical agreement with data for N protein oligomerization and assembly116–118. Furthermore, the resulting condensate is then in effect a multivalent binder for M protein, which interacts with N directly, and may drive membrane curvature and budding in a manner similar to that proposed by Bergeron-Sandoval and Michnick (though with a different directionality of the force) and in line with recent observations from cryo-electron tomography (cryoET)115,119–121
An open question pertains to specificity of packaging gRNA while excluding other RNAs. One possibility is for two high-affinity N-protein-binding sites to flank the 5′ and 3′ ends of the genome, whereby only RNA molecules with both sites are competent for compaction. A recent map of N protein binding to gRNA has revealed high-affinity-binding regions at the 5′ and 3′ ends of the gRNA, in good agreement with this qualitative prediction22. Alternatively, only gRNA condensates may possess the requisite valency for N protein binding to drive virion assembly through interaction with M protein at the cytoplasmic side of the ERGIC, offering a physical selection mechanism for budding.
Genome compaction through dynamic multivalent interactions would be especially relevant for coronaviruses, which have extremely large single-stranded RNA genomes. This is evolutionarily appealing, in that as the genome grows larger, compaction becomes increasingly efficient, as the effective valence of the genome is increased69,80. The ability of multivalent disordered proteins to drive RNA compaction has been observed previously in various contexts14,122. Furthermore, genome compaction by RNA-binding protein has been proposed and observed in other viruses118,123,124, and the SARS coronavirus N protein has previously been shown to act as an RNA chaperone, an expected consequence of compaction to a dynamic single-RNA condensate that accommodates multiple N proteins with a single RNA14,125. Furthermore, previous work exploring the ultrastructure of phase separated condensates of G3BP1 and RNA through simulations and cryoET revealed a beads-on-a-string type architecture, mirroring recent results for obtained from cryo-electron tomography of SARS-CoV-2 virions71,115.
N protein has been shown to interact directly with a number of proteins studied in the context of biological phase separation which may influence assembly in vivo5,23,70,77,126. In particular, G3BP1—an essential stress-granule protein that undergoes phase separation—was recently shown to co-localize with overexpressed N protein24,71,77,127,128. G3BP1 interaction may be part of the innate immune response, leading to stress-granule formation, or alternatively N protein may attenuates the stress response by sequestering G3BP1, depleting the cytosolic pool, and preventing stress-granule formation, as has been shown for HIV-1 and very recently proposed explicitly for SARS-CoV-2 112,128.
Our model is also in good empirical agreement with recent observations made for other viruses129. Taken together, we speculate that viral packaging may—in general—involve an initial genome compaction through multivalent protein:RNA and protein:protein interactions, followed by a liquid-to-solid transition in cases where well-defined crystalline capsid structures emerge. Liquid-to-solid transitions are well established in the context of neurodegeneration with respect to disease progression130–132. Here we suggest nature is leveraging those same principles as an evolved mechanism for monodisperse particle assembly.
Regardless of if phase separated condensates form inside cells, all available evidence suggests phase separation is reporting on a physiologically important interaction that underlies genome compaction (Fig. 6L). With this in mind, from a biotechnology standpoint, phase separation may be a convenient readout for in vitro assays to interrogate protein:RNA interaction. Regardless of which model is correct, N protein:RNA interaction is key for viral replication. As such, phase separation provides a macroscopic reporter on a nanoscopic phenomenon, in line with previous work70,80,133,134. In this sense, we propose the therapeutic implications of understanding and modulating phase separation here (and elsewhere in biology) are conveniently decoupled from the physiological relevance of actual, large phase separated liquid droplets, but instead offer a window into the underlying physical interactions that lead to condensate formation20.
The physics of single-polymer condensates
Depending on the molecular details, single-polymer condensates may be kinetically stable (but thermodynamically unstable, as in our model simulations) or thermodynamically stable. Delineation between these two scenarios will depend on the nature, strength, valency, and anisotropy of the interactions. It is worth noting that from the perspective of functional biology, kinetic stability may be essentially indistinguishable from thermodynamic stability, depending on the lifetime of a metastable species.
It is also important to emphasize that at higher concentrations of N protein and/or after a sufficiently long time period we expect robust phase separation with viral RNA, regardless of the presence of a symmetry-breaking site. Symmetry breaking is achieved when the apparent local concentration of N protein (from the perspective of gRNA) is substantially higher than the actual global concentration. As effective local and global concentrations approach one another, the entropic cost of intramolecular interaction is outweighed by the availability of intermolecular partners. On a practical note, if the readout in question is the presence/absence of liquid droplets, a high-affinity site may be observed as a shift in the saturation concentration which, confusingly, could either suppress or enhance phase separation. Further, if single-genome condensates are kinetically stable and driven through electrostatic interactions, we would expect a complex temperature dependence, in which larger droplets are observed at higher temperature (up to some threshold). Recent work is showing a strong temperature dependence of phase separation is consistent with these predictions22.
Finally, we note no reason to assume single-RNA condensates should be exclusively the purview of viruses. RNAs in eukaryotic cells may also be processed in these types of assemblies, as opposed to in large multi-RNA RNPs. The role of RNA:RNA interactions both here and in other systems is also of particular interest and not an aspect explored in our current work, but we anticipate may play a key role in the relevant biology.
Methods
All-atom simulations
All-atom Monte Carlo simulations were performed with the ABSINTH implicit solvent model (abs_3.2_opls.prm) and CAMPARI simulation engine (V2) (http://campari.sourceforge.net/)56,135 with the solution ion parameters of Mao et al.136. Simulations were performed using movesets and Hamiltonian parameters as reported previously70,137. All simulations were performed in sufficiently large box sizes to prevent finite-size effects (where box size varies from system to system). For simulations with IDRs in isolation all degrees of freedom available in CAMPARI are sampled. For simulations with folded domains with IDRs, the backbone dihedral angles in folded domains are not sampled, such that folded domains remain structurally fixed (although sidechains are fully sampled). The IDR has backbone and sidechain degrees of freedom sampled. Simulation sequences used are defined in SI Table S7.
All-atom molecular dynamics simulations were performed using GROMACS (version 5.0.4), using the FAST algorithm in conjunction with the Folding@home platform57,138,139. Post-simulation analysis was performed with Enspara140. For additional simulation details see the Supplementary Information.
Coarse-grained polymer simulations
Coarse-grained Monte Carlo simulations were performed using the PIMMS simulation engine141. All simulations were performed in a 70 × 70 × 70 lattice-site box. The results averaged over the final 20% of the simulation to give average values at equivalent states. The polymer species is represented as a 61-residue polymer with either a central high-affinity binding site or not. The binder is a two-bead species. All simulations shown in Fig. 6 were run for 20 × 109 Monte Carlo steps, with four independent replicas. Bead interaction strengths were defined as shown in Fig. 6A. For additional simulation details see SI.
Protein expression, purification, and labeling
SARS-CoV-2 Nucleocapsid protein (NCBI Reference Sequence: YP_009724397.2) including an N term extension containing His9-HRV 3 C protease site was cloned into the BamHI EcoRI sites in the MCS of pGEX-6P-1 vector (GE Healthcare). Site-directed mutagenesis was performed on the His9-SARS-CoV-2 Nucleocapsid pGEX vector to create the N protein constructs (SI Table S1) and sequences were verified using Sanger sequencing. All variants were expressed recombinantly in BL21 Codon-plus pRIL cells (Agilent) or Gold BL21(DE3) cells (Agilent) and purified using a FF HisTrap column. The GST-His9-N tag was then cleaved using HRV 3C protease and further purified to remove the cleaved tag. Finally, purified N protein variants were analyzed using SDS-PAGE and verified by electrospray ionization mass spectrometry (LC-MS). Activity of the protein was assessed by testing whether the protein is able to bind and condense nucleic acids (see phase-separation experiments) as well as to form dimers (see oligomerization in SI).
All nucleocapsid variants were labeled with Alexa Fluor 488 maleimide and Alexa Fluor 594 maleimide (Molecular Probes) under denaturing conditions following a two-step sequential labeling procedure (see SI).
Single-molecule fluorescence spectroscopy
Single-molecule fluorescence measurements were performed with a Picoquant MT200 instrument (Picoquant, Germany). FRET experiments were performed by exciting the donor dye with a laser power of 100 μW (measured at the back aperture of the objective). For pulsed interleaved excitation of donor and acceptor, the power used for exciting the acceptor dye was adjusted to match the acceptor emission intensity to that of the donor (between 50 and 70 mW). Single-molecule FRET efficiency histograms were acquired from samples with protein concentrations between 50 and 100 pM and the population with stoichiometry corresponding to 1:1 donor:acceptor labeling was selected. Trigger times for excitation pulses (repetition rate 20 MHz) and photon detection events were stored with 16 ps resolution. For FRET-FCS, samples of double-labeled protein with a concentration of 100 pM were excited by either the diode laser or the supercontinuum laser at the powers indicated above.
All samples were prepared in 50 mM Tris pH 7.32, 143 mM β-mercaptoethanol (for photoprotection), 0.001% Tween 20 (for limiting surface adhesion) and GdmCl at the reported concentrations. All measurements were performed in uncoated polymer coverslip cuvettes (Ibidi, Wisconsin, USA) and custom-made glass cuvette coated with PEG (see SI). Each sample was measured for at least 30 min at room temperature (295 ± 0.5 K).
Reporting summary
Further information on research design is available in the Nature Research Reporting Summary linked to this article.
Supplementary information
Acknowledgements
We thank Amy Gladfelter, Christiane Iserman, Christine Roden, Ahmet Yildiz, Amanda Jack, Luke Ferro, Steve Michnick, Pascale Legault, and Jim Omichinski for sharing data and extensive discussion. We also thank Rohit Pappu for placing our groups in contact with one another. We thank the labs of John Cooper, Carl Frieden, and Silvia Jansen for providing some of the reagents we have used in this work. We thank Ben Schuler and Daniel Nettels for developing, maintaining, and sharing with us the software package used to analyze the single-molecule data. J.C. and J.J.A. are supported by NIGMS R25 IMSD Training Grant GM103757. We are grateful to the citizen-scientists of Folding@home for donating their computing resources. G.R.B. holds an NSF CAREER Award MCB-1552471, NIH R01GM12400701, a Career Award at the Scientific Interface from the Burroughs Wellcome Fund, and a Packard Fellowship for Science and Engineering from The David and Lucile Packard Foundation. A.S. holds NIH grant R01AG062837. A.S.H. is supported by the Longer Life Foundation: an RGA/Washington University Collaboration.
Author contributions
J.C. designed, expressed, and purified the constructs, performed the single-molecule spectroscopy and oligomerization experiments, analyzed the corresponding data, and wrote the manuscript. J.J.A. performed coarse-grained simulations and wrote the manuscript. J.J.I. performed turbidity experiments and wrote the manuscript. M.D.S.B. designed the constructs for single-molecule spectroscopy experiments, supervised protein expression and purification and oligomerization experiments, and wrote the manuscript. S.S., M.D.W., M.I.Z. and N.V. set up, curated, analyzed, and managed molecular dynamics simulations on both local resources and the Folding@Home supercomputer. D.G. performed bioinformatic analysis. J.A.W. performed theoretical analysis. G.R.B. acquired funding. K.B.H. wrote the manuscript. A.S. conceived of the study, analyzed data, wrote the manuscript, and acquired funding. A.S.H. conceived of the study, analyzed data, performed and analyzed all-atom Monte Carlo simulations and coarse-grained simulations, wrote the manuscript, and acquired funding. G.R.B., K.B.H., A.S. and A.S.H. jointly supervised the work.
Data availability
Data supporting the findings in this paper are available from the corresponding authors upon request. All-atom simulation data for Monte Carlo simulations and disorder prediction info are provided at https://github.com/holehouse-lab/supportingdata/tree/master/2021/cubuk_nucleocapsid_2021. Simulations and simulation analysis were performed with open source tools (http://campari.sourceforge.net/, https://camparitraj.readthedocs.io/, http://mdtraj.org/, https://www.gromacs.org/) and Folding@Home data are available for further analysis at https://covid.molssi.org//org-contributions/#folding--home.
Competing interests
A.S.H. is a scientific consultant with Dewpoint Therapeutics. This affiliation in no way influenced the content of this study. All other authors declare no competing interests.
Footnotes
Peer review information Nature Communications thanks the anonymous reviewers for their contributions to the peer review of this work. Peer review reports are available.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Contributor Information
Andrea Soranno, Email: soranno@wustl.edu.
Alex S. Holehouse, Email: alex.holehouse@wustl.edu
Supplementary information
The online version contains supplementary material available at 10.1038/s41467-021-21953-3.
References
- 1.Zhu N, et al. A novel coronavirus from patients with pneumonia in China, 2019. N. Engl. J. Med. 2020;382:727–733. doi: 10.1056/NEJMoa2001017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Corman, V. M., Muth, D., Niemeyer, D. & Drosten, C. In Advances in Virus Research (eds. Kielian, M. et al.) Ch. 8, Vol. 100 163–188 (Academic Press, 2018). [DOI] [PMC free article] [PubMed]
- 3.Roser, M., Ritchie, H., Ortiz-Ospina, E. & Hasell, J. Coronavirus Pandemic (COVID-19). Our World in Data (2020).
- 4.Lurie N, Saville M, Hatchett R, Halton J. Developing Covid-19 vaccines at pandemic speed. N. Engl. J. Med. 2020;382:1969–1973. doi: 10.1056/NEJMp2005630. [DOI] [PubMed] [Google Scholar]
- 5.Gordon, D. E. et al. A SARS-CoV-2 protein interaction map reveals targets for drug repurposing. Nature10.1038/s41586-020-2286-9 (2020). [DOI] [PMC free article] [PubMed]
- 6.Sanders, J. M., Monogue, M. L., Jodlowski, T. Z. & Cutrell, J. B. Pharmacologic treatments for coronavirus disease 2019 (COVID-19): a review. JAMA10.1001/jama.2020.6019 (2020). [DOI] [PubMed]
- 7.Walls AC, et al. Structure, function, and antigenicity of the SARS-CoV-2 spike glycoprotein. Cell. 2020;181:281–292.e6. doi: 10.1016/j.cell.2020.02.058. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Hoffmann M, et al. SARS-CoV-2 cell entry depends on ACE2 and TMPRSS2 and is blocked by a clinically proven protease inhibitor. Cell. 2020;181:271–280.e8. doi: 10.1016/j.cell.2020.02.052. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Shang J, et al. Structural basis of receptor recognition by SARS-CoV-2. Nature. 2020;581:221–224. doi: 10.1038/s41586-020-2179-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Lan J, et al. Structure of the SARS-CoV-2 spike receptor-binding domain bound to the ACE2 receptor. Nature. 2020;581:215–220. doi: 10.1038/s41586-020-2180-5. [DOI] [PubMed] [Google Scholar]
- 11.Masters PS. Coronavirus genomic RNA packaging. Virology. 2019;537:198–207. doi: 10.1016/j.virol.2019.08.031. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Laude, H. & Masters, P. S. In The Coronaviridae (ed. Siddell, S. G.) 141–163 (Springer US, 1995).
- 13.van der Lee R, et al. Classification of intrinsically disordered regions and proteins. Chem. Rev. 2014;114:6589–6631. doi: 10.1021/cr400525m. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Holmstrom ED, Liu Z, Nettels D, Best RB, Schuler B. Disordered RNA chaperones can enhance nucleic acid folding via local charge screening. Nat. Commun. 2019;10:2453. doi: 10.1038/s41467-019-10356-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Borgia A, et al. Extreme disorder in an ultrahigh-affinity protein complex. Nature. 2018;555:61. doi: 10.1038/nature25762. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Dimura M, et al. Quantitative FRET studies and integrative modeling unravel the structure and dynamics of biomolecular systems. Curr. Opin. Struct. Biol. 2016;40:163–185. doi: 10.1016/j.sbi.2016.11.012. [DOI] [PubMed] [Google Scholar]
- 17.Fuertes G, et al. Decoupling of size and shape fluctuations in heteropolymeric sequences reconciles discrepancies in SAXS vs. FRET measurements. Proc. Natl Acad. Sci. USA. 2017;114:E6342–E6351. doi: 10.1073/pnas.1704692114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Warner JB, 4th, et al. Monomeric Huntingtin exon 1 has similar overall structural features for wild-type and pathological polyglutamine lengths. J. Am. Chem. Soc. 2017;139:14456–14469. doi: 10.1021/jacs.7b06659. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Chung HS, Piana-Agostinetti S, Shaw DE, Eaton WA. Structural origin of slow diffusion in protein folding. Science. 2015;349:1504–1510. doi: 10.1126/science.aab1369. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Jack, A. et al. SARS CoV-2 nucleocapsid protein forms condensates with viral genomic RNA. Preprint at bioRxiv10.1101/2020.09.14.295824 (2020). [DOI] [PMC free article] [PubMed]
- 21.Lu S, et al. The SARS-CoV-2 Nucleocapsid phosphoprotein forms mutually exclusive condensates with RNA and the membrane-associated M protein. Nat. Commun. 2021;12:502. doi: 10.1038/s41467-020-20768-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Iserman, C. et al. Genomic RNA elements drive phase separation of the SARS-CoV-2 nucleocapsid. Mol. Cell10.1016/j.molcel.2020.11.041 (2020). [DOI] [PMC free article] [PubMed]
- 23.Perdikari TM, et al. SARS-CoV-2 nucleocapsid protein phase-separates with RNA and with human hnRNPs. EMBO J. 2020;39:e106478. doi: 10.15252/embj.2020106478. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Savastano A, Ibáñez de Opakua A, Rankovic M, Zweckstetter M. Nucleocapsid protein of SARS-CoV-2 phase separates into RNA-rich polymerase-containing condensates. Nat. Commun. 2020;11:6041. doi: 10.1038/s41467-020-19843-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Carlson, C. R. et al. Phosphoregulation of phase separation by the SARS-CoV-2 N protein suggests a biophysical basis for its dual functions. Mol. Cell80, 1092–1103.e4 (2020). [DOI] [PMC free article] [PubMed]
- 26.Wu, C. et al. Characterization of SARS-CoV-2 N protein reveals multiple functional consequences of the C-terminal domain. Cold Spring Harb. Lab.10.1101/2020.11.30.404905 (2020).
- 27.Chen H, et al. Liquid–liquid phase separation by SARS-CoV-2 nucleocapsid protein and RNA. Cell Res. 2020;30:1143–1145. doi: 10.1038/s41422-020-00408-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.McBride R, van Zyl M, Fielding BC. The coronavirus nucleocapsid is a multifunctional protein. Viruses. 2014;6:2991–3018. doi: 10.3390/v6082991. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Chang C-K, et al. Multiple nucleic acid binding sites and intrinsic disorder of severe acute respiratory syndrome coronavirus nucleocapsid protein: implications for ribonucleocapsid protein packaging. J. Virol. 2009;83:2255–2264. doi: 10.1128/JVI.02001-08. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Grossoehme NE, et al. Coronavirus N protein N-terminal domain (NTD) specifically binds the transcriptional regulatory sequence (TRS) and melts TRS-cTRS RNA duplexes. J. Mol. Biol. 2009;394:544–557. doi: 10.1016/j.jmb.2009.09.040. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Cui L, et al. The nucleocapsid protein of coronaviruses acts as a viral suppressor of RNA silencing in mammalian cells. J. Virol. 2015;89:9029–9043. doi: 10.1128/JVI.01331-15. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Takeda M, et al. Solution structure of the c-terminal dimerization domain of SARS coronavirus nucleocapsid protein solved by the SAIL-NMR method. J. Mol. Biol. 2008;380:608–622. doi: 10.1016/j.jmb.2007.11.093. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Jayaram H, et al. X-ray structures of the N- and C-terminal domains of a coronavirus nucleocapsid protein: implications for nucleocapsid formation. J. Virol. 2006;80:6612–6620. doi: 10.1128/JVI.00157-06. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Yu I-M, et al. Recombinant severe acute respiratory syndrome (SARS) coronavirus nucleocapsid protein forms a dimer through its C-terminal domain. J. Biol. Chem. 2005;280:23280–23286. doi: 10.1074/jbc.M501015200. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Luo H, Chen J, Chen K, Shen X, Jiang H. Carboxyl terminus of severe acute respiratory syndrome coronavirus nucleocapsid protein: self-association analysis and nucleic acid binding characterization. Biochemistry. 2006;45:11827–11835. doi: 10.1021/bi0609319. [DOI] [PubMed] [Google Scholar]
- 36.Chang C-K, Chen C-MM, Chiang M-H, Hsu Y-L, Huang T-H. Transient oligomerization of the SARS-CoV N protein–implication for virus ribonucleoprotein packaging. PLoS ONE. 2013;8:e65045. doi: 10.1371/journal.pone.0065045. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Robbins SG, Frana MF, McGowan JJ, Boyle JF, Holmes KV. RNA-binding proteins of coronavirus MHV: detection of monomeric and multimeric N protein with an RNA overlay-protein blot assay. Virology. 1986;150:402–410. doi: 10.1016/0042-6822(86)90305-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.He R, et al. Analysis of multimerization of the SARS coronavirus nucleocapsid protein. Biochem. Biophys. Res. Commun. 2004;316:476–483. doi: 10.1016/j.bbrc.2004.02.074. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Kang, S. et al. Crystal structure of SARS-CoV-2 nucleocapsid protein RNA binding domain reveals potential unique drug targeting sites. Acta Pharm. Sin. B10.1016/j.apsb.2020.04.009 (2020). [DOI] [PMC free article] [PubMed]
- 40.Zinzula, L., Nagy, M. O. & Bracher, A. 1.45 Angstrom resolution crystal structure of C-terminal dimerization domain of nucleocapsid phosphoprotein from SARS-CoV-2 (PDB: 6YUN). Protein Data Bank (2020).
- 41.Ye Q, West AMV, Silletti S, Corbett KD. Architecture and self‐assembly of the SARS‐CoV‐2 nucleocapsid protein. Protein Sci. 2020;29:1890–1901. doi: 10.1002/pro.3909. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Zeng W, et al. Biochemical characterization of SARS-CoV-2 nucleocapsid protein. Biochem. Biophys. Res. Commun. 2020;527:618–623. doi: 10.1016/j.bbrc.2020.04.136. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Nettels D, et al. Single-molecule spectroscopy of the temperature-induced collapse of unfolded proteins. Proc. Natl Acad. Sci. USA. 2009;106:20740–20745. doi: 10.1073/pnas.0900622106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Soranno A, et al. Quantifying internal friction in unfolded and intrinsically disordered proteins with single-molecule spectroscopy. Proc. Natl Acad. Sci. USA. 2012;109:17800–17806. doi: 10.1073/pnas.1117368109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Borgia A, et al. Localizing internal friction along the reaction coordinate of protein folding by combining ensemble and single-molecule fluorescence spectroscopy. Nat. Commun. 2012;3:1195. doi: 10.1038/ncomms2204. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Schuler B, Soranno A, Hofmann H, Nettels D. Single-molecule FRET spectroscopy and the polymer physics of unfolded and intrinsically disordered proteins. Annu. Rev. Biophys. 2016;45:207–231. doi: 10.1146/annurev-biophys-062215-010915. [DOI] [PubMed] [Google Scholar]
- 47.Soranno, A., Cabassi, F. & Orselli, M. E. Dynamics of structural elements of GB1 β-hairpin revealed by tryptophan–cysteine contact formation experiments. J. Phys. Chem. B122, 11468–11477 (2018). [DOI] [PubMed]
- 48.Soranno A, et al. Integrated view of internal friction in unfolded proteins from single-molecule FRET, contact quenching, theory, and simulations. Proc. Natl Acad. Sci. USA. 2017;114:E1833–E1839. doi: 10.1073/pnas.1616672114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Zheng W, et al. Inferring properties of disordered chains from FRET transfer efficiencies. The Journal of Chemical Physics. 2018;148:123329. doi: 10.1063/1.5006954. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Soranno, A., Zosel, F. & Hofmann, H. Internal friction in an intrinsically disordered protein—comparing Rouse-like models with experiments. J. Chem. Phys. 148, 123326 (2018). [DOI] [PubMed]
- 51.Schellman JA. Selective binding and solvent denaturation. Biopolymers. 1987;26:549–559. doi: 10.1002/bip.360260408. [DOI] [PubMed] [Google Scholar]
- 52.Hofmann H, et al. Polymer scaling laws of unfolded and intrinsically disordered proteins quantified with single-molecule spectroscopy. Proc. Natl Acad. Sci. USA. 2012;109:16155–16160. doi: 10.1073/pnas.1207719109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Borgia A, et al. Consistent view of polypeptide chain expansion in chemical denaturants from multiple experimental methods. J. Am. Chem. Soc. 2016;138:11714–11726. doi: 10.1021/jacs.6b05917. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Zheng W, et al. Probing the action of chemical denaturant on an intrinsically disordered protein by simulation and experiment. J. Am. Chem. Soc. 2016;138:11702–11713. doi: 10.1021/jacs.6b05443. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Aznauryan M, et al. Comprehensive structural and dynamical view of an unfolded protein from the combination of single-molecule FRET, NMR, and SAXS. Proc. Natl Acad. Sci. USA. 2016;113:E5389–E5398. doi: 10.1073/pnas.1607193113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Vitalis A, Pappu RV. ABSINTH: a new continuum solvation model for simulations of polypeptides in aqueous solutions. J. Comput. Chem. 2009;30:673–699. doi: 10.1002/jcc.21005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Shirts M, Pande VS. Screen savers of the world unite! Science. 2000;290:1903–1904. doi: 10.1126/science.290.5498.1903. [DOI] [PubMed] [Google Scholar]
- 58.Zimmerman, M. I. et al. Citizen scientists create an exascale computer to combat COVID-19. Preprint at bioRxiv10.1101/2020.06.27.175430 (2020).
- 59.Tompa P, Fuxreiter M. Fuzzy complexes: polymorphism and structural disorder in protein–protein interactions. Trends Biochem. Sci. 2008;33:2–8. doi: 10.1016/j.tibs.2007.10.003. [DOI] [PubMed] [Google Scholar]
- 60.Nettels D, Gopich IV, Hoffmann A, Schuler B. Ultrafast dynamics of protein collapse from single-molecule photon statistics. Proc. Natl Acad. Sci. USA. 2007;104:2655–2660. doi: 10.1073/pnas.0611093104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Sauer M, Neuweiler H. PET-FCS: probing rapid structural fluctuations of proteins and nucleic acids by single-molecule fluorescence quenching. Methods Mol. Biol. 2014;1076:597–615. doi: 10.1007/978-1-62703-649-8_27. [DOI] [PubMed] [Google Scholar]
- 62.Haenni D, Zosel F, Reymond L, Nettels D, Schuler B. Intramolecular distances and dynamics from the combined photon statistics of single-molecule FRET and photoinduced electron transfer. J. Phys. Chem. B. 2013;117:13015–13028. doi: 10.1021/jp402352s. [DOI] [PubMed] [Google Scholar]
- 63.Zosel F, Haenni D, Soranno A, Nettels D, Schuler B. Combining short- and long-range fluorescence reporters with simulations to explore the intramolecular dynamics of an intrinsically disordered protein. J. Chem. Phys. 2017;147:152708. doi: 10.1063/1.4992800. [DOI] [PubMed] [Google Scholar]
- 64.Banani SF, Lee HO, Hyman AA, Rosen MK. Biomolecular condensates: organizers of cellular biochemistry. Nat. Rev. Mol. Cell Biol. 2017;18:285–298. doi: 10.1038/nrm.2017.7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Shin, Y. & Brangwynne, C. P. Liquid phase condensation in cell physiology and disease. Science357, eaaf4382 (2017). [DOI] [PubMed]
- 66.Brangwynne CP, et al. Germline P granules are liquid droplets that localize by controlled dissolution/condensation. Science. 2009;324:1729–1732. doi: 10.1126/science.1172046. [DOI] [PubMed] [Google Scholar]
- 67.Li P, et al. Phase transitions in the assembly of multivalent signalling proteins. Nature. 2012;483:336–340. doi: 10.1038/nature10879. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Lin Y, Protter DSW, Rosen MK, Parker R. Formation and maturation of phase-separated liquid droplets by RNA-binding proteins. Mol. Cell. 2015;60:208–219. doi: 10.1016/j.molcel.2015.08.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Choi J-M, Holehouse AS, Pappu RV. Physical principles underlying the complex biology of intracellular phase transitions. Annu. Rev. Biophys. 2020;49:107–133. doi: 10.1146/annurev-biophys-121219-081629. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Martin EW, et al. Valence and patterning of aromatic residues determine the phase behavior of prion-like domains. Science. 2020;367:694–699. doi: 10.1126/science.aaw8653. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Guillén-Boixet J, et al. RNA-induced conformational switching and clustering of G3BP drive stress granule assembly by condensation. Cell. 2020;181:346–361.e17. doi: 10.1016/j.cell.2020.03.049. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Wang J, et al. A molecular grammar governing the driving forces for phase separation of prion-like RNA binding proteins. Cell. 2018;174:688–699.e16. doi: 10.1016/j.cell.2018.06.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Stockmayer WH. Light scattering in multi‐component systems. J. Chem. Phys. 1950;18:58–61. doi: 10.1063/1.1747457. [DOI] [Google Scholar]
- 74.Banerjee PR, Milin AN, Moosa MM, Onuchic PL, Deniz AA. Reentrant phase transition drives dynamic substructure formation in ribonucleoprotein droplets. Angew. Chem. Int. Ed. Engl. 2017;56:11354–11359. doi: 10.1002/anie.201703191. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Guseva S, et al. Measles virus nucleo- and phosphoproteins form liquid-like phase-separated compartments that promote nucleocapsid assembly. Sci. Adv. 2020;6:eaaz7095. doi: 10.1126/sciadv.aaz7095. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Posey, A. E., Holehouse, A. S. & Pappu, R. V. In Methods in Enzymology (ed. Rhoades, E.) Ch. 1, Vol. 611, 1–30 (Academic Press, 2018).
- 77.Sanders DW, et al. Competing protein-RNA interaction networks control multiphase intracellular organization. Cell. 2020;181:306–324.e28. doi: 10.1016/j.cell.2020.03.050. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Riback JA, et al. Composition-dependent thermodynamics of intracellular phase separation. Nature. 2020;581:209–214. doi: 10.1038/s41586-020-2256-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Semenov AN, Rubinstein M. Thermoreversible gelation in solutions of associative polymers. 1. Statics. Macromolecules. 1998;31:1373–1385. doi: 10.1021/ma970616h. [DOI] [Google Scholar]
- 80.Rubinstein, M. & Colby, R. H. Polymer Physics (Oxford University Press, 2003).
- 81.Choi J-M, Dar F, Pappu RV. LASSI: a lattice model for simulating phase transitions of multivalent proteins. PLoS Comput. Biol. 2019;15:e1007028. doi: 10.1371/journal.pcbi.1007028. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Post CB, Zimm BH. Internal condensation of a single DNA molecule. Biopolymers. 1979;18:1487–1501. doi: 10.1002/bip.1979.360180612. [DOI] [Google Scholar]
- 83.Hsieh P-K, et al. Assembly of severe acute respiratory syndrome coronavirus RNA packaging signal into virus-like particles is nucleocapsid dependent. J. Virol. 2005;79:13848–13855. doi: 10.1128/JVI.79.22.13848-13855.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Woo K, Joo M, Narayanan K, Kim KH, Makino S. Murine coronavirus packaging signal confers packaging to nonviral RNA. J. Virol. 1997;71:824–827. doi: 10.1128/JVI.71.1.824-827.1997. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85.Cologna R, Hogue BG. Identification of a bovine coronavirus packaging signal. J. Virol. 2000;74:580–583. doi: 10.1128/JVI.74.1.580-583.2000. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Pool R, Bolhuis PG. Sampling the kinetic pathways of a micelle fusion and fission transition. J. Chem. Phys. 2007;126:244703. doi: 10.1063/1.2741513. [DOI] [PubMed] [Google Scholar]
- 87.Denkova AG, Mendes E, Coppens M-O. Non-equilibrium dynamics of block copolymer micelles in solution: recent insights and open questions. Soft Matter. 2010;6:2351–2357. doi: 10.1039/c001175b. [DOI] [Google Scholar]
- 88.Ranganathan, S. & Shakhnovich, E. I. Dynamic metastable long-living droplets formed by sticker-spacer proteins. Elife9, e56159 (2020). [DOI] [PMC free article] [PubMed]
- 89.Leyrat C, et al. The N0-binding region of the vesicular stomatitis virus phosphoprotein is globally disordered but contains transient α-helices. Protein Sci. 2011;20:542–556. doi: 10.1002/pro.587. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90.Feuerstein S, et al. Transient structure and SH3 interaction sites in an intrinsically disordered fragment of the hepatitis C virus protein NS5A. J. Mol. Biol. 2012;420:310–323. doi: 10.1016/j.jmb.2012.04.023. [DOI] [PubMed] [Google Scholar]
- 91.Jensen MR, et al. Quantitative conformational analysis of partially folded proteins from residual dipolar couplings: application to the molecular recognition element of Sendai virus nucleoprotein. J. Am. Chem. Soc. 2008;130:8055–8061. doi: 10.1021/ja801332d. [DOI] [PubMed] [Google Scholar]
- 92.Martin EW, et al. Sequence determinants of the conformational properties of an intrinsically disordered protein prior to and upon multisite phosphorylation. J. Am. Chem. Soc. 2016;138:15323–15335. doi: 10.1021/jacs.6b10272. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 93.Das RK, Crick SL, Pappu RV. N-terminal segments modulate the α-helical propensities of the intrinsically disordered basic regions of bZIP proteins. J. Mol. Biol. 2012;416:287–299. doi: 10.1016/j.jmb.2011.12.043. [DOI] [PubMed] [Google Scholar]
- 94.Harmon TS, et al. GADIS: Algorithm for designing sequences to achieve target secondary structure profiles of intrinsically disordered proteins. Protein Eng. Des. Sel. 2016;29:339–346. doi: 10.1093/protein/gzw034. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 95.Bayer TS, Booth LN, Knudsen SM, Ellington AD. Arginine-rich motifs present multiple interfaces for specific binding by RNA. RNA. 2005;11:1848–1857. doi: 10.1261/rna.2167605. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 96.Battiste JL, et al. Alpha helix-RNA major groove recognition in an HIV-1 rev peptide-RRE RNA complex. Science. 1996;273:1547–1551. doi: 10.1126/science.273.5281.1547. [DOI] [PubMed] [Google Scholar]
- 97.Hurst KR, Koetzner CA, Masters PS. Characterization of a critical interaction between the coronavirus nucleocapsid protein and nonstructural protein 3 of the viral replicase-transcriptase complex. J. Virol. 2013;87:9159–9172. doi: 10.1128/JVI.01275-13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 98.Hurst KR, Ye R, Goebel SJ, Jayaraman P, Masters PS. An interaction between the nucleocapsid protein and a component of the replicase-transcriptase complex is crucial for the infectivity of coronavirus genomic RNA. J. Virol. 2010;84:10276–10288. doi: 10.1128/JVI.01287-10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 99.Verheije MH, et al. The coronavirus nucleocapsid protein is dynamically associated with the replication-transcription complexes. J. Virol. 2010;84:11575–11579. doi: 10.1128/JVI.00569-10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 100.Surjit M, et al. The severe acute respiratory syndrome coronavirus nucleocapsid protein is phosphorylated and localizes in the cytoplasm by 14-3-3-mediated translocation. J. Virol. 2005;79:11476–11486. doi: 10.1128/JVI.79.17.11476-11486.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 101.Timani KA, et al. Nuclear/nucleolar localization properties of C-terminal nucleocapsid protein of SARS coronavirus. Virus Res. 2005;114:23–34. doi: 10.1016/j.virusres.2005.05.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 102.Kuo L, Masters PS. Genetic evidence for a structural interaction between the carboxy termini of the membrane and nucleocapsid proteins of mouse hepatitis virus. J. Virol. 2002;76:4987–4999. doi: 10.1128/JVI.76.10.4987-4999.2002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 103.Hurst KR, et al. A major determinant for membrane protein interaction localizes to the carboxy-terminal domain of the mouse coronavirus nucleocapsid protein. J. Virol. 2005;79:13285–13297. doi: 10.1128/JVI.79.21.13285-13297.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 104.Verma S, Bednar V, Blount A, Hogue BG. Identification of functionally important negatively charged residues in the carboxy end of mouse hepatitis coronavirus A59 nucleocapsid protein. J. Virol. 2006;80:4344–4355. doi: 10.1128/JVI.80.9.4344-4355.2006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 105.Brass V, et al. An amino-terminal amphipathic α-helix mediates membrane association of the hepatitis C virus nonstructural protein 5A. J. Biol. Chem. 2002;277:8130–8139. doi: 10.1074/jbc.M111289200. [DOI] [PubMed] [Google Scholar]
- 106.Braun AR, Lacy MM, Ducas VC, Rhoades E, Sachs JN. α-Synuclein’s uniquely long amphipathic helix enhances its membrane binding and remodeling capacity. J. Membr. Biol. 2017;250:183–193. doi: 10.1007/s00232-017-9946-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 107.Wyman, J. & Gill, S. J. Binding and Linkage: Functional Chemistry of Biological Macromolecules (University Science Books, 1990).
- 108.Nikolic J, et al. Negri bodies are viral factories with properties of liquid organelles. Nat. Commun. 2017;8:58. doi: 10.1038/s41467-017-00102-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 109.Metrick, C. M., Koenigsberg, A. L. & Heldwein, E. E. Conserved outer tegument component UL11 from herpes simplex virus 1 is an intrinsically disordered, RNA-binding protein. MBio10.1128/mBio.00810-20 (2020). [DOI] [PMC free article] [PubMed]
- 110.Heinrich, B. S., Maliga, Z., Stein, D. A., Hyman, A. A. & Whelan, S. P. J. Phase transitions drive the formation of vesicular stomatitis virus replication compartments. MBio10.1128/mBio.02290-17 (2018). [DOI] [PMC free article] [PubMed]
- 111.Zhou, Y., Su, J. M., Samuel, C. E. & Ma, D. Measles virus forms inclusion bodies with properties of liquid organelles. J. Virol. 10.1128/JVI.00948-19 (2019). [DOI] [PMC free article] [PubMed]
- 112.Monette A, et al. Pan-retroviral nucleocapsid-mediated phase separation regulates genomic RNA positioning and trafficking. Cell Rep. 2020;31:107520. doi: 10.1016/j.celrep.2020.03.084. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 113.Nevers Q, Albertini AA, Lagaudrière-Gesbert C, Gaudin Y. Negri bodies and other virus membrane-less replication compartments. Biochim. Biophys. Acta Mol. Cell Res. 2020;1867:118831. doi: 10.1016/j.bbamcr.2020.118831. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 114.Monette, A. & Mouland, A. J. Zinc and copper ions differentially regulate prion-like phase separation dynamics of pan-virus nucleocapsid biomolecular condensates. Viruses12, 1179 (2020). [DOI] [PMC free article] [PubMed]
- 115.Klein S, et al. SARS-CoV-2 structure and replication characterized by in situ cryo-electron tomography. Nat. Commun. 2020;11:5885. doi: 10.1038/s41467-020-19619-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 116.Cong Y, Kriegenburg F, de Haan CAM, Reggiori F. Coronavirus nucleocapsid proteins assemble constitutively in high molecular oligomers. Sci. Rep. 2017;7:5740. doi: 10.1038/s41598-017-06062-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 117.Chang C-K, Hou M-H, Chang C-F, Hsiao C-D, Huang T-H. The SARS coronavirus nucleocapsid protein–forms and functions. Antivir. Res. 2014;103:39–50. doi: 10.1016/j.antiviral.2013.12.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 118.Borodavka A, Tuma R, Stockley PG. A two-stage mechanism of viral RNA compaction revealed by single molecule fluorescence. RNA Biol. 2013;10:481–489. doi: 10.4161/rna.23838. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 119.He R, et al. Characterization of protein–protein interactions between the nucleocapsid protein and membrane protein of the SARS coronavirus. Virus Res. 2004;105:121–125. doi: 10.1016/j.virusres.2004.05.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 120.Bergeron-Sandoval, L.-P. et al. Endocytosis caused by liquid-liquid phase separation of proteins. Preprint at bioRxiv10.1101/145664 (2018).
- 121.Bergeron-Sandoval L-P, Michnick SW. Mechanics, structure and function of biopolymer condensates. J. Mol. Biol. 2018;430:4754–4761. doi: 10.1016/j.jmb.2018.06.023. [DOI] [PubMed] [Google Scholar]
- 122.Holmstrom ED, Nettels D, Schuler B. Conformational plasticity of hepatitis C virus core protein enables RNA-induced formation of nucleocapsid-like particles. J. Mol. Biol. 2018;430:2453–2467. doi: 10.1016/j.jmb.2017.10.010. [DOI] [PubMed] [Google Scholar]
- 123.Rodríguez L, Cuesta I, Asenjo A, Villanueva N. Human respiratory syncytial virus matrix protein is an RNA-binding protein: binding properties, location and identity of the RNA contact residues. J. Gen. Virol. 2004;85:709–719. doi: 10.1099/vir.0.19707-0. [DOI] [PubMed] [Google Scholar]
- 124.Linger BR, Kunovska L, Kuhn RJ, Golden BL. Sindbis virus nucleocapsid assembly: RNA folding promotes capsid protein dimerization. RNA. 2004;10:128–138. doi: 10.1261/rna.5127104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 125.Zúñiga S, et al. Coronavirus nucleocapsid protein is an RNA chaperone. Virology. 2007;357:215–227. doi: 10.1016/j.virol.2006.07.046. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 126.Luo H, et al. The nucleocapsid protein of SARS coronavirus has a high binding affinity to the human cellular heterogeneous nuclear ribonucleoprotein A1. FEBS Lett. 2005;579:2623–2628. doi: 10.1016/j.febslet.2005.03.080. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 127.Yang P, et al. G3BP1 is a tunable switch that triggers phase separation to assemble stress granules. Cell. 2020;181:325–345.e28. doi: 10.1016/j.cell.2020.03.046. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 128.Nabeel-Shah, S. et al. SARS-CoV-2 nucleocapsid protein attenuates stress granule formation and alters gene expression via direct interaction with host mRNAs. Preprint at bioRxiv10.1101/2020.10.23.342113 (2020).
- 129.van Rosmalen MGM, et al. Revealing in real-time a multistep assembly mechanism for SV40 virus-like particles. Sci. Adv. 2020;6:eaaz1639. doi: 10.1126/sciadv.aaz1639. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 130.Patel A, et al. A liquid-to-solid phase transition of the ALS protein FUS accelerated by disease mutation. Cell. 2015;162:1066–1077. doi: 10.1016/j.cell.2015.07.047. [DOI] [PubMed] [Google Scholar]
- 131.Alberti S, Dormann D. Liquid-liquid phase separation in disease. Annu. Rev. Genet. 2019;53:171–194. doi: 10.1146/annurev-genet-112618-043527. [DOI] [PubMed] [Google Scholar]
- 132.Weber SC, Brangwynne CP. Getting RNA and protein in phase. Cell. 2012;149:1188–1191. doi: 10.1016/j.cell.2012.05.022. [DOI] [PubMed] [Google Scholar]
- 133.Dignon GL, Zheng W, Best RB, Kim YC, Mittal J. Relation between single-molecule properties and phase behavior of intrinsically disordered proteins. Proc. Natl Acad. Sci. USA. 2018;115:9929–9934. doi: 10.1073/pnas.1804177115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 134.Zeng X, Holehouse AS, Chilkoti A, Mittag T, Pappu RV. Connecting coil-to-globule transitions to full phase diagrams for intrinsically disordered proteins. Biophys. J. 2020;119:402–418. doi: 10.1016/j.bpj.2020.06.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 135.Mittal, A., Das, R. K., Vitalis, A. & Pappu, R. V. In Computational Approaches to Protein Dynamics: From Quantum to Coarse-Grained Methods (ed. Fuxreiter, M.) 188–222 (CRC Press, 2015).
- 136.Mao AH, Pappu RV. Crystal lattice properties fully determine short-range interaction parameters for alkali and halide ions. J. Chem. Phys. 2012;137:064104. doi: 10.1063/1.4742068. [DOI] [PubMed] [Google Scholar]
- 137.Sherry KP, Das RK, Pappu RV, Barrick D. Control of transcriptional activity by design of charge patterning in the intrinsically disordered RAM region of the Notch receptor. Proc. Natl Acad. Sci. USA. 2017;114:E9243–E9252. doi: 10.1073/pnas.1706083114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 138.Abraham MJ, et al. GROMACS: high performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX. 2015;1–2:19–25. doi: 10.1016/j.softx.2015.06.001. [DOI] [Google Scholar]
- 139.Zimmerman MI, Bowman GR. FAST conformational searches by balancing exploration/exploitation trade-offs. J. Chem. Theory Comput. 2015;11:5747–5757. doi: 10.1021/acs.jctc.5b00737. [DOI] [PubMed] [Google Scholar]
- 140.Porter JR, Zimmerman MI, Bowman GR. Enspara: modeling molecular ensembles with scalable data structures and parallel computing. J. Chem. Phys. 2019;150:044108. doi: 10.1063/1.5063794. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 141.Holehouse, A. S. & Pappu, R. V. PIMMS (0.24 pre-beta). 10.5281/zenodo.3588456 (2019).
- 142.Mészáros B, Erdos G, Dosztányi Z. IUPred2A: context-dependent prediction of protein disorder as a function of redox state and protein binding. Nucleic Acids Res. 2018;46:W329–W337. doi: 10.1093/nar/gky384. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 143.Baker NA, Sept D, Joseph S, Holst MJ, McCammon JA. Electrostatics of nanosystems: application to microtubules and the ribosome. Proc. Natl Acad. Sci. USA. 2001;98:10037–10041. doi: 10.1073/pnas.181342398. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 144.Holehouse AS, Garai K, Lyle N, Vitalis A, Pappu RV. Quantitative assessments of the distinct contributions of polypeptide backbone amides versus side chain groups to chain expansion via chemical denaturation. J. Am. Chem. Soc. 2015;137:2984–2995. doi: 10.1021/ja512062h. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Data supporting the findings in this paper are available from the corresponding authors upon request. All-atom simulation data for Monte Carlo simulations and disorder prediction info are provided at https://github.com/holehouse-lab/supportingdata/tree/master/2021/cubuk_nucleocapsid_2021. Simulations and simulation analysis were performed with open source tools (http://campari.sourceforge.net/, https://camparitraj.readthedocs.io/, http://mdtraj.org/, https://www.gromacs.org/) and Folding@Home data are available for further analysis at https://covid.molssi.org//org-contributions/#folding--home.