Skip to main content
PLOS Computational Biology logoLink to PLOS Computational Biology
. 2021 Aug 5;17(8):e1009286. doi: 10.1371/journal.pcbi.1009286

Modelling conformational state dynamics and its role on infection for SARS-CoV-2 Spike protein variants

Natália Teruel 1, Olivier Mailhot 1,2, Rafael J Najmanovich 1,*
Editor: Roland L Dunbrack Jr3
PMCID: PMC8384204  PMID: 34351895

Abstract

The SARS-CoV-2 Spike protein needs to be in an open-state conformation to interact with ACE2 to initiate viral entry. We utilise coarse-grained normal mode analysis to model the dynamics of Spike and calculate transition probabilities between states for 17081 variants including experimentally observed variants. Our results correctly model an increase in open-state occupancy for the more infectious D614G via an increase in flexibility of the closed-state and decrease of flexibility of the open-state. We predict the same effect for several mutations on glycine residues (404, 416, 504, 252) as well as residues K417, D467 and N501, including the N501Y mutation recently observed within the B.1.1.7, 501.V2 and P1 strains. This is, to our knowledge, the first use of normal mode analysis to model conformational state transitions and the effect of mutations on such transitions. The specific mutations of Spike identified here may guide future studies to increase our understanding of SARS-CoV-2 infection mechanisms and guide public health in their surveillance efforts.

Author summary

The present work explores the molecular mechanisms underlying and potentially helping new strains of SARS-CoV-2 to gain an evolutionary advantage during the ongoing COVID-19 pandemics. We show how a computational method called normal mode analysis that treats protein dynamics in a simplified manner is capable to predict the higher propensity of the Spike protein to be in the open state in which it is capable to interact with the human ACE2 receptor and thus facilitate cell entry. Because the simulation of the simplified computational model is relatively less demanding on resources than alternative methods, we were able to simulate over 17000 mutations in the SARS-CoV-2 Spike protein to identify multiple mutations that if they were to appear as the virus continues to evolve, could confer an evolutionary advantage. As a matter of fact, our predictions foresaw the emergence of particular mutations such as N501Y that appeared in several variants of concern. Our results can inform public health regarding new variants and serves as a proof of concept for the application of normal mode analysis to study the effect of mutations on both, protein dynamics and conformational transitions in a high-throughput manner.

Introduction

The coronavirus pandemic has emerged as a major and urgent issue affecting individuals, families and society as a whole. Among all outbreaks of aerosol transmissible diseases in the 21st century, the COVID-19 pandemic, caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) virus [1,2], has the highest infection and death cumulative numbers—61 million infections and over 1.4 million deaths, according to the World Health Organization (WHO) epidemiological report of December 1, 2020 [3]. Recent WHO reports also show significant weekly increases in the number of infections and deaths as countries start to face upcoming waves of the disease. In 2003 the SARS coronavirus (SARS-CoV) pandemic caused 8,098 infections and 774 deaths before it was brought under control [4,5]. In 2012, the Middle East respiratory syndrome-related coronavirus (MERS-CoV) outbreak caused 2499 infections and 858 deaths, presenting the highest fatality rate [6]. SARS-CoV-2, SARS-CoV and MERS-CoV, as coronaviruses in general, present considerable mutation rates, which may contribute to future outbreaks. For instance, SARS-CoV-2 is estimated to have a mutation rate close to the ones presented by MERS-CoV [7] and by SARS-CoV [8], as well as other RNA viruses, showing a median of 1.12 × 10−3 mutations per site per year [9]. The high mutation rate may in part be responsible for the zoonotic nature of these viruses and points to a clear risk of still-undetected additional members of the coronavirideae family of viruses making the jump from their traditional hosts to humans in the future.

The SARS-CoV-2 Spike protein (Uniprot ID P0DTC2) is responsible for anchoring the virus to the host cell. The entry receptor for SARS-CoV-2 and other lineages of human coronaviruses is the human cell-surface protein angiotensin converting enzyme 2 (ACE2) (Uniprot ID Q9BYF1) [10]. Therefore, studying the Spike protein family is essential to understand the evolution of coronaviruses.

SARS-CoV-2 Spike is a homo-trimeric glycoprotein, with each chain built by subunits S1 and S2, delimited by a Furin cleavage site at residues 682–685. The S1 subunit comprises the N-terminal domain (NTD), located in the peripheric part of the extramembrane extreme, and the receptor-binding domain (RBD), the most flexible site, located in the central part of this same extreme. The S2 subunit consists of the fusion peptide (FP), heptad repeat 1 (HR1), heptad repeat 2 (HR2), the transmembrane domain (TM), and the cytoplasmic tail (CT) (Fig 1). The interaction between Spike and ACE2 relies on Spike being in its open conformation, in which the receptor-binding domain (RBD) is extended [11]. The study of the binding properties between Spike and ACE2, although important, cannot explain all the nuances of the infection mechanism. An example of this limitation is the comparison between SARS-CoV and SARS-CoV-2, which have different rates of infection even though they share similar Spike-ACE2 affinities [12]. These facts lead us to consider the contribution of Spike protein dynamics to the infection process.

Fig 1. Domains of the Spike protein.

Fig 1

N-Terminal Domain (NTD), Receptor Binding Domain (RBD), Subunit 1/Subunit 2 junction (S1/S2), Fusion Peptide (FP), Heptad Repeat 1 (HR1), Heptad Repeat 2 (HR2), Transmembrane Domain (TM), and the Cytoplasmic Tail (CT). Crystallography structure in the conformational state of all 3 RBD domains closed (PDB 6VXX) and of 1 RBD open (PDB 6VYB), binding to ACE2 (PDB 6M17).

Computational structural biology methods have grown in both accuracy and usability over the years and are increasingly accepted as part of an integrated approach to tackle problems in molecular biology. Such integration speeds up research, decreases needs in infrastructure, reagents, and human resources and allows us to evaluate increasingly larger data sets. Computational approaches are being extensively used in the study of SARS-CoV-2 and its mechanisms of infection [1315]. Among these, we highlight the study of dynamic properties of the Spike protein as well as in antibody recognition and the search for therapeutic interventions [1618].

Several aspects of Spike protein dynamics are currently being studied, with a range of particular goals: to evaluate the docking of small molecules to the RBD domain [19], to search for alternative target binding-sites for vaccine development [20], to understand residue-residue interactions and their effects on conformational plasticity [21] and to investigate the flexibility of different domains in particular conformational states [22].

Various combinations of normal mode analysis (NMA) and molecular dynamics (MD) methods are being employed in the study of different conformational states [23] and of different coronavirus variants [24]. These methods, however, are limited with respect to their ability to study the effects of mutations on dynamics since they are either extremely taxing on computational resources in the case of MD or agnostic to the nature of amino acids in the case of traditional coarse-grained NMA. In the past, our group developed a coarse-grained NMA model called ENCoM (for Elastic Network Contact Model) that considers the chemical nature of amino acids and their interactions and consequently their effect on dynamics [25]. This makes ENCoM perform better than other NMA models on traditional applications but more importantly makes it the only coarse-grained NMA model capable of predicting the effect of mutations on protein stability and function as a result of dynamical properties [2628]. As a coarse-grained NMA model, ENCoM is not much costlier to run than other traditional NMA models and this makes it very attractive as a tool to screen the effect of mutations in a high-throughput manner.

In the present study, we use ENCoM to study the dynamics of the Spike protein, considering different conformational states and several sequence variants observed during the current pandemic, as well as through large-scale analysis of in silico mutations. Experimental analysis of the effect of the SARS-CoV-2 Spike mutation D614G and the comparison between SARS-CoV and SARS-CoV-2 Spike proteins show unique dynamic characteristics that correlate with epidemiological and experimental data on infection. The present work shows that we can replicate such results computationally, suggesting that rigidity or flexibility of different Spike conformational states affects infectivity. We present a high throughput analysis of simulated single amino acid mutations on dynamical properties to seek potential hotspots and individual Spike variants that may be more infectious and therefore may guide public health decisions if such variants were to appear in the population. We also introduce a Markov model of occupancy of molecular states with transition probabilities derived from our analysis of dynamics that recapitulates experimental data on conformational state occupancies. This is the first application of an NMA method that derives transition probabilities from normal modes and employs them in a dynamic system to predict the occupancy of different conformational states. We model the occupancy of several variants and highlight those that may be useful in studying future epidemiological trends that could be responsible for new outbreaks and lastly, we expand and apply the methodology to multiple mutations in cases of variants of concern that that were observed during the COVID-19 pandemic with their full complement of Spike mutations.

Materials and methods

Spike protein models

We performed our analyses using the crystallographic models of the SARS-CoV-2 Spike protein in the open (PDB ID 6VYB) and closed (PDB ID 6VXX) states. The open (prefusion) state was designed with an abrogated Furin S1/S2 cleavage site and two consecutive proline mutations that improve expression [29]. Despite the mutations, the engineered structures correctly represent the conformational states of Spike, as confirmed by independently solved structures [30]. The PDB structures used for the SARS-CoV comparison were 5X58 and 5X5B for closed state and one RBD open state, respectively [31].

We removed heteroatoms, water molecules, and hydrogen atoms from the PDB structures. Missing residues were reconstructed using template-based loop reconstruction and refinement with Modeller [32]. Single amino acid mutants were generated using FoldX4 [33]. Vibrational Difference Score (VDS, defined below) and occupancy calculations were performed with reconstructed closed and one-RBD-open structures using as template 6VXX and 6VYB. These engineered structures contain the GSAS sequence in the Furin cleavage site as well as two prolines in positions 986 and 987. In order to minimize potential artefacts in the calculations due to modelling errors, we chose to model all mutations and perform subsequent calculations using the above engineered structures and sequences unless otherwise noted. That is to say, when we refer to the wild type SARS-CoV-2 Spike protein in our calculations, it is the Spike protein with the above alterations in the Furin clivage site as well as the pair of prolines. This choice in our methodology is made as stated to decrease the possibility of modelling artefacts as the alternative would have required modelling 6 additional mutations to ‘de-engineer’ the structures of the open and closed states.

For the parameter fitting used in the calculation of occupancies, we utilized the following experimentally determined structures for which occupancy data exists as follows (acronyms described in results): S-GSAS/WT: 7KDG,7KDH; S-GSAS/D614G: 7KDI,7KDJ [30]; S-R/x2: 6ZOX; S-R/PP/x1: 6ZOY,6ZOZ; S-R: 6ZP0; S-R/PP: 6ZP1,6ZP2 [34].

Dynamic analyses

We analysed dynamic properties of the Spike protein with ENCoM [25]. ENCoM employs a potential energy function that includes a pairwise atom-type non-bonded interaction term and thus makes it possible to consider the effect of the specific nature of amino-acids on dynamics. Normal mode analysis (NMA) explores protein vibrations around an equilibrium conformation by calculating the eigenvectors and eigenvalues associated with different normal modes [3537]. Representing each protein residue as a single point, for a given conformation of a protein with N amino acids, we obtain 3N - 6 nontrivial eigenvectors. Each eigenvector represents a linear, harmonic motion of the entire protein in which each amino acid moves along a unique 3-dimensional Euclidean vector. The associated eigenvalues rank the eigenvectors in terms of energetic accessibility, lower values corresponding to global, more easily accessible motions.

NMA calculations allow us to computationally estimate b-factors associated with the protein structure, as shown in Eq 1 for the ith residue, which in turn are related to local flexibility. Higher predicted b-factors denote more flexible positions. Individually calculated b-factors are combined in a vector for a protein sequence or part thereof and called Dynamical Signature.

Bi=n=73NEn,i,x2+En,i,y2+En,i,z2λn (1)

The eigenvectors and associated eigenvalues can also be used to obtain the vibrational contribution of the entropic components of the free energy. Vibrational entropy [38] is calculated as described in Eq 2 in units of J.K-1, where N is the total number of amino acids in the protein, vi is the vibrational frequency and KB is the Boltzmann constant. Eq 3 shows the association between eigenvalues and vibrational frequency.

Svib=KBn=73N{βvneβvn1ln[1eβvn]} (2)
λn=vn2 (3)

Measuring the difference of vibrational entropy (ΔSvib) between a mutant and a wild type (WT), one can calculate how much a mutation affects the overall flexibility and stability of the mutant relative to the WT. The ΔSvib value predicted by ENCoM is negative when the mutation makes the protein more flexible and positive when the mutation makes the protein more rigid. Vibrational entropy calculations are dependent on the thermodynamic β factor, that for pseudo-physical models such as ENCoM serves as a scaling factor. This term was optimized to fit experimental Gibbs free energy differences [39] and established as β = 1. The differences between the ΔSvib values for closed and open states, which we call Vibrational Difference Score (VDS), were calculated for each mutant (VDS = ΔSvib (open)− ΔSvib (closed)) as a means to select mutations of interest. A positive VDS suggests that the mutation makes the open state less flexible and/or the closed state more flexible, favouring the open conformation relative to the WT. Conversely, a negative VDS suggests that the mutation favours the closed state more so than the WT. The computational cost of obtaining the VDS (modelling the mutation with FoldX on both states and computing ΔSvib values for each with ENCoM) is approximately 20 CPU-minutes per mutant, making the total cost of these computations for the 17 081 mutants considered in this work around 0.65 CPU-year. Of the 20 CPU-minutes per mutant, around 2 minutes are spent on running FoldX and the rest on running ENCoM.

The Najmanovich Research Group Toolkit for Elastic Networks (NRGTEN) [39], with the latest implementation of ENCoM, also includes a function to evaluate state occupancies by calculating transition probabilities between different states. A probability Pj of moving along each eigenvector j can be obtained using a Boltzmann distribution given its associated eigenvalue λj and a scaling factor γ.

Pj=eλj/γi=73Neλi/γ (4)

Let us consider two conformations A and B of the same protein and the vector EA→B, which represents the conformational change going from conformation A to conformation B. The overlap between each normal mode Mj computed from conformation A and the EA→B vector is a value between 0 and 1 describing how well that normal mode recapitulates the conformational change required to go from one state to the other [40].

O(EAB,Mj)=|EABMj|EABMj (5)

We can then calculate the transition probability of going from conformation A to conformation B as the weighted sum of the Boltzmann probability Pj of each normal mode Mj times the overlap between that normal mode and the conformational change EA→B.

PAB=j=73NPj×O(EAB,Mj) (6)

The reverse probability PB→A can be computed in the same fashion, giving an indication of which conformation is favored between the two.

A simple way of computing the occupancies of these conformations from the transition probabilities is to use a Markov model. Each conformation is represented by a state, and the transition probabilities between states are computed as described above. We add a constant k to all states as the probability of staying in that state. Since all states must have outgoing transition probabilities that sum to 1, we normalize these values after the addition of k. For a two-state Markov chain representing the open and closed states of the Spike protein, we obtain the diagram shown in Fig 2. All transition probabilities are computed using ENCoM and Eq 6. The parameters k and γ need to be optimized for the system being studied as they are not directly coupled to physical quantities because of the pseudo-physical, coarse-grained nature of the ENCoM model. Once the parameters are set, there is a unique equilibrium solution that gives the occupancies of the two states. This approach could be easily generalized to a Markov model with more than two states, where the transition between any two states is computed exactly as described above if that transition is deemed possible.

Fig 2. Two-state Markov chain of Spike protein conformations.

Fig 2

Data visualization

All raw data is available for download and can be visualized with the help of the dms-view open-access tool [41]. On dms-view, it is possible to see the effects of different mutations for each residue of the Spike protein and visualise these on the 3D structure of Spike. Each analysed site is represented by 20 VDS values, one of them being zero (corresponding to the amino acid found in the wild type). The ‘max’ option will show the top VDS score for each position. Therefore, it shows which mutation for that specific position represents the candidate with the highest predicted infectivity as defined here in terms of a propensity to higher occupancy of the open state. The ‘min’ option will show the lowest score for each position and the mutation associated with the candidate predicted as least infectious. The ‘median’ option returns the median score, presenting a general trend for any given position, and ‘var’ shows the variance between the results for each position, highlighting sites in which mutations to different residues lead to a broader range of VDS values. Furthermore, for the mutations for which occupancy was calculated, the data can be accessed through the same menu. As new occupancy data is calculated, it will be added to this resource. Readers interested on the occupancy of particular mutations not yet available are invited to contact the authors via email or through the GitHub repository. When selecting each specific point on the first panel, it is possible to access all VDS values on the second panel and see the highlighted position in 3D on the structural representation.

Results and discussion

Dynamical Signature of different Spike variants

Comparison of the differences in dynamics between G614 and D614

An important event in the progression of the COVID-19 pandemic was the appearance of the D614G variant in mid-February 2020 in Europe. The fast spread of this variant raised the possibility that this mutation conferred advantages relative to other forms of the virus in circulation at the time [42,43]. Studies revealed that the mutation has indeed greater infectivity, triggering higher viral loads [44,45]. Several hypotheses have emerged to explain the mechanisms behind this higher infectivity, focusing primarily on possible effects on the Furin cleavage site [30,46,47], but recently also considering possible important dynamic differences [45,48,49].

In order to test if Dynamical Signatures reveal differences between Spike variants, we analysed the 13741 sequences of the protein available on May 8th, 2020 in the COVID-19 Viral Genome Analysis Pipeline, enabled by data from GISAID [50,51]. The mutant Spike proteins harboring mutations (S1 Table) were modelled in the open and closed states. Dynamical Signatures were calculated for each mutant in both states and clustered (Fig 3) using the Euclidean distance between Dynamical Signature vectors as measure of dissimilarity. Mutations in positions that had no occupancy in the original templates used for the open and closed states (positions 5, 8, and 1263) were ignored.

Fig 3.

Fig 3

Dynamical Signature clustering for the closed (A) and open (B) state structures for WT and 22 mutants from GISAID (S1 Table). The clustering measures the distance between each pair of Dynamical Signature. Different colors were used to identify branches within a threshold of similarity (8.7 and 3.2 for closed and open state, respectively). The branches that comprehend most strains containing the D614G mutation are highlighted in red.

Analysis of the effect of mutations on the Dynamical Signature shows that the D614G mutation produces similar dynamical patterns largely independent of the other mutations accumulated, and dynamical patterns that are distinct from that of the wild type and other mutants on both the open and closed states, as highlighted in the sections of the dendrogram marked in red. The dynamical characteristics of D614G are very specific and cannot be obtained with random mutations (S1 Fig and S2 Table).

When checking the difference between the Dynamical Signatures of the wild type D614 and the mutant G614 we observe that for the closed conformation, the pattern tends towards negative values, indicating that this mutation makes the closed state more flexible, especially around the position of the mutation. On the other hand, for the open B chain conformation the pattern is positive for the open RBD, the same chain NTD and the adjacent chain NTD, indicating that this mutation makes these areas of the open conformation more rigid (Fig 4).

Fig 4. Effects of the D614G mutation on the Dynamical Signature of the closed (purple) and open B chain RBD (blue) structures, measured by the difference between the calculated b-factors of D614 and G614.

Fig 4

Chains are represented in different colours and the position of the mutation is marked in yellow, using the same colours as for different regions of the structure as represented in the colours of the structures.

This result led us to hypothesize that a more flexible closed state would favor the opening of Spike and that a more rigid open state would disfavor its closing, thus shifting the conformational equilibrium towards the open state and favouring interaction with ACE2, leading to increased cell entry. Mutating position 614 to every other amino acid, we observe a correlation in the closed state between residue size and flexibility. Namely, smaller amino acids tend to make the closed state more flexible. However, we do not observe the opposite effect on the open state. Mutation of D614 to glutamine, which is similar to aspartate, barely shows any effect. Nevertheless, we can see that other amino acids have a similar effect as glycine, such as proline and threonine (S2 Fig).

Comparison of the Dynamical Signatures of Spike from SARS-CoV and SARS-CoV-2

It has been previously observed that RBD flexibility in SARS-CoV influences binding to ACE2 and facilitates fusion with host cells [52]. Thus, considering the lesser infectivity of SARS-CoV relative to SARS-CoV-2 and our aforementioned results for the D614G mutation, we expected the SARS-CoV Spike to be more rigid in the closed state and more flexible in the open state relative to Spike from SARS-CoV-2. This is indeed the case (Fig 5). The Dynamical Signature values of SARS-CoV are smaller than those of SARS-CoV-2 in several areas throughout the closed structure, indicating that when in the closed state, the SARS-CoV Spike protein is more rigid. For the open state we can see that SARS-CoV open RBD and adjacent NTD are significantly more flexible than for SARS-CoV-2 Spike.

Fig 5. Comparison between SARS-CoV-2 and SARS-CoV.

Fig 5

Dynamical Signature difference of the closed (purple) and open B chain RBD (blue) between aligned residues of the Spike protein from SARS-CoV-2 and SARS-CoV, with SARS-CoV chains represented in the top bar and equivalent colors in the structures and SARS-CoV-2 chains represented in the bar just below.

Vibrational entropy

It is possible to combine the trend of a Dynamical Signature into a single value to represent the overall flexibility of any given mutation and compare it to the WT. This can be achieved with ΔSvib, calculated with Eq 2 for each state (see Materials and Methods). For any given state, positive ΔSvib values represent mutants that relative to the wild type make the protein more rigid, whereas negative values of ΔSvib describe mutations that cause the protein to be more flexible in the given state relative to the wild type. In the case of the mutation D614G, we obtain ΔSvib (open) = 5.26x10-2 J.K-1 and ΔSvib (closed) = -9.27x10-2 J.K-1 with a VDS (calculated as ΔSvib (open)− ΔSvib (closed)) of 1.45x10-1 J.K-1.

We generated in silico the 19 possible single mutations in each position from residue 14 to residue 913 and calculated ΔSvib (open), ΔSvib (closed) and VDS. Other positions were ignored due to uncertainties in modelling or the fact that they are not expected to have a pronounced effect on dynamics [23]. It should be noted that Spike cannot accommodate the vast majority of such single mutations, particularly in its core as these would lead to unstable or misfolded conformations. However, those that occur near the surface are more likely to represent single residue variations of the Spike protein that lead to a stable, correctly folded protein. Therefore, the stability of specific mutations highlighted in this work, unless otherwise stated (such as those already observed experimentally or within the RBD domain as stated below), needs to be validated experimentally.

The heatmap in Fig 6A shows ΔSvib values associated with mutations on the closed conformational state (left) and open conformational state (right). Lighter colors represent high ΔSvib values, meaning that the specific mutant is more rigid than the WT, and darker colors represent low ΔSvib values, meaning that the specific mutant is more flexible than the WT. The second heatmap (Fig 6B) shows VDS values, highlighting positions and specific mutations with great contrast between their effect on the open and closed states. In this representation, blue mutants are more rigid in the closed state and more flexible in the open state, therefore candidates for less infectious mutants, and red mutants are more flexible when closed and more rigid when open, candidates for more infectious mutants.

Fig 6.

Fig 6

Heatmaps representing the values of ΔSvib for the closed structure (A, left-hand section) and for the open structure (A, right-hand section), and the values of VDS (B) for every possible mutant in Spike from positions 14 to 913. Each column represents one of the 20 amino acids (repeated in the left heatmap). Notice that for each position (represented in a row), one particular column represents the value of the WT amino acid found at that position. Higher values of ΔSvib are represented in yellow and lower values in dark purple. Higher values of VDS are represented in red and lower values in blue. The domain structure of Spike is represented in (C) for reference purposes.

In Fig 7 we map VDS values (Fig 6B) on the structure of Spike, colored according to the median value for each position with the same color scheme as the heatmap. From the 17081 single mutations considered, we show the top 64 mutants with VDS>3.00x10-1 J.K-1 (Tables 1 and S3) as well as the bottom 20 in terms of VDS values (S3 Table). The mutants with predicted open state occupancy higher than that of the wild type are presented in Table 1. The Dynamical Signature comparison for 3 of those most infectious candidates (S3A Fig) and 3 of the least infectious candidates (S3B Fig) shows some of the patterns that could lead to a greater or lesser predicted effect on infectivity. For instance, in Figs 8 and S3A we can see that high scores can come from a large flexibility of the closed state, a very large rigidity of the open state, or have the contribution of both. We can also observe that these effects can be different in each chain and can affect more the NTD, the RBD, or both. Finally, these single mutants also show how a point mutation can have widespread impacts on flexibility across the whole protein.

Fig 7. VDS values represented in the structure of Spike from two angles according to the median value for each position and the same color scheme as in the Difference Score Heatmap (Fig 6B).

Fig 7

Table 1. Putative mutations, associated VDS (ΔSvib (open)− ΔSvib (close), in units of J.K-1) and predicted occupancies for the open and closed states for the mutants with predicted open-state occupancy higher than that of the wild type.

Predicted occupancy values are shown for the open conformation, the closed conformation, and the difference between the two (closed–open). The data for the remaining mutants with occupancy below that of the wild type but VDS>0.3 (red) as well as those with the lowest predicted VDS values (blue) is presented in S3 Table.

Variant VDS Predicted Occupancy
Open state Closed State Difference (Open–Closed)
N501W 0.372 62.705% 37.295% 25.410%
G504I 0.349 40.491% 59.509% -19.019%
G416E 0.314 30.111% 69.889% -39.778%
G416Y 0.403 28.478% 71.522% -43.045%
R403S 0.323 26.794% 73.206% -46.412%
D467W 0.431 26.379% 73.621% -47.243%
G252W 0.383 26.302% 73.698% -47.396%
G404W 0.387 26.302% 73.698% -47.396%
G252Q 0.394 26.295% 73.705% -47.410%
G252H 0.353 26.268% 73.732% -47.464%
G252E 0.365 26.224% 73.776% -47.551%
K417G 0.315 26.179% 73.821% -47.641%
G252C 0.401 26.164% 73.836% -47.672%
G252S 0.447 26.159% 73.841% -47.682%
G252T 0.440 26.157% 73.843% -47.686%
G252D 0.390 26.149% 73.851% -47.702%
G413M -0.521 26.147% 73.853% -47.705%
G252M 0.481 26.134% 73.866% -47.731%
G252P 0.403 26.120% 73.880% -47.759%
K417D 0.518 26.114% 73.886% -47.773%
D467Y 0.399 26.106% 73.894% -47.788%
S161F 0.395 26.056% 73.944% -47.888%
K417P 0.502 26.035% 73.965% -47.930%
S161Y 0.303 25.980% 74.020% -48.040%
K417E 0.494 25.971% 74.029% -48.058%
D467P 0.308 25.954% 74.046% -48.093%
R34Y 0.380 25.908% 74.092% -48.184%
I468T 0.350 25.907% 74.093% -48.186%
R355F -0.697 25.889% 74.111% -48.222%
S161I 0.391 25.855% 74.145% -48.290%
G72W 0.404 25.851% 74.149% -48.298%
T73F 0.376 25.844% 74.156% -48.313%
Q14C 0.312 25.839% 74.161% -48.322%
WT 0.000 25.837% 74.163% -48.327%

Fig 8. Difference in the occupancies for the open and the closed states (open–closed) for six variants of the Spike protein.

Fig 8

Experimental values are represented on the Y-axis and the predicted values in the scale on the X-axis. Predicted values for the parameters k = 0.5, γ = 0.001. Represented linear fit of Experimental = 192.011*Predicted + 92.9013. Errors on the experimental measurements are not known.

Conformational state occupancies

We calculated forward and reverse transition probabilities between the open and closed states (Eq 4, 5 & 6) from the calculated normal modes and used the Markov model described in Materials and Methods to calculate the equilibrium occupancies for each state in wild type and mutant Spike proteins. It is unclear if any additional conformational states other than those with either all three RBD domains in the closed state or only one RBD open state are biologically relevant. Specifically, Yurkovetskiy et al. [45] observed an occupancy for states with two or three RBD domains in the open conformation, but these were not observed by Gobeil et al. [30] and Xiong et al. [34] or taken into consideration in several other structural studies [2024]. As such, we employ the two-state model shown in Fig 2, with one state representing all three RBD domains closed and the second state representing one RBD open.

The Markov model calculation of occupancies requires two parameters (see Materials and Methods) that were optimized based on experimental data for six Spike variants. These variants were: S-GSAS/D614, an engineered Spike with the sequence GSAS in the Furin cleavage site and no 614 mutation; S-GSAS/G614, with the same Furin site modifications and the D614G mutation [30]; S-R, the Spike protein with original Furin site RRAR; S-R/x2, with added S383C, D985C mutations inducing a disulfide bond; S-R/PP, engineered with two prolines in positions 986 and 987; S-R/PP/x1, in which from the double prolines sequence the mutations G413C, V987C were performed to induce a disulfide bond [34]. It is worth stressing that all 6 variants used to calibrate the two parameters affecting the occupancy were modelled on the same open and closed state conformations. All differences in observed occupancies and the agreement with experimental occupancy data came about as a direct consequence of the effect of the mutations on the normal modes and derived transition probabilities and not as a result of structural differences between variants. We obtained a good fitting to the experimental results with k and γ of 0.5 and 0.001, respectively (Pearson correlation = 0.89, p-value = 1.94x10-2). Predicted occupancies of the open and closed states for each of the six variants above, as well as the experimental data, are presented in Table 2.

Table 2. Experimental and predicted occupancies for the open and closed states and their difference for multiple SARS-CoV-2 variants.

Experimental values obtained from Gobeil et al. [30] and Xiong et al. [34].

Variant Experimental Occupancy Predicted Occupancy
Open Closed difference Open Closed difference
S-GSAS/D614 25.160% 74.840% -49.680% 25.781% 74.219% -48.439%
S-GSAS/G614 43.938% 56.062% -12.123% 25.812% 74.188% -48.376%
S-R/x2 0.000% 100.000% -100.000% 25.578% 74.422% -48.843%
S-R/PP/x1 0.000% 100.000% -100.000% 25.532% 74.468% -48.935%
S-R 18.646% 81.354% -62.707% 25.599% 74.401% -48.801%
S-R/PP 77.967% 22.033% 55.935% 25.848% 74.152% -48.305%

We utilized these data to calculate occupancy differences for each variant (Fig 8). The range of variation of our predicted occupancies is small compared to that of experimental values. We believe that given the limitations of our coarse-grained model as well as additional phenomena that ultimately affect occupancy, our predictions reflect only a fraction of the myriad of factors contributing to the occupancy. Nonetheless, our predictions correctly capture the pattern of relative variations of occupancy observed in the experimental data. To ensure that the calculated correlation is not due to chance, we simulated random sets of occupancies for the 6 sequence variants and calculated simulated correlations for the 110 different combinations of k and γ to determine if the observed correlations represent an actual signal in the data or could be randomly obtained with different values for the parameters k and γ. We observed a marked shift with higher correlations for the data representing our predicted occupancies when compared to the gaussian noise data (S4 Fig), suggesting that the predicted occupancies are not due to chance.

We set a threshold of VDS>0.3 to select candidates for the calculation of occupancies, corresponding to 64 mutations (Tables 1 and S3, in red). Using the parameters k and γ obtained above, we calculated occupancies for these 64 mutants as well as the 20 mutants with lowest VDS values (Tables 1 and S3, in blue) for comparison. In Fig 9A we show the difference in occupancy between the open and closed states using a non-linear scale adapted to better show the results around the wild type occupancy. Whereas VDS values for particular mutations may hint at a more flexible closed state and more rigid open state, this is a global measure that may not reflect the necessary pattern of flexibility across the structure that leads to effective transition probabilities between the open and closed states. Yet, for the most part, VDS can predict the shifts in occupancy, showing a clear distinction between the 64 mutants predicted using VDS as shifting occupancy towards the open state and the 20 mutants predicted to shift the equilibrium towards the closed state (p-value = 2.04x10-6). Fig 9B shows the location in the structure of the mutants in Table 1. We can see that the least infectious candidates (blue) are positioned in the interfaces between NTD and RBD domains, while the most infectious candidates, especially the ones validated by the occupancy prediction (dark red), are more concentrated in the interfaces between different RBD domains.

Fig 9.

Fig 9

(A) Difference in the occupancies for the open and the closed states for the top 64 mutants with VDS>0.3 (red) and the 20 mutants with lower VDS (blue). Occupancy difference for the WT is represented by the dashed green line. Y-axis based on the transformation of a symmetric logarithmic scale. (B) Two visualizations of the 6VYB structure highlighting the mutations. The bottom 20 mutant positions are marked in two shades of blue, with the darker shade indicating positions in which at least one mutant had an (open–closed) occupancy value smaller than wild type. The top 64 mutant positions are marked in two shades of red, with the darker shade indicating positions in which at least one mutant had an (open–closed) occupancy value higher than wild type.

Residue G252 stands out as capable of accommodating a large number of mutations (C, D, E, H, M, P, Q, S, T, W) that shift the occupancy in favour of the open state. The fact that variants in this position do not seem to be prevalent in outbreaks to date, points to the possibility that this position may be under additional functional constraints that prevent the emergence of variants. A number of other glycine residues could also accept mutations that we predict to increase the occupancy of the open state: G72W; G404W; G413M; G416E,W; and G404I. In fact, three of the top four mutations are mutations on glycines. A number of other potential mutations are adjacent to the glycine residues above. Namely, R403S and K417D,E,G,P. Additionally, D467P,W and I468T are also positions that are adjacent to others that can accommodate mutations that may lead to a conformational shift favouring the open state. The mutation that favours the open state the most in our calculations is N501W with ΔSvib (open) = 6.02x10-1 J.K-1 and ΔSvib (closed) = 2.30x10-1 J.K-1 and a resulting VDS value of 3.72x10-1 J.K-1 leading to occupancies compared to those of the wild type (in parenthesis) of 62.7% (25.8%) and 37.3% (74.2%) for the open and closed states respectively. It is important to stress, as discussed in Materials and Methods, that the calculations are performed using structures containing a modified Furin recognition site and prolines in positions 986 and 987. Furthermore, the contribution of vibrational entropy changes is one among potentially several effects, the overall importance of which remains to be determined. Therefore, relative changes in occupancy are relevant whereas the specific values are less so.

The COG-UK consortium (https://www.cogconsortium.uk/about/) monitors the appearance and spread of new strains of SARS-CoV-2. COG-UK recently detected a strain containing the mutation N501Y that has been observed to be spreading rapidly at the time of writing. We believe that shifts in occupancy may be in part responsible for its emergence. According to our calculations, the N501Y mutant shows ΔSvib (open) = -1.60x10-2 J.K-1 and ΔSvib (closed) = 2.37x10-1 J.K-1, with VDS = 2.53x10-1 J.K-1. The predicted occupancies for the N501Y mutant compared to those of the wild type (in parenthesis) are 54.3% (25.8%) and 45.7% (74.2%) for the open and closed states, respectively. Therefore, the N501Y mutant shows a marked increase of the occupancy of the open state relative to other mutations. Additionally, this mutation was shown to also increase binding affinity to the ACE2 receptor relative to the wild type with a Δlog10(KD,app) of 0.24 [53]. Therefore, we predict that N501Y has a strong potential to contribute to increased transmission. The calculations above were performed in the context of D614. However, the double mutant representing the N501Y mutation in the context of G614 also shows an increase in the occupancy of the open state to 35.06%. The recently observed A222V mutation on the other hand [54], does not show in our analysis any propensity of altering the occupancy of states with a negative VDS of -1.64x10-2 J.K-1. Predicted occupancies for A222 and V222 are nearly identical either in the context of D614 (WT) or the mutant containing G614.

Notice that N501Y has a VDS value of 2.53x10-1 J.K-1 that is slightly below the 3.00x10-1 J.K-1 threshold, suggesting that there may be many other mutations with VDS values below our set threshold that turn out to have augmented occupancies for the open state relative to the wild type.

D614G shows that changes in the occupancy of conformational states can impact infectivity despite no changes or even weaker binding affinities [45]. A recent study [53] on binding and expression of Spike mutations within the RBD domain (positions 331 to 531) shows that several (but not all, see below) of the mutations that we predicted to have increased occupancy of the open state are associated with a decrease of binding affinity with ACE2. Incidentally, the data also shows that the mutations in Table 1 within the RBD produce stable and properly folded Spike proteins. As shown for D614G, infection does not rely on binding affinity alone, and even a strain with higher dissociation rates from ACE2 can bring about fitness advantages.

The mutation N501W is predicted to have the largest effect in augmenting the occupancy of the open state relative to the wild type. This mutation is associated with stronger binding to ACE2 (Δlog10(KD,app) = 0.11) [53] relative to the wild type Spike (but lower than N501Y). Furthermore, N501W appears to have increased expression relative to the wild type with a Δlog(MFI) of 0.1 compared to decrease in relative expression of -0.14 for N501Y [53]. The authors note that changes in expression correlate with folding stability [53]. However, even with a Δlog(MFI) of -0.14, N501Y is viable and spreading. Therefore, N501W might be even more stable and infective.

We consider all mutations with increased predicted occupancy of the open state in Table 1 as good candidates for further experimental validation to better understand the role of binding and dynamics of Spike and their role in SARS-CoV-2 infectivity. Furthermore, we suggest that their appearance in outbreaks should be closely monitored.

SARS-CoV-2 Variants: B.1.1.7, 501.V2, P.1, Delta and Delta+

The mutation N501Y above appears in the B.1.1.7 variant first observed in the UK [55] as well as the 501.V2 variant first observed in South Africa [56] and the P.1 variant from Brazil [57] that are rapidly spreading around the globe. These two strains contain additional mutations in Spike. Namely, B.1.1.7 contains N501Y, A570D, D614G, P681H, T716I, S982A, D1118H and deletions on positions 69, 70 and 144. As the number of normal modes is related to the number of amino acids, we are unable to model deletions while still making comparisons with the wild type strain given the nature of the quantities calculated (Eqs 2 and 6). Therefore, the deletions of three residues at positions 69,70 and 144 that are present in B.1.1.7 were not modelled here. 501.V2 includes the mutations L18F, D80A, D215G, R246I, K417N, E484K, N501Y, D614G, A701V. P.1 variant includes the mutations L18F, T20N, P26S, D138Y, R190S, K417T, E484K, N501Y, D614G, H655Y, T1027I. The Dynamical Signatures for B.1.1.7, 501.V2 as well as P.1 show a strong rigidification of the open state and added flexibility of the closed state (S5S7 Figs respectively), leading to VDS values of 5.30x10-1 J.K-1, 6.45x10-1 J.K-1 and 6.88x10-1 J.K-1 and open state occupancies of 36.2%, 35.8% and 39.8% for B.1.1.7, 501.V2 and P.1 respectively. The three variants show an increase in occupancy of approximately 40% relative to the wild type (25.8%). Despite our preference of modelling the smallest possible number of mutations and therefore using the engineered structure containing the modified Furin clivage site and proline modification, we also modelled B.1.1.7 (except the deletions), 501.V2 and P.1 using the original sequence of Spike with a non-modified Furin clivage site. We obtain 33.0%, 33.6%, and 38.9% occupancy for B.1.1.7, 501.V2 and P.1 respectively.

Different factors may contribute to the apparent evolutionary advantages of the above strains through the course of the pandemic. The N501Y substitution, as mentioned, is among the candidates we pointed as enabling occupancy shifts towards the open conformation, but was also shown to have higher binding affinity to the receptor ACE2 [53]. The E484K mutation does not increase the occupancy of the open state in our predictions and also does not increase ACE2 binding affinity [53], but was observed to facilitate immune escape in a study with human serum antibodies from subjects that recovered from COVID-19 [58]. However, mutations like K417N and K417T were not observed to cause favourable changes on expression (a proxy measurement for stability) or binding [53] as well as immune recognition [58], but in our high throughput evaluation they were, as well as several substitutions in this position, predicted to increase the occupancy of the open state.

All predictive studies on Spike mutations, with both computational and experimental approaches, covered single mutations only. Therefore, it is possible that combinations of substitutions that constitute each new variant bring about advantages due to a number of factors that are hard to decouple or are yet to be determined. For example, the Indian variant B.1.617.2 (also known as Delta) [59], contains a core number of mutations in Spike (G142D, E154K, L452R, E484Q, D614G, P681R, Q1071H) but also additional sub-strain variations (T95I, H1101D or V382L) [60]. The B.1.617.2 strain and its variations above have predicted open state occupancies comparable or lesser than that of the wild type in our calculations. This suggests that other unknown factors alone or in combination are at play if the preliminary data suggesting increased transmissibility [61] is observed on larger samples. A newer variation of the Delta strain, AY.1 (or Delta+) does however contain the K417N mutation (that we predict to increase open state occupancy) similar to the South African strain 501.V2. A preliminary report [62] places AY.1 as a variant of concern (VOC).

Conclusions

SARS-CoV-2 mutations are still arising and spreading around the world. The A222V mutation, reportedly responsible for many infections, emerged in Spain during the Summer of 2020 and since then has spread to neighbor countries [54]; In Denmark, new strains related to SARS-CoV-2 transmission in mink farms were confirmed in early October by the WHO and shown to be caused by specific mutations not previously observed with the novelty of back-and-forth transmission between minks and humans [63]. A new strain containing N501Y first appeared in the UK, the recent Delta+ strain (AY.1) containing K417N is now on the rise at the time of writing. Such occurrences point to the possibility that new mutations in SARS-CoV-2 may bring about more infectious strains.

Using the low computational cost methods described in this paper, it is possible to predict potential variants that might have an advantage over the wild type virus insofar as these are the result of changes in occupancy of states. Despite the limitations of the simplified coarse-grained model employed here, our results correctly model the experimentally observed higher open state occupancy of several mutations. In our analyses, flexibility properties and conformational state occupancy probabilities contribute to the infectivity of a SARS-CoV-2. Our results explain the behaviour of the D614G strain, the increased infectivity of SARS-CoV-2 relative to SARS-CoV as well as offer a possible explanation for the rise of new strains such as those harboring the N501Y mutation.

The results we present on SARS-CoV-2 Spike mutations have several limitations. First and foremost, some of the in silico mutations discussed may not be thermodynamically stable, may affect expression, cleavage, or binding to ACE2, and our approach does not consider that Spike is, in fact, a glycoprotein and the sugar molecules may have an effect on dynamics. However, the agreement between our model and experimental observations as described above shows that the simplified model of Spike and the coarse-grained methods used here allow us to calculate dynamic properties of Spike that are relevant to understand infection and epidemiological behavior. It is important to keep in mind that all of the mutations that we discuss in Table 1 that lay within positions 331 and 531 within the RBD domain were already experimentally validated shown to be viable [53]. However, we highlight the need for experimental validation of our predictions particularly for those candidates that we believe would help elucidate the extent of the effect of the conformational dynamics of Spike on infectivity. Beyond in vitro biophysical studies, experimental alternatives exist such as using pseudo-type viruses or virus-like-particles that would not require studying gain-of-function mutations using intact viruses. Alternatively, loss-of-function mutations can be created with intact viruses and compared to the wild type SARS-CoV-2 virus to validate the role of dynamics on infectivity.

After this work first became available as a preprint in December 2020 [64], extensive molecular dynamics simulations totalling 20 μs and investigating the effect of the D614G mutation on conformational dynamics have been conducted with the full complement of glycans [65]. The authors found that this mutation increased the occupancy of the up (open) state, in line with the results presented in our work. This makes us confident that our coarse-grained model lacking glycans still captures the essential dynamics of the Spike protein and validates its application for high throughput screening of mutations, which would be too costly computationally to conduct using long all-atom MD simulations.

Two studies determined the structures of the D614G mutant [66,67] in open and closed states via cryo electron microscopy (cryo-EM) in the open and closed states were published and came to our attention after this manuscript was published as a preprint. Both studies come to the same conclusion as ours, namely that the open state is favoured on the D614G mutant. More recently, the structures of the B.1.1.7 (UK), 501.V2 (SA) and P1 (Brazil) variants modelled here become available [68]. The authors show in all cases an increased propensity for the open state. In summary, while this computational work has been under review a number of studies corroborating our conclusions have appeared. Namely, complementary computational molecular dynamics simulations as well as experimental work for the single mutant D614G as well as experimental work on the UK, SA and Brazil variants.

This work demonstrates how a simplified coarse-grained model can capture essential aspects of the effect of mutations on the dynamics between conformational states for the SARS-CoV-2 Spike protein. Furthermore, the low computational cost of the calculations allowed us to predict mutations with evolutionary advantages that appeared during the COVID-19 pandemic. Lastly, as the COVID-19 pandemic is still ongoing, our high-throughput data can contribute to the risk assessment of future variants.

To the best of our knowledge, this is the first time that a normal mode analysis method is used to model the effect of mutations on the occupancy of conformational states, opening a new opportunity in computational biophysics to create dynamic models of transitions between conformational states of proteins based on physical properties and sensitive to sequence variations. We hope that our results contribute to inform the research community in understanding SARS-CoV-2 infection mechanisms, open new possibilities in computational biophysics to study protein dynamics and help public health surveillance programs decide on the risk posed by new strains given the appropriateness of our method for large-scale uses.

Supporting information

S1 Fig

Dynamical Signature clustering between WT, 22 mutants observed in nature and 30 random mutants designed, both for the full structure (A) and only for the RBD (B). The grey arrow highlights the clusters containing most of the mutants that have the mutation D614G.

(TIF)

S2 Fig. Effects on the Dynamical Signatures of the mutation from D614 to each one of the other 18 possible residues.

(TIF)

S3 Fig

Dynamical Signature differences for three mutations among the top VDS (A)–K417E, G252M and G404Y –and bottom VDS values (B)–R355Y, F464L and I231P –for closed (purple) and open (blue) B chain RBD structures. Chains A, B and C are marked on the top in green, cyan and magenta, respectively, and the point mutations are marked in yellow.

(TIF)

S4 Fig. Simulated Pearson correlations between gaussian noise vectors of length 6 for the 110 iterations associated to each combination of the parameters k = [0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1, 5, 10, 50, 100] and γ = [0.001, 0.01, 0.1, 1, 10, 100, 1000, 10000, 100000, 1000000].

(TIF)

S5 Fig. Dynamical signature difference for the open (blue) and closed (purple) states relative to the wild type for the B.1.1.7 variant.

(TIF)

S6 Fig. Dynamical signature difference for the open (blue) and closed (purple) states relative to the wild type for the 501.V2 variant.

(TIF)

S7 Fig. Dynamical signature difference for the open (blue) and closed (purple) states relative to the wild type for the P.1 variant.

(TIF)

S1 Table. Amino acid mutations associated to 13741 sequences of the Spike protein available on May 08 in COVID-19 Viral Genome Analysis Pipeline, enabled by data from GISAID.

(DOCX)

S2 Table. Random mutants accumulating from one to four mutations.

(DOCX)

S3 Table. Putative mutations and associated VDS (ΔSvib (open)− ΔSvib (close), in units of J.K-1) and predicted occupancies with open occupancies lower than for the wild type for the 64 mutants with FDS>0.3 and the 20 mutants with lowest FDS values.

(DOCX)

Acknowledgments

RN is a Fonds de Recherche du Québec—Santé (FRQ-S) Senior Fellow, a member of the Réseau Québécois de Recherche sur les Médicaments (RQRM) and the Quebec Network for Research on Protein Function, Engineering and Applications (PROTEO). The authors would like to dedicate this work to the memory of Mordechai Najmanovich, Z”L, father of RN, who passed away from complications due to COVID-19 on November 26, 2020. RN would like to thank all healthcare workers, particularly ICU nurses and physicians at the Avista Adventist Hospital in Louisville, Colorado, for their efforts.

Data Availability

Raw data and structures used to build the images presented here are available in a Github repository (https://github.com/nataliateruel/data_Spike). All vibrational entropy results are available for visualisation and analysis through a link to the dms-view open-access tool, available on GitHub through the same URL above. The Najmanovich Research Group Toolkit for Elastic Networks (NRGTEN) including the latest ENCoM implementation is freely available at https://github.com/gregorpatof/nrgten_package.

Funding Statement

OM is the recipient of a PhD fellowship from the Fonds de Recherche du Québec - Nature et Technologie (FRQ-NT). This work was funded by grants from Genome Canada (http://genomecanada.ca), Genome Québec (https://www.genomequebec.com) as well as the Natural Sciences and Engineering Research Council (NSERC) grant number RGPIN05332-2019. This research was enabled in part by support provided by (Calcul Québec) (https://www.calculquebec.ca/) and Compute Canada (www.computecanada.ca). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Zhou P, Yang X-L, Wang X-G, Hu B, Zhang L, Zhang W, et al. A pneumonia outbreak associated with a new coronavirus of probable bat origin. Nature. 2020;579: 270–273. doi: 10.1038/s41586-020-2012-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Lu R, Zhao X, Li J, Niu P, Yang B, Wu H, et al. Genomic characterisation and epidemiology of 2019 novel coronavirus: implications for virus origins and receptor binding. Lancet. 2020;395: 565–574. doi: 10.1016/s0140-6736(20)30251-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Organization WH. Weekly epidemiological update—5 January 2021. 2021 Jan. Available: https://www.who.int/publications/m/item/weekly-epidemiological-update—5-january-2021
  • 4.Control UC for D. SARS Basics Fact Sheet. n.d. [cited 2AD]. Available: https://www.cdc.gov/sars/about/fs-sars.html
  • 5.Wang L-F, Shi Z, Zhang S, Field H, Daszak P, Eaton BT. Review of bats and SARS. Emerging infectious diseases. 2006;12: 1834–1840. doi: 10.3201/eid1212.060401 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Memish ZA, Perlman S, Kerkhove MDV, Zumla A. Middle East respiratory syndrome. Lancet. 2020;395: 1063–1077. doi: 10.1016/s0140-6736(19)33221-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Haagmans BL, Dhahiry SHSA, Reusken CBEM, Raj VS, Galiano M, Myers R, et al. Middle East respiratory syndrome coronavirus in dromedary camels: an outbreak investigation. The Lancet Infectious diseases. 2014;14: 140–145. doi: 10.1016/s1473-3099(13)70690-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Zhao Z, Li H, Wu X, Zhong Y, Zhang K, Zhang Y-P, et al. Moderate mutation rate in the SARS coronavirus genome and its implications. BMC evolutionary biology. 2004;4: 21. doi: 10.1186/1471-2148-4-21 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Koyama T, Platt D, Parida L. Variant analysis of SARS-CoV-2 genomes. Bulletin of the World Health Organization. 2020;98: 495–504. doi: 10.2471/blt.20.253591 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Letko M, Marzi A, Munster V. Functional assessment of cell entry and receptor usage for SARS-CoV-2 and other lineage B betacoronaviruses. Nature Microbiology. 2020;5: 562–569. doi: 10.1038/s41564-020-0688-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Yan R, Zhang Y, Li Y, Xia L, Guo Y, Zhou Q. Structural basis for the recognition of the SARS-CoV-2 by full-length human ACE2. Science (New York, NY). 2020; eabb2762. doi: 10.1126/science.abb2762 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Shang J, Wan Y, Luo C, Ye G, Geng Q, Auerbach A, et al. Cell entry mechanisms of SARS-CoV-2. Proc Natl Acad Sci U S A. 2020;117: 11727–11734. doi: 10.1073/pnas.2003138117 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Selvaraj C, Dinesh DC, Panwar U, Abhirami R, Boura E, Singh SK. Structure-based virtual screening and molecular dynamics simulation of SARS-CoV-2 Guanine-N7 methyltransferase (nsp14) for identifying antiviral inhibitors against COVID-19. Journal of biomolecular structure & dynamics. 2020;57: 1–12. doi: 10.1080/07391102.2020.1778535 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Ali A, Vijayan R. Dynamics of the ACE2-SARS-CoV-2/SARS-CoV spike protein interface reveal unique mechanisms. Scientific Reports. 2020;10: 14214–12. doi: 10.1038/s41598-020-71188-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Suárez D, Díaz N. SARS-CoV-2 Main Protease: A Molecular Dynamics Study. J Chem Inf Model. 2020;60: 5815–5831. doi: 10.1021/acs.jcim.0c00575 [DOI] [PubMed] [Google Scholar]
  • 16.Pinto D, Park Y-J, Beltramello M, Walls AC, Tortorici MA, Bianchi S, et al. Cross-neutralization of SARS-CoV-2 by a human monoclonal SARS-CoV antibody. Nature. 2020;583: 290–295. doi: 10.1038/s41586-020-2349-y [DOI] [PubMed] [Google Scholar]
  • 17.Rogers TF, Zhao F, Huang D, Beutler N, Burns A, He W-T, et al. Isolation of potent SARS-CoV-2 neutralizing antibodies and protection from disease in a small animal model. Science (New York, NY). 2020;369: 956–963. doi: 10.1126/science.abc7520 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Cao Y, Su B, Guo X, Sun W, Deng Y, Bao L, et al. Potent Neutralizing Antibodies against SARS-CoV-2 Identified by High-Throughput Single-Cell Sequencing of Convalescent Patients’ B Cells. Cell. 2020;182: 73–84.e16. doi: 10.1016/j.cell.2020.05.025 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Deganutti G, Prischi F, Reynolds CA. Supervised molecular dynamics for exploring the druggability of the SARS-CoV-2 spike protein. Journal of Computer-Aided Molecular Design. 2020;20: 1015–13. doi: 10.1007/s10822-020-00356-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Arantes PR, Saha A, Palermo G. Fighting COVID-19 Using Molecular Dynamics Simulations. 28Oct2020: 1654–1656. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Karathanou K, Lazaratos M, Bertalan É, Siemers M, Buzar K, Schertler GFX, et al. A graph-based approach identifies dynamic H-bond communication networks in spike protein S of SARS-CoV-2. J Struct Biol. 2020;212: 107617. doi: 10.1016/j.jsb.2020.107617 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Melero R, Sorzano COS, Foster B, Vilas J-L, Martínez M, Marabini R, et al. Continuous flexibility analysis of SARS-CoV-2 spike prefusion structures. IUCrJ. 2020;7: 1059–1069. doi: 10.1107/s2052252520012725 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Verkhivker GM. Molecular Simulations and Network Modeling Reveal an Allosteric Signaling in the SARS-CoV-2 Spike Proteins. Journal of proteome research. 2020;19: 4587–4608. doi: 10.1021/acs.jproteome.0c00654 [DOI] [PubMed] [Google Scholar]
  • 24.Majumder S, Chaudhuri D, Datta J, Giri K. Exploring the intrinsic dynamics of SARS-CoV-2, SARS-CoV and MERS-CoV spike glycoprotein through normal mode analysis using anisotropic network model. Journal of molecular graphics & modelling. 2021;102: 107778. doi: 10.1016/j.jmgm.2020.107778 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Frappier V, Najmanovich RJ. A coarse-grained elastic network atom contact model and its use in the simulation of protein dynamics and the prediction of the effect of mutations. MacKerell AD, editor. PLoS computational biology. 2014;10: e1003569. doi: 10.1371/journal.pcbi.1003569 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Frappier V, Chartier M, Najmanovich RJ. ENCoM server: exploring protein conformational space and the effect of mutations on protein function and stability. Nucleic acids research. 2015;43: W395–400. doi: 10.1093/nar/gkv343 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Frappier V, Chartier M, Najmanovich R. Applications of Normal Mode Analysis Methods in Computational Protein Design. Methods in molecular biology (Clifton, NJ). 2017;1529: 203–214. doi: 10.1007/978-1-4939-6637-0_9 [DOI] [PubMed] [Google Scholar]
  • 28.Frappier V, Najmanovich RJ. Vibrational entropy differences between mesophile and thermophile proteins and their use in protein engineering. Protein Science. 2015;24: 474–483. doi: 10.1002/pro.2592 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Walls AC, Park Y-J, Tortorici MA, Wall A, McGuire AT, Veesler D. Structure, Function, and Antigenicity of the SARS-CoV-2 Spike Glycoprotein. Cell. 2020;181: 281–292.e6. doi: 10.1016/j.cell.2020.02.058 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Gobeil SM-C, Janowska K, McDowell S, Mansouri K, Parks R, Manne K, et al. D614G Mutation Alters SARS-CoV-2 Spike Conformation and Enhances Protease Cleavage at the S1/S2 Junction. Cell Reports. 2021;34: 108630. doi: 10.1016/j.celrep.2020.108630 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Yuan Y, Cao D, Zhang Y, Ma J, Qi J, Wang Q, et al. Cryo-EM structures of MERS-CoV and SARS-CoV spike glycoproteins reveal the dynamic receptor binding domains. Nature communications. 2017;8: 15092–9. doi: 10.1038/ncomms15092 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Sali A, Blundell TL. Comparative protein modelling by satisfaction of spatial restraints. Journal of molecular biology. 1993;234: 779–815. doi: 10.1006/jmbi.1993.1626 [DOI] [PubMed] [Google Scholar]
  • 33.Schymkowitz J, Borg J, Stricher F, Nys R, Rousseau F, Serrano L. The FoldX web server: an online force field. Nucleic acids research. 2005;33: W382–8. doi: 10.1093/nar/gki387 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Xiong X, Qu K, Ciazynska KA, Hosmillo M, Carter AP, Ebrahimi S, et al. A thermostable, closed SARS-CoV-2 spike protein trimer. Nature structural & molecular biology. 2020;27: 934–941. doi: 10.1038/s41594-020-0478-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Wako H, Endo S. Normal mode analysis as a method to derive protein dynamics information from the Protein Data Bank. Biophysical Reviews. 2017;9: 877–893. doi: 10.1007/s12551-017-0330-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Tang Q-Y, Kaneko K. Long-range correlation in protein dynamics: Confirmation by structural data and normal mode analysis. Groot BL de, editor. PLoS computational biology. 2020;16: e1007670. doi: 10.1371/journal.pcbi.1007670 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Cui Q, Bahar I. Normal Mode Analysis. CRC Press; 2006. [Google Scholar]
  • 38.Xu B, Shen H, Zhu X, Li G. Fast and accurate computation schemes for evaluating vibrational entropy of proteins. Journal of Computational Chemistry. 2011;32: 3188–3193. doi: 10.1002/jcc.21900 [DOI] [PubMed] [Google Scholar]
  • 39.Mailhot O, Najmanovich R. The NRGTEN Python package: an extensible toolkit for coarse-grained normal mode analysis of proteins, nucleic acids, small molecules and their complexes. Bioinformatics. 2021. doi: 10.1093/bioinformatics/btab189 [DOI] [PubMed] [Google Scholar]
  • 40.Marques O, Sanejouand YH. Hinge-bending motion in citrate synthase arising from normal mode calculations. PROTEINS: Structure, Function and Genetics. 1995;23: 557–560. doi: 10.1002/prot.340230410 [DOI] [PubMed] [Google Scholar]
  • 41.Hilton S, Huddleston J, Black A, North K, Dingens A, Bedford T, et al. dms-view: Interactive visualization tool for deep mutational scanning data. Journal of Open Source Software. 2020;5: 2353. doi: 10.21105/joss.02353 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Volz E, Hill V, McCrone JT, Price A, Jorgensen D, O’Toole Á, et al. Evaluating the Effects of SARS-CoV-2 Spike Mutation D614G on Transmissibility and Pathogenicity. Cell. 2020. doi: 10.1016/j.cell.2020.11.020 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Li Q, Wu J, Nie J, Zhang L, Hao H, Liu S, et al. The Impact of Mutations in SARS-CoV-2 Spike on Viral Infectivity and Antigenicity. Cell. 2020;182: 1284–1294.e9. doi: 10.1016/j.cell.2020.07.012 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Korber B, Fischer WM, Gnanakaran S, Yoon H, Theiler J, Abfalterer W, et al. Tracking Changes in SARS-CoV-2 Spike: Evidence that D614G Increases Infectivity of the COVID-19 Virus. Cell. 2020;182: 812–827.e19. doi: 10.1016/j.cell.2020.06.043 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Yurkovetskiy L, Wang X, Pascal KE, Tomkins-Tinch C, Nyalile TP, Wang Y, et al. Structural and Functional Analysis of the D614G SARS-CoV-2 Spike Protein Variant. Cell. 2020;183: 739–751.e8. doi: 10.1016/j.cell.2020.09.032 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Mohammad A, Alshawaf E, Marafie SK, Abu-Farha M, Abubaker J, Al-Mulla F. Higher binding affinity of Furin to SARS-CoV-2 spike (S) protein D614G could be associated with higher SARS-CoV-2 infectivity. International journal of infectious diseases: IJID: official publication of the International Society for Infectious Diseases. 2020. doi: 10.1016/j.ijid.2020.10.033 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Tang L, Schulkins A, Chen C-N, Deshayes K, Kenney JS. The SARS-CoV-2 Spike Protein D614G Mutation Shows Increasing Dominance and May Confer a Structural Advantage to the Furin Cleavage Domain. 2020; 2020050407. doi: 10.20944/preprints202005.0407.v1 [DOI] [Google Scholar]
  • 48.Zhang L, Jackson CB, Mou H, Ojha A, Peng H, Quinlan BD, et al. SARS-CoV-2 spike-protein D614G mutation increases virion spike density and infectivity. Nat Commun. 2020;11: 6013. doi: 10.1038/s41467-020-19808-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Berger I, Schaffitzel C. The SARS-CoV-2 spike protein: balancing stability and infectivity. Cell research. 2020;30: 1059–1060. doi: 10.1038/s41422-020-00430-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Shu Y, McCauley J. GISAID: Global initiative on sharing all influenza data—from vision to reality. Euro surveillance: bulletin Europeen sur les maladies transmissibles = European communicable disease bulletin. 2017;22: 957. doi: 10.2807/1560-7917.es.2017.22.13.30494 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Elbe S, Buckland-Merrett G. Data, disease and diplomacy: GISAID’s innovative contribution to global health. Global challenges (Hoboken, NJ). 2017;1: 33–46. doi: 10.1002/gch2.1018 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Kirchdoerfer RN, Wang N, Pallesen J, Wrapp D, Turner HL, Cottrell CA, et al. Stabilized coronavirus spikes are resistant to conformational changes induced by receptor recognition or proteolysis. Scientific Reports. 2018;8: 15701–11. doi: 10.1038/s41598-018-34171-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Starr TN, Greaney AJ, Hilton SK, Ellis D, Crawford KHD, Dingens AS, et al. Deep Mutational Scanning of SARS-CoV-2 Receptor Binding Domain Reveals Constraints on Folding and ACE2 Binding. Cell. 2020;182: 1295–1310.e20. doi: 10.1016/j.cell.2020.08.012 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Hodcroft EB, Zuber M, Nadeau S, Vaughan TG, Crawford KHD, Althaus CL, et al. Spread of a SARS-CoV-2 variant through Europe in the summer of 2020. Nature. 2021; 1–6. doi: 10.1038/s41586-021-03677-y [DOI] [PubMed] [Google Scholar]
  • 55.Rambaut A, Loman N, Pybus O, Barclay W, Barrett J, Carabelli A, et al. Case Study: Prolonged Infectious SARS-CoV-2 Shedding from an Asymptomatic Immunocompromised Individual with Cancer. Dec2020. Available: https://virological.org/t/preliminary-genomic-characterisation-of-an-emergent-sars-cov-2-lineage-in-the-uk-defined-by-a-novel-set-of-spike-mutations/563 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Tegally H, Wilkinson E, Giovanetti M, Iranzadeh A, Fonseca V, Giandhari J, et al. Emergence and rapid spread of a new severe acute respiratory syndrome-related coronavirus 2 (SARS-CoV-2) lineage with multiple spike mutations in South Africa. medRxiv: the preprint server for health sciences. 2020;34: 2020.12.21.20248640. doi: 10.1101/2020.12.21.20248640 [DOI] [Google Scholar]
  • 57.Paiva MHS, Guedes DRD, Docena C, Bezerra MF, Dezordi FZ, Machado LC, et al. Multiple Introductions Followed by Ongoing Community Spread of SARS-CoV-2 at One of the Largest Metropolitan Areas of Northeast Brazil. Viruses. 2020;12: 1414. doi: 10.3390/v12121414 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Greaney AJ, Loes AN, Crawford KHD, Starr TN, Malone KD, Chu HY, et al. Comprehensive mapping of mutations in the SARS-CoV-2 receptor-binding domain that affect recognition by polyclonal human plasma antibodies. Cell Host Microbe. 2021;29: 463–476.e6. doi: 10.1016/j.chom.2021.02.003 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Mlcochova P, Kemp S, Dhar MS, Papa G, Meng B, Mishra S, et al. SARS-CoV-2 B.1.617.2 Delta variant emergence and vaccine breakthrough. 2021. doi: 10.21203/rs.3.rs-637724/v1 [DOI] [Google Scholar]
  • 60.Yadav PD, Sapkal GN, Abraham P, Ella R, Deshpande G, Patil DY, et al. Neutralization of variant under investigation B.1.617 with sera of BBV152 vaccinees. Biorxiv. 2021; 2021.04.23.441101. doi: 10.1101/2021.04.23.441101 [DOI] [PubMed] [Google Scholar]
  • 61.Allen* H, Vusirikala* A, Flannagan J, Twohig KA, Zaidi A, Consortium C-U, et al. Increased Household Transmission of COVID-19 Cases—national case study.pdf. 2021 Jul. Available: https://khub.net/documents/135939561/405676950/Increased+Household+Transmission+of+COVID-19+Cases+-+national+case+study.pdf/7f7764fb-ecb0-da31-77b3-b1a8ef7be9aa
  • 62.Public-Health-England. Variants of Concern VOC Technical Briefing 15–2. 2021 Jun.
  • 63.Munnink BBO, Sikkema RS, Nieuwenhuijse DF, Molenaar RJ, Munger E, Molenkamp R, et al. Transmission of SARS-CoV-2 on mink farms between humans and mink and back to humans. Science (New York, NY). 2020; eabe5901. doi: 10.1126/science.abe5901 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Teruel N, Mailhot O, Najmanovich RJ. Modelling conformational state dynamics and its role on infection for SARS-CoV-2 Spike protein variants. Biorxiv. 2021; 2020.12.16.423118. doi: 10.1101/2020.12.16.423118 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Mansbach RA, Chakraborty S, Nguyen K, Montefiori DC, Korber B, Gnanakaran S. The SARS-CoV-2 Spike variant D614G favors an open conformational state. Sci Adv. 2021;7: eabf3671. doi: 10.1126/sciadv.abf3671 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Zhang J, Cai Y, Xiao T, Lu J, Peng H, Sterling SM, et al. Structural impact on SARS-CoV-2 spike protein by D614G substitution. Science. 2021;372: eabf2303. doi: 10.1126/science.abf2303 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Benton DJ, Wrobel AG, Roustan C, Borg A, Xu P, Martin SR, et al. The effect of the D614G substitution on the structure of the spike glycoprotein of SARS-CoV-2. Proc National Acad Sci. 2021;118: e2022586118. doi: 10.1073/pnas.2022586118 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Gobeil SM-C, Janowska K, McDowell S, Mansouri K, Parks R, Stalls V, et al. Effect of natural mutations of SARS-CoV-2 on spike structure, conformation, and antigenicity. Sci New York N Y. 2021. doi: 10.1126/science.abi6226 [DOI] [PMC free article] [PubMed] [Google Scholar]
PLoS Comput Biol. doi: 10.1371/journal.pcbi.1009286.r001

Decision Letter 0

Roland L Dunbrack Jr, Arne Elofsson

Transfer Alert

This paper was transferred from another journal. As a result, its full editorial history (including decision letters, peer reviews and author responses) may not be present.

14 May 2021

Dear Dr. Najmanovich,

Thank you very much for submitting your manuscript "Modelling conformational state dynamics and its role on infection for SARS-CoV-2 Spike protein variants" for consideration at PLOS Computational Biology.

As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. In light of the reviews (below this email), we would like to invite the resubmission of a significantly-revised version that takes into account the reviewers' comments.

Both reviewers were positive overall. The first reviewer had concerns primarily  about the presentation. However, the second reviewer had more significant concerns about some of the methodology (glycans) and focused comparison to experimental data, which I think would improve the paper.

We cannot make any decision about publication until we have seen the revised manuscript and your response to the reviewers' comments. Your revised manuscript is also likely to be sent to reviewers for further evaluation.

When you are ready to resubmit, please upload the following:

[1] A letter containing a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.

[2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file).

Important additional instructions are given below your reviewer comments.

Please prepare and submit your revised manuscript within 60 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email. Please note that revised manuscripts received after the 60-day due date may require evaluation and peer review similar to newly submitted manuscripts.

Thank you again for your submission. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments.

Sincerely,

Roland L. Dunbrack Jr., Ph.D.

Associate Editor

PLOS Computational Biology

Arne Elofsson

Deputy Editor

PLOS Computational Biology

***********************

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: The manuscript describes an interesting and timely computational study of SARS-CoV-2 Spike protein variants and their impact on infectivity. The results will be very useful not only for the scientific community but also for the public health systems. The methodology presented could be of great help to predict the risk of new strains.

It’s well written. It would however beneficiate from a thorough review:

- Page 9: Explain the meaning of “May 08”

- In Figure 3, it is not clear which data are represented in the y-axis.

- Figure 8: Results showed in figure 8 should be better explained in the main text because there are some inconsistencies. Figure legend should be revised.

- It is not clear the usefulness of 3.5 section

- 3.6 section should be better included in Material & Method section

- A final conclusion should be included. Actual conclusions are more like a summary of the manuscript.

Reviewer #2: In the manuscript “Modelling conformational state dynamics and its role on infection for SARS-CoV-2 Spike protein variants” the authors utilize coarse-grained normal mode analyses to model the dynamics of Spike proteins and calculate transition probabilities between states for a number of Spike variants. The results predict an increase in open-state occupancy for the more infectious D614G via an increase in flexibility of the closed-state and decrease of flexibility of the open-state. The manuscript also presents a high throughput analysis of simulated single amino acid mutations on dynamic properties to seek potential hotspots and individual Spike variants that may be more infectious. The authors introduce a Markov model of occupancy of molecular states with transition probabilities derived from our analysis of dynamics that recapitulates experimental data on conformational state occupancies.

The biological problems addressed in this work are of clear fundamental and therapeutic interest and insights from computational approaches are certainly welcome to improve our understanding of the SARS-CoV-2 spike mechanisms and interactions.

Major points:

1.This is a fairly well-executed technical study describing an interesting combination of computational simulation tools to understand mechanisms of SARS-CoV-2 spike proteins in the native and mutant states. Although some of the presented results are certainly very interesting, the manuscript lacks organization, structure and a clearly formulated methodological objective. The overall presentation of the results is fragmented making difficult to understand the logic and methodological details of this work.

2. There is an enormous literature about this manuscript (both computational and experimental) that is only very briefly mentioned in Introduction. The authors should have critically assessed the previous studies and, more importantly, identify key issues and questions unanswered thus far.

3) The performed CG simulations do not apparently include the glycosylation of the spike, therefore strongly reducing the biological relevance of the entire work. Perhaps the authors should consider a model to mimic the glycosylated microenvironment in the framework of CG approaches. Although glycans are not supported by many CG methods which represents an important limitation in adopting this specific computational approach to study fusion viral proteins where glycosylation plays a key role, there exist CG methods ( such as Martini) where the energetic parameters have been recently extended to N-glycans. Glycans have been widely found to be crucial in the modulation of the spike conformational dynamics and should be considered for modeling of spike proteins. The inclusion of glycans to the model could potentially change the results and this comparison would be very important to substantiate the main findings and conclusions.

4) The authors claim that their results correctly model an increase in open-state occupancy for the more infectious D614G via an increase in flexibility of the closed-state and decrease of flexibility of the open-state.

There have been recently cryo-EM structures of a full-length G614 trimer (Structural impact on SARS-CoV-2 spike protein by D614G substitution. Science 2021, 372, 525-530) for distinct prefusion conformations in the closed, intermediate and 1-up open states (pdb ids 7krq, 7krs, 7krr) that characterized previously disordered regions in spike protein. This study supported the reduced shedding mechanism and suggested the increased stability of the G614 mutant. At the same time, another recent work in PNAS ( The effect of the D614G substitution on the structure of the spike glycoprotein of SARS-CoV-2. Proc. Natl. Acad. Sci. U. S. A. 2021, 118, e2022586118, pdb ids 7bnm, 7bnn, 7bno). These G614 mutant structures were more flexible and wide-open which is in line with the increased flexibility of the open state as proposed in the reviewed manuscript. These studies proposed different mechanisms, but it may reflect the diversity of conformational states adopted by the G614 mutant spike trimer.

Given simplicity of the elastic network models, I would suggest testing these structures ( or at least some of them) to try to reconcile conflicting mechanisms and also understand the effect of the G614 structures on the results and predictions.

5. Could the authors more clearly identify what makes their findings novel to biological community? What do the results of this study add to our current knowledge of the role of protein dynamics in these mechanisms?

6. It would be desirable to also use all-atom MD simulations for some of the studied systems to allow for a comparative analysis of protein flexibility. In general, the analysis of computational simulations are not sufficiently justified which weakens their connection with the biological evidence.

7. I believe that the authors should spend some time thinking how to strengthen the interface between experiment and computations in the manuscript to substantiate key findings.

8) Although the system is fascinating and computational approach is generally appropriate, the manuscript often reads as a set of disconnected observations rather than a cohesive story with the detailed analysis and insightful discussion.

9) The results often lack proper interpretation and integration with experiment to justify findings.

10). I believe that the authors should spend some time thinking how to strengthen the interface between experiment and computations in the manuscript to substantiate key findings.

Minor points:

1. The illustrations are often not sufficiently informative and generally very poor. Many of the plots and data cannot be seen at all. The authors should redesign and redo most of these figures and make them better organize, visible and informative with necessary annotations.

2. The manuscript is lacking a systematic statistical framework for assessing significance and quality of predictions. The authors should more clearly formulate and apply their statistical instruments along a common strategy to provide more confidence of quality and reproducibility of their results.

**********

Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #2: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

**********

Have all data underlying the figures and results presented in the manuscript been provided?

Large-scale datasets should be made available via a public repository as described in the PLOS Computational Biology data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information.

Reviewer #1: Yes

Figure Files:

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

Data Requirements:

Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example in PLOS Biology see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5.

Reproducibility:

To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1009286.r003

Decision Letter 1

Roland L Dunbrack Jr, Arne Elofsson

17 Jul 2021

Dear Dr. Najmanovich,

We are pleased to inform you that your manuscript 'Modelling conformational state dynamics and its role on infection for SARS-CoV-2 Spike protein variants' has been provisionally accepted for publication in PLOS Computational Biology.

I'm persuaded the new experimental data confirms your calculations and that your paper should be accepted.

Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests.

Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated.

IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript.

Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us now if you or your institution is planning to press release the article. All press must be co-ordinated with PLOS.

Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Computational Biology. 

Best regards,

Roland L. Dunbrack Jr., Ph.D.

Associate Editor

PLOS Computational Biology

Arne Elofsson

Deputy Editor

PLOS Computational Biology

***********************************************************

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1009286.r004

Acceptance letter

Roland L Dunbrack Jr, Arne Elofsson

3 Aug 2021

PCOMPBIOL-D-21-00189R1

Modelling conformational state dynamics and its role on infection for SARS-CoV-2 Spike protein variants

Dear Dr Najmanovich,

I am pleased to inform you that your manuscript has been formally accepted for publication in PLOS Computational Biology. Your manuscript is now with our production department and you will be notified of the publication date in due course.

The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript.

Soon after your final files are uploaded, unless you have opted out, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers.

Thank you again for supporting PLOS Computational Biology and open-access publishing. We are looking forward to publishing your work!

With kind regards,

Katalin Szabo

PLOS Computational Biology | Carlyle House, Carlyle Road, Cambridge CB4 3DN | United Kingdom ploscompbiol@plos.org | Phone +44 (0) 1223-442824 | ploscompbiol.org | @PLOSCompBiol

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Fig

    Dynamical Signature clustering between WT, 22 mutants observed in nature and 30 random mutants designed, both for the full structure (A) and only for the RBD (B). The grey arrow highlights the clusters containing most of the mutants that have the mutation D614G.

    (TIF)

    S2 Fig. Effects on the Dynamical Signatures of the mutation from D614 to each one of the other 18 possible residues.

    (TIF)

    S3 Fig

    Dynamical Signature differences for three mutations among the top VDS (A)–K417E, G252M and G404Y –and bottom VDS values (B)–R355Y, F464L and I231P –for closed (purple) and open (blue) B chain RBD structures. Chains A, B and C are marked on the top in green, cyan and magenta, respectively, and the point mutations are marked in yellow.

    (TIF)

    S4 Fig. Simulated Pearson correlations between gaussian noise vectors of length 6 for the 110 iterations associated to each combination of the parameters k = [0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1, 5, 10, 50, 100] and γ = [0.001, 0.01, 0.1, 1, 10, 100, 1000, 10000, 100000, 1000000].

    (TIF)

    S5 Fig. Dynamical signature difference for the open (blue) and closed (purple) states relative to the wild type for the B.1.1.7 variant.

    (TIF)

    S6 Fig. Dynamical signature difference for the open (blue) and closed (purple) states relative to the wild type for the 501.V2 variant.

    (TIF)

    S7 Fig. Dynamical signature difference for the open (blue) and closed (purple) states relative to the wild type for the P.1 variant.

    (TIF)

    S1 Table. Amino acid mutations associated to 13741 sequences of the Spike protein available on May 08 in COVID-19 Viral Genome Analysis Pipeline, enabled by data from GISAID.

    (DOCX)

    S2 Table. Random mutants accumulating from one to four mutations.

    (DOCX)

    S3 Table. Putative mutations and associated VDS (ΔSvib (open)− ΔSvib (close), in units of J.K-1) and predicted occupancies with open occupancies lower than for the wild type for the 64 mutants with FDS>0.3 and the 20 mutants with lowest FDS values.

    (DOCX)

    Attachment

    Submitted filename: Teruel_SARS-CoV-2_Spike20210708_Response.pdf

    Data Availability Statement

    Raw data and structures used to build the images presented here are available in a Github repository (https://github.com/nataliateruel/data_Spike). All vibrational entropy results are available for visualisation and analysis through a link to the dms-view open-access tool, available on GitHub through the same URL above. The Najmanovich Research Group Toolkit for Elastic Networks (NRGTEN) including the latest ENCoM implementation is freely available at https://github.com/gregorpatof/nrgten_package.


    Articles from PLoS Computational Biology are provided here courtesy of PLOS

    RESOURCES