Skip to main content
Elsevier - PMC COVID-19 Collection logoLink to Elsevier - PMC COVID-19 Collection
. 2021 Apr 24;92:104874. doi: 10.1016/j.meegid.2021.104874

Spike protein mutational landscape in India during the complete lockdown phase: Could Muller's ratchet be a future game-changer for COVID-19?

Rachana Banerjee a, Kausik Basak a, Anamika Ghosh b, Vyshakh Rajachandran a, Kamakshi Sureka a, Debabani Ganguly a,, Sujay Chattopadhyay a,
PMCID: PMC8084351  PMID: 33905891

Abstract

The dire need of effective preventive measures and treatment approaches against SARS-CoV-2 virus, causing COVID-19 pandemic, calls for an in-depth understanding of its evolutionary dynamics with attention to specific geographic locations, since lockdown and social distancing to prevent the virus spread could lead to distinct localized dynamics of virus evolution within and between countries owing to different environmental and host-specific selection pressures. To decipher any correlation between SARS-CoV-2 evolution and its epidemiology in India, we studied the mutational diversity of spike glycoprotein, the key player for the attachment, fusion and entry of virus to the host cell. For this, we analyzed the sequences of 630 Indian isolates as available in GISAID database till June 07, 2020 (during the time-period before the start of Unlock 1.0 in India on and from June 08, 2020), and detected the spike protein variants to emerge from two major ancestors – Wuhan-Hu-1/2019 and its D614G variant. Average stability of the docked spike protein – host receptor (S-R) complexes for these variants correlated strongly (R2 = 0.96) with the fatality rates across Indian states. However, while more than half of the variants were found unique to India, 67% of all variants showed lower stability of S-R complex than the respective ancestral variants, indicating a possible fitness loss in recently emerged variants, despite a continuous increase in mutation rate. These results conform to the sharply declining fatality rate countrywide (>7-fold during April 11 – June 28, 2020). Altogether, while we propose the potential of S-R complex stability to track disease severity, we urge an immediate need to explore if SARS-CoV-2 is approaching mutational meltdown in India.

Keywords: SARS-CoV-2, Muller's ratchet, Mutational meltdown, Molecular docking, Viral evolutionary dynamics

1. Introduction

The emergence and rapid global spread of novel severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the causative agent of the coronavirus disease 2019 (COVID-19), has led to an unprecedented worldwide public health crisis, crossing the species barrier and disseminating rapidly through the global population (Coronaviridae Study Group of the International Committee on Taxonomy of, 2020). Like any other rapidly evolving emerging pathogens, effective preventive measures and treatment approaches call for an in-depth understanding of the evolutionary dynamics of SARS-CoV-2 and its association with the epidemiological data (Holmes et al., 1995; Pybus and Rambaut, 2009), especially with distinct attention to specific geographic regions. Nature and strength of selection pressures acting on the pathogen can vary across different ethnicities, countries, or even among specific administrative regions (such as states or provinces) within a country like India, China, USA, largely because of the implementation of social distancing via lockdown as a preventive measure against the geographic spread of the pathogen between and within countries.

The transmembrane spike glycoprotein protrudes from the viral surface and is responsible for viral attachment, fusion and entrance into the host cells, thereby establishing the infection (Andersen et al., 2020; Tortorici and Veesler, 2019). The two most notable genetic features of this protein are (a) the binding of its surface unit (S1) to the human cellular receptor angiotensin converting enzyme 2 (hACE2), and (b) the fusion of viral membrane with a host cell membrane via its transmembrane unit (S2). The S1 subunit contains the receptor-binding domains (RBD) supporting the stabilization of the membrane-anchored state of the S2 subunit along with its fusion machinery (Gui et al., 2017; Kirchdoerfer et al., 2016; Pallesen et al., 2017; Song et al., 2018; Walls et al., 2016; Walls et al., 2017). Some previous works have acknowledged that extensive and irreversible conformational changes stimulate the cleavage of spike protein, thereby activating it for membrane fusion (Belouzard et al., 2009; Heald-Sargent and Gallagher, 2012; Millet and Whittaker, 2014, Millet and Whittaker, 2015; Park et al., 2016; Walls et al., 2020; Walls et al., 2017). Though it is not yet confirmed if the differences owing to conformational changes are aiding to the expansion of SAR-CoV2 varieties increasing or decreasing its infectivity or transmissibility, but the researchers confirmed the spike proteins to be the key pathogenic determinant that differentiates SAR-CoV2 from other SARS-related coronaviruses (Walls et al., 2020). It is predicted that mutations in the spike protein increases or decreases protein glycosylation, thereby enhancing or reducing the viral uptake by host cells (Brufsky and Lotze, 2020a; Gordon et al., 2020; Jia et al., 2020; Korber et al., 2020; Yao et al., 2020). Although the RBD in the spike protein is known as the most important player to recognize the host receptor, it is highly likely that the region outside of this C-terminal domain in S1 subunit as well as the domains of S2 subunit also influence the host-receptor binding allosterically.

The present study analyses spike protein variants from the sequenced genomes of Indian isolates available till June 7, 2020, a day before the Unlock 1.0 was initiated by the Government of India. This helped us to understand the location-specific evolutionary patterns and driving forces behind the emerging SARS-CoV-2 infection and its potential epidemiological footprints in different parts of India during the nationwide lockdown in four consecutive phases. We detected a strong correlation between the average stability of complexes formed by the circulating spike protein variants with the host receptor (S-R complex) and the disease severity of a given location, suggesting the S-R complex stability as a potential marker to assess the severity of the disease. Importantly, majority of the emerging variants showed decreased stability, indicating accumulation of deleterious mutations in the spike protein. This conforms to the declining fatality rates of the disease countrywide. Could the fixation of these deleterious mutations in the population lead to mutational meltdown, following Muller's ratchet dynamics of evolution?

2. Methods

2.1. Analysis of sequence diversity and reconstruction of phylogeny

The average pairwise nucleotide diversity (π) and the rates of synonymous (dS) and nonsynonymous (dN) mutations for the spike protein-coding genes were calculated using MEGA version X (Kumar et al., 2018). TimeZone software (Chattopadhyay et al., 2013) was used to reconstruct the maximum-likelihood based phylogeny to map the protein variants and identify convergent amino acid changes (i.e. repeated independent or phylogenetically unlinked mutations at the same amino acid positions). The spike gene sequence from Wuhan-Hu-1/2019 genome was used as reference to detect the orthologs in the sequenced Indian genomes based on a threshold value of 95% for both nucleotide sequence diversity and gene length coverage.

2.2. Identification of spike protein variants unique to India

All 17,529 spike protein sequences from worldwide isolates available till May 9, 2020 in the GISAID database (https://www.gisaid.org/) were downloaded. We implemented CD-HIT Suite (Huang et al., 2010; Li and Godzik, 2006) to cluster all the spike protein sequences considering 100% amino acid sequence identity as ortholog clustering criteria, and detected a total of 3706 clusters. Of these clusters, we considered 3577 clusters matching the complete length of spike protein (1273 amino acids), using the sequence from Wuhan-Hu-1/2019 (GenBank accession number MN908947) as reference. These 3577 spike protein sequences were aligned using ClustalW program (Higgins and Sharp, 1988; Rice et al., 2000). The resulting alignment was compared with Indian mutational variants mapped in the previous step to distinguish the variants unique to Indian isolates.

2.3. Analysis of state-wise diversity of spike protein variants

For each Indian state, we computed the number of spike protein variants and the frequency of each of those variants. The state-wise calculation of variant diversity was performed using Simpson's index (Simpson, 1949).

2.4. Modeling of spike protein – hACE2 complex variants

Variants of the ancestral spike protein (Wuhan-Hu-1/2019) were built by homology modeling using Swiss modeler (Waterhouse et al., 2018) with the aid of available templates (residue range: 27–1147) using Wuhan-Hu-1/2019 isolate as reference (Walls et al., 2020; Wrapp et al., 2020). X-ray crystal structure of the human ACE2 was used from the complex of receptor binding domain (RBD) of spike protein with hACE2 (PDB code: 6lzg) (Wang et al., 2020). We docked the RBD (residue range: 331–524) of spike protein (residue range: 27–1147) mutants to the binding site of hACE2 using HADDOCK webserver (van Dijk et al., 2012; van Zundert et al., 2016) by providing binding site information (Lan et al., 2020; Wrapp et al., 2020). HADDOCK score (often mentioned as docking score in the text), with some arbitrary unit, signifies a measure determined by weighted sum of intermolecular interactions, such as electrostatic and van der Waals interactions between protein and ligand, desolvation energy, restraint violation energy and the buried surface area upon binding. For each docked complex HADDOCK score was estimated and VMD (Humphrey et al., 1996) was used to visualize the structures.

3. Results

3.1. Two major ancestors circulating in India lead to a burst of spike protein variants

We identified a total of 630 isolates with complete gene sequences encoding the spike protein based on the submissions of Indian SARS CoV-2 genome sequences to GISAID till June 7, 2020 (S1 Table). The samples analyzed were isolated from 17 states and 2 union territories of India, collectively called as ‘states’ hereafter. We found a countrywide average pairwise nucleotide diversity (π) of 0.048 ± 0.02%. The rates of synonymous (or silent) and nonsynonymous (or amino acid replacement) mutations were found to be 0.097 ± 0.05% and 0.033 ± 0.02% respectively. Phylogenetic analysis showed the Wuhan-Hu-1/2019 variant of the spike protein as the most ancestral one, as expected, while the D614G variant emerged from the Wuhan-Hu-1/2019 variant appeared to be another stable variant circulating in the Indian population. Since both these variants have established themselves in the worldwide population as two major ancestors of SARS CoV-2 spike protein variants, we here onwards will refer the Wuhan-Hu-1/2019 and D614G variants as ancestor 1 and ancestor 2 respectively.

Apart from giving rise to ancestor 2, the ancestor 1 led to a total of 20 variants (Fig. 1 and S2 Table). Of these, the variant K77M evolved further to yield three more variants isolated from three different states (Bihar, Tamil Nadu and Telengana), suggesting the emergence of K77M as another stable variant. On the other hand, the ancestor 2 showed about twice more diversity by giving rise to 47 variants (Fig. 1 and S2 Table). In this ancestor 2 clade, several variants (L5F, T22I, L54F, G261S, T572I, E583D, Q677H, A706S, H1083Q) indicated their stability in the population via mutating further, giving rise to additional variants. Besides, although the ancestral variants were predominant in the population circulating in India, we detected a total of 16 variants of spike proteins that were represented by multiple isolates (Fig. 1), from 2 to as many as 9 isolates, indicating the possible fixation of some of these variants in the population irrespective of the stability of S-R complex. Of these, an array of mutations in 8 variants (at positions 5, 54, 78, 558, 574, 583, 677, 1243) showed their convergent nature, where those mutations at the same positions were phylogenetically unlinked, i.e. repeated independently (S2 Table). Interestingly, 53% of the total set of variants detected in Indian population was found to be exclusive to India, i.e. not found in 17,529 worldwide genomic isolates analyzed from the GISAID database till May 09, 2020 (S2 Table). We believe that this considerable level of uniqueness could be an expected scenario in almost all geographical regions during the complete lockdown period where a newly emerging, rapidly evolving viral pathogen tries to adapt to a new host, and many of these variants might be detrimental to the fitness of the organism. However, specific positive selection pressures could also play a role in this mutational pattern which needs to be studied separately in some greater depth.

Fig. 1.

Fig. 1

Schematic representation of the diversity of spike protein variants circulating in India, using maximum likelihood-based phylogeny reconstruction. Each node represents a specific spike protein variant, while the node-size and the number inside depict the frequency of that variant. The red or green color of each arrow indicates the higher or lower stability index respectively of the S-R complex for each variant than the major ancestral variant it emerged from (either Ancestor 1 or Ancestor 2). The black arrows lead to the variants for which the docking scores could not be determined either because of the presence of at least one variation outside the available template region for docking (Walls et al., 2020; Wrapp et al., 2020), or due to non-existing isolate in the lone hypothetical node with H1083Q mutation denoted by black color. This black node signifies a variant with no available isolate in our dataset, while it gives rise to two derived variants, H1083Q:R78M and H1083Q:E583D, for which representative isolates were available.

3.2. The average stability index of S-R complex correlates strongly with the fatality rates in a given location

We next looked into the distribution of these spike protein variants across Indian states (S3 Table). As we computed the diversity based on both richness and evenness of spike variants, some of the states like Maharashtra, Odisha, West Bengal and Gujarat demonstrated significantly higher (P < 0.05) diversity of circulating variants than most of the remaining states. However, while we had 196, 97, 75 and 73 sequenced isolates from Gujarat, Telengana, Delhi and Maharashtra respectively, the remaining states were represented by even lower sample size (ranging from 1 to 48 isolates). Of these, Delhi variants showed lowest diversity, significantly different from both Maharashtra (P = 0.0002) and Gujarat (P = 0.0006), though not from Telengana (P = 0.16). On the other hand, Maharashtra variants presented much higher diversity than Telengana (P = 0.017) or Gujarat (P = 0.21).

It is expected that these variations might be in response to strong selection pressures acting on the spike protein, especially its S1 subunit which, being a major immunogenic target for the host, plays the pivotal role to evade the host immune response and to offer a successful viral entry. Therefore, the mutational variations in spike proteins can essentially affect the stability of the S-R complex. To assess this, we modeled each of the spike protein variants, and then docked to the binding site of host receptor, hACE2, using HADDOCK webserver (van Dijk et al., 2012; van Zundert et al., 2016) of data-driven docking algorithm by providing binding site information as the same was already established from the crystal structure (Lan et al., 2020; Wang et al., 2020). In the circulating variants, we detected a significant excess (P = 0.035) of mutations in the S1 subunit (with 41 mutations) compared to the S2 subunit (with 22 mutations). The mutation positions in the secondary structures of the analyzed variants are detailed in the S1 Dataset in supporting information.

The docking score (HADDOCK score) of each variant on hACE2 is hereafter designated as the stability index of S-R complex. More negative is the docking (HADDOCK) score, higher is the stability of the S-R complex (Pantsar and Poso, 2018). Under the assumption that better stability would lead to better invasion of the virus into the host, we hypothesize that such a stability index of a given spike protein variant could be linked to the severity of viral pathogenicity. To test this hypothesis, we measured the severity as the ‘fatality rate’ calculated simply as the ratio of the number of deceased people to the number of recovered in a given state (available from the Government of India website: www.mygov.in/corona-data/covid19-statewise-status/).

While we aimed to estimate an average stability index of a given state based on the stability indices of circulating variants in that location, we were handicapped with the available sample size and the information of collection diversity. Considering this issue, we restricted our study to the states having 50 or more sequenced samples for analysis. Therefore, we could assess the association of average stability index with fatality rates for three states and one union territory (Maharashtra, Gujarat, Telengana and Delhi) which qualified our sample size threshold. Importantly, these four regions harbored 70% of all samples analyzed across 19 states, while their variant diversity ranged from the highest to one of the lowest (S3 Table).

We detected a strong exponential correlation (R2 = 0.96) between the average stability index of circulating spike variants of the region with the fatality rate in that region (Fig. 2 ). While Telengana and Delhi showed comparable average stability index values with ~7% fatality rates, Gujarat and Maharashtra had higher stability index values (i.e., more negative docking or HADDOCK scores) with 8% and 9% fatality rates respectively. The averaged values of docking scores / HADDOCK scores for Gujarat and Maharashtra were significantly different from one another as well as from those for Delhi and Telengana (P < 0.0001), while no significant difference was found between the Delhi and Telengana values (P = 0.32). It is highly plausible that the spike protein, as the primary controller of both the attachment to the host cell surface and the initiation of infection by fusing the viral and the host cell membranes, would be represented by variants with varying efficiency of the virus to enter human cells, and to get transmitted among people (Walls et al., 2020). However, our conclusions based on the available initial data are premature due to low sample size per location and lack of direct evidence for the correlation between spike protein variant's docking score and the pathogen's contribution to host fatality, thereby warranting population-level robust association analysis and experimental validations.

Fig. 2.

Fig. 2

Heat map distribution across four Indian states with > 50 sequenced isolates based on average stability index. The average stability index for a particular state denotes the averaged value (± standard error value) of docking scores / HADDOCK scores of S-R complexes for all circulating variants. The values of average stability index and fatality rate in Indian states are plotted to fit an exponential function (R2 = 0.96).

3.3. The emerging spike protein variants showing reduced stability of S-R complex are significantly abundant

Of 630 isolates analyzed, Ancestors 1 and 2 were represented by 253 and 248 isolates, suggesting their steady circulation across India. However, the remaining ones, i.e. more than 20% isolates represented relatively recently emerged variants out of two ancestral variants. Interestingly, a quick look at Fig. 1 showed that, for majority of the variants derived from the Ancestor 1 and Ancestor 2, the stability was reduced (having less negative stability index values) relative to their respective ancestors. We therefore plotted the trend of those emerging variants with reference to their ancestral backgrounds (Fig. 3 ). Significant majority (χ2 P = 0.03) of the variants that emerged from the two ancestral variants showed reduced stability (having less negative docking scores) from their respective ancestors. This picture got even more prominent (χ2 P = 0.009) when we looked into exclusively the variants which were detected multiple times in the dataset, i.e. represented by more than one isolate (sometimes collected from different states), thereby suggesting possible fixation of those variants in the population (S2 Table). Interestingly, 13 of these variants with multiple occurrences accumulated mutations exclusively in S1 subunit, while only 3 variants showed all mutations in S2 subunit, which might be suggesting an increased selection pressure in the S1 subunit because of its key role in the viral entry and the presence of the receptor binding domain.

Fig. 3.

Fig. 3

Stability index (i.e., docking score or HADDOCK score) plot of S-R complexes for spike protein variants emerging from (a) Ancestor 1 (Wuhan-Hu-1/2019 variant) and (b) Ancestor 2 (D614G variant). The blue dotted line is used as a reference to denote the stability index value for respective ancestral variants. More negative is the value, higher is the stability level. The red dotted rectangular block includes the variants that are represented by multiple isolates in our dataset (SI appendix, S2 Table).

At this point, we can propose that the relatively recent variants emerging from the ancestors in India are losing their ability on an average to form a stable complex with the human receptor as compared with their ancestors, which could possibly result in lower countrywide fatality rate, if we combine our earlier observation of the direct correlation between average stability index and fatality rate. Conforming to this proposal, we found that, after an initial steady increase of fatality rate for the first four weeks, the fatality rate reached a peak at 38.2% on April 11, 2020, followed by a continuous sharp decline at 5.3% on June 28, 2020 until when the data were available (Fig. 4 ), fitting a power law function (R2 = 0.94) (data source: http://covidindiaupdates.in/). This was despite the fact that the rate of mutations was found to be increasing during the three-month period from March till May 2020 when our analyzed samples were collected. Interestingly, the increase in nonsynonymous (i.e. amino acid replacement) mutation rate (from 0.027% in March to 0.033% in May 2020) was detected to be about 14% higher than the increase in synonymous mutation rate (from 0.091% in March to 0.097% in May 2020), suggesting a stronger selection pressure for amino acid changes in the spike protein.

Fig. 4.

Fig. 4

Countrywide fatality rates (the number of deaths / the number of recovered cases) over time in India. The date of first available data (March 13, 2020), the date after which the decline of fatality rate started (April 11, 2020), and the date until which the analyzed samples were collected (May 27, 2020) are denoted by blue dotted lines. The declining fatality rate curve is best fitted by a power law function (R2 = 0.94).

3.4. Could Muller's ratchet be a player in shaping SARS-CoV-2 evolutionary dynamics in India?

In evolutionary genetics, Muller's ratchet signifies accumulation of deleterious mutation in a population giving rise to ‘mutational meltdown’, thereby leading to a gradual extinction of that population (Muller, 1964). In general, genetic mutations that provide adaptive advantages are fixed in the population by natural selection whereas deleterious ones are wiped off from the population. However, an accelerated mutation rate puts huge mutational pressure, and natural selection is unable to wash out these deleterious mutations, retaining the newly formed variants within the population and thereby leading to their fixation. This irretrievable evolutionary mechanism is coined as Muller's ratchet by evolutionary biologists (Felsenstein, 1974). When more and more deleterious mutations are accumulated and become permanent in the population, this results in ‘mutational meltdown’ or ultimate loss of the population (Jensen and Lynch, 2020; Lynch et al., 1993).

It is known that the apparent tendency to directly correlate the high mutation rate of virus with its infectivity and transmissibility is without merit (Grubaugh et al., 2020). On the other side, it has been suggested that Muller's ratchet, via mutational meltdown, could be a key player in leading the SARS-CoV-2 population to gradual extinction due to accumulation and fixation of deleterious mutations in future (Brufsky and Lotze, 2020b; Jensen and Lynch, 2020). On one hand, we observed that the stability of S-R complex is directly linked to the fatality rates, while the continuous emergence of variants from the ancestral ones were found to be less stable compared to their ancestors. As we combine these findings with the countrywide data of a sharp decline in the fatality rate over time, such a correlation could be attributed to multiple factors like early detection of the disease, use of steroids or other effective therapeutics (remdisivir, favipiravir, etc.). The disease severity could have been diminished via timely implementation of drugs attenuating the cytokine storm. However, the samples we analyzed here were collected before the Unlock 1.0 was started on June 8, 2020. At that time, both the diagnostic and therapeutic management of patients with COVID-19 was at a stage too early to be considerably associated to the emergence of deleterious mutants and decline of fatality rates. Therefore, alternatively we anticipate the possibility of mutational meltdown in action for SARS-CoV-2 in India, indicating Muller's ratchet as a plausible game-changer for COVID-19 scenario here in near future.

We understand that, only in three months, we cannot expect a radical increase in the mutation rate to give rise to any significant accumulation of deleterious mutations that could have offered a prominent picture of mutational meltdown. However, our results altogether point toward the trend, thereby suggesting the potential of future studies in this otherwise overlooked domain of microbial dynamics, which could in turn lead to a possibility of a successful therapeutic approach (Brufsky and Lotze, 2020b; Jensen and Lynch, 2020).

4. Discussion

Our work has combined genetic and epidemiological data of SARS-CoV2 in India to decipher a direct correlation between the average stability of S-R complexes for the circulating spike protein variants and the fatality rate of a geographic region. The docking score of S-R complex, designated here as the stability index, is estimated by protein-protein docking based on intermolecular interactions, such as electrostatic and van der Waals interactions, desolvation and restraint violation energies along with buried surface area upon binding via detection of the correct binding pose. This score, in essence, quantifies the stability of the docked complex by optimizing global minimum energy conformation of the complex (Halperin et al., 2002; Moreira et al., 2010). On the other side, while it is probable that a better stability of S-R complex could lead to an increased viral load and maybe an increased infectivity, previous works showed that the spike protein – hACE2 complex is crucial for viral pathogenesis by causing acute lung damage (Kuba et al., 2005; Lumbers et al., 2020), which suggests a direct link between S-R complex stability and fatality rate. However, the robustness of the potential of S-R complex stability index for the spike protein variants as a tracker of fatality rate or disease severity needs to be studied in greater depths with more structured region-specific patient data (that are primarily available to selected government/non-government agencies) in connection with larger population-level sequence datasets for the given locations. Also, it is worth-mentioning that the variations in the hACE2 protein from different human host individuals, similar to the spike protein variations, or the expression levels of hACE2 encoding gene can have both direct and indirect impact on the stability of S-R complex. While the scope of our present work limits this study, it is worth investigating in near future to have a more detailed view on the impact of polymorphisms on the S-R complex contributed by both the host and the pathogen.

As expected for any fast-moving pathogen outbreaks (Grubaugh et al., 2020), we find an increasing rate of mutations, while the extent of increase is much higher for the nonsynonymous (amino acid replacement) changes. Interestingly, while more than half of the emerged spike protein variants have been found to be exclusive to India, the S-R complex of a significant majority of variants tend to lose stability relative to their two most stable ancestral variants. Alongside, the countrywide data show a continuous sharp decline in the fatality rate after an initial surge, thereby suggesting that despite the burst of mutations in the spike proteins of Indian isolates, the new variants on an average have not severely affected the COVID situation in India. Combining all this information, here we pose an open question for future research: Does the phylodynamics of SARS-CoV-2 in India indicate any nascent action of Muller's ratchet where the otherwise deleterious mutations tend to get fixed in the population as natural selection remains unable to purge them due to excessive mutational pressures? If so, there lies an immense potential of using therapeutics that could facilitate such a process of mutational meltdown, as was demonstrated earlier for influenza A virus (Ormond et al., 2017; Penisson et al., 2017). Altogether, future large-scale population genomics analyses supported by reliable epidemiological metadata are highly important to explore this question for India as well as for other countries across the globe to develop efficient analytical methods, thereby guiding better surveillance programs, prevention and treatment management of COVID-19.

Besides, there are many different factors that can influence the fatality rate in one way or another, such as various co-morbidities and other host factors, absolute growth of viral population and its carrying capacity, rate of dissemination, natural selection pressures, apart from the effect and the rate of deleterious mutations. By the same token, with an aim to strategize the development of vaccines or other therapeutics, experimental studies are warranted to look at the fates of the spike protein mutations. For example, the impact of spike protein mutations in antibody neutralization needs to be analyzed. Indeed, an important observation (Rees-Spear et al., 2021) showed altered sensitivity of antibody neutralization specific to different spike protein variants, although the serum neutralization was hardly affected. However, another study demonstrated the absence of any selective role of spike protein variations in influencing antibody response, possibly because the virus moves to a new host much faster than the time-span needed for a neutralizing antibody response to be developed (Jackson et al., 2021). Similarly, as depicted by earlier works (Shang et al., 2020a; Shang et al., 2020b; Tai et al., 2020), the binding assay studies are of great importance to decipher the impact of spike protein mutations in the isolates circulating in India. Also, apart from the spike protein, there are other viral proteins such as the nucleocapsid (N) protein to confer antigenicity and to harbour potential epitope regions important for diagnostic and therapeutic approaches (Can et al., 2020; Dutta et al., 2020; Oliveira et al., 2020; Zeng et al., 2020). From this viewpoint, a critical angle of future studies would also be to understand the co-evolution of the spike and nucleocapsid protein variants in SARS-CoV-2 infectivity and pathogenicity.

Ethics declarations

No human samples or clinical data were used.

Authorship

All authors have read the manuscript and agreed on the submitted version. All authors participated to the writing of the article.

Author contributions

D.G., and S.C. conceptualized, provided the resources and supervised the study; R.B., K.B., A.G., V.R., D.G., and S.C. performed the formal analysis; R.B., K.S., D.G., and S.C. performed the writing.

Declaration of Competing Interest

None.

Footnotes

Appendix A

Supplementary data to this article can be found online at https://doi.org/10.1016/j.meegid.2021.104874.

Appendix A. Supplementary data

Supplementary material

mmc1.pdf (1,011.5KB, pdf)

References

  1. Andersen K.G., Rambaut A., Lipkin W.I., Holmes E.C., Garry R.F. The proximal origin of SARS-CoV-2. Nat. Med. 2020;26:450–452. doi: 10.1038/s41591-020-0820-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Belouzard S., Chu V.C., Whittaker G.R. Activation of the SARS coronavirus spike protein via sequential proteolytic cleavage at two distinct sites. Proc. Natl. Acad. Sci. U. S. A. 2009;106:5871–5876. doi: 10.1073/pnas.0809524106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Brufsky A., Lotze M.T. DC/L-SIGNs of hope in the COVID-19 pandemic. J. Med. Virol. 2020;92:1396–1398. doi: 10.1002/jmv.25980. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Brufsky A., Lotze M.T. Ratcheting down the virulence of SARS-CoV-2 in the COVID-19 pandemic. J. Med. Virol. 2020;92:2379–2380. doi: 10.1002/jmv.26067. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Can H., Koseoglu A.E., Erkunt Alak S., Guvendi M., Doskaya M., Karakavuk M., Guruz A.Y., Un C. In silico discovery of antigenic proteins and epitopes of SARS-CoV-2 for the development of a vaccine or a diagnostic approach for COVID-19. Sci. Rep. 2020;10:22387. doi: 10.1038/s41598-020-79645-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Chattopadhyay S., Paul S., Dykhuizen D.E., Sokurenko E.V. Tracking recent adaptive evolution in microbial species using TimeZone. Nat. Protoc. 2013;8:652–665. doi: 10.1038/nprot.2013.031. [DOI] [PubMed] [Google Scholar]
  7. Coronaviridae Study Group of the International Committee on Taxonomy of, V The species severe acute respiratory syndrome-related coronavirus: classifying 2019-nCoV and naming it SARS-CoV-2. Nat. Microbiol. 2020;5:536–544. doi: 10.1038/s41564-020-0695-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Dutta N.K., Mazumdar K., Gordy J.T. The Nucleocapsid protein of SARS-CoV-2: a target for vaccine development. J. Virol. 2020;94 doi: 10.1128/JVI.00647-20. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Felsenstein J. The evolutionary advantage of recombination. Genetics. 1974;78:737–756. doi: 10.1093/genetics/78.2.737. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Gordon D.E., Jang G.M., Bouhaddou M., Xu J., Obernier K., White K.M., O'Meara M.J., Rezelj V.V., Guo J.Z., Swaney D.L., Tummino T.A., Huttenhain R., Kaake R.M., Richards A.L., Tutuncuoglu B., Foussard H., Batra J., Haas K., Modak M., Kim M., Haas P., Polacco B.J., Braberg H., Fabius J.M., Eckhardt M., Soucheray M., Bennett M.J., Cakir M., McGregor M.J., Li Q., Meyer B., Roesch F., Vallet T., Mac Kain A., Miorin L., Moreno E., Naing Z.Z.C., Zhou Y., Peng S., Shi Y., Zhang Z., Shen W., Kirby I.T., Melnyk J.E., Chorba J.S., Lou K., Dai S.A., Barrio-Hernandez I., Memon D., Hernandez-Armenta C., Lyu J., Mathy C.J.P., Perica T., Pilla K.B., Ganesan S.J., Saltzberg D.J., Rakesh R., Liu X., Rosenthal S.B., Calviello L., Venkataramanan S., Liboy-Lugo J., Lin Y., Huang X.P., Liu Y., Wankowicz S.A., Bohn M., Safari M., Ugur F.S., Koh C., Savar N.S., Tran Q.D., Shengjuler D., Fletcher S.J., O'Neal M.C., Cai Y., Chang J.C.J., Broadhurst D.J., Klippsten S., Sharp P.P., Wenzell N.A., Kuzuoglu-Ozturk D., Wang H.Y., Trenker R., Young J.M., Cavero D.A., Hiatt J., Roth T.L., Rathore U., Subramanian A., Noack J., Hubert M., Stroud R.M., Frankel A.D., Rosenberg O.S., Verba K.A., Agard D.A., Ott M., Emerman M., Jura N., von Zastrow M., Verdin E., Ashworth A., Schwartz O., d'Enfert C., Mukherjee S., Jacobson M., Malik H.S., Fujimori D.G., Ideker T., Craik C.S., Floor S.N., Fraser J.S., Gross J.D., Sali A., Roth B.L., Ruggero D., Taunton J., Kortemme T., Beltrao P., Vignuzzi M., Garcia-Sastre A., Shokat K.M., Shoichet B.K., Krogan N.J. A SARS-CoV-2 protein interaction map reveals targets for drug repurposing. Nature. 2020;583:459–468. doi: 10.1038/s41586-020-2286-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Grubaugh N.D., Petrone M.E., Holmes E.C. We shouldn’t worry when a virus mutates during disease outbreaks. Nat. Microbiol. 2020;5:529–530. doi: 10.1038/s41564-020-0690-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Gui M., Song W., Zhou H., Xu J., Chen S., Xiang Y., Wang X. Cryo-electron microscopy structures of the SARS-CoV spike glycoprotein reveal a prerequisite conformational state for receptor binding. Cell Res. 2017;27:119–129. doi: 10.1038/cr.2016.152. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Halperin I., Ma B., Wolfson H., Nussinov R. Principles of docking: an overview of search algorithms and a guide to scoring functions. Proteins. 2002;47:409–443. doi: 10.1002/prot.10115. [DOI] [PubMed] [Google Scholar]
  14. Heald-Sargent T., Gallagher T. Ready, set, fuse! The coronavirus spike protein and acquisition of fusion competence. Viruses. 2012;4:557–580. doi: 10.3390/v4040557. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Higgins D.G., Sharp P.M. CLUSTAL: a package for performing multiple sequence alignment on a microcomputer. Gene. 1988;73:237–244. doi: 10.1016/0378-1119(88)90330-7. [DOI] [PubMed] [Google Scholar]
  16. Holmes E.C., Nee S., Rambaut A., Garnett G.P., Harvey P.H. Revealing the history of infectious disease epidemics through phylogenetic trees. Philos. Trans. R. Soc. Lond. Ser. B Biol. Sci. 1995;349:33–40. doi: 10.1098/rstb.1995.0088. [DOI] [PubMed] [Google Scholar]
  17. Huang Y., Niu B., Gao Y., Fu L., Li W. CD-HIT suite: a web server for clustering and comparing biological sequences. Bioinformatics. 2010;26:680–682. doi: 10.1093/bioinformatics/btq003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Humphrey W., Dalke A., Schulten K. VMD: visual molecular dynamics. J. Mol. Graph. 1996;14:33–38. doi: 10.1016/0263-7855(96)00018-5. [DOI] [PubMed] [Google Scholar]
  19. Jackson C.B., Zhang L., Farzan M., Choe H. Functional importance of the D614G mutation in the SARS-CoV-2 spike protein. Biochem. Biophys. Res. Commun. 2021;538:108–115. doi: 10.1016/j.bbrc.2020.11.026. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Jensen J.D., Lynch M. Considering mutational meltdown as a potential SARS-CoV-2 treatment strategy. Heredity (Edinb) 2020;124:619–620. doi: 10.1038/s41437-020-0314-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Jia Y., Shen G., Zhang Y., Huang K.-S., Ho H.-Y., Hor W.-S., Yang C.-H., Li C., Wang W.-L. Analysis of the mutation dynamics of SARS-CoV-2 reveals the spread history and emergence of RBD mutant with lower ACE2 binding affinity. bioRxiv. 2020 doi: 10.1101/2020.04.09.034942. [DOI] [Google Scholar]
  22. Kirchdoerfer R.N., Cottrell C.A., Wang N., Pallesen J., Yassine H.M., Turner H.L., Corbett K.S., Graham B.S., McLellan J.S., Ward A.B. Pre-fusion structure of a human coronavirus spike protein. Nature. 2016;531:118–121. doi: 10.1038/nature17200. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Korber B., Fischer W., Gnanakaran S., Yoon H., Theiler J., Abfalterer W., Hengartner N., Giorgi E., Bhattacharya T., Foley B., Hastie K., Parker M., Partridge D., Evans C., Freeman T., de Silva T., McDanal C., Perez L., Tang H., Moon-Walker A., Whelan S., LaBranche C., Saphire E., Montefiori D. Tracking changes in SARS-CoV-2 spike: evidence that D614G increases infectivity of the COVID-19 virus. Cell. 2020;182:812–827. doi: 10.1016/j.cell.2020.06.043. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Kuba K., Imai Y., Rao S., Gao H., Guo F., Guan B., Huan Y., Yang P., Zhang Y., Deng W., Bao L., Zhang B., Liu G., Wang Z., Chappell M., Liu Y., Zheng D., Leibbrandt A., Wada T., Slutsky A.S., Liu D., Qin C., Jiang C., Penninger J.M. A crucial role of angiotensin converting enzyme 2 (ACE2) in SARS coronavirus-induced lung injury. Nat. Med. 2005;11:875–879. doi: 10.1038/nm1267. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Kumar S., Stecher G., Li M., Knyaz C., Tamura K. MEGA X: molecular evolutionary genetics analysis across computing platforms. Mol. Biol. Evol. 2018;35:1547–1549. doi: 10.1093/molbev/msy096. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Lan J., Ge J., Yu J., Shan S., Zhou H., Fan S., Zhang Q., Shi X., Wang Q., Zhang L., Wang X. Structure of the SARS-CoV-2 spike receptor-binding domain bound to the ACE2 receptor. Nature. 2020;581:215–220. doi: 10.1038/s41586-020-2180-5. [DOI] [PubMed] [Google Scholar]
  27. Li W., Godzik A. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics. 2006;22:1658–1659. doi: 10.1093/bioinformatics/btl158. [DOI] [PubMed] [Google Scholar]
  28. Lumbers E.R., Delforce S.J., Pringle K.G., Smith G.R. The lung, the heart, the novel coronavirus, and the renin-angiotensin system; the need for clinical trials. Front Med (Lausanne) 2020;7:248. doi: 10.3389/fmed.2020.00248. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Lynch M., Burger R., Butcher D., Gabriel W. The mutational meltdown in asexual populations. J Hered. 1993;84:339–344. doi: 10.1093/oxfordjournals.jhered.a111354. [DOI] [PubMed] [Google Scholar]
  30. Millet J.K., Whittaker G.R. Host cell entry of Middle East respiratory syndrome coronavirus after two-step, furin-mediated activation of the spike protein. Proc. Natl. Acad. Sci. U. S. A. 2014;111:15214–15219. doi: 10.1073/pnas.1407087111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Millet J.K., Whittaker G.R. Host cell proteases: critical determinants of coronavirus tropism and pathogenesis. Virus Res. 2015;202:120–134. doi: 10.1016/j.virusres.2014.11.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Moreira I.S., Fernandes P.A., Ramos M.J. Protein-protein docking dealing with the unknown. J. Comput. Chem. 2010;31:317–342. doi: 10.1002/jcc.21276. [DOI] [PubMed] [Google Scholar]
  33. Muller H.J. The relation of recombination to mutational advance. Mutat. Res. 1964;106:2–9. doi: 10.1016/0027-5107(64)90047-8. [DOI] [PubMed] [Google Scholar]
  34. Oliveira S.C., de Magalhaes M.T.Q., Homan E.J. Immunoinformatic analysis of SARS-CoV-2 Nucleocapsid protein and identification of COVID-19 vaccine targets. Front. Immunol. 2020;11:587615. doi: 10.3389/fimmu.2020.587615. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Ormond L., Liu P., Matuszewski S., Renzette N., Bank, C, Zeldovich K., Bolon D.N., Kowalik T.F., Finberg R.W., Jensen J.D., Wang J.P. The combined effect of Oseltamivir and Favipiravir on influenza a virus evolution. Genome Biol. Evol. 2017;9:1913–1924. doi: 10.1093/gbe/evx138. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Pallesen J., Wang N., Corbett K.S., Wrapp D., Kirchdoerfer R.N., Turner H.L., Cottrell C.A., Becker M.M., Wang L., Shi W., Kong W.P., Andres E.L., Kettenbach A.N., Denison M.R., Chappell J.D., Graham B.S., Ward A.B., McLellan J.S. Immunogenicity and structures of a rationally designed prefusion MERS-CoV spike antigen. Proc. Natl. Acad. Sci. U. S. A. 2017;114:E7348–E7357. doi: 10.1073/pnas.1707304114. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Pantsar T., Poso A. Binding affinity via docking: fact and fiction. Molecules. 2018;23:1899–1910. doi: 10.3390/molecules23081899. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Park J.E., Li K., Barlan A., Fehr A.R., Perlman S., McCray P.B., Jr., Gallagher T. Proteolytic processing of Middle East respiratory syndrome coronavirus spikes expands virus tropism. Proc. Natl. Acad. Sci. U. S. A. 2016;113:12262–12267. doi: 10.1073/pnas.1608147113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Penisson S., Singh T., Sniegowski P., Gerrish P. Dynamics and fate of beneficial mutations under lineage contamination by linked deleterious mutations. Genetics. 2017;205:1305–1318. doi: 10.1534/genetics.116.194597. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Pybus O.G., Rambaut A. Evolutionary analysis of the dynamics of viral infectious disease. Nat. Rev. Genet. 2009;10:540–550. doi: 10.1038/nrg2583. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Rees-Spear C., Muir L., Griffith S.A., Heaney J., Aldon Y., Snitselaar J.L., Thomas P., Graham C., Seow J., Lee N., Rosa A., Roustan C., Houlihan C.F., Sanders R.W., Gupta R.K., Cherepanov P., Stauss H.J., Nastouli E., Investigators S., Doores K.J., van Gils M.J., McCoy L.E. The effect of spike mutations on SARS-CoV-2 neutralization. Cell Rep. 2021;34:108890. doi: 10.1016/j.celrep.2021.108890. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Rice P., Longden I., Bleasby A. EMBOSS: the European molecular biology open software suite. Trends Genet. 2000;16:276–277. doi: 10.1016/s0168-9525(00)02024-2. [DOI] [PubMed] [Google Scholar]
  43. Shang J., Wan Y., Luo C., Ye G., Geng Q., Auerbach A., Li F. Cell entry mechanisms of SARS-CoV-2. Proc. Natl. Acad. Sci. U. S. A. 2020;117:11727–11734. doi: 10.1073/pnas.2003138117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Shang J., Ye G., Shi K., Wan Y., Luo C., Aihara H., Geng Q., Auerbach A., Li F. Structural basis of receptor recognition by SARS-CoV-2. Nature. 2020;581:221–224. doi: 10.1038/s41586-020-2179-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Simpson E.H. Measurement in diversity. Nature. 1949;163:688. [Google Scholar]
  46. Song W., Gui M., Wang X., Xiang Y. Cryo-EM structure of the SARS coronavirus spike glycoprotein in complex with its host cell receptor ACE2. PLoS Pathog. 2018;14 doi: 10.1371/journal.ppat.1007236. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Tai W., He L., Zhang X., Pu J., Voronin D., Jiang S., Zhou Y., Du L. Characterization of the receptor-binding domain (RBD) of 2019 novel coronavirus: implication for development of RBD protein as a viral attachment inhibitor and vaccine. Cell. Mol. Immunol. 2020;17:613–620. doi: 10.1038/s41423-020-0400-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Tortorici M.A., Veesler D. Structural insights into coronavirus entry. Adv. Virus Res. 2019;105:93–116. doi: 10.1016/bs.aivir.2019.08.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. van Dijk M., Wassenaar T.A., Bonvin A.M. A flexible, grid-enabled web portal for GROMACS molecular dynamics simulations. J. Chem. Theory Comput. 2012;8:3463–3472. doi: 10.1021/ct300102d. [DOI] [PubMed] [Google Scholar]
  50. van Zundert G.C.P., Rodrigues J., Trellet M., Schmitz C., Kastritis P.L., Karaca E., Melquiond A.S.J., van Dijk M., de Vries S.J., Bonvin A. The HADDOCK2.2 web server: user-friendly integrative modeling of biomolecular complexes. J. Mol. Biol. 2016;428:720–725. doi: 10.1016/j.jmb.2015.09.014. [DOI] [PubMed] [Google Scholar]
  51. Walls A.C., Tortorici M.A., Bosch B.J., Frenz B., Rottier P.J.M., DiMaio F., Rey F.A., Veesler D. Cryo-electron microscopy structure of a coronavirus spike glycoprotein trimer. Nature. 2016;531:114–117. doi: 10.1038/nature16988. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Walls A.C., Tortorici M.A., Snijder J., Xiong X., Bosch B.J., Rey F.A., Veesler D. Tectonic conformational changes of a coronavirus spike glycoprotein promote membrane fusion. Proc. Natl. Acad. Sci. U. S. A. 2017;114:11157–11162. doi: 10.1073/pnas.1708727114. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Walls A.C., Park Y.J., Tortorici M.A., Wall A., McGuire A.T., Veesler D. Structure, function, and antigenicity of the SARS-CoV-2 spike glycoprotein. Cell. 2020;181:281–292. doi: 10.1016/j.cell.2020.02.058. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Wang Q., Zhang Y., Wu L., Niu S., Song C., Zhang Z., Lu G., Qiao C., Hu Y., Yuen K.Y., Wang Q., Zhou H., Yan J., Qi J. Structural and functional basis of SARS-CoV-2 entry by using human ACE2. Cell. 2020;181:894–904. doi: 10.1016/j.cell.2020.03.045. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Waterhouse A., Bertoni M., Bienert S., Studer G., Tauriello G., Gumienny R., Heer F.T., de Beer T.A.P., Rempfer C., Bordoli L., Lepore R., Schwede T. SWISS-MODEL: homology modelling of protein structures and complexes. Nucleic Acids Res. 2018;46:W296–W303. doi: 10.1093/nar/gky427. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Wrapp D., Wang N., Corbett K.S., Goldsmith J.A., Hsieh C.L., Abiona O., Graham B.S., McLellan J.S. Cryo-EM structure of the 2019-nCoV spike in the prefusion conformation. Science. 2020;367:1260–1263. doi: 10.1126/science.abb2507. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Yao H., Lu X., Chen Q., Xu K., Chen Y., Cheng M., Chen K., Cheng L., Weng T., Shi D., Liu F., Wu Z., Xie M., Wu H., Jin C., Zheng M., Wu N., Jiang C., Li L. Patient-derived SARS-CoV-2 mutations impact viral replication dynamics and infectivity in vitro and with clinical implications in vivo. Cell Discovery. 2020;6:76. doi: 10.1038/s41421-020-00226-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Zeng W., Liu G., Ma H., Zhao D., Yang Y., Liu M., Mohammed A., Zhao C., Yang Y., Xie J., Ding C., Ma X., Weng J., Gao Y., He H., Jin T. Biochemical characterization of SARS-CoV-2 nucleocapsid protein. Biochem. Biophys. Res. Commun. 2020;527:618–623. doi: 10.1016/j.bbrc.2020.04.136. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary material

mmc1.pdf (1,011.5KB, pdf)

Articles from Infection, Genetics and Evolution are provided here courtesy of Elsevier

RESOURCES