Skip to main content
Taylor & Francis - PMC COVID-19 Collection logoLink to Taylor & Francis - PMC COVID-19 Collection
. 2021 Feb 13:1–11. doi: 10.1080/07391102.2021.1886175

Structural genetics of circulating variants affecting the SARS-CoV-2 spike/human ACE2 complex

Francesco Ortuso a,b, Daniele Mercatelli c, Pietro Hiram Guzzi c, Federico Manuel Giorgi d,
PMCID: PMC7885719  PMID: 33583326

Abstract

SARS-CoV-2 entry in human cells is mediated by the interaction between the viral Spike protein and the human ACE2 receptor. This mechanism evolved from the ancestor bat coronavirus and is currently one of the main targets for antiviral strategies. However, there currently exist several Spike protein variants in the SARS-CoV-2 population as the result of mutations, and it is unclear if these variants may exert a specific effect on the affinity with ACE2 which, in turn, is also characterized by multiple alleles in the human population. In the current study, the GBPM analysis, originally developed for highlighting host-guest interaction features, has been applied to define the key amino acids responsible for the Spike/ACE2 molecular recognition, using four different crystallographic structures. Then, we intersected these structural results with the current mutational status, based on more than 295,000 sequenced cases, in the SARS-CoV-2 population. We identified several Spike mutations interacting with ACE2 and mutated in at least 20 distinct patients: S477N, N439K, N501Y, Y453F, E484K, K417N, S477I and G476S. Among these, mutation N501Y in particular is one of the events characterizing SARS-CoV-2 lineage B.1.1.7, which has recently risen in frequency in Europe. We also identified five ACE2 rare variants that may affect interaction with Spike and susceptibility to infection: S19P, E37K, M82I, E329G and G352V.

Communicated by Ramaswamy H. Sarma

Keywords: SARS-CoV-2, COVID-19, mutations, spike, ACE2

Introduction

The Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) has emerged in late 2019 (Zhu et al., 2020) as the etiological cause of a pandemic of severe proportions dubbed Coronavirus Disease 19 (COVID-19). The disease has reached virtually every country in the globe (Hilton & Keeling, 2020), with more than 40,000,000 confirmed cases and more than 1,100,000 deaths (source: World Health Organization). SARS-CoV-2 is characterized by a 29,903-long single stranded RNA genome, densely packed in 11 Open Reading Frames (ORFs); the ORF1 encodes for a polyprotein which is further split in 16 proteins, for a total of 26 proteins (Mercatelli & Giorgi, 2020).

The second ORF encodes for the Spike (S) protein, which is the key protagonist in the viral entry into host cells, through its interaction with human epithelial cell receptors Angiotensin Converting Enzyme 2 (ACE2) (Tai et al., 2020), Transmembrane Serine Protease 2 (TMPRSS2) (Hoffmann et al., 2020), Furin (Xia et al., 2020) and CD147 (Ulrich & Pillat, 2020). Investigators have focused their attention on the Spike/ACE2 interaction, trying to disrupt it as a potential anti-COVID-19 therapy, using small drugs (Hanson, 2020) or Spike fragments (Peter & Schug, 2020). Using X-ray crystallography, some models of the Spike/ACE2 have been generated (Lan et al., 2020; Shang et al., 2020; Wang, Zhang et al., 2020), providing a structural instrument for the analysis of this key interaction. These models determined that the Receptor Binding Domain (RBD) of Spike, directly interacting with ACE2, is a compact structure of ∼200 amino acids (AAs) over a total of 1273 AAs of the full-length Spike.

The SARS-CoV-2 Spike protein adapted from subsequent mutations from a wild bat beta-coronavirus (Ou, 2020), in order to exploit the N-terminal ACE2 peptidase domain conformation. As a result, SARS-CoV-2 Spike can establish a strong interaction with the human cell surface, allowing the virus to fuse its membrane with that of the host cell, releasing its proteins and genetic material and starting its replication cycle (Hoffmann et al., 2020). While SARS-CoV-2 shows low mutability (Ceraolo & Giorgi, 2020), with less than 25 predicted events/year (Hadfield et al., 2018), the virus is in continuous evolution from the original Wuhan reference sequence (NC_045512.2) (Tang et al., 2020), and there are currently at least six major variants circulating in the population (Mercatelli, Triboli et al., 2020; Mercatelli & Giorgi, 2020). Some of these strains are characterized by a mutation in Spike, at AA 614, whereas an Aspartic Acid (D) is substituted by a Glycine (G) (Sashittal et al., 2020). In fact, the Spike D614G mutation gives the name to the most frequent viral clade (G), which was first detected in Europe at the end of January 2020, and is currently present in all continents, with increasing frequency over time (Mercatelli & Giorgi, 2020). D614G does not fall within the putative RBD (AA ∼330–530), but some studies suggest it may have a clinically relevant role: D614G is positively correlated with increased case fatality rate (Becerra‐Flores & Cardozo, 2020), and it shows increased transmissibility and infectivity compared to the reference genome (Korber, 2020). In vitro studies show that viruses carrying the D614G Spike mutation have an increased viral load and cytopathic effect in cultured Vero cells (Tang et al., 2020). Despite these preliminary observations, there are still several doubts on the molecular effects of the D614G variant (Grubaugh et al., 2020). Other recurring Spike mutations have been observed in the population worldwide, however at frequencies of 1% or below (Mercatelli & Giorgi, 2020); some of these mutations fall within the RBD and therefore may have a direct role in ACE2 interaction.

On the other hand, genetic variants of ACE2 in human population may influence susceptibility or resistance to SARS-CoV-2 infection, possibly contributing to the difference in clinical features observed in COVID-19 patients (Benetti, 2020). ACE2 gene is located on chromosome Xp22.2 and consists of 18 exons, coding for an 805 AAs long protein exposed on the cell surface of a variety of human organs, including kidneys, heart, brain, gastrointestinal tract, and lungs (Burrell et al., 2013). It is unclear if tissue-expression patterns of ACE2 may be linked to the severity of symptoms or outcomes of SARS-CoV-2 infections; however, ACE2 levels in lungs were found to be increased in patients with comorbidities associated to severe COVID-19 clinical manifestations (Pinto, 2020), whereas polymorphisms of ACE2 have been already described to play a role in hypertension and cardiovascular diseases (Bosso et al., 2020), particularly in association with type 2 diabetes (Burrell et al., 2013), all conditions predisposing to an increased risk of dying from COVID-19 (Zheng, 2020). Despite early studies, the presence of Spike mutations potentially altering the binding with ACE2 is still largely under-investigated, as is the role of ACE2 variants in the human population in determining patient-specific molecular interactions between these two proteins.

In the present study, we aim at detecting which Spike and ACE2 AAs are the most important in determining the SARS-CoV-2 entry interaction and analyze which ones have already mutated in the population. The task is clinically relevant, providing a functional characterization of present and future mutations targeting the ACE2/Spike binding and detected by sequencing SARS-CoV-2 on a patient-specific basis. Characterizing the variability of both proteins must be taken in consideration in the process of developing anti-COVID-19 strategies, such as the Spike-based vaccine currently deployed by the National Institute of Allergy and Infectious Diseases and Moderna (Jackson, 2020).

Results

We set out to analyze the key AAs involved in the Spike/ACE2 interaction, in order to highlight which ones may alter the binding affinity and therefore etiological and clinical properties of different SARS-CoV-2 variants on different patients. Following that, we determined which Spike and ACE2 AA variations relevant for this interaction have been observed in the SARS-CoV-2 and human population, respectively.

Structural analysis of spike/ACE2 interaction

We obtained structural models of the SARS-CoV-2 Spike interacting with the human ACE2 from three recent X-ray structures, deposited on the Protein Data Bank: 6LZG (Wang, Zhang et al., 2020), 6M0J (Lan et al., 2020) and 6VW1 (Shang et al., 2020). For 6VW1, two Spike/ACE2 complexes were available, so we report results for both as 6VW1-A and 6WV1-B, separately. All models show the core domains of interaction, located in the region of AA 330–530 for Spike and in the region AA 15–615 of ACE2. Full length proteins would be 1273 AAs (Spike only known isoform, from reference SARS-CoV-2 genome NC_045512.2) and 805 AAs (ACE2 isoform 1, UniProt id Q9BYF1-1).

Selected PDB entries are wild type and their primary sequence and the higher order structures were identical. Residues 517–519 were missed in 6VW1-B. With the aim to investigate the conformation variability, PDB complexes were aligned by backbone and the Root Mean Square deviation (RMSd) was computed on all equivalent not hydrogen atoms. RMSd data have shown some conformation flexibility that confirmed our idea to take into account all PDB structures in the next investigation (Figure 1).

Figure 1.

Figure 1.

Conformational comparison of Spike-ACE2 PDB complexes: (A) alignment of PDB entries, Spike and ACE2 are respectively surrounded by cyan and orange fog, and (B) bar graph showing RMSd (in Å) computed on protein atoms.

The GBPM method was originally developed for identifying and scoring pharmacophore and protein–protein interaction key features by combining GRID molecular interaction fields (MIFs) according to the GRAB tool algorithm (Ortuso et al., 2006). In the present study, GBPM has been applied to all selected complex models considering Spike and ACE2 either as host or guest. DRY, N1 and O GRID probes were considered for describing hydrophobic, hydrogen bond donor and hydrogen bond acceptor interaction. For each probe a cutoff, required for highlighting the most relevant MIFs points, was fixed above the 30% from the corresponding global minimum interaction energy value. With respect to the known GBPM application, where pharmacophore features are used for virtual screening purposes, here these data guided us in the complex stabilizing AAs identification. In fact, Spike or ACE-2 residues, within 3 Å from GBPM points, were marked as relevant in the host–guest recognition and were qualitatively scored by assigning them the corresponding GBPM energy. If a certain residue was suggested by more than one GBPM point, its score was computed as summa of the related GBPM points energy (Figure 2).

Figure 2.

Figure 2.

Summary of the pipeline adopted by GBPM to identify key residues contributing to the SARS-CoV-2 Spike/Human ACE2 interface. Spike is depicted in cyan, and ACE2 in orange, based on the 6LZG PDB model (Wang et al., 2020). Residues highlighted by GBPM are then tested for mutation frequency in the worldwide SARS-CoV-2 population.

Finally, for each selected residue, the four models averaged score was considered for estimating the role in complex stabilization. Taking into account their average scores, Spike and ACE2 AAs were divided by quartiles to facilitate the interpretation of the results: quartile 1 (Q1) includes the strongest complex stabilization contributors; quartile 2 (Q2) contains residues less important than those reported in Q1 but most relevant of those included in quartile 3 (Q3); quartile 4 (Q4) indicates the weakest predicted interacting AAs. Such an extension of the original approach allowed us to highlight known relevant interaction residues of both Spike (Table 1) and ACE-2 (Table 2).

Table 1.

GBPM scores, average values, and quartile distribution of Spike relevant AAs in three PDB models.

Residue # PDB entries
GBPM
6LZG 6M0J 6VW1-A 6VW1-B Average
score
Quartile
LYS 417 –43.58 –12.12 0.00 0.00 –13.93 Q2
ASN 439 0.00 0.00 –12.30 –34.94 –11.81 Q2
GLY 446 –22.52 –5.75 0.00 –10.32 –9.65 Q3
GLY 447 –5.63 0.00 0.00 0.00 –1.41 Q3
TYR 449 –25.72 –6.38 –20.37 –24.76 –19.31 Q1
TYR 453 0.00 0.00 –1.77 –1.76 –0.88 Q4
LEU 455 –11.59 –16.82 –21.78 –7.04 –14.31 Q2
PHE 456 –34.20 –30.16 –39.72 –20.76 –31.21 Q1
ALA 475 –52.35 –49.72 –38.73 –77.00 –54.45 Q1
GLY 476 –21.72 0.00 –17.16 –34.59 –18.37 Q2
SER 477 –22.32 0.00 –11.44 –40.68 –18.61 Q2
GLU 484 –8.52 –13.23 0.00 0.00 –5.44 Q3
PHE 486 –28.99 –53.63 –32.56 –53.43 –42.15 Q1
ASN 487 –31.67 –59.57 –33.98 –52.21 –44.36 Q1
TYR 489 –62.10 –27.67 –45.92 –69.38 –51.27 Q1
PHE 490 –4.58 –4.48 –22.90 –40.32 –18.07 Q2
GLN 493 –37.20 –56.08 –79.60 –70.51 –60.85 Q1
GLY 496 –15.54 –8.74 –18.72 –16.80 –14.95 Q2
PHE 497 –8.86 0.00 –4.68 –29.10 –10.66 Q3
GLN 498 –77.24 –80.38 –42.34 0.00 –49.99 Q1
PRO 499 0.00 0.00 0.00 –11.64 –2.91 Q3
THR 500 0.00 –66.00 –92.90 –122.50 –70.35 Q1
ASN 501 –60.14 –61.04 –61.82 –70.59 –63.40 Q1
GLY 502 –24.84 –35.42 –39.45 –40.92 –35.16 Q1
VAL 503 0.00 –5.37 –5.45 –5.54 –4.09 Q3
TYR 505 –30.60 –23.22 –20.90 –40.62 –28.84 Q1

GBPM scores and average values are reported in kcal/mol.

Table 2.

GBPM scores, average values and quartile distribution of ACE2 relevant AAs in three PDB models. GBPM scores and average values are reported in kcal/mol.

Residue # PDB entries
GBPM
6LZG 6M0J 6VW1-A 6VW1-B Average
score
Quartile
SER 19 –31.45 –26.08 –53.61 –79.33 –47.62 Q1
GLN 24 –31.15 –23.62 –34.15 –85.23 –43.54 Q1
THR 27 –16.93 –32.58 –38.70 –16.65 –26.22 Q2
PHE 28 –20.68 –25.02 –14.10 –27.48 –21.82 Q2
ASP 30 0.00 –17.01 0.00 0.00 –4.25 Q3
LYS 31 –84.06 –43.67 –32.98 –46.60 –51.83 Q1
HIS 34 0.00 –30.42 –27.78 –67.56 –31.44 Q2
GLU 35 –11.73 0.00 0.00 –19.40 –7.78 Q2
GLU 37 –11.58 –20.36 –11.83 –20.52 –16.07 Q2
ASP 38 –41.09 –40.52 –25.75 –34.16 –35.38 Q2
TYR 41 –52.50 –75.07 –62.35 –76.07 –66.50 Q1
GLN 42 –36.78 –37.15 –28.53 −63.49 –41.49 Q2
LEU 45 –12.80 –16.43 0.00 –16.20 –11.36 Q2
LEU 79 0.00 0.00 0.00 –5.99 –1.50 Q3
MET 82 0.00 0.00 –6.36 –6.00 –3.09 Q3
TYR 83 –40.50 –66.29 –57.86 –60.81 –56.37 Q1
GLU 329 0.00 0.00 0.00 –17.25 –4.31 Q3
ASN 330 –11.84 –5.92 –11.82 –6.04 –8.91 Q2
GLY 352 –1.97 –8.36 –8.86 –14.66 –8.46 Q2
LYS 353 –79.38 –70.11 –120.73 –46.03 –79.06 Q1
GLY 354 –21.87 –31.15 –12.74 –15.25 –20.25 Q2
ASP 355 –68.95 –81.24 –57.99 –89.12 –74.33 Q1
ARG 357 0.00 –4.99 0.00 0.00 –1.25 Q3
ALA 386 0.00 0.00 –4.85 0.00 –1.21 Q4
ARG 393 0.00 0.00 –4.85 0.00 –1.21 Q4

Basically, the same number of AAs was highlighted for Spike (26 AAs) and ACE2 (25 AAs). The average score was also in the same range. Spike reported a population of Q1 larger than ACE2: 12 and 7 AAs, respectively. The opposite scenario was observed in the Q2 that accounted for 7 residues for Spike and 11 for ACE2. No remarkable difference can be addressed to the Q3 and Q4 Spike–ACE2 comparison. We reasoned that mutations and variants in Q1 residues could have a more relevant impact in the complex stability.

The analysis of all designed GBPM suggested the Spike–ACE2 molecular recognition is largely sustained by polar interactions, such as hydrogen bonds, and by very few putative hydrophobic contributions (Table 3).

Table 3.

Composition of the GBPM models designed.

GBPM
feature
6LZG
6M0J
6VW1-A
6VW1-B
Host/Guest
# AIE # AIE # AIE # AIE
Hydrophobic 4 –2.07 4 –1.82 5 –2.05 3 –2.12 Spike/ACE2
HBD 18 –6.48 15 –6.47 17 –6.22 19 –6.31
HBA 4 –6.61 13 –5.25 12 –5.47 14 –5.48
Hydrophobic 1 –1.49 3 –1.16 2 –1.49 1 –1.76 ACE2/Spike
HBD 18 –6.26 18 –6.32 24 –5.63 28 –5.94
HBA 7 –4.84 10 –4.53 9 –4.98 12 –4.60

HBD = Hydrogen Bond Donor; HBA = Hysdrogen Bond Acceptor; # = number of features; AIE = Average Interaction Energy (in kcal/mol).

Mutational analysis of SARS-CoV-2 spike

We analyzed 295,507 publicly available SARS-CoV-2 full-length genome sequences collected worldwide and deposited on the GISAID database on December 30, 2020 (Shu & McCauley, 2017). From these, we obtained 257,434 samples containing at least one AA-changing mutation in the Spike protein. A total of 3314 different AA-changing mutations were detected in the 1279 AA-long Spike sequence. However, many of these are unique events (or possibly even sequencing errors), as only 2023 mutations were found in more than one sample, 788 were found in more than 10 samples, and 196 in more than 100 samples (Supplementary File 1).

We then focused on mutations located in the Spike RBD (AA 330–530) with predicted interaction contribution, as assessed by our GBPM method. The majority of mutations here are found in only a handful of samples (Table 4 and Figure 4(A)), with a few notable exceptions. The mutations S477N and N439K are the most frequent in the current population and were identified in 16,547 patients (5.60%) and 5587 patients (1.89%) respectively. These two variants (N439K and S477N) are also amongst the top 20 most frequent in the population and involve two positions productively contributing to the interaction between Spike and ACE2, according to GBPM (see Table 1 and Figure 3 for locations 439 and 477).

Table 4.

Spike mutations located within the RBD (AA 330–530) with at least two cases in the population and non-zero GBPM average score in the ACE2/Spike interaction models.

        GBPM  
Mutation Position Abundance Frequency Average score Quartile
S477N 477 16,547 0.055995 –18.61 Q2
N439K 439 5587 0.018906 –11.81 Q2
N501Y 501 4921 0.016653 –63.3975 Q1
Y453F 453 917 0.003103 –0.8825 Q4
E484K 484 352 0.001191 –5.4375 Q3
K417N 417 260 0.00088 –13.925 Q2
S477I 477 157 0.000531 –18.61 Q2
G446V 446 58 0.000196 –9.6475 Q3
F490S 490 53 0.000179 –18.07 Q2
S477R 477 49 0.000166 –18.61 Q2
N501T 501 47 0.000159 –63.3975 Q1
L455F 455 44 0.000149 –14.3075 Q2
G476S 476 43 0.000146 –18.3675 Q2
E484Q 484 43 0.000146 –5.4375 Q3
A475V 475 35 0.000118 –54.45 Q1
F486L 486 34 0.000115 –42.1525 Q1
F490L 490 18 6.09E-05 –18.07 Q2
YQ505WK 505 14 4.74E-05 –28.835 Q1
Q493L 493 12 4.06E-05 –60.8475 Q1
V503F 503 9 3.05E-05 –4.09 Q3
E484A 484 8 2.71E-05 –5.4375 Q3
G446S 446 7 2.37E-05 –9.6475 Q3
E484D 484 4 1.35E-05 –5.4375 Q3
Q493* 493 4 1.35E-05 –60.8475 Q1
Y505W 505 4 1.35E-05 –28.835 Q1
G476A 476 3 1.02E-05 –18.3675 Q2
S477G 477 3 1.02E-05 –18.61 Q2
F456L 456 2 6.77E-06 –31.21 Q1
V503I 503 2 6.77E-06 –4.09 Q3
Y449F 449 2 6.77E-06 –19.3075 Q1

The asterisk (*) indicates a stop codon. A lower GBPM score indicates a stronger effect in the ACE2/Spike interaction.

Figure 4.

Figure 4.

(A) Occurrence of AA-changing variants on SARS-CoV-2 Spike protein. X-axis indicates the position of the affected AA. Y-axis indicates the log10 of the number of occurrences of the variant in the SARS-CoV-2 dataset. Labels indicate variants affecting ACE2/Spike binding and detected in at least five SARS-CoV-2 sequences. Vertical dashed lines indicate crystalized region analyzed (aa 330 – 530). The D614G variant, located outside the RBD, is also indicated. (B) Scatter plot indicating the occurrence of the variant in the population (x-axis) and the GBPM score of the reference AA in the model (y-axis). Mutations with non-zero GBPM score are indicated. CC indicates the Pearson correlation coefficient and p indicates the p-value of the CC.

Figure 3.

Figure 3.

3 D ribbon representation of the interaction domains of SARS-CoV-2 Spike (left, orange) and human ACE2 (right, green), based on the crystal structure 6LZG deposited on Protein Data Bank and produced by Wang et al. (2020). The positions of the three most frequent Spike mutations in the interacting region (AA 350-550) with a non-zero GBPM score are indicated: N439K, N501Y and S477N.

The graphical inspection of the PDB structures revealed that Spike Asparagine (N) 439, raked at GBPM Q2, is mainly involved in intra-protein interaction. In fact, by means of its backbone sp2 oxygen atom, N439 accepts one hydrogen bond from Spike Serine 443 side chain and, by its side chain amide group, donates one hydrogen bond to the Spike Proline 499 backbone: all these AAs are located into a random coil loop of Spike so the N439K could minimally modify the Spike-ACE2 recognition. On the other hand, after the theoretical mutation of the Asparagine 439 with a Lysine, it is possible to predict a productive electrostatic interaction between the new net positively charged residue and the ACE2 Glutamate 329. Such a long-distance interaction could improve the stabilization of the complex with respect to the Spike wild type (Supporting information Figure S1).

A similar effect could be addressed to the mutation at position 477. Serine (S) 477 is a weak contributor to the complex interaction. In all PDB entries we selected, Serine 477 is located into a solvent exposed random coil loop. No interaction with ACE2 or Spike residues can be observed. Actually, the GBPM analysis included such a residue in Q2. Conversely, its mutation to Asparagine (S477N), in our in silico model, revealed the possibility to establish hydrogen bond to the ACE2 Serine 19 that can clearly result in a stabilization of the complex (Supporting information Figure S2). Moreover, position 477 is also affected by three other events with lower occurrence: S477I, S477R and S477G, with 6, 2 and 2 observations (Table 4). Among all, the S447R could be the most interesting one. Actually, a net positively charged residue, such as Arginine (R), can establish a weak electrostatic interaction to ACE2 Glutamate 87, as suggested by a theoretical model we built. The S477I and S477G could modify the conformation of a random coil segment, so it does not appear very relevant. Conversely, S477N and S477G could productively contribute to the Spike ACE2 complex stabilization. Of course, deeper theoretical and experimental investigations should be carried out to confirm this hypothesis. Unfortunately, full-scale simulations cannot be rigorously performed today because the available 3 D structural models report only fragments of the complex between Spike and ACE2.

The third most common mutation, N501Y (Figure 3), targets an AA predicted to have a strong role in the interaction in all four models, sitting in the GBPM Q1. N501Y was detected in 4921 patients (1.67% of the dataset): the majority of which were located in the United Kingdom (Shu & McCauley, 2017). From a structural point of view, we predict that a substitution, at position 501, of an Asparagine (N) with a Tyrosine (Y) may have an effect: their Total Polar Surface Area (TPSA), equal to 101.29 and to 78.43 Å2 respectively, is different, however both their side chains can donate/accept a hydrogen bond. Therefore, their contribution to complex stabilization may be slightly different, also taking into account the chemical environment. In fact, the wild type Asparagine 501 donates one hydrogen bond to ACE2 Tyrosine 41: such an interaction could be possible also for N501Y mutant or, as we observed in our theoretical model, it could be replaced by pi–pi stacking (Supporting information Figure S3). The rapid increase in frequency of mutation N501Y has been recently observed in the United Kingdom and other countries, as it is one of the variants characterizing lineage B1.1.7 (Preliminary genomic characterisation of an emergent SARS-CoV-2 lineage in the UK defined by a novel set of spike mutations - SARS-CoV-2 coronavirus/nCoV-2019 Genomic Epidemiology - Virological, 2021). The Asparagine/Tyrosine substitution in Spike position 501 could contribute to determine an evolutionary advantage for this lineage, based on differential affinity for the human receptor ACE2 (Fratev, 2020; Leung et al., 2020).

A less frequent mutation amongst those predicted to contribute to the ACE2/Spike interaction is G476S, detected in 43 samples (0.02%), and supported by three out of four structural models (Table 1, Figure 4(B)). The Glycine (G) 476 was included by GBPM analysis in Q2: its contribution to the complex stabilization is weak. Conversely to the other mutation described here, the replacement of Glycine 476 with a Serine (S) could have more evident effects on Spike ACE2 molecular recognition. In fact, in all PDB entries, the alpha carbon of this Glycine is very close, about 4 Å, to the side chain amide group of the ACE2 Glutamine 24. Between these two AAs no productive interaction can be established but the substitution of the Spike Glycine with a Serine could allow one inter-protein hydrogen bond to ACE2 Glutamine 24. Moreover, G476S could establish the same interaction with Spike Glutamine 478 that could stabilize the conformation of a random coil segment of the viral protein resulting in a better pre-organization to the ACE2 recognition (Supporting information Figure S4).

Another Spike residue, predicted by our analysis for playing a relevant role in ACE2 recognition, is the Glutamine 493 (Table 1). The GISAID data revealed that such an aminoacid is rarely replaced by a Leucine (Q493L) or by an Arginine (Q493R). These mutations could affect the recognition of ACE2 in an opposite way. Spike Glutamine 493 is involved in hydrogen bond with ACE2 Glutamate 35. The mutation Q493L cannot establish such a productive contribution and could only hydrophobically interact to Spike Leucine 455. Conversely, Q493R could locate its net positively charged side chain into an ACE2 pocket delimited by Aspartate 30, Histidine 34 and Glutamate 35. Such a positioning could produce a remarkable electrostatic stabilization of the complex (Supporting information Figure S5).

In general, we could observe that AAs with the strongest evidence for interaction contribution in the Spike/ACE2 interface tend not to diverge from the reference (Figure 4(B)), which may indicate a solid evolutionary constraint to maintain the interface residues unchanged. For example, one of the most relevant 1st quartile AA in the ACE2/Spike interaction, Glutamine (Q) 493, is rarely mutated, with 12 cases of Q493L, 4 of Q493* (the substitution of Q493 with a stop codon), three of Q493K, and one of Q493R and Q493H. One possible exception is the aforementioned Spike mutation N501Y, located in the strongest 1st quartile GBPM-predicted AA for ACE2 binding, which was found in the considerable number of 4921 different patients.

Mutational analysis of human ACE2

We also investigated the variants of human ACE2, since these could constitute the basis for patient-specific COVID-19 susceptibility and severity. ACE2 protein sequence is highly conserved across vertebrates (Guzzi et al., 2020) and also within the human species (Cao et al., 2020), with the most frequent missense mutation (rs41303171, N720D) present in 1.5% of the world population (Supplementary File 2).

Our analysis shows that only five variants of ACE2 detected in the human population are also located in the ACE2/Spike direct binding interface (Table 5 and Figure 5). Of these, rs73635825 (causing a S19P AA variant) is both the most frequent in the population (0.06%) and the most relevant in the interaction with the viral protein, with a GBPM score of −47.6175 (Q1) and support from all four models (Table 2). The rs73635825 SNP frequency is higher in the population of African descent (0.2%). The second SNP, rs143936283 (E329G, Table 5) is a very rare allele (0.0066%) in the European (non-Finnish) Asian population. The rs766996587 (M82I) SNP is also a very rare allele (0.0066%) found in the African population. E37K (rs146676783) is more frequent in the Finnish (0.03%) and G352V (rs370610075) in the European non-Finnish (0. 007%) population. None of these five SNPs have a reported clinical significance, according to dbSNP and literature search (Sherry et al., 2001).

Table 5.

ACE2 variants with non-zero GBPM score in the Spike interaction model.

Variant rsID Allele frequency GBPM
Average score Quartile
S19P rs73635825 0.000655 –47.62 Q1
E329G rs143936283 6.63E-05 –4.31 Q3
M82I rs766996587 6.62E-05 –3.09 Q3
E37K rs146676783 5.68E-05 –16.07 Q2
G352V rs370610075 3.8E-05 –8.46 Q2

Figure 5.

Figure 5.

Frequency of mutations on ACE2. X-axis indicates the AA position in isoform 1 (UniProt id Q9BYF1-1). Y-axis indicates the allele frequency in the global population according to the GNOMAD v3 database. Labels indicate AA changes observed in the human population with non-zero GBPM average score in the ACE2/Spike interaction models. Vertical dashed lines indicate the crystalized region analyzed in this study (aa 15 – 615).

It must be mentioned that M82I, together with S19P, has been predicted to adversely affect ACE2 stability (Hussain, 2020). M82I, together with E329G, has been simulated to increase binding affinity with Spike when compared to wild type ACE2, hypothesizing greater susceptibility to SARS-CoV-2 for patients carrying these variants (Wang, Xu et al., 2020). Instead, E37K (Wang, Xu et al., 2020) and G352V (Darbani, 2020) were predicted to possess a lower affinity with Spike, suggesting lower susceptibility to the infection. However, while describing potential explanations to the existence of a possible predisposing genetic background to infection, all these studies remain inconclusive in linking allele variants to COVID-19 susceptibility.

Structurally, the S19P variant may greatly differ from the reference sequence in the interaction with ACE2: Serine (S) is a polar residue, able to accept and donate, by means of its side chain alcoholic group, a hydrogen bond. Proline (P), on the other hand, cannot be involved in hydrogen bonding, and therefore should establish a weaker interaction with Spike. In fact, ACE2 Serine 19 side chain donates a hydrogen bond to Spike Alanine 475 backbone (Supporting information Figure S6) and potentially could establish the same interaction with Spike Glycine (G) 476, which could also be mutated (Table 4). Both Methionine (M) 82 and Glutamate (E) 329 are in Q3 minimally contributing to Spike ACE2 recognition (Supporting information Figures S7 and S8). They are located within two alpha helices so their mutation could modify the secondary structure of ACE2 corresponding to a different affinity against Spike. Such a possibility should be more evident in the case of E329G because Glutamate 329 side chain is involved in hydrogen bond with ACE-2 Glutamine 325.

Discussion

SARS-CoV-2 Spike evolved through a series of adaptive mutations that increased its affinity for the human ACE2 receptor (Ortega et al., 2020). There is no reason to believe that the evolution and adaptation of the virus will stop, making continuous sequencing and mutational tracking studies of paramount importance to strategically contain COVID-19 (Meredith et al., 2020). In our study, we highlighted which specific locations of Spike can influence the ACE2 molecular recognition, required for the viral entry into the host cell (Hoffmann et al., 2020). We further showed that some mutations are already present in the SARS-CoV-2 population that may weakly affect the interaction with the human receptor, specifically Spike N439K, S477N and N501Y. These mutations are rising in the viral population (>1%) and in particular N501Y is one of the key mutations characterizing lineage B.1.1.7 (Leung et al., 2020), which has seen a recent dramatic increase in frequency in the United Kingdom (Preliminary genomic characterisation of an emergent SARS-CoV-2 lineage in the UK defined by a novel set of spike mutations - SARS-CoV-2 coronavirus/nCoV-2019 Genomic Epidemiology - Virological , XXXX). Having identified this mutation proves that our combination of targeted mutation frequency and GBPM is a useful pipeline to monitor events in the key region used by SARS-CoV-2 to recognize and enter human bronchial cells. The same approach can be used to monitor, in the future, if any of these events will increase in frequency, suggesting an adaptation to the human host leveraging a higher affinity with ACE2.

On the other hand, we studied the variants in the human ACE2 population, identifying five loci that can affect the binding with SARS-CoV-2 Spike. They are all rare variants, with the most frequent, S19P, present in 0.06% of the population, and with no known clinical significance. However, other in silico studies have predicted their role in decreasing ACE2 stability (S19P and M82I) (Hussain, 2020), and in altering the affinity with Spike (increasing it: M82I and E329G (Wang, Xu et al., 2020); decreasing it: E37K (Wang, Xu et al., 2020) and G352V (Darbani, 2020)). The most common ACE2 variant, rs41303171 (N720D), is not located in the binding region, and so far its predicted effects on the etiopathology of COVID-19 are still largely conjectural and associated to neurological complications via mechanisms probably independent from direct interaction with Spike (Strafella et al., 2020).

It remains to be seen whether, in the future, the combination of Spike and ACE2 sequences will produce novel and unexpected COVID-19 specificities, that will require granular efforts in developing wider-spectrum anti-SARS-CoV-2 strategies, such as vaccines or antiviral drugs. So far, our analysis has shown a location on the Spike/ACE2 complex where both proteins vary in the viral/human population, specifically on ACE2 S19 and Spike A475/G476. While, as described in our Results, these mutations on Spike are not likely to strongly affect the interaction surface, future combinations of ACE2/Spike variants may have peculiar effects that will require constant mutation monitoring. Identifying single or multiple AAs involved in this viral entry interaction will allow for personalized diagnosis and clinical prediction based on the specific combination of SARS-CoV-2 strain and ACE2 variant. Personalized COVID-19 treatment will require targeted sequencing of the patient ACE2 and Spike, to identify the combination causing the specific case. This technical obstacle can be further complicated by the intra-host genetic variability of SARS-CoV-2, which has recently been reported from RNA-Sequencing studies (Shen et al., 2020).

Structural investigation will benefit, in the next future, from the availability of experimental structural models reporting the complete sequence of both Spike and ACE2, or at least Spike. This will allow more rigorous computational analyses (i.e. molecular dynamics simulation, free energy perturbation) on the effect of mutations on the Spike/ACE2 recognition. Beyond the complex investigated in this manuscript, our approach can be fully extended to any other partners in the SARS-CoV-2/human interactome, for example the recently discovered interaction between viral protease NSP5 (Gordon et al., 2020) and human histone deacetylase HDAC2 (Milazzo et al., 2020), which is indirectly responsible for the transcriptional activation of pro-inflammatory genes. Our approach can also be extended to other viruses exploiting human receptors as an entry mechanism, such as CD4 for the Human Immunodeficiency Virus (HIV) or TIM-1 for the Ebola virus (Grove & Marsh, 2011).

Materials and methods

Structural analysis

The PDB (Berman et al., 2000) was searched for high-resolution Spike/ACE2 complexes. PDB entries 6LZG (Wang, Zhang et al., 2020), 6M0J (Lan et al., 2020) and 6VW1 (Shang et al., 2020), reporting the Spike RBD interacting to ACE2, have been retrieved and taken into account for our GBPM analysis (Ortuso et al., 2006). Such a computational approach compares GRID (Goodford, 1985) molecular interaction fields (MIFs) computed on a generic complex (A) and on its host (B) and guest (C) components, separately. Actually, MIFs describe the interaction between a certain probe and a certain target. If the target is represented by a complex, depending on the selected area, the MIF energies can be referred to the interaction between the probe and one of the complex subunits or, at the host/guest interface, with both of them. The GBPM analysis, objectively, highlights these last. Five steps are required: (1) the complex A is disassembled in its subunits B and C; (2) MIFs are computed on A, B and C by using the most appropriate GRID probes. A hydrogen bond acceptor/donor and a generic hydrophobic probe can describe the basic interaction. Because GRID MIFs are stored as a 3 D matrix of interaction energy points (IEP), the same box dimensions are adopted in all calculations; (3) each IEP of B is compared with respect to the equivalent point of A generating a new MIFs named D. The following algorithm, available into the GRAB tool, is applied: if IEP(A) > 0 and IEP(B) > 0 then IEP(D) = 0; if IEP(A) > 0 and IEP(B) < 0 then IEP(D) = IEP(B); if IEP(A) < 0 and IEP(B) > 0 then IEP(D) = -IEP(A); if IEP(A) < 0 and IEP(B) < 0 then IEP(D) = IEP(A)-IEP(B). The resulting MIF D reports as negative energy values the productive interaction between the GRID probe and B and the interface A and B; (4) in order to obscure the interaction between the probe and B, MIFs D and C are compared, by using the GRAB approach, producing to a new MIF E; (5) the most relevant interaction points (GBPM features) of the MIF E are, finally, selected taking into account an energy cutoff 15% above the global minimum. Supplementary figures focusing on the most relevant mutation are available in Supplementary File 3.

Before starting the GBPM analysis, co-crystalized water molecules were removed from PDB structures. In 6VW1, showing two Spike-ACE2 complexes, namely chains A-E and B-F, both structures have been investigated and further reported as model A and B, respectively. All selected complexes have been conformationally compared with each other by alignment and computing the RMSd on the cartesian coordinates of equivalent non hydrogen atoms. DRY, N1 and O original GRID probes have been used to highlight hydrophobic, hydrogen bond donors and acceptors areas. In order to identify the most relevant residues of both Spike and ACE2, we conceptually and technically extended the GBPM algorithm, originally designed for drug/target interactions (Ortuso et al., 2006). In the GBPM analysis presented here, the two interacting proteins have been considered either as host and guest units, and relevant AAs were selected if their distance from GBPM features was lower or equal to 3 Å. For each PDB model, the selected residues were scored as summa of the corresponding GBPM features interaction energy.

In order to prevent unrealistic distortion of the Spike-ACE2 complex, due to the usage of structures not covering the full length of the interacting proteins, the mutations effect has been qualitatively estimated by means of the mutagenesis tool implemented in PyMol software (PyMOL, 2017). Wild type residues have been replaced by the mutation and the new side chain conformations have been optimized taking into account the neighboring AAs. The graphical analysis was carried out onto the predicted most populated rotamers. On the basis of its better X-ray resolution, the 6M0J PDB structure has been selected for the above reported investigation.

Genetical analysis

SARS-CoV-2 genome sequences from human hosts and accounting for a total of 145,201 submissions were obtained from the GISAID database on 15 October 2020 (Shu & McCauley, 2017). Low quality (with more than 5% uncharacterized nucleotides) and incomplete (<29,000 nucleotides, based on a total reference length of 29,903) sequences were removed. The resulting 135,591 genome sequences were aligned on the reference SARS-CoV-2 Wuhan genome (NCBI entry NC_045512.2) using the NUCMER algorithm (Marçais et al., 2018). Position-specific nucleotide differences were merged for neighboring events and converted into protein mutations using the coronapp annotator (Mercatelli, Triboli et al., 2020). The results were further filtered for AA-changing mutations targeting the Spike protein.

ACE2 variants in the human population were extracted from the gnomAD database, v3, 18 July 2020 (Karczewski, et al., 2020). We considered only missense variants affecting specific AAs in the protein sequence, for a total of 155 entries (Supplementary File 2). Graph generation was performed with the R statistical software and the corto package v1.1.2 (Mercatelli, Lopez-Garcia et al., 2020).

Acknowledgements

We thank the Italian Ministry of Education and Research for their financial support under the Montalcini initiative. We thank Prof. Giovanni Perini for his continued support and scientific enthusiasm, Prof. Massimo Battistini for his lessons on logic and writing, Prof. Elena Bacchelli for her suggestions on the use of gnomAD, and Prof. Stefano Alcaro who provided the computational resources required by the GBPM analysis. Finally, we thank Mr. George Wolf for the final proofreading the manuscript.

Glossary

Abbreviations

AA

amino acid

ACE2

angiotensin-converting enzyme 2

COVID-19

coronavirus disease 2019

GBPM

Grid Based Pharmacophore Model

IEP

interaction energy point

MIFs

molecular interaction fields

ORF

open reading frame

PDB

protein data bank

RBD

spike receptor binding domain with ACE2

RMSd

root mean square deviation

SARS-CoV-2

severe acute respiratory syndrome coronavirus 2

Author contributions

FMG, PHG and FO designed the study. FO designed and performed the structural analysis. FMG designed the genetics analysis. FMG and DM performed the genetics analysis. FMG financially supported the study. PHG drafted the manuscript and performed literature search. All authors contributed to the writing of the final version of the manuscript.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Classification

Biophysics and Computational Biology

Significance statement

We developed a method to identify key amino acids responsible for the initial interaction between SARS-CoV-2 (the COVID-19 virus) and human cells, through the analysis of Spike/ACE2 complexes. We further identified which of these amino acids show variants in the viral and human populations. Our results will facilitate scientists and clinicians alike in identifying the possible role of present and future Spike and ACE2 sequence variants in cell entry and general susceptibility to infection.

References

  1. Becerra‐Flores, M., & Cardozo, T. (2020). SARS-CoV-2 viral spike G614 mutation exhibits higher case fatality rate. International Journal of Clinical Practice, 74(8), e13525. 10.1111/ijcp.13525 [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Benetti, E. (2020). ACE2 gene variants may underlie interindividual variability and susceptibility to COVID-19 in the Italian population. Genetic and Genomic Medicine. Advance online publication. 10.1101/2020.04.03.20047977 [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Berman, H. M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T. N., Weissig, H., Shindyalov, I. N., & Bourne, P. E. (2000). The protein data bank. Nucleic Acids Research, 28(1), 235–242. 10.1093/nar/28.1.235 [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Bosso, M., Thanaraj, T. A., Abu-Farha, M., Alanbaei, M., Abubaker, J., & Al-Mulla, F. (2020). The two faces of ACE2: The role of ACE2 receptor and its polymorphisms in hypertension and COVID-19. Molecular Therapy. Methods & Clinical Development, 18, 321–327. 10.1016/j.omtm.2020.06.017 [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Burrell, L. M., Harrap, S. B., Velkoska, E., & Patel, S. K. (2013). The ACE2 gene: Its potential as a functional candidate for cardiovascular disease. Clinical Science (London, England: 1979), 124(2), 65–76. 10.1042/CS20120269 [DOI] [PubMed] [Google Scholar]
  6. Cao, Y., Li, L., Feng, Z., Wan, S., Huang, P., Sun, X., Wen, F., Huang, X., Ning, G., & Wang, W. (2020). Comparative genetic analysis of the novel coronavirus (2019-nCoV/SARS-CoV-2) receptor ACE2 in different populations. Cell Discovery, 6(1) 4 10.1038/s41421-020-0147-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Ceraolo, C., & Giorgi, F. M. (2020). Genomic variance of the 2019-nCoV coronavirus. Journal of Medical Virology, 92(5), 522–528. 10.1002/jmv.25700 [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Darbani, B. (2020). The expression and polymorphism of entry machinery for COVID-19 in human: Juxtaposing population groups, gender, and different tissues. International Journal of Environmental Research and Public Health, 17(10), 3433. 10.3390/ijerph17103433 [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Fratev, F. (2020). The N501Y and K417N mutations in the spike protein of SARS-CoV-2 alter the interactions with both hACE2 and human derived antibody: A Free energy of perturbation study. Molecular Biology. Advance online publication. 10.1101/2020.12.23.424283 [DOI] [PubMed] [Google Scholar]
  10. Goodford, P. J. (1985). A computational procedure for determining energetically favorable binding sites on biologically important macromolecules. Journal of Medicinal Chemistry, 28(7), 849–857. 10.1021/jm00145a002 [DOI] [PubMed] [Google Scholar]
  11. Gordon, D. E., Jang, G. M., Bouhaddou, M., Xu, J., Obernier, K., White, K. M., O'Meara, M. J., Rezelj, V. V., Guo, J. Z., Swaney, D. L., Tummino, T. A., Hüttenhain, R., Kaake, R. M., Richards, A. L., Tutuncuoglu, B., Foussard, H., Batra, J., Haas, K., Modak, M., … Krogan, N. J. (2020). A SARS-CoV-2 protein interaction map reveals targets for drug repurposing. Nature, 583(7816), 459–468. 10.1038/s41586-020-2286-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Grove, J., & Marsh, M. (2011). The cell biology of receptor-mediated virus entry. The Journal of Cell Biology, 195(7), 1071–1082. 10.1083/jcb.201108131 [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Grubaugh, N. D., Hanage, W. P., & Rasmussen, A. L. (2020). Making sense of mutation: What D614G means for the COVID-19 pandemic remains unclear. Cell, 182(4), 794–795. 10.1016/j.cell.2020.06.040 [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Guzzi, P. H., Mercatelli, D., Ceraolo, C., & Giorgi, F. M. (2020). Master Regulator Analysis of the SARS-CoV-2/Human Interactome. Journal of Clinical Medicine, 9(4), 982. 10.3390/jcm9040982 [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Hadfield, J., Megill, C., Bell, S. M., Huddleston, J., Potter, B., Callender, C., Sagulenko, P., Bedford, T., & Neher, R. A. (2018). Nextstrain: Real-time tracking of pathogen evolution. Bioinformatics (Oxford, England), 34(23), 4121–4123. 10.1093/bioinformatics/bty407 [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Hanson, Q. M. (2020). Targeting ACE2-RBD interaction as a platform for COVID19 therapeutics: Development and drug repurposing screen of an AlphaLISA proximity assay. bioRxiv. Advance online publication. 10.1101/2020.06.16.154708 [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Hilton, J., & Keeling, M. J. (2020). Estimation of country-level basic reproductive ratios for novel Coronavirus (SARS-CoV-2/COVID-19) using synthetic contact matrices. PLoS Computational Biology, 16(7), e1008031. 10.1371/journal.pcbi.1008031 [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Hoffmann, M., Kleine-Weber, H., Schroeder, S., Krüger, N., Herrler, T., Erichsen, S., Schiergens, T. S., Herrler, G., Wu, N.-H., Nitsche, A., Müller, M. A., Drosten, C., & Pöhlmann, S. (2020). SARS-CoV-2 cell entry depends on ACE2 and TMPRSS2 and is blocked by a clinically proven protease inhibitor. Cell, 181(2), 271–280.e8. 10.1016/j.cell.2020.02.052 [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Hussain, M. (2020). Structural variations in human ACE2 may influence its binding with SARS-CoV-2 spike protein. Journal of Medical Virology, 92(9), 1580–1586. 10.1002/jmv.25832 [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Jackson, L. A. (2020). An mRNA vaccine against SARS-CoV-2 – Preliminary report. New England Journal of Medicine 383, 1920–1931. 10.1056/NEJMoa2022483 [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Karczewski, K. J., Francioli, L. C., Tiao, G., Cummings, B. B., Alföldi, J., Wang, Q., Collins, R. L., Laricchia, K. M., Ganna, A., Birnbaum, D. P., Gauthier, L. D., Brand, H., Solomonson, M., Watts, N. A., Rhodes, D., Singer-Berk, M., England, E. M., Seaby, E. G., Kosmicki, J. A., … MacArthur, D. G, Genome Aggregation Database Consortium (2020). The mutational constraint spectrum quantified from variation in 141,456 humans. Nature, 581(7809), 434–443. 10.1038/s41586-020-2308-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Korber, B. (2020). Spike mutation pipeline reveals the emergence of a more transmissible form of SARS-CoV-2. Biorxiv. Advance online publication. 10.1101/2020.04.29.069054 [DOI] [Google Scholar]
  23. Lan, J., Ge, J., Yu, J., Shan, S., Zhou, H., Fan, S., Zhang, Q., Shi, X., Wang, Q., Zhang, L., & Wang, X. (2020). Structure of the SARS-CoV-2 spike receptor-binding domain bound to the ACE2 receptor. Nature, 581(7807), 215–220. 10.1038/s41586-020-2180-5 [DOI] [PubMed] [Google Scholar]
  24. Leung, K., Shum, M. H., Leung, G. M., Lam, T. T., & Wu, J. T. (2020). Early empirical assessment of the N501Y mutant strains of SARS-CoV-2 in the United Kingdom, October to November 2020. Epidemiology. Advance online publication. 10.1101/2020.12.20.20248581 [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Marçais, G., Delcher, A. L., Phillippy, A. M., Coston, R., Salzberg, S. L., & Zimin, A. (2018). MUMmer4: A fast and versatile genome alignment system. PLoS Computational Biology, 14(1), e1005944. 10.1371/journal.pcbi.1005944 [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Mercatelli, D., & Giorgi, F. M. (2020). Geographic and genomic distribution of SARS-CoV-2 Mutations. Frontiers in Microbiology, 11, 1800. 10.3389/fmicb.2020.01800 [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Mercatelli, D., Lopez-Garcia, G., & Giorgi, F. M. (2020). corto: A lightweight R package for gene network inference and master regulator analysis. Bioinformatics (Oxford, England), 36(12), 3916–3917. 10.1093/bioinformatics/btaa223 [DOI] [PubMed] [Google Scholar]
  28. Mercatelli, D., Triboli, L., Fornasari, E., Ray, F., & Giorgi, F. M. (2020). Coronapp: A web application to annotate and monitor SARS-CoV-2 mutations. Journal of Medical Virology. Advance online publication. 10.1002/jmv.26678. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Meredith, L. W., Hamilton, W. L., Warne, B., Houldcroft, C. J., Hosmillo, M., Jahun, A. S., Curran, M. D., Parmar, S., Caller, L. G., Caddy, S. L., Khokhar, F. A., Yakovleva, A., Hall, G., Feltwell, T., Forrest, S., Sridhar, S., Weekes, M. P., Baker, S., Brown, N., … Goodfellow, I. (2020). Rapid implementation of SARS-CoV-2 sequencing to investigate cases of health-care associated COVID-19: A prospective genomic surveillance study. The Lancet Infectious Diseases, 20(11), 1263–1271. 10.1016/S1473-3099(20)30562-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Milazzo, G., Mercatelli, D., Di Muzio, G., Triboli, L., De Rosa, P., Perini, G., & Giorgi, F. M. (2020). Histone deacetylases (HDACs): Evolution, specificity, role in transcriptional complexes, and pharmacological actionability. Genes, 11(5), 556. 10.3390/genes11050556 [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Ortega, J. T., Serrano, M. L., Pujol, F. H., & Rangel, H. R. (2020). Role of changes in SARS-CoV-2 spike protein in the interaction with the human ACE2 receptor: An in silico analysis. EXCLI Journal, 19, 410–417. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Ortuso, F., Langer, T., & Alcaro, S. (2006). GBPM: GRID-based pharmacophore model: Concept and application studies to protein-protein recognition. Bioinformatics (Oxford, England), 22(12), 1449–1455. 10.1093/bioinformatics/btl115 [DOI] [PubMed] [Google Scholar]
  33. Ou, J. (2020). Emergence of RBD mutations in circulating SARS-CoV-2 strains enhancing the structural stability and human ACE2 receptor affinity of the spike protein. Biorxiv.Advance online publication. 10.1101/2020.03.15.991844 [DOI] [Google Scholar]
  34. Peter, E. K., & Schug, A. (2020). The inhibitory effect of a Corona virus spike protein fragment with ACE2. Biorxiv. Advance online publication. 10.1101/2020.06.03.132506 [DOI] [Google Scholar]
  35. Pinto, B. G. G. (2020). ACE2 expression is increased in the lungs of patients with comorbidities associated with severe COVID-19. Journal of Infectious Diseases. Advance online publication. 10.1093/infdis/jiaa332. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Preliminary genomic characterisation of an emergent SARS-CoV-2 lineage in the UK defined by a novel set of spike mutations – SARS-CoV-2 coronavirus/nCoV-2019 Genomic Epidemiology - Virological (January 3, 2021). Retrieved from https://virological.org/t/preliminary-genomic-characterisation-of-an-emergent-sars-cov-2-lineage-in-the-uk-defined-by-a-novel-set-of-spike-mutations/563
  37. PyMOL. (2017). The PyMOL Molecular Graphics System, Version 2.0 (Schrödinger, LLC).
  38. Sashittal, P., Luo, Y., Peng, J., & El-Kebir, M. (2020). Characterization of SARS-CoV-2 viral diversity within and across hosts. Biorxiv. Advance online publication. 10.1101/2020.05.07.083410 [DOI] [Google Scholar]
  39. Shang, J., Ye, G., Shi, K., Wan, Y., Luo, C., Aihara, H., Geng, Q., Auerbach, A., & Li, F. (2020). Structural basis of receptor recognition by SARS-CoV-2. Nature, 581(7807), 221–224. 10.1038/s41586-020-2179-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Shen, Z., Xiao, Y., Kang, L., Ma, W., Shi, L., Zhang, L., Zhou, Z., Yang, J., Zhong, J., Yang, D., Guo, L., Zhang, G., Li, H., Xu, Y., Chen, M., Gao, Z., Wang, J., Ren, L., & Li, M. (2020). Genomic diversity of severe acute respiratory syndrome – Coronavirus 2 in patients with coronavirus disease 2019. Clinical Infectious Diseases, 71(15), 713–720. 10.1093/cid/ciaa203 [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Sherry, S. T., Ward, M. H., Kholodov, M., Baker, J., Phan, L., Smigielski, E. M., & Sirotkin, K. (2001). dbSNP: The NCBI database of genetic variation. Nucleic Acids Research, 29(1), 308–311. 10.1093/nar/29.1.308 [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Shu, Y., & McCauley, J. (2017). GISAID: Global initiative on sharing all influenza data – from vision to reality. Euro Surveillance, 22, 30494. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Strafella, C., Caputo, V., Termine, A., Barati, S., Gambardella, S., Borgiani, P., Caltagirone, C., Novelli, G., Giardina, E., & Cascella, R. (2020). Analysis of ACE2 genetic variability among populations highlights a possible link with COVID-19-related neurological complications. Genes, 11(7), 741. 10.3390/genes11070741 [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Tai, W., He, L., Zhang, X., Pu, J., Voronin, D., Jiang, S., Zhou, Y., & Du, L. (2020). Characterization of the receptor-binding domain (RBD) of 2019 novel coronavirus: Implication for development of RBD protein as a viral attachment inhibitor and vaccine. Cellular and Molecular Immunology, 17(6), 613–620. 10.1038/s41423-020-0400-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Tang, X., Wu, C., Li, X., Song, Y., Yao, X., Wu, X., Duan, Y., Zhang, H., Wang, Y., Qian, Z., Cui, J., & Lu, J. (2020). On the origin and continuing evolution of SARS-CoV-2. National Science Review, 7(6), 1012–1023. 10.1093/nsr/nwaa036 [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Ulrich, H., & Pillat, M. M. (2020). CD147 as a target for COVID-19 treatment: Suggested effects of azithromycin and stem cell engagement. Stem Cell Reviews and Reports16(3), 434–440. 10.1007/s12015-020-09976-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Wang, J., Xu, X., Zhou, X., Chen, P., Liang, H., Li, X., Zhong, W., & Hao, P. (2020). Molecular simulation of SARS-CoV-2 spike protein binding to pangolin ACE2 or human ACE2 natural variants reveals altered susceptibility to infection. The Journal of General Virology, 101(9), 921–924. 10.1099/jgv.0.001452 [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Wang, Q., Zhang, Y., Wu, L., Niu, S., Song, C., Zhang, Z., Lu, G., Qiao, C., Hu, Y., Yuen, K.-Y., Wang, Q., Zhou, H., Yan, J., & Qi, J. (2020). Structural and functional basis of SARS-CoV-2 entry by using human ACE2. Cell, 181(4), 894–904.e9. 10.1016/j.cell.2020.03.045 [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Xia, S., Lan, Q., Su, S., Wang, X., Xu, W., Liu, Z., Zhu, Y., Wang, Q., Lu, L., & Jiang, S. (2020). The role of furin cleavage site in SARS-CoV-2 spike protein-mediated membrane fusion in the presence or absence of trypsin. Signal Transduction and Targeted Therapy, 5(1), 92–93. 10.1038/s41392-020-0184-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Zheng, Z. (2020). Risk factors of critical & mortal COVID-19 cases: A systematic literature review and meta-analysis. Journal of Infection. Advance online publication. 10.1016/j.jinf.2020.04.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Zhu, N., Zhang, D., Wang, W., Li, X., Yang, B., Song, J., Zhao, X., Huang, B., Shi, W., Lu, R., Niu, P., Zhan, F., Ma, X., Wang, D., Xu, W., Wu, G., Gao, G. F., & Tan, W., China Novel Coronavirus Investigating and Research Team (2020). A novel coronavirus from patients with pneumonia in China, 2019. The New England Journal of Medicine, 382(8), 727–733. 10.1056/NEJMoa2001017 [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Journal of Biomolecular Structure & Dynamics are provided here courtesy of Taylor & Francis

RESOURCES