Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2023 Sep 6.
Published in final edited form as: J Phys Chem B. 2020 Oct 28;124(45):10034–10047. doi: 10.1021/acs.jpcb.0c05994

Critical Sequence Hot-spots for Binding of nCOV-2019 to ACE2 as Evaluated by Molecular Simulations

Mahdi Ghorbani 1,3, Bernard R Brooks 3, Jeffery B Klauda 1,2,*
PMCID: PMC7605337  NIHMSID: NIHMS1922495  PMID: 33112147

Abstract

The novel coronavirus (nCOV-2019) outbreak has put the world on edge, causing millions of cases and hundreds of thousands of deaths all around the world, as of June 2020, let alone the societal and economic impacts of the crisis. The spike protein of nCOV-2019 resides on the virion’s surface mediating coronavirus entry into host cells by binding its receptor binding domain (RBD) to the host cell surface receptor protein, angiotensin converter enzyme (ACE2). Our goal is to provide a detailed structural mechanism of how nCOV-2019 recognizes and establishes contacts with ACE2 and its difference with an earlier coronavirus SARS-COV in 2002 via extensive molecular dynamics (MD) simulations. Numerous mutations have been identified in the RBD of nCOV-2019 strains isolated from humans in different parts of the world. In this study, we investigated the effect of these mutations as well as other Ala-scanning mutations on the stability of RBD/ACE2 complex. It is found that most of the naturally occurring mutations to the RBD either slightly strengthen or have the same binding affinity to ACE2 as the wild-type nCOV-2019. This means the virus had sufficient binding affinity to its receptor at the beginning of the crisis. This also have implications for any vaccine design endeavors since these mutations could act as antibody escape mutants. Furthermore, in-silico Ala-scanning and long-timescale MD simulations, highlight the crucial role of the residues at the interface of RBD and ACE2 that may be used as potential pharmacophores for any drug development endeavors. From an evolutional perspective, this study also identifies how the virus has evolved from its predecessor SARS-COV and how it could further evolve to become even more infectious.

Graphical Abstract

graphic file with name nihms-1922495-f0001.jpg

Introduction

The novel coronavirus (nCOV-2019) outbreak emerging from China has become a global pandemic and a major threat for human public health. According to World Health Organization (WHO) as of August 28th 2020, there has been about 25 million confirmed cases and approaching 900,000 deaths due to coronavirus in the world.1, 2 Much of the human population including the United States of America were under lockdown or official stay-at-home orders to minimize the continued spread of the virus.

Coronaviruses are a family of single-stranded enveloped RNA viruses. Phylogenetic analysis of coronavirus genome has shown that nCOV-2019 belongs to the beta-coronavirus family, which also includes MERS-COV, SARS-COV and bat-SARS-related coronaviruses.3, 4 It is worth mentioning that SARS-COV, which was widespread in 2002 caused more than 8,000 cases and about 800 deaths and MERS-COV (middle east respiratory syndrome coronavirus) in 2012 also spread in more than 25 countries, causing about 2,500 cases and more than 850 deaths. (www.who.int/health-topics/coronavirus).5

In all coronaviruses, a homotrimeric spike glycoprotein on the virion’s envelope mediates coronavirus entry into host cells through a mechanism of receptor binding followed by fusion of viral and host membranes.3, 6 Coronavirus spike protein contains two functional subunits S1 and S2. The S1 subunit is responsible for binding to host cell receptor, and the S2 subunit is responsible for fusion of viral and host cell membranes.3, 7 The spike protein in nCOV-2019 exists in a metastable pre-fusion conformation that undergoes a substantial conformational rearrangement to fuse the viral membrane with the host cell membrane.7, 8 nCOV-2019 is closely related to bat coronavirus RaTG13 with about 93.1% sequence similarity in the spike protein gene. The sequence similarity of nCOV-2019 and SARS-COV is less than 80% in the spike sequence.2 S1 subunit in the spike protein includes a receptor binding domain (RBD) that recognizes and binds to the host cells receptor. The RBD of nCOV-2019 shares 72.8% sequence identity to SARS-COV RBD and the root mean squared deviation (RMSD) for the structure between the two proteins is 1.2 Å, which shows the high structural similarity.4, 8, 9 Experimental binding affinity measurements using Surface Plasmon Resonance (SPR) have shown that nCOV-2019 spike protein binds its receptor human angiotensin converter enzyme (ACE2) with 10 to 20 fold higher affinity than SARS-COV binding to ACE2.7 Based on the sequence similarity between RBD of nCOV-2019 and SARS-COV and also the tight binding between RBD of nCOV-2019 and ACE2, it is most probable that nCOV-2019 uses this receptor on human cells to gain entry into the body.3, 6, 7, 10

The spike protein and specifically the RBD in coronaviruses have been a major target for therapeutic antibodies. However, no monoclonal antibodies targeted to RBD have been able to bind efficiently and neutralize nCOV-2019.7, 11 The core of nCOV-2019 RBD is a 5-stranded antiparallel β-sheet with connected short α-helices and loops (Figure 1). Binding interface of nCOV-2019 and SARS-COV with ACE2 are very similar with less than 1.3 Å RMSD. An extended insertion inside the core containing short strands, α-helices and loops called the receptor binding motif (RBM) makes all the contacts with ACE2. In nCOV-2019 RBD, the RBM forms a concave surface with a ridge loop on one side and it binds to a convex exposed surface of ACE2. The overlay of SARS and nCOV-2019 RBD proteins are shown in Figure 1-A. The binding interface in nCOV-2019 contains loops L1 to L4 and short β-strands β5 and β6 and a short helix α5. The location of RBM in nCOV-2019 RBD as well as different helices, strands and loops is shown in Figure 1-B.

Figure 1.

Figure 1.

A) superposition of the RBD of SARS-COV (yellow) and nCOV-2019(red). B) different regions in the binding domain of nCOV-2019 defining the extended loop (non-yellow).

The sequence alignment between SARS-COV in human, SARS civet, Bat RaTG13 coronavirus and nCOV-2019 in the RBM is shown in Figure 2. There is a 50% sequence similarity between the RBM of nCOV-2019 and SARS-COV. RBM mutations played an important role in the SARS epidemic in 2002.3, 12 Two mutations in the RBM of SARS-2002 from SARS-Civet were observed from strains of these viruses. These two mutations were K479N and S487T. These two residues are close to the virus binding hotspots in ACE2 including hotspot-31 and hotspot-353. Hotspot-31 centers on the salt-bridge between K31-E35 and hotspot-353 is centered on the salt-bridge between K353-E358 on ACE2. Residues K479 and S487 in SARS-Civet are in close proximity with these hotspots and mutations at these residues caused SARS to bind ACE2 with significantly higher affinity than SARS-Civet and played a major role in civet-to-human and human-to-human transmission of SARS coronavirus in 2002.3, 13-15 Numerous mutations in the interface of SARS-COV RBD and ACE2 from different strains of SARS isolated from humans in 2002 have been identified and the effect of these mutations on binding ACE2 have been investigated by surface plasmon resonance.14, 16 Two identified RBD mutations (Y442F and L472F) increased the binding affinity of SARS-COV to ACE2 and two mutations (N479K, T487S) decreased the binding affinity. It was demonstrated that these mutations were viral adaptations to either human or civet ACE2.14, 16 A pseudotyped viral infection assay of the interaction between different spike proteins and ACE2 confirmed the correlation between high affinity mutants and their high infection.16 Further investigation of RBD residues in binding of SARS-COV and ACE2 was performed through ala-scanning mutagenesis, which resulted in identification of residues that reduce binding affinity to ACE2 upon mutation to alanine.17 RBD mutations have also been identified in MERS-COV, which affected their affinity to receptor (DPP4) on human cells.14 Multiple monoclonal antibodies have been developed for SARS since 2002 that neutralized spike glycoprotein on SARS-COV surface.18-22 However, multiple escape mutations exist in the RBD of SARS-COV that affect neutralization with antibodies, which led to the use of a cocktail of antibodies as a robust treatment.23

Figure 2.

Figure 2.

Sequence comparison of the receptor binding motif (RBM) in SARS-2002, SARS-civet, Bat RaTG13 and nCOV-2019. The mutations from SARS-2002 to nCOV-2019 are marked with blue. Important mutations in RBM are marked with yellow. Red color shows the 3-resdiue motif in SARS and Civet and 4-residue motif in RaTG13 and nCOV-2019.

Full genome analysis of nCOV-2019 in different countries and the receptor binding surveillance has shown multiple mutations in the RBD of glycosylated spike. The GISAID database24 (www.gisaid.org/) contains genomes on nCOV-2019 from researchers across the world since December 2019. Latest report by the GISAID database on June 2020 have shown 25 different variants of RBD from strains of nCOV-2019 collected from different countries along with the number of occurrences in these regions which is listed below for the seven most occurring mutations:

213x N439K (211 Scotland, England, Romania), 65x T478I in England, 30x V483A (26 USA/WA, 2 USA/UN, USA/CT, England), 10x G476S (8 USA/WA, USA/OR, Belgium), 7x S494P (3 USA/MI, England, Spain, India, Sweden), 5x V483F (4x Spain, England), 4x A475V (2 USA/AZ, USA/NY, Australia/NSW.

It is not known whether these mutations are linked to the severity of coronavirus in these regions. Starr and coworkers25 performed a deep mutational scanning of nCOV-2019 RBD and used Flow Cytometry to measure the effect of single mutations on the expression of the folded protein as well as its binding affinity to ACE2. They showed that RBD is very tolerant to these mutations to maintains its expression level as well as binding affinity to ACE2 in most cases. According to their results, most natural mutations exert similar binding affinities to ACE2 as wild-type nCOV-2019. Furthermore, they showed that mutations at critical positions at the RBD-ACE2 interface at nCOV-2019 such as residues Q493 and Q498 does not reduce the binding affinity to ACE2 which shows the substantial plasticity of the interface.25

Different groups have computationally studied the binding of nCOV-2019 RBD with ACE2.25-29 All these studies point to higher binding affinity of nCOV-2019 RBD than SARS-COV RBD to ACE2. Interestingly, the role of water-mediated interactions has been pointed out to be a driving force which is shown to be the similar for both SARS-COV and nCOV-2019 RBD.27 Spinello and coworkers30 studied the binding of nCOV-2019 and SARS-COV RBD to ACE2 and found that the former binds its receptor with 30kcalmol higher affinity than SARS-COV RBD. Gao et al.31 used free energy perturbation (FEP) and showed that most amino acid mutations at the RBM from SARS-COV to nCOV-2019 increase the affinity of RBD to ACE2. The focus of this article is to elucidate the differences between the interface of SARS-COV and nCOV-2019 with ACE2 to understand with atomic resolution the interaction mechanism and hotspot residues at the RBD/ACE2 interface using long-timescale molecular dynamics (MD) simulation. An alanine-scanning mutagenesis in the RBM of nCOV-2019 helped to identify the key residues in the interaction, which could be used as potential pharmacophores for future drug development. Furthermore, we performed molecular simulations on the seven most common mutations found from surveillance of RBD mutations N439K, T478I, V483A, G476S, S494P, V483F and A475V. From an evolutionary perspective this study shows the residues in which the virus might further evolve to be even more dangerous to human health.

Methods

Sequence comparison and mutant preparation

nCOV-2019 shares 76% sequence similarity with SARS-2002 spike protein, 73% sequence identity for RBD and 50% for the RBM.1 Bat coronavirus RaTG13 seems to be the closest relative of nCOV-2019 sharing about 93% sequence identity in the spike protein.6 The sequence alignment of SARS-2002 (accession number: AFR58742), SARS-civet (accession number: AY304486.1), Bat RaTG13 (accession number: MN996532.1) and nCOV-2019 (accession number: MN908947.1) are in Figure 2.6 To investigate the roles of critical mutations on the complex stability of nCOV-2019 with ACE2, mCSM-PPI2 Webserver32 was used to find the residues in nCOV-2019 that are at the interface with ACE2. This method uses a graph-based signature framework and predicts the effect of alanine substitution at interface residues on the binding energy of the complex. 21 different residues were identified to be in contact with ACE2 and were chosen to do further MD simulation. On the other hand, mutations are also observed in RBD domain from full genome analysis of different nCOV-2019 variants collected from different countries compiled in GISAID database.24 The mutations selected are listed in Table S1 along with their location in RBD.

Molecular dynamics simulations

The crystal structure of nCOV-2019 in complex with hACE2 (pdb id:6M0J)17 as well as SARS-COV complex with human ACE2 (pdb id: 6ACJ)33 were obtained from RCSB (www.rcsb.org).34 The RBD domain of nCOV-2019 comprises of 194 residues (333-526) and SARS-COV includes 190 residues (323-512). ACE2 protein contains 597 residues (19-615) in the complex structure for both systems. All the structures including nCOV-2019, SARS-COV and all 21 alanine substitutions of nCOV-2019 were prepared and solvated in GROMACS.35 A TIP3P water model was used for the solvent and Param99SB-ILDN AMBER force field (FF)36, 37 was used for all the complexes. A few counter ions were added to each systems to neutralize the charges on the RBD and ACE2 as the PBSA method for binding energy calculation is known to be problematic with high ionic strength. Each system contained about 260,000 atoms. It is important to note that, none of the RBD/ACE2 complexes studied here were glycosylated. The glycosylation sites on RBD are far from the binding interface and does not interfere with the binding of RBD to ACE2. Moreover, there are nine Cys residues at the RBD of nCOV-2019 and eight of them form four pairs of disulfide bonds (Cys336-Cys361, Cys379-Cys432, Cys391-Cys525, Cys480-Cys488).

5000 steps of energy minimizations were done using the steepest descent algorithm. In all steps the LINCS algorithm was used to constrain all bonds containing hydrogen atoms and a time step of 2 fs was used as the integration time step. Equilibration of all systems were performed in three steps. In the first step, 100,000 steps of simulation were performed using a velocity-rescaling thermostat to maintain the temperature at 310 K with a 0.1 ps coupling constant in NVT ensemble under periodic boundary conditions and harmonic restraints on the backbone and sidechain atoms of the complex.38 A velocity rescaling thermostat was used in all other steps of simulation. In the next step, we performed 300,000 steps in the isothermal-isobaric NPT ensemble at temperature of 310 K and pressure of 1bar using a Berendsen barostat.39 This was done through decreasing the force constant of the restraint on backbone and side chain atoms of the complex from 1000 to 100 and finally to 10kJmolnm2. Berendsen barostat was only used for the equilibration step due to usefulness in rapidly correcting density. In the next step the restraints were removed, and the systems were subjected to 1,000,000 steps of MD simulation under NPT ensemble.

In the production run, harmonic restraints were removed and all the systems were simulated using a NPT ensemble where the pressure was maintained at 1bar using the Parrinello-Rahman barostat40 with a compressibility of 4.5 × 10 bar−1 and a coupling constant of 0.5 ps. It is important to note that the Berendsen barostat was only used for the equilibration step as it was shown that this barostat can cause unrealistic temperature gradients.41 The production run lasted for 500ns for SARS-COV and nCOV-2019 complexes and 300 ns for all the mutants with a 2 fs timestep and the particle-mesh Ewald (PME)42 for long range electrostatic interactions using GROMACS 2018.3 package.43 All mutant systems were constructed as described before and ran for 300ns of production run. In addition, the simulation time for a few mutants (Y449A, T478I, Y489A and S494P) was extended to 500 ns.

Gibbs free energy and correlated motions.

The last 400ns of simulation was used to explore the dominant motions in SARS-COV, nCOV-2019 and the mutations with extended simulation, and last 200ns for all other mutants using principal component analysis (PCA) as part of the quasiharmonic analysis method.44 For this method the rotational and translational motions of RBD of all systems were eliminated by fitting to a reference (crystal) structure. Next, 4,000 snapshots from the last 400ns of SARS-COV, nCOV-2019 and mutations with extended simulation time, and 2,000 snapshots from last 200ns of all other mutant systems were taken to generate the covariance matrix between Cα atoms of RBD. In the mutant systems with production run, the last 400 ns was used for this analysis. Diagonalization of this matrix resulted in a diagonal matrix of eigenvalues and their corresponding eigenvectors.43, 45 The first eigenvector which indicate the first principal component was used to visualize the dominant global motions of all complexes through porcupine plots using the (PorcupinePlot.tcl) script in vmd.

The principal components were used to calculate and plot the approximate free energy landscape (aFEL). We refer to the free energy landscape produced by this approach to be approximate in that the ensemble with respect to the first few PC’s (lowest frequency quasiharmonic modes) is not close to convergence, but the analysis can still provide valuable information and insight, g_sham, g_covar and g_anaeig functions in GROMACS35 were used to obtain principal components and aFEL.

The dynamic cross-correlation maps (DCCM) were obtained using MD_TASK package to identify the correlated motions of RBD residues.46 In DCCM the cross-correlation matrix Cij is obtained from displacement of backbone Cα atoms in a time interval Δt. The DCCM was constructed using the last 400ns of SARS-COV and nCOV-2019 and the extended mutant systems and last 200ns of all other mutant systems with a 100xps time interval. Hydrogen bonds were analyzed in VMD where the distance cutoff was 3.2 Å and the angle cutoff between donor and acceptor was 30°.

Binding free energy from MMPBSA method.

The Molecular Mechanics Poisson-Boltzmann Surface Area (MMPBSA) method was applied to calculate the binding energy between RBD and ACE2 in all complexes.47, 48 for SARS-COV and nCOV-2019, 200 snapshots of the last 400 ns and for the mutant systems, 100 snapshots of the last 200 ns simulation were used for the calculation of binding free energies with an interval of 2 ns. Simulation for a few mutant systems (Y449A, T478I, Y489A and S494P) were extended to 400 ns a the binding energies were calculated for the last 400 ns to assess the convergence of free energies. The binding free energy of a ligand-receptor complex can be calculated as:

ΔGbind,aq=ΔHTΔS=ΔGcomplex[ΔGprotein+ΔGligand] (1)
ΔGbind,aq=ΔEMM+ΔGbind,solvTΔS (2)
ΔEMM=ΔEcovalent+ΔEElect+ΔEVDW (3)
ΔGbind,solv=ΔGpolar+ΔGnonpolar (4)
ΔGnonpolar=γ×SASA+b (5)

In these equations, ΔEMM, ΔGbind,solv and TΔS are calculated in the gas phase. ΔEMM is the gas phase molecular mechanical energy changes which includes covalent, electrostatic and van der Waals energies. Based on previous studies the entropy change during binding is small and neglected in these calculations.48-50 ΔGbind,solv is the solvation free energy which comprises the polar and non-polar components. The polar solvation is calculated using the MMPBSA method by setting a value of 80 and 2 for solvent and solute dielectric constants. The non-polar free energy is simply estimated from solvent accessible surface area (SASA) of the solute from equation 5.

Results

Structural dynamics

To compute the RMSD of systems, the rotational and translational movements were removed by first fitting the Cα atoms of the RBD to the crystal structure and then computing the RMSD with respect to Cα atoms of RBD in each system.

Figure 3 shows the RMSD plot in the RBD of SARS-COV, nCOV-2019 and some of its variants. Comparison of RMSD of SARS and nCOV-2019 RBD shows that SARS-COV has a larger RMSD throughout the 500 ns simulation. In nCOV-2019, RMSD is very stable with a value of about 1.5 Å whereas in SARS-COV the RMSD increases up to ~4 Å after 100 ns and then fluctuates between 3 and 4 Å. The change in RMSD of SARS is partially related to the motion in the C-terminal which is a flexible loop.

Figure 3.

Figure 3.

The Cα RMSD plots for nCOV-2019 and SARS-COV and a few nCOV-2019 mutations.

The RMSD plots for the nCOV-2019 mutants show similar behaviors to nCOV-2019-wt. In most of the variants, the RMSD is very stable during the 300 ns simulation which shows the great tolerance of the interface for mutations. However, a few mutations showed some RMSD variance. In mutation Y489A, the RMSD increases from 1.37 ± 0.21 Å to 1.88 ± 0.16 Å after 2000 ns. Mutation Y505A resulted in an increase in RMSD up to 100 ns to a value to 1.98 ± 0.20 Å and decreased afterwards. RMSD for mutation N487A shows an increasing behavior with a value of 2.10 ± 0.23 Å. Mutations N439K, V483A and V483F showed a stable RMSD of ~1.5 Å. For mutations T478I, G476S, S494P and A475V RMSD increases up to ~2 Å. These variations in the backbone RMSD show the involvement of these residues in the complex stability. RMSD plots for other mutations are shown in Figure S1.

Since the extended loop (residues 449 to 510 shown in figure 1B from α4 to α5) of the RBD makes all contacts with ACE2, the RMSD was computed by also fitting to the Cα atoms of this region (Figure S2). The extended loop in nCOV-2019 is very stable with less deviation (RMSD = 0.86 ± 0.017 Å) from the crystal structure compared to SARS-COV having an RMSD of 2.79 ± 0.05 Å. Few of the mutants show an increase in the loop RMSD during the simulation. In mutants N487A and Y449A the loop RMSD jumps to a value of about 1.95 ± 0.12 Å and 1.94 ± 0.24 Å respectively after about 200 ns of simulation. Mutants G447A and E484A show a loop RMSD values of 2.22 ± 0.03 Å and 1.96 ± 0.02 Å during the last 100 ns of simulation. Other mutants showed a stable extended loop (observed in loop RMSD) during simulation. The stability of extended loop for mutant systems confirms the high tolerance of this region of RBD.

To characterize the dynamic behavior for each amino acid in the RBD, we analyzed the root mean square fluctuation (RMSF) of all systems. The RMSF plots for nCOV-2019, SARS-COV and four other mutations are shown in Figure 4. nCOV-2019 shows less fluctuations compared to SARS-COV. L3 in nCOV-2019 corresponding to residues 476 to 487 (shown in red in Figure 4) has smaller RMSF (1.5 Å) than SARS-COV L3 residues 463 to 474. L1 in both nCOV-2019 and SARS-COV (green) has small fluctuation (less than 1.5 Å). Moreover, the C-terminal residues of SARS-COV show high fluctuation (Figure 4 shown in orange). Few mutants show higher fluctuation in the L1. Mutants Y505A and S494A had a RMSF of 2.5 Å and mutation N487A has a RMSF of about 4 Å in the L1. Mutation Y449A has a higher RMSF of about 3 Å in the L3. Mutants G496A, E484A and G447A show a high fluctuation of about 4.5 Å in the L3. RMSF of other variants are shown in Figure S3.

Figure 4).

Figure 4)

RMSF plots for nCOV-2019-wt SARS-COV, Y505A, N487A, G496A and E484A. the red shaded region shows the fluctuation in L1 and the green shaded shows the fluctuation in L3. The orange shaded region in SARS-COV shows the fluctuation in C-terminal. For comparison, RMSF of nCOV-2019-wt is shown as cyan in other plots.

Principal component analysis (PCA) and approximate free energy landscape

To identify the dominant motions in the nCOV-2019, SARS-COV, and all the mutants, PCA was performed. Most of the combined motions were captured by the first ten eigenvectors generated from the last 400 ns for SARS-COV, nCOV-2019 and extended mutant systems and the last 200 ns for other nCOV-2019 mutants. The percentage of the motions captured by the first three eigenvectors was 51% for nCOV-2019 and 68% for SARS-COV. In all mutations, more than 50% of the motions were captured by the first three eigenvectors. The first few PC’s describe the highest motions in a protein which are related to a functional motion such as binding or unbinding of protein from receptor. The first three eigenvectors were used to calculate the approximate FEL (aFEL) using the last 400 ns of simulation for nCOV-2019 and SARS-COV shown in Figure 5, which displays the variance in conformational motion. SARS-COV showed two distinct low free energy states shown as blue separated by a metastable state. There is a clear separation between the two regions by a free energy barrier of about 6-7.5 kcal/mol. These two states correspond to the loop motions in the L3 as well as the motion in C-terminal residues of SARS-COV. The L3 motion in nCOV-2019 is stabilized by H-bond between N487 on RBD and Y83 on ACE2 as well as a π-stacking interaction between F486 and Y83. It is evident that the nCOV-2019 RBD is more stable than SASR-COV RBD and exist in one conformation whereas the SARS-COV interface fluctuates and the aFEL is separated into two different regions. The first two eigenvectors were used to calculate and plot the aFEL as a function of first two principal components using the last 200 ns of the simulation for mutant systems. aFEL for other systems are shown in Figure S4.

Figure 5.

Figure 5.

Mapping of the principal components of the RBD for the aFEL from the last 400ns of simulations for SARS-COV (top row) and nCOV-2019-wt (Bottom row). The color bar is relative to the lowest free energy state.

In each system, the first eigenvector was used to construct the porcupine plots to visualize the most dominant movements (Figure S5). nCOV-2019 showed a small motion in L3 and the core region and the extended loop region is very rigid showing small cones in the porcupine plot. The core structure of the RBD remains dormant as the cones are blue in most of the regions (Figure S5-A). In SARS-COV, the C-terminal region shows large motions (Figure S5-B). Mutation N487A showed a large motion in L1 (Figure S5-A). Mutations Y449A, G447A and E484A demonstrate large motions in L3 (Figure S5). Overall, these plots show the involvement of these residues in the dynamic stability of RBD/ACE2 complex.

Dynamic cross-correlation maps (DCCM)

The correlated motions of RBD atoms were also analyzed with the DCCM based on the Cα atoms of RBD from the last 400 ns of simulation for nCOV-2019 , SARS-COV and extended mutant systems and the last 200 ns for the other mutant systems (Figure 6). DCCM for nCOV-2019 showed a correlation between residues 490-505 (containing α5, L4 and β5 regions) and residues 440-455 (containing α4, L1 and β5 regions) shown in red rectangle in Figure 6. This correlation showed the coordination of these regions for binding ACE2 effectively. Another important correlation that appears in DCCM of nCOV-2019 is between residues 473-481 with residues 482-491. These residues are in L3 and β6 regions and their correlation in nCOV-2019 is stronger than SARS-COV. This is due to the presence of β6 strand in nCOV-2019, whereas in SARS-COV these residues all belong to L3. This indicates that L3 in nCOV-2019 has evolved from SARS-COV to adopt a new secondary structure, which causes strong correlation and make the loop act as a recognition region for binding. The correlation in L3 is shown as blue rectangle in Figure 6. Some of the mutations disrupted the patterns of correlation and anti-correlation in nCOV-2019-wt. Mutation N487A showed a stronger correlation in L3 and β6 strand than the wild-type RBD. In mutation E484A, correlation in L3 is stronger than the nCOV-2019-wt. DCCM for other mutants are shown in Figure S6. It is worth mentioning that mutation F486A disrupts the DCCM of nCOV-2019 by introducing strong correlations in the core region of RBD as well as the extended loop region. Residue F486 resides in L3 and plays a crucial role in stabilizing the recognition loop by making a π-stacking interaction with residue Y83A on ACE2.

Figure 6.

Figure 6.

Dynamic cross correlation maps for nCOV-2019, SARS-COV and mutants with residue numbers of the RBD domain. Red box shows the correlation between α5, L4 and β5 regions and α4, L1 and β5. Blue box shows the correlation in L3 and β6 regions.

Binding free energies

The binding energetics between ACE2 and the RBD of SARS-COV, nCOV-2019 and all its mutant complexes were investigated by the MMPBSA method.48 The binding energy was partitioned into its individual components including: electrostatic, van der Waals, polar solvation and solvent accessible surface area (SASA) to identify important factors affecting the interface o RBD and ACE2 in all complexes. nCOV-2019 has a total binding energy of 50.22±1.93kcalmol, whereas SARS-COV has a much higher binding energy of 18.79±1.53kcalmol. Decomposition of binding energy to its components show that the most striking difference between nCOV-2019 an kca SARS-COV is the electrostatic contribution which is 746.69±2.66kcalmol for nCOV-2019 and 600.14±7.65kcalmol for SARS-COV. This high electrostatic contribution is compensated by a large polar solvation free energy which is 797.30±3.12kcalmol for nCOV-2019 and 659.61±8.98kcalmol for SARS-COV. nCOV-2019 also possess a higher van der Waals (vdw) contribution (89.93±0.46kcalmol) than SARS-COV (70.07±1.22kaclmol). Furthermore, the SASA contribution to binding for SARS-COV was 8.30±0.15kcalmol and 10.58kcalmol for nCOV-2019. Both hydrophobic and electrostatic interactions play major roles in the higher affinity of nCOV-2019 RBD than SARS-COV RBD to ACE2.

The binding free energies for nCOV-2019 and SARS-COV were decomposed into a per-residue based binding energy to find the residues that contribute strongly to the binding and are responsible for higher binding affinity of nCOV-2019 than SARS-COV (Figure 7). Most of the residues in the RBM of nCOV-2019 had more favorable contribution to total binding energy than SARS-COV. Residues Q498, Y505, N501, Q493 and K417 in nCOV-2019 RBM contributed more than 5kcalmol to binding affinity and are crucial for complex formation. A few residues such as E484 and S494 contributed unfavorably to the total binding energy. Among all the interface residues K417 had the highest contribution to the total binding energy (12.34±0.23kcalmol). The corresponding residue in SARS-COV is V404 only had a 0.02±0.01kcalmol contribution, which points to the importance of this residue for nCOV-2019 binding to ACE2. Residue Q498 contributed 6.72±0.18kcalmol and its corresponding residue in SARS-COV is a Y484 that contributed to total binding by 1.83±0.06kcalmol. Other important residues Y505 and N501 have more negative contribution to total binding energy than their counterparts in SARS-COV residues Y49 and T487 respectively (Figure 7). Residue D480 in SARS-COV contributed positively to binding energy by 6.2±0.15kcalmol and the corresponding residue in nCOV-2019 which is a S494 residue lowered this positive contribution to only 1.17±0.06kcalmol. Mutation D480A/G appeared to be a dominant mutation in SARS-COV in 2002-2003.51 This mutation was reported to escape neutralization by antibody 80R.52 To investigate the effect of this point mutation on binding of SARS-COV RBD to ACE2 we performed an additional simulation and calculated the binding affinity for this mutant in SARS-COV RBD with the same approach for other mutation in this study. D480A mutation showed a binding affinity of 23.46±3.07kcalmol which is about 5kcalmol higher than the wild-type SARS-COV RBD. In SARS-COV residue R426 had the highest contribution to total binding energy 6.27±0.22kcalmol although the corresponding residue in nCOV-2019 is N439 with a contribution of 0.32±0.02kcalmol. These important mutations on RBM of nCOV-2019 from SARS-COV caused RBD of nCOV-2019 to bind ACE2 with much stronger (about 30kcalmol) affinity.

Figure 7.

Figure 7.

Binding energy decomposition per residue for the RBM of nCOV-2019 and SARS-COV.

Binding free energy decomposition to its individual components for all mutants is represented in Table S2. In all complexes, a large positive polar solvation free energy disfavors the binding and complex formation, which is compensated by a large negative electrostatic free energy of binding. All variants had similar solvent accessible surface area energies. The vdw free energy of binding ranged from 84.68±0.68kcalmol for mutant Q493A to 103.85±0.66kcalmol for Y489A. Mutant K417A had the lowest electrostatic contribution to binding 415.67±5.07kcalmol and mutants N439K and E484A had the highest electrostatic binding contribution of 989.80±5.6kcal,ol and 941.20±3.95kcalmol respectively.

Most alanine substitutions exhibited similar or lower total binding affinities to nCOV-2019, however a few mutants had higher binding affinity than the wild type. Mutant Y489A had a total binding energy of 61.78±2.59kcalmol which was about 11kcalmol lower than wild type binding energy. Mutants G446A, G447A and T478I also demonstrated higher total binding affinities than nCOV-2019. Other alanine substitutions had similar or lower total binding energy than nCOV-2019. Mutant G502A has the lowest binding affinity among all the mutants with a binding energy of 24.31±2.98kcalmol. Mutant systems K417A, L455A, T500A and N501A are the other mutants with total binding affinities significantly lower than the wild type complex. The electrostatic component of binding contributes the most to the low binding affinities for these mutants. The contribution of RBM residues to binding with ACE2 for nCOV-2019 were mapped to the RBD structure and is shown in Figure 9-B.

Figure 9).

Figure 9)

A) H bonds between RBD of A) nCOV-2019 and B) SARS-COV. B) mapping contribution of interface residues to structure in RBD of nCOV-2019. The RBD domain is purple and the ACE2 is yellow. The RBD in contact with AC2 is rendered in a surface format with more red being a favorable contribution to binding (more negative) and blue unfavorable contribution (positive).

Most natural mutants exhibited similar binding affinities compared to wild-type nCOV-2019 with a few exceptions. Mutation T478I which is one of the most frequent mutations based on GISAID database has a binding affinity which is about 6kcalmol higher than wild-type. S494P and A475V showed a slightly lower binding affinity than the wild-type complex. Other natural mutants showed binding affinities similar to wild-type RBD. N439K demonstrated a high electrostatic energy which is compensated by large polar solvation energy and this mutant has a total binding energy of 48.27±3.07kcalmol which is similar to nCOV-2019.

Hydrogen bond, Salt-bridge and Hydrophobic contact analysis:

Important hydrogen bonds (H-bonds) and salt bridges between nCOV-2019 RBD or SARS-COV RBD and ACE2 for the last 400ns of trajectory are shown in Table 1. nCOV-2019 RBD makes 10 H-bonds/1 salt bridge with ACE2 whereas SARS-COV makes only 5 H-bonds/1 salt bridge with ACE2 with more than 30% persistence.

Table 1.

H-bonds and salt-bridges between nCOV-2019 and ACE2 and SARS-COV and ACE2 that persist for >30%. Salt bridge is shown as bold.

# nCOV-2019 ACE2 % occupancy SARS-COV ACE2 % occupancy
1 G502 K353 89 Y436 D38 96
2 Q493 E35 83 R426 E329 87
3 N487 Y83 80 T486 D355 83
4 Q498 D38 73 G488 K353 80
5 K417 D30 55 N479 K31 52
6 T500 D355 53 Y440 H34 47
7 Y505 E37 52
8 Q498 K353 49
9 Y449 D38 45
10 G496 K353 37
11 Q493 K31 32

The evolution of the coronavirus from SARS-COV to nCOV-2019 has reshaped the interfacial hydrogen bonds with ACE2. G502 in nCOV-2019 has a persistent H-bond with residue K353 on ACE2. This residue was G488 in SARS-COV, which also makes H-bond with K353 on ACE2. Q493 in nCOV-2019 makes H-bond with E35 and another H-bond with K31 on ACE2. This residue was an N479 in SARS-COV, which only makes one H-bond with K31 on ACE2. An important mutation from SARS-COV to nCOV-2019 is residue Q498, which was Y484 in SARS-COV. Q498 makes two H-bonds with residues D38 and K353 on ACE2, whereas Y484 in SARS-COV does not make any H-bonds. Importantly a salt bridge between K417 and D30 in nCOV-2019/ACE2 complex contributes to total binding energy by 12.34±0.23kcalmol. This residue is V404 in SARS-COV which is not able to make any salt-bridge and does not make H-bond with ACE2. Gao et al.27 used a free energy perturbation (FEP) approach and showed that mutation V404 to K417 lowers the binding energy of nCOV-2019 RBD to ACE2 by 2.2±0.9kcalmol. A salt bridge between R426 on RBD and E329 on ACE2 stabilizes the complex in SARS-COV/ACE2. This residue is N439 in nCOV-2019 which is unable to make salt-bridge with ACE2 residue E329. One of the most observed mutations in nCOV-2019 according to GISAID database is N439K which recovers some of the electrostatic interactions with ACE2 at this position. Y436 in SARS-COV and Y449 in nCOV-2019 both make H-bonds with D38 on ACE2. The unchanged T486 in SARS-COV corresponds to T500 in nCOV-2019, both of which make consistent H-bonds with ACE2 residue D355.

Hydrophobic interactions also play an important role in stabilizing RBD/ACE2 complex in nCOV-2019. An important interaction between nCOV-2019 RBD and ACE2 is the π-stacking interaction between F486 (RBD) and Y83 (ACE2). This interaction helps in stabilizing L3 in nCOV-2019 compared to SARS-COV where this residue is L472. It was observed by Gao et al.26 that mutation L472 to F486 in nCOV-2019 results in a net change in binding free energy of 1.2±0.2kcalmol. Other interfacial residues in nCOV-2019 RBD that participate in hydrophobic interaction with ACE2 are L455, F456, Y473, A475 and Y489. It is interesting to note that all these residues except Y489 have mutated from SARS-COV. Spinello and co-workers30 performed long-timescale (1μs) simulation of nCOV-2019/ACE2 and SARS-COV ACE2 and found that L3 in nCOV-2019 is more stable due to presence of the β6 strand and existence of two H-bonds in L3 (H-bonds G485-C488 and Q474-G476). Importantly, an amino acid insertion in L3 makes this loop longer than L3 in SARS-COV and enable it to act like a recognition loop and make more persistent H-bonds with ACE2. L455 in nCOV-2019 RBD is important for hydrophobic interaction with ACE2 and mutation L455A lowers the vdw contribution of binding affinity by about 5kcalmol. The H-bonds between RBD of nCOV-2019 and SARS-COV are shown in Figure 9-A. The structural details discussed here are in agreement with other structural studies of nCOV-2019 RBD/ACE2 complex.4, 53

H-bond analysis was also performed for the mutant systems and the results for H-bonds with more than 40% consistency are in Table S3. Few of the alanine substitutions increase the number of interfacial H-bonds between nCOV-2019 RBD and ACE2. Interestingly, the ala-substitution at Y489A increased the number of H-bonds in the wild-type complex. Mutation in some of the residues having consistent H-bonds in the wild type complex such as Q498A and Q493A, stunningly maintain the number of H-bonds in the wild-type complex. This indicates that the plasticity in the network of H-bonds in RBM of nCOV-2019 which can reshape the network and strengthen other H-bonds upon mutation in these locations. However, few mutations decrease the number of H-bonds from the wild-type complex. Alanine substitution at residue G502 has a significant effect on the network of H-bonds between nCOV-2019 and SARS-COV. This residue locates at the end of L4 loop near two other important residues Q498 and T500. This mutation breaks the H-bonds at these residues. Mutation K417A decreases the number of H-bonds to only 5 where the H-bond at residue Q498 is broken. This indicates the delicacy of H-bond from residue Q498 which can easily be broken upon ala-substitution at other residues. Furthermore, mutation N487 also decreases the number of H-bonds by breaking the H-bond at Q498.

Discussion

In this work, we preformed MD simulations to unveil the detailed molecular mechanism for receptor binding of nCOV-2019 and comparing it with SARS-COV. The role of key residues at the interface of nCOV-2019 with ACE2 were investigated by computational ala-scanning. A rigorous 500 ns MD simulation was performed for nCOV-2019, SARS-COV and few mutants (Y449, T478I, Y489A and S494P) as well as 300 ns MD simulation on each mutant. These simulations aiding in our understanding in the dynamic role of RBD/ACE2 interface residues and estimating the binding free energy of these variants, which shed light on crucial residues for the RBD/ACE2 complex stability. Moreover, numerous mutations have been identified in the RBD of different nCOV-2019 strains from all over the world not known to be critical for infection.54 The effect of these mutations on the stability of RBD/ACE2 complex were investigated to shed light on their role in the viral infection of coronavirus.

Changes in RBD structure of nCOV-2019, SARS-COV and mutants from their crystal structure were analyzed by RMSD and RMSF. nCOV-2019 showed a stable structure with a RMSD=1.5 Å, whereas SARS-COV had a larger RMSD value between 3-4 Å during the simulation. Most mutations of nCOV-2019 maintained similar stability to the wild-type. However, a few nCOV-2019 mutations resulted in larger deviations (>2 Å), i.e., Y489A, F456A, Y505A, N487A, K417A, Y473A, Y449A. We further investigated the structure of the extended loop domain (Figure 1-B) and discovered that nCOV-2019 is stable with RMSD of less than 1 Å, whereas the extended loop in SARS-COV shows an RMSD of about 3Å during simulation (figure S2). Some mutants showed high RMSD in this region. Alanine-substitution at residue N487 increased the extended loop RMSD to 2.5 Å. Other mutations that increased the extended loop RMSD (>2 Å) include Y449A, G477A and E484A. The dynamic behavior of RBD was further investigated by analyzing the RMSF of all systems. As shown in Figure 4, nCOV-2019 shows less fluctuation in L3 than SARS-COV. This is due to the presence of a 4-residue motif (GQTQ) in nCOV-2019 L3, which forces the loop to adopt a compact structure by making 2 H-bonds (G485-C488 and Q474-G476) and thereby reducing the fluctuations in the loop. Residues F486 and N487 play major roles in stabilizing the recognition loop by making π-stacking and H-bond interactions with residue Y83 on ACE2. Alanine substitution at N487 introduced large RMSF to L1. Mutation L472 to F486 in SARS-COV was shown to favor binding by 1.2±0.2kcalmol using FEP.26 In addition, this mutation was shown to be among the five mutations that produce a super affinity ACE2 binder based on SARS-COV RBD.6 Alanine mutations at residues Y449, G447 and E484 increased the motion in L3 characterized by large RMSF in this region.

Using principal component analysis, the approximate free energy landscape for nCOV-2019 and SARS-COV demonstrated that the former occupies only one low energy state whereas the latter forms two distinct low energy basins separated by a metastable state with a barrier of about 67.5kcalmol. This confirms that the level of binding for the RBD domain is weaker in SARS-COV due to the presence of two basins. Similarly, alanine-substitution for a few residues caused the free energy landscape to degenerate into separate multiple low energy regions (Figure S4). Dominant motions in the RBD are visualized in Figure S5 using the first eigenvector of the PCA. nCOV-2019 and SARS-COV did not show any strong motion in the extended loop region. Porcupine plots of alanine-mutants demonstrated that mutant N487A shows large motion in L1 region and Y449A, G447A and E484A showed large motions in L3 (Figure S5).

To better characterize the functional motions of RBD, DCCM for all systems are constructed and demonstrated in Figure 6 and Figure S6. nCOV-2019 showed large correlation between the α4-L1-β5 and α5-L4-β5 region. This correlation was stronger in SARS-COV and few mutants such as Y449A, G447A and E484A. Another important correlation in nCOV-2019 is inside L3 and β6. This correlation is stronger in nCOV-2019 than SARS-COV due to the presence of β6 which makes the loop to adopt correlated motions. Few mutants impact the correlation in this region such as N487A. Interestingly, mutant F486A which is in L3 and participates in binding by π-stacking interaction with Y83 on ACE2, disrupts the DCCM of wild-type nCOV-2019 and introduces strong correlation in the extended loop region as well as the core structure of RBD.

The details of hydrogen bond and salt-bridge pattern in nCOV-2019 and SARS-COV to ACE2 (Table 1) are key to the virus attachment to the host. nCOV-2019 residues participate in 10 H-bonds/1 salt bridge with ACE2, whereas SARS-COV only has 5 H-bonds/1 salt bridge with ACE2. This significantly contributes to 30kcalmol difference in total binding free energy of nCOV-2019 and SARS-COV. The binding energies calculated here for nCOV-2019 and SARS-COV (50.22±1.93 and 18.79±1.53kcalmol, respectively) are in good agreement with the binding energies calculated using Generalized Born method (GB) by Spinello et al.30 Moreover, the patterns of H-bonds between nCOV-2019 and ACE2 was also already characterized by other groups26, 30 which agrees with our work. An important H-bond between nCOV-2019 and ACE2 is between G502 on RBD and K353 of ACE2. G502 is in L4 region, which is populated by 5 H-bonds between RBD and ACE2. The contribution of this residue to the total binding energy is 2.03±0.04kcalmol and the Ala-substitution at G502 has the highest effect on the binding energy among all the residues by lowering the total binding affinity to 24.31±2.98kcalmol which is the lowest among all mutants. This mutation breaks the other H-bonds in L4 such as H-bonds from residues Q498 and T500. This residue is preserved and corresponds to residue G488 in SARS-COV, which also makes a H-bond with residue K353 on ACE2. Residue Q493 in nCOV-2019 participates in binding ACE2 by making two H-bonds with residues E35 and K31 on ACE2. Q493 corresponds to residue N479 in SARS-COV, which only makes one H-bond with residue Lys31 on ACE2. This caused Q493 to have more contribution to total binding than its counterpart N479. However, alanine substitution at Q493 did not affect the total binding energy and this mutant had a total binding energy similar to wild-type complex as it maintains the number H-bonds in the wild-type complex. Residues Q498 and T500 in nCOV-2019 are crucial for binding by making H-bonds with ACE2 residues D38, D355 and K353. Residue Q498 corresponds to residue Y484 in SARS-COV which does not make any H-bond in SARS-COV/ACE2 complex. Q498 contributes to binding by 6.72±0.18kcalmol which is more than the contribution of Y484 in SARS-COV (1.83±0.06kcalmol). Ala-substitution at Q498 did not show large impact on total binding energy. Residue T500 is conserved and corresponds to residue T486 which also makes a H-bond with Asp355 on ACE2. Mutation of T500 to Alanine lowers the binding affinity by about 10kcalmol. Residue N487 in nCOV-2019 locates in L3 and plays a crucial role in stabilizing the recognition loop by making H-bond with Y83 on ACE2. This residue contributes to total binding energy of nCOV-2019 by 1.52±0.06kcalmol, whereas its corresponding residue in SARS-COV does not show any contribution to binding energy (0.02±0.05kcalmol). This demonstrates that L3 in SARS-COV has evolved to be an important recognition loop in nCOV-2019, which participates in binding with ACE2. Residue K417 in nCOV-2019 has the most contribution to total binding energy (12.34±0.23kcalmol) by making a salt-bridge with residue D30 on ACE2. This residue is crucial for binding of RBD and ACE2 and alanine substitution lowers the total binding energy to 29.56±2.95kcalmol. This salt-bridge is found to be important for stability of the crystal structure of RBD/ACE2 complex in nCOV-2019.4 K417 is Val404 in SARS-COV which does not participate in binding ACE2. Another important residue in nCOV-2019 is L455 which contributes to binding by 1.86±0.03kcalmol. This residue is important for hydrophobic interaction with ACE2 and mutating it to alanine lowers the total binding affinity by about 17kcalmol. The hydrophobic residue F456 in nCOV-2019 also has a favorable contribution to binding energy and F456A lowers the binding affinity by 5kcalmol. These results are in fair agreement with experimental binding measurements with deep mutational scanning of RBD in nCOV-2019 where they used flow cytometry for different ACE2 concentrations to measure the dissociation constant KD.25 It was shown that mutations at K417, N487, T500 and G502 are detrimental for binding to ACE2 which agrees with the results here. These experiments showed that mutations at Q493 and Q498 does not impact the binding affinity of RBD to ACE2 which demonstrates the high plasticity of the network of H-bonds at the interface where upon mutation at these residues the network can reshape to form new H-bonds. Mutations at hydrophobic residues L455 and F456 are shown to reduce the binding affinity in these experiments.

Total binding energy calculation of all the variants showed that mutation Y489A has the highest binding affinity among all systems which is about 11kcalmol stronger than that of nCOV-2019 complex. This residue is located in β6, which is part of the recognition region of RBD for binding to ACE2. Removal of this bulky hydrophobic residue at the interface with ACE2 caused the extended loop to move closer to ACE2 interface and make more H-bonds with ACE2 (table S3). A high electrostatic interaction energy is the reason for higher binding energy of mutant Y489A than wild-type complex. It is interesting to note that among the 5 residues L455, F456, Y473, A475 and Y489 that make hydrophobic interactions with ACE2, Y489 is the only residue that is conserved from SARS-COV. However, the experimental binding affinity measurements using deep mutational scanning showed that mutations at this position lower the binding affinity to ACE2. Other alanine substitutions that increase the binding energy are G446A, G447A. Residues G446 and G447 reside in L1 and mutation to alanine can make L1 take a more rigid form as shown in RMSF plot in Figure S3. However, experiment showed that these mutations have similar or lower binding affinities to ACE2 than the wild-type RBD and care must be taken when interpreting these results.25 This discrepancy could be due to forcefield inaccuracy and the deficiencies in the PBSA method for the treatment of solvent in binding energy calculation. Further studies are needed to investigate whether these mutations will increase the binding affinity to ACE2. Deep mutational scanning using Flow cytometry is a qualitative method to measure the impact of a large number of mutations of protein-protein interactions and further experiments such as Surface Plasmon Resonance (SPR) or Isothermal Titration Calorimetry (ITC) which are conventional method for measuring binding affinities are needed to study the effect of these mutations in detail.

Important mutations found in naturally occurring nCOV-2019 appear to influence to some extent the binding to ACE2. Mutation T478I which is one of the most frequent mutations according to GISAID database, increases the binding affinity of nCOV-2019 to ACE2 by about 6kcalmol. Mutation N439K has the highest occurrence among all strains of coronavirus in the GISAID database which demonstrated the highest electrostatic interaction among all studied systems. This residue corresponds to R426 in SARS-COV which exerts a salt-bridge interaction with E329 on ACE2. Mutation N439K recovers some of this ACE2 interaction however, it exerts a binding affinity similar to that of wild-type RBD. Contribution of important interface residues to binding affinity was compared for mutations T478I, N439K and wild-type nCOV-2019 (Figure S7). The most striking differences between wild-type RBD and mutation T478I are residues Y449 and Q498 which have significantly higher contribution to binding than the wild type residue. Most other residues at the interface have similar binding affinities to the nCOV-2019. A higher H-bond persistence is also seen for these two residues Y449 and Q498 compared to the wild type RBD which is the reason for their higher contribution to total binding energy. Mutation N439K has a slightly lower binding affinity to ACE2 than the wild type RBD. Per residue binding energy decomposition showed that K439 in this system has a favorable contribution of 1.80±0.15kcalmol to the total binding energy which is balanced by a lower contribution of K417 which resulted in a binding affinity similar to wild-type RBD. Mutant E484A which is also one of the observed mutations based on GISAID database, demonstrates a high electrostatic interaction with ACE2. E484 contributes to binding by 3.56±0.15kcalmol whereas the corresponding residue in SARS-COV, P469 contributes to binding of SARS-COV to ACE2 by 0.27±0.01kcalmol. This residue is close to D30 on ACE2 and have electrostatic repulsion with this residue. Most natural mutants including N439K, A475V, G476S, V483A, V483F, E484A and S494P showed similar or slightly lower binding affinities to ACE2 compared to wild-type complex which agrees with experimental binding measurements.25 However, the experimental binding affinity for T478I also showed similar binding affinity to wild-type complex. This difference could be due to the use of MMPBSA approach for calculation of polar solvation and further studies are needed to study the effect of this mutation on viral infectivity of coronavirus.

Additional sequence differences between nCOV-2019 and SARS-COV influence RBD/ACE2 binding. Residue D480 in SARS-COV contributes negatively to total binding energy (6.25±0.14kcalmol) and mutating this residue to S494 in nCOV-2019 lowers this negative contribution to 1.17±0.06kcalmol. D480 in SARS-COV is located in a region of high negative charge from residues E35, E37 and D38 on ACE2. Electrostatic repulsion between D480 on SARS-COV and the acidic residues on ACE2 is the reason for highly negative contribution of this residue to binding of SARS-COV to ACE2. Mutation to S494 in this location removes this highly negative contribution. Gao and coworkers26 computed the relative free energies of binding due to mutations from the RBD-ACE2 of SARS-COV to the corresponding residues in nCOV-2019. They used a free energy perturbation approach and showed that mutation D480S in SARS-COV changed the binding free energy by 1.9±0.8kcalmol which is consistent with our study. Furthermore, we performed an additional simulation on D480A mutant in SARS-COV and found that this mutation has a binding affinity of 23.46±3.07kcalmol which is about 5kcalmol higher than the wild-type SARS-COV RBD. In addition, experimental binding affinity measurements showed that mutations of S494 to an acidic residue highly reduces the binding affinity to ACE2 which confirms the hypothesis here.

To our knowledge this is first detailed molecular simulation study on the effect of mutations on binding of nCOV-2019 to ACE2. Previous computational studies have found that nCOV-2019 binds to ACE2 with a total binding affinity which was about 30kcalmol stronger than SARS-COV and is in fair agreement with the results here.56 The critical role of interface residues and residues are computationally investigated here and in other articles and the results of all the studies indicate the importance of these residues for the stability of the complex and finding hot-spot residues for the interaction with receptor ACE2.26, 30, 55, 56 It is interesting to note the role of L3 in the stability of the RBD/ACE2 complex. The amino acid insertions in L3 for nCOV-2019 has converted as unessential part of RBD in SARS to a functional domain of the RBD. This loop participates in binding ACE2 by making H-bond as well as π-stacking interaction with ACE2, which makes this region to act as a recognition loop. Previous studies on SARS-COV have shown there is a correlation between higher binding affinity to receptor and higher infection rate by coronavirus.6, 13, 57 Higher binding affinity of nCOV-2019 for ACE2 than SARS-COV to ACE2 is suggested to be the reason for its high infection rate. Most natural mutations showed similar binding affinities to wild-type ACE2 which indicates that the virus was already effective at the beginning of the crisis for binding ACE2. A few mutations such as N489A and T478I are shown to increase the binding affinity to ACE2. However, more studies are needed to investigate the effect of these mutations in detail. Mutations of nCOV-2019 RBD that do not change the binding affinity and complex stability, could have implications for antibody design purposes since they could act as antibody escape mutants. Escape from monoclonal antibodies are observed for mutations of SARS-COV in 2002 and these mutations should be considered for any antibody design endeavors against these escape mutations.

Conclusion

In conclusion, this study unraveled key molecular traits underlying the higher affinity of nCOV-2019 for ACE2 compared to SARS-COV and unveiled critical residues for the interaction by in-silico alanine scanning mutations and binding free energy calculations. The higher affinity of nCOV-2019 to binding with ACE2 correlates with higher human-to-human transmissibility of nCOV-2019 compared to SARS-COV. Ala-scanning mutagenesis of the interface residues of nCOV-2019 RBM has shed light on the crucial interface residues and helped obtain an atomic-level understanding of the interaction between coronavirus and the receptor ACE2 on the host cell. MD simulations on RBD mutations found in strains of nCOV-2019 from different countries aiding in the understanding of how these mutations can play and importance to in viral infection with ACE2 attachment. In addition to previously reported residues, it was found that residue F486 locating in L3 plays a crucial role in dynamic stability of the complex by a π-stacking interaction with ACE2. Per-residue free energy decomposition pinpoints the critical role of residues K417, Y505, Q498, Q493 in binding ACE2. Alanine scanning of interface residues in nCOV-2019 RBD showed that alanine substitution at some residues such as G502, K417 and L455 can significantly decrease the binding affinity of the complex. Moreover, mutation T478I, which is one of the most probable mutations in RBD of nCOV-2019 is found to binding ACE2 with about 7kcalmol higher affinity than wild-type. It is also alerting that some of the alanine substitutions at residues G446, G447 and Y489 substantially increased the binding affinity that may lead to a strongly RBD attachment to ACE2 and influence the infection virulence. However, details of interaction between these mutants and ACE2 should be carefully studied using experimental techniques. On the other hand, most mutations are found not to impact the binding affinity of RBD with ACE2 in nCOV-2019 which could have implications for vaccine design endeavors as these mutations could act as antibody escape mutants. Receptor recognition is the first line of attack for coronavirus and this study gives novel insights to key structural features of interface residues for advancement of effective therapeutic strategies to stop the coronavirus pandemic.

Supplementary Material

jp-2020-059948 SI

Figure 8.

Figure 8.

Total free binding energy of SARS-COV, nCOV-2019 and mutants. Natural mutants are shown with X at the bar base.

Acknowledgement & Dedication:

This work was supported by the Intramural Research Program of the National Heart, Lung and Blood institute (NHLBI) at the National Institute of Health (NIH). The authors acknowledge the Biowulf high performance Linux cluster at the NIH for providing computational resources for this project. The authors would like to dedicate this article to the doctors and nurses who sacrificed their time, health and even their lives to fight COVID-19, particularly those in Iran and the United States. JBK would also like to dedicate this work to family friend Joe Kaplan (Silver Spring, MD) who passed away due to COVID-19 on April 22, 2020.

Footnotes

Supporting information:

Tables S1-S3, listing location of residues picked for mutation, Binding free energies of mutants, and details of H-bonds for all mutations; Figures S1-S7, showing RMSD of mutants, loop RMSD of all systems, RMSF plots for all mutants, Free energy landscapes using first two principal components, Porcupine plots for structural fluctuations, DCCM plots for mutants and binding energy decomposition for T478I and N439K mutants.

References:

  • 1.Zhu N; Zhang D; Wang W; Li X; Yang B; Song J; Zhao X; Huang B; Shi W; Lu R, A novel coronavirus from patients with pneumonia in China, 2019. New England Journal of Medicine 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Zhou P; Yang X-L; Wang X-G; Hu B; Zhang L; Zhang W; Si H-R; Zhu Y; Li B; Huang C-L, A pneumonia outbreak associated with a new coronavirus of probable bat origin, nature 2020, 579 (7798), 270–273. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Li F, Structure, function, and evolution of coronavirus spike proteins. Annual review of virology 2016, 3, 237–261. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Wang Q; Zhang Y; Wu L; Niu S; Song C; Zhang Z; Lu G; Qiao C; Hu Y; Yuen K-Y, Structural and functional basis of SARS-CoV-2 entry by using human ACE2. Cell 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Chan PK; To W-K; Ng K-C; Lam RK; Ng T-K; Chan RC; Wu A; Yu W-C; Lee N; Hui DS, Laboratory diagnosis of SARS. Emerging infectious diseases 2004, 10 (5), 825. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Wan Y; Shang J; Graham R; Baric RS; Li F, Receptor recognition by the novel coronavirus from Wuhan: an analysis based on decade-long structural studies of SARS coronavirus. Journal of virology 2020, 94 (7). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Wrapp D; Wang N; Corbett KS; Goldsmith JA; Hsieh C-L; Abiona O; Graham BS; McLellan JS, Cryo-EM structure of the 2019-nCoV spike in the prefusion conformation. Science 2020, 367 (6483), 1260–1263. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Bosch BJ; Van der Zee R; DeHaan CA; Rottier PJ, The coronavirus spike protein is a class I virus fusion protein: structural and functional characterization of the fusion core complex. Journal of virology 2003, 77 (16), 8801–8811. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Batool M; Shah M; Patra MC; Yesudhas D; Choi S, Structural insights into the Middle East respiratory syndrome coronavirus 4a protein and its dsRNA binding mechanism. Scientific reports 2017, 7 (1), 1–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Walls AC; Park Y-J; Tortorici MA; Wall A; McGuire AT; Veesler D, Structure, function, and antigenicity of the SARS-CoV-2 spike glycoprotein. Cell 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Tian X; Li C; Huang A; Xia S; Lu S; Shi Z; Lu L; Jiang S; Yang Z; Wu Y, Potent binding of 2019 novel coronavirus spike protein by a SARS coronavirus-specific human monoclonal antibody. Emerging microbes & infections 2020, 9 (1), 382–385. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Li F, Receptor recognition mechanisms of coronaviruses: a decade of structural studies. Journal of virology 2015, 89 (4), 1954–1964. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Li W; Greenough TC; Moore MJ; Vasilieva N; Somasundaran M; Sullivan JL; Farzan M; Choe H, Efficient replication of severe acute respiratory syndrome coronavirus in mouse cells is limited by murine angiotensin-converting enzyme 2. Journal of virology 2004, 75 (20), 11429–11433. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Li F, Structural analysis of major species barriers between humans and palm civets for severe acute respiratory syndrome coronavirus infections. Journal of virology 2008, 82 (14), 6984–6991. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Li F, Receptor recognition and cross-species infections of SARS coronavirus. Antiviral research 2013, 100 (1), 246–254. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Wu K; Peng G; Wilken M; Geraghty RJ; Li F, Mechanisms of host receptor adaptation by severe acute respiratory syndrome coronavirus. Journal of Biological Chemistry 2012, 287 (12), 8904–8911. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Chakraborti S; Prabakaran P; Xiao X; Dimitrov DS, The SARS coronavirus S glycoprotein receptor binding domain: fine mapping and functional characterization. Virology journal 2005, 2 (1), 73. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Greenough TC; Babcock GJ; Roberts A; Hernandez HJ; Thomas WD Jr; Coccia JA; Graziano RF; Srinivasan M; Lowy I; Finberg RW, Development and characterization of a severe acute respiratory syndrome—associated coronavirus—neutralizing human monoclonal antibody that provides effective immunoprophylaxis in mice. The Journal of infectious diseases 2005, 191 (4), 507–514. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Coughlin MM; Prabhakar BS, Neutralizing human monoclonal antibodies to severe acute respiratory syndrome coronavirus: target, mechanism of action, and therapeutic potential. Reviews in medical virology 2012, 22 (1), 2–17. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Ter Meulen J; Van Den Brink EN; Poon LL; Marissen WE; Leung CS; Cox F; Cheung CY; Bakker AQ; Bogaards JA; Van Deventer E, Human monoclonal antibody combination against SARS coronavirus: synergy and coverage of escape mutants. PLoS Med 2006, 3 (7), e237. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Deming D; Sheahan T; Heise M; Yount B; Davis N; Sims A; Suthar M; Harkema J; Whitmore A; Pickles R, Vaccine efficacy in senescent mice challenged with recombinant SARS-CoV bearing epidemic and zoonotic spike variants. PLoS med 2006, 3 (12), e525. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Rockx B; Corti D; Donaldson E; Sheahan T; Stadler K; Lanzavecchia A; Baric R, Structural basis for potent cross-neutralizing human monoclonal antibody protection against lethal human and zoonotic severe acute respiratory syndrome coronavirus challenge. Journal of virology 2008, 82 (7), 3220–3235. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Rockx B; Donaldson E; Frieman M; Sheahan T; Corti D; Lanzavecchia A; Baric RS, Escape from human monoclonal antibody neutralization affects in vitro and in vivo fitness of severe acute respiratory syndrome coronavirus. The Journal of infectious diseases 2010, 201 (6), 946–955. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Elbe S; Buckland-Merrett G, Data, disease and diplomacy: GISAID's innovative contribution to global health. Global Challenges 2017, 1 (1), 33–46. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Starr TN; Greaney AJ; Hilton SK; Crawford KH; Navarro MJ; Bowen JE; Tortorici MA; Walls AC; Veesler D; Bloom JD, Deep mutational scanning of SARS-CoV-2 receptor binding domain reveals constraints on folding and ACE2 binding. BioRxiv 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Wang Y; Liu M; Gao J, Enhanced receptor binding of SARS-CoV-2 through networks of hydrogen-bonding and hydrophobic interactions. Proceedings of the National Academy of Sciences 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Malik A; Prahlad D; Kulkarni N; Kayal A, Interfacial Water Molecules Make RBD of SPIKE Protein and Human ACE2 to Stick Together. bioRxiv 2020. [Google Scholar]
  • 28.Boorla VS; Chowdhury R; Maranas CD, De novo design of high-affinity antibody variable regions (scFv) against the SARS-CoV-2 spike protein. BioRxiv 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Sharma S; Singh I; Haider S; Malik MZ; Ponnusamy K; Rai E, ACE2 Homodimerization, Human Genomic variants and Interaction of Host Proteins Explain High Population Specific Differences in Outcomes of COVID19. BioRxiv 2020. [Google Scholar]
  • 30.Spinello A; Saltalamacchia A; Magistrato A, Is the Rigidity of SARS-CoV-2 Spike Receptor-Binding Motif the Hallmark for Its Enhanced Infectivity? Insights from All-Atoms Simulations. The Journal of Physical Chemistry Letters 2020. [DOI] [PubMed] [Google Scholar]
  • 31.Shang J; Ye G; Shi K; Wan Y; Luo C; Aihara H; Geng Q; Auerbach A; Li F, Structural basis for receptor recognition by the novel coronavirus from Wuhan. 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Rodrigues CH; Myung Y; Pires DE; Ascher DB, mCSM-PPI2: predicting the effects of mutations on protein–protein interactions. Nucleic acids research 2019, 47 (W1), W338–W344. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Lan J; Ge J; Yu J; Shan S; Zhou H; Fan S; Zhang Q; Shi X; Wang Q; Zhang L, Structure of the SARS-CoV-2 spike receptor-binding domain bound to the ACE2 receptor. Nature 2020, 581 (7807), 215–220. [DOI] [PubMed] [Google Scholar]
  • 34.Berman HM; Westbrook J; Feng Z; Gilliland G; Bhat TN; Weissig H; Shindyalov IN; Bourne PE, The protein data bank. Nucleic acids research 2000, 28 (1), 235–242. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Abraham MJ; Murtola T; Schulz R; Páll S; Smith JC; Hess B; Lindahl E, GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX 2015, 1, 19–25. [Google Scholar]
  • 36.Lindorff-Larsen K; Piana S; Palmo K; Maragakis P; Klepeis JL; Dror RO; Shaw DE, Improved side-chain torsion potentials for the Amber ff99SB protein force field. Proteins: Structure, Function, and Bioinformatics 2010, 78 (8), 1950–1958. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Best RB; Zhu X; Shim J; Lopes PE; Mittal J; Feig M; MacKerell AD Jr, Optimization of the additive CHARMM all-atom protein force field targeting improved sampling of the backbone ϕ, ψ and side-chain χ1 and χ2 dihedral angles. Journal of chemical theory and computation 2012, 8 (9), 3257–3273. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Bussi G; Donadio D; Parrinello M, Canonical sampling through velocity rescaling. The Journal of chemical physics 2007, 126 (1), 014101. [DOI] [PubMed] [Google Scholar]
  • 39.Van Der Spoel D; Lindahl E; Hess B; Groenhof G; Mark AE; Berendsen HJ, GROMACS: fast, flexible, and free. Journal of computational chemistry 2005, 26 (16), 1701–1718. [DOI] [PubMed] [Google Scholar]
  • 40.Parrinello M; Rahman A, Polymorphic transitions in single crystals: A new molecular dynamics method. Journal of Applied physics 1981, 52 (12), 7182–7190. [Google Scholar]
  • 41.Feller SE; Zhang Y; Pastor RW; Brooks BR, Constant pressure molecular dynamics simulation: the Langevin piston method. The Journal of chemical physics 1995, 103 (11), 4613–4621. [Google Scholar]
  • 42.Darden T; York D; Pedersen L, Particle mesh Ewald: An N· log (N) method for Ewald sums in large systems. The Journal of chemical physics 1993, 98 (12), 10089–10092. [Google Scholar]
  • 43.Lindahl E; Hess B; Van Der Spoel D, GROMACS 3.0: a package for molecular simulation and trajectory analysis. Molecular modeling annual 2001, 7 (8), 306–317. [Google Scholar]
  • 44.Brooks BR; Janežič D; Karplus M, Harmonic analysis of large systems. I. Methodology. Journal of computational chemistry 1995, 16 (12), 1522–1542. [Google Scholar]
  • 45.David CC; Jacobs DJ, Principal component analysis: a method for determining the essential dynamics of proteins. In Protein dynamics, Springer: 2014; pp 193–226. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Brown DK; Penkler DL; Sheik Amamuddy O; Ross C; Atilgan AR; Atilgan C; Tastan Bishop Ö, MD-TASK: a software suite for analyzing molecular dynamics trajectories. Bioinformatics 2017, 33 (17), 2768–2771. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Ren J; Yuan X; Li J; Lin S; Yang B; Chen C; Zhao J; Zheng W; Liao H; Yang Z, Assessing the performance of the g_mmpbsa tools to simulate the inhibition of oseltamivir to influenza virus neuraminidase by molecular mechanics Poisson–Boltzmann surface area methods. Journal of the Chinese Chemical Society 2020, 67 (1), 46–53. [Google Scholar]
  • 48.Wang C; Greene DA; Xiao L; Qi R; Luo R, Recent developments and applications of the MMPBSA method. Frontiers in molecular biosciences 2018, 4, 87. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Yang T; Wu JC; Yan C; Wang Y; Luo R; Gonzales MB; Dalby KN; Ren P, Virtual screening using molecular simulations. Proteins: Structure, Function, and Bioinformatics 2011, 79 (6), 1940–1951. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Guan S.-s.; Han W.-w.; Zhang H; Wang S; Shan Y.-m., Insight into the interactive residues between two domains of human somatic Angiotensin-converting enzyme and Angiotensin II by MM-PBSA calculation and steered molecular dynamics simulation. Journal of Biomolecular Structure and Dynamics 2016, 34 (1), 15–28. [DOI] [PubMed] [Google Scholar]
  • 51.Korber B; Fischer WM; Gnanakaran S; Yoon H; Theiler J; Abfalterer W; Hengartner N; Giorgi EE; Bhattacharya T; Foley B, Tracking changes in SARS-CoV-2 Spike: evidence that D614G increases infectivity of the COVID-19 virus. Cell 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Sui J; Aird DR; Tamin A; Murakami A; Yan M; Yammanuru A; Jing H; Kan B; Liu X; Zhu Q, Broadening of neutralization activity to directly block a dominant antibody-driven SARS-coronavirus evolution pathway. PLoS Pathog 2008, 4 (11), e1000197. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Tai W; He L; Zhang X; Pu J; Voronin D; Jiang S; Zhou Y; Du L, Characterization of the receptor-binding domain (RBD) of 2019 novel coronavirus: implication for development of RBD protein as a viral attachment inhibitor and vaccine. Cellular & molecular immunology 2020, 17 (6), 613–620. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Forster P; Forster L; Renfrew C; Forster M, Phylogenetic network analysis of SARS-CoV-2 genomes. Proceedings of the National Academy of Sciences 2020, 117 (17), 9241–9243. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Brielle ES; Schneidman-Duhovny D; Linial M, The SARS-CoV-2 exerts a distinctive strategy for interacting with the ACE2 human receptor. Viruses 2020, 12 (5), 497. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Liu H; Lupala C; Li X; Lei J; Chen H; Qi J; Su X, Computational simulations reveal the binding dynamics between human ACE2 and the receptor binding domain of SARS-CoV-2 spike protein. bioRxiv 2020. [Google Scholar]
  • 57.Li Y; Wang Y; Qiu Y; Gong Z; Deng L; Pan M; Yang H; Xu J; Yang L; Li J, SARS-CoV-2 Spike Glycoprotein Receptor Binding Domain is Subject to Negative Selection with Predicted Positive Selection Mutations. bioRxiv 2020. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

jp-2020-059948 SI

RESOURCES