Abstract
The development of reliable ways for predicting the binding free energies of covalent inhibitors is a challenge for computer aided drug design. Such development is important for example in the fight against the SAR-CoV-2 virus, where covalent inhibitors can provide a promising tool for blocking Mpro, the main protease of the virus. This work develops a reliable and practice protocol for evaluating the binding free energy of covalent inhibitors. Our protocol presents a major advance over other approaches that do not consider the chemical contribution of the binding free energy. We have combined the empirical valence bond (EVB) method for evaluating the chemical profiles and the PDLD/S-LRA/β method for evaluating the non-covalent part of the binding process. This protocol has been used in the calculations of the binding free energy of an alpha-ketoamide inhibitor of Mpro. Encouragingly, our approach reproduces the observed binding free energy. Our study of covalent inhibitors to cysteine proteases, has indicated that in choosing an effective warhead it is crucial to focus on the exothermicity of the point on the free energy surface of peptide cleavage that connect the acylation and deacylation steps. Overall, we believe that our approach should provide a powerful and effective way for in silico design of covalent drugs.
Graphical Abstract
1. Introduction
From December 2019, the whole world has been facing a problem of a highly contagious pulmonary disease named as coronavirus disease 2019 (COVID-19).1 Almost 41 million people in the world have been infected by this virus so far. The first case of this global pandemic was reported in the city of Wuhan, China.2 A coronavirus strain named as severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2)3 is responsible for this global pandemic. So far, no vaccine or antiviral drug has been approved to prevent the spreading of SAR-CoV-2. Many proteins in SAR-CoV-2 have been targeted to design new drugs or to repurpose known drugs,4 and the main protease of SAR-CoV2 (SAR-CoV-2 Mpro, also called 2CLpro)5 is one of those. The SAR-CoV-2 Mpro is a cysteine protease (CP) which takes part in the viral replication process. This protein cleaves the polyprotein pp1a and pp1ab (translated from the viral RNA) at 16 different positions to generate important structural (spike, envelop, membrane and nucleocapsid proteins) as well as non-structural proteins (NSPs).6 Thus, hindering the normal action of this Mpro can stop the spreading of SARS-CoV-2. The SAR-CoV-2 Mpro has an unique recognition sequence (Leu-Gln↓(Ser, Ala, Gly)) and the cleavage site (denoted by ↓) is between the Gln and the next small amino acid (Ser, Ala, Gly).7 No human proteases has this cleavage specificity and as a result, inhibitors for SAR-CoV-2 Mpro is less likely to be toxic. This makes the Mpro an excellent target to design drugs. Some crystal structures7–10 of the inhibitor bound SAR-CoV-2 Mpro have been solved recently, which have immensely helped to identify important protein residues near the inhibitors. Most of those published crystal structures contain covalent inhibitors.
Generally, covalent inhibitors are more potent than their non-covalent analogues, since they form covalent bonds with the proteins. In fact, there are many examples of covalent inhibitors particularly for protease enzymes.11 For example, very recently a few potential broad-spectrum covalent inhibitors against alphacoronavirus, betacoronavirus and enterovirus were reported.12 These inhibitors bind specifically to the main proteases of those viruses. Unfortunately, most of these designs of covalent inhibitors are solely based on experimental studies, and computational research is yet to play a significant role in it. Reasonably accurate computational methods13–14 are available for obtaining relative binding free energies of non-covalent inhibitors, but the main hurdle in developing computational approaches for designing covalent inhibitors is the simulation of the formation of the covalent bond. Unlike non-covalent inhibitors, the binding process of a covalent inhibitor not only depends on correct structural complementary between the protein and the inhibitor, but also on the appropriate chemical reactivity of the inhibitor and on the protein environment that stabilizes the covalent complex. Thus, designing good covalent inhibitors requires understanding of the energy contributions of different steps in the covalent complex formation which includes both the non-covalent binding free energy as well as the reaction free energies. In the past few years several interesting computational studies were reported,15–18 where Free Energy Perturbation (FEP) based alchemical transformations were applied to calculate the relative binding free energies of various covalent inhibitors. While in most of these works both the non-covalent and covalent states were considered, authors in ref 17 used only the covalent state in their calculations. As pointed out in ref. 19, the choice of considering just the covalent state is reasonable only when the contribution of the covalent state to the total binding free energy is more by at least −5.5 kcal/mol than that of the non-covalent state. Unfortunately, knowing a priori what the contributions from the covalent and non-covalent state are not possible. Thus, even for relative covalent binding free energy calculations, both contributions should be calculated. The situation gets even further complicated when one tries to calculate the absolute covalent binding free energies, because in this case the possibility of error cancelation (as we might expect in relative free energy calculations) is negligible. One might still be able to calculate the absolute bind free energies by considering one inhibitor as reference and following the thermodynamic cycle used in ref 19 to avoid the expensive quantum mechanics (QM) based calculations. On the other hand, a complete mechanistic understanding of the covalent inhibition process is not possible from such analysis. In fact, sometimes the inhibition is correlated with the transition state binding energy. Furthermore, comparison of different covalent warheads requires explicit evaluation of the chemical bonding process. Thus, it seems to us that there is a need for a protocol which is computationally less expensive but accurate enough (both in terms of convergence and reliability of the quantum mechanics/molecular mechanics (QM/MM) approach) in calculating both non-covalent binding free energies and reaction free energies.
We have previously combined the semi-microscopic version of the Protein Dipole Langevin Dipole method in the linear response approximation, with a scaled non electrostatic term (PDLD/S-LRA and PDLD/S-LRA/β)20 and the Empirical Valence Bond (EVB)21 method to investigate the drug resistance of Hepatitis C virus (HCV).22 In that project PDLD/S-LRA and method was used to calculate the non-covalent binding free energies of the substrate and inhibitors whereas EVB was employed to investigate the peptide hydrolysis mechanism. Thus, it is likely that we can combine these methods also to calculate absolute covalent binding free energies of inhibitor-protein complexes.
Considering the current pandemic situation, it is also very timely to try to use computer simulations to get a detailed atomistic understanding of the inhibitor binding to SAR-CoV-2 Mpro. Such simulations can accelerate the design of new drugs to combat COVID-19 or other similar viruses that may come to in existence in the future.
In this paper, we have used the case of an alpha-ketoamide inhibitor (a reversible covalent inhibitor of Mpro) to explore the mechanism of inhibition of Mpro. Recently, the mechanism of proteolysis of the wild type substrate (polypeptide chain) by SAR-CoV-2 Mpro has been explored computationally.23–24 These studies are very informative for designing new drugs but understanding the mechanism of inhibition of SAR-CoV-2 Mpro by a covalent inhibitor would surely help more in inhibitor design process.
The main protease of SAR-CoV-2 contains a cysteine (Cys145) and histidine (His41) residue as a catalytic dyad (numberings are based on protomer chain A of PDB 6Y2G7). There are some proposals that the dyads of cysteine protease systems are already in the thiolate-imidazolium ion-pair form in the apo form25 of the proteins, and thus upon substrate/inhibitor binding, the activated Cys can attack the most electrophilic center of the substrate/inhibitor complex. An alternative proposal is that the Cys-His catalytic dyad remains in neutral form in the apo form of the protein and upon substrate/inhibitor binding the histidine residue activates the cysteine by abstracting the proton from its Sγ atom.26–27 The activated cysteine then attacks the substrate/inhibitor. In Ref 23 a third mechanistic proposal was reported, where after the substrate/inhibitor binding the proton transfer can occur concomitant with the next nucleophilic attack step.23 All the above mentioned proposed mechanistic options were considered in this work to investigate the inhibition process and are depicted in Figure 1. Where in schemes A and B, the nucleophilic attack (NA) occurs after the first proton transfer step (PT1) is completed, and those two steps occur in a concerted manner in scheme C. The last proton transfer step is same for all the schemes. The difference between the schemes A and B is in the ordering of the PT1 and the inhibitor binding process (see Figure 1). To calculate the free energy change for different steps in Figure 1 the PDLD/S-LRA/β or EVB method were utilized (depending upon the step). Furthermore, our calculated results have been compared to the experiment results and the most exothermic step in the entire binding process was also identified. This result can help in designing new covalent inhibitors for SAR-CoV-2 Mpro, especially by focusing on increasing the exothermicity of that step.
Figure 1.
Various mechanistic schemes of formation of a covalent complex between a peptidomimetic inhibitor and a Cysteine protease. The inhibitor is represented in generic form of RR’CO, where R and R’ can be any aliphatic or aromatic chain. The reacting molecular fragments are represented in square boxes. The green, red and blue colored arrows are used to represent schemes A, B and C, respectively. The difference among schemes A, B and C has been discussed in detail in the main text.
2. Computational Methods
2.1. Preparation of the Simulation System
The catalytically active form of the Mpro is a homodimer (Figure 2A). Therefore, a homodimer of the main protease (PDB 6Y2G)7 was taken in all of our free energy calculations. Chain A of the homodimer was used as the main simulation system. The inhibitor in PDB 6Y2G (named 13b in ref. 7) (see Figure 2B) was used as the alpha-ketoamide inhibitor in our study. The covalent bond between the inhibitor and the sulfhydryl group of Cys145 was removed before starting any simulation. All the non-protein atoms except the inhibitor were removed from the PDB file. The protein was solvated using the simulation software MOLARIS-XG28 to generate a water sphere based on the surface constraint all-atom solvent model (SCAAS)29. The simulation system was then energy minimized to remove any bad contacts in the protein system while keeping the coordinates of the inhibitor frozen. The partial charges of the inhibitor were calculated using the B3LYP level of theory at 6–31+G** basis set with the Gaussian 09 software30. The charges obtained from the QM calculations were fitted using restrained electrostatic potential (RESP)31 procedure to obtain the RESP based charges. In all of the simulations the polarizable force field ENZYMIX32 was used unless otherwise mentioned.
Figure 2.
Structures of dimeric SAR-CoV-2 Mpro and its α-ketoamide inhibitor 13b. (A) Protomer A and B are in orange and blue colored ribbon representations, respectively. The inhibitor and catalytic residues in protomer A are shown in purple and green stick representations. (B) The most electrophilic center of 13b, is highlighted with a black arrow.
2.2. Empirical Valence Bond Simulations
In the EVB simulations, the reacting atoms are defined as “region I” atoms, while the rest of the protein atoms (up to a user defined radius) is defined as “region II”. The EVB simulations in which a few atoms of the inhibitor were included in “region I”, the geometric center of the inhibitor was considered as the center of the simulation system. On the other hand, in the simulations where the atoms of the inhibitor were not part of the “region I”, the geometric center of all the protein atoms included in “region I” was used as the center. The protein-inhibitor system was immersed in a 22 Å sphere of water molecules. The water molecules at the boundary of the solvent sphere were subject to polarization and radial restraints by using the surface-constrained all-atom solvent (SCAAS) model. This simulation sphere was further surrounded by a 2 Å spherical shell of Langevin dipoles and then a bulk continuum. The long-range electrostatic effect was treated using the local reaction field (LRF) method.33 All protein atoms beyond the 22 Å sphere were fixed at their initial coordinate positions as in the crystal structure. The electrostatic interaction from outside of the sphere was not included in the energy calculations. The partial charges of all “region I” atoms were calculated using the B3LYP level of theory at 6–31+G** basis set with the Gaussian 09 software and the partial charges and all other EVB parameters are provided in the supporting information.
Before calculating the free energy surface (FES) using EVB approach, the simulation system was equilibrated thoroughly. The temperature of the simulation system was increased slowly from 1 to 300 K in the increment of 40 K (except from 280 to 300 K, increment was 20K) of total simulation time of 200 ps. During heating, the EVB “region I” atoms were constrained at 50 kcal/mol energy. This constrain was then released in six steps to 0.3 kcal/mol in another 100 ps. Finally, three different starting geometries were generated from an equilibration simulation of 300 ps. It is noteworthy to mention that the equilibration of the system is considerably fast with spherical boundary conditions. Additionally, we observed minimal fluctuation of the root mean square deviation (RMSD) of the positions of the heavy atoms in “region II” during the last 300 ps equilibrium simulations. In EVB approach, the free energy surface is calculated using the Free Energy Perturbation/Umbrella Sampling (FEP/US) method, and for that we ran simulations of 51 frames with each frame simulated for 10 ps. For each reaction, the simulations of the reference reaction (a reaction in water without any protein atoms from “region II”) were also performed to obtain the necessary EVB parameters (gas phase shift and coupling constants). The definition of these parameters can be found in the supporting information. The reported energies in this work were obtained from the averaging of three independent EVB simulations.
2.3. PDLD/S-LRA/β Simulations
The simulation setup for the non-covalent binding energy were the same as that of in the EVB simulations. Except in the PDLD/S-LRA/β simulations, the whole inhibitor was included in “region I” and all atoms (including atoms in “region I”) in the simulation were represented with the ENZYMIX force field. Five different simulations (starting with different initial structures) were performed for each non-covalent binding free energy calculation. The presented results were averaged of those simulation results. In every PDLD/S-LRA/β simulations, the linear response approximation (LRA) calculation was performed on 10 different protein configurations. A brief description on the PDLD/S-LRA/ β method can be found in the supporting information.
3. Results and Discussion
A generic binding process for a covalent inhibitor can be described formally as a two-step process, where the first step is a formation of a non-covalent protein-inhibitor complex and the second step is the chemical reaction for the formation of a covalent protein-inhibitor complex (see Figure 3). As we can see from Figure 1, the chemical reaction can consist of multiple steps. Thus, the absolute binding free energy of a covalent complex can be a combination of many free energy terms. In order to compare the calculated covalent binding free energies with the corresponding experimental results, thus we need reliable calculations of each of those steps. In the following sections we will discuss the calculations of the free energy changes for different steps and finally we will compare our calculated results with experimental results of the inhibitor 13b.
Figure 3.
A general free energy profile that explains the formation of a covalent protein (P) inhibitor (I) complex. P···I, [P---I]‡ and P---I denote the non-covalent, activated complex(s) during chemical reaction(s) and covalent binding state, respectively. ΔGnon-cov, ΔGchem and ΔGcov represent the non-covalent (P···I) binding energy, reaction free energy and total covalent (P---I) binding free energy, respectively.
3.1. Non-covalent Binding Free Energy () Calculations
Mpro with different ionization combinations were used to calculate the of two types of non-covalent complexes. In one case, we ionized all ionizable residues within 20 Å from the geometric center of the inhibitor) as well as His41 (doubly protonated) and Cys145 (negatively charged) whereas in another case the His41-Cys145 pair was kept in its neutral form. We have used the Monte Carlo Proton Transfer (MCPT)34 method to assess the most probable ionization state of the titratable residues. The neutral form corresponds to the situation where the inhibitor binding occurs before the proton transfer between Cys145 and His41. In this way we have tested the hypothesis related to the ordering of the first proton transfer and inhibitor binding (Scheme A and B in Figure 1). The calculated values have been presented in Table 1. Interestingly, for inhibitor 13b in both cases we obtained similar binding free energies. These results probably suggest that the ionization of Cys145 and His41 is not important for the initial inhibitor protein non-covalent complex formation. Crystal structures of the dimeric complexes of the mutated (C145A35–36 or H41A37) Mpro of SAR-CoV with bound C-terminal/N-terminal autocleavage substrates can be found in literatures. A bound substrate in those mutated (C145A or H41A) proteins indicate that the electrostatic influence of Cys145 and His41 might be nominal during the non-covalent complex formation. On the other hand, a more surprising part is such low values of for a big alpha ketoamide inhibitor. Here we would like to mention that the structures that were used to represent the non-covalent states were derived from coordinates of the covalent hemiketal form of the inhibitors. It is possible that in those structures the protein preorganization (which is mostly for stabilizing the transition state of the covalent bond formation) is not conducive enough for stabilizing the non-covalent states. In other words, there is a possibility that the inhibitor binds in a slightly different mode than what have been simulated. At any rate, if we consider the chemical reaction step(s) along with this step we could understand how strongly the exothermicity of the reactions drive the covalent complex formation despite this low contributions.
Table 1.
Calculated binding free energies () for the non-covalent SARS-CoV-2 Mpro-13b complexes.
System | |
---|---|
13b (before proton transfer) | −3.5 ± 0.5 |
13b (after proton transfer) | −3.6± 0.5 |
3.2. Calculating the Reaction Free Energy of the First Proton Transfer (PT1) Step
The investigation of the mechanism of covalent bond formation was started by evaluating the reaction free energy () of the PT1 step. PT1 reaction was simulated both in the apo and holo form of the protein. It should be noted that the calculations for the respective PT1 reactions in water (reference reactions) were also performed in the presence or absence of the inhibitor. The in water (in the presence of the inhibitor), was taken from ref. 38 to compare it with our calculated in water. In this case, it was assumed that the effect of the slightly different warheads of the inhibitors (reported in ref. 38 and in this work) on the PT reaction is very similar. On the other hand, the in water (in the absence of the inhibitor), was calculated from the pKa values of ethanethiol (pKa=10.6) and imidazole (pKa=6.9). Here, the pKa of ethanethiol instead of cysteine was taken, because in case of cysteine, there is a mainchain zwitterionic effect. Thus, the value of in water (in absence of the ligand) is 1.38*(10.6–6.9) = 5.1 kcal/mol. We used this value to compare our calculated EVB profile (in absence of the ligand) as well as to fit EVB parameters. The PT1 profiles obtained for the reactions in protein were different in the apo and holo forms of the protein (see Table 2). Table 2 shows that the (His-Cys) ion-pair state is less favorable in the holo form of the protein with respect to that in the apo form. Probably because the ion-pair forms at the end of the PT1 step is more stabilized in more polar environment like in the apo form (with a vacant inhibitor binding pocket) compared to that of in the holo form. Our results also match with ref. 24 where the proteolysis mechanism of SARS-CoV-2 was studied using multiscale DFT/MM simulations methods (see Table 2).
Table 2.
Calculated reaction free energies () of the first proton transfer step (PT1) in water and SAR-CoV-2 Mpro.
System | † | ||
---|---|---|---|
Scheme A | Water | 5.1 | − |
Protein (apo) | 2.9 ± 0.2 | 2.9 | |
Scheme B | Water | 7.5 | − |
Protein (holo) | 7.3 ± 1.7 | 4.8 |
3.3. Calculation of the Activation and Reaction Free Energies of the Nucleophilic Attack (NA) step
In the nucleophilic attack step the activated anionic Sγ atom of Cys145 attacks the α-keto group of the ketoamides and forms an anionic tetrahedral complex. In this case, the information on observed activation () and reaction free energy () of the reference water reaction were taken from the free energy profiles reported in ref. 38. It is worthwhile mentioning that in ref. 38 an amide system was used to study the free energy profile, whereas we have a ketoamide group. Thus, we also performed QM based energy calculation to estimate in water. From the QM calculations we obtained the value of as 12.8 kcal/mol (after adding the thermal corrections). The QM calculated is close to the value reported in ref. 38, and thus we have used the exact PES as in ref. 38 to parameterize our EVB simulations for the NA step. The calculated and have been reported in Table 3.
Table 3.
Calculated activation () and reaction free energies () of the nucleophilic attack (NA) step in water and SAR-CoV-2 Mpro.
System | |||
---|---|---|---|
Scheme A | Water | 17.7 | 12.6 |
Protein | 13.5 ± 2.3 | 2.9 ± 2.2 | |
Scheme B | Water | 17.6 | 12.4 |
Protein | 12.5 ± 0.9 | −0.5 ± 1.7 |
In Table 3 two different values of and was reported, as we have explored two different stepwise mechanistic schemes (A and B). In scheme A, the PT1 step happens before the inhibitor binding and we did not include the His145 when we studied the NA step in the water environment (because, this would help us to account for the effect of the protonated histidine residue in the protein reaction only). Whereas, for scheme B, we performed a continuation of a run from the previous PT1 run. Thus, the His145 residue was included both in water and protein reaction for the NA step. Thus, the effect of the His145 was incorporated slightly differently in the simulations of these schemes (A and B). On the other hand, even if the representations of the simulation systems are slightly different, the total free energy change of the three steps () should converge for both schemes. The calculated free energy change after the first three steps are 3.3 and 2.2 kcal/mol for scheme A and B, respectively. Thus, the calculated energy after first three steps is endothermic even in the protein. This indicates that the major exothermicity observed due to the covalent inhibitor-protein complexation should come from the last proton transfer step.
At this point we would like to discuss the third mechanistic scheme considered in this study. In ref 23 the author proposed that the first proton transfer occurs concomitantly with the nucleophilic attack step. For our PT1 free energy surface calculations, it was also observed that in the presence of the inhibitor the Cys-His ion-pair (product of PT1) was more unstable compared to that of when the inhibitor was absent. Thus, it is quite possible that the nucleophilic attack happens almost immediately after the formation of the less stable ion-pair in the holo form of the protein. Therefore, a concerted PT1 and NA step is a legitimate option to check.
3.4. Calculation of Activation and Reaction Free Energies of the Concerted PT1 and NA Step
The results for the concerted step have been presented in Table 4. In order to parameterize the EVB simulation, here we also performed EVB simulations of the reference reaction in water. For the reference reaction, it was assumed that the observed activation barrier () and reaction free energies () of the concerted reaction is same as stepwise reaction discussed in ref 38 and the values 24 and 19 kcal/mol were used for the respectively free energy changes. The calculated barrier of the concerted reaction (in scheme C) is 13.6 kcal/mol (Table 4), whereas for the stepwise process the combined barrier (including PT1 and NA) are 17.8 and 20.0 kcal/mol for scheme A and B, respectively (Table 2 and 3). Thus, it can be concluded that scheme C provides a more plausible mechanism. On the other hand, the total change in reaction free energies (or) for the formation of the anionic tetrahedral complex are comparable in all three Schemes. This confirms the consistency in our simulations.
Table 4.
Calculated activation () and reaction free energy () of the concerted PT1 and NA step in water and SAR-CoV-2 Mpro.
System | |||
---|---|---|---|
Scheme C | Water | 24.4 | 18.6 |
Protein | 13.6 ± 0.2 | 6.1 ± 0.8 |
3.5. Calculation of the Reaction Free Energy of the Second Proton Transfer (PT2) Step
The last proton transfer step is common for all three mechanistic schemes. In order to obtain the observed reaction free energy () for the reaction in water here we have also used the information of the pKa’s of the reacting groups. The obtained in water was −5.6 kcal/mol considering the pKa of imidazole = 6.9 and the pKa of the tetrahedral intermediate38 = ~ 11.0. The calculated results for the reaction in the protein and water have been reported in Table 5.
Table 5.
Calculated reaction free energy () of the second proton transfer step (PT2) in water and SAR-CoV-2 Mpro.
System | |
---|---|
Water | −5.6 |
Protein | −11.4± 0.9 |
The results in Table 5 suggests that the proton affinity of the tetrahedral complex is very important (along with the electrophilicity of the reacting group of the inhibitor) in designing a better inhibitor than 13b. Since PT2 step contributes to the maximum exothermicity during the covalent bond formation. Recently the two mechanistic studies39–40 on the inhibition of Mpro have been reported, where a peptidyl Michael acceptor N3 was used as the covalent inhibitor. In both works, the last proton transfer step has been reported to be highly exothermic, which is in agreement with our study. Additionally, the reported activation free energy of the covalent bond formation step in ref. 39 is close to what we have obtained here. In another study, Arafet et al.41 reported that the protonated hemiketal intermediate of halomethyl ketone inhibitors of cruzain (a cysteine protease) were highly stable compared to that of the anion forms. Although, in their study the second proton transfer preceded or happened concurrently with the nucleophilic attack, it produced the same conclusion, about the contribution of the second proton transfer step to the overall exothermicity, as we have found here.
3.6. Calculation of Binding Free Energy of the Covalent SAR-CoV-2 Inhibitor Complex
Finally, we can add all the calculated free energy contributions to obtain the binding free energy () of the covalent complex between Mpro and the inhibitor 13b. The experimental IC50 value for 13b is 0.67 μM.7 Therefore, the experimental binding free energy at T=300K can be estimated as −8.5 kcal/mol. The total free energy change for each reaction scheme have been reported in Table 6, and the overall free energy profiles for the covalent complex formation between SAR-CoV-2 Mpro and 13b inhibitor (following different mechanistic schemes) has been depicted in Figure 4. As we can see from Table 6 the calculated is close to the experimental.
Table 6.
Calculated ()and experimental () binding free energies of the covalent SAR-CoV-2 Mpro- 13b complex.
Scheme | † | |
---|---|---|
Scheme A | −9.2 | |
Scheme B | −8.1 | −8.5 |
Scheme C | −8.9 |
The is calculated from the IC50 value reported in ref 7.
Figure 4.
The Free energy surface (FES) of formation of the covalent protein ligand (SARS-CoV-2 Mpro 13b) complex. The square boxes near the surfaces represent the reaction coordinate at the corresponding position of the FES.
However, it is worthwhile mentioning that the IC50 value is not the best metric for evaluating the accuracy of a protocol for absolute binding free energy calculations. Despite that, due to the lack of experimentally reported Ki (inhibition constant) value for the alpha-ketoamide inhibitor 13b, we had to use IC50 value, an indirect metric of binding affinity, for our study. It is known that the relationship between Ki and IC50 value depends on the experimental condition, and on the binding affinity of the substrate (KM). In general, the Ki value is equal or less than the IC50 value42, thus the actual binding affinity of 13b should be even lower than −8.5 kcal/mol. On the other hand, 13b is known to be a reversible covalent inhibitor. For similar reversible alpha ketoamide inhibitors of Calpains and other cysteine proteases, the reported Ki values43 were around 15 nM and the corresponding binding free energy will be −11 kcal/mol. Therefore, the expected value may be 3–4 kcal/mol more negative than what we can get from IC50 value of 13b. At any rate, we think our estimated absolute binding free energy of Mpro-13b covalent complex is not significant over or underestimated and we can also explain that 13b is not an irreversible inhibitor.
The reversibility of a reaction depends generally on the barrier of the reverse reaction. In other words, for two reactions with comparable activation barriers the reaction which is more exothermic is also more irreversible. Thus, the calculated reaction free energies of two reactions can be used to determine whether a covalent inhibitor is reversible or not.44 The authors of ref. 44 used semiempirical QM/MM based methods to calculate the reaction free energies for a few reversible covalent inhibitors of a cysteine protease. They reported that the calculated exothermicity of the chemical reactions were around −10 kcal/mol. We have also obtained around −11 kcal/mol for the second proton transfer step, which supports the idea that the protonated tetrahedral complex of 13b can undergo a reverse reaction to form the non-covalent complex. At this point we would like to make a cautionary note that the time dependence of the binding process has not been provided in the experimental work and thus it is not completely clear that our assumption about thermodynamic equilibrium is fully justified.
4. Conclusions
The great potential of covalently bound drugs has been demonstrated in recent years45–48. Still, most current approaches for in silico design of such inhibitors are not fully consistent for absolute binding free energy calculations, since they do not evaluate the chemical contribution of the binding energy. This work developed a general way for calculating the binding free energies of covalent drugs including all the relevant energy contribution. That is, our strategy is based on using the EVB for evaluating the chemical contribution for the covalent binding and using the PDLD/S-LRA//β for determining the non-covalent contribution. This approach is examined by calculating the binding free energy of an alpha-ketoamide inhibitor of the main protease of SAR-CoV-2 (Mpro.). Our calculations considered multiple reaction pathways for the inhibitor-protease complex formation and reproduces a very reasonable estimate of the observed binding free energy. This demonstrated the potential of our strategy in general drug design and particularly in looking for effective inhibitors for SARS-CoV-2.
Supplementary Material
Acknowledgment
This work was supported by the National Institute of Health R35 GM122472 and the National Science Foundation Grant MCB 1707167. We also thank the University of Southern California for the Zumberge Multi-School Interdisciplinary Research award and the University of Southern California High Performance Computing and Communication Center and the Extreme Science and Engineering Discovery Environment’s (XSEDE) Comet facility at the San Diego supercomputing Center for computational resources. D. M. thanks Dr. Gabriel Oanca for helpful discussion.
Footnotes
Supporting Information
All relevant EVB simulation parameters and partial charges are provided in Table S2.1–S2.4.
7. References
- 1.Zhou P; Yang XL; Wang XG; Hu B; Zhang L; Zhang W; Si HR; Zhu Y; Li B; Huang CL; Chen HD; Chen J; Luo Y; Guo H; Jiang RD; Liu MQ; Chen Y; Shen XR; Wang X; Zheng XS; Zhao K; Chen QJ; Deng F; Liu LL; Yan B; Zhan FX; Wang YY; Xiao GF; Shi ZL, A pneumonia outbreak associated with a new coronavirus of probable bat origin. Nature 2020, 579 (7798), 270–273. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Wu F; Zhao S; Yu B; Chen YM; Wang W; Song ZG; Hu Y; Tao ZW; Tian JH; Pei YY; Yuan ML; Zhang YL; Dai FH; Liu Y; Wang QM; Zheng JJ; Xu L; Holmes EC; Zhang YZ, A new coronavirus associated with human respiratory disease in China. Nature 2020, 579 (7798), 265–269. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Gorbalenya AE; Baker SC; Baric RS; de Groot RJ; Drosten C; Gulyaeva AA; Haagmans BL; Lauber C; Leontovich AM; Neuman BW; Penzar D; Perlman S; Poon LLM; Samborskiy DV; Sidorov IA; Sola I; Ziebuhr J; Grp CS, The species Severe acute respiratory syndrome-related coronavirus: classifying 2019-nCoV and naming it SARS-CoV-2. Nat Microbiol 2020, 5 (4), 536–544. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Guy RK; DiPaola RS; Romanelli F; Dutch RE, Rapid repurposing of drugs for COVID-19. Science 2020, 368 (6493), 829–830. [DOI] [PubMed] [Google Scholar]
- 5.Anand K; Ziebuhr J; Wadhwani P; Mesters JR; Hilgenfeld R, Coronavirus main proteinase (3CL(pro)) structure: Basis for design of anti-SARS drugs. Science 2003, 300 (5626), 1763–1767. [DOI] [PubMed] [Google Scholar]
- 6.Chang GG, Quaternary Structure of the SARS Coronavirus Main Protease. Molecular Biology of the Sars-Coronavirus 2010, 115–128. [Google Scholar]
- 7.Zhang LL; Lin DZ; Sun XYY; Curth U; Drosten C; Sauerhering L; Becker S; Rox K; Hilgenfeld R, Crystal structure of SARS-CoV-2 main protease provides a basis for design of improved alpha-ketoamide inhibitors. Science 2020, 368 (6489), 409–412. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Jin ZM; Du XY; Xu YC; Deng YQ; Liu MQ; Zhao Y; Zhang B; Li XF; Zhang LK; Peng C; Duan YK; Yu J; Wang L; Yang KL; Liu FJ; Jiang RD; Yang XL; You T; Liu XC; Yang XN; Bai F; Liu H; Liu X; Guddat LW; Xu WQ; Xiao GF; Qin CF; Shi ZL; Jiang HL; Rao ZH; Yang HT, Structure of M-pro from SARS-CoV-2 and discovery of its inhibitors. Nature 2020, 582 (7811), 289–293. [DOI] [PubMed] [Google Scholar]
- 9.Jin ZM; Zhao Y; Sun Y; Zhang B; Wang HF; Wu Y; Zhu Y; Zhu C; Hu TY; Du XY; Duan YK; Yu J; Yang XB; Yang XN; Yang KL; Liu X; Guddat LW; Xiao GF; Zhang LK; Yang HT; Rao ZH, Structural basis for the inhibition of SARS-CoV-2 main protease by antineoplastic drug carmofur. Nat Struct Mol Biol 2020, 27 (6), 529–532. [DOI] [PubMed] [Google Scholar]
- 10.Dai WH; Zhang B; Jiang XM; Su HX; Li JA; Zhao Y; Xie X; Jin ZM; Peng JJ; Liu FJ; Li CP; Li Y; Bai F; Wang HF; Cheng X; Cen XB; Hu SL; Yang XN; Wang J; Liu X; Xiao GF; Jiang HL; Rao ZH; Zhang LK; Xu YC; Yang HT; Liu H, Structure-based design of antiviral drug candidates targeting the SARS-CoV-2 main protease. Science 2020, 368 (6497), 1331–1335. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Powers JC; Asgian JL; Ekici OD; James KE, Irreversible inhibitors of serine, cysteine, and threonine proteases. Chem Rev 2002, 102 (12), 4639–4750. [DOI] [PubMed] [Google Scholar]
- 12.Zhang LL; Lin DZ; Kusov Y; Nian Y; Ma QJ; Wang J; von Brunn A; Leyssen P; Lanko K; Neyts J; de Wilde A; Snijder EJ; Liu H; Hilgenfeld R, alpha-Ketoamides as Broad-Spectrum Inhibitors of Coronavirus and Enterovirus Replication: Structure-Based Design, Synthesis, and Activity Assessment. J Med Chem 2020, 63 (9), 4562–4578. [DOI] [PubMed] [Google Scholar]
- 13.Mondal D; Florian J; Warshel A, Exploring the Effectiveness of Binding Free Energy Calculations. J Phys Chem B 2019, 123 (42), 8910–8915. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Wang L; Wu YJ; Deng YQ; Kim B; Pierce L; Krilov G; Lupyan D; Robinson S; Dahlgren MK; Greenwood J; Romero DL; Masse C; Knight JL; Steinbrecher T; Beuming T; Damm W; Harder E; Sherman W; Brewer M; Wester R; Murcko M; Frye L; Farid R; Lin T; Mobley DL; Jorgensen WL; Berne BJ; Friesner RA; Abel R, Accurate and Reliable Prediction of Relative Ligand Binding Potency in Prospective Drug Discovery by Way of a Modern Free-Energy Calculation Protocol and Force Field. J Am Chem Soc 2015, 137 (7), 2695–2703. [DOI] [PubMed] [Google Scholar]
- 15.Chatterjee P; Botello-Smith WM; Zhang H; Qian L; Alsamarah A; Kent D; Lacroix JJ; Baudry M; Luo Y, Can Relative Binding Free Energy Predict Selectivity of Reversible Covalent Inhibitors? J Am Chem Soc 2017, 139 (49), 17945–17952. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Yu HS; Gao C; Lupyan D; Wu YJ; Kimura T; Wu CJ; Jacobson L; Harder E; Abel R; Wang LL, Toward Atomistic Modeling of Irreversible Covalent Inhibitor Binding Kinetics. J Chem Inf Model 2019, 59 (9), 3955–3967. [DOI] [PubMed] [Google Scholar]
- 17.Kuhn B; Tichy M; Wang LL; Robinson S; Martin RE; Kuglstatter A; Benz J; Giroud M; Schirmeister T; Abel R; Diederich F; Hert J, Prospective Evaluation of Free Energy Calculations for the Prioritization of Cathepsin L Inhibitors. J Med Chem 2017, 60 (6), 2485–2497. [DOI] [PubMed] [Google Scholar]
- 18.Lameira J; Bonatto V; Cianni L; Rocho FD; Leitao A; Montanari CA, Predicting the affinity of halogenated reversible covalent inhibitors through relative binding free energy. Phys Chem Chem Phys 2019, 21 (44), 24723–24730. [DOI] [PubMed] [Google Scholar]
- 19.Zhang H; Jiang WJ; Chatterjee P; Luo Y, Ranking Reversible Covalent Drugs: From Free Energy Perturbation to Fragment Docking. J Chem Inf Model 2019, 59 (5), 2093–2102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Singh N; Warshel A, Absolute binding free energy calculations: On the accuracy of computational scoring of protein-ligand interactions. Proteins 2010, 78 (7), 1705–1723. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Warshel A; Weiss RM, An Empirical Valence Bond Approach for Comparing Reactions in Solutions and in Enzymes. J Am Chem Soc 1980, 102 (20), 6218–6226. [Google Scholar]
- 22.Jindal G; Monda D; Warshel A, Exploring the Drug Resistance of HCV Protease. J Phys Chem B 2017, 121 (28), 6831–6840. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Świderek K; Moliner V, Revealing the molecular mechanisms of proteolysis of SARS-CoV-2 M pro by QM/MM computational methods. Chem Sci 2020, 11, 10626–10630. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Ramos-Guzmán CA; Ruiz-Pernía JJ; Tuñón I, Unraveling the SARS-CoV-2 Main Protease Mechanism Using Multiscale DFT/MM Methods 2020. [DOI] [PubMed] [Google Scholar]
- 25.Polgár L, On the mode of activation of the catalytically essential sulfhydryl group of papain. Eur J Biochem 1973, 33 (1), 104–109. [DOI] [PubMed] [Google Scholar]
- 26.Huang CK; Wei P; Fan KQ; Liu Y; Lai LH, 3C-like proteinase from SARS coronavirus catalyzes substrate hydrolysis by a general base mechanism. Biochemistry-Us 2004, 43 (15), 4568–4574. [DOI] [PubMed] [Google Scholar]
- 27.Connolly KM; Smith BT; Pilpa R; Ilangovan U; Jung ME; Clubb RT, Sortase from Staphylococcus aureus does not contain a thiolate-imidazolium ion pair in its active site. J Biol Chem 2003, 278 (36), 34061–34065. [DOI] [PubMed] [Google Scholar]
- 28.Warshel A; Chu Z; Villa J; Strajbl M; Schutz C; Shurki A; Vicatos S; Plotnikov N; Schopf P, Molaris-XG, v 9.15 University of Southern California: Los Angeles: 2012. [Google Scholar]
- 29.King G; Warshel A, A Surface Constrained All-Atom Solvent Model for Effective Simulations of Polar Solutions. J Chem Phys 1989, 91 (6), 3647–3661. [Google Scholar]
- 30.Frisch MJ; Trucks GW; Schlegel HB; Scuseria GE; Robb MA; Cheeseman JR; Scalmani G; Barone V; Mennucci B; Petersson G. e., Gaussian∼ 09 Revision D. 01 2014. [Google Scholar]
- 31.Bayly CI; Cieplak P; Cornell WD; Kollman PA, A Well-Behaved Electrostatic Potential Based Method Using Charge Restraints for Deriving Atomic Charges - the Resp Model. J Phys Chem-Us 1993, 97 (40), 10269–10280. [Google Scholar]
- 32.Lee FS; Chu ZT; Warshel A, Microscopic and Semimicroscopic Calculations of Electrostatic Energies in Proteins by the Polaris and Enzymix Programs. Journal of Computational Chemistry 1993, 14 (2), 161–185. [Google Scholar]
- 33.Lee FS; Warshel A, A Local Reaction Field Method for Fast Evaluation of Long-Range Electrostatic Interactions in Molecular Simulations. J Chem Phys 1992, 97 (5), 3100–3107. [Google Scholar]
- 34.Vorobyov I; Kim I; Chu ZT; Warshel A, Refining the treatment of membrane proteins by coarse-grained models. Proteins 2016, 84 (1), 92–117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Hsu MF; Kuo CJ; Chang KT; Chang HC; Chou CC; Ko TP; Shr HL; Chang GG; Wang AHJ; Liang PH, Mechanism of the maturation process of SARS-CoV 3CL protease. J Biol Chem 2005, 280 (35), 31257–31266. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Muramatsu T; Takemoto C; Kim YT; Wang HF; Nishii W; Terada T; Shirouzu M; Yokoyama S, SARS-CoV 3CL protease cleaves its C-terminal autoprocessing site by novel subsite cooperativity. P Natl Acad Sci USA 2016, 113 (46), 12997–13002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Xue XY; Yu HW; Yang HT; Xue F; Wu ZX; Shen W; Li J; Zhou Z; Ding Y; Zhao Q; Zhang XJC; Liao M; Bartlam M; Rao Z, Structures of two coronavirus main proteases: Implications for substrate binding and antiviral drug design. J Virol 2008, 82 (5), 2515–2527. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Strajbl M; Florian J; Warshel A, Ab initio evaluation of the free energy surfaces for the general base/acid catalyzed thiolysis of formamide and the hydrolysis of methyl thiolformate: A reference solution reaction for studies of cysteine proteases. J Phys Chem B 2001, 105 (19), 4471–4484. [Google Scholar]
- 39.Arafet K; Serrano-Aparicio N; Lodola A; Mulholland A; González FV; Swiderek K; Moliner V, Mechanism of Inhibition of SARS-CoV-2 Mpro by N3 Peptidyl Michael Acceptor Explained by QM/MM Simulations and Design of New Derivatives with Tunable Chemical Reactivity 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Ramos-Guzmán CA; Ruiz-Pernía JJ; Tuñón I, A Microscopic Description of SARS-CoV-2 Main Protease Inhibition with Michael Acceptors. Strategies for Improving Inhibitors Design 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Arafet K; Ferrer S; Moliner V, First quantum mechanics/molecular mechanics studies of the inhibition mechanism of cruzain by peptidyl halomethyl ketones. Biochemistry-Us 2015, 54 (21), 3381–3391. [DOI] [PubMed] [Google Scholar]
- 42.Brandt RB; Laux JE; Yates SW, Calculation of Inhibitor Ki and Inhibitor Type from the Concentration of Inhibitor for 50-Percent Inhibition for Michaelis-Menten Enzymes. Biochem Med Metab B 1987, 37 (3), 344–349. [DOI] [PubMed] [Google Scholar]
- 43.Li ZZ; OrtegaVilain AC; Patil GS; Chu DL; Foreman JE; Eveleth DD; Powers JC, Novel peptidyl alpha-keto amide inhibitors of calpains and other cysteine proteases. J Med Chem 1996, 39 (20), 4089–4098. [DOI] [PubMed] [Google Scholar]
- 44.da Costa CHS; Bonatto V; dos Santos AM; Lameira J; Leitao A; Montanari CA, Evaluating QM/MM Free Energy Surfaces for Ranking Cysteine Protease Covalent Inhibitors. J Chem Inf Model 2020, 60 (2), 880–889. [DOI] [PubMed] [Google Scholar]
- 45.Lanman BA; Allen JR; Allen JG; Amegadzie AK; Ashton KS; Booker SK; Chen JJ; Chen N; Frohn MJ; Goodman G; Kopecky DJ; Liu L; Lopez P; Low JD; Ma V; Minatti AE; Nguyen TT; Nishimura N; Pickrell AJ; Reed AB; Shin Y; Siegmund AC; Tamayo NA; Tegley CM; Walton MC; Wang HL; Wurz RP; Xue M; Yang KC; Achanta P; Bartberger MD; Canon J; Hollis LS; McCarter JD; Mohr C; Rex K; Saiki AY; San Miguel T; Volak LP; Wang KH; Whittington DA; Zech SG; Lipford JR; Cee VJ, Discovery of a Covalent Inhibitor of KRAS(G12C) (AMG 510) for the Treatment of Solid Tumors. J Med Chem 2020, 63 (1), 52–65. [DOI] [PubMed] [Google Scholar]
- 46.Ghosh AK; Samanta I; Mondal A; Liu WR, Covalent Inhibition in Drug Discovery. Chemmedchem 2019, 14 (9), 889–906. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Byrd JC; Harrington B; O’Brien S; Jones JA; Schuh A; Devereux S; Chaves J; Wierda WG; Awan FT; Brown JR, Acalabrutinib (ACP-196) in relapsed chronic lymphocytic leukemia. New England Journal of Medicine 2016, 374 (4), 323–332. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Bradshaw JM; McFarland JM; Paavilainen VO; Bisconte A; Tam D; Phan VT; Romanov S; Finkle D; Shu J; Patel V; Ton T; Li XY; Loughhead DG; Nunn PA; Karr DE; Gerritsen ME; Funk JO; Owens TD; Verner E; Brameld KA; Hill RJ; Goldstein DM; Taunton J, Prolonged and tunable residence time using reversible covalent kinase inhibitors. Nat Chem Biol 2015, 11 (7), 525–531. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.