Abstract

Protein–ligand binding free-energy calculations using molecular dynamics (MD) simulations have emerged as a powerful tool for in silico drug design. Here, we present results obtained with the ARROW force field (FF)—a multipolar polarizable and physics-based model with all parameters fitted entirely to high-level ab initio quantum mechanical (QM) calculations. ARROW has already proven its ability to determine solvation free energy of arbitrary neutral compounds with unprecedented accuracy. The ARROW FF parameterization is now extended to include coverage of all amino acids including charged groups, allowing molecular simulations of a series of protein–ligand systems and prediction of their relative binding free energies. We ensure adequate sampling by applying a novel technique that is based on coupling the Hamiltonian Replica exchange (HREX) with a conformation reservoir generated via potential softening and nonequilibrium MD. ARROW provides predictions with near chemical accuracy (mean absolute error of ∼0.5 kcal/mol) for two of the three protein systems studied here (MCL1 and Thrombin). The third protein system (CDK2) reveals the difficulty in accurately describing dimer interaction energies involving polar and charged species. Overall, for all of the three protein systems studied here, ARROW FF predicts relative binding free energies of ligands with a similar accuracy level as leading nonpolarizable force fields.
Introduction
Free-energy calculations of ligand binding to a protein can serve as a powerful tool for structure-based small-molecule drug design, especially at the stages of lead selection (hit-to-lead) and lead optimization, where ligands of high binding affinity are desired. A change of free energy, ΔG, upon binding quantitatively describes ligand–protein affinity. In silico calculations of protein–ligand binding ΔG have numerous advantages over expensive experimental approaches. The calculations can be performed in a fully automated manner; consequently, a large number of ligands can be evaluated and numerous drug candidates with diverse structures can be selected. The relative energy, ΔΔG, i.e., calculation of ΔG of one ligand with respect to another, is usually sufficient to guide the process of ligand optimization and can be calculated more accurately than the absolute energy via alchemical transformations.1,2
In recent years, calculations of binding free energies of ligands in proteins, using all-atom force fields, have grown in importance. Effective software packages have been developed to perform such calculations.3−5 All-atom force fields used in these calculations6−8 typically combine parameters derived from quantum mechanics (QM) calculations and empirical parameters fitted to reproduce certain experimental observables. However, systematic studies performed on a large set of protein–ligand systems suggest that current methodologies may have reached a limit of about 1 kcal/mol mean absolute error (MAE) from the experiment.9−12 There are two factors responsible for the low accuracy, insufficient quality of molecular force fields (FF), and poor conformational sampling.
Most of the currently available force fields are all-atom fixed-charge models, e.g., AMBER/GAFF,6 CHARMM/CGenFF,13 OPLS,10,14,15 GROMOS,16 and MMFF.17 Although they are well established, highly refined, and computationally efficient, it is generally accepted that they are not sophisticated enough to describe complicated protein–ligand interactions with an accuracy required for drug design. Specifically, they do not allow an accurate description of electrostatic and exchange interactions in the proteins, or nonadditive effects of atomic interactions, e.g., electronic polarizability. The absence of electronic polarizability significantly limits the force field’s ability to correctly describe highly heterogeneous environments such as protein active sites. For the same reason, these force fields are marked by low transferability of parameters.
A solution for these issues would be to develop more advanced models that are physics-based and explicitly represent nonadditive effects. Thanks to recent developments of computational hardware, especially graphics processing units (GPU), simulations of these models are currently perfectly feasible on the size and time scales required by protein–ligand systems. Additionally, more robust procedures for fitting analytical expressions to molecular potential surfaces and advanced analytical expressions themselves are available. Thus, a few polarizable force fields are under continuous development, e.g., AMOEBA18 or CHARMM-Drude.19 Nevertheless, they do not fully meet the expectations in the field of drug design, such as high accuracy, transferability, universality, or productivity. An increasing number of parameters in polarizable FF makes it harder to derive them unambiguously from the available experimental information. Consequently, applications of polarizable force fields to model complex systems, such as protein–ligand complexes, are relatively rarely reported in the literature.20−23 Reported ligand-binding free-energy calculations with AMOEBA FF showed lower accuracy20 for some systems than calculations with simpler nonpolarizable FF.
An appealing alternative to empirical force fields are force fields that rely purely on QM calculations, e.g., MMFF94,17 MB-pol,24 QMPFF,25,26 QMFF,27,28 QMDFF,29 or QMPFF3.26 Recently, these kinds of force fields have gained traction as accurate but computationally intensive QM calculations, such as those based on coupled-cluster (CCSDT) method, have become more accessible. QM calculations can provide detailed insights into energy interactions within and between the individual particles and also for the individual energy components based on suitable decomposition schemes.30 Thus, QM can be used to parameterize advanced physics-based models based on their energy components, e.g., electrostatics, exchange, induction, or dispersion, can be fitted separately. Development of models with terms that have physical interpretation is essential for force field transferability, and high transferability is especially required from QM-based force fields since QM calculations are feasible only for small fragments of compounds.
In this paper, we present results obtained with the ARROW force field. It is an advanced physics-based model that includes multipolar electrostatics and anisotropic polarization. Additionally, its parameters are fitted exclusively to high-level QM data for a set of small compound monomers and dimers, without fitting to any experimental data. We already have shown that ARROW FF provides ΔG of solvation for arbitrary neutral molecules with unprecedented accuracy.31 Here, we expand the coverage to all standard amino acids (neutral and charged) and limited ligand chemical phase space to make protein–ligand simulations feasible. We probe the accuracy of ARROW FF by calculating the relative binding free energies (ΔΔG) for a series of ligands in MCL1, Thrombin, and CDK2 proteins for which experimental results are known. These systems are also related to real drug design projects and tend to be benchmark studies for various research groups for testing their molecular models.
An equally vital component of any ΔG prediction is thorough sampling of configurational space of a simulated system. Successful sampling is especially important for protein–ligand systems that often exist in multiple binding states. Numerous enhanced sampling techniques have been developed to make more states accessible during the molecular dynamics simulations, e.g., umbrella sampling,32 TREX (Temperature Replica Exchange), HREX (Hamiltonian Replica Exchange),33 REST1, and REST2 (Replica Exchange with Solute Tempering).34,35 To ensure adequate conformational sampling, we developed and applied an enhanced sampling technique, a modified HREX coupled to a conformation reservoir generated through softening of the molecular potential, and a nonequilibrium (NEQ) MD.
To successfully disseminate the computational methodology presented in this paper to applications in the pharmaceutical industry, we have developed a user-friendly software package that facilitates ligand parameterization, system setup, running simulations, and their analysis. A key module of the package is ARBALEST,36 an MD simulation program that supports the ARROW force field. ARBALEST is capable of running simulations on computer clusters with multiple CPUs and GPUs. It allows a user to perform free-energy calculations and use various enhanced sampling techniques including those described here.
Theory
Generation of Conformation Reservoir Using Nonequilibrium MD
Recently, nonequilibrium MD-based techniques were used by several research groups for ΔΔG calculations37 and enhanced conformational sampling of ligand-binding complexes.38 Free energies for alchemical transitions were calculated from work values computed from multiple nonequilibrium MD runs as the Hamiltonian of the system gradually changes from one state to another in one or both directions.39,40 Enhanced sampling on a rugged potential energy landscape of protein–ligand complexes was attempted by Gill et al.38 using Nonequilibrium Candidate Monte Carlo (NCMC) moves. NCMC moves consist of nonequilibrium MD runs with the Hamiltonian of the system changing from nonsoftened to a softened state, with an addition of a regular MD stretch in the softened state of the Hamiltonian, followed by a reverse NEQ MD to the nonsoftened state of the Hamiltonian. An enhanced sampling is achieved due to fast interconversions of torsional states of a ligand in MD simulations with “softened” Hamiltonian. NEQ MD work calculations give proper acceptance probabilities for such MC moves that preserve the Boltzmann distribution of molecular system geometries.
In this paper, we use the methodology that can be considered as a modification of the NCMC sampling procedure.38 Instead of complex MC moves described above, we are performing nonequilibrium runs from snapshots of equilibrated MD obtained with the softened Hamiltonian, changing the system Hamiltonian from the softened state to the regular (nonsoftened) state. Molecular system geometries at the end of the NEQ MD runs that pass a Metropolis-like acceptance criterion are used to prepare conformation reservoirs for HREX ΔΔG calculations.
First, a sufficiently long MD trajectory with the “softened” Hamiltonian that samples the ligand and protein conformations is generated. The “softened” Hamiltonian is constructed to reduce the potential barriers between the local minima and increase the inter-minima transition rates. For the chosen parameters, 10 ns long MD runs were sufficient to get an equilibrated ensemble of configurations in the “softened” Hamiltonian. To generate a Boltzmann-distributed ensemble of configurations in the original physical (nonsoftened) Hamiltonian, we run a set of nonequilibrium MD simulations starting from a set of geometries in the ensemble of the softened Hamiltonian and filtering the final conformations using a criterion that is based on nonequilibrium work. During the course of a nonequilibrium MD run, the molecular potential quickly changes from the “softened” to “nonsoftened” state. The work for the process is computed as the integral
| 1 |
A Metropolis test is performed for the end point of the NEQ MD trajectory that determines whether to add the end system configuration to the reservoir or not
| 2 |
where ξ is a random number from [0,1] interval and ΔGs→ns is the free energy of the transition between softened and nonsoftened Hamiltonians. ΔGs→ns is computed iteratively.
The first approximation of ΔGs→ns is obtained using the Jarzynski equality
| 3 |
where ⟨ ⟩ denotes averaging over all of the nonequilibrium MD runs, W is the work computed for a nonequilibrium run, thus
| 4 |
It is expected that an approximation to ΔGs→ns obtained with the Jarzynski equality (eq 4) is not very accurate and a better estimate for ΔGs→ns can be obtained using the Maximum-Likelihood method41 based on bidirectional nonequilibrium runs, the Bennett acceptance ratio method and the Crooks fluctuation theorem
| 5 |
In the Maximum-Likelihood method, ΔGs→ns is found by solving the equation
| 6 |
where
and nF and nR are the numbers of forward and reverse nonequilibrium
MD runs, respectively.
We use values of ΔGs→ns obtained in eq 4 to initially filter the end point conformations of NEQ MD runs using the criteria in eq 2. Then, we use these configurations as starting points to run NEQ MD from nonsoftened to the softened Hamiltonian state. Work values obtained in MD runs in both directions are combined, solving eq 6, and the Bennett acceptance ratio (BAR) approach42 is used to obtain an improved estimate of ΔGs→ns that we plug into eq 2 to obtain an improved equilibrium conformation distribution for the nonsoftened Hamiltonian. The Maximum-Likelihood equation (eq 6) can be solved again to correct values of ΔGs→ns based on an updated set of accepted conformations of nonsoftened Hamiltonian. We found that a few iterations were sufficient to converge ΔGs→ns to 0.1–0.2 kcal/mol accuracy so that the computed distribution of accepted configurations of nonsoftened Hamiltonian did not significantly change on further iterations.
Methods
Quantum Mechanics
In our work, we use a variety of quantum mechanical data as a benchmark for energies and conformations. QM calculations were performed for the monomer model compounds at the MP2/aug-cc-pVQZ level and dimers of the ligand model compounds with amino acid fragments and water were computed with the silver standard, i.e., MP2/CBS, calculated with Helgaker cubic extrapolation from aug-cc-pVTZ->aug-cc-pVQZ as well as post-MP2 correction (i.e., plus CCSD(T)/aug-cc-pVDZ–MP2/aug-cc-pVDZ). More details on QM can be found in the Supporting Information and in our previous publication.31
Force Field
In the ARROW force field, the nonbonded interactions are composed of electrostatic, exchange-repulsion, and dispersion terms. The electrostatic and exchange-repulsion terms are multipolar with inclusion of charges, dipoles, and quadrupoles, and their radial dependence is a Slater-like exponential so that they well describe charge penetration effects. The dispersion term is conventionally represented by spherical terms (C6 and C8), and a Tang–Toennies-damped interaction. Many-body effects are modeled by anisotropic-induced dipoles interacting with the electrostatic and exchange-repulsion terms, as well as with one another. They are iterated to self-consistent field convergence on every nonbonded step. Additional description of the ARROW force field can be found in the SI. For a detailed description, including functional forms, the reader is referred to our previous works.31,43
Parameterization
Proteins and ligands were split into chemical functional groups. Their intermolecular parameters were determined by agreement with QM values of dimer and multimer energies, electrostatic potentials, multipole moments of monomers, polarization tensor, and interaction of fragments with point charges. To aid transferability, we also attempted to match the individual FF energy components to their corresponding QM counterparts, in addition to reproducing the total energy. The typical size of the fragments was not bigger than 10 heavy atoms, e.g., phenol. Larger molecules were built by joining together smaller fragments. We assume that all interactions except electrostatic stay the same (e.g., like in GAFF or AMOEBA) and we refine multipoles on the boundary atoms (e.g., boundary atoms in biphenyl when two benzene rings are merged) to have the best fit to the electrostatic potential around the merged place using the RESP44 procedure that is applied to charges, dipoles, and multipoles. For this, we perform QM calculations of lower quality on joined molecular pieces of two fragments. Typically, we include fragments where boundary atoms are either hydrogens attached to carbons or carbons due to their typically more neutral charges in comparison to other more electronegative elements, e.g., oxygen, nitrogen, chlorine, etc. The details of nonbonded parameterization have been described in our previous publication.31
Molecular Dynamics
MD simulations were performed with the ARBALEST simulation package using multiple CPUs with OpenMP and MPI libraries and NVIDIA graphics processing units (GPU) with CUDA library.45 For long-range electrostatic interactions, Particle Mesh Ewald (PME)46,47 was used. Dispersion and PME direct sum electrostatic interactions were cutoff at 9 Å distance. A multiple time step algorithm was used to integrate the equations of motion.48 The system temperature was kept at 298 K using the Nosé–Hoover thermostat.49 Pressure was maintained at 1 atm using the Berendsen barostat.50
Systems Setup
The following structures were used to set up three protein–ligand systems for MD simulations: MCL1 (PDB: 4hw2), Thrombin (PDB:2zc9), and CDK2 (PDB:1h1q). Missing residues and side chains in Thrombin were modeled using the Swiss-Model server.51 Protonation states of protein residues were determined using PropKa.52 Each complex was centered and aligned along its principal axes in a rectangular simulation box. The size of the box was adjusted to leave at least 5 Å distance between the protein and the box edges. The systems were solvated using tools from the GROMACS package.53 Some water molecules inserted into the binding pocket were manually removed.
Systems Equilibration
The solvated protein–ligand systems were equilibrated in two steps. In the first step, all heavy atoms of protein, ligand, and crystallographic water were positionally restrained (k = 2.5 kcal/mol/Å2). The potential energy of the system was minimized and the system equilibrated for 0.5 ns in the NVT ensemble. In the second step, only the Cα atoms that were farther away than 7 Å from the ligand were restrained. These restraints were maintained in all of the following production and reservoir generation simulations. The energy of the system was minimized again and the system was equilibrated for another 2 ns in the NPT ensemble.
Free-Energy Calculations
An alchemical transformation method was used for calculations of the free-energy change, ΔGR→T, associated with mutation of the reference ligand, R, to the target ligand, T. In this method, the Hamiltonian of the reference ligand, HR, is incrementally transformed to the Hamiltonian of the target ligand, HT, using a chain of replicas with intermediate hybrid Hamiltonian states, governed by a scalar parameter λ changing from 0 to 1, where 0 corresponds to HR and 1 to HT. The exact coupling relation is described by eq S1 and S2.
The transformations were performed in the protein and solvent to determine ΔGR→Tprotein and ΔGR→Tsolvent, respectively. Free-energy differences, ΔG, associated with the alchemical transformations were computed using the Bennett Acceptance Ratio (BAR)42,54 and Thermodynamic Integration (TI) methods.54,55 Finally, the relative binding free energy, ΔΔGR→T, was determined as a difference between ΔGR→Tprotein and ΔGR→Tsolvent.
For asymmetrical ligands for which two values were determined, ΔΔGA and ΔΔGB, depending on which site, A or B, the target ligand was modeled into, the following formulas to calculate the combined ΔΔGA/B were used
| 7 |
| 8 |
Enhanced Sampling
HREX with Potential Softening and Conformation Reservoir
Sampling of the conformational space of replicas used for alchemical transformation was enhanced by the Hamiltonian replica exchange (HREX)56 method. In this method, conformations of neighboring replicas periodically exchange, if they fulfill certain energy conditions, increasing the overall sampling. In our simulations, the exchanges are attempted every 120 s. Using real time for the wall time, instead of simulation time, allowed us to efficiently use a cluster of diverse GPUs. A typical alchemical transformation in proteins went through 800 exchange cycles, and 400 exchange cycles in water, roughly corresponding to 2–4 ns.
To efficiently sample conformations of ligands in the protein binding pockets, we reduced the energetic barriers between the potential minima by “softening” selected protein–ligand and ligand–ligand interactions. The softening was introduced into the HREX chain either directly or indirectly via a pre-prepared conformation reservoir. In the direct approach, the softening was gradually turned on from the terminal replicas (λ = 0.0 and λ = 1.0) toward the middle replica (λ = 0.5). As a result, ligands close to the middle were able to sample conformations more efficiently and propagate them toward the terminal nonsoftened replicas through HREX. In the indirect approach, the softening was used to generate a reservoir of enhanced conformations for the reference ligand. Then, the reservoir was coupled to a corresponding replica (λ = 0.0) from which the conformations propagated toward the target ligands (λ = 1.0) through HREX.
Generation of the conformation reservoir consists of two steps. In the first step, a long simulation with softened interactions was performed. In the second step, the softened ensemble was converted to a nonsoftened ensemble, i.e., desired reservoir. We explored two methods for the second step―HREX and NEQ MD. In the HREX approach, the softened ensemble was alchemically transformed to the nonsoftened ensemble by a set of intermediate λ-states (similarly as for the ligand mutation). In the NEQ approach, the Hamiltonian of the softened ensemble quickly changes from the softened to the nonsoftened state. Work and ΔG calculated during this process serve to filter a generated nonsoftened ensemble (see the following paragraph for details). Both approaches allow generation of a Boltzmann-distributed ensemble that can periodically insert random conformations to the corresponding replica (here, the reference replica, λ = 0) of mutation HREX.
Conformation Reservoir Generation Using Nonequilibrium MD
In our NEQ MD runs, the Hamiltonian of the molecular system linearly changes from the softened to the nonsoftened state during the time interval T, governed by a coupling parameter λ. For the systems studied, it was found that T = 10 ps provides a good tradeoff between the computational costs of the simulations and the accuracy of the results. For each of the NEQ MD runs, work values were computed as explained in the Theory section. The starting conformations of NEQ MD runs were drawn from the trajectory generated in MD with a softened Hamiltonian as described above. The following workflow (see Figure 1) was used to generate a conformation reservoir:
Figure 1.
Generation of conformation reservoir using nonequilibrium MD.
-
(0)
Generate the softened trajectory (MD trajectory of the system with “softened” Hamiltonian, having reduced potential barriers between relevant local potential minima of the ligand–protein complex).
-
(1)
Take equally spaced conformations from the softened MD trajectory.
-
(2)
Run NEQ MD starting from the chosen conformations changing the Hamiltonian from the softened to the nonsoftened state (forward NEQ MD runs).
-
(3)
Compute work values for forward NEQ MD simulations, and compute ΔG between softened and nonsoftened Hamiltonian states using Jarzynski equality.
-
(4)
Filter the NEQ simulations based on computed work and ΔG values using a Metropolis algorithm as described in the Theory section.
-
(5)
Run NEQ MD from the nonsoftened to the softened Hamiltonian state (reverse NEQ MD) starting from the end conformations of filtered forward NEQ MD runs.
-
(6)
Use forward and reverse NEQ MD results to compute ΔG with the bidirectional method.
-
(7)
Go to 4.
-
(8)
Repeat steps 4–7 500 times. We need to run reverse NEQ MD only for those configurations that were not filtered in the previous cycles.
-
(9)
Average ΔG values obtained in cycles 4–7.
-
(10)
Obtain a final set of filtered MD conformations from forward NEQ runs using the averaged value of ΔG between softened and nonsoftened Hamiltonian states (Figure 1).
The methods are described in more detail in the Supporting Information.
Results and Discussion
Relative Binding Free-Energy Predictions
ARROW FF has shown its ability to predict solvation free energy of arbitrary small neutral molecules with unprecedented accuracy (MAE: 0.2–0.3 kcal/mol).21 Here, we probe its ability to predict protein–ligand relative binding free energy. Our test set consists of three proteins―MCL1, Thrombin, and CDK2 (Figure S1)―each with a series of binding ligands (Figures S2–S4). The complexes proved to be stable during 10 ns long MD simulations with an average RMSD of Cα atoms from the X-ray structures of 1.3, 1.2, and 1.9 Å for MCL1, Thrombin, and CDK2, respectively (Figure S5). Such a deviation is on a similar level to that of other force fields.19 Nonetheless, the following ligand-binding simulations were performed with peripheral Cα atoms positionally restrained to avoid the effects of potential slow conformational changes. The ΔΔG predictions obtained with ARROW FF are shown against experimental values in Figure 2. The exact values, correlation coefficients, slopes, and errors can also be found in Tables S1–S3.
Figure 2.

Parity plot comparing the relative binding free energies ΔΔG for ligand mutations in MCL1, Thrombin, and CDK2 as predicted by ARROW FF and the experiment. Results with ARROW FF were calculated with HREX and conformation reservoir generated via potential softening and NEQ MD. Selected ARROW ΔΔG values with the largest deviation from the experiment (MAE > 1.5 kcal/mol) are marked with labels - ligands 1h1r, 1oi9, and 1oiy bound to CDK2, and ligand 39 bound to MCL1. The thin gray lines are +/– 0.5 kcal/mol from the diagonal.
Overall, ARROW FF predicts ΔΔG’s well (MAE: 0.7) and at a similar accuracy level as leading nonpolarizable force fields OPLS10 (MAE: 0.6), GAFF9,11 (MAE: 0.8), and CGenFF (MAE: 0.8)12 (see Figure S6 for comparison). Notably, three ΔΔG’s with the largest deviation from the experiment (∼2 kcal/mol) come from mutations in the same protein—CDK2, i.e., mutations of 1h1r, 1oi9, and 1oiy ligands. We analyzed these simulations in detail, looking for putatively incorrectly described protein–ligand interactions and found 1oi9 being the most evident case. Namely, the X-ray structure of 1oi9 in CDK2 (PDB: 1oi9) indicates that a hydroxyl group of the ligand forms a hydrogen bond with the carboxylate group of ASP87. However, such a hydrogen bonding interaction is not observed during the simulations. Lack of a strong O–H···O hydrogen bond well explains the binding being underestimated by ∼2 kcal/mol. Additionally, we found that in simulations with the GAFF force field, this hydrogen bond is present and persists over the entire simulation, ultimately producing ΔΔG with a much smaller deviation from the experiment. To check if the ARROW FF misrepresents this or any other interaction with the 1oi9 ligand, we extracted dimers of protein–ligand fragments from the GAFF simulation and calculated their interaction energy using ARROW FF and QM (“silver standard”). The difference between the two energies, i.e., FF-QM, can be seen in Figure 3a. Indeed, the largest inconsistency is found for a pair of phenol (fragment of 1oi9) and acetate (fragment of ASP87).
Figure 3.
Difference in the interaction energy determined with ARROW FF and QM for amino acid fragments of CDK2 and ligands: (a) 1oi9 (phenol), (b) 1oiy (benzamide), and (c) 1h1r (chlorobenzene). Configurations with the largest discrepancy are shown on the right (opaque) along with a few others (transparent).
Similar FF-QM calculations were performed for ligand 1oiy. Although a hydrogen bond between an amide group of the ligand and a carboxylate group of ASP87 periodically forms during the simulation with ARROW FF, that is consistent with the X-ray structure (PDB: 1oiy), the FF-QM indicates significant discrepancy (Figure 3b). ΔΔG of the third questionable ligand, 1h1r, as opposed to 1oi9 and 1oiy, is overestimated with respect to the experiment. Nevertheless, interactions with ASP87 are likely to be the key in this case too, since the X-ray (PDB: 1h1r) indicates that the chlorine of the ligand and the acetate group of ASP87 are in close proximity (∼3 Å). FF-QM calculations confirm this: finding acetate–chlorobenzene dimers showing the largest discrepancy (Figure 3c). These observations suggest that ARROW FF might not reproduce the interactions that involve charged groups sufficiently well.
It has been shown that for accurate free-energy (ΔG) calculations, nuclear quantum effect (NQE) should be taken into account.43 However, we found that this effect mostly cancels out in our relative free-energy (ΔΔG) calculations, where ΔG determined in water is subtracted from ΔG determined in a protein. We repeated two of our calculations of ligands binding to the CDK2 protein with PIMD = 4 (Path Integral Molecular Dynamics), which models NQE. As can be seen in Table S4, although corresponding ΔG values are reduced due to NQE, the final ΔΔG values are not very different. For this reason, as well as the high computational cost of using PIMD, we neglected NQE in our present calculations.
Conformational Sampling
In addition to force field accuracy, adequate conformational sampling is another factor that determines the validity of binding free-energy calculations. Missing or inadequate sampling of strong or weak binding states can result in underestimation or overestimation of the binding free energy, respectively. This question is even more compelling in the case of polarizable models that can significantly raise the bar for the currently available enhanced sampling techniques. Having this in mind, we paid particular attention to extensively sample the conformations in our protein–ligand systems.
For all of the three protein systems studied here, we choose ligands with the simplest benzyl group as a reference. Thus, mutations to the target ligands mostly relied on growing an additional group from the benzyl site. If the group grows asymmetrically, i.e., in ortho or meta position, then, there are two alternative sites for it (see Figure S7). We call them sites A and B. It is important to mention that in our simulations, any ligand, even the reference, does not flip from site A to B, or vice versa, when it is in the binding pocket. It is unclear if such a flip is possible in reality, or a ligand needs to leave and reenter the packet to do so. In either case, it is unknown apriori how these two sites contribute to the experimental ΔΔG. In the case of 1h1r ligand in CDK2, both the orientations of the chlorobenzene group in meta position were resolved from the X-ray experiment (Figure S9). This suggests that each of them should be taken into account in the binding free-energy calculations.
Here, we discuss this sampling problem, and how we deal with it, in detail for the case of MCL1. Indeed, for asymmetrical ligands, we obtained two different sets of ΔΔG values (see Figure 4a) depending on which site, A or B, the ligands started the simulations from (MAE: 0.61/1.38 kcal/mol). Although the obtained ΔG’s can be combined according to eq 8, it is in question if there are any more missing states. Furthermore, analyzing the simulations, we noticed that certain torsional angles along the ligand backbone undergo rare transitions. We found that the X-ray structures (PDB: 4hw2, 4hw3) also contain their alternative states (Figure S10). Thus, to sample all of the possible states extensively, including flipping of the benzyl ring, we softened all of the backbone torsions of the ligand (Figure S8a) and also the protein–benzyl interactions. In our first approach, we applied the softening directly to the mutation HREX chain (maximum for the middle replica, λ = 0.5). Although convenient, to keep a sufficient exchange rate between the replicas, we had to increase the number of λ-states from 11 to 21. As expected, the obtained ΔΔG values with this method were mostly found between the A and B states determined with the regular HREX (Figure 4b) (MAE: 0.66/0.49). Nevertheless, they still depend on which site the simulation started from, which is an indication of the convergence issue. Moreover, as the softening applies to the hybrid ligand (λ = 0.5), it might be suspected to not perform as effectively for different mutations.
Figure 4.
Comparison of ΔΔG as determined with ARROW force field and the experiment for MCL1. Green and red markers correspond to values obtained with HREX and a target ligand starting at A and B sides, respectively. Blue markers correspond to (a) combined A/B sides, (b) HREX with softening from A and B sides, (c) HREX with reservoir (from HREX), and (d) HREX with reservoir (from NEQ). Thin gray lines are +/– 0.5 kcal/mol from the diagonal.
Utilizing our second approach, we deconvolute the softening from the mutation. We run a single long simulation with potential softening of the reference ligand 27 in MCL1. We made sure that all of the torsions and the benzyl ring flipped between the different states multiple times (Figures 5a, S14, and S15). To convert the softened ensemble to the nonsoftened ensemble, i.e., reservoir, we used two different methods—HREX and NEQ. We found that both methods produced similar conformational ensembles (compare Figure 5c,d). What is more, we found that the reservoirs contain conformations otherwise not sampled (compare, e.g., Figure 5d,b). When the reservoirs were attached to the replica of reference ligand 27 (λ = 0.0), the conformations efficiently propagated along the mutation HREX chain enhancing sampling also of the target ligands (see Figure S13 with an example of replica exchanges). ΔΔG’s obtained with both the reservoirs were found consistent with each other as well (compare Figure 4c,d) and either reduced the discrepancy with the experiment (MAE: 0.56/0.59). The only significant difference was noticed for ligand 39 (see Figure 1) whose sampling is particularly challenging because of a large phenyl group. With the NEQ-generated reservoir, ligand 39 was found to partially leave the binding pocket. Nonetheless, because of the high computational parallelizability, the NEQ method was chosen to generate conformational reservoirs for the other systems studied here.
Figure 5.
Pairwise distribution of rotatable torsions of ligand 27 in MCL1 from (a) softened potential MD, (b) nonsoftened potential MD, (c) nonsoftened potential MD with a reservoir (generated with HREX), and (d) nonsoftened potential MD with a reservoir (generated with nonequilibrium protocol).
Figure 6a shows ΔΔG values determined for a series of ligands binding Thrombin in two alternative orientations, A and B. A-orientation is clearly more preferable than B-orientation and “combined” ΔΔG (A/B) values computed with eq 8 values are very close for A-side ΔΔG’s and agree well with the Ki experiment (fluorescence labeling) (MAE: 0.66). Nevertheless, coupling of HREX with a conformation reservoir generated using nonequilibrium MD makes the predictions even more accurate (Figure 6b, MAE: 0.50).
Figure 6.
Comparison of ΔΔG as determined with the ARROW force field and experiment (Ki)57 for Thrombin. Green and red markers correspond to values obtained with HREX and a target ligand starting at A and B sides. Blue markers correspond to (a) combined A/B sides and (b) HREX with reservoir (from NEQ). The thin gray lines are +/– 0.5 kcal/mol from the diagonal.
Nonequilibrium MD
To generate a Boltzmann-distributed reservoir of conformation with regular, i.e., nonsoftened, Hamiltonian, conformations from the trajectory with “softened” protein–ligand interactions were used as starting points for nonequilibrium MD runs. Figure 7a shows the distribution of the torsion angle τ5 (see Figure S8a) that describes the orientation of the benzyl ring of ligand 27 in MCL1—before NEQ MD runs (the softened MD trajectory), at the end of 10 ps NEQ MD runs, and after filtering based on the computed work as described in the Theory section. Figure 7b shows the same distributions for the torsion angle ω (see Figure S8b) describing the orientation of the benzyl ring of ligand 5 in Thrombin. One can see how NEQ MD runs and filtering change the observed distributions of benzyl torsions. Especially, it can be clearly seen for ligand 5 of Thrombin (Figure 7b). For the regular nonsoftened Hamiltonian, the distribution of the benzyl torsion angle ω has narrow peaks around −120 and 60°. Distribution of ω angle for softened Hamiltonian MD before NEQ MD runs is wide with probability maxima around −20 and 160°. After NEQ MD runs, the distribution of ω angle shows four narrow peaks at −120, −60, 60, and 130°. NEQ work-based filtering removes configurations at −60 and 130° so filtered configurations have the correct Boltzmann distribution corresponding to the nonsoftened Hamiltonian of the system.
Figure 7.
Distributions of benzyl ring (a) torsion τ5 for ligand 27 bound to MCL1 and (b) torsion ω of ligand 5 bound to Thrombin before and after NEQ MD runs for conformer reservoir generation.
As outlined in the Methods section, bidirectional ΔG was computed via the iterative procedure. The value ΔG convergence is defined here as ΔG (bidirectional). In two systems studied, free energies computed via the bidirectional approach and Jarzynski equality differ. For MCL1 ligand 27: ΔG (Jarzynski) = −34.20 kcal/mol, ΔG (bidirectional) = −32.26 kcal/mol. For Thrombin ligand 5: ΔG (Jarzynski) = −2.20 kcal/mol, ΔG (bidirectional) = −1.83 kcal/mol. It is known that Jarzynski equality expression is strongly affected by tails of the work distributions often resulting in too negative computed ΔG values. The bidirectional approach is not prone to these problems and provides more robust estimates of ΔG between the softened and nonsoftened states.
Figure 8 shows distributions of work values computed for forward and reverse NEQ MD runs. The distribution of reverse work is sparse since the reverse NEQ MD starts from end conformations of forward NEQ runs that passed the Metropolis criteria (see Theory Section). The acceptance ratio for the MCL1 ligand 27 system was 4.8% and that for Thrombin ligand 5 was 4.7%. Blue and orange curves represent Gaussian curves that fit distributions of forward and reverse work values, respectively. The crossing point of these curves may be used as another estimate of the free-energy difference between softened and nonsoftened states of the molecular system and is close to the estimate obtained by the bidirectional approach. Green and black vertical lines correspond to the ΔG values computed via Jarzynski and bidirectional approach.
Figure 8.
Forward and reverse work distribution for (a) MCL1 ligand 27 and (b) Thrombin ligand 5. ΔG values between softened and nonsoftened Hamiltonian states of the system computed with one-directional (Jarzynski) and bidirectional (Crooks theorem-based) approaches are shown with green and black vertical lines.
Conclusions
Based on the three protein–ligand systems, we have shown that ARROW FF is able to predict relative binding free energy with almost chemical accuracy and is currently on par with leading all-atom fixed-charge force fields. We identified that the largest discrepancy with the experimental results is associated with binding interactions that involve charged groups. Analogous QM simulations demonstrated that these kinds of interactions are not well represented by the model and will be addressed in a later publication.
Despite current limitations, ARROW proves its potential in the field of drug design. As our model is physics-based and relies purely on QM calculations, the sources of error can be narrowed down to particular energy terms and refined separately. Additionally, because our technology does not require any experimental data, the force field refinement can be performed in a systematic manner. Although, in general, this is an endless process, we believe that there is a particular complexity and accuracy of the model that needs to be reached for successful drug design. We think that making a force field faithful to high-level ab initio calculations is a step in that direction.
Since effective sampling is a common challenge and usually cannot be completely deconvoluted from the force field accuracy, here, we paid particular attention to sufficiently sample protein–ligand conformations. We have shown that a conformation reservoir generated through potential softening in a nonequilibrium process is an efficient way to extend conformational space of a ligand in the protein binding pocket.
Acknowledgments
The authors thank InterX Inc. for their generous support. The authors also thank Sean Greenslade for his help with cluster maintenance and Hulda Chen for useful comments. Work performed at the Center for Nanoscale Materials, a U.S. Department of Energy Office of Science User Facility, was supported by the U.S. DOE, Office of Basic Energy Sciences, under Contract No. DE-AC02-06CH11357.
Supporting Information Available
The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.jctc.2c00930.
Details of the methods, comparison of the enhanced sampling techniques, comment on alternative experimental results for Thrombin, structures of proteins and ligands, tables with values of the binding free energy, comparison with other force fields, and RMSD of the proteins (PDF)
The authors declare no competing financial interest.
Notes
InterX Inc. is a subsidiary of NeoTX Holdings Ltd., Rehovot, Israel 7670202.
Supplementary Material
References
- Cournia Z.; Allen B.; Sherman W. Relative Binding Free Energy Calculations in Drug Discovery: Recent Advances and Practical Considerations. J. Chem. Inf. Model. 2017, 57, 2911–2937. 10.1021/acs.jcim.7b00564. [DOI] [PubMed] [Google Scholar]
- Bash P. A.; Singh U. C.; Brown F. K.; Langridge R.; Kollman P. A. Calculation of the Relative Change in Binding Free Energy of a Protein-Inhibitor Complex. Science 1987, 235, 574–576. 10.1126/science.3810157. [DOI] [PubMed] [Google Scholar]
- Case D. A.; Cheatham T. E. 3rd; Darden T.; Gohlke H.; Luo R.; Merz K. M. Jr; Onufriev A.; Simmerling C.; Wang B.; Woods R. J. The Amber Biomolecular Simulation Programs. J. Comput. Chem. 2005, 26, 1668–1688. 10.1002/jcc.20290. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brooks B. R.; Brooks C. L. 3rd; Mackerell A. D. Jr; Nilsson L.; Petrella R. J.; Roux B.; Won Y.; Archontis G.; Bartels C.; Boresch S.; Caflisch A.; Caves L.; Cui Q.; Dinner A. R.; Feig M.; Fischer S.; Gao J.; Hodoscek M.; Im W.; Kuczera K.; Lazaridis T.; Ma J.; Ovchinnikov V.; Paci E.; Pastor R. W.; Post C. B.; Pu J. Z.; Schaefer M.; Tidor B.; Venable R. M.; Woodcock H. L.; Wu X.; Yang W.; York D. M.; Karplus M. CHARMM: The Biomolecular Simulation Program. J. Comput. Chem. 2009, 30, 1545–1614. 10.1002/jcc.21287. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Abraham M. J.; Murtola T.; Schulz R.; Páll S.; Smith J. C.; Hess B.; Lindahl E. GROMACS: High Performance Molecular Simulations through Multi-Level Parallelism from Laptops to Supercomputers. SoftwareX 2015, 1–2, 19–25. 10.1016/j.softx.2015.06.001. [DOI] [Google Scholar]
- Wang J.; Wolf R. M.; Caldwell J. W.; Kollman P. A.; Case D. A. Development and Testing of a General Amber Force Field. J. Comput. Chem. 2004, 25, 1157–1174. 10.1002/jcc.20035. [DOI] [PubMed] [Google Scholar]
- MacKerell A. D.; Jr; Banavali N.; Foloppe N. Development and Current Status of the CHARMM Force Field for Nucleic Acids. Biopolymers 2000, 56, 257–265. . [DOI] [PubMed] [Google Scholar]
- Tian C.; Kasavajhala K.; Belfon K. A. A.; Raguette L.; Huang H.; Migues A. N.; Bickel J.; Wang Y.; Pincay J.; Wu Q.; Simmerling C. ff19SB: Amino-Acid-Specific Protein Backbone Parameters Trained against Quantum Mechanics Energy Surfaces in Solution. J. Chem. Theory Comput. 2020, 16, 528–552. 10.1021/acs.jctc.9b00591. [DOI] [PubMed] [Google Scholar]
- Song L. F.; Lee T.-S.; Zhu C.; York D. M.; Merz K. M. Jr. Using AMBER18 for Relative Free Energy Calculations. J. Chem. Inf. Model. 2019, 59, 3128–3135. 10.1021/acs.jcim.9b00105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lu C.; Wu C.; Ghoreishi D.; Chen W.; Wang L.; Damm W.; Ross G. A.; Dahlgren M. K.; Russell E.; Von Bargen C. D.; Abel R.; Friesner R. A.; Harder E. D. OPLS4: Improving Force Field Accuracy on Challenging Regimes of Chemical Space. J. Chem. Theory Comput. 2021, 17, 4291–4300. 10.1021/acs.jctc.1c00302. [DOI] [PubMed] [Google Scholar]
- He X.; Liu S.; Lee T.-S.; Ji B.; Man V. H.; York D. M.; Wang J. Fast, Accurate, and Reliable Protocols for Routine Calculations of Protein–Ligand Binding Affinities in Drug Design Projects Using AMBER GPU-TI with ff14SB/GAFF. ACS Omega 2020, 5, 4611–4619. 10.1021/acsomega.9b04233. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Raman E. P.; Paul T. J.; Hayes R. L.; Brooks C. L. Automated, Accurate, and Scalable Relative Protein–Ligand Binding Free-Energy Calculations Using Lambda Dynamics. J. Chem. Theory Comput. 2020, 16, 7895–7914. 10.1021/acs.jctc.0c00830. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vanommeslaeghe K.; Hatcher E.; Acharya C.; Kundu S.; Zhong S.; Shim J.; Darian E.; Guvench O.; Lopes P.; Vorobyov I.; Mackerell A. D. CHARMM General Force Field: A Force Field for Drug-like Molecules Compatible with the CHARMM All-Atom Additive Biological Force Fields. J. Comput. Chem. 2009, 31, 671–690. 10.1002/jcc.21367. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang L.; Wu Y.; Deng Y.; Kim B.; Pierce L.; Krilov G.; Lupyan D.; Robinson S.; Dahlgren M. K.; Greenwood J.; Romero D. L.; Masse C.; Knight J. L.; Steinbrecher T.; Beuming T.; Damm W.; Harder E.; Sherman W.; Brewer M.; Wester R.; Murcko M.; Frye L.; Farid R.; Lin T.; Mobley D. L.; Jorgensen W. L.; Berne B. J.; Friesner R. A.; Abel R. Accurate and Reliable Prediction of Relative Ligand Binding Potency in Prospective Drug Discovery by Way of a Modern Free-Energy Calculation Protocol and Force Field. J. Am. Chem. Soc. 2015, 137, 2695–2703. 10.1021/ja512751q. [DOI] [PubMed] [Google Scholar]
- Harder E.; Damm W.; Maple J.; Wu C.; Reboul M.; Xiang J. Y.; Wang L.; Lupyan D.; Dahlgren M. K.; Knight J. L.; Kaus J. W.; Cerutti D. S.; Krilov G.; Jorgensen W. L.; Abel R.; Friesner R. A. OPLS3: A Force Field Providing Broad Coverage of Drug-like Small Molecules and Proteins. J. Chem. Theory Comput. 2016, 12, 281–296. 10.1021/acs.jctc.5b00864. [DOI] [PubMed] [Google Scholar]
- Schmid N.; Eichenberger A. P.; Choutko A.; Riniker S.; Winger M.; Mark A. E.; van Gunsteren W. F. Definition and Testing of the GROMOS Force-Field Versions 54A7 and 54B7. Eur. Biophys. J. 2011, 40, 843–856. 10.1007/s00249-011-0700-9. [DOI] [PubMed] [Google Scholar]
- Halgren T. A. Merck Molecular Force Field. I. Basis, Form, Scope, Parameterization, and Performance of MMFF94. J. Comput. Chem. 1996, 17, 490–519. . [DOI] [Google Scholar]
- Ponder J. W.; Wu C.; Ren P.; Pande V. S.; Chodera J. D.; Schnieders M. J.; Haque I.; Mobley D. L.; Lambrecht D. S.; DiStasio R. A.; Head-Gordon M.; Clark G. N. I.; Johnson M. E.; Head-Gordon T. Current Status of the AMOEBA Polarizable Force Field. J. Phys. Chem. B 2010, 114, 2549–2564. 10.1021/jp910674d. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lopes P. E. M.; Huang J.; Shim J.; Luo Y.; Li H.; Roux B.; Mackerell A. D. Jr. Force Field for Peptides and Proteins Based on the Classical Drude Oscillator. J. Chem. Theory Comput. 2013, 9, 5430–5449. 10.1021/ct400781b. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jiao D.; Golubkov P. A.; Darden T. A.; Ren P. Calculation of Protein–ligand Binding Free Energy by Using a Polarizable Potential. Proc. Natl. Acad. Sci. U.S.A. 2008, 105, 6290–6295. 10.1073/pnas.0711686105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jiao D.; Zhang J.; Duke R. E.; Li G.; Schnieders M. J.; Ren P. Trypsin-Ligand Binding Free Energies from Explicit and Implicit Solvent Simulations with Polarizable Potential. J. Comput. Chem. 2009, 30, 1701–1711. 10.1002/jcc.21268. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Harger M.; Lee J.-H.; Walker B.; Taliaferro J. M.; Edupuganti R.; Dalby K. N.; Ren P. Computational Insights into the Binding of IN17 Inhibitors to MELK. J. Mol. Model. 2019, 25, 151 10.1007/s00894-019-4036-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Khoruzhii O.; Donchev A. G.; Galkin N.; Illarionov A.; Olevanov M.; Ozrin V.; Queen C.; Tarasov V. Application of a Polarizable Force Field to Calculations of Relative Protein–ligand Binding Affinities. Proc. Natl. Acad. Sci. U.S.A. 2008, 105, 10378–10383. 10.1073/pnas.0803847105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Medders G. R.; Babin V.; Paesani F. Development of a “First-Principles” Water Potential with Flexible Monomers. III. Liquid Phase Properties. J. Chem. Theory Comput. 2014, 10, 2906–2910. 10.1021/ct5004115. [DOI] [PubMed] [Google Scholar]
- Donchev A. G.; Galkin N. G.; Pereyaslavets L. B.; Tarasov V. I. Quantum Mechanical Polarizable Force Field (QMPFF3): Refinement and Validation of the Dispersion Interaction for Aromatic Carbon. J. Chem. Phys. 2006, 125, 244107 10.1063/1.2403855. [DOI] [PubMed] [Google Scholar]
- Donchev A. G.; Galkin N. G.; Illarionov A. A.; Khoruzhii O. V.; Olevanov M. A.; Ozrin V. D.; Pereyaslavets L. B.; Tarasov V. I. Assessment of Performance of the General Purpose Polarizable Force Field QMPFF3 in Condensed Phase. J. Comput. Chem. 2008, 29, 1242–1249. 10.1002/jcc.20884. [DOI] [PubMed] [Google Scholar]
- Ewig C. S.; Berry R.; Dinur U.; Hill J.-R.; Hwang M.-J.; Li H.; Liang C.; Maple J.; Peng Z.; Stockfisch T. P.; Thacher T. S.; Yan L.; Ni X.; Hagler A. T. Derivation of Class II Force Fields. VIII. Derivation of a General Quantum Mechanical Force Field for Organic Compounds. J. Comput. Chem. 2001, 22, 1782–1800. 10.1002/jcc.1131. [DOI] [PubMed] [Google Scholar]
- Hagler A. T. Quantum Derivative Fitting and Biomolecular Force Fields: Functional Form, Coupling Terms, Charge Flux, Nonbond Anharmonicity, and Individual Dihedral Potentials. J. Chem. Theory Comput. 2015, 11, 5555–5572. 10.1021/acs.jctc.5b00666. [DOI] [PubMed] [Google Scholar]
- Grimme S. A General Quantum Mechanically Derived Force Field (QMDFF) for Molecules and Condensed Phase Simulations. J. Chem. Theory Comput. 2014, 10, 4497–4514. 10.1021/ct500573f. [DOI] [PubMed] [Google Scholar]
- Naseem-Khan S.; Gresh N.; Misquitta A. J.; Piquemal J.-P. Assessment of SAPT and Supermolecular EDA Approaches for the Development of Separable and Polarizable Force Fields. J. Chem. Theory Comput. 2021, 17, 2759–2774. 10.1021/acs.jctc.0c01337. [DOI] [PubMed] [Google Scholar]
- Pereyaslavets L.; Kamath G.; Butin O.; Illarionov A.; Olevanov M.; Kurnikov I.; Sakipov S.; Leontyev I.; Voronina E.; Gannon T.; Nawrocki G.; Darkhovskiy M.; Ivahnenko I.; Kostikov A.; Scaranto J.; Kurnikova M. G.; Banik S.; Chan H.; Sternberg M. G.; Sankaranarayanan S. K. R. S.; Crawford B.; Potoff J.; Levitt M.; Kornberg R. D.; Fain B. Accurate Determination of Solvation Free Energies of Neutral Organic Compounds from First Principles. Nat. Commun. 2022, 13, 414 10.1038/s41467-022-28041-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bernardi R. C.; Melo M. C. R.; Schulten K. Enhanced Sampling Techniques in Molecular Dynamics Simulations of Biological Systems. Biochim. Biophys. Acta, Gen. Subj. 2015, 1850, 872–877. 10.1016/j.bbagen.2014.10.019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sugita Y.; Okamoto Y. Replica-Exchange Multicanonical Algorithm and Multicanonical Replica-Exchange Method for Simulating Systems with Rough Energy Landscape. Chem. Phys. Lett. 2000, 329, 261–270. 10.1016/S0009-2614(00)00999-4. [DOI] [Google Scholar]
- Liu P.; Kim B.; Friesner R. A.; Berne B. J. Replica Exchange with Solute Tempering: A Method for Sampling Biological Systems in Explicit Water. Proc. Natl. Acad. Sci. U.S.A. 2005, 102, 13749–13754. 10.1073/pnas.0506346102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang L.; Friesner R. A.; Berne B. J. Replica Exchange with Solute Scaling: A More Efficient Version of Replica Exchange with Solute Tempering (REST2). J. Phys. Chem. B 2011, 115, 9431–9438. 10.1021/jp204407d. [DOI] [PMC free article] [PubMed] [Google Scholar]
- InterX Inc., Berkeley, CA, USA. Arbalest - MD Simulation Program. 2022.
- Gapsys V.; Pérez-Benito L.; Aldeghi M.; Seeliger D.; van Vlijmen H.; Tresadern G.; de Groot B. L. Large Scale Relative Protein Ligand Binding Affinities Using Non-Equilibrium Alchemy. Chem. Sci. 2020, 11, 1140–1152. 10.1039/C9SC03754C. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gill S. C.; Lim N. M.; Grinaway P. B.; Rustenburg A. S.; Fass J.; Ross G. A.; Chodera J. D.; Mobley D. L. Binding Modes of Ligands Using Enhanced Sampling (BLUES): Rapid Decorrelation of Ligand Binding Modes via Nonequilibrium Candidate Monte Carlo. J. Phys. Chem. B 2018, 122, 5579–5598. 10.1021/acs.jpcb.7b11820. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jarzynski C. Nonequilibrium Equality for Free Energy Differences. Phys. Rev. Lett. 1997, 78, 2690–2693. 10.1103/PhysRevLett.78.2690. [DOI] [Google Scholar]
- Crooks G. E. Entropy Production Fluctuation Theorem and the Nonequilibrium Work Relation for Free Energy Differences. Phys. Rev. E 1999, 60, 2721–2726. 10.1103/PhysRevE.60.2721. [DOI] [PubMed] [Google Scholar]
- Shirts M. R.; Bair E.; Hooker G.; Pande V. S. Equilibrium Free Energies from Nonequilibrium Measurements Using Maximum-Likelihood Methods. Phys. Rev. Lett. 2003, 91, 140601 10.1103/PhysRevLett.91.140601. [DOI] [PubMed] [Google Scholar]
- Bennett C. H. Efficient Estimation of Free Energy Differences from Monte Carlo Data. J. Comput. Phys. 1976, 22, 245–268. 10.1016/0021-9991(76)90078-4. [DOI] [Google Scholar]
- Pereyaslavets L.; Kurnikov I.; Kamath G.; Butin O.; Illarionov A.; Leontyev I.; Olevanov M.; Levitt M.; Kornberg R. D.; Fain B. On the Importance of Accounting for Nuclear Quantum Effects in Ab Initio Calibrated Force Fields in Biological Simulations. Proc. Natl. Acad. Sci. U.S.A. 2018, 115, 8878–8882. 10.1073/pnas.1806064115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bayly C. I.; Cieplak P.; Cornell W.; Kollman P. A. A Well-Behaved Electrostatic Potential Based Method Using Charge Restraints for Deriving Atomic Charges: The RESP Model. J. Phys. Chem. A 1993, 97, 10269–10280. 10.1021/j100142a004. [DOI] [Google Scholar]
- Vingelmann P.; Fitzek F. H.; Nvidia. CUDA, Release: 10.2. 89 NVIDIA https://developer.nvidia.com/cuda-toolkit2020.
- Darden T.; York D.; Pedersen L. Particle Mesh Ewald: An N·log(N) Method for Ewald Sums in Large Systems. J. Chem. Phys. 1993, 98, 10089–10092. 10.1063/1.464397. [DOI] [Google Scholar]
- Giese T. J.; Panteva M. T.; Chen H.; York D. M. Multipolar Ewald Methods, 1: Theory, Accuracy, and Performance. J. Chem. Theory Comput. 2015, 11, 436–450. 10.1021/ct5007983. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tuckerman M.; Berne B. J.; Martyna G. J. Reversible Multiple Time Scale Molecular Dynamics. J. Chem. Phys. 1992, 97, 1990–2001. 10.1063/1.463137. [DOI] [Google Scholar]
- Martyna G. J.; Klein M. L.; Tuckerman M. Nosé--Hoover Chains: The Canonical Ensemble via Continuous Dynamics. J. Chem. Phys. 1992, 97, 2635–2643. 10.1063/1.463940. [DOI] [Google Scholar]
- Berendsen H. J. C.; Postma J. P. M.; van Gunsteren W. F.; DiNola A.; Haak J. R. Molecular Dynamics with Coupling to an External Bath. J. Chem. Phys. 1984, 81, 3684–3690. 10.1063/1.448118. [DOI] [Google Scholar]
- Waterhouse A.; Bertoni M.; Bienert S.; Studer G.; Tauriello G.; Gumienny R.; Heer F. T.; de Beer T. A. P.; Rempfer C.; Bordoli L.; Lepore R.; Schwede T. SWISS-MODEL: Homology Modelling of Protein Structures and Complexes. Nucleic Acids Res. 2018, 46, W296–W303. 10.1093/nar/gky427. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Olsson M. H. M.; Søndergaard C. R.; Rostkowski M.; Jensen J. H. PROPKA3: Consistent Treatment of Internal and Surface Residues in Empirical pKa Predictions. J. Chem. Theory Comput. 2011, 7, 525–537. 10.1021/ct100578z. [DOI] [PubMed] [Google Scholar]
- Abraham M. J.; Murtola T.; Schulz R.; Páll S.; Smith J. C.; Hess B.; Lindahl E. GROMACS: High Performance Molecular Simulations through Multi-Level Parallelism from Laptops to Supercomputers. SoftwareX 2015, 1-2, 19–25. 10.1016/j.softx.2015.06.001. [DOI] [Google Scholar]
- Shirts M. R.; Pande V. S. Comparison of Efficiency and Bias of Free Energies Computed by Exponential Averaging, the Bennett Acceptance Ratio, and Thermodynamic Integration. J. Chem. Phys. 2005, 122, 144107 10.1063/1.1873592. [DOI] [PubMed] [Google Scholar]
- Straatsma T. P.; McCammon J. A. Multiconfiguration Thermodynamic Integration. J. Chem. Phys. 1991, 95, 1175–1188. 10.1063/1.461148. [DOI] [Google Scholar]
- Woods C. J.; Essex J. W.; King M. A. Enhanced Configurational Sampling in Binding Free-Energy Calculations. J. Phys. Chem. B 2003, 107, 13711–13718. 10.1021/jp036162+. [DOI] [Google Scholar]
- Baum B.; Mohamed M.; Zayed M.; Gerlach C.; Heine A.; Hangauer D.; Klebe G. More than a Simple Lipophilic Contact: A Detailed Thermodynamic Analysis of Nonbasic Residues in the s1 Pocket of Thrombin. J. Mol. Biol. 2009, 390, 56–69. 10.1016/j.jmb.2009.04.051. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.







