Abstract
This study reports the results of binding free energy calculations for CB[8] host-guest systems in the SAMPL6 blind challenge (receipt ID 3z83m). Force-field parameters were developed specific for each of host and guest molecules to improve configurational sampling. We used quantum mechanical (QM) implicit solvent calculations and QM force matching to determine non-bonded (partial atomic charges) and bonded terms, respectively. Free energy calculations were carried out using the double-decoupling method (DDM) combined with Hamiltonian replica exchange method (HREM) and Bennett acceptance ratio (BAR) method. The root mean square error (RMSE) of the predicted values using DDM with respect to the experimental results was 4.32 kcal/mol. The coefficient of determination (R2) and Kendall rank coefficient (τ) were 0.49 and 0.52, respectively, highest of all submissions. In addition, these were compared to the results obtained by umbrella sampling (US) and weighted histogram analysis method (WHAM). Overall, DDM achieved a higher prediction accuracy than the US method. Results are discussed in terms of parameterization and free energy simulations.
Keywords: binding free energy, double-decoupling, Hamiltonian replica exchange, Bennett acceptance ratio, umbrella sampling, weighted histogram analysis
1. INTRODUCTION
Binding free energy calculations using molecular simulations provide a way to understand the binding affinity between proteins and ligands at the atomic level. The ability to predict binding affinities of ligands to proteins is crucial for understanding a comprehensive profile of protein interactions with not only naturally occurring activators and inhibitors in cells but also synthesized drug-like small molecules. For this reason, many computational methods for calculating binding free energy have been developed and are increasingly used in the fields of drug discovery, biochemical engineering, and nanotechnology [1,2]. This is especially important for early-stage drug discovery [1–3]. Accurate molecular modeling and virtual screening based on the prediction of protein-ligand binding free energies can accelerate hit-to-lead and lead optimization processes. Furthermore, the ability to predict binding free energy can be used to evaluate selectivity and off-target interactions of leads, enabling in silico testing of potential target toxicity, side-effects, and resistance in drug development steps
While many computational techniques have been developed [3–5], challenges persist for the quest of optimal methods for the molecular modeling, force field development and free energy simulations. However, it is difficult to rank superiority of current state-of-the-art methods, as most approaches are assessed on different systems.
The Statistical Assessment of the Modeling of Proteins and Ligands (SAMPL) blind challenge [6] provides a unique platform for validating computational methods for predicting protein-ligand binding free energies in an unbiased manner. Since SAMPL3 in 2011, the scope of the SAMPL challenge has expanded from calculating solvation free energies (SAMPL1) [6] and tautomeric states of drug-like small molecules (SAMPL2) [7] to predicting binding free energies of host-guest systems (SAMPL3–6) [8–11].
Host-guest systems are useful models for testing computational methods for predicting binding free energies of drug-like small molecules. Host molecules used in SAMPL challenges are smaller (~100 non-hydrogen atoms) and have fewer conformational degrees of freedom than proteins, which significantly reduces the complexity and cost of computations. Hence, it is beneficial in terms of circumventing the quasi-nonergodicity problem in protein-ligand binding affinity predictions [5,12], as the incomplete sampling due to the slow structural reorganizations can affect the simulation results to a significant degree [5,12–14].
Cucurbit[8]uril (CB[8]) [15,16] is a macrocyclic methylene-bridged glycoluril oligomer (CB[n] family) containing eight glycoluril units. The CB[n] family has attracted considerable research interest recently due to their rich structural and chemical functionality. Validating computational methods for calculating binding free energies of the CB[8] host-guest systems are important for both developing tools to accurately predict protein-ligand binding affinities and supporting experimental studies on developing CB[8] as drug delivery carriers, molecular switch and catalyst [15,16].
CB[8] was previously used as the host molecule in the SAMPL3 challenge with two guest molecules and now it is extensively dealt in the current SAMPL6 challenge with fourteen guest molecules, including FDA-approved drugs (Fig. 1). In the current challenge, eleven guests (G0-G10) are included in the main challenge and three guests (G11-G13) are offered as bonus cases, having the possibility of a 1:2 or 1:3 stoichiometry between host and guest. A blinded dataset using isothermal titration calorimetry (ITC) was provided by the SAMPL6 organizers following the submission of calculated binding free energies [17].
Learning from previous experiences on SAMPL host-guest binding affinity prediction challenges, we focus on understanding the effects of two major sources of error in this study, 1) force-field parameters and 2) free energy simulation methods. Utilizing the classical molecular mechanics potential energy functions used in Chemistry at HARvard Molecular Mechanics (CHARMM) [18], we generated parameters ‘specific’ for each of host and guest molecules. Although this approach inherently negates transferability, it is expected that the gains in conformational degrees of freedom will improve sampling in bound and unbound states. To investigate the limitations of free energy methods, we compared results from two methods based on different sampling approaches, i.e., double-decoupling method (DDM) [19,20] used with Hamiltonian replica exchange method (HREM) [21] and Bennett acceptance ratio (BAR) [22] (method 1) and umbrella sampling (US) method [23] combined with weighted histogram analysis method (WHAM) [24,25] (method 2).
The paper is organized as follows—Section 2 describes the force field parameterization (Section 2.1), i.e., calculations of partial atomic charges (Section 2.1.1) and generation of bonded parameters (Section 2.1.2), methods used for docking and molecular dynamics (MD) simulations (Section 2.2), and free energy simulations (Section 2.3) such as DDM (Section 2.3.1) and the US method (Section 2.3.2). Section 3 begins with results of DDM (Section 3.1) and the results are compared to those using the US method in Section 3.2. Section 4 combines the Discussion and Conclusions.
2. METHODS
The workflow for calculating the binding free energy is shown in Fig. 2. We first generated force field parameters for the host and guest molecules and then initial binding poses for host-guest complexes were obtained via molecular docking. Systems consisting of a guest molecule or a host-guest complex were subjected to MD simulations. Free energy simulations were started from the last coordinates of the MD simulations. All simulations were performed with CHARMM (c41b1 and c41b2).
2.1. Parameterization
Force-field parameters were generated based on the classical molecular mechanics potential energy functions used in CHARMM [18,26–28]. Given R, the vector of the coordinates of the atoms, the non-bonded and bonded parts are evaluated by electrostatic and van der Waals (vdW) interactions (eq. 1) and bond, Urey-Bradley, angle, dihedral and improper dihedral terms (eq. 2), respectively.
(1) |
(2) |
The first term on the right hand side of eq. 1 refers to the electrostatic interactions, given by Coulomb’s law where qi and qj are partial atomic charges of atoms i and j, rij is the distance between them and ɛ0 is the vacuum permittivity (≈ 8.854 × 10−12 F/m) [29]. In the second term, vdW interaction is treated by standard 6–12 Lennard-Jones (L-J) function where Rminij and ɛij is the distance at the minimum of the L-J function and the depth of the minimum, respectively. For the host and guests (G0-G12), the L-J parameters provided by CHARMM General Force Field (CGenFF) program [28] were used, while the parameters for G13 were adopted from a modeling study for platinum atom (Pt (II)) [30]. The L-J parameters between pairs of different atoms are obtained via the LorentzBerthelot combination rules [31], in which ɛij values are based on the geometric mean of ɛi and ɛj and Rminij values are based on the arithmetic mean between Rmini and Rminj.
In eq. 2, Kb, KUB, Kθ, Kχ and Kimp are the bond, Urey-Bradley, angle, dihedral angle, and improper dihedral angle force constant, respectively; b, S, θ, χ and φ are the bond length, Urey-Bradley 1,3-distance, bond angle, dihedral angle and improper torsion angle, respectively, with the subscript zero representing the equilibrium values for the individual terms. More details of the potential energy function can be obtained from references 25–27.
2.1.1. Partial atomic charges
Partial atomic charges in eq. 1 were obtained via QM calculations. Equilibrium geometries for the host CB[8] and guest molecules G0-G12 were optimized using the B3LYP density functional theory [32,33] and Møller-Plesset second-order perturbation theory (MP2) [34], respectively, with Pople’s split-valence double zeta plus polarization basis sets (6–31G(d)) [35,36]. Following, single point calculations were carried out to determine electrostatic potential (ESP) charges based on MP2/6–31G(d) densities. ESP charges were redistributed so that equivalent atoms have the same partial charge using the restrained electrostatic potential (RESP) charge fitting procedure [37]. In the case of G13, triple-zeta quality correlation consistent basis sets (cc-pVTZ) [38] with the respective relativistic pseudopotential for Pt (cc-pVTZ-PP) [39] were used. Geometry optimizations were performed using B3LYP and ESP charges were obtained based on MP2 densities, using an atomic radius of 2.0 Å for Pt. All calculations were performed with the solvation model based on density (SMD) implicit solvent model [40]. Electronic structure calculations and charge fitting were performed using Gaussian16 [41] and Antechamber [42], respectively.
2.1.2. Bonded parameters
Bonded parameters in eq. 2 were determined by force-matching to QM forces. To generate classical conformational ensembles of the host and guests, Langevin dynamics (LD) simulations were performed in gas-phase (i.e., no tapering of non-bonded contributions), with a timestep of 1 fs, a temperature of 298.15 K, a collision frequency of 5 ps−1, a coordinate saving frequency of 1 ps, and preceded by 500 ps of equilibration. L-J parameters and initial bonded parameters were used directly from CGenFF program. A total of 10,000 snapshots (i.e., 10 ns production run) were collected for each system (excluding G13) and B3LYP/631(d) forces were calculated for each point, which in turn was subjected to force-matching using the ForceSolve program [43,44]. Partial charges and L-J parameters were fixed during force-matching. For G13, classical ensemble generation proceeded in a similar manner to the systems of G0-G12, with exception of 20,000 snapshots collected from 5 ns of LD simulations. Palladium (Pd) (II) was used as a substitute for the complex coordinated Pt(II). The semi-empirical QM method, MNDO(d) [45] was used to perform LD simulations and the force matching was performed using B3LYP/LANL2DZ forces [46–48].
2.2. Docking and molecular dynamics simulations
Initial biding poses for the systems of the host-guest complex were obtained via molecular docking using GalaxyDock-HG program, which was developed based on the GalaxyDock2 program [49,50]. In cases of a 1:2 or 1:3 stoichiometry between host and guest (G11-G13) and in the absence of parameters (i.e., Pt(II) for G13), we alternatively resorted to manual docking, followed by short MD simulations to find lowest-energy binding poses.
The systems of host-guest complex and guest-only were then solvated using preequilibrated TIP3P water boxes (cubic, l = ~ 50 Å). We added additional Na+ and Cl− ions (i.e., three of each) to reach the experimental [Na+] of the 25 mM sodium phosphate buffer at pH 7.4 and 298.15 K, in addition to neutralizing Cl− ions for positively charged guest molecules. Systems were then subjected to MD simulations. They were minimized using the steepest descent (50,000 steps) and adopted basis Newton Raphson methods (500,000 steps), heated to 298.15 K over 100 ps, and then simulated for 50 ns under constant number, pressure, and temperature (NPT) ensemble at 298.15 K and 1 atm using the Nosé-Hoover thermostat [51]. The integration time step was 1 fs, and coordinate sets were saved every 10 ps. Electrostatics were evaluated using particle-mesh Ewald (PME) with ca. one grid point per angstrom (Å), a sixth–order spine interpolation for the complementary error function, a κ value of 0.36, and a 12 Å real space cutoff. The vdW term used a standard 6–12 LJ form, with force-switched truncation over the range 10–12 Å. The SHAKE constraint method [52] was applied to the covalent bonds between the hydrogen and oxygen atoms in TIP3P waters, with the default tolerance (1.0×10–10 Å).
2.3. Free energy simulations
We partitioned the free energy simulation scheme into two parts: 1) DDM [19,20] combined with HREM [21] and BAR [22] (method 1) and 2) US method [23] used with WHAM [24,25] (method 2). All the simulations were repeated three times using different randomly assigned initial velocities to estimate statistical error.
2.3.1. Double-decoupling method
The binding free energies for all the host-guest systems including three bonus cases were calculated based on the DDM [19,20]. The thermodynamic analysis that underlies DDM is summarized in Fig. 3. The top pathway denotes the annihilation of the guest from the host-guest complex in solution and the bottom is the annihilation of the guest from the solution, where the electrostatic and vdW interactions of the guest have been turned off alchemically, gradually converting the guest into an ideal-gas molecule. The corresponding free energy changes denoted respectively by ΔGelec −off and ΔGvdw −off, with the superscript ‘C’ and ‘G’ representing the host-guest complex and guest-only, respectively. The free energy changes of the two alchemical pathways combine to yield the “absolute free energy of binding” [19,20].
The free energy changes were computed as follows: Equilibrium simulations of states of interest and intermediate states were performed using the HREM [21] to enhance configurational sampling. The simulations were done under constant number, volume, and temperature (NVT) ensemble at 298.15 K. The integration time step was 1 fs, and coordinate sets were collected every 1 ps. The other simulation conditions were the same as the MD simulations (Section 2.2). Statistical analysis of energetic information collected from simulation samples was done using BAR method [22].
For systems of host-guest complexes, a harmonic restraint in the form of U(r) = Kf(ri − rj)2 was applied to the distance between the reference positions ri and rj of atom i (for host) and atom j (guest). These atoms were selected to form the shortest distance between the host and guest. The free energy cost for applying the restraint () was calculated by HREM simulations using 6 replicas (R1-R6). The force constant Kf was gradually increased over R1-R6 to reach the final force constant Kf of 0.4–2.0 kcal/mol/rad2. The energetic information was analyzed via the BAR method [22].
To calculate the contributions of the electrostatic () and van der Waals interactions (), we performed HREM simulations using 60 replicas (R6-R65), followed by analysis with the BAR method. The electrostatic interactions were gradually scaled down for the R6-R16 and then vdW interactions were gradually turned off for R16-R65. The restraint between the host and guest remain the same as in R6. The simulation for each replica was performed for 1 ns, to a total of 60 ns. In cases of binding of two or three guests to the host cavity (i.e., G11-G13), only one guest molecule was decoupled in the presence of other guests in the cavity to yield the second or third binding free energies. Representative binding configurations are shown in Fig. 4.
Free energy costs upon applying restraints between the host and guest were corrected for the standard state (i.e., restraint correction). The free energy for releasing the restraint was assumed as [19,20,53,54]. kB is the Boltzmann constant and T the temperature in Kelvin. V0 is the estimated standard state volume for an ideal gas at 298.15 K (1649.76 Å3). Veff was estimated using the equation, . The maximum (rmax) and minimum (rmin) distances between the reference atoms were estimated as the upper and lower limits of the middle 95% of the distance distribution. The distribution was calculated using CORREL module in CHARMM for the 1 ns trajectory of replica R65, where both electrostatic and vdW forces are completely turned off.
The free energy changes due to the electrostatic and vdW interactions of the guest-only systems were calculated with HREM/BAR simulations for 60 replicas (r1-r60) using the same procedure as the host-guest system through replicas R6-R65. The electrostatic and vdW interactions were gradually scaled down for the r1-r11 and r11-r60, respectively.
The work associated with pressure (ΔP) and volume (ΔV) differences between the alchemical pathways of the host-guest complex and guest-only (i.e., ΔPΔV correction) was accounted as described below. As the relationship between pressure and intermediate state was linear for both pathways, the correction factor was estimated as the difference between the average pressure over R6-R65 and the average pressure over r1-r60. The volume difference was considered as the volume of each guest in the last coordinate of the MD simulation of the guest-only system. It was calculated using the CORMAN module in CHARMM. The work associated with ΔPΔV in a unit of atm·Å3 was scaled to kcal/mol.
The (absolute) binding free energy (ΔGbind ) was obtained by using the following thermodynamic cycle:
(3) |
2.3.2. Umbrella sampling method
Umbrella sampling (US) was used to calculate the binding free energies for G0-G11 (1:1 binding cases) [23,55–64]. The calculations were done using the same initial coordinates and force-field parameters as used in DDM. The thermodynamic cycle used in the US method is presented in Fig. 5.
A series of MD simulations were performed on a host-guest complex with the harmonic distance restraint between the centers of masses of the host and the guest molecules. The biasing potential in the i-th simulation window is , where Ki is the force constant, r is the reaction coordinate and is the reference distance for the i-th sampling window. The full range of the reaction coordinate is covered using approximately 100 windows.
Starting from the initial binding pose obtained from molecular docking—followed by MD simulations, the guest molecule was gradually moved along the reaction coordinate towards outside of cavity by incrementing the reference distance at a speed of 0.2 Å every step, until the guest was fully dissociated. In most cases, for each sampling window, a 500 ps simulation was performed, starting from the last configuration of the previous sampling window. The first 250 ps in each run were treated as equilibration, which allowed the system to adjust to the current umbrella potential.
The biased probability distributions of the distance P′(ri) collected in sampling windows were unbiased using WHAM [24]. The WHAM program (version 2.0.9.1) implemented by Grossfield [25] was used to obtain the unbiased distribution (ri) and the associated potential of mean force (PMF), indicated by ΔGus.
The free energy cost for imposing the restraint (ΔGrest −on) was calculated using thermodynamic integration (TI) method. The harmonic force constant Ki was gradually increased from 0 to 30 kcal/mol/rad2 over 30 λ states. For each λ state, we performed an equilibration for 20 ps and a production increment of 180 ps. The simulations were performed in the NVT ensemble by using LD with a timestep of 1 fs, a temperature of 298.15 K, a collision frequency of 1 ps−1, and a coordinate saving frequency of 1 ps−1. The restraint correction was done in a similar manner to the DDM to yield the free energy for releasing the restraint (ΔGrest −off).
The (absolute) binding free energy (ΔGbind ) was obtained via the thermodynamic cycle:
(4) |
3. RESULTS
3.1. Prediction accuracy of computed values using the double-decoupling method
Table 1 shows computed binding free energy (ΔGbind ) using DDM and the values of the thermodynamic terms. We assumed a 1:1 stoichiometry between the host and guest for eleven guest molecules included in the main challenge (G0-G10), whereas both single and multiple binding modes were considered for three bonus cases (G11-G13).
Table 1.
Guest | ΔPΔV | ΔGbind | ΔGexpt | ||||||
---|---|---|---|---|---|---|---|---|---|
G0 | 0.83 ± 0.06 | −23.75 ± 0.03 | −10.02 ± 0.95 | −1.66 ± 0.02 | −26.56 ± 0.03 | −40.73 ± 0.05 | 24.63 | −8.05 ± 0.88 | −6.69 ± 0.05 |
G1 | 0.10 ± 0.04 | 24.24 ± 0.15 | −47.94 ± 1.31 | −1.83 ± 0.03 | 19.43 ± 0.20 | −67.54 ± 0.03 | 21.25 | −1.43 ± 1.51 | −7.65 ± 0.04 |
G2 | 0.84 ± 0.05 | −2.96 ± 0.06 | −19.81 ± 0.20 | −1.50 ± 0.01 | −6.67 ± 0.01 | −48.80 ± 0.07 | 21.79 | −10.25 ± 0.20 | −7.66 ± 0.05 |
G3 | 0.37 ± 0.06 | −18.83 ± 0.04 | −44.21 ± 0.99 | −1.49 ± 0.01 | −22.34 ± 0.05 | −76.03 ± 0.16 | 23.39 | −10.82 ± 0.85 | −6.45 ± 0.06 |
G4 | 2.01 ± 0.13 | −56.13 ± 0.19 | −38.10 ± 0.11 | −1.46 ± 0.00 | −60.58 ± 0.08 | −78.37 ± 0.56 | 39.85 | −5.42 ± 0.51 | −7.80 ± 0.04 |
G5 | 1.03 ± 0.08 | 26.31 ± 0.05 | 4.98 ± 0.11 | −0.43 ± 0.01 | 25.59 ± 0.08 | −22.59 ± 0.08 | 12.60 | −16.29 ± 0.34 | −8.18 ± 0.05 |
G6 | 2.10 ± 0.05 | 46.88 ± 0.19 | 8.86 ± 0.08 | −1.52 ± 0.02 | 47.25 ± 0.06 | −11.84 ± 0.01 | 9.18 | −11.73 ± 0.23 | −8.34 ± 0.05 |
G7 | 1.11 ± 0.02 | 45.69 ± 0.07 | 10.38 ± 0.07 | −0.82 ± 0.01 | 46.81 ± 0.03 | −13.50 ± 0.08 | 9.71 | −13.34 ± 0.14 | −10.00 ± 0.10 |
G8 | 1.16 ± 0.40 | 36.29 ± 0.07 | 8.21 ± 0.33 | −1.97 ± 0.02 | 32.23 ± 0.03 | −22.74 ± 0.01 | 14.54 | −19.67 ± 0.30 | −13.50 ± 0.04 |
G9 | 1.50 ± 0.05 | 101.92 ± 0.18 | 11.74 ± 0.12 | −1.37 ± 0.00 | 100.09 ± 0.04 | −9.79 ± 0.05 | 9.63 | −13.87 ± 0.13 | −8.68 ± 0.08 |
G10 | 1.83 ± 0.07 | 122.60 ± 0.04 | 15.10 ± 0.30 | −1.44 ± 0.00 | 126.17 ± 0.03 | −11.27 ± 0.13 | 11.43 | −11.76 ± 0.20 | −8.22 ± 0.07 |
G11 | 3.10 ± 0.06 | 104.23 ± 0.16 | 9.23 ± 0.06 | −1.48 ± 0.01 | 106.86 ± 0.06 | −9.74 ± 0.08 | 10.26 | −7.70 ± 0.21 | −7.77 ± 0.05 |
G11* | 2.17 ± 1.04 | 94.91 ± 0.59 | −4.18 ± 0.61 | −1.60 ± 0.02 | 106.86 ± 0.06 | −9.74 ± 0.08 | 9.71 | 15.54 ± 2.21 | N/A |
G12 | 8.22 ± 1.13 | −23.63 ± 0.08 | −29.05 ± 0.66 | −1.50 ± 0.00 | −27.94 ± 0.03 | −58.74 ± 1.60 | 26.83 | − 13.90 ± 2.40 | − 9.86 ± 0.03 |
G13 | 0.77 ± 0.08 | 226.92 ± 0.16 | 6.04 ± 0.11 | −1.87 ± 0.02 | 225.54 ± 0.04 | −12.55 ± 0.15 | 14.09 | −4.79 ± 0.36 | −7.11 ± 0.03 |
G13* | 2.32 ± 0.24 | 232.26 ± 0.31 | −1.68 ± 0.14 | −1.77 ± 0.00 | 225.54 ± 0.04 | −12.55 ± 0.15 | 14.00 | −4.11 ± 0.27 | N/A |
ΔGbind and ΔGexpt are the computed and experimental binding free energies, respectively. ΔGrest –on and ΔGrest –off are the free energy changes due to applying and releasing the restraint and ΔGelec –off and ΔGvdw–off denote the contributions of electrostatic and vdW interactions, respectively. Superscript ‘C’ and ‘G’ respectively denote the host-guest complex and guest-only. ΔPΔV denotes the work associated with differences in pressure and volume between the alchemical pathways of the host-guest complex and guest-only. All standard errors for ΔPΔV are less than 10−2 kcal/mol. Means and standard errors are obtained from three independent simulations using different randomly assigned initial velocities.
Represents values for the second guest molecule.
The mean absolute error (MAE) and root mean square error (RMSE) with respect to experiment were 3.82 and 4.32 kcal/mol, respectively (see Table 2 and Fig. 6). The slope of the linear regression line (indicated by s) was 1.92. Pearson correlation coefficient (r) [65] was 0.70 with the p-value of 0.0042 (two-tailed test; n=14, α=0.05), indicating that the linear correlation between the experimental and computed values is statistically significant at the level of significance of 0.05. The corresponding coefficient of determination (R2) was 0.49. Although the correlation between DDM predictions and experiment was the highest amongst the SAMPL6 submissions, the linear correlation (r) is generally considered moderate and the accuracy of the linear regression model (measured by R2) in not satisfactory—51% of the variance in the experimental value still cannot be predicted by the computed value.
Table 2.
Method | RMSE | MAE | s | r | R2 | τ | ρ |
---|---|---|---|---|---|---|---|
DDM* | 4.32 | 3.82 | 1.92 | 0.70a | 0.49 | 0.52b | 0.75c |
DDM** | 4.45 | 3.89 | 1.77 | 0.67d | 0.45 | 0.48e | 0.73f |
US** | 5.94 | 4.99 | 0.82 | 0.41g | 0.17 | 0.15h | 0.24i |
Results including all guest molecules (G0-G13).
Results of guest molecules G0-G11.
RMSE and MAE are the root mean square error and mean absolute error between computed and experimental values, respectively.
s denotes the slope of the linear regression line.
r, τ, ρ is Pearson, Kendall, and Spearman correlation coefficient, respectively.
R2 is correlation coefficient of determination.
p-value of 0.0042 with a confidential interval of [0.27, 0.90] (two-tailed test; n=14, α=0.05 with Fisher z-transformation)
p-value of 0.010 with a confidential interval of [0.12, 0.91] (two-tailed test; n=14, α=0.05)
p-value of 0.0020 (two-tailed test; n=14, α=0.05)
p-value of 0.015 with a confidential interval of [0.15, 0.90] (two-tailed test; n=12, α=0.05 with Fisher z-transformation)
p-value of 0.028 with a confidential interval of [0.05, 0.92] (two-tailed test; n=12, α=0.05)
p-value of 0.0074 (two-tailed test; n=12, α=0.05)
p-value of 0.19 with a confidential interval of [−0.21, 0.80] (two-tailed test; n=12, α=0.05 with Fisher z-transformation)
p-value of 0.49 with a confidential interval of [−0.28, 0.58] (two-tailed test; n=12, α=0.05)
p-value of 0.46 (two-tailed test; n=12, α=0.05)
Note that the computed first binding free energy value of G11 (−7.70 ± 0.21 kcal/mol) was used for the analysis.
In case of G12, the computed value was compared with the first binding free energy of experiment (−9.86 ± 0.03 kcal/mol).
For G13, the average of the first and second binding free energy (−4.45 ± 0.19 kcal/mol) was compared to the experimental value (7.11 ± 0.03 kcal/mol).
In addition, the similarity between the ranks of experimental and computed values were evaluated (Table 2). The Kendall rank correlation coefficient (τ) [66] was 0.52. The p-value of the two-tail test at n=14 and α=0.05 was 0.010, proving a statistically significant rank order correlation between two sets at the 95% confidence level. The Spearman correlation coefficient (ρ) (i.e., the Pearson’s r on the ranks of the data) was 0.75 (p-value = 0.0020; two-tailed test at n=14 and α=0.05). Altogether, this indicates that the rank ordering is robust, as there is at most a 5% chance that this similarity in rank order is due to unknown factors.
It is worth noting that our submission for bonus cases (G11-G13) was the top result in terms of error with respect to experimental results. The MAE and RMSE of our predictions with respect to experiment were 2.26 and 2.79 kcal/mol, respectively. The absolute errors for G11, G12, and G13 were 0.07, 4.04, 2.66 kcal/mol, respectively, while the MAE reported for each molecule for all six participants was 6.44 (for G11), 18.84 (G12), and 11.67 kcal/mol (G13).
For G11, we tested host:guest binding ratios of 1:1 and 1:2. ΔGbind values of the first and second guest were −7.70 ± 0.21 (mean and standard error) and 15.54 ± 2.21 kcal/mol, respectively, proving that binding of the second guest is energetically unfavorable. Therefore, we submitted only ΔGbind obtained via 1:1 binding mode, excluding the second ΔGbind. This is confirmed by the ITC experiment provided by organizers [17] and the submitted value was very close to the experimental result (−7.77 ± 0.05 kcal/mol). We investigated 1:1 – 1:3 binding modes for G12, given that in the ITC experiment the final concentration of guest was approximately three times greater than the host concentration (i.e., 0.0729 mM (for CB[8]) and 0.271 mM (G12)) [17]. However, it became clear that having two or three guests in the host cavity was energetically unfavorable. Instead, we calculated a binding free energy of a guest bound to the cavity of the host in the presence of two other molecules interacting with portal regions of the host, based on observations from MD trajectories (Fig. 4). In the case of G13, both the first and second binding free energies had energetically favorable values (i.e., −4.79 ± 0.36 and −4.11 ± 0.27, respectively). Therefore, the averaged value over the first and second binding free energies (−4.45 ± 0.19 kcal/mol) was taken as the final binding free energy.
3.2. Comparison between the double decoupling and umbrella sampling method
We conducted a follow-up study on 1:1 binding cases (i.e., G0-G11) after the SAMPL6 submission. To investigate the limitations of free energy simulation methods, we used a different strategy (US method), which considers the PMF as a function of the physical separation of the host and guest [23,55–64]. The (absolute) binding free energies (ΔGbind ) determined using the US method are summarized in Table 3. The performance of the DDM and US methods, with respect to experiment, are compared for twelve hostguest systems (Table 2 and Fig. 7). The correlations between the two methods are presented in Table 4.
Table 3.
Guest |
ΔGrest–on |
ΔGus |
ΔGrest–off |
ΔGbind |
ΔGexpt |
---|---|---|---|---|---|
G0 | 2.41 ± 0.19 | 18.56 ± 0.90 | −0.31 ± 0.14 | −20.67 ± 0.93 | −6.69 ± 0.05 |
G1 | 2.42 ± 0.06 | 5.97 ± 0.64 | −0.41 ± 0.18 | −7.98 ± 0.85 | −7.65 ± 0.04 |
G2 | 1.99 ± 0.04 | 13.53 ± 1.04 | −0.84 ± 0.02 | −14.68 ± 1.05 | −7.66 ± 0.05 |
G3 | 4.01 ± 0.21 | 7.74 ± 0.15 | −0.52 ± 0.02 | −11.23 ± 0.19 | −6.45 ± 0.06 |
G4 | 2.08 ± 0.05 | 9.86 ± 0.82 | −0.39 ± 0.06 | −11.55 ± 0.87 | −7.80 ± 0.04 |
G5 | 1.69 ± 0.01 | 13.54 ± 0.95 | −1.34 ± 0.12 | −13.89 ± 0.83 | −8.18 ± 0.05 |
G6 | 1.87 ± 0.02 | 10.87 ± 0.56 | −1.25 ± 0.17 | −11.49 ± 0.74 | −8.34 ± 0.05 |
G7 | 1.80 ± 0.03 | 12.22 ± 1.07 | −1.26 ± 0.06 | −12.77 ± 1.00 | −10.00 ± 0.10 |
G8 | 1.60 ± 0.01 | 19.36 ± 0.93 | −0.82 ± 0.14 | −20.14 ± 0.78 | −13.50 ± 0.04 |
G9 | 1.65 ± 0.01 | 12.66 ± 1.04 | −1.14 ± 0.04 | −13.18 ± 1.02 | −8.68 ± 0.08 |
G10 | 2.63 ± 0.03 | 9.74 ± 0.10 | −1.10 ± 0.11 | −11.26 ± 0.10 | −8.22 ± 0.07 |
G11 | 4.07 ± 0.04 | 8.46 ± 0.45 | −0.49 ± 0.27 | −12.04 ± 0.39 | −7.77 ± 0.05 |
ΔGbind and ΔGexpt are the computed and experimental binding free energies, respectively. ΔGrest –on and ΔGrest–off are respectively free energy changes due to applying and releasing the harmonic restraint. ΔGus denotes the free energy cost to pull the guest molecule from the binding pocket to outside the host molecule. Means and standard errors are obtained from three independent simulations using different randomly assigned initial velocities.
Table 4.
s denotes the slope of the linear regression line.
r, τ, ρ is Pearson, Kendall, and Spearman correlation coefficient, respectively.
R2 is correlation coefficient of determination.
p-value of 0.09 with a confidential interval of [−0.09, 0.84] (two-tailed test; n=12, α=0.05 with Fisher z-transformation)
p-value of 0.10 with a confidential interval of [−0.07, 0.80] (two-tailed test; n=12, α=0.05)
p-value of 0.19 (two-tailed test; n=12, α=0.05)
Note that correlations were obtained from results of G0 – G11.
The first binding free energy value of G11 was used for the analysis.
DDM achieved a relatively higher accuracy than the US method in terms of error with respect to experiment. The MAE and RMSE were 3.89 and 4.45 (for DDM) and 4.99 and 5.94 kcal/mol (for US), respectively (see Table 2). In addition, correlations were higher in results obtained with DDM than with the US method, with exception of the slope of linear regression (s = 1.77 for DDM and 0.82 for US). While computed binding free energies (ΔGbind ) using DDM illustrated a statistically significant linear relationship with the experiment, the values via the US method did not. In case of DDM, Pearson’s r was 0.67 with p-value of 0.015 (two-tailed test; n=12, α=0.05), indicating that there is a statistically significant linear relationship between the experimental and computed values at the level of significance of 0.05. The corresponding coefficient of determination (R2) was 0.45. For the US method, r was 0.41 (R2 = 0.17) with p-value of 0.19 (n=12, α=0.05).
Moreover, the similarity between the ranks of experimental and computed values was higher in DDM than US method. For DDM, the Kendall’s τ was 0.48. The p-value of the two-tail test at n=12 and α=0.05 was 0.028, proving a statistically significant rank order correlations between two sets at the 0.05 confidence level. The Spearman’s ρ was 0.73 (p-value = 0.0074; two-tailed test at n=12 and α=0.05). For the US method, the τ and ρ were 0.15 and 0.24, which are far below the critical values of the level of significance of 0.05.
The correlations between the two methods were moderate. The RMSE between DDM and US methods was 4.88 kcal/mol and the slope of the linear regression line was 0.68 (Table 4). Pearson‘s r was 0.51 with the p-value of 0.09 (two-tail test; n=12 and α=0.05). The rank correlation coefficients, τ and ρ were 0.36 and 0.41 with p-values 0.10 and 0.19 (two-tail test; n=12 and α=0.05), respectively.
4. DISCUSSION AND CONCLUSIONS
In an attempt to improve the force-field accuracy, we generated tailored force-field parameters ‘specific’ to each host and guest molecule. Namely, even if different molecules have same atom types, each parameter is slightly different, according to QM calculations of each molecule. Although this approach inherently negates transferability, it is expected that the gains in configurational degrees of freedom will improve sampling in bound and unbound states. However, as a derivative for classical molecular mechanics models, the parameters do not account for polarizability.
Partial charges were obtained from QM calculations. The structure of CB[8] was optimized with density functional theory (i.e., B3LYP/6–31G(d)) whereas the geometry-optimized structures for guests (G0-G12) were obtained using MP2/6–31G(d). MerzKollman ESP charges were obtained for each optimized structure, and these are processed through the antechamber RESP facility to symmetrize charge distribution amongst topologically equivalent atoms. However, partial charges for the bonus guest molecule G13 needed special consideration as the guest contained a platinum (Pt II) atom which required the use of effective core potentials. In the case of G13, triple-zeta quality correlation consistent basis sets (cc-pVTZ) with the respective relativistic pseudopotential for Pt (cc-pVTZ-PP) [39] were used, based on an effective pseudopotential-based composite method, rp-ccCA [67]. Geometry optimizations were performed using B3LYP and ESP charges were obtained based on MP2 densities, using an atomic radius of 2.0 Å for Pt. All QM calculations were carried out using the SMD implicit solvent (water) model [40,68], to include approximate solvent effects on the partial charge calculations in hopes to provide a reasonable description of the solute in solution. As a continuum solvation method, the SMD model is unique in contrast to other implicit models because it uses the solute’s charge density rather than partial atomic charges, allowing the method to be used with wavefunction theory and density functional theory methods [40]. The parameterization of the boned part for G13 proved a more difficult challenge, as parameters for Pt (II) complexes are not readily available classically. As a work around, Palladium (Pd) (II) was used as a substitute for the complex coordinated Pt (II), with the assumption that discrepancies in internal degrees of freedom between exchanging Pd (II) and Pt (II) are minimal and are reasonably close in vibrational motions.
Although the tailored parameterization approach ideally would improve sampling which in turn allows more accurate prediction of the binding free energy, we had unexpected technical problems in the parameterization process via software issue. First, improper dihedral terms assigned by force-matching were not utilized in free energy simulations. The ForceSolve program creates additional improper dihedrals based on the bond connectivity other than those existing in the CHARMM force field and these additional parameters were not read from the CHARMM program. For instance, in the CB[8] host, the original CHARMM topology assigns a single improper dihedral centered on the carbonyl carbon, to ensure planarity of the associated oxygen and the two nitrogen atoms attached to the carbon. However, the force-matching procedure generates all possible combinations of atoms with a bond connectivity of three to yield an additional improper dihedral onto the nitrogen atoms between the CB[8] carbonyls, for a total of two improper dihedral parameters. This new nitrogen-centered improper dihedral was not utilized since it was not in the initial CHARMM topology file used to generate the initial systems. However, this solely manifested in the aromatic substituents of the guest systems, and thus caused some minor excessive flexibility in planar rings. They maintained reasonable structural features and did not significantly impact either the binding poses nor the non-bonded interactions between the host and guest. These issues were corrected after the SAMPL6 challenge. Second, we found a dihedral overwriting parser problem. For singly defined dihedrals (i.e., a dihedral with one associated multiplicity), the overwriting occurred as expected. However, dihedral terms containing dihedral multiplicities were to be a combination of CHARMM and force-matched parameter. For example, after reading in a CHARMM parameter set, and following with the appending of the force-matched parameter set, a dihedral with multiplicity of n = 2, 3 and 6 would take the n = 6 dihedral parameter from the force-matched set. This resulted in a dihedral that has CHARMM dihedral parameters for n = 2 and 3, and a force-matched dihedral parameter for n = 6. This is a bug associated with CHARMM and the exact cause is still uncertain. Last, overlap between same parameter types in host and guest occurred in reading parameters. Further discussion on the parameterization issue can be found in the work of Hudson et al., also in this issue [69].
Free energy simulations using two well-known methods, alchemical (DDM) and US method, were considered for side by side comparison. The methods are based on different perturbation techniques. In DDM, the complex of host and guest is gradually separated via an alchemical transformation, while in the US method, the complex is separated along a physically realizable path (reaction coordinate) such as the distance between host and guest. In addition, these methods are combined with different sampling and analysis approaches. For DDM, equilibrium simulations of states of interest and intermediate states were performed using HREM to enhance configurational sampling and statistical analysis of the energetic information collected from sampling was performed using the BAR method. On the other hand, non-Boltzmann sampling was used for the US method. A non-Boltzmann weighting function was added to each window to acquire adequate statistics about low-energy configurations. The obtained biased probability distributions accumulated in these sampling windows were unbiased using WHAM to yield the PMF.
In both cases, starting with plausible binding poses are essential for accurate predictions, given that both DDM and US method aim to calculate “absolute” binding free energies, which are the free energy differences between two thermodynamics states—the ligand restricted to the host (bound state) and the unrestricted ligand, free to explore all the possible conformations (unbound state). From a sampling perspective, energetically unfavorable initial binding poses lead to insufficient sampling of the low-energy bound states. Therefore, we conducted molecular docking to obtain low-energy initial binding poses (in the context of the force-field) for extensive free energy simulations (i.e., DDM and US method). Docking algorithms are designed to estimate binding affinities between proteins and ligands using phenomenological scoring functions based on (nearly) rigid representations of the molecules. Although comparative studies (including SAMPL challenges) consistently illustrate that binding free energies obtained from docking simulations poorly correlate with experiment [8–10,70,71], if used properly, molecular docking approaches offer useful intuitions such as plausible configurations of a protein-ligand complex in a fast and computationally inexpensive way.
We used the GalaxyDock-HG program for docking simulations for most host-guest systems. The program finds binding poses of host and guest via global optimizations of the AutoDock4 energy [72] using the conformational space annealing (CSA) algorithm [73]. The energy is evaluated in the continuous space instead of interpolating energy values at the grid points as in the GalaxyDock2 program and the initial set of conformations for CSA was generated by random perturbations of initial structures unlike the geometry-based pre-docking used in GalaxyDock2. For cases of binding of two or three guests to the host or the absence of parameters (i.e., Pt(II) for G13), we alternatively resorted to manual docking followed by short MD simulations to find lowenergy binding poses.
In free energy calculations, corrections for free energy changes due to pressure (P) and volume (V) differences—referred to as “ΔPΔV correction” (for DDM) and free energy costs for releasing restraints— “restraint correction” (for both DDM and US method) were accounted. For “ΔPΔV correction”, the work associated with the differ in pressure and volume between the intermediate states of host-guest complex and guest-only was corrected for the standard state. Specifically, and , the potential energy of i-th state of host-guest complex and that of guest-only system were, respectively, replaced with and . Although applying artificial restraint is useful to avoid sampling problems due to complex energy barriers along the alchemical pathway, this causes the loss of positional and orientational freedom and ultimately alters the free energy. Corresponding free energy for releasing the restraint can be estimated as the ratio between the standard state volume for an ideal gas (V0) and an effective volume of a guest molecule (Veff), namely, [19,20,53,54]. In bound states for both DDM and US method, imposing the restraint confines the guest molecule to a small volume, i.e., Veff < V0, and corresponding free energy for releasing the restraint (ΔGrest−off) is negative. On the other hand, in US scheme, when guest molecules are dissociated from the host beyond certain distance (i.e., r ≈ 20 Å) Veff values become larger than V0 to yield the positive ΔGrest−off values. It is worth pointing out, however, the resulting binding free energies (ΔGbind) (in cases of positive ΔGrest−off values) are (almost) same as those with negative ΔGrest−off values because ΔGus decreases with the increase in ΔGrest−off. We confirmed that ΔGbind values did not depend on the final dissociation distance—the variances of calculated ΔGbind values beyond a “cutoff limit” (approximately 10 Å) were negligible. The configurations of the distances between the host and guest greater than the “cutoff limit” can be considered as “fully dissociated” states, in which the host and guest are separated by multiple water layers.
We note that in US simulations some systems needed special consideration for sampling. In an effort to achieve a well-converged PMF, we confirmed the ‘pulling-out (starting from the bound state)’ schemes by comparing with ‘pushing-in (from the unbound state)’ simulations. In cases of PMF profiles of pulling-out simulations (500 ps for each window) were not consistent with those based on pushing-in simulations (i.e., CB8-G0, CB8-G4 and CB8-G10), we performed a 1 ns simulation for each window to achieve the consistency. For the other systems, the consistency was observed with 200 ~ 500 ps simulation for each window. Even though applying a force to separate the host and guest is effective in producing representative unbound states, this can create nonnative unbound states. In addition, the initial coordinate for pulling-out simulations is not uniquely defining the bound state. Therefore, collected distributions that are under-sampled can lead to a poorly-converged PMF. Although the consistency of the PMF profiles is not sufficient to demonstrate that a simulation is well-sampled and converged, it is necessary to confirm the convergence [74]. Further discussion on the US method can be found in a companion paper by Nishikawa et al. in the same issue [75].
Using Na+ and Cl− ions in simulations as an alternative to the sodium phosphate buffers used in the ITC experiment could be another source of error in the binding free energy calculations. This approach negates the properties of phosphate in solution such as specific electrostatic interactions with the host and guest molecules and with ions [76–78] and the kosmotropic behavior [79,80]. However, unfortunately we lack a better alternative than using Cl− (instead of phosphate) as the reliable force-field parameters for phosphates and interacting ions are still under development [81–83,78].
The current results demonstrate limitations of the DDM and US method based on classical force-fields for predicting the experimental binding free energies. Although the correlation of the results from the DDM approach with respect to experiment, was strongly statistically significant and highest among all submissions, almost half of the variance in the experimental value still cannot be predicted by the computed values. Still many theoretical and practical issues associated with our methodology used in the SAMPL6 challenge remain unresolved. We have developed force-field parameters specific for each of host and guest molecules to best reproduce QM properties such as partial charges and forces. However, as a derivative of CHARMM classical force-field parameters, the parameters do not account for polarizability. This would be the major weakness of the force-field parameters used in this study. From the comparison of results of DDM and US methods, we have learned that methodology limitations are not negligible. Although both methods based on the same initial binding poses and force-field parameters, the correlation between the two methods is not significant. Inaccuracies in both configurational sampling and statistical analysis for collected energetic information could be another major source of errors in binding free energy predictions.
5. ACKNOWLEDGEMENTS
The authors would like to thank Gerhard König, Xiongwu Wu, Qiao Zheng and Daniel R. Roe for helpful advice and discussion. We extend our appreciation to Richard M. Venable, Andrew C. Simmonett, John Legato, Andrea Rizzi, Minkyung Baek and Chaok Seok for valuable comment and technical support. Kyungreem Han wishes to express his deepest gratitude to Richard W. Pastor. This work was partially supported by the intramural research program of the National Heart, Lung and Blood Institute (NHLBI) of the National Institutes of Health and employed the high-performance computational capabilities of the LoBoS and Biowulf Linux clusters at the National Institutes of Health (http://www.lobos.nih.gov and http://biowulf.nih.gov). Kyungreem Han’s research was partially supported by a grant from the KRIBB Research Initiative Program (Korean Biomedical Scientist Fellowship Program), Korea Research Institute of Bioscience and Biotechnology, Republic of Korea.
REFERENCES
- 1.Jorgensen WL (2004) The Many Roles of Computation in Drug Discovery. Science 303 (5665):18131818. doi: 10.1126/science.1096361 [DOI] [PubMed] [Google Scholar]
- 2.Sliwoski G, Kothiwale S, Meiler J, Lowe EW (2014) Computational Methods in Drug Discovery.Pharmacological Reviews 66 (1):334–395. doi: 10.1124/pr.112.007336 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Shirts MR (2012) Best Practices in Free Energy Calculations for Drug Design In: Baron R (ed) Computational Drug Discovery and Design. Springer New York, New York, NY, pp 425–467. doi: 10.1007/978-1-61779-465-0_26 [DOI] [PubMed] [Google Scholar]
- 4.Kollman P (1993) Free energy calculations: Applications to chemical and biochemical phenomena. Chemical Reviews 93 (7):2395–2417. doi: 10.1021/cr00023a004 [DOI] [Google Scholar]
- 5.Chipot C, Pohorille A (2007) Free Energy Calculations: Theory and Applications in Chemistry and Biology. Springer, [Google Scholar]
- 6.Guthrie JP (2009) A Blind Challenge for Computational Solvation Free Energies: Introduction and Overview. The Journal of Physical Chemistry B 113 (14):4501–4507. doi: 10.1021/jp806724u [DOI] [PubMed] [Google Scholar]
- 7.Geballe MT, Skillman AG, Nicholls A, Guthrie JP, Taylor PJ (2010) The SAMPL2 blind prediction challenge: introduction and overview. Journal of Computer-Aided Molecular Design 24 (4):259–279. doi: 10.1007/s10822-010-9350-8 [DOI] [PubMed] [Google Scholar]
- 8.Muddana HS, Daniel Varnado C, Bielawski CW, Urbach AR, Isaacs L, Geballe MT, Gilson MK (2012) Blind prediction of host–guest binding affinities: a new SAMPL3 challenge. Journal of Computer-Aided Molecular Design 26 (5):475–487. doi: 10.1007/s10822-012-9554-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Muddana HS, Fenley AT, Mobley DL, Gilson MK (2014) The SAMPL4 host–guest blind prediction challenge: an overview. Journal of Computer-Aided Molecular Design 28 (4):305–317. doi: 10.1007/s10822014-9735-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Yin J, Henriksen NM, Slochower DR, Shirts MR, Chiu MW, Mobley DL, Gilson MK (2017) Overview of the SAMPL5 host–guest challenge: Are we doing better? Journal of Computer-Aided Molecular Design 31 (1):1–19. doi: 10.1007/s10822-016-9974-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Rizzi A, Murkli S, McNeill JN, Yao W, Sullivan M, Gilson MK, Chiu MW, Isaacs L, Gibb BC, Mobley DL, Chodera JD (2018) Overview of the SAMPL6 host-guest binding affinity prediction challenge. Journal of Computer-Aided Molecular Design Submitted [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Wang L, Berne BJ, Friesner RA (2012) On achieving high accuracy and reliability in the calculation of relative protein–ligand binding affinities. Proceedings of the National Academy of Sciences 109 (6):19371942. doi: 10.1073/pnas.1114017109 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Mobley DL, Chodera JD, Dill KA (2007) Confine-and-Release Method: Obtaining Correct Binding Free Energies in the Presence of Protein Conformational Change. Journal of Chemical Theory and Computation 3 (4):1231–1235. doi: 10.1021/ct700032n [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Jiang W, Roux B (2010) Free Energy Perturbation Hamiltonian Replica-Exchange Molecular Dynamics (FEP/H-REMD) for Absolute Ligand Binding Free Energy Calculations. Journal of Chemical Theory and Computation 6 (9):2559–2565. doi: 10.1021/ct1001768 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Liu S, Ruspic C, Mukhopadhyay P, Chakrabarti S, Zavalij PY, Isaacs L (2005) The Cucurbit[n]uril Family: Prime Components for Self-Sorting Systems. Journal of the American Chemical Society 127 (45):15959–15967. doi: 10.1021/ja055013x [DOI] [PubMed] [Google Scholar]
- 16.Lagona J, Mukhopadhyay P, Chakrabarti S, Isaacs L (2005) The Cucurbit[n]uril Family. Angewandte Chemie International Edition 44 (31):4844–4870. doi: 10.1002/anie.200460675 [DOI] [PubMed] [Google Scholar]
- 17.Steven Murkli JM, Isaacs Lyle (2018) Cucurbit[8]uril-Guest Complexes: Blinded Dataset for the SAMPL6 Challenge. Supramolecular Chemistry Submitted (XX) [Google Scholar]
- 18.Brooks BR, Bruccoleri RE, Olafson BD, States DJ, Swaminathan S, Karplus M (1983) CHARMM: A program for macromolecular energy, minimization, and dynamics calculations. Journal of Computational Chemistry 4 (2):187–217. doi: 10.1002/jcc.540040211 [DOI] [Google Scholar]
- 19.Gilson MK, Given JA, Bush BL, McCammon JA (1997) The statistical-thermodynamic basis for computation of binding affinities: a critical review. Biophysical Journal 72 (3):1047–1069 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Boresch S, Tettinger F, Leitgeb M, Karplus M (2003) Absolute Binding Free Energies: A Quantitative Approach for Their Calculation. The Journal of Physical Chemistry B 107 (35):9535–9551. doi: 10.1021/jp0217839 [DOI] [Google Scholar]
- 21.Fukunishi H, Watanabe O, Takada S (2002) On the Hamiltonian replica exchange method for efficient sampling of biomolecular systems: Application to protein structure prediction. The Journal of Chemical Physics 116 (20):9058–9067. doi: 10.1063/1.1472510 [DOI] [Google Scholar]
- 22.Bennett CH (1976) Efficient estimation of free energy differences from Monte Carlo data. Journal of Computational Physics 22 (2):245–268. doi: 10.1016/0021-9991(76)90078-4 [DOI] [Google Scholar]
- 23.Torrie GM, Valleau JP (1977) Nonphysical sampling distributions in Monte Carlo free-energy estimation: Umbrella sampling. Journal of Computational Physics 23 (2):187–199. doi: 10.1016/0021-9991(77)90121-8 [DOI] [Google Scholar]
- 24.Kumar S, Rosenberg JM, Bouzida D, Swendsen RH, Kollman PA (1992) THE weighted histogram analysis method for free‐energy calculations on biomolecules. I. The method. Journal of Computational Chemistry 13 (8):1011–1021. doi: 10.1002/jcc.540130812 [DOI] [Google Scholar]
- 25.Grossfield A (2013) “WHAM: an implementation of the weighted histogram analysis method”, http://membrane.urmc.rochester.edu/content/wham/, version 2.0.9.
- 26.MA D (2004) Empirical force fields for biological macromolecules: Overview and issues. Journal of Computational Chemistry 25 (13):1584–1604. doi: 10.1002/jcc.20082 [DOI] [PubMed] [Google Scholar]
- 27.MacKerell AD, Bashford D, Bellott M, Dunbrack RL, Evanseck JD, Field MJ, Fischer S, Gao J, Guo H, Ha S, Joseph-McCarthy D, Kuchnir L, Kuczera K, Lau FTK, Mattos C, Michnick S, Ngo T, Nguyen DT, Prodhom B, Reiher WE, Roux B, Schlenkrich M, Smith JC, Stote R, Straub J, Watanabe M, WiórkiewiczKuczera J, Yin D, Karplus M (1998) All-Atom Empirical Potential for Molecular Modeling and Dynamics Studies of Proteins. The Journal of Physical Chemistry B 102 (18):3586–3616. doi: 10.1021/jp973084f [DOI] [PubMed] [Google Scholar]
- 28.Vanommeslaeghe K, Hatcher E, Acharya C, Kundu S, Zhong S, Shim J, Darian E, Guvench O, Lopes P, Vorobyov I, MacKerell AD (2010) CHARMM General Force Field (CGenFF): A force field for drug-like molecules compatible with the CHARMM all-atom additive biological force fields. Journal of computational chemistry 31 (4):671–690. doi: 10.1002/jcc.21367 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.The NIST Reference on Constants, Units, and Uncertainty. US National Institute of Standards and Technology. June 2015. Retrieved 2015–09-25. 2014. CODATA recommended values [Google Scholar]
- 30.Yao S, Plastaras JP, Marzilli LG (1994) A Molecular Mechanics AMBER-Type Force Field for Modeling Platinum Complexes of Guanine Derivatives. Inorganic Chemistry 33 (26):6061–6077. doi: 10.1021/ic00104a015 [DOI] [Google Scholar]
- 31.Allen MP, Tildesley DJ (1987) Computer simulation of liquids. Clarendon Press [Google Scholar]
- 32.Becke AD (1993) Density‐functional thermochemistry. III. The role of exact exchange. The Journal of Chemical Physics 98 (7):5648–5652. doi: 10.1063/1.464913 [DOI] [Google Scholar]
- 33.Lee C, Yang W, Parr RG (1988) Development of the Colle-Salvetti correlation-energy formula into a functional of the electron density. Physical Review B 37 (2):785–789. doi: 10.1103/PhysRevB.37.785 [DOI] [PubMed] [Google Scholar]
- 34.Møller C, Plesset MS (1934) Note on an Approximation Treatment for Many-Electron Systems. Physical Review 46 (7):618–622. doi: 10.1103/PhysRev.46.618 [DOI] [Google Scholar]
- 35.Hariharan PC, Pople JA (1973) The influence of polarization functions on molecular orbital hydrogenation energies. Theoretica chimica acta 28 (3):213–222. doi: 10.1007/bf00533485 [DOI] [Google Scholar]
- 36.Francl MM, Pietro WJ, Hehre WJ, Binkley JS, Gordon MS, DeFrees DJ, Pople JA (1982) Self‐consistent molecular orbital methods. XXIII. A polarization‐type basis set for second‐row elements. The Journal of Chemical Physics 77 (7):3654–3665. doi: 10.1063/1.444267 [DOI] [Google Scholar]
- 37.Bayly CI, Cieplak P, Cornell W, Kollman PA (1993) A well-behaved electrostatic potential based method using charge restraints for deriving atomic charges: the RESP model. The Journal of Physical Chemistry 97 (40):10269–10280. doi: 10.1021/j100142a004 [DOI] [Google Scholar]
- 38.THD Jr. (1989) Gaussian basis sets for use in correlated molecular calculations. I. The atoms boron through neon and hydrogen. The Journal of Chemical Physics 90 (2):1007–1023. doi: 10.1063/1.456153 [DOI] [Google Scholar]
- 39.Figgen D, Peterson KA, Dolg M, Stoll H (2009) Energy-consistent pseudopotentials and correlation consistent basis sets for the 5d elements Hf–Pt. The Journal of Chemical Physics 130 (16):164108. doi: 10.1063/1.3119665 [DOI] [PubMed] [Google Scholar]
- 40.Marenich AV, Cramer CJ, Truhlar DG (2009) Universal Solvation Model Based on Solute Electron Density and on a Continuum Model of the Solvent Defined by the Bulk Dielectric Constant and Atomic Surface Tensions. The Journal of Physical Chemistry B 113 (18):6378–6396. doi: 10.1021/jp810292n [DOI] [PubMed] [Google Scholar]
- 41.Frisch MJ, Trucks GW, Schlegel HB, Scuseria GE, Robb MA, Cheeseman JR, Scalmani G, Barone V, Petersson GA, Nakatsuji H, Li X, Caricato M, Marenich AV, Bloino J, Janesko BG, Gomperts R, Mennucci B, Hratchian HP, Ortiz JV, Izmaylov AF, Sonnenberg JL, Williams, Ding F, Lipparini F, Egidi F, Goings J, Peng B, Petrone A, Henderson T, Ranasinghe D, Zakrzewski VG, Gao J, Rega N, Zheng G, Liang W, Hada M, Ehara M, Toyota K, Fukuda R, Hasegawa J, Ishida M, Nakajima T, Honda Y, Kitao O, Nakai H, Vreven T, Throssell K, Montgomery JA Jr., Peralta JE, Ogliaro F, Bearpark MJ, Heyd JJ, Brothers EN, Kudin KN, Staroverov VN, Keith TA, Kobayashi R, Normand J, Raghavachari K, Rendell AP, Burant JC, Iyengar SS, Tomasi J, Cossi M, Millam JM, Klene M, Adamo C, Cammi R, Ochterski JW, Martin RL, Morokuma K, Farkas O, Foresman JB, Fox DJ(2016) Gaussian 16 Rev. B.01. Wallingford, CT [Google Scholar]
- 42.Wang J, Wang W, Kollmann P, Case D (2005) Antechamber, An Accessory Software PackageFor Molecular Mechanical Calculation. J Comput Chem 25:1157–1174. doi:citeulike-article-id:10121022 [Google Scholar]
- 43.Beck DMRaTL (2008) ForceSolve Sourceforge
- 44.Hudson PS, Boresch S, Rogers D, Woodcock HL (2018) Accelerating QM/MM Free Energy Computations via Intramolecular Force Matching. Journal of Chemical Theory and Computation In press [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Thiel W, Voityuk AA (1996) Extension of MNDO to d Orbitals: Parameters and Results for the SecondRow Elements and for the Zinc Group. The Journal of Physical Chemistry 100 (2):616–626. doi: 10.1021/jp952148o [DOI] [Google Scholar]
- 46.Hay PJ, Wadt WR (1985) Ab initio effective core potentials for molecular calculations. Potentials for K to Au including the outermost core orbitals. The Journal of Chemical Physics 82 (1):299–310. doi: 10.1063/1.448975 [DOI] [Google Scholar]
- 47.Wadt WR, Hay PJ (1985) Ab initio effective core potentials for molecular calculations. Potentials for main group elements Na to Bi. The Journal of Chemical Physics 82 (1):284–298. doi: 10.1063/1.448800 [DOI] [Google Scholar]
- 48.Hay PJ, Wadt WR (1985) Ab initio effective core potentials for molecular calculations. Potentials for the transition metal atoms Sc to Hg. The Journal of Chemical Physics 82 (1):270–283. doi: 10.1063/1.448799 [DOI] [Google Scholar]
- 49.Shin W-H, Seok C (2012) GalaxyDock: Protein–Ligand Docking with Flexible Protein Side-chains. Journal of Chemical Information and Modeling 52 (12):3225–3232. doi: 10.1021/ci300342z [DOI] [PubMed] [Google Scholar]
- 50.Shin WH, Kim JK, Kim DS, Seok C (2013) GalaxyDock2: Protein–ligand docking using beta‐complex and global optimization. Journal of Computational Chemistry 34 (30):2647–2656. doi: 10.1002/jcc.23438 [DOI] [PubMed] [Google Scholar]
- 51.Hoover WG (1985) Canonical dynamics: Equilibrium phase-space distributions. Physical Review A 31 (3):1695–1697. doi: 10.1103/PhysRevA.31.1695 [DOI] [PubMed] [Google Scholar]
- 52.Ryckaert J-P, Ciccotti G, Berendsen HJC (1977) Numerical integration of the cartesian equations of motion of a system with constraints: molecular dynamics of n-alkanes. Journal of Computational Physics 23 (3):327–341. doi: 10.1016/0021-9991(77)90098-5 [DOI] [Google Scholar]
- 53.Zhang Y, McCammon JA (2003) Studying the affinity and kinetics of molecular association with molecular-dynamics simulation. The Journal of Chemical Physics 118 (4):1821–1827. doi: 10.1063/1.1530162 [DOI] [Google Scholar]
- 54.Hermans J, Wang L (1997) Inclusion of Loss of Translational and Rotational Freedom in Theoretical Estimates of Free Energies of Binding. Application to a Complex of Benzene and Mutant T4 Lysozyme. Journal of the American Chemical Society 119 (11):2707–2714. doi: 10.1021/ja963568 [DOI] [Google Scholar]
- 55.Northrup SH, Pear MR, Lee CY, McCammon JA, Karplus M (1982) Dynamical theory of activated processes in globular proteins. Proceedings of the National Academy of Sciences of the United States of America 79 (13):4035–4039 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Jorgensen WL (1983) Theoretical studies of medium effects on conformational equilibria. The Journal of Physical Chemistry 87 (26):5304–5314. doi: 10.1021/j150644a002 [DOI] [Google Scholar]
- 57.Jorgensen WL (1989) Interactions between amides in solution and the thermodynamics of weak binding. Journal of the American Chemical Society 111 (10):3770–3771. doi: 10.1021/ja00192a057 [DOI] [Google Scholar]
- 58.Boczko EM, Brooks CL (1993) Constant-temperature free energy surfaces for physical and chemical processes. The Journal of Physical Chemistry 97 (17):4509–4513. doi: 10.1021/j100119a043 [DOI] [Google Scholar]
- 59.Boczko E, Brooks C (1995) First-principles calculation of the folding free energy of a three-helix bundle protein. Science 269 (5222):393–396. doi: 10.1126/science.7618103 [DOI] [PubMed] [Google Scholar]
- 60.Sugita Y, Kitao A (1998) Dependence of protein stability on the structure of the denatured state: free energy calculations of I56V mutation in human lysozyme. Biophysical Journal 75 (5):2178–2187 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Woo H-J, Roux B (2005) Calculation of absolute protein–ligand binding free energy from computer simulations. Proceedings of the National Academy of Sciences of the United States of America 102 (19):6825–6830. doi: 10.1073/pnas.0409005102 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Gumbart JC, Roux B, Chipot C (2013) Efficient Determination of Protein–Protein Standard Binding Free Energies from First Principles. Journal of Chemical Theory and Computation 9 (8):3789–3798. doi: 10.1021/ct400273t [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Heinzelmann G, Henriksen NM, Gilson MK (2017) Attach-Pull-Release Calculations of Ligand Binding and Conformational Changes on the First BRD4 Bromodomain. Journal of Chemical Theory and Computation 13 (7):3260–3275. doi: 10.1021/acs.jctc.7b00275 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Lee MS, Olson MA (2006) Calculation of Absolute Protein-Ligand Binding Affinity Using Path and Endpoint Approaches. Biophysical Journal 90 (3):864–877. doi: 10.1529/biophysj.105.071589 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Stigler SM (1989) Francis Galton's Account of the Invention of Correlation. Statist Sci 4 (2):73–79. doi: 10.1214/ss/1177012580 [DOI] [Google Scholar]
- 66.Kendall MG (1938) A new measure of rank correlation. Biometrika 30 (1–2):81–93. doi: 10.1093/biomet/30.1-2.81 [DOI] [Google Scholar]
- 67.Laury ML, DeYonker NJ, Jiang W, Wilson AK (2011) A pseudopotential-based composite method: The relativistic pseudopotential correlation consistent composite approach for molecules containing 4d transition metals (Y–Cd). The Journal of Chemical Physics 135 (21):214103. doi: 10.1063/1.3662415 [DOI] [PubMed] [Google Scholar]
- 68.Riojas AG, Wilson AK (2014) Solv-ccCA: Implicit Solvation and the Correlation Consistent Composite Approach for the Determination of pKa. Journal of Chemical Theory and Computation 10 (4):1500–1510. doi: 10.1021/ct400908z [DOI] [PubMed] [Google Scholar]
- 69.Hudson PS, Han K, Woodcock HL, Brooks BR (2018) Force Matching as a stepping stone to QM/MM CB[8] host/guest binding free energies: A SAMPL6 Cautionary Tale. Journal of Computer-Aided Molecular Design Submitted [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Damm-Ganamet KL, Smith RD, Dunbar JB, Stuckey JA, Carlson HA (2013) CSAR Benchmark Exercise 2011–2012: Evaluation of Results from Docking and Relative Ranking of Blinded Congeneric Series. Journal of Chemical Information and Modeling 53 (8):1853–1870. doi: 10.1021/ci400025f [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Xie B, Nguyen TH, Minh DDL (2017) Absolute Binding Free Energies between T4 Lysozyme and 141 Small Molecules: Calculations Based on Multiple Rigid Receptor Configurations. Journal of Chemical Theory and Computation 13 (6):2930–2944. doi: 10.1021/acs.jctc.6b01183 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Huey R, Morris GM, Olson AJ, Goodsell DS (2007) A semiempirical free energy force field with chargebased desolvation. Journal of Computational Chemistry 28 (6):1145–1152. doi: 10.1002/jcc.20634 [DOI] [PubMed] [Google Scholar]
- 73.Lee J, Scheraga HA, Rackovsky S (1997) New optimization method for conformational energy calculations on polypeptides: Conformational space annealing. Journal of Computational Chemistry 18 (9):1222–1232. doi: [DOI] [Google Scholar]
- 74.Domański J, Hedger G, Best RB, Stansfeld PJ, Sansom MSP (2017) Convergence and Sampling in Determining Free Energy Landscapes for Membrane Protein Association. The Journal of Physical ChemistryB 121 (15):3364–3375. doi: 10.1021/acs.jpcb.6b08445 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Nishikawa N, Han K, W X, Tofoleanu F, Brooks BR (2018) Comparison of the umbrella sampling and the double decoupling method in binding free energy predictions for SAMPL6 octa-acid host-guest challenges. Journal of Computer-Aided Molecular Design Submitted [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Bilkova E, Pleskot R, Rissanen S, Sun S, Czogalla A, Cwiklik L, Róg T, Vattulainen I, Cremer PS, Jungwirth P, Coskun Ü (2017) Calcium Directly Regulates Phosphatidylinositol 4,5-Bisphosphate Headgroup Conformation and Recognition. Journal of the American Chemical Society 139 (11):4019–4024. doi: 10.1021/jacs.6b11760 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Jurkiewicz P, Cwiklik L, Vojtíšková A, Jungwirth P, Hof M (2012) Structure, dynamics, and hydration of POPC/POPS bilayers suspended in NaCl, KCl, and CsCl solutions. Biochimica et Biophysica Acta (BBA) - Biomembranes 1818 (3):609–616. doi: 10.1016/j.bbamem.2011.11.033 [DOI] [PubMed] [Google Scholar]
- 78.Han K, Venable RM, Bryant AM, Legacy CJ, Shen R, Li H, Roux B, Gericke A, Pastor RW (2018) Graph-Theoretic Analysis of Monomethyl Phosphate Clustering in Ionic Solutions. J Phys Chem B 122 (4):1484–1494. doi: 10.1021/acs.jpcb.7b10730 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Collins KD (1997) Charge density-dependent strength of hydration and biological structure. Biophysical Journal 72 (1):65–76 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.Hribar B, Southall NT, Vlachy V, Dill KA (2002) How Ions Affect the Structure of Water. Journal of the American Chemical Society 124 (41):12302–12311. doi: 10.1021/ja026014h [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Christian M, RM M, Chris O (2017) Update on phosphate and charged post-translationally modified amino acid parameters in the GROMOS force field. Journal of Computational Chemistry 38 (10):714–720. doi:doi: 10.1002/jcc.24733 [DOI] [PubMed] [Google Scholar]
- 82.Steinbrecher T, Latzer J, Case DA (2012) Revised AMBER parameters for bioorganic phosphates. Journal of chemical theory and computation 8 (11):4405–4412. doi: 10.1021/ct300613v [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83.Venable RM, Luo Y, Gawrisch K, Roux B, Pastor RW (2013) Simulations of Anionic Lipid Membranes: Development of Interaction-Specific Ion Parameters and Validation Using NMR Data. The Journal of Physical Chemistry B 117 (35):10183–10192. doi: 10.1021/jp401512z [DOI] [PMC free article] [PubMed] [Google Scholar]