Abstract
Accurate methods to estimate free energies play an important role for studying diverse condensed-phase problems in chemistry and biochemistry. The most common methods used in conjunction with molecular dynamics (MD) and Monte Carlo statistical mechanics (MC) simulations are free energy perturbation (FEP) and thermodynamic integration (TI). For common applications featuring the conversion of one molecule to another, simulations are run in stages or multiple “λ-windows” to promote convergence of the results. For computation of absolute free energies of solvation or binding, calculations are needed in which the solute is typically annihilated in the solvent and in the complex. The present work addresses identification of optimal protocols for such calculations, specifically, the creation/annihilation of organic molecules in aqueous solution. As is common practice, decoupling of the perturbations for electrostatic and Lennard-Jones interactions was performed. Consistent with earlier reports, FEP calculations for molecular creations are much more efficient, while annihilations require many more windows and may converge to incorrect values. Strikingly, we find that as few as four windows may be adequate for creation calculations for solutes ranging from argon to ethylbenzene. For a larger drug-like molecule, MIF180, which contains 22 non-hydrogen atoms and three rotatable bonds, 10 creation windows are found to be adequate to yield the correct free energy of hydration. Convergence is impeded with procedures that use any sampling in the annihilation direction, and there is no need for post-processing methods such as BAR.
INTRODUCTION
Use of free energy perturbation (FEP) theory1 to calculate the difference in free energy of solvation of two molecules began with Monte Carlo simulations for transforming methanol to ethane in water in 1985.2 It was shown that five evenly spaced λ-windows were sufficient to obtain the same results within 0.12 kcal/mol for the forward series of perturbations (λ= 0 →1) and the reverse. As expectations were uncertain, this level of precision was striking and led to interest in applying FEP calculations to many problems including computation of solvent effects on reaction barriers, pKa values for weak acids, partition coefficients, and relative free energies of binding for organic host-guest systems and protein-ligand complexes.3–7 Starting in 1986, FEP and related methods including thermodynamic integration (TI)8 began to be used to compute absolute free energies of hydration for small solutes such as a noble gas atom, water, or an atomic ion by determining the free energy change when the solute is created or annihilated in water and in the gas phase.9–11
In general, equilibrium constants for partitioning phenomena can be evaluated by comparing the annihilation or creation of a solute in two media such as two liquids or water and a cell membrane. Thus, it was also recognized at this time that absolute free energies of binding could be computed for host-guest complexes by “double annihilation”, i.e., by taking the difference from annihilating the guest alone in solution and in the complex.12 Such calculations were computationally taxing and application to larger systems was slow to advance. Though computation of relative free energies of binding for protein-ligand complexes is now widespread,6,7,13,14 computation of absolute free energies of binding is still not common.15–22 Nevertheless, there is increasing activity and the potential value of such calculations is clear to facilitate, for example, core hopping for drug lead optimization23 and improved scoring of docking poses.24,25 With these goals in mind, we have performed deeper analyses of FEP protocols for molecular creations/annihilations to identify optimal procedures. Our focus has been on methods that feature on-the-the fly averaging of all configurations using the Zwanzig equation (eq 1),1 specifically, direct forward and backward, and double-wide sampling.2,5,26,27 Bennett acceptance ratio (BAR) methods are also popular; however, these require post-processing of results for sampled configurations and large storage capacity for long MD or MC runs.28,29
| (1) |
A key problem is that the reliability of FEP and TI calculations depends on sufficient overlap of sampled states for the reference system (A in eq 1) and the perturbed one (B).4,5 Thus, the perturbations for drug-sized solutes typically utilize many intermediate stages with different values of λ, which may be used to scale all geometrical and force-field parameters χ from those for A to those for B (eq 2). For creation/annihilation of a molecule, A or B is normally a collection of null particles at the former positions of the real atoms or a collapsed collection formed by shrinking bond lengths.2 Kofke and co-workers have stressed that for insertion or deletion of a hard sphere or Lennard-Jones (LJ) particle in an atomic liquid, insertion is much
| (2) |
preferred for convergence of the free energy change and that sampling in the deletion direction has little value.30,31 They then framed this by proposing that the preferred direction for perturbations is to go from the higher entropy system to the lower entropy one, e.g., from a less-dense state to a more dense one. In a subsequent paper, they supported this position with FEP results for converting a hard-sphere particle to an LJ one in an LJ liquid.32 Our interpretation of this preference for insertion (creation) over deletion (annihilation) stems from the steeply rising 1/r12 repulsion in Lennard-Jones (LJ) potentials when the inter-particle contact falls below the LJ diameter σ. As schematized in Figure 1a, it is possible to have an outward perturbation from a configuration in which a solvent molecule turns out to be in contact with the perturbed larger sphere, which is a low-energy arrangement for the perturbed state. On the other hand, for the inward perturbation in Figure 1b, the reference state cannot sample the low-energy configuration for the perturbed state because the repulsive LJ wall keeps the solvent and reference solute from interpenetrating. Thus, low-energy configurations for the inward-perturbed state cannot be sampled, while low-energy ones for an outward perturbation may be within the manifold of the reference state.
Figure 1.

(a) The outward perturbation of the reference solute (solid circle) is favourable because the perturbed solute is near contact with a solvent molecule (blue sphere) from the reference configuration. (b) For the inward perturbation, a low-energy configuration of the perturbed state would again have a solvent molecule in contact with the perturbed solute. However, this positioning of the solvent molecule is not possible by sampling the reference state because it requires interpenetration of the solvent molecule and reference solute.
In the following, we begin with a deeper theoretical analysis of the sampling problem in Figure 1. Then, the implications of the preference for creation over annihilation are reinvestigated for an argon atom and water molecule in water, followed by extension to seven organic solutes as large as ethylbenzene, and finally to a druglike molecule, MIF180. A primary goal is to establish the minimum number of windows that are needed to perform creation/annihilation of such molecules yielding converged free energies of solvation.
THEORY
Free Energy Asymmetry.
The thermodynamic relation ∆GAB = −∆GBA must hold since G is a state function. In practical terms, ∆GAB and ∆GBA can be computed using methods such as FEP or TI where total overlap between the configurational spaces for A and B is required to ensure the equality. In the common case where the configurational space of A (ΩA) is a superset of that of B (ΩB), i. e. (ΩA=ΩA\B+ΩB), sufficient sampling can yield a correct ∆GAB; however, ∆GBA is likely to be erroneous, even though it may be well converged, since sampling for B cannot cover all configurations for A. As shown in Figure 2 for the case of creation of an LJ potential (B) from a null point (A), the repulsive LJ term will exclude necessary configurations in the annihilation direction (B→A) (green area), but in the creation direction (A→B) (green area + blue area), the whole configurational space may be sampled to yield a correct ∆GAB.
Figure 2.

Representation of a null potential (red line – state A) and a Lennard-Jones potential (blue line – state B). The dashed blue line can be taken to correspond to the maximum energy sampled in an MC or MD simulation using this LJ potential. The blue area shows the accessible configurational space for state B (ΩB), while the green area (ΩA\B) depicts the configurational space inaccessible from sampling B, leading to an incorrect value for ∆GBA.
An asymmetry coefficient ξ can be defined as the free energy difference that would arise
| (3) |
Expansion of this formula can made using the Gibbs free energy definition (GA = −kBT ln Z), and assuming that the configurational space of B (ΩB) is a subset of that of A (ΩA), e. g. (ΩB ⊆ ΩA) and the sampling is performed in the initial state A or B (eqs 4 and 5).
| (4) |
| (5) |
Where in the right side of eq 4 the configurational space of state A is separated as and , since the configurations in ΩA\B (the green area) are not-accessible to state B.
Therefore
and thus
which makes ξ = 0 only if . This last identity is satisfied, for instance, in the case of total overlap ΩB = ΩA (there is no excluded region). Generally, the inequality is obtained, which implies eq 6.
| (6) |
Fundamentally, owing to the excluded configurations, ∆GBA is not negative enough, so -∆GAB is not positive enough. Again, the inequality originates from the inaccessibility of important regions of configurational space, and it is a general phenomenon for free energy calculations which feature creation/annihilation of particles. It implies the existence of a direction of transformation which will yield the correct free energy difference independent of the transformation size, although larger transformations will require longer sampling. In other words, in the correct direction (creation), the free energy change will be independent of the number of λ-windows with enough sampling. However, in the annihilation direction use of different numbers of λ-windows will give different converged results.
METHODOLOGY
Monte Carlo Simulations.
MC simulations were carried out with BOSS (Biochemical and Organic Simulation System) software.33 The MC moves in each step are controlled by a user-defined set of move frequencies and ranges to sample solute and solvent degrees of freedom. Water molecules are represented by the rigid TIP4P model;34 thus, solvent moves only involve random rigid-body translations and rotations. The solute moves are a combination of rigid-body translations and rotations with variations in bond lengths, bond angles, and dihedral angles. The default BOSS perturbation parameters, which were adjusted to produce global acceptance rates of 30–40%, were applied in all cases. Simulations have been performed in the NPT ensemble including volume perturbations with the pressure at 1 atm. A pre-equilibrated periodic cube of 512 water molecules was large enough to ensure at least 10-Å padding for all molecules except MIF18035 for which a rectangular box with 750 water molecules was used. In all cases, a number of water molecules equal to the number of heavy atoms of the solute were removed to reduce the equilibration time. For each attempted move, the new configuration is accepted or rejected using the Metropolis criterion36 at a temperature of 25 C.

Hydration Free Energies.
Free energy changes were computed for annihilating and creating the solutes both in aqueous solution and the gas phase using eqs 1 and 2. The solutes were described by the OPLS-AA force field37 except that the partial atomic charges came from the 1.14*CM1A-LBCC model,38 as provided by the LigParGen server.39 For argon, the LJ σ and ε are 3.401 Å and 0.2339 kcal/mol. Solutes were created and annihilated by decoupling the electrostatic and LJ interactions; the partial charges and then the σ and ε parameters of the LJ potentials were scaled linearly as dictated by the λ parameter. These annihilation/creation paths were split into 1, 2, 4, 10, 20, 30 and 40 λ-windows. It is important to note that ten independent simulations were carried out for each transformation and the reported statistical uncertainties are the standard deviation of the average result. For the aqueous-phase calculations, 25 million (25M) MC configurations were sampled for equilibration and 950M configurations were covered for averaging for each λ-window. The corresponding numbers that were sufficient for the gas-phase MC calculations were 25M and 57M. The final free energy of hydration for transfer of the solute from the gas phase into aqueous solution then comes from the difference in results for the creation processes (eq 7). With single-topology FEP calculations in MC simulations there are no end-point problems for perturbations near λ = 0 or 1; high-energy configurations are simply rejected with the Metropolis sampling.2
| (7) |
Direct sampling calculations were computed for both the creation (forward) and annihilation (backward) directions. E. g., for a 10-window creation, FEP calculations were performed for λ = 0.0 → 0.1, 0.1 → 0.2, … 1.0. Direct FEP calculations were also performed with shrinking (SRK) in which the bond lengths were scaled from 0.3 Å to their final values. In addition, the results for the direct forward and backward calculations could be combined to yield values for double-wide sampling (DWS) with twice the λ spacing. E.g., the 10-windows direct calculations can be combined to yield results for 5-windows of DWS as λ = 0.0 ← 0.1 → 0.2 ← 0.3 → 0.4 …1.0. The computational time per each window for the aqueous phase ranged from 12 to 22 hours depending on the system size using an Intel Xeon E5–2660 V3 processor.
RESULTS AND DISCUSSION
Argon Atom.
Table 1 contains the ΔGhyd results for the annihilation and creation processes for argon and a TIP4P water molecule using different numbers of λ-windows. The results for ΔGhyd of argon have disparate values for the forward and backward pathways for 10 or fewer λ-windows; the values do not agree to within 0.1 kcal/mol until 20 λ-windows are used. It is apparent that the backward perturbations are not well behaved, while the creation calculations yield the correct, converged result with use of only one or two windows. As illustrated in Figure 3, with the long averaging runs used here, the results for different numbers of λ-windows are all converged; however, the annihilation calculations with fewer than 20 windows converge to incorrect values for ΔGhyd owing to the sampling problem illustrated in Figure 1b. The discrepancies are −3.17, −2.45, −0.74 and −0.08 kcal/mol for 1, 2, 4 and 10 λ-windows. The perturbations in the annihilation direction are not favourable enough since the lowest-energy perturbed states are not sampled (Figure 1b); this corresponds to the creation being too favourable and ΔGhyd ending up too negative. The same pattern is apparent in the results from 1989 for an LJ particle in water using 10 λ-windows, though the problem with annihilations was not realized at that time.11 Consistently, the reported uncertainties for ΔGhyd in Table 1, which correspond to the standard deviations of the average from the ten independent trajectories, are substantially larger for the annihilations. For creation processes, all the uncertainties are small because the final state is a subset of the initial one and the free energy change converges to the correct value independent of the number of λ-windows. With a single window, the ΔGhyd is well converged after 250M configurations are sampled. The number declines with increasing number of windows such that similar convergence with four windows takes ca. 100M configurations. However, since the latter calculations require a total of 400M configurations, there is no benefit in total computer time for doing the windowing in this simple case. As an aside, it may be noted that the experimental value for the ΔGhyd of argon is 2.0 kcal/mol,40 in good agreement with the computed result.
Table 1.
MC Results for Free Energies of Hydration (kcal/mol) of Argon and a TIP4P Water Molecule in TIP4P Water as a Function of the Number of λ-Windows.a
| Argon | TIP4P Water | |||||||
|---|---|---|---|---|---|---|---|---|
| ΔGhyd | ΔGhyd | ΔGQ | ΔGLJ | |||||
| #λ | Annihilation | Creation | Annihilation | Creation | Annihilation | Creation | Annihilation | Creation |
| 1 | −0.96 ± 0.15 | 2.32 ± 0.10 | −9.24 ± 1.19 | −4.40 ± 0.33 | −8.96 ± 1.16 | −6.73 ± 0.31 | −0.20 ± 0.28 | 2.42 ± 0.11 |
| 2 | −0.24 ± 0.29 | 2.27 ± 0.07 | −8.23 ± 0.38 | −6.05 ± 0.17 | −8.51 ± 0.25 | −8.41 ± 0.16 | 0.36 ± 0.28 | 2.45 ± 0.05 |
| 4 | 1.47 ± 0.50 | 2.25 ± 0.04 | −7.03 ± 0.19 | −6.21 ± 0.04 | −8.54 ± 0.05 | −8.55 ± 0.03 | 1.60 ± 0.18 | 2.42 ± 0.02 |
| 10 | 2.13 ± 0.13 | 2.25 ± 0.02 | −6.32 ± 0.08 | −6.22 ± 0.03 | −8.55 ± 0.03 | −8.55 ± 0.01 | 2.31 ± 0.08 | 2.41 ± 0.02 |
| 20 | 2.21 ± 0.02 | 2.25 ± 0.02 | −6.25 ± 0.02 | −6.22 ± 0.02 | −8.55 ± 0.02 | −8.55 ± 0.01 | 2.39 ± 0.02 | 2.42 ± 0.01 |
| Exp.40 | 2.00 | −6.32 | ||||||
For water, separate results are presented for the charge and LJ perturbations. Statistical uncertainties are determined from 10 independent simulations.
Figure 3.
Evolution of the ΔGhyd for an argon atom in TIP4P water covering 950M MC steps. Panels A, B, C and D correspond to use of 1, 2, 4 and 10 λ-windows, respectively. Blue (green) points correspond to the creation (annihilation) process and the red line shows the properly converged value for ΔGhyd.
TIP4P Water Molecule.
In the TIP4P model, a water molecule is represented by an LJ site centered on oxygen with partial positive charges on the hydrogens and a compensating negative site 0.15 Å from the oxygen on the bisector of the HOH angle. For annihilation, the partial charges were neutralized first and then the LJ particle was removed; for creation, the LJ particle is grown, then the partial charges are added. Table 1 contains results for ΔGhyd and the separate charge and LJ components. The latter do not exactly add up to ΔGhyd owing to a small correction that has been added (−0.085 kcal/mol) for the neglect of LJ interactions with the solute beyond the 10-Å cutoff.38 Interestingly, the charge contribution converges to the correct answer using four or more λ-windows for both the creation and annihilation pathways. However, as expected from the argon case, the LJ contribution is still in error by 0.1 kcal/mol with ten windows for the annihilation pathway. Overall, it is notable that only four creation windows are needed to obtain the correct, converged ΔGhyd for a TIP4P water molecule in TIP4P water. The result of −6.22 ± 0.02 kcal/mol is in good agreement with the report from 1989 (−6.1 ± 0.3 kcal/mol)11 and the experimental value (−6.32 kcal/mol).40 The agreement with the prior result reflects that 10 λ-windows were used in that study for both the conversion of a TIP4P water molecule to an LJ particle and for creation/annihilation of the LJ particle, though the averaging only covered 2M configurations for each window.11
Methanol, Ethanol and Benzene Analogs.
These molecules, which contain two or more non-hydrogen atoms, provide general testing for small organic molecules. The ΔGhyd results for direct sampling as a function of the number of λ-windows are shown in Table 2. The immediate, striking result is that the ΔGhyd values from use of only 4 creation windows for both charge and LJ perturbations are within 0.07 kcal/mol of the final values from 20 or 30 windows; the statistical uncertainties from the 10 independent runs with 4 windows are all also under 0.2 kcal/mol. In all cases, the ΔGhyd results from the creation calculations are within 0.05 kcal/mol for 10, 20 or 30 λ-windows. In contrast, there are discrepancies of 1–2 kcal/mol between the results with 10 and 30 windows for the annihilations of the benzene analogs and of still ca. 0.3 kcal/mol with 20 and 30 windows for pyridine and ethylbenzene. For small numbers of windows, the annihilations may converge with small statistical uncertainties, e.g., 0.5 kcal/mol for methanol or benzene with two windows, but they again converge to ΔGhyd values that are too negative, by 4–9 kcal/mol in these cases. With two creation windows, the results in these two cases are within 0.2 kcal/mol of the correct values; however, two creation windows are insufficient for the substituted benzenes and pyridine. Overall, it is apparent that reliable ΔGhyd values can be obtained for such solutes with as few as four windows of direct sampling in the creation direction, while FEP annihilations are not well behaved and should be avoided. The agreement between the computed and experimental data in Table 2 is very good, as expected for the current force field with the 1.14*CM1A-LBCC partial charges.38
Table 2.
Computed Free Energies of Hydration for Seven Organic Molecules as a Function of the Number of λ-Windows.
| Methanol | Ethanol | Benzene | Toluene | p-Xylene | Pyridine | Ethylbenzene | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| #λ | Annihilation | Creation | Annihilation | Creation | Annihilation | Creation | Annihilation | Creation | Annihilation | Creation | Annihilation | Creation | Annihilation | Creation |
| 1 | −10.06 ± 0.67 | −3.43 ± 2.07 | −13.26 ± 0.67 | 0.51 ± 2.36 | −10.86 ± 0.53 | 60.1 ± 38.0 | −12.99 ± 0.47 | 111.0 ± 115.9 | −14.79 ± 0.71 | 187.36 ± 213.92 | −14.19 ± 0.53 | 38.4 ± 28.8 | −14.61 ± 0.64 | 44.5 ± 63.5 |
| 2 | −9.24 ± 0.48 | −5.08 ± 0.21 | −11.36 ± 0.86 | −5.16 ± 0.28 | −10.02 ± 0.53 | −0.49 ± 0.93 | −11.74 ± 0.66 | −0.28 ± 1.25 | −13.50 ± 0.61 | 1.89 ± 1.68 | −12.99 ± 0.67 | −3.72 ± 0.55 | −13.31 ± 0.64 | 3.83 ± 2.24 |
| 4 | −6.91 ± 0.96 | −5.15 ± 0.05 | −8.03 ± 0.81 | −5.21 ± 0.06 | −5.56 ± 0.90 | −0.63 ± 0.11 | −6.93 ± 1.08 | −0.71 ± 0.17 | −8.09 ± 0.93 | −0.73 ± 0.16 | −9.10 ± 1.13 | −4.11 ± 0.12 | −8.26 ± 1.01 | −0.54 ± 0.19 |
| 10 | −5.22 ± 0.66 | −5.15 ± 0.04 | −5.78 ± 0.49 | −5.28 ± 0.04 | −1.10 ± 1.02 | −0.67 ± 0.04 | −1.62 ± 0.95 | −0.71 ± 0.04 | −2.35 ± 0.96 | −0.76 ± 0.08 | −4.92 ± 1.04 | −4.18 ± 0.04 | −2.00 ± 0.74 | −0.53 ± 0.06 |
| 20 | −5.16 ± 0.45 | −5.15 ± 0.01 | −5.44 ± 0.21 | −5.25 ± 0.02 | −0.81 ± 0.66 | −0.68 ± 0.03 | −0.72 ± 0.67 | −0.70 ± 0.03 | −0.56 ± 0.92 | −0.80 ± 0.04 | −4.42 ± 0.44 | −4.17 ± 0.03 | −0.83 ± 0.48 | −0.54 ± 0.05 |
| 30 | −5.21 ± 0.16 | −5.16 ± 0.02 | −5.34 ± 0.25 | −5.26 ± 0.03 | −0.81 ± 0.30 | −0.67 ± 0.03 | −0.69 ± 0.67 | −0.71 ± 0.02 | −0.68 ± 0.86 | −0.80 ± 0.04 | −4.14 ± 0.73 | −4.16 ± 0.02 | −0.48 ± 0.64 | −0.52 ± 0.03 |
| Exp.41 | −5.10 | −5.01 | −0.86 | −0.89 | −0.80 | −4.69 | −0.79 | |||||||
The effects of performing bond shrinking (SRK) with the annihilation of the LJ interactions and expansion for the molecular creations as well as double-wide sampling (DWS) were evaluated for the ΔGhyd of ethylbenzene (Table 3). As noted above, for the direct sampling in the creation direction, use of four windows gives essentially the correct result (−0.54 kcal/mol), though the uncertainty is ±0.19 kcal/mol. The result is poorer (0.39 kcal/mol) with four windows and the shrinking included, while for 10 windows the direct creation results with or without the shrinking are in close agreement with the target value of −0.52 kcal/mol. For 10 windows of double-wide sampling, the results with or without the shrinking are very close, but they are lower than the target by 0.12 and 0.14 kcal/mol. Going to 15 windows of DWS brings the results within 0.02 kcal/mol of the target with or without the shrinking. These results show that for computing the free energy of hydration for ethylbenzene there is no advantage to performing the bond shrinking during the FEP calculations or for double-wide sampling over direct creation, which is consistent with the prior results for insertion or deletion of a hard sphere or LJ particle in an LJ liquid.30,31
Table 3.
Computed Free Energies of Hydration for Ethylbenzene Using Direct and Double-Wide Sampling with and without Bond Shrinking.
| #λ | Direct Annihilation | Direct Creation | DWS | Direct SRK Annihilation | Direct SRK Creation | DWS SRK |
|---|---|---|---|---|---|---|
| 1 | −14.61 +/−0.64 | 44.51 +/−63.48 | −4.36 +/−1.53 | −14.78 +/−0.54 | 281.03 +/−81.77 | 29.34 +/−16.97 |
| 2 | −13.31 +/−0.64 | 3.83 +/−2.24 | −4.08 +/−1.03 | −15.28 +/−0.60 | 32.96 +/−16.99 | −4.47 +/−0.74 |
| 4a/5b | −8.26 +/−1.01 | −0.54 +/−0.19 | −1.22 +/−0.48 | −12.60 +/−0.70 | 0.39 +/−0.59 | −1.78 +/−0.44 |
| 10 | −2.00 +/−0.74 | −0.53 +/−0.06 | −0.64 +/−0.23 | −3.05 +/−0.70 | −0.51 +/−0.09 | −0.66 +/−0.29 |
| 20a/15b | −0.83 +/−0.48 | −0.54 +/−0.05 | −0.50 +/−0.30 | −0.83 +/−0.55 | −0.54 +/−0.07 | −0.54 +/−0.11 |
| 30 | −0.48 +/−0.64 | −0.52 +/−0.03 | −0.56 +/−0.35 | −0.53 +/−0.08 |
For direct sampling.
For double-wide sampling.
Comparison with BAR for Phenol.
A different benzene analog, phenol, was used to compare the results of the FEP direct creation method with the often-used post-processing Bennett Acceptance Ratio (BAR) method.28 Phenol was chosen primarily because a more polar, hydrogen-bonding molecule would be more likely to reveal problematic issues. The MC simulations were conducted as above using four windows in both directions; the energy differences (ΔE) were written to a file at each step and used to estimate the free energy with the pymbar program.42 A comparison of the free energy changes from the Lennard-Jones and charge simulations, and the resulting hydration free energies (kcal/mol) estimated using BAR for different ΔE frequencies are given in Table 4. In all cases, the free energies converged to very similar values. For comparison, the experimental free energy of hydration is −6.61 kcal/mol, and the prior value from the literature with the same force field is −6.51 kcal/mol.38
Table 4.
Comparison of Hydration Free Energy of Phenol (kcal/mol) estimated using BAR and FEP Direct Creation.
| Method | MC Steps | ΔE freq.a | ΔGLJ | ΔGQ | ΔGhydb |
|---|---|---|---|---|---|
| bar | 280 M | 1 | −7.85 +/−0.10 | 17.05 +/−0.06 | −6.49 +/−0.12 |
| 100 | −7.85 +/−0.10 | 17.05 +/−0.06 | −6.49 +/−0.12 | ||
| 10000 | −7.87 +/−0.05 | 17.06 +/−0.06 | −6.49 +/−0.08 | ||
| bar | 950 Mc | 100 | −7.79 +/−0.09 | 17.04 +/−0.04 | −6.53 +/−0.10 |
| 10000 | −7.77 +/−0.04 | 17.02 +/−0.03 | −6.54 +/−0.05 | ||
| FEP creation | 950 M | 1 | −7.76 +/−0.08 | 17.02 +/−0.07 | −6.55 +/−0.11 |
Frequency of ΔE inclusion in the BAR calculations.
The hydration free energies include a long range correction of −1.32 kcal/mol and a gas-phase contribution of 4.04 kcal/mol.
The pymbar software could not use the full 950 M steps at a frequency of 1.
A plot comparing the convergence of the free energy estimates with number of steps is given in the Supplementary Information. It can be easily seen from the values in Table 4 and the evolution plot that convergence to the same numerical value is obtained in approximately the same number of steps using BAR and direct creation FEP. Since the latter can be done on-the-fly with no need for storage of the energy changes or post-processing, it is the preferred procedure.
MIF180.
The results for the ΔGhyd of MIF180 are listed in Table 5. For this large a molecule, some of the calculations with just one λ-window were ill-behaved with the occurrence of infinite energies, and results are not provided. From the calculations with 40 windows, the converged ΔGhyd is −18.4 ± 0.2 kcal/mol. Remarkably, this value is closely approached using just 10 windows with the creation route or double-wide sampling. As for ethylbenzene, the bond shrinking is found to have little impact for these calculations in aqueous solution, though it may be valuable in more constrained environments such as a binding site of a protein. The annihilation procedure requires 30 or 40 windows to yield results within 0.2 kcal/mol of the target, and it again yields ΔGhyd values that are too negative with smaller numbers of windows. The discrepancies between the creation and annihilation results are further illustrated in Figure 4, which shows the computed free energy changes for the LJ part of the calculations in water. Consistent with the results in Table 4, there are significant differences, particularly for λ > 0.5, until at least 20 windows are utilized.
Table 5.
Computed Free Energies of Hydration for MIF180 Using Direct and Double-Wide Sampling with and without Bond Shrinking.
| #λ | Direct Annihilation | Direct Creation | DWS | Direct SRK Annihilation | Direct SRK Creation | DWS SRK |
|---|---|---|---|---|---|---|
| 1 | −45.31 +/−1.60 | −10.50 +/−6.27 | 164.0 +/−64.7 | |||
| 2 | −38.11 +/−1.18 | 76.30 +/−32.01 | −23.10 +/−1.57 | −43.31 +/−1.21 | 185.90 +/−63.58 | −22.54 +/−1.71 |
| 4a/5b | −31.28 +/−1.00 | −16.86 +/−1.59 | −20.16 +/−0.47 | −38.54 +/−0.97 | −7.00 +/−3.24 | −21.86 +/−0.77 |
| 10 | −21.88 +/−0.86 | −18.37 +/−0.20 | −18.17 +/−0.78 | −25.37 +/−1.19 | −18.53 +/−0.28 | −18.87 +/−0.84 |
| 20a/15b | −18.07 +/−0.83 | −18.42 +/−0.19 | −18.43 +/−0.26 | −19.28 +/−1.14 | −18.55 +/−0.21 | −18.68 +/−0.44 |
| 30a/20b | −18.54 +/−0.48 | −18.41 +/−0.13 | −18.13 +/−0.61 | −18.85 +/−0.67 | −18.55 +/−0.11 | −18.41 +/−0.38 |
| 40 | −18.18 +/−0.77 | −18.35 +/−0.17 | −18.26 +/−0.75 | −18.51 +/−0.15 |
For direct sampling.
For double-wide sampling.
Figure 4.
Free energy changes for the Lennard-Jones component in the direct FEP calculations with different numbers of λ-window for MIF180. Blue and green lines correspond to creation and annihilation, respectively.
It is also important to note that some results with fewer than 10 windows appear well converged based on the relatively small statistical uncertainty from the 10 independent runs, e.g., ±0.47 kcal/mol for five windows of DWS; however, as in Figure 3, the convergence is to an incorrect value of ΔGhyd. This can be attributed to inclusion of the backward results in the DWS calculations. The ΔGhyd from four windows of direct creation (−16.86 kcal/mol) is also incorrect, though the rms deviation of the average from the ten runs (±1.59 kcal/mol) provides a proper warning. The message is clear that to compute absolute free energies of solution it is best to just perform FEP calculations that only sample in the creation direction.
CONCLUSION
Procedures have been investigated for the complete creation/annihilation of molecules in solution using FEP calculations. Such calculations are needed for the computation of absolute free energies of solution and for important related problems including computation of partition coefficients for equilibria between two media and of absolute free energies binding for host-guest complexes. The focus was for simulations with on-the-fly averaging for all sampled configurations, namely, direct forward (creation) and backward (annihilation) calculations and double-wide sampling, which involves a combination of the two. As found previously with atomic fluids,30,31 direct creation pathways are much preferred in the present cases covering argon, water, methanol, ethanol, five benzene analogs, and a drug-like molecule, MIF180. The inconsistencies between the creation and annihilation results stem primarily from inadequate representation of low-energy perturbed states for the annihilation of the LJ interactions, as illustrated in Figure 1. Remarkably, for the molecules with as many as eight non-hydrogen atoms, only four forward λ-windows for both the perturbation of the partial charges and LJ terms were found to be necessary to obtain correct, converged results for absolute free energies of hydration. With the forward-only FEP protocol, there is no need for post-processing the configurational energy differences with BAR. Furthermore, ten windows were sufficient to treat properly MIF180, which has 22 non-hydrogen atoms. The charge perturbations converge more readily and show little sensitivity to the direction of the perturbation (Table 1); it appears to only be necessary to use half as many λ-windows for the charge perturbations as the LJ ones. These results provide valuable, time-saving guidelines for future performance of molecular creations/annihilations, which can be expected to become heavily utilized for diverse applications including evaluation of membrane partitioning and guidance for core-hopping and scoring of docking poses in drug discovery.
Supplementary Material
Acknowledgments
Funding
Gratitude is expressed to the National Institutes of Health (GM32136) and to the Yale University Faculty of Arts and Sciences High-Performance Computing Center for support.
Footnotes
ASSOCIATED CONTENT
Supporting Information. This information is available free of charge on the ACS Publications website at http://pubs.acs.org. It consists of a pdf file showing the aqueous free energy evolution of the different methods used in this study and an Excel file (HFEtables.xlsx) with the detailed results (aqueous/gas phase for LJ and charge calculations) for each molecule.
The authors declare no competing financial interest.
REFERENCES
- 1.Zwanzig RW, High‐temperature equation of state by a perturbation method. I. Nonpolar gases. J. Chem. Phys 1954, 22, 1420–1426. [Google Scholar]
- 2.Jorgensen WL; Ravimohan C, Monte Carlo simulation of differences in free energies of hydration. J. Chem. Phys 1985, 83, 3050–3054. [Google Scholar]
- 3.Jorgensen WL Free Energy Calculations, A Breakthrough for Modeling Organic Chemistry in Solution. Acc. Chem. Res 1989, 22, 184–189. [Google Scholar]
- 4.Kollman PA Free Energy Calculations: Applications to Chemical and Biochemical Phenomena. Chem Rev 1993, 93, 2395–2417. [Google Scholar]
- 5.Chipot C; Pohorille A Calculating Free Energy Differences Using Perturbation Theory. In Springer Series in Chemical Physics, Vol 86: Free Energy Calculations: Theory and Applications in Chemistry and Biology; Chipot C; Pohorille A, Eds.; Springer-Verlag, Berlin, 2007, p 33–75. [Google Scholar]
- 6.De Vivo M; Masetti M; Bottegoni G; Cavalli A Role of Molecular Dynamics and Related Methods in Drug Discovery. J. Med. Chem 2016, 59, 4035–4061. [DOI] [PubMed] [Google Scholar]
- 7.Cournia Z; Allen B; Sherman W Relative Binding Free Energy Calculations in Drug Discovery: Advances and Practical Considerations. J. Chem. Info. Model 2017, 57, 2911–2937. [DOI] [PubMed] [Google Scholar]
- 8.Kirkwood JG Statistical Mechanics of Fluid Mixtures. J. Chem. Phys 1935, 3, 300–313. [Google Scholar]
- 9.Straatsma TP; Berendsen HJC; Postma JPM Free enegy of hydrophobic hydration: A molecular dynamics study of noble gases in water. J. Chem. Phys 1986, 85, 6720–6727. [Google Scholar]
- 10.Hermans J; Pathiaseril A; Anderson A Excess free energy of liquids from molecular dynamics simulations. Application to water models. J. Am. Chem. Soc 1988, 110, 5982–5986. [DOI] [PubMed] [Google Scholar]
- 11.Jorgensen WL; Blake JF; Buckner JK Free energy of TIP4P water and free energies of hydration of CH4 and Cl-from statistical perturbation theory. Chem. Phys 1989, 129, 193–200. [Google Scholar]
- 12.Jorgensen WL; Buckner JK; Boudon S; Tirado-Rives J Efficient computation of absolute free energies of binding bu computer simulations. Application to the methane dimer in water. J. Chem. Phys 1988, 89, 3742–3746. [Google Scholar]
- 13.Jorgensen WL Efficient Drug Lead Discovery and Optimization. Acc. Chem. Res 2009, 42, 724–733. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Abel R; Wang L; Mobley DL; Friesner RA A critical review of validation, blind testing, and real-world use of alchemical protein-ligand binding free energy calculations. Curr. Top. Med. Chem 2017, 17, 1–9. [DOI] [PubMed] [Google Scholar]
- 15.Fujitani H; Tanida Y; Ito M; Jayachandran G; Snow CD; Shirts MR; Sorin EJ; Pande VS Direct calculation of the binding energies of FKBP ligands. J. Chem. Phys 2005, 123, 084108. [DOI] [PubMed] [Google Scholar]
- 16.Deng Y; Roux B Calculation of standard binding free energies: Aromatic molecules in the T4 lysozyme L99A mutant. J. Chem. Theory Comput 2006, 2, 1255–1273. [DOI] [PubMed] [Google Scholar]
- 17.Mobley DL; Graves AP; Chodera JD; McReynolds AC; Shoichet BK; Dill KA Predicting absolute ligand binding free energies to a simple model site. J. Mol. Biol 2007, 371, 1118–1134. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Rodinger T; Howell PL; Pomès R, Calculation of absolute protein-ligand binding free energy using distributed replica sampling. J. Chem. Phys 2008, 129, 155102. [DOI] [PubMed] [Google Scholar]
- 19.Jiang W; Roux B Free energy perturbation Hamiltonian replica-exchange molecular dynamics (FEP/H-REMD) for absolute ligand binding free energy calculations. J. Chem. Theory Comput 2010, 6, 2559–2565. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Jo S; Jiang W; Lee HS; Roux B; Im W CHARMM-GUI Ligand Binder for absolute binding free energy calculations and its application. J. Chem. Inf. Model 2013, 53, 267–277. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Aldeghi M; Heifetz A; Bodkin MJ; Knapp S; Biggin PC Accurate calculation of the absolute free energy of binding for drug molecules. Chem. Sci 2016, 7, 207–218. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Cabeza de Vaca I; Qian Y; Vilseck JZ; Tirado-Rives J; Jorgensen WL, Enhanced Monte Carlo Methods for Modeling Proteins Including Computation of Absolute Free Energies of Binding. J. Chem. Theory Comput 2018, 14, 3279–3288. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Wang L; Deng Y; Wu Y; Kim B; LeBard DN; Wandschneider D; Beachy M; Friesner RA; Abel R Accurate modeling of scaffold hopping transformations in drug discovery. J. Chem. Theory Comput 2017, 13, 42–54. [DOI] [PubMed] [Google Scholar]
- 24.Gaieb Z; Liu S; Gathiaka S; Chiu M; Yang H; Shao C; Feher VA; Walters WP; Kuhn B; Rudolph MG; Burley SK; Gilson MK; Amaro RE D3R grand challenge 2: blind prediction of protein-ligand poses, affinity rankings, and relative binding free energies. J. Comput. Aided Mol. Design 2018, 32, 1–20. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Dittrich J; Schmidt D; Pfleger C; Gohlke H Converging a knowledge-based scoring function: DrugScore2018. J. Chem. Info. Model 2019, 59, 509–521. [DOI] [PubMed] [Google Scholar]
- 26.Lu N; Kofke DA; Woolf TB Improving the efficiency and reliability of free energy perturbation calculations using overlap sampling methods. J. Comput. Chem 2004, 25, 28–39. [DOI] [PubMed] [Google Scholar]
- 27.Jorgensen WL; Thomas LL Perspective on free-energy perturbation calculations for chemical equilibria. J. Chem. Theory Comput 2008, 4, 869–876. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Bennett CH Efficient estimation of free energy differences from Monte Carlo data. J. Comput. Phys 1976, 22, 245–268. [Google Scholar]
- 29.Ding X; Vilseck JZ; Brooks CL III Fast solver for large scale multistate Bennett acceptance ratio equations. J. Chem. Theory Comput 2019, 15, 799–802. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Kofke DA; Cummings PT Precision and accuracy of staged free-energy perturbation methods for computing the chemical potential by molecular simulation. Fluid Phase Equil 1998, 150–151, 41–49. [Google Scholar]
- 31.Lu N; Kofke DA Optimal intermediates in staged free energy calculations. J. Chem. Phys 1999, 111, 4414–4423. [Google Scholar]
- 32.Lu N; Kofke DA Accuracy of Free-Energy Perturbation Calculations in Molecular Simulation. I. Modeling. J. Chem. Phys 2001, 114, 7303–7311. [Google Scholar]
- 33.Jorgensen WL; Tirado-Rives J Molecular modeling of organic and biomolecular systems using BOSS and MCPRO. J. Comput. Chem 2005, 26, 1689–700. [DOI] [PubMed] [Google Scholar]
- 34.Jorgensen WL; Chandrasekhar J; Madura JD; Impey RW; Klein ML Comparison of simple potential functions for simulating liquid water. J. Chem. Phys 1983, 79, 926–935. [Google Scholar]
- 35.Dziedzic P; Cisneros JA; Robertson MJ; Hare AA; Danford NE; Baxter RH; Jorgensen WL Design, Synthesis, and Protein Crystallography of Biaryltriazoles as Potent Tautomerase Inhibitors of Macrophage Migration Inhibitory Factor. J. Am. Chem. Soc 2015, 137, 2996–3003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Metropolis N; Rosenbluth AW; Rosenbluth MN; Teller AH; Teller E Equation of state calculations by fast computing machines. J. Chem. Phys 1953, 21, 1087–1092. [Google Scholar]
- 37.Jorgensen WL; Maxwell DS; Tirado-Rives J Development and testing of the OPLS all-atom force field on conformational energetics and properties of organic liquids. J. Am. Chem. Soc 1996, 118, 11225–11236. [Google Scholar]
- 38.Dodda LS; Vilseck JZ; Tirado-Rives J; Jorgensen WL 1.14*CM1A-LBCC: Localized Bond-Charge Corrected CM1A Charges for Condensed-Phase Simulations. J. Phys. Chem. B 2017, 121, 3864–3870. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Dodda LS; Cabeza de Vaca I; Tirado-Rives J; Jorgensen WL LigParGen web server: an automatic OPLS-AA parameter generator for organic ligands. Nuc. Acids Res 2017, 45 (W1), W331–W336. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Ben‐Naim A; Marcus Y Solvation thermodynamics of nonionic solutes. J. Chem. Phys 1984, 81, 2016–2027. [Google Scholar]
- 41.Abraham MH; Andonian-Haftvan J; Whiting GS; Leo A; Taft RS Hydrogen bonding. Part 34. The factors that influence the solubility of gases and vapours in water at 298 K, and a new method for its determination. J. Chem. Soc. Perkin Trans 2 1994, 1777–1791. [Google Scholar]
- 42.Shirts MR; Chodera JD Statistically optimal analysis of samples from multiple equilibrium states. J. Chem. Phys 2008, 129,124105. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.


