Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2022 Jan 11.
Published in final edited form as: J Comput Aided Mol Des. 2020 Feb 14;34(5):471–483. doi: 10.1007/s10822-020-00285-2

Multi-Phase Boltzmann Weighting

Accounting for Local Inhomogeneity in Molecular Simulations of Water-Octanol Partition Coefficients in the SAMPL6 Challenge

Andreas Krämer 1,*, Phillip S Hudson 1,2,*, Michael R Jones 1, Bernard R Brooks 1
PMCID: PMC8750956  NIHMSID: NIHMS1561921  PMID: 32060677

Abstract

Accurately computing partition coefficients is a pivotal part of drug discovery. Specifically, octanol-water partition coefficients can provide information into hydrophobicity of drug-like molecules, as well as a de facto representation of membrane permeability. However, one challenge facing the computation of partition coefficients is the need to encapsulate various microscopic environments. These include areas of largely bulk solvent (i.e., either water or octanol) or regions where octanol is saturated with water or areas of higher salt concentration. Also, tautomeric effects require consideration. Thus, we present a Boltzmann weighting approach that incorporates transfer free energies across varying microscopic media, as well as varying tautomeric state, to compute partition coefficients in the SAMPL6 challenge.

Keywords: Alchemical Simulation, Solvation Free Energy, Multi-Phase, Boltzmann Weighting, SAMPL6

1. Introduction

The ability to correctly identify thermochemical properties of drug-like molecules is an essential step in the drug design process [1]. Some examples include the resolution of major tautomeric and protonation states, as well as solvent affinity (e.g., solvation free energies and partition coefficients). Various methods exist for these computations ranging a gamut of theoretical rigor (e.g., empirical and physicsbased) [2]. However, comparing accuracy and precision across methods is difficult. Thus, the statistical assessment of the modelling of proteins and ligands (SAMPL) challenge has become a pivotal benchmark to the computational chemistry community [38].

This work is a molecular simulation study. We describe an attempt to determine water-octanol partition coefficients from alchemical free energy simulations in the context of the SAMPL6 log P prediction challenge [9]. Logarithmic partition coefficients, log P, are an often-used indicator for drug efficacy, mainly because they reveal a substance’s ability to permeate through cell membranes and access binding targets. The correlation between partitioning into an organic phase and permeability is famously stated by Overton’s rule [10]. While permeability measurements are tedious in both simulation and experiment, partition coefficients are generally easier to obtain with both methodologies [11]. Although Overton’s rule does not explicitly state a solvent of choice for the organic phase, hexadecane and octanol are the most popular phases in practice [12]. Hexadecane resembles the hydrophobic interior of lipid bilayers, which is often most informative of the permeability [13]. Octanol, in contrast, better models the amphiphilic molecular structure of lipids which contain apolar tails as well as polar head groups. This leads to a richer microsopic structure; octanol forms inverse micelles in bulk since the polar hydroxyl groups cluster. The non-negligible amount of water that gets solvated in bulk octanol further amplifies the structural heterogeneity; differentsized water clusters and chains form, around which octanol organizes in micelle- and column-like structures [14, 15].

The microscopic heterogeneity in the octanol phase presents a challenge for classical approaches that calculate partition coefficients based on atomistic simulations [16]. Compared to hexadecane, the presence of substructure increases the effective dimension of the configurational space and sampling this space exhaustively becomes computationally expensive. Free energy simulations in different alchemical states cause a reorganization of solvent in the presence or absence of solute and converging those simulations in each state becomes tedious. In addition, various microscopic environments coexist throughout both the organic (i.e. octanol) and aqueous phases. This includes environmental heterogeneity in the octanol phase from water saturation, as well as the salt accumulation in the aqueous phase (as salt ions do not cross into the octanol [17] and phases are never perfectly mixed on the nano-scale). Other effects to consider in the classical scheme are the possibility of major tautomeric changes between phases, and thus the need to accommodate possible tautomeric states [1821].

To account for local (and spontaneous) inhomogeneity, the present work follows a multi-phase approach, where both solvents are represented by multiple systems that contain different mixtures of octanol, water, and ions, see Fig. 1. Free energies of solvation for all considered tautomers in all solvent phases are calculated using the CHARMM generalized force field (CGenFF) [22] and a standard free energy simulation protocol. To combine the solvation free energies for these systems into a single partition coefficient, each mixture is assumed to be equally accessible to the solutes. Boltzmann weighting is performed with respect to a mutual reference state, namely the lowest-energy tautomer in gas phase. This approach captures the local inhomogeneity of the system, while allowing a small system size for each simulation.

Fig. 1.

Fig. 1

Illustration of the various environments considered in this work.

This manuscript presents the methods and results for SAMPL6 submissions ujsvj, 3wvyh, 2mi5w, 5855v, and tzzb5. As detailed later, the results submitted as a blind prediction were affected by a numerical inconsistency in an analysis script. This error was fixed in post-submission analysis and we mostly discuss the updated results. The manuscript is structured as follows: Section 2 presents the methods that were used to identify and parameterize tautomers, set up and run the simulations, and compute partition coefficients. Section 3 describes the results. Section 4 discusses the results and some of the challenges faced on the way. Section 5 provides concluding remarks.

2. Methodology

2.1. Generation of tautomers

Tautomer identification was performed through the OpenBabel, MolVS, and RD-Kit interfaces for converting SMILES strings into XYZ files [23, 24]. Following this, the resulting preliminary tautomers were examined for possible E/Z isomerization through which IQMol was used to generate initial configurations [25]. All the identified tautomers (labeled with letters a, b, c, etc.) are provided as PDB files in supplemental data. Once all tautomers were identified, geometry optimizations were performed at the PM7 level of theory using MOPAC [26].

Relative energy differences between tautomers, ΔAgasreftautomer, were computed via geometry optimizations of all possible tautomers using MP2 [27] with 6–31G* [2831] via Gaussian09 [32] or Gaussian16 [33]. Following this, frequency calculations were used to determine thermal corrections to the free energy with a tight SCF convergence threshold.

2.2. Parameterization

Initial CHARMM tautomer parameters (i.e., CSET) were obtained via the Paramchem server [34]. Reparameterization for each respective set of tautomers was attempted via the FFTK program available in VMD [35, 36]. This process was performed in a two-fold approach: once where only parameters with penalties based on the Paramchem output were adjusted (i.e. PSET), and fitting based on reparameterizing each molecule from scratch (i.e., ASET). However, a handful of problems occured in the parameterization process with the FFTK interface. First, through successive rounds of simulated annealing to obtain partial atomic charges, it was found that the resulting charges had fallen outside of the constraint bounds for select cases (e.g., hydrogens with a negatively assigned charge and halogens with a positive charge). This was the case for molecules SM04, SM07, SM12, and SM13 of the PSET as well as SM07 and SM11 of the ASET. A second problem occurred based a hard coded bug in the FFTK interface, where the standard TIP3 water-solute QM interaction energy required for CHARMM charge parameterization had an erroneous water OH bond length (0.9527 Å as opposed to 0.9572 Å) in the generated Gaussian input files. This bug seems to be specific to halogenated species SM02, SM04, SM12, and SM16, and had a non-negligible effect on interaction energies (e.g., ~0.5 kcal/mol difference). Also, based on an update to the treatment of halogen complexes in the CHARMM general force field [37], lone pairs are needed to encapsulate “sigma hole” effects. At present, FFTK does not support lone pairs, as the background functionality implements NAMD, which does not have lone pair functionality [38]. Furthermore, fitting of Urey-Bradley terms and improper dihedrals is not currently supported by FFTK. Due to the lack of support, lone pairs were ommitted in all calculations (e.g., the +0.05e charge on the lone pairs were added onto their respective parent atom for the CSET parameters).

2.3. System Preparation

Four solvent environments were considered in the log P calculations (see Table 1). For bulk water (wat), a box of 1000 TIP3 waters [39] was constructed, and equilibrated for 500 ps at 1 atm and 300 K under the isobaric-isothermal ensemble to obtain a consensus box size of ~31 Å. The equilibration was run in CHARMM c41b2 [40] using a Nosé-Hoover themostat [41, 42], Particle-mesh Ewald (PME) for electrostatics, and a polynomial potential switching function between 10 and 12 Å for Lennard-Jones interactions. A water-saturated octanol (wso) box was constructed by randomly placing and orienting 356 octanol molecules and 132 TIP3 waters to approximate the experimentally observed mole fraction of water found in octanol (≈0.2705) [43]. The box was equilibrated at 1 atm and 300 K for 2.5 ns in the isobaric-isothermal ensemble in CHARMM, followed by 10 ns equilibration in the same ensemble in OpenMM 7.3.1 [44]. The equilibration yielded a consensus box size of ~45 Å. The pure octanol phase box was constructed from the equilibrated water saturated octanol box via removal of waters (i.e, a box composed of 356 octanols). The ionic strength adjusted (isa wat) water was setup by adding 3 K+/Cl to the bulk water box.

Table 1.

Designation of solvent systems.

Abbreviation Name Number of Molecules/Ions in Solvent
Water Octanol K+ Cl

wat pure water 1000 0 0 0
isa_wat ionic strength adjusted water 0.15 M KCl 1000 0 3 3
ocoh pure octanol 0 356 0 0
wso water-saturated octanol 132 356 0 0
gas gas phase 0 0 0 0

The solvated systems were initialized by dragging each tautomer into the center of the solvent box by imposing a center of mass restraint. The initial restraint force constant was set to 0.00001 kcal/mol·Å2 and increased by an order of magnitude every 1 ps over 5 iterations. Some systems ended up with octanol sticking through rings of the solute molecules. These systems were manually postprocessed by shifting the coordinates of the penetrating octanols and running a short equilibration at 200 K.

2.4. Molecular Dynamics Simulations

Simulations were run in OpenMM 7.3.1 using Langevin Dynamics with a 1 fs time step and a 5/ps friction coefficient [44]. Nonbonded interactions were calculated as described in the previous subsection. Preceded by an energy minimization, each solute/solvent system was first equilibrated in the isothermal-isobaric ensemble. A Monte-Carlo barostat [45] was used for pressure control. For the first 20 ps, the reference pressure and temperature were set to 1000 atm and 200 K, respectively, ensuring stable simulations for all systems and preventing blow-ups. The box size was then equilibrated at atmospheric pressure, where the temperature was gradually increased from 200 K to 298.15 K for 250 ps, and left constant at 298.15 K for another 250 ps.

Starting from the equilibrated configurations, independent alchemical simulations were run in parallel in 13 λ-states. Each of these 13 equilibrium simulations redefines the electrostatic and van-der-Waals interactions between solute and solvent atoms, and among solute atoms: At lambda index Λ = 0, the solvent and solute are fully interacting. At Λ = 12, the intermolecular interactions of the solute are fully annihilated. In order to enable simulations of all tautomers in all phases within a reasonable amout of time, larger errors (i.e., ~1 kcal/mol) were tolerated and the lambda spacing was intentionally chosen somewhat wider than usual.

Each λ-state in Table 2 defines the solute’s intermolecular interactions as follows: The electrostatic scaling parameter, λelec, is multiplied onto the solute’s partial charges. The van-der-Waals scaling parameter, λvdw, designates a soft-core Lennard-Jones (LJ) potential between atoms i and j, where either i or j, or both, represent the index of a solute atom,

UλvdwLJLJ(rij)=4εijeff((σrijeff)12(σrijeff)6). (1)

Table 2.

Parameters for alchemical states. Lambda index Λ and decoupling parameters for van-der-Waals and electrostatic interactions in each λ-state, λvdw and λelec.

Λ 0 1 2 3 4 5 6 7 8 9 10 11 12
λ vdw 1.0 1.0 1.0 1.0 1.0 1.0 0.9 0.8 0.7 0.6 0.4 0.2 0.0
λ elec 1.0 0.8 0.6 0.4 0.2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

The effective well-depth and radius are defined as

εijeff=λvdwεij

and

rijeff=rij+σij(1λvdw),

respectively, where σij=(σi+σj)/2 and εij=εiεj are the Lennard-Jones parameters. The use of soft-core potentials prevents the van-der-Waals endpoint problem [46] as well as van-der-Waals clashes, i.e. too short interatomic distances rij that would lead to large potential energies and thereby cause instabilities. For λvdw=1,UλvdwLJ represents the customary Lennard-Jones potential, while for λvdw0, steric interactions are gradually switched off. The potentials were used as implemented in openmmtools [47], version 0.18.0. Note that openmmtools did not support soft-core potentials for systems that contain any NBFIXes (pair-specific Lennard-Jones parameters), a restriction that initially prevented alchemical simulations for the tautomers m16a, m16c and m16e, and for all systems containing ions.

The energy minimization and equilibration were repeated in each λ-state (now with the box size fixed) allowing 500 ps for the system to adapt to the alchemically modified potential energy function. Production simulations, 10 ns in each state, were run in the canonical ensemble. Regarding the potentially slow relaxation times of octanol configurations, this simulation time presents a compromise between accuracy and computational cost that values efficiency over accuracy. Trajectory frames and energies were saved once per picosecond. Gas phase calculations were carried out in a cubic box spanning 30 Å in each direction. The annihilation of electrostatic and steric interactions was done in the same way as for the solvated systems.

2.5. Free Energy Calculations

The Hamiltonian from each λ-state was cross-evaluated in each trajectory frame from all other λ-states. Free energies of annihilation, ΔAsolventofftautomer, were calculated using the Multistate Bennett Acceptance Ratio method (MBAR) for each tautomer and solvent (including gas phase), following a subsampling of statistically uncorrelated data [48].

Using the free energies of annihilation, the free energy of solvation for each tautomer and solvent is given as

ΔAgassolventtautom=ΔAgasofftautomerΔAsolventofftautomer,

see Fig. 2. Statistical errors were calculated by averaging ΔAgassolventtautomer over five 2 ns blocks. In contrast to the error estimate provided by MBAR, the error of the block-averaging recognizes configurational changes of the solvent that take place within a few nanoseconds.

Fig. 2.

Fig. 2

Schematic diagram of the individual terms used in the free energy calculations. Tautomers are represented by orange circles, solvent phases are represented by blue, green, and yellow rectangles. The tautomers’ intermolecular interactions were alchemically switched off from both solvent and gas phase to compute annihilation free energies, ΔAsolventofftautomer and ΔAgasofftautomer. These were related to a mutual reference state through their relative free energies in gas phase ΔAgasreftautomer.

2.6. Multi-Phase Boltzmann Weighting

Finally, the solvation free energies for all tautomers of a given solute molecule were combined into a single partition coefficient log P. To determine the partitioning into different solvent systems and to identify the prevalent tautomers, the free energy differences between tautomers in solution were evaluated with respect to the reference tautomer, i.e. the one with the lowest QM energy in gas phase

ΔAgassolventreftautomer=ΔAgasreftautomer+ΔAgassolvent.tautomer

These free energies with respect to a mutual reference point determine the partitioning through Boltzmann weighting. If different solvent systems are equally accessible to the solute, the probability of finding a given tautomer in a given solvent is

P(tautomer,solvent)=exp(βΔAgassolventreftautomer)Σ0,

where the normalization factor Σ0 is the sum over all tautomers and solvent systems

Σ0=tautomerssolventsexp(βΔAgassolventreftautomer).

Consequently, to determine the octanol/water partitioning, these probabilities are summed over, and normalized with the number of different water and octanol systems, nw and no, as follows:

logP=log10(nwnotautomersoctanolsystemsP(tautomer,solvent)tautomerswatersystemsP(tautomer,solvent)).

3. Results

3.1. Predictions Submitted to the SAMPL6 Challenge

Three sets of partition coefficients were submitted to the SAMPL6 challenge based on the above-described protocol and the CSET (CGenFF) parameters, see Table S2. The simulations differed by the solvents that were considered in the log P calculation: pure water/pure octanol; pure water/wet octanol; pure water/pure octanol/wet octanol. (At the time of submission, the results for the ion-saturated water box were not available due to lacking support for NBFIXes in openmmtools’s alchemy module.) Also, a reweighting to CHARMM energies was included to account for the different van-der-Waals switching functions as described in the Supplementary Information, Section S3. The submissions all achieved mean absolute errors (MAE) and root mean squared errors (RMSE) below 1 kcal/mol and positive correlation with experiment, see supplementary information, Figures S1S3. The partition coefficients were also evaluated for the reoptimized parameters PSET and ASET, see Figs S4 and S5, respectively. The optimized parameters worsened match with experiment due to the aforementioned problems with FFTK.

In a follow-up analysis we found a numerical inconsistency in an analysis script that changed the result drastically and for the worse with the CSET parameters. Moreover, in one of the simulations for SM09 an octanol molecule stuck through a ring of the solute during the entire simulation, causing large potential energies and effectively removing this tautomer (m9e) from the Boltzmann weighting. In fact, this situation occurred for multiple solutes after the initial generation of the solvation box but all other systems were detected and fixed earlier. Lastly, three tautomers of SM16 (m16a, m16c, and m16e) could not be considered in the submissions due to NBFIXes in their CGenFF parameters. Due to these major flaws in the original results, the remainder of this paper will only discuss the post-submission results where these problems were resolved. The original submissions are briefly described in the supplementary information (see Section S1).

3.2. Consistency between CHARMM and OpenMM

OpenMM was chosen as the molecular dynamics engine for efficiency reasons (CHARMM does not yet have a fast implementation of its alchemical functionality). One concern in using OpenMM instead of CHARMM for sampling was the consistency of CHARMM parameters with the OpenMM simulation program [49, 50]. Some measures had to be taken to improve consistency. First, the analytic long-range correction that was customary in OpenMM systems generated from CHARMM PSF files had to be manually removed. Second, trajectory files generated by the OpenMM DCD reporter were not flawlessly readable by CHARMM due to inconsistent indexing of the trajectory frames (zero-based versus one-based). Therefore, the reporter had to be customized to match the format expected by CHARMM. Third, the selection of the alchemical region had to be handled carefully, since OpenMM’s PSF parser merged residues from different segments together. These three problems were fixed and the fixes were included into OpenMM Version 7.4.0. (Note that OpenMM version 7.4.0 also supports lone pairs more comprehensively which further extends compatibility with the CHARMM force field.)

Another difference between the two programs is the van-der-Waals switching function. While CHARMM implements the potential switching and force switching schemes introduced by Steinbach and Brooks [51], OpenMM’s polynomial switching function has a different functional form. This is an important detail, for example, in simulations of lipid bilayers. Therefore, the influence of the switching function on potential and free energies was studied. In OpenMM, the average potential energy of a 5 ns long simulation in the fully solvated state of m11b (a tautomer of SM11) in water was −16852.3 kcal/mol with a standard deviation of 72.191 kcal/mol. Reevaluating the energies for this trajectory in CHARMM with the VSWITCH potential switching function yielded good agreement with OpenMM, an average of −16851.4 kcal/mol and standard deviation 72.196 kcal/mol. The free energy difference between the two programs was evaluated from Zwanzig’s rule [52] as ΔAΛ=0openmmcharmm=0.66±0.01 kcal/mol, 2.4 ± 0.2 kcal/mol, and 2.6 ± 0.2 kcal/mol for pure water, pure octanol, and water-saturated octanol, respectively. See Supplementary Information, Section S3, for details. Note that all interactions (solvent-solvent, solvent-solute, and solute-solute) contribute to this difference so that the influence on the solvation free energy, approximately 0.02 kcal/mol, is negligible. Nevertheless, the CHARMM switching functions were implemented for openmmtools, mostly with respect to lipid bilayer applications but also to further improve consistency in the present work. Since this implementation achieved consistent energies between CHARMM and OpenMM, the reweighting to CHARMM was not performed for the post-submission results presented in this manuscript.

Lastly, the soft-core functionality of openmmtools was expanded to these new switching functions as well as to NBFIXes. The soft-core implementations were tested by comparing the potential energies of SM14 solvated in ionic strength adjusted water. Both end states (fully solvated and fully annihilated) were in excellent agreement with the corresponding CHARMM energies for the initial coordinates. 1

3.3. Free Energies of Solvation

The free energies of solvation for all tautomers and all solvents are given in the supplementary information, Table S3. The wide lambda spacing and relatively short simulations resulted in large uncertainties. On average the errors were 0.43 kcal/mol in water and 0.78 kcal/mol in octanol, which in combination amounts to approximately 1 log P unit.

The solvation free energies were remarkably sensitive to the solvent composition (e.g., isa wat vs wat). Fig. 3 shows the differences in solvation free energies between the pure solvents and the experimental compositions. Some of the tautomers were more soluble in one than the other systems by up to 2.5 kcal/mol. For example, SM04’s tautomer (m4a) was a factor of ~30 more soluble in water-saturated octanol than pure octanol, whereas, in contrast, m4b was a factor of ~100 more soluble in pure octanol, see probabilities in Table S4. Despite large error bars, these differences were statistically significant. Similar findings were observed for the same tautomers regarding the two water systems.

Fig. 3.

Fig. 3

Differences in solvation free energies between the pure solvents (wat and ocoh) and the experimental compositions (isa wat and wso).

The compilation of all different tautomers (Fig. 3) did not reveal a general preference for either the pure or mixed phases; the results were highly tautomer-dependent and in many cases statistically significant.

3.4. Partition Coefficients

Table 3 lists log P for four different solvent combinations, the three combinations that were submitted to the prediction challenge as well as one that included all four phases. Fig. 4 shows the correlation with experiment for three of them. As expected from the discrepancies and uncertainties in the solvation free energies, the partition coefficients varied depending on the choice of solvents for many of the molecules. The greatest difference, 2.4 log P units, was observed for SM14, which lost its preference for the organic solvents when water-saturated octanol was excluded from the Boltzmann weighting. The best consistency between the different solvent systems was observed for SM08 (0.2 log P).

Table 3.

Logarithmic water-octanol partition coefficients, log P, for simulations including four different combinations of solvent and from experiment.

ocoh/wat wso/wat ocoh/wso/wat ocoh/wso/wat/isa_wat Experiment [9]

SM02 0.4 ± 1.1 0.0 ± 1.0 0.2 ± 1.0 0.2 ± 1.0 4.1
SM04 5.4 ± 1.2 6.8 ± 1.0 6.5 ± 1.1 6.1 ± 1.0 4.0
SM07 1.0 ± 0.8 2.5 ± 0.9 2.3 ± 0.8 1.9 ± 0.8 3.2
SM08 5.0 ± 1.1 5.1 ± 1.0 5.0 ± 1.1 5.2 ± 1.1 3.1
SM09 0.6 ± 0.9 1.3 ± 0.8 1.1 ± 0.9 1.2 ± 0.8 3.0
SMll 2.6 ± 0.6 2.3 ± 0.7 2.5 ± 0.7 2.6 ± 0.7 2.1
SM12 3.8 ± 0.7 5.2 ± 0.7 4.9 ± 0.7 4.2 ± 0.7 3.8
SM13 1.7 ± 1.4 1.3 ± 1.2 1.5 ± 1.3 0.6 ± 1.3 2.9
SM14 -0.1 ± 0.7 2.3 ± 0.5 2.0 ± 0.6 1.6 ± 0.6 2.0
SM15 1.2 ± 0.5 1.3 ± 0.6 1.2 ± 0.6 0.6 ± 0.5 3.1
SM16 0.8 ± 0.6 1.6 ± 0.6 1.4 ± 0.6 1.6 ± 0.7 2.6

Fig. 4.

Fig. 4

Simulated versus experimental water-octanol partition coefficients [9], log P, for three different combinations of solvent.

The overall performance with respect to experiment was ~1.7 in terms of the mean absolute error (MAE) and ~2 in terms of the root mean squared error (RMSE). In most cases, the partition coefficients obtained from only two solvent phases bracketed the multiple-solvent results, as expected from a Boltzmann weighted average. The inclusion of all four solvents led to an overall stronger solubility in the organic phase in comparison to the ocoh/wat and wso/wat systems. The molecule that was most soluble in octanol experimentally, SM02, was a major outlier in the simulations. While all other solutes matched experiment within at least 2 log P units, SM02 consistently underestimated the experimental value by at least 3.7 log P units for all solvent combinations. For most solvent combinations, SM02 was even the least soluble in octanol.

The result for SM02 had a strongly negative effect on the overall correlation with experiment. Table 4 shows that R2 was in the range of 0.05–0.15. In all cases, the inclusion of water-saturated octanol led to better match with experiment over pure octanol, whereas the inclusion of isa wat deteriorated accuracy. The best result was obtained using a combination of pure water and the two different octanol phases, where the MAE and RMSE were 1.57 log P units and 1.86 log P units, respectively.

Table 4.

Statistics for all considered solvent combinations. Mean absolute error (MAE), root mean squared error (RMSE) and squared correlation coefficient R2. All data is in log P units

MAE RMSE R 2

ocoh/wat 1.74 1.98 0.15
ocoh/isa_wat 1.99 2.26 0.07
ocoh/wat/isa_wat 1.98 2.17 0.12
wso/wat 1.60 1.93 0.10
wso/isa_wat 1.70 2.10 0.04
wso/wat/isa_wat 1.64 2.01 0.07
ocoh/wso/wat 1.57 1.86 0.10
ocoh/wso/isa_wat 1.68 2.04 0.05
ocoh/wso/wat/isa_wat 1.66 1.96 0.08

4. Discussion

The calculated solvation free energies and log P show unequivocally that solvent composition matters in molecular dynamics studies of solvation. The presence of salt in the water phase and water in the octanol phase had a non-trivial, strongly tautomer-dependent effect. While some differences in solubility due to solvent composition are expected [53, 54], the ~2 kcal/mol disparity observed in the simulations for many tautomers, to the knowledge of the authors, has no experimental precedent. These discrepancies are explained by flaws in both the entropic and enthalpic contributions to the free energy, as detailed in the succeeding subsections 4.1 and 4.2, respectively.

4.1. Sampling Issues

Part of the problem may be the imprecision of the free energy estimates in this work. While a certain degree of uncertainty was intentionally tolerated for computational efficiency, the overall error margins were rather large for alchemical free energy simulations. The average statistical uncertainty over all annihilation free energies was 0.83 kcal/mol from the block-averaging. (In comparison, the MBAR error estimates were 0.32 kcal/mol on average even after subsampling the data according to statistical inefficiency.) The large errors are a result of wide lambda spacings specifically in the annihilation of van-der-Waals interactions and of relatively short simulations (10 ns) in each λ-state.

Specifically the octanol phases could require longer simulations due to slow solvent reorganization in the presence or absence of solute. Unfortunately, the slow relaxation does not only affect the initial system setup but extends to every λ-state: In principle, the annihilation of a solute’s electrostatic interactions turns it maximally hydrophobic, which would likely cause a reorientation of all its surrounding octanol molecules and ultimately affect the whole system [55]. Such drastic configurational changes are easily missed on the 10 ns timescale used by the presented free energy protocol [16, 56].

To get an idea of the timescales involved in octanol equilibration, we ran a microsecond long simulation of a larger water/octanol system that started out with ionic strength adjusted water in one slab and pure octanol in the other (see Section S2 in the SI for details). Even after 300 ns, the slab had yet to reach equilibrium, with the concentration of water in the octanol slab steadily increasing. Water transport into the organic phase was mediated by potassium ions approaching the octanol/water interface, which fits in with the overestimated coordination of TIP3 water with potassium [57]. The slab simulation served as proof of concept of two things. First, over the course of the entire simulation, no ions permeated into octanol. Second, the time scales required to properly relax the octanol layer are computationally infeasible. This observation shows that equilibration and sampling of water-saturated octanol are major concerns, even in the absence of solute.

To demonstrate and quantify the convergence problems in the present simulations, we studied radial distribution functions and interaction energies of the tautomer m2a in water-saturated octanol, see Supplementary Figures S8 and S9. The observed discrepancies between four blocks of the simulations are sobering and suggest that simulations much longer than 10 ns are required to sample the configurational space. A detailed discussion of these results is provided in Supplementary Section S4.

Since the equilibration and sampling problems scale with system size, the multi-phase approach provides some alleviation. Various other methods could be considered to further improve sampling, such as a more favorable alchemical transformation, Hamiltonian replica exchange, or nonequilibrium approaches [5860].

4.2. Force Field Deficiencies

Despite large statistical uncertainty, the widely different solubilities observed for the different water and octanol phases were statistically significant. This points out severe flaws in the employed force field parameters. In part, these issues can be blamed on deficiencies in the TIP3P water model (e.g., overestimation of K+ interaction and under-structuring in the second and third shells). More importantly though, the CGenFF parameters for the different tautomers are assigned by analogy. They have not been validated for the molecules under consideration. Aggregate statistics of intramolecular and charge parameter penalties for tautomer sets are provided in the supporting information. On average, the penalties are high, particularly for the partial atomic charges.

The largest outlier from experiment, SM02, proved particularly problematic due to the trifluoromethyl substituent attached. The abberative behavior of fluorinated complexes in the context of partition coefficients is well documented, and highlights the need for more sophisticated treatment of non-covalent interactions for fluorinated species [61]. This is emphasized by the fact that efforts to reparameterize this (as well as other complexes) resulted in a net increase in hydrophilicity. Charge penalties (i.e., measures of how reliable initial guess charges are) of SM02 give an average penalty of 156.34, indicating initial charges to be highly suspect.

Large deviations from experimental partition coefficients are therefore not surprising. Even for simple, well-optimized compounds such as amino-acid side chains, Ahmed and Sandler found that CGenFF yields deviations from experiment as large as of 0.45 in terms of the RMSD [62]. This result demonstrates the need for force fields that better describe the intermolecular interactions. In particular, recent calculations of hexadecane/water partition coefficients show that transfer free energies can be drastically improved by including explicit polarizability [63]. Alternatives to assignment via Paramchem, such as MATCH (particularly the machine learning variants of MATCH) are also attractive as a way to better assign initial parameters [64].

4.3. Gas-Phase Calculations

The weighting scheme employed for calculating the free energies between tautomers in gas phase was heavily dependent on the level of theory implemented. In principle, using QM calculations to weight MM solvation free energies can be precarious, as there is a level of “error cancellation” granted by using a consistent level of theory for all calculations. However, the MM free energy differences between gas phase tautomers are physically unreasonable, requiring corrections to QM. Hence, using a QM/MM Hamiltonian for all free energy calculations is desirable. In fact, use of indirect approaches to free energy calculations is strongly advised for these scenarios [6574].

4.4. High-throughput Parameterization in CHARMM

The parameterization process of CHARMM is somewhat controversial in nature, particularly the ad-hoc nature of potentials added (e.g., Urey Bradley terms, improper dihedrals, CMAP terms, etc). However, for most intramolecular terms, the LSFITPAR [75] tool has been released to accommodate quick and efficient term fitting. However, the fitting of intramolecular parameters requires consistency with intermolecular terms (i.e., charge and vdw parameters). Unfortunately, the charge fitting in CHARMM is somewhat ill-defined, and ill-advised for large scale parameterization. In particular, the nature of charge parameterization in CHARMM (e.g., matching to idealized interaction of molecules with TIP3 waters) requires a fair amount of user intervention. This is largely due to the requirement of user defined “building blocks” as parameterization components to the larger system of interest. Although the FFTK toolkit can be used for de novo parameterization of molecules, there is still a fair amount of user input required via a GUI interface, making it a poor choice for high-throughput parameterization. However, with FFTK amongst the only automated interfaces for CHARMM parameterization, the need for a more streamlined approach to parameter optimization in the CHARMM force field is more apparent than ever.

4.5. CHARMM Force Fields in the OpenMM Ecosystem

The CHARMM force fields remain among the most-used force fields in the MD community. While OpenMM generally supported CHARMM input files, the present work has improved the integration between the programs. All bugfixes and newly implemented features were made available in OpenMM 7.4.0 and through openmmtools (pull request under review). These include: fixes in the PSF parser and DCD reporter, an implementation of CHARMM’s van-der-Waals switching functions, and support for softcore potentials. With these amendments, OpenMM can be utilized as a highly efficient MD engine for alchemical simulations using CHARMM force fields.

5. Conclusion

This work presents a physics-based approach for predicting water-octanol partition coefficients in the SAMPL6 log P prediction challenge. This task is well-suited to empirical approaches due to the wealth of experimental data. In contrast, methods based on molecular mechanics are troubled by long octanol relaxation times that require extensive simulations to equilibrate the systems and sample all relevant molecular configurations. Additionally, assigning force field parameters in a consistent and program-independent manner remains challenging, which reflects the general immaturity of the tools available to the community for these tasks. The emergence of more suitable descriptions of the molecular interactions [7678] as well as better algorithms and software for automated force field development promises alleviation [7983]. With these improvements underway, the presented multi-phase reweighting presents a propitious path to mitigate the sampling problem by breaking a complex molecular system into smaller, more tangible subsystems.

Supplementary Material

10822_2020_285_MOESM2_ESM
10822_2020_285_MOESM1_ESM

Acknowledgements

The authors would like to thank Richard Venable and John Legato, for technical assistance. We extend our gratitude to Mehtap Işık, David Mobley, Andy Simmonett, Samarjeet Prasad, and Richard Pastor for helpful comments on the manuscript and general insights. This work was partially supported by the intramural research program of the National Heart, Lung and Blood Institute (NHLBI) of the National Institutes of Health and employed the high-performance computational capabilities of the LoBoS and Biowulf Linux clusters at the National Institutes of Health. (http://www.lobos.nih.gov and http://biowulf.nih.gov). P.S.H. acknowledges funding support from the Intramural Research Program of the NIH, NHLBI. A.K. and P.S.H. contributed equally to this work. +

Footnotes

1

See openmmtools pull request #431, which implements this system as a testsystem and runs the comparison as a unit test. The CHARMM reference energies and all input files are provided on https://github.com/Olllom/charmm-vs-openmm-energies

Supporting Information

The following Supporting Information is available:

Publisher's Disclaimer: This Author Accepted Manuscript is a PDF file of an unedited peer-reviewed manuscript that has been accepted for publication but has not been copyedited or corrected. The official version of record that is published in the journal is kept up to date and so may therefore differ from this version.

Supporting Information

The following Supporting Information is available:

- Details on submissions and slab setup.

- Table of Penalty statistics from CGenFF.

- Results of the submitted predictions.

- Illustration of tautomers.

- Tautomer solvation free energies for all solvent phases and associated probabilities.

- Plot of water saturation in the slab simulation.

- PDBs of all tautomers.

References

  • 1.Sangster J (1997) Octanol-water partition coefficients: Fundamentals and physical chemistry. Eur J Med Chem 32(11):842, DOI 10.1016/s0223-5234(97)82764-x [DOI] [Google Scholar]
  • 2.Bannan CC, Burley KH, Chiu M, Shirts MR, Gilson MK, Mobley DL (2016) Blind prediction of cyclohexane–water distribution coefficients from the sampl5 challenge. Journal of computer-aided molecular design 30(11):927–944 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Guthrie JP (2009) A blind challenge for computational solvation free energies: Introduction and overview. J Phys Chem B 113(14):4501–4507, DOI 10.1021/jp806724u, URL 10.1021/jp806724u [DOI] [PubMed] [Google Scholar]
  • 4.Geballe MT, Skillman AG, Nicholls A, Guthrie JP, Taylor PJ (2010) The SAMPL2 blind prediction challenge: Introduction and overview. J Comput Aided Mol Des 24(4):259–279, DOI 10.1007/s10822-010-9350-8, URL 10.1007/s10822-010-9350-8 [DOI] [PubMed] [Google Scholar]
  • 5.Muddana HS, Daniel Varnado C, Bielawski CW, Urbach AR, Isaacs L, Geballe MT, Gilson MK (2012) Blind prediction of host–guest binding affinities: A new SAMPL3 challenge. J Comput Aided Mol Des 26(5):475–487, DOI 10.1007/s10822-012-9554-1, URL 10.1007/s10822-012-9554-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Muddana HS, Fenley AT, Mobley DL, Gilson MK (2014) The SAMPL4 host–guest blind prediction challenge: An overview. J Comput Aided Mol Des 28(4):305–317, DOI 10.1007/s10822-014-9735-1, URL 10.1007/s10822-014-9735-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Yin J, Henriksen NM, Slochower DR, Shirts MR, Chiu MW, Mobley DL, Gilson MK (2016) Overview of the SAMPL5 host–guest challenge: Are we doing better? J Comput Aided Mol Des 31(1):1–19, DOI 10.1007/s10822-0169974-4, URL [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Rizzi A, Murkli S, McNeill JN, Yao W, Sullivan M, Gilson MK, Chiu MW, Isaacs L, Gibb BC, Mobley DL, Chodera JD (2018) Overview of the sampl6 host–guest binding affinity prediction challenge. Journal of Computer-Aided Molecular Design 32(10):937–963, DOI 10.1007/s10822-018-0170-6, URL 10.1007/s10822-018-0170-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Işık M, Levorse D, Mobley DL, Rhodes T, Chodera JD (2019) Octanol-water partition coefficient measurements for the SAMPL6 Blind Prediction Challenge. bioRxiv p 757393, DOI 10.1101/757393, URL [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Finkelstein A (1976) Water and nonelectrolyte permeability of lipid bilayer membranes. J Gen Physiol 68(2):127–135, DOI 10.1085/jgp.68.2.127, URL http://www.ncbi.nlm.nih.gov/pubmed/956767 http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=PMC2228420 http://www.jgp.org/cgi/doi/10.1085/jgp.68.2.127 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Venable RM, Krämer A, Pastor RW (2019) Molecular Dynamics Simulations of Membrane Permeability. Chemical Reviews 119(9):5954–5997, DOI 10.1021/acs.chemrev.8b00486, URL 10.1021/acs.chemrev.8b00486 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Li S, Hu PC, Malmstadt N (2011) Imaging molecular transport across lipid bilayers. Biophys J 101(3):700–708, DOI 10.1016/j.bpj.2011.06.044, URL http://www.ncbi.nlm.nih.gov/pubmed/21806938 http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=PMC3145269 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Walter A, Gutknecht J (1986) Permeability of small nonelectrolytes through lipid bilayer membranes. J Membr Biol 90(3):207–217, DOI 10.1007/BF01870127, URL 10.1007/BF01870127 [DOI] [PubMed] [Google Scholar]
  • 14.MacCallum JL, Tieleman DP (2002) Structures of neat and hydrated 1-octanol from computer simulations. Journal of the American Chemical Society 124(50):15085–15093, DOI 10.1021/ja027422o, URL 10.1021/ja027422o [DOI] [PubMed] [Google Scholar]
  • 15.Chen B, Siepmann JI (2006) Microscopic structure and solvation in dry and wet octanol. The Journal of Physical Chemistry B 110(8):3555–3563, DOI 10.1021/jp0548164, URL 10.1021/jp0548164, 10.1021/jp0548164 [DOI] [PubMed] [Google Scholar]
  • 16.Procacci P (2019) Solvation free energies via alchemical simulations: Let’s get honest about sampling, once more. Phys Chem Chem Phys 21(25):13826–13834, DOI 10.1039/c9cp02808k, URL http://xlink.rsc.org/?DOI=C9CP02808K [DOI] [PubMed] [Google Scholar]
  • 17.Zhao YH, Abraham MH (2005) Octanol/water partition of ionic species, including 544 cations. The Journal of Organic Chemistry 70(7):2633–2640, DOI 10.1021/jo048078b, URL 10.1021/jo048078b, 10.1021/jo048078b [DOI] [PubMed] [Google Scholar]
  • 18.Yue Z, Li C, Voth GA, Swanson JMJ (2019) Dynamic protonation dramatically affects the membrane permeability of drug-like molecules. Journal of the American Chemical Society 141(34):13421–13433, DOI 10.1021/jacs.9b04387, URL 10.1021/jacs.9b04387, 10.1021/jacs.9b04387 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Martin YC (2009) Let’s not forget tautomers. J Comput Aided Mol Des 23(10):693–704, DOI 10.1007/s10822-009-9303-2, URL https://www.ncbi.nlm.nih.gov/pubmed/19842045 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Pickard FC, König G, Tofoleanu F, Lee J, Simmonett AC, Shao Y, Ponder JW, Brooks BR (2016) Blind prediction of distribution in the SAMPL5 challenge with QM based protomer and pK a corrections. J Comput Aided Mol Des 30(11):1087–1100, DOI 10.1007/s10822-016-9955-7 [DOI] [PubMed] [Google Scholar]
  • 21.Prasad S, Huang J, Zeng Q, Brooks BR (2018) An explicit-solvent hybrid QM and MM approach for predicting pKa of small molecules in SAMPL6 challenge. Journal of Computer-Aided Molecular Design 32(10):1191–1201, DOI 10.1007/s10822018-0167-1, URL http://www.ncbi.nlm.nih.gov/pubmed/30276503 http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=PMC6342563 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Vanommeslaeghe K, Hatcher E, Acharya C, Kundu S, Zhong S, Shim J, Darian E, Guvench O, Lopes P, Vorobyov I, Mackerell AD (2009) CHARMM general force field: A force field for drug-like molecules compatible with the CHARMM all-atom additive biological force fields. J Comput Chem 31(4):NA–NA, DOI 10.1002/jcc.21367, URL 10.1002/jcc.21367 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Landrum G (2006) RDKit: Open-source Cheminformatics. DOI 10.2307/3592822, URL http://www.rdkit.org [DOI] [Google Scholar]
  • 24.O’Boyle NM, Banck M, James CA, Morley C, Vandermeersch T, Hutchison GR (2011) Open Babel: An open chemical toolbox. Journal of Cheminformatics 3(1):33, DOI 10.1186/1758-2946-3-33, URL 10.1186/1758-2946-3-33 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Gilbert AT (2019) iQmol. URL http://iqmol.org [Google Scholar]
  • 26.Stewart JJP (2007) Optimization of parameters for semiempirical methods v: Modification of NDDO approximations and application to 70 elements. J Mol Model 13(12):1173–1213, DOI 10.1007/s00894-007-0233-4, URL 10.1007/s00894-007-0233-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Møller C, Plesset MS (1934) Note on an approximation treatment for many-electron systems. Phys Rev 46(7):618–622, DOI 10.1103/physrev.46.618, URL 10.1103/physrev.46.618 [DOI] [Google Scholar]
  • 28.Francl MM, Pietro WJ, Hehre WJ, Binkley JS, Gordon MS, DeFrees DJ, Pople JA (1982) Self-consistent molecular orbital methods. xxiii. a polarization-type basis set for second-row elements. J Chem Phys 77, DOI 10.1063/1.444267 [DOI] [Google Scholar]
  • 29.Gordon MS, Binkley JS, Pople JA, Pietro WJ, Hehre WJ (1982) Self-consistent molecular-orbital methods. 22. small split-valence basis sets for second-row elements. J Am Chem Soc 104, DOI 10.1021/ja00374a017 [DOI] [Google Scholar]
  • 30.Hariharan PC, Pople JA (1973) The influence of polarization functions on molecular orbital hydrogenation energies. Theor Chim Acta 28, DOI 10.1007/bf00533485 [DOI] [Google Scholar]
  • 31.Hehre WJ, Ditchfield R, Pople JA (1972) Self-consistent molecular orbital methods. xii. further extensions of gaussian-type basis sets for use in molecular orbital studies of organic molecules. J Chem Phys 56, DOI 10.1063/1.1677527 [DOI] [Google Scholar]
  • 32.Frisch MJ, Trucks GW, Schlegel HB, Scuseria GE, Robb MA, Cheeseman JR, Scalmani G, Barone V, Mennucci B, Petersson GA, Nakatsuji H, Caricato M, Li X, Hratchian HP, Izmaylov AF, Bloino J, Zheng G, Sonnenberg JL, Hada M, Ehara M, Toyota K, Fukuda R, Hasegawa J, Ishida M, Nakajima T, Honda Y, Kitao O, Nakai H, Vreven T, Montgomery JA Jr, Peralta JE, Ogliaro F, Bearpark M, Heyd JJ, Brothers E, Kudin KN, Staroverov VN, Kobayashi R, Normand J, Raghavachari K, Rendell A, Burant JC, Iyengar SS, Tomasi J, Cossi M, Rega N, Millam JM, Klene M, Knox JE, Cross JB, Bakken V, Adamo C, Jaramillo J, Gomperts R, Stratmann RE, Yazyev O, Austin AJ, Cammi R, Pomelli C, Ochterski JW, Martin RL, Morokuma K, Zakrzewski VG, Voth GA, Salvador P, Dannenberg JJ, Dapprich S, Daniels AD, Farkas O, Foresman JB, Ortiz JV, Cioslowski J, Fox DJ (2010) Gaussian09 Revision D.01, Gaussian Inc. Wallingford CT [Google Scholar]
  • 33.Frisch MJ, Trucks GW, Schlegel HB, Scuseria GE, Robb MA, Cheeseman JR, Scalmani G, Barone V, Petersson GA, Nakatsuji H, Li X, Caricato M, Marenich AV, Bloino J, Janesko BG, Gomperts R, Mennucci B, Hratchian HP, Ortiz JV, Izmaylov AF, Sonnenberg JL, Williams-Young D, Ding F, Lipparini F, Egidi F, Goings J, Peng B, Petrone A, Henderson T, Ranasinghe D, Zakrzewski VG, Gao J, Rega N, Zheng G, Liang W, Hada M, Ehara M, Toyota K, Fukuda R, Hasegawa J, Ishida M, Nakajima T, Honda Y, Kitao O, Nakai H, Vreven T, Throssell K, Montgomery JA Jr, Peralta JE, Ogliaro F, Bearpark MJ, Heyd JJ, Brothers EN, Kudin KN, Staroverov VN, Keith TA, Kobayashi R, Normand J, Raghavachari K, Rendell AP, Burant JC, Iyengar SS, Tomasi J, Cossi M, Millam JM, Klene M, Adamo C, Cammi R, Ochterski JW, Martin RL, Morokuma K, Farkas O, Foresman JB, Fox DJ (2016) Gaussian16 Revision B.01. Gaussian Inc. Wallingford CT [Google Scholar]
  • 34.Vanommeslaeghe K, Raman EP, MacKerell AD (2012) Automation of the CHARMM General Force Field (CGenFF) II: Assignment of Bonded Parameters and Partial Atomic Charges. J Chem Inf Model 52(12):3155–3168, DOI 10.1021/ci3003649, URL 10.1021/ci3003649 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Mayne CG, Saam J, Schulten K, Tajkhorshid E, Gumbart JC (2013) Rapid parameterization of small molecules using the force field toolkit. Journal of Computational Chemistry 34(32):2757–2770, DOI 10.1002/jcc.23422 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Prall M (2001) VMD: a graphical tool for the modern chemists. Journal of Computational Chemistry 22(1):132–134, DOI , URL [DOI] [Google Scholar]
  • 37.Soteras Gutiérrez I, Lin FY, Vanommeslaeghe K, Lemkul JA, Armacost KA, Brooks CL, MacKerell AD (2016) Parametrization of halogen bonds in the CHARMM general force field: Improved treatment of ligand–protein interactions. Bioorg Med Chem 24(20):4812–4825, DOI 10.1016/j.bmc.2016.06.034, URL https://linkinghub.elsevier.com/retrieve/pii/S0968089616304576 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Phillips JC, Braun R, Wang W, Gumbart J, Tajkhorshid E, Villa E, Chipot C, Skeel RD, Kalé L, Schulten K (2005) Scalable molecular dynamics with namd. Journal of Computational Chemistry 26(16):1781–1802, DOI 10.1002/jcc.20289, URL 10.1002/jcc.20289, 10.1002/jcc.20289 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Jorgensen WL, Chandrasekhar J, Madura JD, Impey RW, Klein ML (1983) Comparison of simple potential functions for simulating liquid water. The Journal of Chemical Physics 79(2):926–935, DOI 10.1063/1.445869, URL 10.1063/1.445869 [DOI] [Google Scholar]
  • 40.Brooks BR, Brooks CL, Mackerell AD, Nilsson L, Petrella RJ, Roux B, Won Y, Archontis G, Bartels C, Boresch S, Caflisch A, Caves L, Cui Q, Dinner AR, Feig M, Fischer S, Gao J, Hodoscek M, Im W, Kuczera K, Lazaridis T, Ma J, Ovchinnikov V, Paci E, Pastor RW, Post CB, Pu JZ, Schaefer M, Tidor B, Venable RM, Woodcock HL, Wu X, Yang W, York DM, Karplus M (2009) CHARMM: The biomolecular simulation program. J Comput Chem 30(10):1545–1614, DOI 10.1002/jcc.21287, URL 10.1002/jcc.21287 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Nosé S (1984) A molecular dynamics method for simulations in the canonical ensemble. Mol Phys 52(2):255–268, DOI 10.1080/00268978400101201, URL 10.1080/00268978400101201 [DOI] [Google Scholar]
  • 42.Hoover WG (1985) Canonical dynamics: Equilibrium phase-space distributions. Phys Rev A 31(3):1695–1697, DOI 10.1103/PhysRevA.31.1695, URL 10.1103/PhysRevA.31.1695 [DOI] [PubMed] [Google Scholar]
  • 43.Lang BE (2012) Solubility of water in octan-1-ol from (275 to 369) k. Journal of Chemical & Engineering Data 57(8):2221–2226, DOI 10.1021/je3001427, URL 10.1021/je3001427, 10.1021/je3001427 [DOI] [Google Scholar]
  • 44.Eastman P, Swails J, Chodera JD, McGibbon RT, Zhao Y, Beauchamp KA, Wang LP, Simmonett AC, Harrigan MP, Stern CD, Wiewiora RP, Brooks BR, Pande VS (2017) OpenMM 7: Rapid development of high performance algorithms for molecular dynamics. PLoS Comput Biol 13(7):e1005659, DOI 10.1371/journal.pcbi.1005659, URL 10.1371/journal.pcbi.1005659 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Chow KH, Ferguson DM (1995) Isothermal-isobaric molecular dynamics simulations with Monte Carlo volume sampling. Comput Phys Commun 91(1–3):283–289, DOI 10.1016/0010-4655(95)00059-O, URL https://www.sciencedirect.com/science/article/pii/001046559500059O?via%3Dihub [DOI] [Google Scholar]
  • 46.Simonson T (1993) Free energy of particle insertion an exact analysis of the origin singularity for simple liquids. Mol Phys 80(2):441–447, DOI 10.1080/00268979300102371, URL 10.1080/00268979300102371 [DOI] [Google Scholar]
  • 47.Rizzi A, Chodera J, Naden L, Beauchamp K, Grinaway P, Fass J, Rustenburg B, Ross GA, Swenson DW, Simmonett AC, Krämer A (????) DOI 10.5281/zenodo.2592819 [DOI]
  • 48.Shirts MR, Chodera JD (2008) Statistically optimal analysis of samples from multiple equilibrium states. J Chem Phys 129(12):124105, DOI 10.1063/1.2978177, URL 10.1063/1.2978177 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Eastman P, Swails J, Chodera JD, McGibbon RT, Zhao Y, Beauchamp KA, Wang LP, Simmonett AC, Harrigan MP, Stern CD, Wiewiora RP, Brooks BR, Pande VS (2017) OpenMM 7: Rapid development of high performance algorithms for molecular dynamics. PLOS Computational Biology 13(7):e1005659, DOI 10.1371/journal.pcbi.1005659, URL http://www.ncbi.nlm.nih.gov/pubmed/28746339 http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=PMC5549999 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Huang J, Lemkul JA, Eastman PK, Mackerell AD (2018) Molecular dynamics simulations using the drude polarizable force field on GPUs with OpenMM: Implementation, validation, and benchmarks. J Comput Chem 39(21):1682–1689, DOI 10.1002/jcc.25339, URL 10.1002/jcc.25339 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Steinbach PJ, Brooks BR (1994) New spherical-cutoff methods for long-range forces in macromolecular simulation. J Comput Chem 15(7):667–683, DOI 10.1002/jcc.540150702, URL 10.1002/jcc.540150702 [DOI] [Google Scholar]
  • 52.Zwanzig RW (1954) High-temperature equation of state by a perturbation method. i. nonpolar gases. J Chem Phys 22(8):1420–1426, DOI 10.1063/1.1740409, URL 10.1063/1.1740409 [DOI] [Google Scholar]
  • 53.Xie WH, Shiu WY, Mackay D (1997) A review of the effect of salts on the solubility of organic compounds in seawater. Marine Environmental Research 44(4):429–444, DOI 10.1016/S0141-1136(97)00017-2, URL http://www.sciencedirect.com/science/article/pii/S0141113697000172 [DOI] [Google Scholar]
  • 54.Setschenow J (1889) Über die konstitution der salzlösungen auf grund ihres verhaltens zu kohlensäure. Zeitschrift für Physikalische Chemie 4(1):117–125 [Google Scholar]
  • 55.Best SA, Merz KM, Reynolds CH (1999) Free energy perturbation study of octanol/water partition coefficients: Comparison with continuum GB/SA calculations. DOI 10.1021/jp984215v, URL 10.1021/jp984215v [DOI] [Google Scholar]
  • 56.Mobley DL (2012) Let’s get honest about sampling. DOI 10.1007/s10822-0119497-y, URL [DOI] [PubMed] [Google Scholar]
  • 57.Whitfield TW, Varma S, Harder E, Lamoureux G, Rempe SB, Roux B (2007) Theoretical study of aqueous solvation of k+ comparing ab initio, polarizable, and fixed-charge models. Journal of Chemical Theory and Computation 3(6):2068–2082, DOI 10.1021/ct700172b, URL 10.1021/ct700172b, 10.1021/ct700172b [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Pohorille A, Jarzynski C, Chipot C (2010) Good practices in free-energy calculations. J Phys Chem B 114(32):10235–10253, DOI 10.1021/jp102971x, URL 10.1021/jp102971x [DOI] [PubMed] [Google Scholar]
  • 59.Williams-Noonan BJ, Yuriev E, Chalmers DK (2018) Free Energy Methods in Drug Design: Prospects of “alchemical Perturbation” in Medicinal Chemistry. DOI 10.1021/acs.jmedchem.7b00681 [DOI] [PubMed] [Google Scholar]
  • 60.Procacci P (2019) Accuracy, precision, and efficiency of nonequilibrium alchemical methods for computing free energies of solvation. I. Bidirectional approaches. J Chem Phys 151(14):144113, DOI 10.1063/1.5120615, URL 10.1063/1.5120615 [DOI] [PubMed] [Google Scholar]
  • 61.Kolář MH, Hobza P (2016) Computer Modeling of Halogen Bonds and Other σ-Hole Interactions. Chemical Reviews 116(9):5155–5187, DOI 10.1021/acs.chemrev.5b00560, URL 10.1021/acs.chemrev.5b00560 [DOI] [PubMed] [Google Scholar]
  • 62.Ahmed A, Sandler SI (2016) Predictions of the physicochemical properties of amino acid side chain analogs using molecular simulation. Phys Chem Chem Phys 18(9):6559–6568, DOI 10.1039/c5cp05393e, URL http://xlink.rsc.org/?DOI=C5CP05393E [DOI] [PubMed] [Google Scholar]
  • 63.Krämer A, Pickard FC, Huang J, Venable RM, Simmonett AC, Reith D, Kirschner KN, Pastor RW, Brooks BR (2019) Interactions of Water and Alkanes: Modifying Additive Force Fields to Account for Polarization Effects. Journal of Chemical Theory and Computation 15(6):3854–3867, DOI 10.1021/acs.jctc.9b00016, URL 10.1021/acs.jctc.9b00016 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Yesselman JD, Price DJ, Knight JL, Brooks III CL (2012) Match: An atom-typing toolset for molecular mechanics force fields. Journal of Computational Chemistry 33(2):189–202, DOI 10.1002/jcc.21963, URL 10.1002/jcc.21963, 10.1002/jcc.21963 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Gao J, Xia X (1992) A priori evaluation of aqueous polarization effects through Monte Carlo QM-MM simulations. Science 258(5082):631–635, DOI 10.1126/science.1411573, URL 10.1126/science.1411573 [DOI] [PubMed] [Google Scholar]
  • 66.Gao J, Luque FJ, Orozco M (1993) Induced dipole moment and atomic charges based on average electrostatic potentials in aqueous solution. J Chem Phys 98(4):2975–2982, DOI 10.1063/1.464126, URL 10.1063/1.464126 [DOI] [Google Scholar]
  • 67.Luzhkov V, Warshel A (1992) Microscopic models for quantum mechanical calculations of chemical processes in solutions: Ld/ampac and SCAAS/AMPAC calculations of solvation energies. J Comput Chem 13(2):199–213, DOI 10.1002/jcc.540130212, URL 10.1002/jcc.540130212 [DOI] [Google Scholar]
  • 68.Wesolowski T, Warshel A (1994) Ab initio free energy perturbation calculations of solvation free energy using the frozen density functional approach. J Phys Chem 98(20):5183–5187, DOI 10.1021/j100071a003, URL 10.1021/j100071a003 [DOI] [Google Scholar]
  • 69.Gao J, Freindorf M (1997) Hybrid ab initio QM/MM simulation of N-methylacetamide in aqueous solution. J Phys Chem A 101(17):3182–3188, DOI 10.1021/jp970041q, URL 10.1021/jp970041q [DOI] [Google Scholar]
  • 70.Zheng YJ, Merz KM (1992) Mechanism of the human carbonic anhydrase II-catalyzed hydration of carbon dioxide. J Am Chem Soc 114(26):10498–10507, DOI 10.1021/ja00052a054, URL 10.1021/ja00052a054 [DOI] [Google Scholar]
  • 71.Hudson PS, White JK, Kearns FL, Hodoscek M, Boresch S, Lee Woodcock H (2015) Efficiently computing pathway free energies: New approaches based on chain-of-replica and non-Boltzmann Bennett reweighting schemes. Biochim Biophys Acta, Gen Subj 1850(5):944–953, DOI 10.1016/j.bbagen.2014.09.016, URL 10.1016/j.bbagen.2014.09.016 [DOI] [PubMed] [Google Scholar]
  • 72.Hudson PS, Woodcock HL, Boresch S (2015) Use of nonequilibrium work methods to compute free energy differences between molecular mechanical and quantum mechanical representations of molecular systems. J Phys Chem Lett 6(23):4850–4856, DOI 10.1021/acs.jpclett.5b02164, URL 10.1021/acs.jpclett.5b02164 [DOI] [PubMed] [Google Scholar]
  • 73.Hudson PS, Boresch S, Rogers DM, Woodcock HL (2018) Accelerating qm/mm free energy computations via intramolecular force matching. Journal of Chemical Theory and Computation 14(12):6327–6335, DOI 10.1021/acs.jctc.8b00517, URL 10.1021/acs.jctc.8b00517, 10.1021/acs.jctc.8b00517 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Hudson PS, Han K, Woodcock HL, Brooks BR (2018) Force matching as a stepping stone to qm/mm cb[8] host/guest binding free energies: a sampl6 cautionary tale. Journal of Computer-Aided Molecular Design 32(10):983–999, DOI 10.1007/s10822-018-0165-3, URL [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Vanommeslaeghe K, Yang M, Mackerell AD (2015) Robustness in the fitting of molecular mechanics parameters. Journal of Computational Chemistry 36(14):1083–1101, DOI 10.1002/jcc.23897 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Huang J, Simmonett AC, Pickard FC, MacKerell AD, Brooks BR (2017) Mapping the drude polarizable force field onto a multipole and induced dipole model. The Journal of Chemical Physics 147(16):161702, DOI 10.1063/1.4984113, URL 10.1063/1.4984113, 10.1063/1.4984113 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 77.Ponder JW, Wu C, Ren P, Pande VS, Chodera JD, Schnieders MJ, Haque I, Mobley DL, Lambrecht DS, DiStasio RA, Head-Gordon M, Clark GNI, Johnson ME, Head-Gordon T (2010) Current status of the amoeba polarizable force field. The Journal of Physical Chemistry B 114(8):2549–2564, DOI 10.1021/jp910674d, URL 10.1021/jp910674d, 10.1021/jp910674d [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78.Lamoureux G, Roux B (2003) Modeling induced polarization with classical drude oscillators: Theory and molecular dynamics simulation algorithm. The Journal of Chemical Physics 119(6):3025–3039, DOI 10.1063/1.1589749, URL 10.1063/1.1589749, 10.1063/1.1589749 [DOI] [Google Scholar]
  • 79.Wang LPP, Martinez TJ, Pande VS (2014) Building force fields: An automatic, systematic, and reproducible approach. J Phys Chem Lett 5(11):1885–1891, DOI 10.1021/jz500737m PM 26273869 M4 - Citavi, URL http://dx.doi.org/10.1021/jz500737m http://pubs.acs.org/doi/abs/10.1021/jz500737m http://pubs.acs.org/doi/10.1021/jz500737m [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80.Mobley DL, Bannan CC, Rizzi A, Bayly CI, Chodera JD, Lim VT, Lim NM, Beauchamp KA, Slochower DR, Shirts MR, Gilson MK, Eastman PK (2018) Escaping atom types in force fields using direct chemical perception. Journal of Chemical Theory and Computation 14(11):6076–6092, DOI 10.1021/acs.jctc.8b00640, URL 10.1021/acs.jctc.8b00640, 10.1021/acs.jctc.8b00640 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81.Krämer A, Hülsmann M, Köddermann T, Reith D (2014) Automated parameterization of intermolecular pair potentials using global optimization techniques. Computer Physics Communications 185(12):3228–3239, DOI 10.1016/j.cpc.2014.08.022, URL http://www.sciencedirect.com/science/article/pii/S0010465514003038 [DOI] [Google Scholar]
  • 82.Hülsmann M, Kirschner KN, Krämer A, Heinrich DD, Krämer-Fuhrmann O, Reith D (2016) Optimizing Molecular Models Through Force-Field Parameterization via the Efficient Combiation of New Modular Program Packages. In: Snurr RQ, Adjiman CS, Kofke DA (eds) Found. Mol. Model. Simulation.Select Pap. from FOMMS 2015, Molecular Modeling and Simulation, Springer Singapore, Singapore, pp 53–77, DOI 10.1007/978-981-10-1128-3 [DOI] [Google Scholar]
  • 83.Huang L, Roux B (2013) Automated force field parameterization for nonpolarizable and polarizable atomic models based on ab initio target data. Journal of Chemical Theory and Computation 9(8):3543–3556, DOI 10.1021/ct4003477, URL 10.1021/ct4003477, 10.1021/ct4003477 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

10822_2020_285_MOESM2_ESM
10822_2020_285_MOESM1_ESM

RESOURCES