Skip to main content
UKPMC Funders Author Manuscripts logoLink to UKPMC Funders Author Manuscripts
. Author manuscript; available in PMC: 2020 Aug 13.
Published in final edited form as: J Chem Theory Comput. 2019 Jul 2;15(8):4632–4645. doi: 10.1021/acs.jctc.9b00084

On the use of interaction energies in QM/MM free energy simulations

Phillip S Hudson †,, H Lee Woodcock †,*, Stefan Boresch ¶,*
PMCID: PMC6745021  EMSID: EMS83979  PMID: 31142113

Abstract

The use of the most accurate (i.e., QM or QM/MM) levels of theory for free energy simulations (FES) is typically not possible. Primarily, this is because the computational cost associated with the extensive configurational sampling needed for converging FES is prohibitive. To ensure the feasibility of QM-based FES, the “indirect” approach is generally taken, necessitating a free energy calculation between the MM and QM/MM potential energy surfaces. Ideally, this step is performed with standard free energy perturbation (Zwanzig’s equation) as it only requires simulations be carried out at the low level of theory; however, work from several groups over the past few years has conclusively shown that Zwanzig’s equation is ill-suited to this task. As such, many approximations have arisen to mitigate difficulties with Zwanzig’s equation. One particularly popular notion is that the convergence of Zwanzig’s equation can be improved by using interaction energy differences instead of total energy differences. Although problematic numerical fluctuations (a major problem when using Zwanzig’s equation) are indeed reduced, our results and analysis demonstrate that this “interaction energy approximation” (IEA) is theoretically incorrect, and the implicit approximation invoked is spurious at best. Herein, we demonstrate this via solvation free energy calculations using IEA from two different low levels of theory to the same target high level. Results from this proof-of-concept consistently yield the wrong results, deviating by ~1.5 kcal/mol from the rigorously obtained value.

1. Introduction

For many applications, the use of (S)QM/MM1 Hamiltonians is essential.27 Unfortunately, when implementing molecular dynamics (MD) based methods for computing free energy differences (i.e., free energy simulations, FES), direct use of (S)QM/MM Hamiltonians causes two major problems. First, particularly for ai-QM methods, the computational cost for the requisite configurational sampling quickly becomes prohibitive. Second, many crucial techniques frequently employed in FES using MM force fields, most notably the use of soft-core potentials8,9 are incompatible with (S)QM/MM Hamiltonians.3 Many of these limitations can be circumvented by the so-called “indirect” approach to (S)QM/MM FES.1015 The key idea is outlined in Fig. 1. The goal is to compute the free energy difference between two states X and Y at a high level of theory (ΔAXYhigh,dashedarrow), where high denotes an accurate, but expensive method; e.g., ai-QM/MM. Similarly, the label low denotes a computationally affordable method that may not be accurate enough for the intended application. Often, the low level of theory is a standard MM force field, but it could also be a sufficiently fast SQM/MM method (although at the cost of most alchemical tricks). Since the free energy difference along any closed path is zero, the identity

ΔAXYhigh=ΔAXlowhigh+ΔAXYlow+ΔAYlowhigh (1)

follows immediately from Fig. 1.

Figure 1.

Figure 1

Thermodynamic cycle underlying the “indirect” scheme in FES employing (S)QM/MM potential energy functions.

Eq. 1 is the foundation of indirect FES using (S)QM/MM Hamiltonians. This is has recently re-emerged as a very active area of research as this is both an attractive approach and littered with potential pitfalls.1635 For example, the challenge is to compute ΔAlow→high for states X and Y; the free energy difference at the respective low level of theory, ΔAXYlow, is assumed to be easy to compute. In the majority of past applications of the indirect scheme, ΔAlow→high was computed using Zwanzig’s equation, often called free energy perturbation (FEP),36,37 i.e.

ΔAlowhigh=kBTlnexp[ΔUlowhighkBT]low (2)

Since the term “free energy perturbation” encompasses also, e.g., Bennett’s acceptance ratio method, we will preferentially refer to Eq. 2 as Zwanzig’s equation in this work. Here kB and T are Boltzmann’s constant and temperature; ΔUlow→high = UhighUlow is the difference in energy of the system evaluated at the low and high levels of theory, and 〈. . .〉low denotes an ensemble average of configurations sampled at the low level of theory. Use of Eq. 2 has the advantage that MD simulations need to be carried out only at the low level of theory and the energy Uhigh is computed only for selected configurations during a post-processing step.

Recent evidence, however, shows that attempts to compute ΔAlow→high by Zwanzig’s equation converge poorly, if at all.18,19,21,3842 Analysis carried out by various groups suggests that this is mostly caused by differences in the internal energy at the two levels of theory, in particular bond lengths and angles. E.g., in MM simple harmonic terms approximate the bond stretching energy, whereas the true functional dependence is more complicated. Thus, configurations generated with MM Hamiltonians are always “slightly wrong” at the (S)QM level of theory. By “slightly wrong” we mean that a bond length, which is perfectly reasonably at the MM level of theory, may actually be quite off from the minimum geometry at the (S)QM/MM level of theory. In principle, the same can happen between two (S)QM/MM levels of theory, employing different quantum chemical techniques, but the harmonic terms used in force fields likely acerbate this. These individually small “configurational mismatches” sum up to large fluctuations of the central quantity entering Zwanzig’s equation, the energy difference ΔUlow→high, and, thus, failure of Zwanzig’s equation to converge.37 Given sufficiently large differences between the potential energy surfaces of two states, such problems can occur even for diatomic molecules.28,43 In the context of indirect (S)QM/MM FES we have observed this for quantum regions as small as 10 atoms.44 This configurational mismatch between levels of theory only compounds with growing system size, and by proxy so will the observed fluctuations in ΔUlow→high, i.e., σΔUlow→high becomes large, easily exceeding several kcal/mol.

Usually, convergence problems in Zwanzig’s equation are mitigated by the use of intermediate states with Hamiltonian H(λ), where λ is the so-called coupling parameter.37 However, under the specific circumstances H(λ) = (1 – λ)Hlow + λHhigh, so simulations at any intermediate state λ > 0 require energy/force evaluations at the high level Hamiltonian. Thus, any performance gain of the indirect scheme is lost. The same problem arises in the other standard methods to compute free energy differences. For Bennett’s acceptance ratio (BAR) method45 simulations at both the low and the high level of theory are needed. In thermodynamic integration46,47 by construction several MD simulations at intermediate Hamiltonians H(λ), all of which require force evaluations at the high level of theory, are needed. It has been pointed out very recently that the introduction of (S)QM intermediate states might still improve the overall performance of the free energy simulation if it can significantly improve the phase space overlap between levels of theory, thus reducing the computational effort to achieve convergence when computing ΔAlow→high.28

In recent years, a growing number of studies attempted to circumvent the poor convergence of Zwanzig’s equation in connection with (S)QM/MM FES by taking into account changes in interaction energies, rather than full energies.16,21,27,38,39,4859 We will refer to this practice as the interaction energy approximation (IEA). To clarify the term interaction energy, consider a solute, S, in water, W. The interaction energy Uinter of such a system is defined as

Uinter=Utotal(rS,rW)US(rS)UW(rW) (3)

where Utotal (rS, rW) is the total potential energy of the solute–solvent system, with rS, rW denoting solute and solvent coordinates. US(rS) and UW(rW) are the potential energies for just the solute and solvent coordinates, respectively. The separation of energy terms in Eq. 3 is always possible, even in the case of non-pairwise interactions, such as a (S)QM/MM description of interactions. The interaction energy between a protein and a ligand can be defined analogously. The IEA consists of replacing total energies in Zwanzig’s equation by interaction energies. Thus, the free energy difference between a system described at the low and high level of theory becomes

ΔAinterlowhigh=kBTlnexp[ΔUinterlowhighkBT]low. (4)

Since the interaction energy difference ΔUinterlowhigh=UinterhighUinterlow does not contain contributions from “mismatches” in intramolecular terms, in particular bond stretching and angle bending terms, σΔUinterlowhighσΔUlowhigh.

An early use of interaction energies is a study by Wood et al. from 1999.54 Using Zwanzig’s equation, they sought to account for changes in interaction energies in solute–solvent systems between classical and quantum levels of theory. Specifically, they considered a classical solvated solute system, then perturbed to a state where both solute and solvent were treated classically, but the solute-solvent interactions were obtained with DFT. Within the well-defined scope of this study ΔUlow→high is identical to ΔUinterlowhigh. The work focused on the averages and fluctuations of ΔUlow→high, with the goal of refining solvent nonbonded parameters in a way that minimized those quantities. In follow-up work, Wood and co-workers used this approach, referred to as “ABC-FEP”, in several applications ranging from solvation to potential of mean force (PMF) calculations.5459 More recently, Essex and co-workers adopted the notion of using interaction energy rather than total energy difference.16,21,38,39 Unlike in the earlier work by Wood et al., however, the approach appears more and more as a workaround to avoid the use of total energies as required by the cycle in Fig. 1 and Eq. 2, whilst researchers maintain results are full QM/MM free energy differences.

Based on statistical mechanics (see also Theory), Eq. 4 as replacement for Eq. 2 in the context of the indirect thermodynamic cycle (i.e., Fig. 1) cannot be justified rigorously. In fact, in their first use of the IEA, Essex and co-workers stated that use of ΔUinterlowhigh constitutes an approximation.16 However, the nature of the approximation was not analyzed; the argument given in favor of interaction energies is that their use captures polarization between solute and solvent, or protein and ligand, and, hence, includes types of interactions not accounted for by, e.g., MM force fields. Essex and co-workers also showed conclusively that Eq. 2 (i.e., Zwanzig’s equation with the full energy difference ΔUlow→high) led to an incorrect result, as convergence could not be achieved.39 Lately, the improved convergence seems to have become the main justification for use of interaction energies.39,49 On the other hand, there are cases when use of interaction energies is not appropriate. One obvious counter example would be the calculation of a PMF along a reaction coordinate when studying a chemical reaction. Clearly, during bond breakage and formation, changes in intramolecular geometries and interactions will lead to an essential contribution to the PMF; hence, they cannot be omitted. Given the increasing use of the IEA, we are not aware of any detailed investigation into the approximation(s) implied by the use of interaction energies, and no systematic comparison of results obtained with ΔUinterlowhigh instead of ΔUlow→high for computing ΔAlow→high is available.

Providing such an analysis and comparison is the goal of this work. Until recently, this would have been challenging because of the failure of Zwanzig’s equation to converge when using energy differences ΔUlow→high in all but the simplest applications. In recent studies, however, we have demonstrated that Jarzynski’s equation (JAR)60 can be used to obtain converged values for ΔAlow→high via non-equilibrium fast switching work simulations.24,61 Since the mismatches in bond lengths and bond angles correct themselves during even short (less than 100 fs) switching simulations, this source of error is removed easily, regardless of the size of the quantum region. While the method may be too expensive in large scale applications, it can provide reference results in select applications. Several examples are presented in this work, which is organized as follows. In the Theory section we work out in detail the differences between the use of full energies and interaction energies, providing indications when use of the latter may or may not be appropriate. In addition, we briefly summarize the use of non-equilibrium work methods to compute ΔAlow→high. The theoretical considerations are illustrated by numerical results for several model tasks, all concerned with the determination of solvation free energies. In Methods we present the rationale behind the choice of model problems and provide the technical details of the calculations. The Results section is followed by a discussion in which we point out similarities between the IEA and the role of “self-terms” or “intraperturbed-group interactions” in alchemical FES.62,63

2. Theory

2.1. Interaction vs. total energies

2.1.1. Brief recapitulation of Zwanzig’s equation

At the risk of belaboring the obvious, we start by briefly outlining the derivation of Zwanzig’s equation. The customary starting point is two systems (or alternatively, energetic states) X and Y, having potential energies UX and UY. Here and in the following, we assume that the kinetic energy part of the Hamiltonian can always be decoupled and, thus, can be ignored. This holds true even when (S)QM/MM Hamiltonians are considered as the equations of motion are treated classically. Defining ΔUX→Y = UY – UX, one can obviously write UY = UX + ΔUX→Y. Further, the free energy of system X, AX is assumed to be known. Thus, to determine AY it suffices to compute the free energy difference

ΔAXY=AYAX=kBTlnZYZX, (5)

where Z=Γdrexp[U(r)/kBT] is the configurational partition, with the integration being carried out over all configuration space Γ in the canonical ensemble of the two systems X and Y, respectively. Employing the identity 1 = exp[+UX(r)/kBT] exp[−UX(r)/kBT] and the definition of the expectation value ΘX=ΓdrΘ(r)PX(r)=ΓdrΘ(r)exp[UX(r)/kBT]Γdrexp[UX(r)/kBT] for some observable Θ with the Boltzmann probability distribution of state X, PX, leads to

ΔAXY=kBTlnexp[ΔUXY/kBT]X; (6)

i.e., the general form of Eq. 2. The subscript X emphasizes that the average 〈…〉X is sampled using the potential energy function of state X. When analyzing what contributes to a free energy difference obtained by Zwanzig’s equation (Eq. 2 or 6), one has to consider both the details of ΔU, as well as the (energy function of the) ensemble at which configurations are sampled.

While the exponentials within the integral can be rearranged in various manners, the integral in the ensemble average 〈exp(−ΔU/kBT)〉X cannot be split into two (or more) additive terms in a unique manner. Specifically, the energy difference between the two states may consist of two or more distinct terms, e.g., differences in electrostatic (elec) and Lennard-Jones (LJ) interactions, i.e., ΔU = ΔUelec + ΔULJ. While separable at the level of energies, it is not possible in an unambiguous fashion to split the exponential average 〈exp[−(ΔUelec + ΔULJ)/kBT]〉X, and, hence, the corresponding free energy difference, into contributions from ΔUelec and ΔULJ.6466 In other words, ΔAXYΔAXYelec+ΔAXYLJ since kBTlnexp[ΔUXY/kBT]XkBTlnexp[ΔUXYelec/kBT]XkBTlnexp[ΔUXYLJ/kBT]X. 67

2.1.2. The special case of solvation free energies — definitions

For the following analysis we focus on the calculation of a solvation free energy for some solute S by an indirect QM/MM FES (Fig. 2). We assume that a solvation free energy has been computed using a MM force field (ΔAsolvMM); this result is to be refined by a QM/MM method. The IEA has been used in such scenarios;4850 also, all model problems considered in this work are related to solvation free energies. The considerations given below can be straightforwardly extended to other situations, such as the calculation of binding affinities of protein–ligand interactions.

Figure 2.

Figure 2

Indirect scheme to refine a MM solvation free energy at a QM/MM level of theory.

In other words, we are interested in the following specialization of Fig. 1, i.e., the computation of the free energy difference ΔAsolvQM/MM for transferring solute S from the gas phase (gasp) into aqueous solution (aq). In the full indirect scheme, instead of following the dotted arrow in Fig. 2, we would use Eq. 1 and compute

ΔAsolvQM/MM=ΔAsolvMM+(ΔAaqMMQM/MMΔAgaspMMQM). (7)

Using the IEA, i.e., using Eq. 4 instead of Eq. 2, we would instead compute

ΔAsolv(inter)QM/MM=ΔAsolvMM+(ΔAaq,interMMQM/MM) (8)

since by construction there is no (solute-solvent) interaction energy in the gas phase. The central question we want to address is how ΔAsolvQM/MM and ΔAsolv(inter)QM/MM differ and whether one can expect that ΔAsolvQM/MMΔAsolv(inter)QM/MM, i.e., whether use of Eq. 8 instead of Eq. 7 will lead to an identical or, at least, comparable result. As a shorthand for later, we add a conditional expression for the MM → QM /MM (low high) correction encompassing both cases:

ΔAcorrMMQM/MM{=ΔAaqMMQM/MMΔAgaspMMQM:fullcycle(Fig. 2)ΔAaq,interMMQM/MM:IEA (9)

We start by writing down, in a schematic way, the various potential energy functions needed to compute the terms in Eqs. 7 and 8. The gas phase is straightforward, the solute (S) is described either by a MM (UgaspMM(rS)) or QM potential energy function (UgaspQM(rS)). For the aqueous solution, we assume that water (W) is always treated classically, i.e., by a MM force field (UWMM(rW)). Here rS and rW denote the coordinates of the solute and water, respectively. Therefore, the potential energy describing the solute in water at the MM and QM/MM levels of theory is given by

UaqMM=US,SWMM(rS,rW)+UWMM(rW)=USMM(rS)+USWMM(rS,rW)+UWMM(rW) (10)

and

UaqQM/MM=US,SWQM/MM(rS,rW)+UWMM(rW). (11)

We assume an additive MM force field; hence, in Eq. 10 we have separated intra-solute USMM(rS) and solute–solvent interactions USWMM(rS,rW). For QM/MM this is not possible, as reflected by the combined term in US,SWQM/MM(rS,rW) Eq. 11.

In preparation for the analysis of the IEA, we also need to give matching expressions for the interaction energies at the two levels of theory:

  • Uaq,interMM=UaqMMUSMMUWMM=USWMM(rS,rW) (12)
  • Uaq,interQM/MM=UaqQM/MMUSQMUWMM=US,SWQM/MM(rS,rW)USQM(rS) (13)

Eqs. 12 and 13 follow directly from Eqs. 10 and 11 above; US denotes the potential energy of the solute at either the MM or QM level of theory, computed as if the waters were not present.

2.1.3. Analysis of using interaction energy differences

We now turn to the analysis of the free energy differences involved in computing a solvation free energy when using the exact thermodynamic cycle (Fig. 2) and when using the IEA. For the former, the difference in total energies between the MM and QM/MM levels of theory is needed. In the gas phase one trivially has

ΔUgaspMMQM=UgaspQM(rS)UgaspMM(rS). (14)

In aqueous solution, subtracting Eq. 10 from Eq. 11 one obtains

ΔUaqMMQM/MM=UaqQM/MMUaqMM=US,SWQM/MM(rS,rW)USMM(rS)USWMM(rS,rW). (15)

Since interactions between waters are always treated classically, the term UWMM(rW) present in both Eqs. 10 and 11 cancels from the difference Eq. 15. If interaction energies are used instead, the corresponding interaction energy difference (subtracting Eq. 12 from Eq. 13) is

ΔUaq,interMMQM/MM=Uaq,interQM/MMUaq,interMM=US,SWQM/MM(rS,rW)USQM(rS)USWMM(rS,rW) (16)

Eqs. 14, 15, and 16 are the energy differences entering Zwanzig’s equation to compute ΔAgaspMMQM,ΔAaqMMQM/MM,andΔAaq,interMMQM/MM, respectively.

We first focus on comparing ΔAaqMMQM/MMandΔAaq,interMMQM/MM. These two quantities are obtained from the same underlying simulation of the solute in water, modeled at the low level of theory (MM). Its potential energy UaqMM is given by Eq. 10. Recall from the recapitulation of Zwanzig’s equation that, given an initial state X, the energy of the final state Y can be written as UY =UX + ΔUX→Y. Here, UX=UaqMM and is given by Eq. 10 One can easily convince oneself that adding ΔUaqMMQM/MM (Eq. 15) to UaqMM (Eq. 10) leads to the potential energy function UaqQM/MM of the QM/MM description (Eq. 11), as expected.

Repeating this seemingly trivial exercise for the IEA, we add ΔUaq,interMMQM/MM (Eq. 16) to the potential energy UaqMM (Eq. 10) of the reference state. This leads to

U˜aqQM/MM=US,SWQM/MM(rS,rW)+UWMM(rW)+[USMM(rS)USQM(rS)]==UaqQM/MM+[USMM(rS)USQM(rS)]. (17)

where the tilde ˜ is used to distinguish this result from the true UaqQM/MM (Eq. 11). The difference between Eq. 11 and Eq. 17, USMM(rS)USQM(rS), reflects that the difference in intramolecular energies between the two levels of theory is excluded when using interaction energies. However, we can go further. Eq. 17, after all, is the effective potential energy function corresponding to the high level of theory when using the IEA. What is the nature of a system described by U˜aqQM/MM? The intramolecular energy of the solute in this hypothetical state is described by the low level of theory (MM), whereas solute–water interactions are treated at the desired high level of theory (QM/MM). Thus, when using the IEA ΔAcorrMMQM/MM (Eq. 9) is the free energy difference between the MM representation of the system and a hybrid representation in which the intramolecular interactions of the solute, our quantum region, are described by MM, and only the solute–water interactions correspond to a QM/MM treatment.

In the special case of solvation free energies, there is a second aspect worth pointing out in connection with the IEA. In the full indirect cycle (Fig. 2), the correction ΔAMM →QM is computed both for the gas phase, as well as for solution. Since in the gas phase there are by definition no interaction energies, any contributions from the gas phase are not taken into account when the IEA is employed. For flexible molecules the minimum energy conformation(s) in the gas phase and in solution need not be the same. Clearly, any free energy contribution from such conformational preferences is omitted in the IEA. This constitutes an additional source of error that goes beyond the earlier considerations, where we identified the chimeric high level state employed by the IEA. Assume a solute–solvent system for which the low-level (MM) description of intra-solute molecular forces in solution is comparable to that of the high-level description; i.e., the solute’s behavior in aqueous solution in the MM simulation is very similar to what it would be during a true QM/MM simulation. Since in this case the remaining difference between the two levels of theory is the solute–solvent interaction, the use of IEA would seem justified. However, just because the intra-solute molecular forces are adequately described in solution, this does not mean that this is true in the gas phase as well. E.g., biomolecular force fields often sacrifice gas phase properties in order to describe interactions in and with water correctly.68 Since the gas phase is not even considered in the IEA, any such free energy contributions are overlooked. An alternative way of viewing this is to state that in the IEA the properties of the solute, in particular its conformational preferences, are assumed to be identical in the gas phase and in aqueous solution.

Thus, two assumptions are made when applying the IEA to the calculation of solvation free energy differences. (1) The IEA connects the low level of theory (MM in our analysis) with a hypothetical high level state, in which the solute is described by MM and only the solute-solvent interactions obey the QM/MM potential energy function. (2) Further, the IEA implicitly requires that the solute’s conformational preferences are identical in the gas phase and in aqueous solution.

2.2. A brief summary of Jarzynski’s and Crook’s equation

Next, we briefly describe the theoretical basis of non-equilibrium work methods to accurately compute ΔAMM →QM /MM corrections even in cases when Zwanzig’s equation fails to converge. Approximately twenty years ago, Jarzynski 60 showed that Zwanzig’s equation is merely a limiting case of a much more general identity, i.e.,

ΔAXY=kBTlnexp[WXYkBT]X (18)

Here, WX→Y is the non-equilibrium work (NEW) of transforming state X into Y in finite time. We refer to a simulation, during which the potential energy function is changed between two states and the corresponding work W is recorded, as ‘switch’. The average 〈…〉X is carried out over multiple such switches, each starting from an equilibrium configuration and velocity set obtained at state X. Zwanzig’s equation is recovered when carrying out the switch instantaneously, in this case WX→Y reduces to ΔUX→Y = UXUY. Similarly, thermodynamic integration can be viewed as Jarzynski’s equation in the limit of infinitely slow switching times, thus keeping the system at each intermediate step in equilibrium.

Jarzynski’s equation has become an indispensable tool to model biophysical/biochemical processes that range from pulling experiments to ligand (un)binding; more recently it is also used in alchemical FES.69,70 As we showed in earlier work,24,61 it is also well suited for the problem at hand, the reliable computation of ΔAlow→high. Even short switching protocols (50–100MD steps) can ameliorate mismatches between the low and high level description of the system, in particular disparity in bonded degrees of freedom. While quite costly, in particular if ai-QM methods are used, the switches are a post-processing step, and hence any number of switching simulations can be trivially carried out in parallel.

Just as Jarzynski’s equation is the non-equilibrium extension of Zwanzig’s equation, its two-sided equilibrium variant, BAR, can also be generalized to averages of forward and backward non-equilibrium work values. In particular, Crooks71 proved the general theorem

exp(βΔA)=f(W)Xf(W)exp(βW)Y (19)

Eq. 19 is true for any function f (W), where, as in the above discussion of Jarzynski’s equation, W denotes a non-equilibrium work value. The average 〈. . .〉 is taken over multiple work values, starting from equilibrated configurations corresponding to state X or Y (as indicated by the subscript). By minimizing the statistical variance with respect to the arbitrary function f (W), Bennett was the first to show that the choice

f(W)=[1+nXYnYXexp(β(WΔA)]1 (20)

minimizes the variance in the free energy estimate. 45 Here nX→Y, nY→X are the number of forward and backward switches, respectively. While Bennett’s derivation was carried out for the equilibrium case, Eq. 20 can also be used to compute free energy difference from Crooks’ theorem. Additional details about using Jarzynksi’s and Crooks’ equation to compute ΔAlow→high can be found in our earlier work; for general background we refer the reader to a recent review of NEW methods by Dellago and Hummer. 72

In principle, BAR and Crooks are problematic methods for computing ΔAlow→high since in both cases simulations at the high level of theory are required (cf. the Introduction). However, both in previous work,24,61 as well as in this study SQM/MM was/is employed as the high level of theory for model problems. For these systems, simulations at an SQM/MM level of theory are feasible. Being able to use two-sided free energy perturbation methods, such as BAR or Crooks serves as a stringent test whether free energy differences obtained by one-sided methods (i.e., Jarzynski’s equation) have indeed converged.

3. Methods

Overview of model calculations

Validating the IEA is challenging due to the poor convergence of Zwanzig’s equation when computing ΔAlow→high. Specifically, it is difficult to obtain reference results for the full thermodynamic cycle (Fig. 2) for all but the simplest systems. E.g., Ref. 39 convincingly demonstrated the convergence failure of Zwanzig’s equation, but because of the practical impossibility to achieve convergence, it could not be shown that the use of interaction energies instead of the full thermodynamic cycle would lead to identical results. In contrast, ΔAlow→high with NEW methods is much more robust as compared to Zwanzig’s equation; thus, we will rely heavily on these to obtain reference results.

Although (S)QM/MM calculations are the ultimate target, it seems instructive to start with purely MM test cases. For such systems reference results and results based on the IEA can be calculated with high precision. The application we will be studying is the solvent affinity of blocked amino acids using three CHARMM force fields to describe the solute with the TIP3 water model employed in all three cases. We arbitrarily consider one force field the low level of theory, and the two others high levels of theory.

Our proof-of-concept work demonstrating the utility of Jarzynski’s equation to compute ΔAlow→high contained examples of varying complexity.24,61 Thus, we will use some of these earlier results for ΔAlow→high as reference values and compare them to values obtained with the IEA. First, we chose the zeroth order test application of FES, the solvation free energy differences of ethane and methanol, and refine MM results at the SQM/MM level of theory (see below for details), using both the full indirect cycle as shown in Fig. 2, as well as the approximation based on interaction energies. For these systems, we expect the use of interaction energies to suffice. Second, we report results for the blocked amino acids alanine (N-acetyl-alanine-methylamide, Ala) and serine (N-acetyl-serine-methylamide, Ser). These are flexible systems, with different conformational preferences in the gas phase and in aqueous solution, as well as when described by force fields and at the SQM/MM level of theory. Thus, they contain all the potential complications discussed in the Theory section which might result in incorrect results relying on interaction energies.

In Ref. 61 we also reported results for bis-2-chloro diethyl ether (2CLE) in the gas phase. This rather small molecule proved quite difficult since MM parameters obtained from the CHARMM generalized (CGenFF) force field 73 led to rather different conformational preferences compared to SQM. While we could obtain converged results for ΔAlow→high using Jarzynski’s equation, the low to high correction could be obtained more easily using reoptimized force field parameters. Here, we present results not only for the gas phase but for 2CLE’s full free energy of solvation calculation, using two different MM representations as the low and SQM/MM as the high level of theory according to Fig. 2. The result obtained with the full indirect cycle will be compared to the use of the IEA.

Free energy differences between force fields

The first generation of the CHARMM all-atom protein force field74 was revised in 2004 by the introduction of the so-called crossterm correction (“CMAP”).68 We consider the initial protein force field, 74 referred to here as C22,noCMAP, to be the low level of theory, and the revised force field,68 referred to as C22, a possible “high” level of theory. The two parameter sets differ only by the absence / presence of the cross-term correction. The CMAP force field term is an intramolecular energy term and would pertain only to the solute. Hence, using the interaction energy approach, the free energy difference of solvation between any system described with the C22,noCMAP and the C22 force fields will be, by definition zero, as the interaction energies are the same in both cases. We will compare this theoretical result based on the IEA with the true relative solvation free energy difference for Ala and Ser when using the two force fields. Further, we will repeat this comparison for C22,noCMAP and the current CHARMM36 protein force field (C36).75,76 In this case, there are several differences between the two parameters sets, some also directly affecting protein–water interactions.77 Hence, both corrections ΔΔAsolvlowhigh, the one based on the IEA and the one on the full thermodynamic cycle Fig. 2, need to be computed.

Specifically, MD simulations of blocked Ala and Ser were carried out in the gas phase and in aqueous solution, using the three force fields C22,noCMAP, C22, and C36. Trajectories were saved and energies recomputed at the respective other states. E.g., configurations saved during simulations with C22 were reevaluated with C22,noCMAP, and C36. Based on these energy time series, BAR was used to compute the free energy differences between force fields in a single step, in analogy to Ref. 78. Trajectories generated in aqueous solution were further used to compute interaction energies (the energy of the solute and water-water energies were subtracted from the full energies). Then, Zwanzig’s equation was used to compute the free energy difference corrections based on the IEA. Error bars for the BAR and Zwanzig’s equation free energy difference were obtained as follows. Dividing the full data set (here 100,000 ΔU values) into ten sequential blocks, we computed ΔAi (i = 1, . . . 10) for each block independently. The standard deviation s of these ten ΔAi is reported as statistical error estimate (s=1/(N1)i=1N(ΔAiΔA¯)2, where in this work N = 10). The approach just outlined was motivated by an approach of Wood et al.79 to estimate “sample size hysteresis” and was also used in Refs. 44, 61. Simulation details are provided in SI.

Ethane/methanol, Ala/Ser

We take results from Ref. 61 for ΔAlow→high, with low referring to the C36 force field and high to the semi-empirical SCC-DFTB method8084 with the 3ob-3-1 parameter set8588 and third order corrections (herein referred to as DFTB3). The trajectories saved during earlier work61 were used to compute interaction energy differences and to compute ΔAlow→high according to the IEA. In particular, we report results for ethane (ETHA), methanol (MEOH), blocked Ala and Ser, for which we had computed values for ΔAlow→high both in the gas phase and in aqueous solution. The details of the underlying simulations can be found in Ref. 61.

Solvation free energy of bis-2-chloro diethyl ether (2CLE)

Finally, we carried out a full solvation free energy calculation for 2CLE. As reported for the gas phase earlier,61 for this molecule the CGenFF force field parameters generated by PARAMCHEM89,90 turned out to lead to very different conformational sampling than at the DFTB3 level of theory. Using refined force field parameters obtained with GAAMP91 resulted in more similar conformational sampling and facilitated convergence of ΔAgasplowhigh.61 Results using this parameterization will be referred to as “GAAMP”. Here we again use the CGenFF and GAAMP force field parameters as low levels of theory and refine results obtained with them to the DFTB3 level of theory.92

Fig. 3 depicts the individual steps used to compute the solvation free energy of 2CLE, using either the CGenFF or the GAAMP force field, and to refine the classical results to the DFTB3 level of theory. The calculation of the classical solvation free energy ΔAsolvMM consists of three steps, denoted (ii)–(iv) in the figure: First, in step (ii) the charges of the solute atoms were turned off in five steps. Next, step (iii), all Lennard-Jones interactions of the solute were turned off (six steps, see below for details). In the PERT module of CHARMM, one cannot selectively turn on/off solute–solvent interactions, so steps (ii) and (iii) also removed solute intramolecular nonbonded interactions. This contribution was calculated in step (iv), in which nonbonded intramolecular interactions were turned on again (vide infra). Steps (i) and (v) are the corrections steps ΔAlow→high for aqueous solution and the gas phase, respectively. Simulation details for the five steps can be found in SI.

Figure 3.

Figure 3

Detailed thermodynamic cycle to compute the solvation free energy of 2CLE, including an overview of the purely classical steps. S denotes the solute, W water.

Given the relatively small system size and the semi-empirical level of theory chosen, we could afford to carry out MD simulations at the high level of theory. Thus, we used Crooks’ equation (Eqs. 19 and 20) to compute the free energy difference ence between the high and low level of theory (steps (i) and (v) in Fig. 3). Non-equilibrium switches in the two directions were carried out using the MSCALE93 facility of CHARMM combined with the slow-growth mode of the PERT module; cf. Refs. 24, 61 for further details. In the gas phase, 10,000 forward (low high) and backward (high low) switches were carried out; only every fourth of the 40,000 restart files saved at the low level of theory were used. Switches were started from the restart files saved during the production simulations at the two levels of theory and were carried out during 1000 steps of Langevin dynamics (friction coefficient 5 ps–1 on all atoms, 0.5 fs timestep). The same switching protocol was also used in solution. Out of the 40,000 restart files collected during simulations using the CGenFF and GAAMP force field, as well as the SQM/MM simulation (see SI), every second was used, resulting in 20,000 forward and backward work values.

Finally, all coordinates saved during the force field simulations in aqueous solution (CGenFF, GAAMP) were also used to compute interaction energies, Uinter, at the low and high levels of theory, respectively. The interaction energy differences, ΔUinterlowhigh, were then inserted into Eq. 4 to compute ΔAinterlowhigh according to the IEA.

4. Results

Free energy differences between force fields

Table 1 summarizes results for the differences in solvation free energies for Ala and Ser using three generations of CHARMM protein force fields C22,noCMAP, C22, and C36. As described in Methods, these were computed exactly and employing the IEA. The case C22,noCMAP vs. C22 is of particular interest, as it is clear in this case that the free energy resulting from the IEA has to be exactly zero (cf. Methods). By contrast, as one sees in Table 1, the exact result is approximately 0.5 kcal/mol for both Ala and Ser. A similar discrepancy is obtained for C22 vs. C36. For Ala, the IEA again suggests a free energy difference of zero, whereas the correct value is 0.4 kcal/mol. For Ser, the exact result is 0.1 kcal/mol, whereas the IEA leads to a value of –0.2 kcal/mol. While the discrepancy in results is not large, it clearly is not negligible.

Table 1.

Toy problems based on purely classical mechanical force fields. All free energy differences are in kcal/mol.

Ala Ser
fulla IEAb fulla IEAb
ΔAgaspC22,noCMAPC22 -0.57±0.01 -0.82±0.06
ΔAaqC22,noCMAPC22 -0.12±0.11 -0.27±0.07
ΔAcorrC22,noCMAPC22c 0.45±0.11 0.00d 0.55±0.09 0.00d
ΔAgaspC22,noCMAPC36 -0.04±0.01 2.85± 0.10
ΔAaqC22,noCMAPC36 0.39±0.07 2.93±0.08
ΔAcorrC22,noCMAPC36c 0.43±0.07 -9×10–4±0.00 0.08±0.13 -0.18±0.00
a

Corrections ΔAlow→high (low = C22, noCMAP, high = C22 or C36) computed separately in gas phase and in solution, cf. Fig. 2.

b

Correction ΔAaq,interlowhigh (low = C22, noCMAP, high = C22 or C36) computed according to the IEA.

d

Theoretical result — there is no change in interaction energy between C22,noCMAP and C22

The cross-term (CMAP) correction modifies the ϕ/ψ backbone angle distribution sampled by peptides and proteins. Thus, there are two ways in which its presence / absence contributes to the solvation free energy. First, as the average conformation of the peptide is different with and without the CMAP term, solute–solvent interactions are changed, even though the nonbonded force fields terms are identical in both cases. Since in the framework of the IEA the respective other state is never simulated, this contribution is not accounted for. Second, the effect of the CMAP correction on the preferred φ/ψ values is different in the gas phase and in water. As the gas phase is not even considered in the IEA, any free energy contributions resulting from this cannot be accounted for. Thus, although the numerical discrepancy for these purely classical systems is not large, it nicely illustrates the types of contributions ignored by the IEA, in line with the earlier theoretical analysis.

Ethane/methanol, Ala/Ser

Table 2 summarizes several results from Ref. 61 concerning MM SQM(/MM) corrections which are required to compute solvation free energies. In addition, we used our raw data to compute the corrections using the IEA. For two model systems, ethane and methanol, the results are in quite good agreement. Both systems have little conformational flexibility, and the MM force field can be expected to describe intramolecular, as well as interactions with water well; hence, this findings is not surprising. For Ala and Ser, on the other hand, the deviations between the two approaches (exact thermodynamic cycle Fig 2 vs. IEA) are considerable. In both cases, the MMSQM/MM correction obtained from the IEA is too negative, –1.9 kcal/mol for Ala, and –1.2 kcal/mol for Ser. Given the flexibility of the two peptides and their different conformational preferences observed when described by MM and by SQM/MM,24,61 these findings are not too surprising. As outlined in the Theory section, use of the IEA implicitly relies on the assumption that the simulations at the low level of theory result in configurational and conformational sampling that is also representative for the high level of theory. Our earlier results showed this not to be the case; hence, the expected failure.

Table 2.

Test of IEA in an SQM/MM setting. All free energy differences are in kcal/mol.

Etha Meoh
fullb IEAc fullb IEAc
ΔAgaspC36DFTB3a -86.49±0.01 -8.64±0.01
ΔAaqC36DFTB3a -86.83±0.01 -6.82±0.01
(Δ)ΔAC36DFTB3d   -0.34±0.01 -0.18±0.01 1.82±0.01 1.92±0.01
Ala Ser
ΔAgaspC36DFTB3a -46.09±0.04 -27.31±0.17
ΔAaqC36DFTB3a -56.97±0.03 -36.06±0.09
ΔAcorrC36DFTB3d -10.88±0.05 -12.79±1.01   -8.75±0.19 -9.98±0.81
a

For better readability, free energies listed in the table are offset by +4, 100 (Meoh), +3, 500 (Etha), +16, 400 (Ala) and +18, 500 kcal/mol (Ser). All ’full’ results taken from Tables 2 and 3 of Ref. 61.

b

Corrections ΔAC36DFTB3 computed separately in gas phase and in solution, cf. Fig. 2.

c

Correction ΔAaq,interC36DFTB3 computed according to the IEA.

d

Eq. 9.

Finally, the Ala/Ser results of Table 2 are worrisome in a second regard. One main justification for the IEA is that the modified Zwanzig equation (Eq. 4) is expected to converge more quickly than the exact expression, Eq. 2. Indeed, we showed in Refs. 24,61 that attempts to compute ΔAMM→SQM/MM for Ala and Ser by Zwanzig’s equation fail to converge, both in the gas phase and in solution. Using non-equilibrium work methods, on the other hand, the exact correction can be computed, though in aqueous solution relatively long switching protocols were required. The interaction energy results for Ala and Ser in Table 2 have relatively high standard deviations, almost ±1 kcal/mol. Further, they fail several of the criteria we recently suggested to gauge convergence of Zwanzig’s equation. 44 Despite being computed from 200,000 interaction energy differences each, the Π value,94 which ideally should be > +0.5, is only –0.26 for Ala and –0.15 for Ser. The standard deviation of the interaction energy differences themselves is ≈ 4kBT in both cases, the absolute upper limit we found for Zwanzig’s equation to lead to correct results. Thus, while the use of interaction energies helps convergence and makes it possible to use somewhat larger quantum regions, it does not remove all of the underlying diffculties hindering convergence. In Ref. 61 we pointed out that the solute’s charge distribution was quite different for MM and SQM; hence, water configurations appropriate for a MM charge distribution may be quite “incorrect” for a SQM solute. Clearly, this factor hampering convergence is also present when using interaction energies.

Solvation free energy of bis-2-chloro diethyl ether (2CLE)

Finally, we report solvation free energies of 2CLE at the SQM/MM level of theory. Two MM force field were used as the low level of theory; the MM results were refined either by the full thermodynamic cycle Fig. 2, or according to the IEA. All results are summarized in Table 3.

Table 3.

IEA compared to full free energy correction (Fig. 2) as applied to the computation of the solvation free energy ΔAsolv of bis-2-chloroethylether (2CLE). All free energies in kcal/mol. The experimental solvation free energy is 4.23 kcal/mol.95

fullb IEAc
ΔAgaspC36DFTB3a -27.34±0.20
ΔAaqC36DFTB3a -27.54±0.53
ΔAcorrC36DFTB3d   -0.20±0.57 1.73±0.26
ΔAsolvC36e   -3.26±0.08 -3.26±0.08
ΔAsolvDFTB3/MMf   -3.46±0.57 -1.53±0.27
ΔAgaspGAAMPDFTB3a -33.55±0.13
ΔAaqGAAMPDFTB3a -35.79±0.04
ΔAcorrGAAMPDFTB3d   -2.24±0.14 -0.88±0.20
ΔAsolvGAAMPe   -0.81±0.03 -0.81±0.03
ΔAsolvDFTB3/MMf   -3.05±0.14 -1.69±0.20
a

For better readability, free energies listed in the table are offset by +12,200 kcal/mol.

b

Corrections ΔAMMDFTB3 (MM is C36 or GAAMP) computed separately in gas phase and in solution, cf. Fig. 2. Results as reported were computed using non-equilibrium work methods, specifically Crook’s equation.

c

Correction ΔAaq,interMMDFTB3 (MM is C36 or GAAMP) computed according to the IEA; cf. Eq. 4

d

Eq. 9, MM is C36 or GAAMP.

e

Classical solvation free energy for 2CLE computed using the C36 or GAAMP force field parameters for the solute. This corresponds to the horizontal arrow at the bottom of Fig 2. For details of the computational protocol see SI.

f

Solvation free energy for 2CLE corrected to the DFTB3/MM level of theory: ΔAsolvDFTB3/MM=ΔAsolvMM+ΔAcorrMMDFTB3, where MM is C36 or GAAMP.

With the regular CHARMM CGenFF force field (C36), one obtains a solvation free energy difference ΔAsolvMM of –3.26 kcal/mol, deviating slightly less than 1 kcal/mol from the experimental value of –4.23 kcal/mol. 95 When this MM result is refined through the indirect cycle Fig. 2, the solvation free energy ΔAsolvSQM/MM becomes –3.46 kcal/mol, in slightly better agreement with experiment. If one does the same with the IEA (column ’IEA’ in Table 3), the agreement with experiment becomes much poorer. The “corrected” solvation free energy ΔAsolv(inter)SQM/MM=1.53kcal/mol, which deviates by almost 3 kcal/mol from the experimental value.

Since the C36 MM simulations led to conformations of 2CLE which were rather different from those obtained with SQM,61 we repeated the MM simulations with parameters obtained from GAAMP (lower half of Table 3). Somewhat surprisingly, the MM solvation free energy is in much poorer agreement with experiment than the one obtained using the C36 force field (see below). Correcting to the SQM/MM level of theory improves this result considerably, although the deviation of ≈ 1.2 kcal/mol from the experimental result remains higher than for the SQM/MM corrected result based on C36 (deviation ≈ 0.8 kcal/mol). In hindsight, this result is not as surprising as first thought as recent work has clearly shown that solvation free energies are very sensitive to differences in (S)QM levels of theory (i.e., differences in local intermolecular interactions with waters in the first solvation shell).22,26,96 Using the IEA, the “corrected” number is close to the one based on C36 and is far off the experimental value. Thus, we have two clear cases where the use of the IEA leads to a poor result and in one case even makes the MM result worse. The “performance” of the IEA here is particularly treacherous, as the results appear well converged (pass all our usual convergence criteria). Further, the “corrected” solvation free energy differences obtained from the two MM representations agree almost perfectly.

Two aspects of the results merit additional consideration. First, we remind the reader of the goal of the GAAMP re-parameterization, which is to make the configurations / conformations sampled more high-level like. We already showed that this is the case in Ref. 61, and this is also reflected by the much lower statistical uncertainty of ΔAcorrMMSQM/MM, ±0.14 for GAAMP vs. ±0.57 for C36. One should keep in mind the role of the low level in indirect (S)QM/MM FES. The purpose of the low level is to act as a bridge, helping to generate configurations/conformations which are also meaningful at the desired high level of theory.

The second observation worth mentioning is that the two corrected SQM/MM results do not fully agree, though, given the rather high statistical uncertainty of the C36 based result, stemming from the difficulty to converge the ΔAMM →SQM/MM corrections, the two final results are statistically indistinguishable. Further, the GAAMP re-parameterization led to changes of some Lennard-Jones parameters. In CHARMM based (S)QM/MM calculations, electrostatic interactions between the (S)QM and the MM region are computed based on the (S)QM charge distribution; however, the apolar interactions are still calculated using the classical Lennard-Jones parameters for the (S)QM region. Since these are different in the two parameterization, differences in solvation free energies are to be expected.

5. Discussion

In this section we address several topics related to the IEA. First, the rationale for its use is to avoid the configurational mismatch between the two levels of theory, i.e., the mismatch between bond lengths and angles sampled at the MM level of theory, which are “slightly wrong” at the desired (S)QM level of theory. Since by definition interaction energies do not consider bonded degrees of freedom, this issue and the resulting poor convergence are avoided. We note, however, that our results for Ala and Ser (Table 2) suggest that IEA it is still susceptible to other sources of poor convergence. In addition, we argued in the Theory section that when applied to solvation free energies use of the IEA removes any considerations of the gas phase. If, e.g., a solute adopts different conformational states in the gas phase and in solution, the resulting free energy contributions are not accounted for.

Curiously, this last issue bears resemblance to a 20+ year old question in classical alchemical FES, specifically, whether the so-called “self-terms” or “intra-perturbed group interactions” cancel from relative free energy differences.62,63,97 As a reminder, consider the thermodynamic cycle used to compute a relative free energy difference of solvation. Along the alchemical, horizontal paths, identical changes in intra-molecular interactions occur in the gas phase and in aqueous solution. The resulting contributions to the free energy changes ΔAgaspABandΔAaqAB have been called self-terms (intra-perturbed group interactions). However, while, at least in additive force fields, the changes in potential energy are always separable, the additivity of interactions is not mirrored by an additivity of free energy differences, i.e., in general

ΔAaqABΔAgaspAB+ΔAaq,interAB. (21)

Nevertheless, several free energy codes in wide use 20 years ago excluded or made it possible to exclude intramolecular energy terms from the calculations of free energies, assuming that ΔAgaspAB would approximately cancel out of Eq. 21. This approach had the added advantage that no gas phase calculations were required. E.g., the alchDecouple option of NAMD still makes it possible to exclude self-terms.98 Boresch and Karplus investigated this question in detail,62,63 taking into account the underlying differences between a single or a dual topology approach to carry out the alchemical transformation.99 It was concluded at the time that “the present results indicate that selected terms of the energy function might be omitted from the free energy formalism unless single free energy differences are required”.63 In particular, it could be derived that contributions from changes in stiff, bonded degrees of freedom, i.e., bond stretching and angle bending terms, canceled more or less completely from parallel legs of a thermodynamic cycle. For the soft bonded terms, dihedral angles etc. the theoretical considerations were less conclusive, though for the model systems investigated any errors were negligible. However in a later study focused on computing the relative solvation free energy differences between leucine and asparagine, it was concluded that “. . . because of the different conformations adopted by the two molecules in the gas phase and in solution . . ., the contributions to ΔAgasp and ΔAaq from changes in the potential energy function that are identical along the alchemical portions of the thermodynamic cycle (i.e., all intramolecular energy terms of leucine and asparagine affected by the alchemical mutation) are not the same”.100 Thus, there is very concrete evidence in the context of classical FES that (indirectly) “omitting” different conformational preferences in two parallel legs of a thermodynamic cycle can lead to systematic errors.

In MM simulations it is customary to constrain some or all bond lengths through SHAKE101,102 or a similar algorithm. Fixing bond lengths (and angles) may, in principle, provide a different route to avoid the “configurational mismatch” problem, though it is not immediately clear whether constraining internal degrees of freedom is appropriate at (S)QM/MM levels of theory. Recently, König and Brooks outlined how to separate approximately hard and soft bonded degrees of freedom in (S)QM/MM FES; the low level simulations employ SHAKE and the loss of vibrational entropy is accounted for in a post-processing step.40

It seems appropriate to comment on the relative computational cost of the full indirect cycle vs. the use of IEA, as well as provide an overview of available toolchains. First, if inserting full energy differences into Zwanzig’s equation led to converged free energy differences, then use of the IEA would actually increase the computational cost. In addition to recomputing the total energy of the system at the (S)QM/MM level of theory, one has to compute the (S)QM energy of the intended high level region in order to extract interaction energies. However, in practical terms the overhead is relatively small; extracting interaction energies from trajectories is more tedious than costly. Carrying out NEW switching simulations is computationally more demanding, though overall there are mitigating factors. As already pointed out in Refs. 24,61, the switching simulations are a post-processing step and are completely independent; hence, they are trivial to parallelize. Further, we typically carry out much fewer switches compared with using Zwanzig’s equation or, in the context of the present study, employing the IEA. E.g., the full results for ethane, methanol, Ala and Ser (Table 2) were obtained from 10,000 NEW switches, whereas the IEA results were computed from 100,000 coordinate sets. The investigation into optimal switching lengths/protocols, as well as finding criteria how many switches are needed is work in progress. Finally, carrying out NEW switches may seem daunting. As described in Methods/SI, we employed the MSCALE module of CHARMM,93 with which such calculations can be set up in a straightforward manner, though details remain tricky (c.f. 10.5281/zenodo.2328952).103 Very recently, Wang et al.34 carried out NEW switching simulations using the AMBER SQM/MM code.104

One key result of the present study is highlighting that use of the IEA amounts to employing a chimeric potential energy function as the high level of theory: interactions between the MM and (S)QM/MM regions are modeled at the high level of theory, whereas interactions withing the “S(QM)/MM” region continue to be modeled at the low level. The use of chimeric potential energy functions in (S)QM/MM is actually not uncommon. Already in the 1990s Gao worked out (S)QM/MM methods, intended for the calculation of free energy profiles along a reaction coordinate, in which the treatment within the (S)QM region itself and the treatment of (S)QM/MM interactions could be described at different levels of theory.105,106 Such dual-level strategies, refined, e.g., by the groups of Moliner and Tuñón,107 are essential in several applications of (S)QM/MM methods. A review of such approaches applied to the study of kinetic isotope effects was published recently.108 An important difference between these dual-level strategies and IEA, however, is the fact that the former employ a higher level of theory for the (S)QM region itself, whereas IEA, as shown in this work, employs the low level of theory.

6. Conclusions

The data just presented clearly indicate that the IEA may lead to erroneous results, or, at least, to results which deviate systematically from a full correction to a (S)QM/MM level of theory. We showed that the use of the IEA corresponds to a ‘chimeric’ potential energy function: the interactions within the high level region (in our cases the intramolecular energy of the solute) are described at the low level of theory, and only the interactions between the high level region of interest and the remainder of the system are computed at the high level of theory. If one looks back to the initial use of interaction energies by Wood and co-workers,54 this is the exact construction of the Hamiltonian they employed. However, in recent applications of the IEA this distinction became blurred, and use of interaction energy differences rather than differences in total energy was suggested as a means to improve convergence.39,49 Our data also show that IEA only mitigates convergence problems, but does not resolve them. Based on the theoretical considerations and the results reported here, we advise against the IEA.

Overall, the free energy community at large needs to refocus on improving ways to rigorously compute indirect free energies as opposed to finding ways to “cut corners” with theoretically unjustified approximations. Our own work is based on a two-pronged approach: on the one hand, we explore more powerful methods to compute ΔAlow→high, such as NEW based techniques.24,61 Recent work by Ryde and co-workers is moving in this direction as well.33,34 On the other hand, we are searching for ways to make the low level of theory more high level like, as reflected in present work focusing on using force matching as a means to reparameterize force-fields to better provide accurate indirect free energy calculations. 30,31,109 Additionally, we are very interested in machine learning approaches that facilitate more quantum-like treatment of systems at a fraction of the computational cost.110113

Supplementary Material

The following can be found in the supporting information: simulation details for the force field model calculations, as well as the 2CLE calculations; and the parameters of 2CLE from both CGenFF and the modified ones from GAAMP.

SI1
SI2
SI3

Figure 4.

Figure 4

Thermodynamic cycle used to compute the relative free energy difference of solvation ΔΔAsolvAB=ΔAsolvBΔAsolvA=ΔAaqABΔAgaspAB for two solutes X and Y.

Acknowledgment

HLW would like to highlight that this material is based upon work supported by the National Science Foundation under CHE–1464946. Additionally, research reported in this publication was supported by NIGMS of the National Institutes of Health under award number R01GM129519. Further, HLW, and PSH thank USF Research Computing (Circe) for their patience and assistance and the NSF for support via their Major Research Instrumentation Program (MRI–1531590). PSH acknowledges funding support from the Intramural Research Program of the NIH, NHLBI. Finally, SB gratefully acknowledges support of this work from the FWF (grant P31024).

References

  • (1).Throughout this paper, the following nomenclature/abbreviations are used: MM: molecular mechanics, QM: quantum chemistry, ai-QM: strict (i.e., ab initio or density functional theory) quantum chemical methods, SQM: semi-empirical quantum chemical techniques, (S)QM applies to use of either strict or semi-empirical quantum chemical methods.
  • (2).Kästner J, Senn HM, Thiel S, Otte N, Thiel W. QM/MM Free-Energy Perturbation Compared to Thermodynamic Integration and Umbrella Sampling: Application to an Enzymatic Reaction. J Chem Theory Comput. 2006;2:452–461. doi: 10.1021/ct050252w. [DOI] [PubMed] [Google Scholar]
  • (3).Yang W, Cui Q, Min D, Li H. QM/MM Alchemical Free Energy Simulations: Challenges and Recent Developments. In: Wheeler RA, editor. Annual Reports in Computational Chemistry. Vol. 6. Elsevier; 2010. pp. 51–62. [Google Scholar]
  • (4).Rathore RS, Sumakanth M, Reddy MS, Reddanna P, Rao AA, Erion MD, Reddy M. Advances in Binding Free Energies Calculations: QM/MM-Based Free Energy Perturbation Method for Drug Design. Curr Pharm Des. 2017;19:4674–4686. doi: 10.2174/1381612811319260002. [DOI] [PubMed] [Google Scholar]
  • (5).Ryde U, Söderhjelm P. Ligand-Binding Affinity Estimates Supported by Quantum-Mechanical Methods. Chem Rev. 2016;116:5520–5566. doi: 10.1021/acs.chemrev.5b00630. [DOI] [PubMed] [Google Scholar]
  • (6).Lu X, Fang D, Ito S, Okamoto Y, Ovchinnikov V, Cui Q. QM/MM free energy simulations: recent progress and challenges. Mol Simulat. 2016;42:1056–1078. doi: 10.1080/08927022.2015.1132317. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (7).Olsson MA, Ryde U. Comparison of QM/MM Methods to Obtain Ligand-Binding Free Energies. J Chem Theory Comput. 2017;13:2245–2253. doi: 10.1021/acs.jctc.6b01217. [DOI] [PubMed] [Google Scholar]
  • (8).Beutler TC, Mark AE, van Schaik RC, Gerber PR, van Gunsteren WF. Avoiding Singularities and Numerical Instabilities in Free Energy Calculations Based on Molecular Simulations. Chem Phys Lett. 1994;222:529–539. [Google Scholar]
  • (9).Zacharias M, Straatsma TP, McCammon JA. Separation-Shifted Scaling, a New Scaling Method for Lennard-Jones Interactions in Thermodynamic Integration. J Chem Phys. 1994;100:9025–9031. [Google Scholar]
  • (10).Gao J, Xia X. A priori evaluation of aqueous polarization effects through Monte Carlo QM-MM simulations. Science. 1992;258:631–5. doi: 10.1126/science.1411573. [DOI] [PubMed] [Google Scholar]
  • (11).Gao J, Luque FJ, Orozco M. Induced dipole moment and atomic charges based on average electrostatic potentials in aqueous solution. J Chem Phys. 1993;98:2975. [Google Scholar]
  • (12).Gao J, Freindorf M. Hybrid ab Initio QM/MM Simulation of N -Methylacetamide in Aqueous Solution. J Phys Chem A. 1997;101:3182–3188. [Google Scholar]
  • (13).Luzhkov V, Warshel A. Microscopic models for quantum mechanical calculations of chemical processes in solutions: LD/AMPAC and SCAAS/AMPAC calculations of solvation energies. J Comput Chem. 1992;13:199–213. [Google Scholar]
  • (14).Wesolowski T, Warshel A. Ab Initio Free Energy Perturbation Calculations of Solvation Free Energy Using the Frozen Density Functional Approach. J Phys Chem. 1994;98:5183–5187. [Google Scholar]
  • (15).Zheng YJ, Merz KM. Mechanism of the human carbonic anhydrase II-catalyzed hydration of carbon dioxide. J Am Chem Soc. 1992;114:10498–10507. [Google Scholar]
  • (16).Beierlein FR, Michel J, Essex JW. A Simple QM/MM Approach for Capturing Polarization Effects in Protein-Ligand Binding Free Energy Calculations. J Phys Chem B. 2011;115:4911–4926. doi: 10.1021/jp109054j. [DOI] [PubMed] [Google Scholar]
  • (17).Heimdal J, Ryde U. Convergence of QM/MM free-energy perturbations based on molecular-mechanics or semiempirical simulations. Phys Chem Chem Phys. 2012;14:12592–12604. doi: 10.1039/c2cp41005b. [DOI] [PubMed] [Google Scholar]
  • (18).Heimdal J, Ryde U. Convergence of QM/MM free-energy perturbations based on molecular-mechanics or semiempirical simulations. Phys Chem Chem Phys. 2012;14:12592–12604. doi: 10.1039/c2cp41005b. [DOI] [PubMed] [Google Scholar]
  • (19).König G, Hudson PS, Boresch S, Woodcock HL. Multiscale Free Energy Simulations: An Efficient Method for Connecting Classical MD Simulations to QM or QM/MM Free Energies Using Non-Boltzmann Bennett Reweighting Schemes. J Chem Theory Comput. 2014;10:1406–1419. doi: 10.1021/ct401118k. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (20).König G, Pickard FC, Mei Y, Brooks BR. Predicting hydration free energies with a hybrid QM/MM approach: an evaluation of implicit and explicit solvation models in SAMPL4. Journal of Computer-Aided Molecular Design. 2014;28:245–257. doi: 10.1007/s10822-014-9708-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (21).Sampson C, Fox T, Tautermann CS, Woods C, Skylaris C-K. A “Stepping Stone” Approach for Obtaining Quantum Free Energies of Hydration. J Phys Chem B. 2015;119:7030–7040. doi: 10.1021/acs.jpcb.5b01625. [DOI] [PubMed] [Google Scholar]
  • (22).König G, Mei Y, Pickard FC, Simmonett AC, Miller BT, Herbert JM, Woodcock HL, Brooks BR, Shao Y. Computation of Hydration Free Energies Using the Multiple Environment Single System Quantum Mechanical/Molecular Mechanical Method. J Chem Theory Comput. 2015;12:332–344. doi: 10.1021/acs.jctc.5b00874. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (23).Duarte F, Amrein BA, Blaha-Nelson D, Kamerlin SC. Recent advances in QM/MM free energy calculations using reference potentials. Biochimica et Biophysica Acta (BBA) - General Subjects. 2015;1850:954–965. doi: 10.1016/j.bbagen.2014.07.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (24).Hudson PS, Woodcock HL, Boresch S. Use of Nonequilibrium Work Methods to Compute Free Energy Differences Between Molecular Mechanical and Quantum Mechanical Representations of Molecular Systems. J Phys Chem Lett. 2015;6:4850–4856. doi: 10.1021/acs.jpclett.5b02164. [DOI] [PubMed] [Google Scholar]
  • (25).König G, Pickard FC, Huang J, Simmonett AC, Tofoleanu F, Lee J, Dral PO, Prasad S, Jones M, Shao Y, Thiel W, et al. Calculating distribution coefficients based on multi-scale free energy simulations: an evaluation of MM and QM/MM explicit solvent simulations of water-cyclohexane transfer in the SAMPL5 challenge. J Comput-Aided Mol Des. 2016;30:989–1006. doi: 10.1007/s10822-016-9936-x. [DOI] [PubMed] [Google Scholar]
  • (26).Pickard FC, König G, Tofoleanu F, Lee J, Simmonett AC, Shao Y, Ponder JW, Brooks BR. Blind prediction of distribution in the SAMPL5 challenge with QM based protomer and pK a corrections. J Comput-Aided Mol Des. 2016;30:1087–1100. doi: 10.1007/s10822-016-9955-7. [DOI] [PubMed] [Google Scholar]
  • (27).Genheden S, Ryde U, Söderhjelm P. Binding affinities by alchemical perturbation using QM/MM with a large QM system and polarizable MM model. J Comput Chem. 2015;36:2114–2124. doi: 10.1002/jcc.24048. [DOI] [PubMed] [Google Scholar]
  • (28).König G, Brooks BR, Thiel W, York DM. On the convergence of multi-scale free energy simulations. Mol Simul. 2018;44:1062–1081. doi: 10.1080/08927022.2018.1475741. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (29).Dybeck EC, König G, Brooks BR, Shirts MR. Comparison of Methods To Reweight from Classical Molecular Simulations to QM/MM Potentials. J Chem Theory Comput. 2016;12:1466–1480. doi: 10.1021/acs.jctc.5b01188. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (30).Hudson PS, Boresch S, Rogers DM, Woodcock HL. Accelerating QM/MM Free Energy Computations via Intramolecular Force Matching. J Chem Theory Comput. 2018;14:6327–6335. doi: 10.1021/acs.jctc.8b00517. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (31).Hudson PS, Han K, Woodcock HL, Brooks BR. Force matching as a stepping stone to QM/MM CB[8] host/guest binding free energies: a SAMPL6 cautionary tale. J Comput Aided Mol Des. 2018;32:983–999. doi: 10.1007/s10822-018-0165-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (32).Pickard FC, König G, Simmonett AC, Shao Y, Brooks BR. An efficient protocol for obtaining accurate hydration free energies using quantum chemistry and reweighting from molecular dynamics simulations. Bioorganic & Medicinal Chemistry. 2016;24:4988–4997. doi: 10.1016/j.bmc.2016.08.031. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (33).Steinmann C, Olsson MA, Ryde U. Relative Ligand-Binding Free Energies Calculated from Multiple Short QM/MM MD Simulations. J Chem Theory Comput. 2018;14:3228–3237. doi: 10.1021/acs.jctc.8b00081. [DOI] [PubMed] [Google Scholar]
  • (34).Wang M, Mei Y, Ryde U. Predicting Relative Binding Affinity Using Nonequilibrium QM/MM Simulations. J Chem Theory Comput. 2018;14:6613–6622. doi: 10.1021/acs.jctc.8b00685. [DOI] [PubMed] [Google Scholar]
  • (35).Caldararu O, Olsson MA, Ignjatović MM, Wang M, Ryde U. Binding free energies in the SAMPL6 octa-acid host–guest challenge calculated with MM and QM methods. J Comput-Aided Mol Des. 2018;32:1027–1046. doi: 10.1007/s10822-018-0158-2. [DOI] [PubMed] [Google Scholar]
  • (36).Zwanzig RW. High-Temperature Equation Of State By A Perturbation Method. 1. Nonpolar Gases. J Chem Phys. 1954;22:1420–1426. [Google Scholar]
  • (37).Pohorille A, Jarzynski C, Chipot C. Good Practices in Free-Energy Calculations. J Phys Chem B. 2010;114:10235–10253. doi: 10.1021/jp102971x. [DOI] [PubMed] [Google Scholar]
  • (38).Genheden S, Cabedo Martinez AI, Criddle MP, Essex JW. Extensive all-atom Monte Carlo sampling and QM/MM corrections in the SAMPL4 hydration free energy challenge. J Comput Aided Mol Des. 2014;28:187–200. doi: 10.1007/s10822-014-9717-3. [DOI] [PubMed] [Google Scholar]
  • (39).Cave-Ayland C, Skylaris C-K, Essex JW. Direct Validation of the Single Step Classical to Quantum Free Energy Perturbation. J Phys Chem B. 2015;119:1017–1025. doi: 10.1021/jp506459v. [DOI] [PubMed] [Google Scholar]
  • (40).König G, Brooks BR. Correcting for the free energy costs of bond or angle constraints in molecular dynamics simulations. Biochim Biophys Acta. 2015;1850:932–943. doi: 10.1016/j.bbagen.2014.09.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (41).Hudson PS, White JK, Kearns FL, Hodoscek M, Boresch S, Wood-cock HL. Efficiently computing pathway free energies: New approaches based on chain-of-replica and non-Boltzmann Bennett reweighting schemes. Biochim Biophys Acta. 2015;1850:944–953. doi: 10.1016/j.bbagen.2014.09.016. [DOI] [PubMed] [Google Scholar]
  • (42).Ryde U. How Many Conformations Need To Be Sampled To Obtain Converged QM/MM Energies? The Curse of Exponential Averaging. J Chem Theory Comput. 2017;13:5745–5752. doi: 10.1021/acs.jctc.7b00826. [DOI] [PubMed] [Google Scholar]
  • (43).Shirts MR, Pande VS. Comparison of efficiency and bias of free energies computed by exponential averaging, the Bennett acceptance ratio, and thermodynamic integration. J Chem Phys. 2005;122:144107–1–144107–6. doi: 10.1063/1.1873592. [DOI] [PubMed] [Google Scholar]
  • (44).Boresch S, Woodcock HL. Convergence of single-step free energy perturbation. Mol Phys. 2017;115:1200–1213. [Google Scholar]
  • (45).Bennett CH. Efficient estimation of free energy differences from Monte Carlo data. J Comput Phys. 1976;22:245–268. [Google Scholar]
  • (46).Kirkwood JG. Statistical Mechanics of Fluid Mixtures. J Chem Phys. 1935;3:300–313. [Google Scholar]
  • (47).Bruckner S, Boresch S. Efficiency of alchemical free energy simulations. II. Improvements for thermodynamic integration. J Comput Chem. 2011;32:1320–1333. doi: 10.1002/jcc.21712. [DOI] [PubMed] [Google Scholar]
  • (48).Fox SJ, Pittock C, Tautermann CS, Fox T, Christ C, Malcolm NOJ, Essex JW, Skylaris C-K. Free Energies of Binding from Large-Scale First-Principles Quantum Mechanical Calculations: Application to Ligand Hydration Energies. J Phys Chem B. 2013;117:9478–9485. doi: 10.1021/jp404518r. [DOI] [PubMed] [Google Scholar]
  • (49).Jia X, Wang M, Shao Y, König G, Brooks BR, Zhang JZH, Mei Y. Calculations of Solvation Free Energy through Energy Reweighting from Molecular Mechanics to Quantum Mechanics. J Chem Theory Comput. 2016;12:499–511. doi: 10.1021/acs.jctc.5b00920. [DOI] [PubMed] [Google Scholar]
  • (50).Wang M, Li P, Jia X, Liu W, Shao Y, Hu W, Zheng J, Brooks BR, Mei Y. Efficient Strategy for the Calculation of Solvation Free Energies in Water and Chloroform at the Quantum Mechanical/Molecular Mechanical Level. J Chem Inf Model. 2017;57:2476–2489. doi: 10.1021/acs.jcim.7b00001. [DOI] [PubMed] [Google Scholar]
  • (51).Olsson MA, Söderhjelm P, Ryde U. Converging ligand-binding free energies obtained with free-energy perturbations at the quantum mechanical level. J Comput Chem. 2016;37:1589–1600. doi: 10.1002/jcc.24375. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (52).Mikulskis P, Cioloboc D, Andrejić M, Khare S, Brorsson J, Genheden S, Mata RA, Söderhjelm P, Ryde U. Free-energy perturbation and quantum mechanical study of SAMPL4 octa-acid host–guest binding energies. J Comput Aided Mol Des. 2014;28:375–400. doi: 10.1007/s10822-014-9739-x. [DOI] [PubMed] [Google Scholar]
  • (53).Rod TH, Rydberg P, Ryde U. Implicit versus explicit solvent in free energy calculations of enzyme catalysis: Methyl transfer catalyzed by catechol Omethyltransferase. J Chem Phys. 2006;124 doi: 10.1063/1.2186635. 174503. [DOI] [PubMed] [Google Scholar]
  • (54).Wood RH, Yezdimer EM, Sakane S, Barriocanal JA, Doren DJ. Free energies of solvation with quantum mechanical interaction energies from classical mechanical simulations. J Chem Phys. 1999;110:1329–1337. [Google Scholar]
  • (55).Liu W, Sakane S, Wood RH, Doren DJ. The Hydration Free Energy of Aqueous Na+ and Cl− at High Temperatures Predicted by ab Initio/Classical Free Energy Perturbation: 973 K with 0.535 g/cm3 and 573 K with 0.725 g/cm3. J Phys Chem A. 2002;106:1409–1418. [Google Scholar]
  • (56).Sakane S, Yezdimer EM, Liu W, Barriocanal JA, Doren DJ, Wood RH. Exploring the ab initio/classical free energy perturbation method: The hydration free energy of water. J Chem Phys. 2000;113:2583–2593. [Google Scholar]
  • (57).Ming Y, Lai G, Tong C, Wood RH, Doren DJ. Free energy perturbation study of water dimer dissociation kinetics. J Chem Phys. 2004;121:773–777. doi: 10.1063/1.1756574. [DOI] [PubMed] [Google Scholar]
  • (58).Sakane S, Liu W, Doren DJ, Shock EL, Wood RH. Prediction of the Gibbs energies and an improved equation of state for water at extreme conditions from ab initio energies with classical simulations. Geochim Cosmochim Acta. 2001;65:4067–4075. [Google Scholar]
  • (59).Wood RH, Liu W, Doren DJ. Rapid Calculation of the Structures of Solutions with ab Initio Interaction Potentials. J Phys Chem A. 2002;106:6689–6693. [Google Scholar]
  • (60).Jarzynski C. Nonequilibrium equality for free energy differences. Phys Rev Lett. 1997;78:2690–2693. [Google Scholar]
  • (61).Kearns FL, Hudson PS, Woodcock HL, Boresch S. Computing Converged Free Energy Differences between Levels of Theory via Nonequilibrium Work Methods: Challenges and Opportunities. J Comput Chem. 2017;38:1376–1388. doi: 10.1002/jcc.24706. [DOI] [PubMed] [Google Scholar]
  • (62).Boresch S, Karplus M. The Role of Bonded Terms in Free Energy Simulations: 1. Theoretical Analysis. J Phys Chem A. 1999;103:103–118. [Google Scholar]
  • (63).Boresch S, Karplus M. The Role of Bonded Terms in Free Energy Simulations. 2. Calculation of Their Influence on Free Energy Differences of Solvation. J Phys Chem A. 1999;103:119–136. [Google Scholar]
  • (64).Smith PE, van Gunsteren WF. When Are Free Energy Components Meaningful? jcp. 1994;98:13735–13740. [Google Scholar]
  • (65).Mark AE, van Gunsteren WF. Decomposition of the Free Energy of a System in Terms of Specific Interactions. J Mol Biol. 1994;240:167–176. doi: 10.1006/jmbi.1994.1430. [DOI] [PubMed] [Google Scholar]
  • (66).Boresch S, Karplus M. The Meaning of Component Analysis: Decomposition of the Free Energy in Terms of Specific Interactions. J Mol Biol. 1995;254:801–807. doi: 10.1006/jmbi.1995.0656. [DOI] [PubMed] [Google Scholar]
  • (67).Formally, one can obtain “components” ΔAXY = ΔAXY(elec),X(LJ) + ΔAY(elec),XY(LJ) by changing, e.g., first the electrostatic interactions (while retaining LJ interactions of state X), then the LJ interactions from X to Y in separate steps. However, any such decomposition depends on the path, i.e., the order how interactions are changed. A numerical example can be found, e.g., in Ref. 66.x
  • (68).MacKerell AD, Feig M, Brooks CL. Extending the treatment of backbone energetics in protein force fields: Limitations of gas-phase quantum mechanics in reproducing protein conformational distributions in molecular dynamics simulations. J Comput Chem. 2004;25:1400–1415. doi: 10.1002/jcc.20065. [DOI] [PubMed] [Google Scholar]
  • (69).Seeliger D, de Groot BL. Protein Thermostability Calculations Using Alchemical Free Energy Simulations. Biophys J. 2010;98:2309–2316. doi: 10.1016/j.bpj.2010.01.051. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (70).Giovannelli E, Cioni M, Procacci P, Cardini G, Pagliai M, Volkov V, Chelli R. Binding Free Energies of HostGuest Systems by Nonequilibrium Alchemical Simulations with Constrained Dynamics: Illustrative Calculations and Numerical Validation. J Chem Theory Comput. 2017;13:5887–5899. doi: 10.1021/acs.jctc.7b00595. [DOI] [PubMed] [Google Scholar]
  • (71).Crooks GE. Path-ensemble averages in systems driven far from equilibrium. Phys Rev E. 2000;61:2361–2366. [Google Scholar]
  • (72).Dellago C, Hummer G. Computing Equilibrium Free Energies Using Non-Equilibrium Molecular Dynamics. Entropy. 2013;16:41–61. [Google Scholar]
  • (73).Vanommeslaeghe K, Hatcher E, Acharya C, Kundu S, Zhong S, Shim J, Darian E, Guvench O, Lopes P, Vorobyov I, MacKerell AD. CHARMM general force field: A force field for drug-like molecules compatible with the CHARMM all-atom additive biological force fields. J Comput Chem. 2010;31:671–690. doi: 10.1002/jcc.21367. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (74).MacKerell AD, Bashford D, Bellott M, Dunbrack RL, Evanseck JD, Field MJ, Fischer S, Gao J, Guo H, Ha S, Joseph-McCarthy D, et al. All-Atom Empirical Potential for Molecular Modeling and Dynamics Studies of Proteins. J Phys Chem B. 1998;102:3586–3616. doi: 10.1021/jp973084f. [DOI] [PubMed] [Google Scholar]
  • (75).Best RB, Zhu X, Shim J, Lopes PEM, Mittal J, Feig M, MacKerell AD. Optimization of the Additive CHARMM All-Atom Protein Force Field Targeting Improved Sampling of the Backbone φ, ψ and Side-Chain χ1 and χ2 Dihedral Angles. J Chem Theory Comput. 2012;8:3257–3273. doi: 10.1021/ct300400x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (76).Huang J, Rauscher S, Nawrocki G, Ran T, Feig M, de Groot BL, Grubmüller H, MacKerell AD., Jr CHARMM36m: an improved force field for folded and intrinsically disordered proteins. Nat Methods. 2016;14:71. doi: 10.1038/nmeth.4067. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (77).Specificially, in C36 the Lennard-Jones parameters of the aliphatic carbon types were changed; furthermore, various parameters relevant for the Ser sidechain (angles, dihedrals) were reparameterized.
  • (78).König G, Bruckner S, Boresch S. Unorthodox uses of Bennett’s acceptance ratio method. J Comput Chem. 2009;30:1712–1718. doi: 10.1002/jcc.21255. [DOI] [PubMed] [Google Scholar]
  • (79).Wood RH, Muhlbauer WCF, Thompson PT. Systematic errors in free energy perturbation calculations due to a finite sample of configuration space: sample-size hysteresis. J Phys Chem. 1991;95:6670–6675. [Google Scholar]
  • (80).Porezag D, Frauenheim T, Köhler T, Seifert G, Kaschner R. Construction of tight-binding-like potentials on the basis of density-functional theory: Application to carbon. Phys Rev B. 1995;51:12947–12957. doi: 10.1103/physrevb.51.12947. [DOI] [PubMed] [Google Scholar]
  • (81).Elstner M, Porezag D, Jungnickel G, Elsner J, Haugk M, Frauenheim T, Suhai S, Seifert G. Self-consistent-charge density-functional tight-binding method for simulations of complex materials properties. Phys Rev B. 1998;58:7260–7268. [Google Scholar]
  • (82).Elstner M, Frauenheim T, Kaxiras E, Seifert G, Suhai S. A Self-Consistent Charge Density-Functional Based Tight-Binding Scheme for Large Biomolecules. phys stat sol (b) 2000;217:357–376. [Google Scholar]
  • (83).Frauenheim T, Seifert G, Elstner M, Niehaus T, Köhler C, Amkreutz M, Sternberg M, Hajnal Z, Carlo AD, Suhai S. Atomistic simulations of complex materials: ground-state and excited-state properties. J Phys: Condens Matter. 2002;14:3015–3047. [Google Scholar]
  • (84).Elstner M. The SCC-DFTB method and its application to biological systems. Theor Chem Acc. 2006;116:316–325. [Google Scholar]
  • (85).Gaus M, Goez A, Elstner M. Parametrization and Benchmark of DFTB3 for Organic Molecules. J Chem Theory Comput. 2013;9:338–354. doi: 10.1021/ct300849w. [DOI] [PubMed] [Google Scholar]
  • (86).Gaus M, Lu X, Elstner M, Cui Q. Parameterization of DFTB3/3OB for Sulfur and Phosphorus for Chemical and Biological Applications. J Chem Theory Comput. 2014;10:1518–1537. doi: 10.1021/ct401002w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (87).Lu X, Gaus M, Elstner M, Cui Q. Parametrization of DFTB3/3OB for Magnesium and Zinc for Chemical and Biological Applications. J Phys Chem B. 2015;119:1062–1082. doi: 10.1021/jp506557r. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (88).Kubillus M, Kubař T, Gaus M, Řezáč J, Elstner M. Parameterization of the DFTB3 Method for Br, Ca, Cl, F, I K, and Na in Organic and Biological Systems. J Chem Theory Comput. 2015;11:332–342. doi: 10.1021/ct5009137. [DOI] [PubMed] [Google Scholar]
  • (89).Vanommeslaeghe K, MacKerell AD., J Automation of the CHARMM General Force Field (CGenFF) I: Bond and perception and atom typing. J Chem Inf Mod. 2012;52:3144–3154. doi: 10.1021/ci300363c. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (90).Vanommeslaeghe K, Raman EP, MacKerell AD., J Automation of the CHARMM General Force Field (CGenFF) II: Assignment of bonded parameters and partial atomic charges. J Chem Inf Model. 2012;52:3155–3168. doi: 10.1021/ci3003649. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (91).Huang L, Roux B. Automated Force Field Parameterization for Nonpolarizable and Polarizable Atomic Models Based on Ab Initio Target Data. J Chem Theory Comput. 2013;9:3543–3556. doi: 10.1021/ct4003477. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (92).We note that certain nonbonded options of the GAAMP derived force field parameters were set differently here to make them consistent with regular CHARMM force field calculations (constant vs. distance dependent dielectric constant, use of 1-4 Lennard-Jones parameters for atom types from the regular force field). Therefore, the numerical results reported for the gas phase differ from those of Ref. 61.
  • (93).Woodcock HL, Miller BT, Hodoscek M, Okur A, Larkin JD, Ponder JW, Brooks BR. MSCALE: A General Utility for Multiscale Modeling. J Chem Theory Comput. 2011;7:1208–1219. doi: 10.1021/ct100738h. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (94).Wu D, Kofke DA. Phase-space overlap measures. I. Fail-safe bias detection in free energies calculated by molecular simulation. J Chem Phys. 2005;123:054103. doi: 10.1063/1.1992483. [DOI] [PubMed] [Google Scholar]
  • (95).Guthrie JP. Concerning the distant polar interaction in free energies of transfer. An explanation and an estimation procedure. Can J Chem. 1991;69:1893–1903. [Google Scholar]
  • (96).König G, Pickard F, Huang J, Thiel W, MacKerell A, Brooks B, York DA. Comparison of QM/MM Simulations with and without the Drude Oscillator Model Based on Hydration Free Energies of Simple Solutes. Molecules. 2018;23:2695. doi: 10.3390/molecules23102695. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (97).Pearlman DA, Kollman PA. The overlooked bondstretching contribution in free energy perturbation calculations. J Chem Phys. 1991;94:4532–4545. [Google Scholar]
  • (98).Bernardi R, Bhandarkar M, Bhatele A, Bohm E, Brunner R, Buelens F, Chipot C, Dalke A, Dixit S, Fiorin G, Freddolino P, et al. NAMD User’s Guide. Theoretical Biophysics Group: University of Illinois and Beck-man Institute 405 N; Mathews Urbana, IL 61801: 2018. [Google Scholar]
  • (99).Pearlman DA. Comparison of Alternative Approaches to Free Energy Calculations. J Phys Chem. 1994;98:1487–1493. [Google Scholar]
  • (100).Leitgeb M, Schröder C, Boresch S. Alchemical free energy calculations and multiple conformational substates. J Chem Phys. 2005;122:084109. doi: 10.1063/1.1850900. [DOI] [PubMed] [Google Scholar]
  • (101).Ryckaert J-P, Ciccotti G, Berendsen HJC. Numerical integration of the cartesian equations of motion of a system with constraints: molecular dynamics of n-alkanes. J Comput Phys. 1977;23:327–341. [Google Scholar]
  • (102).van Gunsteren W, Berendsen H. Algorithms for macromolecular dynamics and constraint dynamics. Mol Phys. 1977;34:1311–1327. [Google Scholar]
  • (103).Kearns F, Warrensford L, Boresch S, Woodcock HL. The Good, the Bad, and the Ugly: HiPen, a New Dataset for Validating (S)QM/MM Free Energy Simulations. Molecules. 2019;24:681–708. doi: 10.3390/molecules24040681. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (104).Walker RC, Crowley MF, Case DA. The implementation of a fast and accurate QM/MM potential method in Amber. J Comput Chem. 2008;29:1019–31. doi: 10.1002/jcc.20857. [DOI] [PubMed] [Google Scholar]
  • (105).Gao J. Potential of mean force for the isomerization of DMF in aqueous solution: a Monte Carlo QM/MM simulation study. J Am Chem Soc. 1993;115:2930–2935. [Google Scholar]
  • (106).Gao J. Origin of the solvent effects on the barrier to amide isomerization from the combined QM/MM Monte Carlo simulations. Proc Indian Acad Sci (Chem Sci) 1994;106:507–519. [Google Scholar]
  • (107).Martí S, Moliner V, Tuñón I. Improving the QM/MM Description of Chemical Processes: A Dual Level Strategy To Explore the Potential Energy Surface in Very Large Systems. J Chem Theory Comput. 2005;1:1008–1016. doi: 10.1021/ct0501396. [DOI] [PubMed] [Google Scholar]
  • (108).Gao J. Methods in Enzymology. Elsevier; 2016. pp. 359–388. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (109).Han K, Hudson PS, Jones MR, Nishikawa N, Tofoleanu F, Brooks BR. Prediction of CB[8] host–guest binding free energies in SAMPL6 using the double-decoupling method. J Comput Aided Mol Des. 2018;32:1059–1073. doi: 10.1007/s10822-018-0144-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (110).Smith JS, Isayev O, Roitberg AE. ANI-1: an extensible neural network potential with DFT accuracy at force field computational cost. Chem Sci. 2017;8:3192–3203. doi: 10.1039/c6sc05720a. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (111).Shen L, Yang W. Molecular Dynamics Simulations with Quantum Mechanics/Molecular Mechanics and Adaptive Neural Networks. J Chem Theory Comput. 2018;14:1442–1455. doi: 10.1021/acs.jctc.7b01195. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (112).Smith JS, Nebgen B, Lubbers N, Isayev O, Roitberg AE. Less is more: Sampling chemical space with active learning. J Chem Phys. 2018;148:241733. doi: 10.1063/1.5023802. [DOI] [PubMed] [Google Scholar]
  • (113).Smith JS, Isayev O, Roitberg AE. ANI-1, A data set of 20 million calculated off-equilibrium conformations for organic molecules. Scientific Data. 2017;4:170193. doi: 10.1038/sdata.2017.193. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

SI1
SI2
SI3

RESOURCES