Reaction Path-Force Matching in Collective Variables: Determining Ab Initio QM/MM Free Energy Profiles by Fitting Mean Force

Bryant Kim; Ryan Snyder; Mulpuri Nagaraju; Yan Zhou; Pedro Ojeda-May; Seth Keeton; Mellisa Hege; Yihan Shao; Jingzhi Pu

doi:10.1021/acs.jctc.1c00245

. Author manuscript; available in PMC: 2022 Aug 10.

Published in final edited form as: J Chem Theory Comput. 2021 Jul 20;17(8):4961–4980. doi: 10.1021/acs.jctc.1c00245

Reaction Path-Force Matching in Collective Variables: Determining Ab Initio QM/MM Free Energy Profiles by Fitting Mean Force

Bryant Kim ^†, Ryan Snyder ^†, Mulpuri Nagaraju ^†, Yan Zhou ^†, Pedro Ojeda-May ^†, Seth Keeton ^†, Mellisa Hege ^†, Yihan Shao ^‡,^*, Jingzhi Pu ^†,^*

PMCID: PMC9064116 NIHMSID: NIHMS1790287 PMID: 34283604

Abstract

First-principles determination of free energy profiles for condensed-phase chemical reactions is hampered by the daunting costs associated with configurational sampling on ab initio quantum mechanical/molecular mechanical (AI/MM) potential energy surfaces. Here, we report a new method that enables efficient AI/MM free energy simulations through mean force fitting. In this method, a free energy path in collective variables (CVs) is first determined on an efficient reactive aiding potential. Based on the configurations sampled along the free energy path, correcting forces to reproduce the AI/MM forces on the CVs are determined through force matching. The AI/MM free energy profile is then predicted from simulations on the aiding potential in conjunction with the correcting forces. Such cycles of correction-prediction are repeated until convergence is established. As the instantaneous forces on the CVs sampled in equilibrium ensembles along the free energy path are fitted, this procedure faithfully restores the target free energy profile by reproducing the free energy mean forces. Due to its close connection with the reaction path-force matching (RP-FM) framework recently introduced by us, we designate the new method as RP-FM in collective variables (RP-FM-CV). We demonstrate the effectiveness of this method on a type-II solution-phase S_N2 reaction, NH₃ + CH₃Cl (the Menshutkin reaction), simulated with an explicit water solvent. To obtain the AI/MM free energy profiles, we employed the semiempirical AM1/MM Hamiltonian as the base level for determining the string minimum free energy pathway, along which the free energy mean forces are fitted to various target AI/MM levels using the Hartree-Fock (HF) theory, density functional theory (DFT), and the second-order MØller-Plesset perturbation (MP2) theory as the AI method. The forces on the bond-breaking and bond-forming CVs at both the base and target levels are obtained by force transformation from Cartesian to redundant internal coordinates under the Wilson B-matrix formalism, where the linearized FM is facilitated by the use of spline functions. For the Menshutkin reaction tested, our FM treatment greatly reduces the deviations on the CV forces, originally in the range of 12~33 to ∼2 kcal/mol/Å. Comparisons with the experimental and benchmark AI/MM results, tests of the new method under a variety of simulation protocols, and analyses of the solute-solvent radial distribution functions suggest that RP-FM-CV can be used as an efficient, accurate, and robust method for simulating solution-phase chemical reactions.

Graphical Abstract

graphic file with name nihms-1790287-f0013.jpg

1. Introduction

The holy grail of simulating condensed-phase chemical/biochemical reactions is to obtain reliable free energy profiles through sampling highly accurate potential energy surfaces (PES) described by first-principles quantum mechanical (QM) methods such as ab initio molecular orbital¹ (AI-MO) and density functional theory^2–3 (DFT) methods, which are collectively referred to as the AI methods here for convenience. Even with the aid of the combined quantum mechanical and molecular mechanical (QM/MM)^4–8 technique, such simulation will likely remain impractical in the near future because of the daunting computational demands associated with free energy sampling on the already costly AI/MM PESs.⁹ Alternatively, efficient semiempirical (SE) MO¹⁰ methods are often used in QM/MM for more efficient PES calculations; although they make adequate free energy sampling more affordable, their accuracy and reliability may not always be guaranteed. How to systematically improve an SE/MM method to reach AI/MM-level quality is a long-standing challenge in the fields of computational chemistry and computational enzymology.¹¹ Recently, we introduced a new computational framework called reaction path-force matching¹² (RP-FM) to address this challenge. The central idea of RP-FM is to bridge the SE/MM and AI/MM levels in the context of QM/MM free energy simulations through force,¹² which encodes all dynamical information of the system, by making use of the force matching (FM) technique.^13–21 The second key element in RP-FM is that we conduct FM along free energy reaction pathways,^22–24 which is a natural choice for reactive FM. Because FM is performed between two electronic-structure-based QM/MM potentials, RP-FM enables cost-effective fitting of highly accurate reactive potentials for studying chemical reactions. As the molecular configurations for FM and for determining free energy profiles are always sampled at an efficient SE/MM level, direct sampling of the expensive AI/MM surface is avoided. The RP-FM method therefore offers a promising tool for accurate and efficient free energy simulations of condensed-phase reactions.

In our recently published work,¹² we demonstrated the idea of RP-FM in the framework of optimizing specific reaction parameters (SRPs) for SE methods. Based on a set of condensed-phase configurations sampled along the free energy reaction pathway using SE/MM simulations, the selected SE-SRPs are adjusted until atomic forces match with those from the high-level AI/MM calculations. Then, the free energy pathway and the corresponding free energy profile are updated with the FM-optimized SE-SRP/MM simulations. Because of the “self-consistent” nature of the problem, RP-FM is conducted iteratively until the free energy results converge.

Although conceptually elegant, the RP-FM method can be difficult to apply to complex systems due to the nonlinear optimization involved when parametrizing a QM potential. In the framework of FM optimization of SE-SRPs, the forces from the SE QM calculations are not linearly dependent on the electronic-structure parameters to be adjusted, which makes the situation quite different from using FM to fit MM force-field parameters. One way to enable nonlinear FM, as we demonstrated previously, is to fit forces using nonlinear optimization algorithms such as the genetic algorithm (GA).¹² For simple reactions such as a proton-transfer reaction in the gas phase and in solution, the GA-based nonlinear FM strategy works reasonably well. For example, for the RP-FM simulations of the proton-transfer reaction between ammonia and ammonium in the gas phase, the force deviation between PM3 and Hartree−Fock (HF) is reduced remarkably from an average of 12 kcal/mol/Å per force component to less than 1 kcal/mol/Å, which brings the PM3 barrier height to agree with the HF/3–21G benchmark results after a change of ∼10 kcal/mol.¹² For RP-FM of the same reaction in solution, although we observed a similar convergence of the PM3-SRP/MM free energy profile toward its AI/MM benchmark, the solution-phase FM, based on explicit QM/MM configurations, seems to be more challenging for a GA optimizer to handle; the average force deviation plateaus at 3.5 kcal/mol/Å, a value significantly higher than seen in the gas-phase FM.

When size and complexity of the reactive system increase, nonlinear SE-SRP optimization can become a practical bottleneck in Cartesian-based FM. With large numerical errors in force fitting, RP-FM may be insufficient on its own but can be complemented by the weighted thermodynamics perturbation (wTP) approach;²⁵ the powerful combination of the two outperforms the use of either method individually in reproducing the AI/MM free energy profiles.²⁶ Without the help of free energy perturbation, the problem of nonlinear FM, however, can be alleviated by parametrizing a classical energy component in (or on top of) the SE potential; when forces from such a classical energy term display a linear dependence on its parameters, the associated FM morphs into a classical one. One example of this type of approach is the FM-optimized density functional-tight binding (FM-DFTB) method developed by Goldman and co-workers,^27–28 who used FM to optimize the pairwise repulsive potential terms in DFTB to account for the force differences between DFTB and the target AI level. Another exciting direction is to introduce machine learning (ML)-optimized corrections on energy,^29–30 forces,³¹ or both^32–34 for SE methods. Although FM serves as an important component in these developments either for optimizing potentials,^{12, 27–28,
32–34} or for reproducing high-level molecular dynamics (MD) trajectories on selected internal degrees of freedom,³¹ a direct link between FM and determining the target-level free energy profiles is lacking. To overcome this hurdle, it is highly desirable to build a rigorous connection between FM and free energy, ideally through a linearized force-only-based framework.

In the process of forging this missing link and establishing the conceptual framework we desired, we noticed that collective variables (CVs) and the associated forces play important roles in free energy simulations such as the minimum free energy path (MFEP) simulations using the string method.^22–24 In the context of RP-FM, we found that instead of fitting all the atomic forces, matching the AI/MM target forces on the CVs offers a theoretically elegant way to reproduce the AI/MM free energy profiles. Following a similar line of reasoning by Voth and co-workers, who pointed out that mapping all-atom potentials to coarse-grained potentials by FM rigorously reproduces the many-body potential of mean force (PMF),³⁵ we show here that fitting the CV forces along the MFEP reproduces the free energy mean force at the target level, the integration of which directly leads to the high-level PMF coarse-grained to the consistent CV degrees of freedom. Under this strategy, because usually only a few selected CVs are subject to FM, the high-dimensional nonlinear optimization problem in a complex parameter space will be reduced to a much lower dimension. In this paper, we report our development in this direction, which results in a new method we designate as reaction path-force matching in collective variables (RP-FM-CV). As we will demonstrate below, formulation of RP-FM in the CV space leads to a smooth connection between the target-level free energy profiles and mean force fitting. Because we directly operate on force, no explicit modifications of the potential energy function are needed in RP-FM-CV.

The rest of the paper is organized as follows. The related theory is presented in Section 2. The benchmark system for testing the method is described and reviewed in Section 3. Section 4 provides the computational details. Results and discussion are given in Section 5. The relations of this work to others and its future are discussed in Section 6. Concluding remarks are presented in Section 7.

2. Theory

2.1. RP-FM is equivalent to fitting free energy mean force

Although serving as a convenient vehicle for optimizing SE-SRP/MM potentials,¹² RP-FM, from a free energy perspective, is equivalent to fitting the many-body PMF at the target AI/MM level. Such a free-energy-based understanding of the method can be shown by starting from the familiar expression of mean force $〈 F 〉$ of free energy on a reaction coordinate (RC) $ξ$ , represented by a set of n collective variables, i.e., $ξ = (ξ_{1}, ..., ξ_{n})$ :

{〈 F 〉}_{ξ = ξ^{*}} = \frac{\int d x_{1} ... d x_{N} d p_{1} ... d p_{N} δ (ξ_{1} - ξ_{1}^{*}) ... δ (ξ_{n} - ξ_{n}^{*}) \exp (- \frac{H}{k_{B} T}) F [f (x_{1}, ..., x_{N})]}{\int d x_{1} ... d x_{N} d p_{1} ... d p_{N} δ (ξ_{1} - ξ_{1}^{*}) ... δ (ξ_{n} - ξ_{n}^{*}) \exp (- \frac{H}{k_{B} T})}

(1)

where $x_{i}$ denotes the i^th Cartesian coordinate out of N degrees of freedom, $p_{i}$ is the conjugate momentum, H is the Hamiltonian, $k_{B}$ is the Boltzmann constant, $T$ is the temperature, and $δ$ is the Dirac delta function; F represents the instantaneous force on the reaction coordinate and can be obtained from transformation of the Cartesian atomic force $f (x_{1}, ..., x_{N})$ . Cast into the context of RP-FM, Eq. (1) indicates that matching the SE/MM atomic forces to the corresponding AI/MM atomic forces, both in Cartesian coordinates, would indirectly reproduce the free energy mean force $〈 F 〉$ at the target AI/MM level.

Here we demonstrate an alternative idea, where instead of fitting all the AI/MM Cartesian atomic forces in Eq. (1), we will conduct FM directly on the reaction coordinate $ξ$ . Because the instantaneous AI/MM forces along the reaction coordinate are reproduced over an ensemble of configurations sampled on an efficient SE/MM potential, the resulting method is equivalent to directly fitting the free energy mean forces, the integration of which over the reaction coordinate would faithfully restore the AI/MM free energy profile (interchangeably referred to as PMF in this work, only for convenience of discussion when the distinction between them is small³⁶).

2.2. RP-FM-CV fits mean force on collective variables in internal coordinates

Free energy profiles for complex chemical/biochemical reactions can be conveniently obtained using the string method²² through the determination of mean forces on multidimensional collective variables (CVs) along the minimum free energy pathway (MFEP).^23–24 Therefore, we choose to formulate FM of the instantaneous forces F in the CV space using the ansatz in Eq. (1). After the change of variables, we write the free energy mean force in terms of a set of generalized coordinates $(q_{1}, q_{s})$ :

{〈 F 〉}_{q_{1}} = \frac{\int d q_{s} d p_{q_{s}} d p_{q_{1}} \exp (- \frac{H}{k_{B} T}) F (q_{1}, q_{s})}{\int d q_{s} d p_{q_{s}} d p_{q_{1}} \exp (- \frac{H}{k_{B} T})}

(2)

where $q_{1} \equiv ξ$ now represents the reaction coordinate expressed in a set of CVs, which is also used consistently for defining the MFEP, and $q_{s}$ denotes its complementary set for completing the generalized coordinate system.^37–38 The instantaneous forces F on CVs for evaluating the free energy force (also known as the thermodynamics force) can be further expressed as:^37–39

F (q_{1}, q_{s}) = \frac{1}{β} \frac{\partial \ln | J (q_{1}, q_{s}) |}{\partial q_{1}} - \frac{\partial U (q_{1}, q_{s})}{\partial q_{1}}

(3)

where J is the Jacobian matrix that transforms the Cartesian to the generalized coordinate system, $β$ is ${(k_{B} T)}^{- 1}$ , and $\frac{\partial U (q_{1}, q_{s})}{\partial q_{1}}$ denotes the corresponding mechanical force, which is evaluated as the partial derivative of the potential energy $U$ with respect to the CVs. Although the partial derivative form of Eq. (3) appears to suggest that the calculation of the instantaneous forces on the CVs depends not only on the definition of the reaction coordinate $q_{1}$ but also on the choice of the complementary generalized coordinate $q_{s}$ , the thermodynamics force $〈 F 〉$ integrated from Eq. (2), however, does not depend on the specific choice of the complementary generalized coordinate. Indeed, Ruiz-Montero et al. pointed out that as long as one can identify a set of complementary generalized coordinates that makes the union of $q_{s}$ and $q_{1}$ an orthogonal set, the explicit dependence of the partial derivative term on $q_{s}$ can be removed and Eq. (3) can be conveniently written as:³⁸

F (q_{1}, q_{s}) = - \frac{1}{β} \frac{(\nabla | \nabla q_{1} |) \cdot \nabla q_{1}}{| \nabla q_{1} |^{3}} - \frac{\nabla U \cdot \nabla q_{1}}{| \nabla q_{1} |^{2}}

(4)

where $\nabla$ denotes the first derivative operator with respect to the Cartesian coordinates. Note that for a one-dimensional case that uses a single bond as the CV, the mechanical force term in Eq. (4) leads to an expression equivalent to a simple projection of Cartesian forces along the bond vector.

As den Otter and Briels later pointed out,³⁹ the existence of such orthogonal complementary generalized coordinate sets may not always be guaranteed, especially when the CVs are global in nature. We realized that for most QM/MM applications in reaction mechanism studies, it is usually sufficient to use internal coordinates such as bond distances, bond angles, and dihedral angles, which are all local variables, to describe the reaction progress. For these local CVs in internal coordinates, construction of the complementary generalized coordinates that are orthogonal to the selected CVs is attainable: one can use the Cartesian coordinates of the atoms that are not involved in the CVs; these non-CV Cartesian coordinates by definition are orthogonal to the internal coordinates in the CVs. For a solute-solvent system, such a treatment readily justifies the omission of solvent coordinates from $q_{s}$ when evaluating the force on $q_{1}$ , if the reaction coordinate only involves the solute atoms. For the solutes, since the CVs can couple to other solute degrees of freedom through shared atoms and chemical bonds, $q_{s}$ needs to be constructed explicitly with its various choices tested systematically for convergence. Our demonstration of the RP-FM-CV method in this paper will focus on the CVs (as well as the complementary generalized coordinate) that are defined by local internal coordinates. With this clarification of the coordinate system, now we identify Eq. (3) as the key equation for conducting FM in CVs.

In the context of FM in QM/MM, the Jacobian force term J in Eq. (3), which arises from coordinate transformation (zero for a rectilinear transformation but nonzero for a curvilinear transformation), is purely geometrical, regardless of whether an SE/MM or AI/MM method is used for the potential energy calculations; therefore, J does not contribute to the force differences between the two QM/MM levels involved in FM. By contrast, the $- \partial U / \partial q_{1}$ term on the right-hand side of Eq. (3), which gives the mechanical forces on the CV internal coordinates, is PES dependent and will be subject to force matching.

At this point, we reiterate the rationale of conducting RP-FM in CVs: if one reproduces the potential-energy-dependent part of the instantaneous internal forces on the CVs in Eq. (3) at the AI/MM level, the ensemble-averaged free energy mean forces in Eq. (2) would be reproduced at the target level. Integration of the resulting AI/MM-quality mean force along the string MFEP expressed in the same set of CVs would faithfully restore the target free energy profile. Next, we present the practical procedure of obtaining the internal forces on the CVs.

2.3. Determining force on CVs using redundant internal coordinate transformation

The internal forces on the CVs can be conveniently obtained through force transformation from Cartesian to internal coordinates using the Wilson B-matrix formalism.⁴⁰ Because the number of possible internal coordinates that can be constructed for a polyatomic molecule quickly exceeds the degrees of freedom in the system, we choose to use redundant internal coordinates. To remove the linear dependency in the redundant internal coordinate set and transform the Cartesian forces to the internal forces on the CVs, we adopted a procedure developed by Pulay and co-workers,⁴¹ where the redundancy of the coordinate system is identified and removed when forming the generalized inverse of the G-matrix (see Appendix B for details). Note that both the non-redundant and redundant forms of Cartesian-to-internal coordinate transformation have been widely used in geometry optimization^41–42 and in generalized vibrational analysis along the reaction path.^43–44

Unlike classical FM based on pairwise classical potentials, where force matching can be conveniently cast into a linearized least-square problem, our previous implementation of RP-FM between two QM/MM potentials employed a genetic algorithm to handle the nonlinear optimization of SE-SRPs for fitting atomic forces in Cartesian coordinates. To circumvent the need for nonlinear optimization in RP-FM-CV, we introduce a set of empirical force correction terms to directly fit the target forces on the CVs, which is described below. Alternatively, we have also formulated the RP-FM-CV method in a machine learning (ML) framework, which will be reported in a companion paper.

2.4. Linearized force matching in RP-FM-CV using spline functions

After the internal forces on the CVs are obtained at both the base (SE/MM) and target (AI/MM) levels for configurations sampled along the free energy pathway, we conduct FM through a force correction term for each CV to minimize the force differences between the two levels. Specifically, we fit the internal force corrections using a set of grid-based cubic spline functions (see Appendix A for implementation details), which is a numerical treatment originally introduced by Voth and co-workers¹⁴ for FM optimization of classical force fields. As shown by the original developers,¹⁴ FM under this framework can be cast into solving an overdetermined linear equation system. With this linearization treatment, our FM between the SE/MM and AI/MM levels on each CV is converted to a least-square problem and then solved by QR decomposition⁴⁵ or singular value decomposition (SVD),⁴⁵ in a way similar to FM optimization of classical force fields.¹⁴

2.5. Force modification for iterative RP-FM-CV

To obtain the updated free energy results through MD sampling, the spline-based correcting forces on the CVs resulting from FM are incorporated in the SE/MM forces by distributing the internal force correction on each CV to the associated Cartesian atomic force components using the chain rule. Note that the same Cartesian force modifications can also be obtained by using the backward transformation of Eq. (A22) in Appendix B, which transforms the force corrections on the CVs from the redundant internal coordinates back to Cartesian coordinates.⁴⁶ The two procedures are equivalent, where the chain-rule procedure is used in the implementation for simplicity.

After incorporating the FM corrections, the modified SE/MM atomic forces are used to propagate the SE/MM MD trajectories for free energy sampling, with which the updated free energy pathways and free energy profiles are determined. The cycle of free energy path sampling and FM is repeated iteratively until convergence is established. A schematic representation of the RP-FM-CV procedure is shown in Figure 1.

Figure 1. — Schematic representation of the RP-FM-CV method.

3. Critical Test: Menshutkin Reaction NH₃ + CH₃Cl

To demonstrate its effectiveness, we applied the RP-FM-CV method to a type-II S_N2 reaction between ammonia and methyl chloride (NH₃ + CH₃Cl → CH₃NH₃⁺ + Cl^-) in an aqueous solution (Figure 2). Following the literature convention, we refer to this reaction as the Menshutkin⁴⁷ reaction. Understanding kinetics and thermodynamics for the Menshutkin reaction through free energy simulation is challenging in that it requires quantitatively accurate descriptions of both the PES for the solute and the change in solvation effects when the solute system evolves from the charge-neutral reactant state to the charge-separated transition and product states; for a statistically reliable free energy description of this reaction, both these components need to be properly sampled over explicit solute/solvent configurations. Due to its fundamental importance, the Menshutkin reaction has been used as a workhorse for developing a host of computational methods for accurate and efficient treatments of PES and solvation over the past three decades.^{26, 31, 34, 48–66}

Figure 2. — Menshutkin reaction between ammonia and methyl chloride (NH₃ + CH₃Cl → CH₃NH₃⁺ + Cl^-).

Combined QM/MM methods offer a powerful tool for studying solution-phase chemical reactions and solvation effects due to solvent polarization.⁶⁷ The early QM/MM studies of the Menshutkin reaction were pioneered by Gao and co-workers.^48–49,
68 The solution-phase free energy profiles in their work were obtained either by Monte-Carlo-based free energy of hydration calculations using an AI-level gas-phase minimum energy path under a static solvation assumption⁴⁸ or by PMF simulations using two-dimensional umbrella sampling at the semiempirical AM1/TIP3P level,⁴⁹ where the latter allows polarization of the solute to be treated at a consistent electronic-structure level through the QM/MM interaction Hamiltonian.

Although Gao’s QM/MM approach provided detailed molecular-level information, the high costs of sampling a large number of explicit solute/solvent configurations motivated the studies of the Menshutkin reaction by AI calculations at various MO and DFT levels coupled with implicit solvation treatments. These include the multipolar expansion model by Dillet et al.⁵⁰ at the HF level, the polarizable continuum model (PCM) calculations at the MP2/3–21G level by Fradera⁵¹ and at the complete active space self-consistent field (CASSCF) level by Amovilli,⁵² and the generalized conductor-like screening model (GCOSMO) at the DFT and MP2 levels by Truong.⁵³ In particular, Truong et al. showed that by mixing a significant amount of HF exchange in hybrid DFT, the BH&HLYP method agrees well with the MP4 benchmark and experiments for the Menshutkin reaction in terms of its reaction energy and barrier height, both in the gas phase and in solution.⁵³

To strike a balance between the efficiency of using an implicit solvent and the microscopic/dynamic level of accuracy using an explicit solvent, several intermediate methods that bridge the continuum and QM/MM approaches have been developed. For example, Kato and co-workers^{54, 69} explored the Menshutkin reaction using the reference interaction site model self-consistent field (RISM-SCF) method, which combines the RISM integral equation for solvent with the solute electronic structures for description of local solute-solvent interactions.^70–71 Employing the free energy gradient (FEG) strategy of Okuyama-Yoshida et al.,⁷² Hirao et al.⁵⁶ optimized the solution-phase transition-state (TS) geometry for the Menshutkin reaction on a multidimensional free energy surface implicit of solvent coordinates based on the solute FEG derived from explicit QM/MM simulations. To reduce the computational costs of determining FEG explicitly from sampling the QM/MM potential energy surface, Galvan et al.⁵⁹ developed a mean-field approach called the averaged solvent electrostatic potential QM/MM (ASEP-QM/MM), where they used a fixed solute geometry and charge distribution while sampling the solvent configurations to obtain the ASEP for subsequent implicit polarization of the solute at the quantum mechanical level; with this approach, the free energy of activation and free energy TS properties were characterized at the BH&HLYP/aug-cc-pVDZ level.⁵⁹ A related but different strategy was also employed by Gordon and co-workers,⁵⁵ who studied the Menshutkin reaction in the effective fragment potential (EFP) framework; in their approach, the electronic structure of the solute molecule is determined under the polarization of EFP generated from the explicitly represented surrounding solvent clusters.⁷³

As the mean-field QM/MM treatment helps eliminate the costs for explicit AI/MM MD simulations, the related approaches enable free energy to be computed with high-level AI methods. For example, Nakano et al. reported the free energy profiles for the Menshutkin reaction at the MP2/6–31+G(d,p)/MM level using their own mean-field QM/MM approach,⁶⁶ which is similar in spirit to the QM/MM MFEP method developed earlier by Yang and co-workers,^74–76 although the latter has not been applied to the Menshutkin reaction. To assess the performance of their mean-field method, Nakano et al. also obtained a benchmark QM/MM PMF at the MP2/6–31+G(d,p)/MM level using umbrella sampling.⁶⁶ One notable inconvenience in these mean-field QM/MM treatments (as well as in the implicit solvation calculations) is that, as the MFEP is optimized in terms of the QM solute coordinates, the dynamic sampling of the QM atoms is lacking; therefore, the missing vibrational entropy of the solute has to be estimated and added separately (e.g., using a harmonic approximation⁷⁵). Unfortunately, for the Menshutkin reaction, these solute entropy corrections to the free energy profile seem to be substantial (ca., 7 and 9~13 kcal/mol,^{53, 59, 61} for the reaction free energy and free energy barrier, respectively), which makes a direct comparison between the mean-field QM/MM simulation results and experiments less straightforward.

The Menshutkin reaction has recently been revisited using explicit QM/MM simulations. On the one hand, alternative SE and solvent models have been tested. For example, Acevedo and Jorgensen⁶⁰ combined the semiempirical PDDG/PM3 method with the TIP4P water model and computed the free energy profiles for the Menshutkin reaction through Monte-Carlo-based free energy perturbation calculations. A similar approach has been employed by Vilseck et al.⁶⁴ to obtain the QM/MM free energy profiles for the same reaction based on the semiempirical RM1 method, where the more sophisticated CM1/3 charge model is used for treating QM/MM electrostatic interactions with the solvent. On the other hand, because of the inaccuracy in SE/MM methods and the daunting costs of AI/MM free energy simulations, a number of multiscale QM/MM methods utilizing AI information have been developed, aiming at simulating the Menshutkin reaction in a highly accurate but also affordable manner. For example, Tunon and co-workers^{57–58, 77} developed a dual-level QM/MM strategy, where a PES correction term is first obtained for an SE(/MM) method to fit the AI(/MM) energy results and then used in the subsequent PMF simulations. Technically, the energy correction needed for the two levels to match is treated as a spline interpolation function either along a one-dimensional reaction coordinate⁵⁷ or on a two-dimensional surface.⁷⁷ Depending on whether an unperturbed gas-phase Hamiltonian or an electrostatically perturbed QM/MM Hamiltonian is used, they developed two interpolated correction schemes, referred to as unperturbed interpolated correction (UIC) and perturbed interpolated correction (PIC), both applied to the Menshutkin reaction with AM1 being corrected to MP2.⁵⁷

Most recently, the Menshutkin reaction has been revisited using a few newly emerging multiscale QM/MM techniques, including the machine learning approaches,^{31, 34} the force-matching-aided weighted thermodynamics perturbation (FM+wTP) method,²⁶ and the RP-CV-FM method we present here. Despite the common theme that they all aim at reproducing the highly accurate AI/MM free energy profiles at a reduced cost, the ways these methods utilize the high-level information are considerably different. Therefore, assessing the RP-FM-CV method on the Menshutkin reaction would make it possible to cross-validate the related approaches against one another for consistent first-principles AI/MM free energy simulations.

4. Computational Details

This section contains the detailed descriptions of the RP-FM-CV free energy simulations outlined above. The general features of the simulations, including the solute/solvent models, computation of the potential energies, definition of the collective variables, boundary conditions, and electrostatic treatment, are described in Secs. 4.1–4.4. The specific details associated with the restraints used in the simulations are given in Sec. 4.5. Additional details for the string MFEP simulations and FM in redundant internal coordinates are provided in Secs. 4.6–4.7.

4.1. Description of the solute model

The topologies for the solute molecules NH₃ and CH₃Cl were built based on similar residues available in the standard CHARMM topology files. Specifically, the atom types NH3, HC, CT3, HA, and CLA are used for the nitrogen, hydrogens in NH₃, carbon, hydrogens in CH₃, and chlorine, respectively. With these atom types specified, the van der Waals (vdW) parameters were assigned based on the standard CHARMM22 force field⁷⁸ during the initial setup of the system. In the actual simulations, the values of these parameters, required for computing the nonbonded QM/MM interactions between the solute and solvent atoms, were replaced by their pair-specific version tailored for the Menshutkin reaction (see below).

4.2. Potential energy calculations

For the SE/MM simulations of the Menshutkin reaction in water, the solute molecules consisting of NH₃ and CH₃Cl are treated by the SE method AM1,⁷⁹ whereas the solvent molecules are treated by MM using the modified TIP3P model.⁸⁰ The MNDO97 package⁸¹ incorporated into the CHARMM program⁸² (version c42a2) was used for the AM1/MM calculations. The related AI/MM calculations were conducted using the Q-Chem package⁸³ (version 4.0.1) interfaced with CHARMM; the specific combinations of AI methods and basis sets used are discussed in Sec. 5. For QM/MM vdW interactions, we adopted the pair-specific vdW parameters previously optimized by Gao et al. for the Menshutkin reaction,⁴⁹ which were implemented in our simulations by using the NBFIx facility in CHARMM [see Supporting Information Sec. 1 (SI.1)].

4.3. Definition of the collective variables

Let N and Cl represent the nitrogen and chlorine atoms in the NH₃ and CH₃Cl groups, respectively, and C represents the carbon atom in the transferred methyl group. To describe the free energy path for the Menshutkin reaction (NH₃ + CH₃Cl → CH₃NH₃⁺ + Cl^-), we use the bond-breaking distance (C-Cl) and the bond-forming distance (N-C) as the two CVs in the string MFEP simulations; the same CVs are also used consistently in FM.

4.4. Boundary conditions and treatment of long-range electrostatics

In the original and subsequent force-corrected SE/MM simulations, a 40×40×40 Å³ cubic box of modified TIP3P⁸⁰ water molecules is used to solvate the reactive solute system. The SHAKE algorithm⁸⁴ is used to constrain the internal geometries of water during the MD simulations. In all cases, we adopted periodic boundary conditions in the simulations. In the SE/MM simulations, long-range electrostatics for MM/MM and QM/MM interactions are treated by the particle mesh Ewald (PME)⁸⁵ and QM/MM-PME^86–87 method, respectively. In both PME treatments, the κ parameter that represents the width of the Gaussian screening charge distributions is set to 0.34 Å^-1, and the reciprocal space summations are performed on a 40 × 40 × 40 FFT grid, with maximally up to 5 k-vectors included in each Cartesian direction. For the real-space contribution of QM/MM-PME electrostatics, a switching function available in CHARMM is applied from 12 to 13 Å to smoothly attenuate the real-space QM/MM electrostatic interactions at a cutoff of 13 Å.

4.5. Restraints and MD simulations

Colinear-like and C_3v geometry is imposed on the solute complex during all QM/MM MD simulations to reduce the system’s distortion and prevent it from visiting irrelevant configurations that may slow down the MFEP convergence. A similar treatment was used by Truong and co-workers in their implicit solvation calculations,⁵³ where they showed that the free energy barriers obtained for the Menshutkin reaction are not impacted significantly when applying these geometric constraints. Specifically, to impose the C_3v symmetry in CHARMM, we used three relative-distance (RESD) restraints to keep the three H’s in ammonia at the same distance from the central C. To prevent any potential drift of the solute toward the edge of the simulation box (commonly observed for small solutes), we also placed a one-sided quartic spherical repulsive potential at 8 Å away from the box center, using the MMFP facility in CHARMM. The QM/MM MD simulations were carried out under constant-pressure and constant-temperature conditions at 1 atm and 298.15 K.

4.6. String MFEP simulations

For the string MFEP simulations in CVs, we adopted a protocol we recently used for simulating the QM/MM free energy profiles of adenosine 5’-triphosphate (ATP) hydrolysis in the ATP-binding cassette (ABC) transporter HlyB.²⁴ Here, we only provide a brief description of the simulation procedure; the detailed simulation parameters can be found in our published work.²⁴ The MFEP, represented by the two bond CVs described in Sec. 4.3, is discretized into 25 images of the system, whose initial coordinates were obtained from a QM/MM potential energy scan along the reaction coordinate. For each iteration of the string MFEP optimization, the free energy mean force on each CV was estimated from the CV’s fluctuation during 20 ps QM/MM MD simulations in which the CVs are harmonically anchored at their previous path values using a uniform force constant of 1000 kcal/mol/Å². For the projection, reparametrization, and evolvement of the MFEP, as well as the integration of the free energy profiles, see our previous implementation²⁴ based on the original string method developed by Maragliano et al.²³

4.7. Force matching in redundant internal coordinates

The generalized inverse of the G-matrix is formed by following the procedure outlined by Pulay and co-workers,⁴¹ where a threshold of 0.02 is used to identify the negligible eigenvalues that are related to redundancy in the coordinate system (see Appendix B). The singular value decomposition (SVD) method, adapted from the Numerical Recipes in Fortran77,⁴⁵ is used as a default solver in the spline-based FM on CVs (see Appendix A), where a maximum eigenvalue scaled by 10^-6 is employed as a cutoff for removal of linear dependency when solving the overdetermined system under Eq. (A15) [see SI.2 for a concrete example of casting Eq. (A15) in its matrix form]. For each of the bond CVs, the internal force correction needed for matching the SE/MM and AI/MM forces is fitted into a spline function with a grid interval of 0.2 Å. The CHARMM code in its c42a2 version was modified to incorporate the force correction terms on the selected CVs in internal coordinates by distributing the internal force corrections to the related Cartesian force components with the chain rule.

5. Results and Discussion

As Gao found in the early 1990s, due to charge separation in the Menshutkin reaction during its product formation, the presence of a polar solvent generates a tremendous amount of solvation free energy that stabilizes the products as well as the transition state, thereby lowering the free energy of activation compared with the gas phase.⁴⁸ As a result, although the reaction is endothermic in the gas phase, it becomes highly exergonic in aqueous solutions. However, it is known that the semiempirical AM1/MM method is unreliable in simulating this reaction, which gives a reaction free energy significantly higher than experiment;⁴⁹ this is possibly related to the inaccurate description of the chloride anion using a minimal basis set in the AM1 Hamiltonian.

To obtain the first-principles free energy profiles, we use RP-FM-CV to correct the AM1/MM forces to their AI/MM target values. Specifically, the AI levels for computing the target forces include two DFT methods, namely, B3LYP^88–90 and BH&HLYP,^{53, 89, 91} and the second-order MØller-Plesset perturbation (MP2)^92–93 method, with 6–31+G(d,p)⁹⁴ as the default basis set. For brevity, we refer to the RP-FM-CV simulation methods using this default basis set as AI:SE/MM; for other basis sets or when multiple basis sets are compared explicitly, the more specific label AI/BasisSet:SE/MM is used instead for clarity. Force matching in each case was done based on 300 solution-phase configurations using a set of 28 redundant internal coordinates as a default. Unless stated otherwise, the results presented in this section are from the simulations using this default RP-FM-CV protocol. The convergence tests of the free energy results with respect to the redundant internal coordinate sets, sample sizes, and basis sets can be found in Secs. 5.4–5.6. Due to the rapid convergence of the overall procedure when performing the method iteratively (see Sec. 5.8 for details), the results after a single cycle of RP-FM-CV are reported by default.

Next, we first present the ability of RP-FM-CV in converging a wide variety of target AI/MM results, including the free energy profiles, internal forces on the CVs, free energy pathways, and transition-state locations. Note that due to the daunting costs of obtaining the AI/MM benchmark free energy profiles simulated under the same sampling requirement, here we focus on cross-validating the RP-FM-CV method under various AI and FM conditions; a separate validation of the RP-FM-CV method against the directly obtained AI/MM benchmark results, both using a shorter sampling time, can be found in SI.8. As an additional validation, the relevant AI/MM free energy results and transition-state geometries available in the literature are also compiled in Table 1 for comparison with our results.

Table 1.

Free energy barriers (ΔG^‡), reaction free energies (ΔG_r), and transition-state geometries (Å) for the Menshutkin reaction between NH₃ and CH₃Cl in water

Method	ΔG^‡ (kcal/mol)	ΔG_r (kcal/mol)	N-C (Å)	C-Cl (Å)	Ref.

MP2(fc)/6–31+G(d,p)/AM1/TIP3P (UIC)	23.8	−31.1	2.24	1.99	57
MP2(fc)/6–31+G(d,p)/AM1/TIP3P (PIC)	19.1	−27.4	2.22	2.01	57
MP2/6–31+G(d,p)/MM benchmark	27.6	−16.9			66
HF/6–31G(d)/MM benchmark	21.2	−25.9			31
HF/6–31G(d):DFTB/MM ML int. force corr.	20.8	−23.8			31
B3LYP/6–31G(d)/MM benchmark	15.3	~ −28			26
B3LYP/6–31G(d):PM3/MM FM+wTP	15.7	~ −28			26

AM1/TIP3P (with QM/MM-cutoff)	29.3	−10.4	2.10	2.10	57
AM1/TIP3P (with QM/MM-cutoff)	26.3	−18.0	1.96	2.09	49
AM1/TIP3P (with QM/MM-Ewald; string)	30.9	−10.6	1.970	2.129	this work

HF/6–31G(d):AM1/MM	18.3	−33.0	2.270	2.248	this work
B3LYP:AM1/MM^a	14.7	−27.2	2.213	2.194	this work
BH&HLYP:AM1/MM^a	17.8	−28.0	2.187	2.222	this work
MP2/6–31G(d):AM1/MM	19.1	−28.7	2.171	2.196	this work
MP2:AM1/MM^a	21.3	−26.0	2.170	2.193	this work
MP2/6–311++G(d,p):AM1/MM	22.2	−23.9	2.129	2.209	this work
MP2/6–311++G(2df,2p):AM1/MM	19.6	−24.6	2.145	2.168	this work

B3LYP/6–31G(d)/MM benchmark^b	18.8	−29.4	2.295	2.133	this work

Experiment	23.5	−34 ± 10			48
		−36 ± 6			64

Open in a new tab

RP-FM-CV simulations with the target AI/MM forces evaluated at the default basis set 6–31+G(d,p)

DFT/MM-cutoff simulations with 30 iterations of string MFEP optimization; 1 ps sampling is used in each iteration for mean force evaluation (see SI.8).

5.1. Free energy profiles

For the present work, the base method we used in the RP-FM-CV simulations is AM1/MM, which by itself generates large errors in free energy. As shown in Figure 3, the AM1/MM simulations predict a free energy barrier of 30.9 kcal/mol for the Menshutkin reaction, which is 7.4 kcal/mol higher than the experimental value of 23.5 kcal/mol.⁴⁸ The reaction free energy for the Menshutkin reaction obtained from our AM1/MM simulations is −10.6 kcal/mol, which corresponds to an overestimate of 23.4 or 25.4 kcal/mol compared with the experimental value of −34 ± 10 kcal/mol⁴⁸ or −36 ± 6 kcal/mol,⁶⁴ both established from the gas-phase thermodynamics and free energy of hydration data. Our free energy barrier is consistent with the AM1/MM simulations previously performed by Gao & Xia⁴⁹ and by Ruiz-Pernia et al.⁵⁷ using electrostatic cutoff, who obtained slightly lower free energy barriers of 26.3⁴⁹ and 29.3⁵⁷ kcal/mol, respectively (see also Table 1). For free energy of reaction, our result is also qualitatively comparable to the previous results of −18.0⁴⁹ and −10.4 kcal/mol.⁵⁷

Figure 3. — Free energy profiles along the string MFEPs (with α=0 being the reactant and 1 being the product) from the AM1/MM (dashed red) and RP-FM-CV simulations of the Menshutkin reaction in aqueous solution. Results from the RP-FM-CV AI:AM1/MM simulations were obtained by matching AM1/MM forces on the CVs to various target AI/MM levels using the default 6–31+G(d,p) basis set: B3LYP:AM1/MM (dotted black), BH&HLYP:AM1/MM (green with circles), and MP2:AM1/MM (solid blue).

All three RP-FM-CV-based AI:AM1/MM methods improve the reaction free energy toward the experimental value. While improving the reaction free energy, the two DFT:AM1/MM methods lower the free energy barriers compared with the AM1/MM method. Specifically, the RP-FM-CV simulations that fit forces to the B3LYP/MM level yield a free energy barrier height of 14.7 kcal/mol, and the RP-FM-CV simulations at the BH&HLYP:AM1/MM level give a free energy barrier of 17.8 kcal/mol. These results are in line with the literature observation that these DFT methods tend to underestimate the barrier height for this system. For example, based on the GCOSMO continuum solvation calculations, Truong et al.⁵³ showed that B3LYP underestimates the barrier height for the Menshutkin reaction, whereas BH&HLYP (with 50% HF exchange) produces much better results comparable to the data obtained at the highly correlated MP4(STDQ) level. This trend is successfully reproduced in all of our all-atom explicit-solvent RP-FM-CV simulations, where the highest-level AI:AM1/MM simulations that match forces to the MP2/6–31+G(d,p)/MM level strike a good balance between the free energy barrier and reaction free energy predictions. While improving the reaction free energy to −26.0 kcal/mol, compared with the experimental value of −34 ± 10 kcal/mol, our MP2:AM1/MM simulations maintain a free energy of activation at 21.3 kcal/mol, in close agreement with the experimental value of 23.5 kcal/mol established by Gao⁴⁸ on the basis of the NH₃ + CH₃I reaction.^95–96

Similarly, compared with the literature data shown in Table 1, our MP2:AM1/MM results are in great agreement with those of Ruiz-Pernia et al. using the perturbed interpolated potential energy correction QM/MM method at a similar dual-level of MP2(fc)/6–31+G(d,p)/AM1/TIP3P (PIC), which yields a free energy barrier and reaction free energy of 19.1 and −27.4 kcal/mol, respectively.⁵⁷

5.2. Force correlations

The effectiveness of the RP-FM-CV method is also monitored by comparing the internal forces on the CVs between the base and target levels. Specifically, the internal forces obtained for the two bond-based CVs are compared between AM1/MM and the AI/MM levels that use B3LYP/6–31+G(d,p) and MP2/6–31+G(d,p). In Figure 4, we show the internal force correlations for the levels involved before and after force matching. In the same figure, the average internal force deviations (ΔF) between the base and target levels are also given for comparison. As we can see from Figure 4, based on the average of 300 configurations, the internal forces on the CVs computed at the original AM1/MM level differ from their AI/MM targets by 12~32 kcal/mol/Å, depending on the bonds and the target levels [vs. B3LYP/6–31+G(d,p)/MM: 16.5 (N-C) and 32.2 (C-Cl) kcal/mol/Å; vs. MP2/6–31+G(d,p)/MM: 12.4 (N-C) and 23.4 (C-Cl) kcal/mol/Å]; after force matching, the average force differences between RP-FM-CV and the target levels are greatly reduced to only 2.0~2.1 kcal/mol/Å for both bonds. These results demonstrate that the FM component in the RP-FM-CV method works effectively. As the AI/MM forces on the CVs are faithfully reproduced after FM, the force correlation results presented here also help us rationalize the improvements that we see in Sec. 5.1 on the free energy profiles.

Figure 4. — Internal force correlations between the base AM1/MM and target AI/MM methods [at the B3LYP/MM and MP2/MM levels both using the 6–31+G(d,p) basis set]: before (red squares) and after (blue circles) applying the RP-FM-CV internal force corrections; the corresponding trend lines are shown as dashed and solid lines. Internal forces on the two bond CVs were computed based on 300 configurations sampled along the condensed-phase MFEP from the AM1/MM string simulations. The average internal force deviations (ΔF; in kcal/mol/Å) between the base and target levels, before (red) and after (blue) force matching, are also shown for comparison.

5.3. Internal force corrections on CVs along the MFEP

Although RP-FM-CV delivers great numerical agreement on the CV forces between the SE/MM and AI/MM levels based on the sampled configurations, one important question that remains is whether the spline-based correcting forces are well-behaved and smooth functions of the reaction coordinate along the MFEP. These properties of the force correction terms are highly desirable for numerically stable dynamics when the modified forces are plugged back into the MD simulations for obtaining the updated free energy profiles.

To demonstrate the smoothness of the spline-based internal force correction terms, in Figure 5 we plot the force deviations between the SE/MM and AI/MM levels as well as their FM-optimized spline fits for both CVs along their bond distances. One can see that the spline functions nicely fit the averages of the individual force deviations sampled along the MFEP. We noticed that the distributions of the CV force deviations (i.e., the desired force corrections for accomplishing a perfect FM between the two levels) are indeed quite smooth, which justifies the use of spline functions in fitting these corrections. In spite of the deceptive smoothness of the fits, our spline-corrected internal forces successfully reproduce their instantaneous AI/MM internal force targets with small errors of 2.0~2.1 kcal/mol/Å (see Sec 5.2 and Figure 4), thereby capturing the detailed internal force fluctuations at the target levels. To this end, the spline-based force correction scheme serves the designed purpose of RP-FM-CV well, which, as we have discussed in Secs. 2.1–2.2, is to fit free energy mean force through matching instantaneous forces for individual configurations in an ensemble.

Finally, the smoothness of the spline-based force corrections also indicates their numerical stability when incorporated in the SE/MM force calculations for FM-corrected MD trajectories, with which the free energy profiles and pathways can be updated in a robust way.

5.4. Tests of different sets of redundant internal coordinates

In our formulation of the RP-FM-CV method, computation of the forces on the CVs is based on the force transformation from the Cartesian to a set of redundant internal coordinates, for which the definition is not unique. To test the robustness of the algorithm with respect to the choice of the internal coordinate system, we examined three different sets of redundant internal coordinates for the Menshutkin reaction. As shown in Figure 6, the first redundant internal coordinate set (also the default set), denoted “Int28”, includes 8 bonds, 15 angles, 1 doubly-degenerate linear bend, and 3 torsions. The other two redundant sets, denoted “Int31” and “Int34”, are constructed by adding 3 and 6 more torsions, respectively (see also Figure 6). All three sets include the N-C and C-Cl bonds that define a common set of CVs in both the string MFEP and FM simulations.

Figure 6. — Various redundant internal coordinate schemes tested for the RP-FM-CV simulations of the Menshutkin reaction in solution. The two distance-based CVs for bond forming and breaking, i.e., N-C (1–5) and C-Cl (5–9) (shown in red in the “Bonds” section), are used consistently in both the string MFEP simulations and the FM calculations.

In Figure 7, we compare the RP-FM-CV free energy profiles obtained at the MP2:AM1/MM level when the internal CV forces and FM are determined and conducted using the three different redundant internal coordinate systems described above. Our results show that based on the redundant internal coordinate transformation, the internal forces on the CVs, in this case the bonds being broken (C-Cl) and formed (N-C), only vary marginally when using different redundant sets. On average (over 300 configurations), the internal forces using the Int31 and Int34 sets only differ from that using the Int28 set by less than 0.01 kcal/mol/Å) at the MP2:AM1/MM level. Consequently, the free energy profiles resulting from the RP-FM-CV simulations using the three internal coordinate sets are almost identical. These results demonstrate the robustness of the RP-FM-CV method in converging the free energy results when conducting FM in different redundant internal coordinate systems. The invariance of the free energy profiles with respect to the three redundant internal coordinates tested also indicates that under the Int28 default set the coordinate system is already complete.

5.5. Tests of number of configurations included in FM

One advantage of the RP-FM(-CV) approach is that once the configurations sampled at the efficient SE/MM level are collected, the computationally expensive single-point AI/MM force calculations can be conducted in an “embarrassingly parallel” manner. As long as one has access to enough central processing units (CPUs), the wall time for computing the target AI/MM forces does not grow with the number of configurations used. In practice, however, the free energy results from the RP-FM-CV simulations may vary with the number of configurations included in FM. In our default simulation scheme, we conducted FM based on 300 solution-phase configurations taken from 25 images each sampled along the string MFEP over a period of 60 ps. To test how sensitive the free energy results are to the sample sizes in fitting the internal CV forces, we repeated the RP-FM-CV simulations at the MP2:AM1/MM level with three additional FM schemes, in which 1500, 3000, and 15000 configurations are used respectively. The resulting free energy profiles using different sample sizes for FM are compared in Figure 8.

The results in Figure 8 show that the free energy profiles computed at the MP2:AM1/MM level converge well with respect to the FM sample sizes. The free energy profiles essentially overlap with one another even when the number of configurations for FM varies by 50-fold from 300 to 15000. These results once again demonstrate the robustness and statistical reliability of the RP-FM-CV method.

While our tests on the Menshutkin reaction suggest a good convergence of the free energy profile using a small to medium FM sample size, the free energy convergence for more complex systems with large dynamical fluctuations could be more challenging. For those systems, greater numbers of FM configurations drawn from long simulations may be required especially when slow non-CV degrees of freedom are present.

5.6. Tests of basis-set convergence

Because of the computational costs associated with a great number of sequential potential energy calculations for configurational sampling, free energy simulations at AI/MM levels are often limited to single-determinant electronic-structure AI methods such as HF and hybrid DFT, whose N⁴ scaling behavior (with N being the number of basis functions) allows them to be used in combination with relatively small double-zeta basis sets. The use of larger basis sets at and beyond these levels would dramatically increase the computational costs and therefore is rarely seen in practical AI/MM free energy simulations. Therefore, having an affordable strategy that allows AI/MM free energy simulations to be used with large-sized basis sets would greatly ease some of the concerns regarding the otherwise unknown basis-size convergence behavior of the simulations.

In RP-FM-CV, because FM is decoupled from dynamical sampling and conducted separately in a parallel fashion, the AI/MM force calculations are no longer the computational bottleneck for the simulations and therefore can be done at post-HF correlated levels such as MP2 with large basis sets. This enables us to carry out FM at AI/MM levels using basis sets in various sizes, including the very large ones, to systemically check convergence of the free energy results in a way routinely done for gas-phase quantum chemistry calculations.

In Figure 9, we compare the RP-FM-CV free energy files for the Menshutkin reaction obtained at the MP2/BasisSet:AM1/MM level using the 6–31G(d),⁹⁴ 6–31+G(d,p),^{94, 97} 6–311++G(d,p),^{94, 97–98} and 6–311++G(2df,2p)⁹⁹ basis sets, which correspond to 61, 91, 116, and 170 basis functions, respectively. For the smallest basis set we tested, i.e., 6–31G(d), the free energy profile deviates notably from the other results. When the medium- to large-sized basis sets are used with extra split-valence, polarization, and diffuse functions added, the reaction free energies converge to a similar value. Specifically, the MP2/6–31G(d):AM1/MM simulation gives a reaction free energy of −28.7 kcal/mol, compared with −26.0, −23.9, and −24.6 kcal/mol when the basis set is upgraded to 6–31+G(d,p), 6–311++G(d,p), and 6–311++G(2df,2p), respectively (see Table 1). On the other hand, for the RP-FM-CV simulations in which the internal forces are fitted to the MP2/MM level with 6–31G(d), 6–31+G(d,p), 6–311++G(d,p), and 6–311++G(2df,2p) basis sets, the free energy barriers are 19.1, 21.3, 22.2, and 19.6 kcal/mol, respectively. With these data, we conclude that our RP-FM-CV simulations show good convergence with basis set.

It is worth noting that for the largest basis set we tested, direct QM/MM free energy simulations at the MP2/6–311++G(2df,2p)/MM level are out of reach but made possible by the RP-FM-CV method. Agreements among the results using the 6–31+G(d,p) basis set and beyond also suggest that AI/MM free energy simulations in condensed phases likely display a similar convergence behavior seen in gas-phase systems, as long as the basis sets used are sufficiently large. For obtaining reasonably converged results, we recommend inclusion of diffuse and polarization functions in any attempt of AI/MM free energy simulations.

5.7. RP-FM-CV produces AI/MM-quality free energy paths

Above we showed that RP-FM-CV generates the AI/MM-quality free energy profiles for the Menshutkin reaction. The next question we seek to answer is whether RP-FM-CV can improve the free energy path to the target-level quality. Note that although RP-FM-CV is formulated to directly fit the free energy mean force (see Sec. 2), there is no a priori knowledge that the target-level free energy path would also be faithfully reproduced.

In Figure 10, we plot the MFEPs (in terms of the two bond CVs, i.e., the N-C and C-Cl bond distances) determined by the original AM1/MM simulations, as well as those obtained from the RP-FM-CV simulations at the B3LYP:AM1/MM, BH&HLYP:AM1/MM, and MP2:AM1/MM levels. The MFEP obtained at the original AM1/MM level differs from the FM-optimized ones in predicting a more “convex” path as a result of a much “tighter” TS, i.e., the sum of the N-C and C-Cl bond distances along the MFEP are significantly shorter than that produced at the various AI:AM1/MM levels. After the RP-FM-CV force corrections, the MFEPs obtained at all three AI:AM1/MM levels essentially converge to one another, which indicates that the free energy paths at the target AI/MM levels are also successfully reproduced.

The corresponding CV bond distances that characterize the location of a free energy TS (defined as the highest free energy point along the MFEP) are given in Table 1. For the free energy TS located on the MFEP, the original AM1/MM level gives a N-C bond distance of 1.970 Å, which is significantly shorter than the C-Cl bond of 2.129 Å in the same TS; this trend is in great agreement with the values of 1.96 Å (N-C) and 2.09 Å (C-Cl) reported by Gao and Xia⁴⁹ from their earlier AM1/TIP3P simulations. The dual-level AI/MM free energy simulations reported by Ruiz-Pernia et al.⁵⁷ suggest that the N-C bond is likely extended to ~2.2 Å in the TS when the PES is corrected to the MP2(fc)/6–31+G(d,p)/TIP3P level; a similar trend has been observed from various AI calculations using implicit solvent (e.g., see data compiled by Vilseck et al.⁶⁴). Our RP-FM-CV simulations at the B3LYP:AM1/MM and BH&HLYP:AM1/MM levels both successfully reproduce this feature, locating the free energy TS at 2.213 Å (N-C) & 2.194 Å (C-Cl), and at 2.187 Å (N-C) & 2.222 Å (C-Cl), respectively. Our MP2:AM1/MM simulations also converge the TS geometry toward the benchmark and literature results, by giving distances of 2.170 and 2.193 Å for the N-C and C-Cl bonds, respectively.

5.8. Convergence of the overall procedure

Due to the self-consistent nature of RP-FM-CV, cycles consisting of the RP and FM steps ideally need to be conducted iteratively until convergence of the free energy profile is established. Using the MP2:AM1/MM and B3LYP:AM1/MM methods, we examined the convergence behavior of the overall procedure by performing multiple cycles of RP-FM-CV. In the first cycle of the simulations, we conducted 10 iterations of string MFEP optimization at the AM1/MM level followed by FM to fit the CV forces to the target AI/MM levels. In each of the subsequent cycles, we updated the MFEPs by repeating the string simulations under AM1/MM forces in conjunction with the CV force corrections obtained from the previous RP-FM-CV cycle. Such cycles of MFEP optimization and FM in CVs are repeated five times.

From Table 2, we can see that throughout the five cycles of RP-FM-CV simulations the free energy barriers and reaction free energies for the Menshutkin reaction obtained at the MP2:AM1/MM level display small fluctuations of 0.6 and 0.7 kcal/mol about the corresponding average values of 20.6 and −25.3 kcal/mol, respectively, whereas the first cycle produces 20.3 and −26.0 kcal/mol for these free energy results. In terms of geometry, the N-C and C-Cl bond distances found at the free energy TS throughout the five cycles fluctuate closely about their average values of 2.164 ± 0.016 and 2.202 ± 0.014 Å, respectively, compared with the values of 2.170 and 2.193 Å obtained after the first cycle. For the Menshutkin reaction, we found that even one cycle of RP-FM-CV is sufficient to converge the free energy and TS geometry results reasonably well to the average values obtained after five cycles. The free energy profiles determined at the MP2:AM1/MM level from each of its five cycles are further compared in Figure 11, which shows that they overlap well with no systematic drift detected during the iterative applications of RP-FM-CV.

Table 2.

Computed free energy barriers (ΔG^‡), reaction free energies (ΔG_r), and transition-state geometries for the Menshutkin reaction between NH₃ and CH₃Cl in water over five cycles of RP-FM-CV simulations at the MP2:AM1/MM level

Cycle	ΔG^‡ (kcal/mol)	ΔG_r (kcal/mol)	N-C (Å)	C-Cl (Å)

1	20.3	−26.0	2.170	2.193
2	20.8	−25.1	2.170	2.196
3	19.9	−25.8	2.174	2.195
4	20.1	−25.6	2.170	2.199
5	20.6	−24.2	2.135	2.226
Average	20.6 ± 0.6	−25.3 ± 0.7	2.164 ± 0.016	2.202 ± 0.014

Open in a new tab

Figure 11. — Free energy profiles for the Menshutkin reaction in solution obtained from the RP-FM-CV simulations at the MP2:AM1/MM level using the 6–31+G(d,p) basis set over five consecutive RP and FM cycles: the first (dotted black), second (green with circles), third (solid blue), fourth (pink with crosses), and fifth (light blue with triangles) cycles, compared with AM1/MM (dashed red).

A similar convergence behavior is observed for the RP-FM-CV simulations at the B3LYP:AM1/MM level (see SI.6). Altogether, these results strongly suggest a rapid self-consistent convergence of the RP-FM-CV procedure for the Menshutkin reaction studied here, which justifies our use of a single RP-FM-CV cycle as a default.

5.9. Radial distribution functions

To understand how the force correction terms applied in RP-FM-CV simulations would impact the solvent-solute interactions, we computed the radial distribution functions (RDFs) for the selected solute-solvent atom pairs in the reactant (R), transition state (TS), and product (P) regions along the MFEP. Specifically, the N-O_w, C-O_w, and Cl-O_w RDFs involving the water oxygen (O_w) atoms were obtained at the MP2:AM1/MM and AM1/MM levels and are compared in Figure 12; a similar comparison made for the B3LYP:AM1/MM level can be found in SI.7.

Figure 12. — Solute-solvent radial distribution functions (RDFs) obtained from the RP-FM-CV simulations at the MP2:AM1/MM level using the 6–31+G(d,p) basis set, compared with the AM1/MM results. The RDFs for the solute (heavy atoms) and solvent (water oxygens: O_w) (i.e., N-O_w, C-O_w, and Cl-O_w) determined using an average of 3,600 configurations in each regions are shown: reactant (R; dotted red), transition state (TS; solid green), and product (P; dashed blue).

The free energy barrier and reaction free energy we obtained from the AM1/MM simulations are 30.9 and −10.6 kcal/mol, which are lowered in the MP2:AM1/MM simulations to 21.3 and −26.0 kcal/mol, respectively (see also Table 1); this suggests that both the TS and the P state are more stabilized by FM than is the R state. If such stabilization involves any changes in solvation, impacts on the solvent structures would be observed in the related RDFs. In the AM1/MM results (Figure 12), the first solvation peaks of all three solute-solvent RDFs are found higher and shifted toward shorter distances when the system evolves from the R, through the TS, to the P region, which is in line with enhanced solvation upon forming the ionic products in the Menshutkin reaction. This feature is largely preserved in the MP2:AM1/MM results (Figure 12) after the CV force corrections are applied, which suggests that the physical description of solvation in the RP-FM-CV simulations is well retained.

On the other hand, FM seems to lead to a few quantitatively notable changes in RDFs. For example, the Cl-O_w RDF displays a lowered first peak in the TS region after the CV forces are corrected to the MP2:AM1/MM level, but no obvious changes in the peak height are found in the R and P regions; this observation suggests a less solvated TS and therefore a higher solvation barrier than without the FM corrections, which does not seem to directly contribute to the reduced free energy barrier seen in our MP2:AM1/MM simulations. Moreover, the first peak of the N-O_w RDF obtained from our AM1/MM simulations is found at 3.12 Å in the TS, which is 0.48 Å shorter than the corresponding location of 3.60 Å in the R state (Figure 12). After FM to the MP2/6–31+G(d,p)/MM level, the corresponding peak in the TS is moved to 3.36 Å, which becomes only 0.24 Å shorter than the peak location of 3.60 Å in the R state. This result also suggests that the enhanced solvation along the reaction coordinate that preferentially stabilizes the charge-separated TS over the charge-neutral R state is weakened after FM, which again would lead to a higher solvation barrier.

Based on the above data, we conclude that the lowered overall free energy barrier after FM does not correlate with a reduced solvation barrier; the improved free energy profile is predominantly a result of the modified intramolecular forces in the solute rather than changes in the solvent structures. This suggests that the free energy stabilization seen in the FM results is dominated by the force corrections on the CVs, as opposed to solvation itself. Note that in the present RP-FM-CV implementation, the CVs we used for correcting the internal forces only involve the solute coordinates; therefore, any changes in the solute-solvent interactions are likely caused indirectly by the solvent’s response to the modified solute charge distribution. Because RP-FM-CV does not modify SE-SRP parameters, any changes in solvation would be realized more through a modified solute geometry (e.g., a looser TS found after FM) than through an explicit alteration of the electronic-structure part of the SE/MM interaction Hamiltonian. To further improve the structural and dynamical descriptions of the solvent to the AI/MM levels, it is highly desirable to include solvent coordinates into the CVs for FM, which is a topic of our ongoing work.

6. Outlook

A common theme found in the recent developments of multilevel QM/MM free energy methods is to utilize high-level AI/MM information based on configurations efficiently sampled using low-level PES methods, such as SE/MM. Depending on how the high-level data are used, two related strategies, namely, energy matching and force matching, have emerged.

The most straightforward way of using the high-level information is to match the AI/MM total energy; this can often be done by fitting parameterized energy correction terms for the base level. Examples of the energy-matching-based methods include the interpolated PES correction approach using spline functions^49,70 and the recent work of fitting AI/MM energy by machine learning.^29–30 An obvious limitation of energy fitting is that even with the high-level energy data reproduced, there is little to no control on the improvement on atomic forces, which are essential for MD-based free energy simulations.

A different strategy of using the high-level information is to directly fit the AI/MM forces as the only target data,¹² which can be viewed as the reactive version of the more generalized force matching strategy.¹⁰⁰ Connected to Voth and co-workers’ pioneering work on the multistate-empirical valence bond (MS-EVB) method,¹⁰¹ the multilevel QM(/MM) methods under the reactive FM umbrella include the SRP-fitting-based RP-FM,¹² FM-DFTB,²⁸ machine learning-based internal force correction,³¹ and RP-FM-CV reported in this paper. The FM strategy is especially appealing for QM/MM free energy simulations that use MD as the sampling tool. From a dynamics perspective, the multiple time step (MTS) integration approach developed by Nam,⁸⁷ in which the SE/MM forces are directly corrected to their target AI/MM values at less frequent MD steps, can also be viewed as an FM QM/MM method with FM done on the fly.

As emphasized by us,¹² because force serves as the central quantity that encodes all the dynamical information of the system, FM would restore the detailed dynamics at the target level. Therefore, in its purest form, the FM QM/MM strategy fits forces as the only objective quantities without an explicit use of any energy information. As forces are based on the first derivatives of the potential energy, the FM strategy can sometimes be used in a hybrid form in combination with either energy matching or a construction of the potential energy function. For example, in our earlier implementation of RP-FM,¹² we fit SRPs for an SE/MM method to restore the AI/MM atomic forces; as a byproduct, the FM-optimized SRPs also lead to an explicit SE potential energy function that gives the target forces although we never include the target-level energy in the objective function during the SRP-fitting process. In our recent work,²⁶ we developed a hybrid strategy, where FM is first used to obtain SRPs to reproduce the target forces, on top of which weighted thermodynamics perturbation (wTP) utilizing the AI/MM energy data is further employed to restore the high-level free energy. In the FM-DFTB method developed by Kroonblawd et al.,²⁸ parametrized pairwise energy terms are used to represent the repulsive potential part in the DFTB Hamiltonian; the linear dependence of the associated forces on the parameters makes these pairwise energy terms well suited for FM in a linear optimization framework. In a few recent machine learning (ML)-assisted QM/MM approaches developed by Riniker and co-workers,³² by York and co-workers,³³ and by Shao and co-workers,³⁴ both energy and force matching are accomplished; some of these works are enabled by the deep-learning tools developed by E and co-workers,^102–104 or follow their strategy of folding both energy and the associated atomic forces into a combined loss function when optimizing the ML potentials. In all of these works, there are potential energy functions resulting from FM. When serving as a standalone objective, FM can be otherwise achieved without explicitly constructing the corresponding potential energy function. Examples of fitting forces without an explicit potential energy term include force corrections of Yang and co-workers³¹ and our RP-FM-CV.

A particular advantage of the RP-FM-CV method is dimension reduction in terms of fitting the CV forces along a one-dimensional free energy path. This choice makes our method more convenient than fitting a multidimensional potential energy correction term (e.g., the work of Ruiz-Pernia et al.⁷⁷), which requires knowledge on the couplings among multiple reaction coordinates to maintain the global correctness of the PES and therefore would quickly become unmanageable beyond two dimensions. We note that fitting AI(/MM) data in high dimensions can be handled by alternative strategies such as the pairwise energy correction scheme²⁸ and the more generalized ML approaches,^{29–34, 102–104} by which multiple reaction coordinates can be incorporated explicitly or through atom-centered local descriptors so that their couplings can be parametrically represented in the ML potentials.

Recently, Yang and co-workers also reported a force-based machine-learning QM/MM approach,³¹ where they obtained internal force corrections for DFTB/MM to match with the AI/MM results. Our work differs from theirs in the way the internal forces are defined. Yang and co-workers obtained their internal force expression with an aim to reproduce the MD trajectory integration step at the target AI/MM level. By contrast, our formalism directly aims at force matching. As a result, their “trajectory matching” formalism seems to involve additional mass factors compared with our “force matching” formalism (see SI.3–5 for details). For a special case of one-dimensional internal coordinate where a single bond is used as the only CV, the two formalisms conditionally converge to each other and to the projection operator formalism¹⁰⁵ (see SI.5). For more complex reactions such as the Menshutkin reaction, where multidimensional non-orthogonal CVs are involved, the two strategies lead to internal forces that differ both in definition and in numerical values (see SI.3–4).

Besides the definition of internal forces, which is the major distinction between our method and Wu et al.’s, RP-FM-CV is formulated in a different theoretical framework. Importantly, RP-FM-CV is framed in terms of fitting the free energy mean force, which builds a rigorous connection to fitting the high-level PMF. Interestingly, despite the very different definitions of internal forces, theoretical rationales, and technical details on how the force corrections are fitted (i.e., using spline functions vs. machine learning), the two approaches both seem to satisfactorily reproduce their corresponding AI/MM free energy results for the Menshutkin reaction (see Table 1); this suggests that some of the numerical differences are perhaps averaged out when the internal force corrections are fitted over ensembles of configurations.

With the powerful deep-learning tools available now for molecular systems,^102–104 combined energy and force matching is made possible to train ML models for AI/MM-quality free energy simulations.^33–34 In these deep-learning works, the Cartesian atomic forces are fitted through differentiating the rotational- and translational-invariant ML energy. The internal force framework used by RP-FM-CV may provide an alternative way for learning forces, as the internal forces by construction are invariant to rotation and translation. Moreover, due to the special role of reaction coordinate in chemical reactions, it is highly desirable for ML models to be able to selectively learn forces on the essential degrees of freedom; RP-FM-CV can complement ML to serve this purpose. To this end, our results suggest that it is very important to obtain the correct internal forces through proper coordinate transformation.

Other uses of the RP-FM-CV methods can also be envisioned. As we discussed above, because the internal forces obtained from RP-FM-CV can serve as a vehicle for fitting differentiable potential energy functions, the method can be used, for example, to optimize the parameters in the empirical energy correction term represented by a simple valence bond (SVB) potential.¹⁰⁶ The method can also be combined with the MTS⁸⁷ approach to directly correct the internal CV forces on the fly.

7. Concluding Remarks

In summary, we have developed RP-FM-CV, an FM-based multilevel QM/MM method, for determining first-principles free energy profiles for chemical reactions in condensed phases. At a conceptual level, our RP-FM-CV method reproduces the highly accurate AI/MM free energy profile by fitting the corresponding mean forces on a set of CVs based on which a free energy pathway is consistently defined. Mean force fitting in our method is accomplished by matching the target forces acting on the CVs, obtained properly from the redundant internal coordinate transformation, for the condensed-phase configurations sampled at an efficient SE/MM level. Application of the RP-FM-CV method to the Menshutkin reaction demonstrates its remarkable capability in reducing the errors on the CV forces, which greatly improves the quality of the free energy pathway and free energy profile to a level comparable to the AI/MM benchmarks and experimental results. This development therefore offers a systematic and practical strategy for first-principles free energy simulations; it is our expectation that this method will find more applications in AI/MM mechanistic studies of complex chemical and biochemical reactions, for which chemical accuracy and statistically adequate free energy sampling would otherwise be seemingly infeasible to achieve at the same time.

Supplementary Material

Supporting Information

NIHMS1790287-supplement-Supporting_Information.pdf^{(871.1KB, pdf)}

Acknowledgements.

We thank Profs. Wei Yang, Feng Wang, Jiali Gao, and Greg Voth for helpful discussions. This work was supported by a start-up grant from Indiana University-Purdue University Indianapolis (IUPUI), a Summer Faculty Grant from the Purdue Research Foundation (PRF), a Research Support Funds Grant (RSFG) from IUPUI, and grants R15-GM116057 (JP) and R01-GM135392 (YS & JP) from the US National Institutes of Health (NIH). The computing time was provided by School of Science at IUPUI and by the BigRed2 High Performance Computing facilities at the Indiana University.

APPENDIX

Appendix A. Force matching in CVs using spline functions

For FM in CVs, using a formalism that mimics the one used by Izvekov et al.¹⁴ for Cartesian-based FM, we define the objective function $χ^{2}$ as:

χ^{2} = \frac{1}{L N} \sum_{l = 1}^{L} \sum_{i = 1}^{N} {| Δ F_{i l}^{Ref} - Δ F_{i l}^{P} (g_{1}^{i}, g_{2}^{i}, ..., g_{m_{i}}^{i}) |}^{2}

(A1)

where L denotes the number of sampled configurations for FM and N is the number of CVs for representing the MFEP; $Δ F_{i l}^{Ref}$ denotes the reference force correction needed for the internal force F on the i^th CV in the l^th configuration at the SE/MM level to match with the corresponding force at the target AI/MM level, i.e.,

Δ F_{i l}^{Ref} = F_{i l}^{AI/MM} - F_{i l}^{SE/MM}

(A2)

Plugging Eq. (A2) into Eq. (A1) and then setting the objective function $χ^{2}$ to zero lead to the force matching condition:

χ^{2} = \frac{1}{L N} \sum_{l = 1}^{L} \sum_{i = 1}^{N} {| F_{i l}^{AI/MM} - F_{i l}^{SE/MM} - Δ F_{i l}^{P} (g_{1}^{i}, g_{2}^{i}, ..., g_{m_{i}}^{i}) |}^{2} = 0

(A3)

In Eqs. (A1) and (A3), $Δ F_{i l}^{P} (g_{1}^{i}, g_{2}^{i}, ..., g_{m_{i}}^{i})$ denotes the corresponding parametrized force correction term that is to be determined numerically for matching the internal forces between the SE/MM and AI/MM levels, where $(g_{1}^{i}, g_{2}^{i}, ..., g_{m_{i}}^{i})$ denotes a set of m_i parameters for fitting the force correction term for the i^th CV. In the present work, we adopt a numerical treatment used by Voth and co-workers¹⁴ in force-matching optimization of classical force fields, where the correcting force on each CV is expressed as a cubic spline function along evenly distributed grid points. Specifically, for the i^th CV (of a bond-distance type) whose sampled values $r^{i}$ fall in the interval of $[r_{min}^{i}, r_{max}^{i}]$ , the corresponding spline function is defined as:

Δ F_{i l}^{P} (g_{1}^{i}, g_{2}^{i}, ..., g_{m_{i}}^{i}) = f (r^{i}, {r_{k}^{i}}, {f_{k}^{i}}, {f_{k}^{i''}}) = A (r^{i}, {r_{k}^{i}}) f_{j}^{i} + B (r^{i}, {r_{k}^{i}}) f_{j + 1}^{i} + C (r^{i}, {r_{k}^{i}}) f_{j}^{i''} + D (r^{i}, {r_{k}^{i}}) f_{j + 1}^{i''} (k = 1, 2, ..., n_{grid}^{i}; r^{i} \in [r_{j}^{i}, r_{j + 1}^{i}])

(A4)

where $r_{j}^{i}$ denotes the position of the j^th grid point over the radial mesh ${r_{k}^{i}}$ consisting of $n_{grid}^{i}$ grid points for the i^th CV:

r_{j}^{i} = r_{\min}^{i} + (r_{\max}^{i} - r_{\min}^{i}) / (n_{grid}^{i} - 1) \times (j - 1) (j = 1, 2, ..., n_{grid}^{i})

(A5)

and A, B, C, and D are derived quantities in cubic spline,⁴⁵ determined from the sampled CV value $r^{i}$ and its neighboring grid points, given that $r^{i} \in [r_{j}^{i}, r_{j + 1}^{i}]$ :

A = \frac{r_{j + 1}^{i} - r^{i}}{r_{j + 1}^{i} - r_{j}^{i}}

(A6a)

B = 1 - A = \frac{r^{i} - r_{j}^{i}}{r_{j + 1}^{i} - r_{j}^{i}}

(A6b)

C = \frac{1}{6} (A^{3} - A) {(r_{j + 1}^{i} - r_{j}^{i})}^{2}

(A6c)

D = \frac{1}{6} (B^{3} - B) {(r_{j + 1}^{i} - r_{j}^{i})}^{2}

(A6d)

In Eq. (A4), $f_{j}^{i}$ and $f_{j}^{i''}$ denote the parametrized force correction and its second derivative parameter with respect to the i^th CV at the j^th grid point of the spine function, respectively. Note that here we label these spline functions as the italic f to follow the literature convention, and they should not be confused with the Cartesian atomic force vectors, which are labeled as bold non-italic $f$ in the text. As a result, the spline function described in Eq. (A4) contains altogether $m_{i} = 2 n_{grid}^{i}$ adjustable parameters that need to be solved for the i^th CV’s FM condition, i.e.,

Δ F_{i l}^{P} (g_{1}^{i}, g_{2}^{i}, ..., g_{m_{i}}^{i}) \equiv f (r^{i}, {r_{k}^{i}}, f_{1}^{i}, f_{1}^{i''}, f_{2}^{i}, f_{2}^{i''}, ..., f_{n_{grid}^{i}}^{i}, f_{n_{grid}^{i}}^{i''})

(A7a)

where

(g_{1}^{i}, g_{2}^{i}, ..., g_{m_{i} = 2 n_{g r i d}^{i}}^{i}) \equiv (f_{1}^{i}, f_{1}^{i''}, f_{2}^{i}, f_{2}^{i''}, ..., f_{n_{grid}^{i}}^{i}, f_{n_{grid}^{i}}^{i''})

(A7b)

In FM, the numerical solution of Eq. (A3) is obtained at a stationary condition that minimizes the objective function $χ^{2}$ with respect to the parameters $g_{j}^{i} (j = 1, 2, ..., m_{i})$ :

\frac{d χ^{2}}{d g_{j}^{i}} = \frac{1}{L N} \sum_{l = 1}^{L} \sum_{i = 1}^{N} 2 [Δ F_{i l}^{Ref} - Δ F_{i l}^{P} (g_{1}^{i}, g_{2}^{i}, ..., g_{m_{i}}^{i})] [- \frac{\partial Δ F_{i l}^{P} (g_{1}^{i}, g_{2}^{i}, ..., g_{m_{i}}^{i})}{\partial g_{j}^{i}}] = 0

(A8)

Using a short-hand notation:

{(Δ F_{i l}^{P})}_{g_{j}^{i}}^{'} = \frac{\partial Δ F_{i l}^{P} (g_{1}^{i}, g_{2}^{i}, ..., g_{m_{i}}^{i})}{\partial g_{j}^{i}} (j = 1 , 2, ..., m_{i})

(A9)

as well as the specific functional form of $Δ F_{i l}^{P}$ defined in Eq. (A4), we have:

{(Δ F_{i l}^{P})}_{f_{j}^{i}}^{'} = {(Δ F_{i l}^{P})}_{g_{2 j - 1}^{i}}^{'} = A

(A10a)

{(Δ F_{i l}^{P})}_{f_{j + 1}^{i}}^{'} = {(Δ F_{i l}^{P})}_{g_{2 j + 1}^{i}}^{'} = B

(A10b)

{(Δ F_{i l}^{P})}_{f_{j}^{i''}}^{'} = {(Δ F_{i l}^{P})}_{g_{2 j}^{i}}^{'} = C

(A10c)

{(Δ F_{i l}^{P})}_{f_{j + 1}^{i''}}^{'} = {(Δ F_{i l}^{P})}_{g_{2 j + 2}^{i}}^{'} = D

(A10d)

After rearrangement, Eq. (A8) can be written as:

\sum_{l = 1}^{L} \sum_{i = 1}^{N} {(Δ F_{i l}^{P})}_{g_{j}^{i}}^{'} Δ F_{i l}^{P} = \sum_{l = 1}^{L} \sum_{i = 1}^{N} {(Δ F_{i l}^{P})}_{g_{j}^{i}}^{'} Δ F_{i l}^{Ref} (j = 1 , 2, ..., m_{i})

(A11)

Now consider a column vector $g$ as the union of all parameters for the N collective variables,

g = {g_{j}^{i}}^{T} (j = 1, 2, ..., m_{i}; i = 1, 2, ..., N)

(A12)

where the superscript “T” denotes a transpose; the dimension $M$ of the unionized parameter vector $g$ is:

M = \sum_{i}^{N} m_{i}

(A13)

Then Eq. (A8) can be written in a more compact matrix form:

{(Δ F^{P})}_{g}^{' T} Δ F^{P} = {(Δ F^{P})}_{g}^{' T} Δ F^{Ref}

(A14a)

With the following identity:

Δ F^{P} = {(Δ F^{P})}_{g}^{'} g

(A14b)

Eq. (A14a) can be written as:

{(Δ F^{P})}_{g}^{' T} [{(Δ F^{P})}_{g}^{'} g] = {(Δ F^{P})}_{g}^{' T} Δ F^{Ref}

(A14c)

This is equivalent to solving the parameters set $g$ for a linear equation system:

[{(Δ F^{P})}_{g}^{' T} {(Δ F^{P})}_{g}^{'}] g = {(Δ F^{P})}_{g}^{' T} Δ F^{Ref}

(A15)

Note that due to the overdetermined nature of Eq. (A15) in FM, its numerical solution can be obtained by QR decomposition⁴⁵ or singular value decomposition (SVD),⁴⁵ with which Eq. (A3) would be satisfied in a least-square manner; for a perfect FM, the parametrized force correction in Eq. (A15) would restore the reference force correction exactly:

{(Δ F^{P})}_{g}^{'} g = Δ F^{Ref}

(A16)

Therefore we identify Eq. (A15) as the key working equation for conducting FM in multidimensional CVs.

Note that in Eqs. (A14–16), $Δ F^{P}$ and $Δ F^{Ref}$ are both NL-dimensional column vectors, ${(Δ F^{P})}_{g}^{'}$ is $N L \times M$ matrix with the leading dimension (i.e., number of rows) being $N L$ , and $g$ is a M-dimensional column vector to be solved. For implementation purpose, consistency of Eq. (A15) in its matrix form can be verified by analyzing the dimensionalities of the matrix operations involved:

{(M \times N L)}_{{(Δ F^{P})}_{g}^{' T}} \times {(N L \times M)}_{{(Δ F^{P})}_{g}^{'}} \times {(M \times 1)}_{g} = {(M \times N L)}_{{(Δ F^{P})}_{g}^{' T}} \times {(N L \times 1)}_{Δ F^{Ref}} = M \times 1

(A17)

For readers who are interested in more details about the implementation, a concrete example can be found in SI.2, where we illustrate the matrix form of Eq. (A15) for a two-bond CV case based on a specific set of sample and grid distributions.

Appendix B. Determination of internal forces on CVs using redundant internal coordinate transformation

The transformation of forces from Cartesian to the selected redundant internal coordinates is conducted by using the procedure developed by Pulay and co-workers for geometry optimization.⁴¹ Based on the Wilson’s B-matrix formalism, this procedure uses an eigenvalue decomposition technique to remove the linear dependence among the redundant internal coordinates. The redundant internal coordinates and the Wilson’s B-matrix are connected through:

q = BX

(A18)

where $q$ is a set of N_R redundant internal coordinates (e.g., bonds, angles, torsions, doubly-degenerate linear bends, out-of-plane wags, etc.), containing the CVs used in FM and the string MFEP simulations, $X$ represents the corresponding Cartesian displacement coordinates (in a dimension of 3n, with n being the number of atoms involved in the coordinate system), and $B$ is the aforementioned Wilson’s B-matrix,⁴⁰ an $N_{R} \times 3 n$ matrix accounting for the derivatives of the internal coordinates with respect to the Cartesian displacement coordinates. For construction of the B-matrix elements for bonds, angles, torsions, and out-of-plane wags, we follow the equations given in Wilson et al.,⁴⁰ whereas for doubly-degenerate linear bends, we follow the equations given in Califano¹⁰⁷ and an implementation by Jackels et al.⁴³ For force transformation from Cartesian to redundant internal coordinates, the B-matrix is then used to form the condensed G-matrix, which is an $N_{R} \times N_{R}$ dimension matrix defined as:

G = Bu B^{T}

(A19)

where $u$ is an arbitrary diagonal matrix (a $3 n \times 3 n$ identity matrix is used in the present work). Taking on the form of an eigenvalue equation, the condensed G-matrix can be diagonalized as:

G (\begin{matrix} K & L \end{matrix}) = (\begin{matrix} K & L \end{matrix}) (\begin{matrix} Λ & 0 \\ 0 & 0 \end{matrix})

(A20)

where $K$ is formed by 3n-6 eigenvectors of the G-matrix that give non-zero eigenvalues corresponding to the diagonal elements of $Λ$ , and $L$ is the remaining $N_{R} - (3 n - 6)$ redundant eigenvectors. In practice, to remove redundancy of the internal coordinate system, the $L$ eigenvectors in Eq. (A20) are identified as the ones whose eigenvalues are below a pre-selected threshold; these numerically small eigenvalues are then set to zeros such that approximately 3n - 6 largest eigenvalues are kept across the training samples. With $K$ , $L$ , and $Λ$ in Eq. (A20) determined, the generalized inverse of the G-matrix, denoted $G^{-}$ , is constructed as:

G^{-} = (\begin{matrix} K & L \end{matrix}) (\begin{matrix} Λ^{-} & 0 \\ 0 & 0 \end{matrix}) (\begin{matrix} K^{T} \\ L^{T} \end{matrix})

(A21)

where $Λ^{-}$ represents the inverse of the non-zero eigenvalues, and $(\begin{matrix} K^{T} \\ L^{T} \end{matrix})$ is the transpose of $(\begin{matrix} K & L \end{matrix})$ . Once the $G^{-}$ matrix becomes available, the internal forces on the CVs in the redundant internal coordinates can be conveniently determined by the following transformation:

F = G^{-} Buf

(A22)

where the lower case $f$ is the Cartesian atomic forces obtained from the conventional QM/MM simulations, and $F$ represents the internal forces determined in the user-defined redundant coordinates q. In RP-FM-CV, because the CVs form a subset of the redundant internal coordinate q, this transformation procedure is used to obtain the internal forces on the CVs at both the SE/MM and AI/MM levels, which are subsequently used to determine the force corrections needed to match the internal CV forces at the two levels.

Footnotes

Supporting Information. Description of the pair-specific QM/MM van der Waals parameters using NBFIx in CHARMM; illustration of the matrix form of Eq. (A15) for spline-based FM in a two-bond CV case; implementation and comparison of internal force corrections based on Wu et al. and our redundant internal coordinate transformation formalism; overall convergence of free energy profiles along RP-FM-CV cycles at the B3LYP:AM1/MM level; RDFs obtained by RP-FM-CV at the B3LYP:AM1/MM level; our own AI/MM benchmark obtain at the B3LYP/6–31G(d)/MM level and its comparison with the consistent RP-FM-CV simulations. This material is available free of charge via the Internet at http://pubs.acs.org.

References

(1).Hehre WJ; Radom L; Schleyer P. v. R.; Pople JA Ab Initio Molecular Orbital Theory. John Wiley: New York, 1986. [Google Scholar]
(2).Kohn W; Sham LJ Self-Consistent Equations Including Exchange and Correlation Effects. Phys. Rev. A 1965, 140, 1133–1138. [Google Scholar]
(3).Parr RG; Yang W. Density-Functional Theory of Atoms and Molecules Oxford University Press, USA: 1994. [Google Scholar]
(4).Warshel A; Levitt M. Theoretical Studies of Enzymic Reactions: Dielectric, Electrostatic and Steric Stabilization of the Carbonium Ion in the Reaction of Lysozyme. J. Mol. Biol 1976, 103, 227–249. [DOI] [PubMed] [Google Scholar]
(5).Field MJ; Bash PA; Karplus M. A Combined Quantum Mechanical and Molecular Mechanical Potential for Molecular Dynamics Simulations. J. Comput. Chem 1990, 11, 700–733. [Google Scholar]
(6).Singh UC; Kollman PA A Combined Ab Initio Quantum Mechanical and Molecular Mechanical Method for Carrying out Simulations on Complex Molecular Systems: Applications to the CH3Cl + Cl- Exchange Reaction and Gas Phase Protonation of Polyethers. J. Comput. Chem 1986, 7, 718–730. [Google Scholar]
(7).Gao J; Thompson MA Combined Quantum Mechanical and Molecular Mechanical Methods ACS Symposium Series 712; American Chemical Society: Washington, DC, 1998. [Google Scholar]
(8).Senn HM; Thiel W. QM/MM Methods for Biomolecular Systems. Angew. Chem. Int. Ed 2009, 48, 1198–1229. [DOI] [PubMed] [Google Scholar]
(9).Lu X; Fang D; Ito S; Okamoto Y; Ovchinnikov V; Cui Q. QM/MM Free Energy Simulations: Recent Progress and Challenges. Mol. Simul 2016, 42, 1056. [DOI] [PMC free article] [PubMed] [Google Scholar]
(10).Thiel W. Semiempirical Quantum-Chemical Methods. WIREs Comput. Mol. Sci 2014, 4, 145–147. [Google Scholar]
(11).Cui Q; Pal T; Xie L. Biomolecular QM/MM Simulations: What Are Some of the "Burning Iusses"? J. Phys. Chem. B 2021, 125, 689–702. [DOI] [PMC free article] [PubMed] [Google Scholar]
(12).Zhou Y; Pu J. Reaction Path Force Matching: A New Strategy of Fitting Specific Reaction Parameters for Semiempirical Methods in Combined QM/MM Simulations. J. Chem. Theory Comput 2014, 10, 3038–3054. [DOI] [PubMed] [Google Scholar]
(13).Ercolessi F; Adams JB Interatomic Potentials from First-Principles Calculations: the Force-Matching Method. Europhys. Lett 1994, 26, 583–588. [Google Scholar]
(14).Izvekov S; Parrinello M; Burnham CJ; Voth GA Effective Force Fields for Condensed Phase Systems from Ab Initio Molecular Dynamics Simulation: A New Method for Force-Matching. J. Chem. Phys 2004, 120, 10896–10913. [DOI] [PubMed] [Google Scholar]
(15).Izvekov S; Voth GA A Multiscale Coarse-Graining Method for Biomolecular Systems. J. Phys. Chem. B 2005, 109, 2469–2473. [DOI] [PubMed] [Google Scholar]
(16).Laio A; Bernard S; Chiarotti GL; Scandolo S; Tosatti E. Physics of Iron at Earth's Core Conditions. Science 2000, 287, 1027–1030. [DOI] [PubMed] [Google Scholar]
(17).Csanyi G; Albaret T; Payne MC; De Vita A. "Learn on the Fly": A Hybrid Classical and Quantum-Mechanical Molecular Dynamics Simulation. Phys. Rev. Lett 2004, 93, 175503. [DOI] [PubMed] [Google Scholar]
(18).Maurer P; Laio A; Hugosson HW; Colombo MC; Rothlisberger U. Automated Parametrization of Biomolecular Force Fields from Quantum Mechanics/Molecular Mechanics (QM/MM) Simulations through Force Matching. J. Chem. Theory Comput 2007, 3, 628–639. [DOI] [PubMed] [Google Scholar]
(19).Arkin-Ojo O; Song Y; Wang F. Developing Ab Initio Quality Force Field from Condensed Phase Quantum-Mechanics/Molecular-Mechanics Calculations throught the Adaptive Force Matching Method. J. Chem. Phys 2008, 129, 064108. [DOI] [PubMed] [Google Scholar]
(20).Hudson PS; Boresch S; Rogers DM; Woodcock HL Accelerating QM/MM Free Energy Computations via Intramolecular Force Matching. J. Chem. Theory Comput 2018, 14, 6327–6335. [DOI] [PMC free article] [PubMed] [Google Scholar]
(21).Giese TJ; York DM Development of a Robust Indirect Approach for MM → QM Free Energy Calculations That Combines Force-Matched Reference Potential and Bennett’s Acceptance Ratio Methods. J. Chem. Theory Comput 2019, 15, 5543–5562. [DOI] [PMC free article] [PubMed] [Google Scholar]
(22).E W; Ren W; Vanden-Eijnden E. Finite Temperature String Method for the Study of Rare Events. J. Phys. Chem. B 2005, 109, 6688–6693. [DOI] [PubMed] [Google Scholar]
(23).Maragliano L; Fischer A; Vanden-Eijnden E; Ciccotti G. String Method in Collective Variables: Minimum Free Energy Paths and Isocommittor Surfaces. J. Chem. Phys 2006, 125, 024106. [DOI] [PubMed] [Google Scholar]
(24).Zhou Y; Ojeda-May P; Nagaraju M; Kim B; Pu J. Mapping Free Energy Pathways for ATP Hydrolysis in the E. coli ABC Transporter HlyB by the String Method. Molecules 2018, 23, 2652. [DOI] [PMC free article] [PubMed] [Google Scholar]
(25).Li P; Jia X; Pan X; Shao Y; Mei Y. Accelerated Computation of Free Energy Profile at ab initio QM/MM Accuracy via a Semi-Empirical Reference-Potential. I. Weighted Thermodynamics Perturbation. J. Chem. Theory Comput 2018, 14, 5583–5596. [DOI] [PubMed] [Google Scholar]
(26).Pan X; Li P; Ho J; Pu J; Mei Y; Shao Y. Accelerated Computation of Free Energy Profile at Ab Initio Quantum Mechanical/Molecular Mechanical Accuracy via a Semi-empirical Reference Potential. II. Recalibrating Semi-empirical Parameters with Force Matching. Phys. Chem. Chem. Phys 2019, 21, 20595–20605. [DOI] [PMC free article] [PubMed] [Google Scholar]
(27).Goldman N; Fried LE; Koziol L. Using Force-Matched Potentials To Improve the Accuracy of Density Functional Tight Binding for Reactive Conditions. J. Chem. Theory Comput 2015, 11, 4530–4535. [DOI] [PubMed] [Google Scholar]
(28).Kroonblawd MP; Pietrucci F; Saitta AM; Goldman N. Generating Converged Accurate Free Energy Surfaces for Chemical Reactions with a Force-Matched Semiempirical Model. J. Chem. Theory Comput 2018, 14, 2207–2218. [DOI] [PubMed] [Google Scholar]
(29).Shen L; Wu J; Yang W. Multiscale Quantum Mechanics/Molecular Mechanics Simulations with Neural Networks. J. Chem. Theory Comput 2016, 12, 4934–4946. [DOI] [PMC free article] [PubMed] [Google Scholar]
(30).Shen L; Yang W. Molecular Dynamics Simulations with Quantum Mechanics/Molecular Mechanics and Adaptive Neural Networks. J. Chem. Theory Comput 2018, 14, 1442–1455. [DOI] [PMC free article] [PubMed] [Google Scholar]
(31).Wu J; Shen L; Yang W. Internal Force Corretions with Machine Learning for Quantum Mechanics/Molecular Mechanics Simulations. J. Chem. Phys 2017, 147, 161732. [DOI] [PMC free article] [PubMed] [Google Scholar]
(32).Boselt L; Thurlemann M; Riniker S. Machine Learning in QM/MM Molecular Dynamics Simulations of Condensed-Phase Systems. J. Chem. Theory Comput 2021, 17, 2641–2658. [DOI] [PubMed] [Google Scholar]
(33).Zeng J; Giese TJ; Ekesan S; York DM Development of Range-Corrected Deep Learning Potentials for Fast, Accurate Quantum Mechanical/Molecular Mechanical Simulations of Chemical Reactions in Solution. ChemRxiv 2021, DOI: 10.26434/chemrxiv.14120447.v1. [DOI] [PMC free article] [PubMed] [Google Scholar]
(34).Pan X; Yang J; Van R; Epifanovsky E; Ho J; Huang J; Pu J; Mei Y; Nam K; Shao Y. Machine Learning Assisted Free Energy Simulation of Solution–Phase and Enzyme Reactions. ChemRxiv 2021, DOI: 10.26434/chemrxiv.14745510.v1. [DOI] [PMC free article] [PubMed] [Google Scholar]
(35).Noid WG; Chu J-W; Ayton GS; Krishna V; Izvekov S; Voth GA; Das A; Andersen HC The Multiscale Coarse-Graining Method. I. A Rigorous Bridge Between Atomistic and Coarse-Grained Models. J. Chem. Phys 2008, 128, 244114. [DOI] [PMC free article] [PubMed] [Google Scholar]
(36).Zinovjev K; Ruiz-Pernia JJ; Tunon I. Toward an Automatic Determination of Enzymatic Reaction Mechanisms and Their Activation Free Energies. J. Chem. Theory Comput 2013, 9, 3740–3749. [DOI] [PubMed] [Google Scholar]
(37).Darve E; Pohorille A. Calculating Free Energies Using Average Force. J. Chem. Phys 2001, 115, 9169–9183. [Google Scholar]
(38).Ruiz-Montero MJ; Frenkel D; Brey JJ Efficient Schemes to Compute Diffusive Barrier Crossing Rates. Mol. Phys 1997, 90, 925–941. [Google Scholar]
(39).den Otter WK; Briels WJ The Calculation of Free-Energy Differences by Constrianed Molecular-Dynamics Simulations. J. Chem. Phys 1998, 109, 4139–4146. [Google Scholar]
(40).Wilson EB Jr.; Decius JC; Cross PC Molecular Vibrations. McGraw-Hill: New York, 1955. [Google Scholar]
(41).Pulay P; Fogarasi G. Geometry Optimization in Redundant Internal Coordinates. J. Chem. Phys 1992, 96, 2856–2860. [Google Scholar]
(42).Peng C; Ayala PY; Schlegel HB; Frisch MJ Using Redundant Internal Coordinates to Optimize Equilibrum Geometries and Transition States. J. Comput. Chem 1996, 17, 49–56. [Google Scholar]
(43).Jackels CF; Gu Z; Truhlar DG Reaction-path potential and vibrational frequencies in terms of curviliear interal coordinates. J. Chem. Phys 1995, 102, 3188–3201. [Google Scholar]
(44).Chuang Y-Y; Truhlar DG Reaction-Path Dynamics in Redundant Internal Coordinates. J. Phys. Chem. A 1998, 102, 242–247. [Google Scholar]
(45).Press WH; Teukolsky SA; Vetterling WT; Flannery BP Numerical Recipes in FORTRAN 77: The Art of Scientific Computing. 2nd ed.; Cambridge University Press: New York, 1992. [Google Scholar]
(46). In the present work, we only correct the internal forces on the CVs to reproduce the high-level mean forces along the free energy path defined in the same set of CVs. However, the force corrections on the non-CV degrees of freedom in the redundant internal coordinate system provided are also available as a byproduct of Eq. (A22) in Appendix B. Under the CV-only FM scheme, when the backward coordinate transformation procedure is used to obtain the corresponding Cartesian force corrections, one needs to neglect the internal force corrections on the non-CV degrees of freedom by setting them to zeros. Effects of including the additional non-CV internal force corrections in FM are being examined in our ongoing work and will be reported separately.
(47).Menshutkin N. Beiträgen zur Kenntnis der Affinitätskoeffizienten der Alkylhaloide und der Organischen Amine Z. Physik. Chem 1890, 5, 589–600. [Google Scholar]
(48).Gao J. A Priori Computation of a Solvent-Enhanced SN2 Reaction Profile in Water: The Menshutkin Reaction. J. Am. Chem. Soc 1991, 113, 7796–7797. [Google Scholar]
(49).Gao J; Xia X. A Two-Dimensional Energy Surface for a Type II SN2 Reaction in Aqueous Solution. J. Am. Chem. Soc 1993, 115, 9667–9675. [Google Scholar]
(50).Dillet V; Rinaldi D; Bertran J; Rivail J-L Analytical Energy Derivatives for a Realistic Continuum Model of Solvation: Application to the Analysis of Solvent Effects on Reaction Paths. J. Chem. Phys 1996, 104, 9437–9444. [Google Scholar]
(51).Fradera X; Amat L; Torrent M; Mestres J; Constans P; Besalu E; Marti J; Simon S; Lobato M; Oliva JM; Luis JM; Andres JL; Sola M; Carbo R; Duran M. Analysis of the Changes on the Potential Energy Surface of Menshutkin Reactions Induced by External Perturbations. J. Mol. Struct: THEOCHEM 1996, 371, 171–183. [Google Scholar]
(52).Amovilli C; Mennucci B; Floris FM MCSCF Study of the SN2 Menshutkin Reaction in Aqueous Soluiton within the Polarizable Continuum Model. J. Phys. Chem. B 1998, 102, 3023–3028. [Google Scholar]
(53).Truong TN; Truong T-TT; Stefanovich EV A General Methodology for Quantum Modeling of Free-Energy Profile of Reaction in Solution: An Application to the Menshutkin NH3 + CH3Cl Reaction in Water. J. Chem. Phys 1997, 107, 1881–1889. [Google Scholar]
(54).Naka K; Sato H; Morita A; Hirata F; Kato S. RISM-SCF Stuy of the Free-Energy Profile of the Menshutkin-Type Reaction NH3 + CH3Cl -> NH3CH3+ + Cl- in Aqueous Solution. Theor. Chem. Acc 1999, 102, 165–169. [Google Scholar]
(55).Webb SP; Gordon MS Solvation of the Menshutkin Reaction: A Rigorous Test of the Effective Fragment Method. J. Phys. Chem. A 1999, 103, 1265–1273. [Google Scholar]
(56).Hirao H; Nagae Y; Nagaoka M. Transition-State Optimizatino by the Free Energy Gradient Method: Application to Aqueous-Phase Menshutkin Reaction between Ammonia and Methyl Chloride. Chem. Phys. Lett 2001, 348, 350–356. [Google Scholar]
(57).Ruiz-Pernia JJ; Silla E; Tunon I; Marti S; Moliner V. Hybrid QM/MM Potentials of Mean Force with Interpolated Corrections. J. Phys. Chem. B 2004, 108, 8427–8433. [Google Scholar]
(58).Marti S; Moliner V; Tunon I. Improving the QM/MM Description of Chemical Processes: A Dual Level Strategy to Explore the Potential Energy Surface in Vary Large System. J. Chem. Theory Comput 2005, 1, 1008–1016. [DOI] [PubMed] [Google Scholar]
(59).Fdez Galvan I; Martin ME; Aguilar MA A New Method to Locate Saddle Points for Reactions in Solution by Using the Free-Energy Gradient Method and the Mean Field Approximation. J. Comput. Chem 2004, 25, 1227–1233. [DOI] [PubMed] [Google Scholar]
(60).Acevedo O; Jorgensen WL Solvent Effects on Organic Reactions from QM/MM Simulations. In Annual Reports in Computational Chemistry, Spellmeyer D, Ed. Elsevier: Amsterdam, The Netherlands, 2006; Vol. 2, p 263. [Google Scholar]
(61).Yamamoto T. Variational and Perturbative Formulations of Quantum Mechanical/Molecular Mechanical Free Energy with Mean-Field Embedding and its Analytical Gradients. J. Chem. Phys 2008, 129, 244104. [DOI] [PubMed] [Google Scholar]
(62).Komeiji Y; Ishikawa T; Mochizuki Y; Yamataka H; Nakano T. Fragment Molecular Orbital Mothod-Based Molecular Dynamics (FMO-MD) as a Simulator for Chemical Reactions in Explicit Solvation. J. Comput. Chem 2009, 30, 40–50. [DOI] [PubMed] [Google Scholar]
(63).Acevedo O; Jorgensen WL Exploring Solvent Effects upon the Menshutkin Reaction Using a Polarizable Force Field. J. Phys. Chem. B 2010, 114, 8425–8430. [DOI] [PMC free article] [PubMed] [Google Scholar]
(64).Vilseck JZ; Sambasivarao SV; Acevedo O. Optimal Scaling Factors for CM1 and CM3 Atomic Charges in RM1-Based Aqueous Simulations. J. Comput. Chem 2011, 32, 2836–2842. [DOI] [PubMed] [Google Scholar]
(65).Nakano H; Yamamoto T. Variational Calculation of Quantum Mechanical/Molecular Mechanical Free Energy with Electronic Polarization of Solvent. J. Chem. Phys 2012, 136, 134107. [DOI] [PubMed] [Google Scholar]
(66).Nakano H; Yamamoto T. Accurate and Efficient Treatment of Continuous Solute Charge Density in the Mean-Field QM/MM Free Energy Calculation. J. Chem. Theory Comput 2013, 9, 188–203. [DOI] [PubMed] [Google Scholar]
(67).Gao J; Xia X. A Priori Evaluation of Aqueous Polarization Effects Through Monte Carlo QM-MM Simulations. Science 1992, 258, 631–635. [DOI] [PubMed] [Google Scholar]
(68).Gao J. Hybrid Quantum and Molecular Mechanical Simulations: An Alternative Avenue to Solvent Effects in Organic Chemistry. Acc. Chem. Res 1996, 29, 298–305. [Google Scholar]
(69).Ohmiya K; Kato S. Solution Reaction Path Hamiltonian Based on Reference Interaction Site Model Self-Consistent Field Method: Application to Menshutkin-Type Reactions. J. Chem. Phys 2003, 119, 1601–1610. [Google Scholar]
(70).Ten-no S; Hirata F; Kato S. A Hybrid Approach for the Solvent Effect on the Elctronic Structure of a Solute Based on the RISM and Hartree-Fock Equations. Chem. Phys. Lett 1993, 214, 391–396. [Google Scholar]
(71).Ten-no S; Hirata F; Kato S. Reference Interaction Site Model Self-Consistent Field Study for Solvation Effect on Carbonyl Compounds in Aqeous Solution. J. Chem. Phys 1994, 100, 7443–7453. [Google Scholar]
(72).Okuyama-Yoshida N; Nagaoka M; Yamabe T. Transition-State Optimization on Free Energy Surface: Toward Solution Chemical Reaction Ergodography. Int. J. Quantum Chem 1998, 70, 95–103. [Google Scholar]
(73).Day PN; Jensen JH; Gordon MS; Webb SP; Stevens WJ; Krauss M; Garmer D; Basch H; Cohen D. An Effective Fragment Method for Modeling Solvent Effects in Quantum Mechanical Calculations. J. Chem. Phys 1996, 105, 1968–1986. [Google Scholar]
(74).Hu H; Lu Z; Yang W. QM/MM Minimum Free-Energy Path: Methodology and Application to Triosephosphate Isomerase. J. Chem. Theory Comput 2007, 3, 390–406. [DOI] [PMC free article] [PubMed] [Google Scholar]
(75).Hu H; Lu Z; Parks JM; Burger SK; Yang W. Quantum Mechanics/Molecular Mechanics Mimimum Free-Energy Path for Accurate Reaction Energies in Solution and Enzymes: Sequential Sampling and Optimization on the Potential of Mean Force Surface. J. Chem. Phys 2008, 128, 034105. [DOI] [PubMed] [Google Scholar]
(76).Hu H; Yang W. Free Energies of Chemical Reactions in Solution and in Enzymes with Ab Initio QM/MM Methods. Annu. Rev. Phys. Chem 2008, 59, 573–601. [DOI] [PMC free article] [PubMed] [Google Scholar]
(77).Ruiz-Pernia JJ; Silla E; Tunon I; Marti S. Hybrid Quantum Mechanics/Molecular Mechanics Simulations with Two-Dimensional Interpolated Corrections: Application to Enzymatic Processes. J. Phys. Chem. B 2006, 110, 17663–17670. [DOI] [PubMed] [Google Scholar]
(78).MacKerell AD Jr.; Bashford D; Bellott M; Dunbrack RL Jr.; Evanseck JD; Field MJ; Fischer S; Gao J; Guo H; Ha S; Joseph-McCarthy D; Kuchnir L; Kuczera K; Lau FTK; Mattos C; Michnick S; Ngo T; Nguyen DT; Prodhom B; Reiher III WE; Roux B; Schlenkrich M; Smith JC; Stote R; Straub J; Watanabe M; Wiorkiewicz-Kuczera J; Yin D; Karplus M. All-Atom Empirical Potential for Molecular Modeling and Dynamics Studies of Proteins. J. Phys. Chem. B 1998, 102, 3586–3616. [DOI] [PubMed] [Google Scholar]
(79).Dewar MJS; Zoebisch EG; Healy EF; Stewart JJP AM1: A New General Purpose Quantum Mechanical Molecular Model. J. Am. Chem. Soc 1985, 107, 3902–3909. [Google Scholar]
(80).Neria E; Fischer S; Karplus M. Simulation of Activation Free Energies in Molecular Systems. J. Chem. Phys 1996, 105, 1902–1921. [Google Scholar]
(81).Thiel W. MNDO97, v5.0; University of Zurich, Zurich, Switzerland, 1998. [Google Scholar]
(82).Brooks BR; Brooks III CL; MacKerell AD Jr.; Nilsson L; Petrella RJ; Roux B; Won Y; Archontis G; Bartels C; Boresch S; Caflisch A; Caves L; Cui Q; Dinner AR; Feig M; Fischer S; Gao J; Hodoscek M; Im W; Kuczera K; Lazaridis T; Ma J; Ovchinnikov V; Paci E; Pastor RW; Post CB; Pu JZ; Schaefer M; Tidor B; Venable RM; Woodcock HL; Wu X; Yang W; York DM; Karplus M. CHARMM: The Molecular Simulation Program. J. Comput. Chem 2009, 30, 1545–1614. [DOI] [PMC free article] [PubMed] [Google Scholar]
(83).Shao Y; Molnar LF; Jung Y; Kussmann J; Ochsenfeld C; Brown ST; Gilbert ATB; Slipchenko LV; Levchenko SV; O'Neill DP; DiStasio RA Jr.; Lochan RC; Wang T; Beran GJO; Besley NA; Herbert JM; Lin CY; Van Voorhis T; Chien SH; Sodt A; Steele RP; Rassolov VA; Maslen PE; Korambath PP; Adamson RD; Austin B; Baker J; Byrd EFC; Dachsel H; Doerksen RJ; Dreuw A; Dunietz BD; Dutoi AD; Furlani TR; Gwaltney SR; Heyden A; Hirata S; Hsu C-P; Kedziora G; Khaliullin RZ; Klunzinger P; Lee AM; Lee MS; Liang W; Lotan I; Nair N; Peters B; Proynov EI; Pieniazek PA; Rhee YM; Ritchie J; Rosta E; Sherrill CD; Simmonett AC; Subotnik JE; Woodcock III HL; Zhang W; Bell AT; Chakraborty AK; Chipman DM; Keil FJ; Warshel A; Hehre WJ; Schaefer III HF; Kong J; Krylov AI; Gill PMW; Head-Gordon M. Advances in Methods and Algorithms in a Modern Quantum Chemistry Program Package. Phys. Chem. Chem. Phys 2006, 8, 3172–3191. [DOI] [PubMed] [Google Scholar]
(84).Ryckaert J-P; Ciccotti G; Berendsen HCJ Numerical Integration of the Cartisian Equations of Motion of a System with Constrains: Molecular Dynamics of n-Alkanes. J. Comput. Phys 1977, 23, 327–337. [Google Scholar]
(85).Darden T; York D; Pedersen L. Particle Mesh Ewald: An N.log(N) Method for Ewald Sums in Large Systems. J. Chem. Phys 1993, 98, 10089–10092. [Google Scholar]
(86).Nam K; Gao J; York DM An Efficient Linear-Scaling Ewald Method for Long-Range Electrostatics in Combined QM/MM Calculations. J. Comp. Theory Comput 2005, 1, 2–13. [DOI] [PubMed] [Google Scholar]
(87).Nam K. Acceleration of Ab Initio QM/MM Calculations under Periodic Boundary Conditions by Multiscale and Multiple Time Step Approaches. J. Chem. Theory Comput 2014, 10, 4175–4183. [DOI] [PubMed] [Google Scholar]
(88).Becke AD Density‐Functional Thermochemistry. III. The Role of Exact Exchange. J. Chem. Phys 1993, 98, 5648–5652. [Google Scholar]
(89).Lee C; Yang W; Parr RG Development of the Colle-Salvetti Correlation-Energy Formula into a Functional of Electron Density. Phys. Rev. B 1988, 37, 785–789. [DOI] [PubMed] [Google Scholar]
(90).Stephens PJ; Devlin FJ; Chabalowski CF; Frisch MJ Ab Initio Calculation of Vibrational Absorption and Circular Dichroism Spectra Using Density Functional Force Fields. J. Phys. Chem 1994, 98, 11623–11627. [Google Scholar]
(91).Becke AD A New Mixing of Hartree–Fock and Local Density-Functional Theories. J. Chem. Phys 1993, 98, 1372–1377. [Google Scholar]
(92).Møller C; Plesset MS Note on an Approximation Treatment for Many-Electron Systems. Phys. Rev 1934, 46, 618–622. [Google Scholar]
(93).Pople JA; Binkley JS; Seeger R. Theoretical Models Incorporating Electron Correlation. Int. J. Quantum Chem 1976, Supp. Y-10, 1–19. [Google Scholar]
(94).Francl MM; Pietro WJ; Hehre WJ; Binkley JS; Gordon MS; DeFrees DJ; Pople JA Self‐Consistent Molecular Orbital Methods. XXIII. A Polarization‐Type Basis Set for Second‐Row Elements. J. Chem. Phys 1982, 77, 3654–3665. [Google Scholar]
(95).Okamoto K; Fukui S; Shingu H. Kinetic Studies of Bimolecular Nucleophilic Substitution. VI. Rates of the Menshutkin Reaction of Methyl Iodide with Methylamines and Ammonia in Aqueous Solutions. Bull. Chem. Soc. Jpn 1967, 40, 1920–1925. [Google Scholar]
(96).Okamoto K; Fukui S; Nitta I; Shingu H. Kinetic Studies of Bimolecular Nucleophilic Substiution. VII. Effect of Hydroxylic Solvents on the Nucleophilicity of Aliphatic Amines in the Menschutkin Reaction. Bull. Chem. Soc. Jpn 1967, 40, 2354–2357. [Google Scholar]
(97).Frisch MJ; Pople JA; Binkley JS Self-Consistent Molecular Orbital Methods 25. Supplementary Functions for Gaussian Basis Sets. J. Chem. Phys 1984, 80, 3265–3269. [Google Scholar]
(98).McLean AD; Chandler GS Contracted Gaussian-Basis Sets for Molecular Calculations. 1. 2nd Row Atoms, Z=11–18. J. Chem. Phys 1980, 72, 5639–5648. [Google Scholar]
(99).Krishnan R; Binkley JS; Seeger R; Pople JA Self‐Consistent Molecular Orbital Methods. XX. A Basis Set for Correlated Wave Functions. J. Chem. Phys 1980, 72, 650–654. [Google Scholar]
(100).Knight C; Lindberg GE; Voth GA Multiscale Reactive Molecular Dynamics. J. Chem. Phys 2012, 137, 22A525. [DOI] [PMC free article] [PubMed] [Google Scholar]
(101).Knight C; Maupin CM; Izvekov S; Voth GA Defining Condensed Phase Reactive Force Fields from Ab Initio Molecular Dynamics Simulations: The Case of the Hydrated Excess Proton. J. Chem. Theory Comput 2010, 6, 3223–3232. [DOI] [PubMed] [Google Scholar]
(102).Zhang L; Han J; Wang H; Saidi WA; Car R; E W. End-to-end Symmetry Preserving Inter-atomic Potential Energy Model for Finite and Extended Systems. Advances in Neural Information Processing Systems 2018. [Google Scholar]
(103).Wang H; Zhang L; Han J; E W. DeePMD-kit: A Deep Learning Package for Many-Body Potential Energy Representation and Molecular Dynamics. Comput. Phys. Commun 2018, 228, 178–184. [Google Scholar]
(104).Zhang Y; Wang H; Chen W; Zeng J; Zhang L; Wang H; E W. DP-GEN: A Concurrent Learning Platform for the Generation of Reliable Deep Learning Based Potential Energy Models. Comput. Phys. Commun 2020, 253 107206. [Google Scholar]
(105).Lu D.-h.; Zhao M; Truhlar DG Projection Operator Method for Geometry Optimization with Constraints. J. Comput. Chem 1991, 12, 376–384. [Google Scholar]
(106).Devi-Kesavan LS; Garcia-Viloca M; Gao J. Semiempirical QM/MM Potential with Simple Valence Bond (SVB) for Enzyme Reactions. Application to the Nucleophilic Addition Reaction in Haloalkane Dehalogenase. Theor. Chem. Acc 2003, 109, 133–139. [Google Scholar]
(107).Califano S. Vibrational States. Wiley: New York, 1976. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supporting Information

NIHMS1790287-supplement-Supporting_Information.pdf^{(871.1KB, pdf)}

[R1] (1).Hehre WJ; Radom L; Schleyer P. v. R.; Pople JA Ab Initio Molecular Orbital Theory. John Wiley: New York, 1986. [Google Scholar]

[R2] (2).Kohn W; Sham LJ Self-Consistent Equations Including Exchange and Correlation Effects. Phys. Rev. A 1965, 140, 1133–1138. [Google Scholar]

[R3] (3).Parr RG; Yang W. Density-Functional Theory of Atoms and Molecules Oxford University Press, USA: 1994. [Google Scholar]

[R4] (4).Warshel A; Levitt M. Theoretical Studies of Enzymic Reactions: Dielectric, Electrostatic and Steric Stabilization of the Carbonium Ion in the Reaction of Lysozyme. J. Mol. Biol 1976, 103, 227–249. [DOI] [PubMed] [Google Scholar]

[R5] (5).Field MJ; Bash PA; Karplus M. A Combined Quantum Mechanical and Molecular Mechanical Potential for Molecular Dynamics Simulations. J. Comput. Chem 1990, 11, 700–733. [Google Scholar]

[R6] (6).Singh UC; Kollman PA A Combined Ab Initio Quantum Mechanical and Molecular Mechanical Method for Carrying out Simulations on Complex Molecular Systems: Applications to the CH3Cl + Cl- Exchange Reaction and Gas Phase Protonation of Polyethers. J. Comput. Chem 1986, 7, 718–730. [Google Scholar]

[R7] (7).Gao J; Thompson MA Combined Quantum Mechanical and Molecular Mechanical Methods ACS Symposium Series 712; American Chemical Society: Washington, DC, 1998. [Google Scholar]

[R8] (8).Senn HM; Thiel W. QM/MM Methods for Biomolecular Systems. Angew. Chem. Int. Ed 2009, 48, 1198–1229. [DOI] [PubMed] [Google Scholar]

[R9] (9).Lu X; Fang D; Ito S; Okamoto Y; Ovchinnikov V; Cui Q. QM/MM Free Energy Simulations: Recent Progress and Challenges. Mol. Simul 2016, 42, 1056. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R10] (10).Thiel W. Semiempirical Quantum-Chemical Methods. WIREs Comput. Mol. Sci 2014, 4, 145–147. [Google Scholar]

[R11] (11).Cui Q; Pal T; Xie L. Biomolecular QM/MM Simulations: What Are Some of the "Burning Iusses"? J. Phys. Chem. B 2021, 125, 689–702. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R12] (12).Zhou Y; Pu J. Reaction Path Force Matching: A New Strategy of Fitting Specific Reaction Parameters for Semiempirical Methods in Combined QM/MM Simulations. J. Chem. Theory Comput 2014, 10, 3038–3054. [DOI] [PubMed] [Google Scholar]

[R13] (13).Ercolessi F; Adams JB Interatomic Potentials from First-Principles Calculations: the Force-Matching Method. Europhys. Lett 1994, 26, 583–588. [Google Scholar]

[R14] (14).Izvekov S; Parrinello M; Burnham CJ; Voth GA Effective Force Fields for Condensed Phase Systems from Ab Initio Molecular Dynamics Simulation: A New Method for Force-Matching. J. Chem. Phys 2004, 120, 10896–10913. [DOI] [PubMed] [Google Scholar]

[R15] (15).Izvekov S; Voth GA A Multiscale Coarse-Graining Method for Biomolecular Systems. J. Phys. Chem. B 2005, 109, 2469–2473. [DOI] [PubMed] [Google Scholar]

[R16] (16).Laio A; Bernard S; Chiarotti GL; Scandolo S; Tosatti E. Physics of Iron at Earth's Core Conditions. Science 2000, 287, 1027–1030. [DOI] [PubMed] [Google Scholar]

[R17] (17).Csanyi G; Albaret T; Payne MC; De Vita A. "Learn on the Fly": A Hybrid Classical and Quantum-Mechanical Molecular Dynamics Simulation. Phys. Rev. Lett 2004, 93, 175503. [DOI] [PubMed] [Google Scholar]

[R18] (18).Maurer P; Laio A; Hugosson HW; Colombo MC; Rothlisberger U. Automated Parametrization of Biomolecular Force Fields from Quantum Mechanics/Molecular Mechanics (QM/MM) Simulations through Force Matching. J. Chem. Theory Comput 2007, 3, 628–639. [DOI] [PubMed] [Google Scholar]

[R19] (19).Arkin-Ojo O; Song Y; Wang F. Developing Ab Initio Quality Force Field from Condensed Phase Quantum-Mechanics/Molecular-Mechanics Calculations throught the Adaptive Force Matching Method. J. Chem. Phys 2008, 129, 064108. [DOI] [PubMed] [Google Scholar]

[R20] (20).Hudson PS; Boresch S; Rogers DM; Woodcock HL Accelerating QM/MM Free Energy Computations via Intramolecular Force Matching. J. Chem. Theory Comput 2018, 14, 6327–6335. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R21] (21).Giese TJ; York DM Development of a Robust Indirect Approach for MM → QM Free Energy Calculations That Combines Force-Matched Reference Potential and Bennett’s Acceptance Ratio Methods. J. Chem. Theory Comput 2019, 15, 5543–5562. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R22] (22).E W; Ren W; Vanden-Eijnden E. Finite Temperature String Method for the Study of Rare Events. J. Phys. Chem. B 2005, 109, 6688–6693. [DOI] [PubMed] [Google Scholar]

[R23] (23).Maragliano L; Fischer A; Vanden-Eijnden E; Ciccotti G. String Method in Collective Variables: Minimum Free Energy Paths and Isocommittor Surfaces. J. Chem. Phys 2006, 125, 024106. [DOI] [PubMed] [Google Scholar]

[R24] (24).Zhou Y; Ojeda-May P; Nagaraju M; Kim B; Pu J. Mapping Free Energy Pathways for ATP Hydrolysis in the E. coli ABC Transporter HlyB by the String Method. Molecules 2018, 23, 2652. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R25] (25).Li P; Jia X; Pan X; Shao Y; Mei Y. Accelerated Computation of Free Energy Profile at ab initio QM/MM Accuracy via a Semi-Empirical Reference-Potential. I. Weighted Thermodynamics Perturbation. J. Chem. Theory Comput 2018, 14, 5583–5596. [DOI] [PubMed] [Google Scholar]

[R26] (26).Pan X; Li P; Ho J; Pu J; Mei Y; Shao Y. Accelerated Computation of Free Energy Profile at Ab Initio Quantum Mechanical/Molecular Mechanical Accuracy via a Semi-empirical Reference Potential. II. Recalibrating Semi-empirical Parameters with Force Matching. Phys. Chem. Chem. Phys 2019, 21, 20595–20605. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R27] (27).Goldman N; Fried LE; Koziol L. Using Force-Matched Potentials To Improve the Accuracy of Density Functional Tight Binding for Reactive Conditions. J. Chem. Theory Comput 2015, 11, 4530–4535. [DOI] [PubMed] [Google Scholar]

[R28] (28).Kroonblawd MP; Pietrucci F; Saitta AM; Goldman N. Generating Converged Accurate Free Energy Surfaces for Chemical Reactions with a Force-Matched Semiempirical Model. J. Chem. Theory Comput 2018, 14, 2207–2218. [DOI] [PubMed] [Google Scholar]

[R29] (29).Shen L; Wu J; Yang W. Multiscale Quantum Mechanics/Molecular Mechanics Simulations with Neural Networks. J. Chem. Theory Comput 2016, 12, 4934–4946. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R30] (30).Shen L; Yang W. Molecular Dynamics Simulations with Quantum Mechanics/Molecular Mechanics and Adaptive Neural Networks. J. Chem. Theory Comput 2018, 14, 1442–1455. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R31] (31).Wu J; Shen L; Yang W. Internal Force Corretions with Machine Learning for Quantum Mechanics/Molecular Mechanics Simulations. J. Chem. Phys 2017, 147, 161732. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R32] (32).Boselt L; Thurlemann M; Riniker S. Machine Learning in QM/MM Molecular Dynamics Simulations of Condensed-Phase Systems. J. Chem. Theory Comput 2021, 17, 2641–2658. [DOI] [PubMed] [Google Scholar]

[R33] (33).Zeng J; Giese TJ; Ekesan S; York DM Development of Range-Corrected Deep Learning Potentials for Fast, Accurate Quantum Mechanical/Molecular Mechanical Simulations of Chemical Reactions in Solution. ChemRxiv 2021, DOI: 10.26434/chemrxiv.14120447.v1. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R34] (34).Pan X; Yang J; Van R; Epifanovsky E; Ho J; Huang J; Pu J; Mei Y; Nam K; Shao Y. Machine Learning Assisted Free Energy Simulation of Solution–Phase and Enzyme Reactions. ChemRxiv 2021, DOI: 10.26434/chemrxiv.14745510.v1. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R35] (35).Noid WG; Chu J-W; Ayton GS; Krishna V; Izvekov S; Voth GA; Das A; Andersen HC The Multiscale Coarse-Graining Method. I. A Rigorous Bridge Between Atomistic and Coarse-Grained Models. J. Chem. Phys 2008, 128, 244114. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R36] (36).Zinovjev K; Ruiz-Pernia JJ; Tunon I. Toward an Automatic Determination of Enzymatic Reaction Mechanisms and Their Activation Free Energies. J. Chem. Theory Comput 2013, 9, 3740–3749. [DOI] [PubMed] [Google Scholar]

[R37] (37).Darve E; Pohorille A. Calculating Free Energies Using Average Force. J. Chem. Phys 2001, 115, 9169–9183. [Google Scholar]

[R38] (38).Ruiz-Montero MJ; Frenkel D; Brey JJ Efficient Schemes to Compute Diffusive Barrier Crossing Rates. Mol. Phys 1997, 90, 925–941. [Google Scholar]

[R39] (39).den Otter WK; Briels WJ The Calculation of Free-Energy Differences by Constrianed Molecular-Dynamics Simulations. J. Chem. Phys 1998, 109, 4139–4146. [Google Scholar]

[R40] (40).Wilson EB Jr.; Decius JC; Cross PC Molecular Vibrations. McGraw-Hill: New York, 1955. [Google Scholar]

[R41] (41).Pulay P; Fogarasi G. Geometry Optimization in Redundant Internal Coordinates. J. Chem. Phys 1992, 96, 2856–2860. [Google Scholar]

[R42] (42).Peng C; Ayala PY; Schlegel HB; Frisch MJ Using Redundant Internal Coordinates to Optimize Equilibrum Geometries and Transition States. J. Comput. Chem 1996, 17, 49–56. [Google Scholar]

[R43] (43).Jackels CF; Gu Z; Truhlar DG Reaction-path potential and vibrational frequencies in terms of curviliear interal coordinates. J. Chem. Phys 1995, 102, 3188–3201. [Google Scholar]

[R44] (44).Chuang Y-Y; Truhlar DG Reaction-Path Dynamics in Redundant Internal Coordinates. J. Phys. Chem. A 1998, 102, 242–247. [Google Scholar]

[R45] (45).Press WH; Teukolsky SA; Vetterling WT; Flannery BP Numerical Recipes in FORTRAN 77: The Art of Scientific Computing. 2nd ed.; Cambridge University Press: New York, 1992. [Google Scholar]

[R46] (46). In the present work, we only correct the internal forces on the CVs to reproduce the high-level mean forces along the free energy path defined in the same set of CVs. However, the force corrections on the non-CV degrees of freedom in the redundant internal coordinate system provided are also available as a byproduct of Eq. (A22) in Appendix B. Under the CV-only FM scheme, when the backward coordinate transformation procedure is used to obtain the corresponding Cartesian force corrections, one needs to neglect the internal force corrections on the non-CV degrees of freedom by setting them to zeros. Effects of including the additional non-CV internal force corrections in FM are being examined in our ongoing work and will be reported separately.

[R47] (47).Menshutkin N. Beiträgen zur Kenntnis der Affinitätskoeffizienten der Alkylhaloide und der Organischen Amine Z. Physik. Chem 1890, 5, 589–600. [Google Scholar]

[R48] (48).Gao J. A Priori Computation of a Solvent-Enhanced SN2 Reaction Profile in Water: The Menshutkin Reaction. J. Am. Chem. Soc 1991, 113, 7796–7797. [Google Scholar]

[R49] (49).Gao J; Xia X. A Two-Dimensional Energy Surface for a Type II SN2 Reaction in Aqueous Solution. J. Am. Chem. Soc 1993, 115, 9667–9675. [Google Scholar]

[R50] (50).Dillet V; Rinaldi D; Bertran J; Rivail J-L Analytical Energy Derivatives for a Realistic Continuum Model of Solvation: Application to the Analysis of Solvent Effects on Reaction Paths. J. Chem. Phys 1996, 104, 9437–9444. [Google Scholar]

[R51] (51).Fradera X; Amat L; Torrent M; Mestres J; Constans P; Besalu E; Marti J; Simon S; Lobato M; Oliva JM; Luis JM; Andres JL; Sola M; Carbo R; Duran M. Analysis of the Changes on the Potential Energy Surface of Menshutkin Reactions Induced by External Perturbations. J. Mol. Struct: THEOCHEM 1996, 371, 171–183. [Google Scholar]

[R52] (52).Amovilli C; Mennucci B; Floris FM MCSCF Study of the SN2 Menshutkin Reaction in Aqueous Soluiton within the Polarizable Continuum Model. J. Phys. Chem. B 1998, 102, 3023–3028. [Google Scholar]

[R53] (53).Truong TN; Truong T-TT; Stefanovich EV A General Methodology for Quantum Modeling of Free-Energy Profile of Reaction in Solution: An Application to the Menshutkin NH3 + CH3Cl Reaction in Water. J. Chem. Phys 1997, 107, 1881–1889. [Google Scholar]

[R54] (54).Naka K; Sato H; Morita A; Hirata F; Kato S. RISM-SCF Stuy of the Free-Energy Profile of the Menshutkin-Type Reaction NH3 + CH3Cl -> NH3CH3+ + Cl- in Aqueous Solution. Theor. Chem. Acc 1999, 102, 165–169. [Google Scholar]

[R55] (55).Webb SP; Gordon MS Solvation of the Menshutkin Reaction: A Rigorous Test of the Effective Fragment Method. J. Phys. Chem. A 1999, 103, 1265–1273. [Google Scholar]

[R56] (56).Hirao H; Nagae Y; Nagaoka M. Transition-State Optimizatino by the Free Energy Gradient Method: Application to Aqueous-Phase Menshutkin Reaction between Ammonia and Methyl Chloride. Chem. Phys. Lett 2001, 348, 350–356. [Google Scholar]

[R57] (57).Ruiz-Pernia JJ; Silla E; Tunon I; Marti S; Moliner V. Hybrid QM/MM Potentials of Mean Force with Interpolated Corrections. J. Phys. Chem. B 2004, 108, 8427–8433. [Google Scholar]

[R58] (58).Marti S; Moliner V; Tunon I. Improving the QM/MM Description of Chemical Processes: A Dual Level Strategy to Explore the Potential Energy Surface in Vary Large System. J. Chem. Theory Comput 2005, 1, 1008–1016. [DOI] [PubMed] [Google Scholar]

[R59] (59).Fdez Galvan I; Martin ME; Aguilar MA A New Method to Locate Saddle Points for Reactions in Solution by Using the Free-Energy Gradient Method and the Mean Field Approximation. J. Comput. Chem 2004, 25, 1227–1233. [DOI] [PubMed] [Google Scholar]

[R60] (60).Acevedo O; Jorgensen WL Solvent Effects on Organic Reactions from QM/MM Simulations. In Annual Reports in Computational Chemistry, Spellmeyer D, Ed. Elsevier: Amsterdam, The Netherlands, 2006; Vol. 2, p 263. [Google Scholar]

[R61] (61).Yamamoto T. Variational and Perturbative Formulations of Quantum Mechanical/Molecular Mechanical Free Energy with Mean-Field Embedding and its Analytical Gradients. J. Chem. Phys 2008, 129, 244104. [DOI] [PubMed] [Google Scholar]

[R62] (62).Komeiji Y; Ishikawa T; Mochizuki Y; Yamataka H; Nakano T. Fragment Molecular Orbital Mothod-Based Molecular Dynamics (FMO-MD) as a Simulator for Chemical Reactions in Explicit Solvation. J. Comput. Chem 2009, 30, 40–50. [DOI] [PubMed] [Google Scholar]

[R63] (63).Acevedo O; Jorgensen WL Exploring Solvent Effects upon the Menshutkin Reaction Using a Polarizable Force Field. J. Phys. Chem. B 2010, 114, 8425–8430. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R64] (64).Vilseck JZ; Sambasivarao SV; Acevedo O. Optimal Scaling Factors for CM1 and CM3 Atomic Charges in RM1-Based Aqueous Simulations. J. Comput. Chem 2011, 32, 2836–2842. [DOI] [PubMed] [Google Scholar]

[R65] (65).Nakano H; Yamamoto T. Variational Calculation of Quantum Mechanical/Molecular Mechanical Free Energy with Electronic Polarization of Solvent. J. Chem. Phys 2012, 136, 134107. [DOI] [PubMed] [Google Scholar]

[R66] (66).Nakano H; Yamamoto T. Accurate and Efficient Treatment of Continuous Solute Charge Density in the Mean-Field QM/MM Free Energy Calculation. J. Chem. Theory Comput 2013, 9, 188–203. [DOI] [PubMed] [Google Scholar]

[R67] (67).Gao J; Xia X. A Priori Evaluation of Aqueous Polarization Effects Through Monte Carlo QM-MM Simulations. Science 1992, 258, 631–635. [DOI] [PubMed] [Google Scholar]

[R68] (68).Gao J. Hybrid Quantum and Molecular Mechanical Simulations: An Alternative Avenue to Solvent Effects in Organic Chemistry. Acc. Chem. Res 1996, 29, 298–305. [Google Scholar]

[R69] (69).Ohmiya K; Kato S. Solution Reaction Path Hamiltonian Based on Reference Interaction Site Model Self-Consistent Field Method: Application to Menshutkin-Type Reactions. J. Chem. Phys 2003, 119, 1601–1610. [Google Scholar]

[R70] (70).Ten-no S; Hirata F; Kato S. A Hybrid Approach for the Solvent Effect on the Elctronic Structure of a Solute Based on the RISM and Hartree-Fock Equations. Chem. Phys. Lett 1993, 214, 391–396. [Google Scholar]

[R71] (71).Ten-no S; Hirata F; Kato S. Reference Interaction Site Model Self-Consistent Field Study for Solvation Effect on Carbonyl Compounds in Aqeous Solution. J. Chem. Phys 1994, 100, 7443–7453. [Google Scholar]

[R72] (72).Okuyama-Yoshida N; Nagaoka M; Yamabe T. Transition-State Optimization on Free Energy Surface: Toward Solution Chemical Reaction Ergodography. Int. J. Quantum Chem 1998, 70, 95–103. [Google Scholar]

[R73] (73).Day PN; Jensen JH; Gordon MS; Webb SP; Stevens WJ; Krauss M; Garmer D; Basch H; Cohen D. An Effective Fragment Method for Modeling Solvent Effects in Quantum Mechanical Calculations. J. Chem. Phys 1996, 105, 1968–1986. [Google Scholar]

[R74] (74).Hu H; Lu Z; Yang W. QM/MM Minimum Free-Energy Path: Methodology and Application to Triosephosphate Isomerase. J. Chem. Theory Comput 2007, 3, 390–406. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R75] (75).Hu H; Lu Z; Parks JM; Burger SK; Yang W. Quantum Mechanics/Molecular Mechanics Mimimum Free-Energy Path for Accurate Reaction Energies in Solution and Enzymes: Sequential Sampling and Optimization on the Potential of Mean Force Surface. J. Chem. Phys 2008, 128, 034105. [DOI] [PubMed] [Google Scholar]

[R76] (76).Hu H; Yang W. Free Energies of Chemical Reactions in Solution and in Enzymes with Ab Initio QM/MM Methods. Annu. Rev. Phys. Chem 2008, 59, 573–601. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R77] (77).Ruiz-Pernia JJ; Silla E; Tunon I; Marti S. Hybrid Quantum Mechanics/Molecular Mechanics Simulations with Two-Dimensional Interpolated Corrections: Application to Enzymatic Processes. J. Phys. Chem. B 2006, 110, 17663–17670. [DOI] [PubMed] [Google Scholar]

[R78] (78).MacKerell AD Jr.; Bashford D; Bellott M; Dunbrack RL Jr.; Evanseck JD; Field MJ; Fischer S; Gao J; Guo H; Ha S; Joseph-McCarthy D; Kuchnir L; Kuczera K; Lau FTK; Mattos C; Michnick S; Ngo T; Nguyen DT; Prodhom B; Reiher III WE; Roux B; Schlenkrich M; Smith JC; Stote R; Straub J; Watanabe M; Wiorkiewicz-Kuczera J; Yin D; Karplus M. All-Atom Empirical Potential for Molecular Modeling and Dynamics Studies of Proteins. J. Phys. Chem. B 1998, 102, 3586–3616. [DOI] [PubMed] [Google Scholar]

[R79] (79).Dewar MJS; Zoebisch EG; Healy EF; Stewart JJP AM1: A New General Purpose Quantum Mechanical Molecular Model. J. Am. Chem. Soc 1985, 107, 3902–3909. [Google Scholar]

[R80] (80).Neria E; Fischer S; Karplus M. Simulation of Activation Free Energies in Molecular Systems. J. Chem. Phys 1996, 105, 1902–1921. [Google Scholar]

[R81] (81).Thiel W. MNDO97, v5.0; University of Zurich, Zurich, Switzerland, 1998. [Google Scholar]

[R82] (82).Brooks BR; Brooks III CL; MacKerell AD Jr.; Nilsson L; Petrella RJ; Roux B; Won Y; Archontis G; Bartels C; Boresch S; Caflisch A; Caves L; Cui Q; Dinner AR; Feig M; Fischer S; Gao J; Hodoscek M; Im W; Kuczera K; Lazaridis T; Ma J; Ovchinnikov V; Paci E; Pastor RW; Post CB; Pu JZ; Schaefer M; Tidor B; Venable RM; Woodcock HL; Wu X; Yang W; York DM; Karplus M. CHARMM: The Molecular Simulation Program. J. Comput. Chem 2009, 30, 1545–1614. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R83] (83).Shao Y; Molnar LF; Jung Y; Kussmann J; Ochsenfeld C; Brown ST; Gilbert ATB; Slipchenko LV; Levchenko SV; O'Neill DP; DiStasio RA Jr.; Lochan RC; Wang T; Beran GJO; Besley NA; Herbert JM; Lin CY; Van Voorhis T; Chien SH; Sodt A; Steele RP; Rassolov VA; Maslen PE; Korambath PP; Adamson RD; Austin B; Baker J; Byrd EFC; Dachsel H; Doerksen RJ; Dreuw A; Dunietz BD; Dutoi AD; Furlani TR; Gwaltney SR; Heyden A; Hirata S; Hsu C-P; Kedziora G; Khaliullin RZ; Klunzinger P; Lee AM; Lee MS; Liang W; Lotan I; Nair N; Peters B; Proynov EI; Pieniazek PA; Rhee YM; Ritchie J; Rosta E; Sherrill CD; Simmonett AC; Subotnik JE; Woodcock III HL; Zhang W; Bell AT; Chakraborty AK; Chipman DM; Keil FJ; Warshel A; Hehre WJ; Schaefer III HF; Kong J; Krylov AI; Gill PMW; Head-Gordon M. Advances in Methods and Algorithms in a Modern Quantum Chemistry Program Package. Phys. Chem. Chem. Phys 2006, 8, 3172–3191. [DOI] [PubMed] [Google Scholar]

[R84] (84).Ryckaert J-P; Ciccotti G; Berendsen HCJ Numerical Integration of the Cartisian Equations of Motion of a System with Constrains: Molecular Dynamics of n-Alkanes. J. Comput. Phys 1977, 23, 327–337. [Google Scholar]

[R85] (85).Darden T; York D; Pedersen L. Particle Mesh Ewald: An N.log(N) Method for Ewald Sums in Large Systems. J. Chem. Phys 1993, 98, 10089–10092. [Google Scholar]

[R86] (86).Nam K; Gao J; York DM An Efficient Linear-Scaling Ewald Method for Long-Range Electrostatics in Combined QM/MM Calculations. J. Comp. Theory Comput 2005, 1, 2–13. [DOI] [PubMed] [Google Scholar]

[R87] (87).Nam K. Acceleration of Ab Initio QM/MM Calculations under Periodic Boundary Conditions by Multiscale and Multiple Time Step Approaches. J. Chem. Theory Comput 2014, 10, 4175–4183. [DOI] [PubMed] [Google Scholar]

[R88] (88).Becke AD Density‐Functional Thermochemistry. III. The Role of Exact Exchange. J. Chem. Phys 1993, 98, 5648–5652. [Google Scholar]

[R89] (89).Lee C; Yang W; Parr RG Development of the Colle-Salvetti Correlation-Energy Formula into a Functional of Electron Density. Phys. Rev. B 1988, 37, 785–789. [DOI] [PubMed] [Google Scholar]

[R90] (90).Stephens PJ; Devlin FJ; Chabalowski CF; Frisch MJ Ab Initio Calculation of Vibrational Absorption and Circular Dichroism Spectra Using Density Functional Force Fields. J. Phys. Chem 1994, 98, 11623–11627. [Google Scholar]

[R91] (91).Becke AD A New Mixing of Hartree–Fock and Local Density-Functional Theories. J. Chem. Phys 1993, 98, 1372–1377. [Google Scholar]

[R92] (92).Møller C; Plesset MS Note on an Approximation Treatment for Many-Electron Systems. Phys. Rev 1934, 46, 618–622. [Google Scholar]

[R93] (93).Pople JA; Binkley JS; Seeger R. Theoretical Models Incorporating Electron Correlation. Int. J. Quantum Chem 1976, Supp. Y-10, 1–19. [Google Scholar]

[R94] (94).Francl MM; Pietro WJ; Hehre WJ; Binkley JS; Gordon MS; DeFrees DJ; Pople JA Self‐Consistent Molecular Orbital Methods. XXIII. A Polarization‐Type Basis Set for Second‐Row Elements. J. Chem. Phys 1982, 77, 3654–3665. [Google Scholar]

[R95] (95).Okamoto K; Fukui S; Shingu H. Kinetic Studies of Bimolecular Nucleophilic Substitution. VI. Rates of the Menshutkin Reaction of Methyl Iodide with Methylamines and Ammonia in Aqueous Solutions. Bull. Chem. Soc. Jpn 1967, 40, 1920–1925. [Google Scholar]

[R96] (96).Okamoto K; Fukui S; Nitta I; Shingu H. Kinetic Studies of Bimolecular Nucleophilic Substiution. VII. Effect of Hydroxylic Solvents on the Nucleophilicity of Aliphatic Amines in the Menschutkin Reaction. Bull. Chem. Soc. Jpn 1967, 40, 2354–2357. [Google Scholar]

[R97] (97).Frisch MJ; Pople JA; Binkley JS Self-Consistent Molecular Orbital Methods 25. Supplementary Functions for Gaussian Basis Sets. J. Chem. Phys 1984, 80, 3265–3269. [Google Scholar]

[R98] (98).McLean AD; Chandler GS Contracted Gaussian-Basis Sets for Molecular Calculations. 1. 2nd Row Atoms, Z=11–18. J. Chem. Phys 1980, 72, 5639–5648. [Google Scholar]

[R99] (99).Krishnan R; Binkley JS; Seeger R; Pople JA Self‐Consistent Molecular Orbital Methods. XX. A Basis Set for Correlated Wave Functions. J. Chem. Phys 1980, 72, 650–654. [Google Scholar]

[R100] (100).Knight C; Lindberg GE; Voth GA Multiscale Reactive Molecular Dynamics. J. Chem. Phys 2012, 137, 22A525. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R101] (101).Knight C; Maupin CM; Izvekov S; Voth GA Defining Condensed Phase Reactive Force Fields from Ab Initio Molecular Dynamics Simulations: The Case of the Hydrated Excess Proton. J. Chem. Theory Comput 2010, 6, 3223–3232. [DOI] [PubMed] [Google Scholar]

[R102] (102).Zhang L; Han J; Wang H; Saidi WA; Car R; E W. End-to-end Symmetry Preserving Inter-atomic Potential Energy Model for Finite and Extended Systems. Advances in Neural Information Processing Systems 2018. [Google Scholar]

[R103] (103).Wang H; Zhang L; Han J; E W. DeePMD-kit: A Deep Learning Package for Many-Body Potential Energy Representation and Molecular Dynamics. Comput. Phys. Commun 2018, 228, 178–184. [Google Scholar]

[R104] (104).Zhang Y; Wang H; Chen W; Zeng J; Zhang L; Wang H; E W. DP-GEN: A Concurrent Learning Platform for the Generation of Reliable Deep Learning Based Potential Energy Models. Comput. Phys. Commun 2020, 253 107206. [Google Scholar]

[R105] (105).Lu D.-h.; Zhao M; Truhlar DG Projection Operator Method for Geometry Optimization with Constraints. J. Comput. Chem 1991, 12, 376–384. [Google Scholar]

[R106] (106).Devi-Kesavan LS; Garcia-Viloca M; Gao J. Semiempirical QM/MM Potential with Simple Valence Bond (SVB) for Enzyme Reactions. Application to the Nucleophilic Addition Reaction in Haloalkane Dehalogenase. Theor. Chem. Acc 2003, 109, 133–139. [Google Scholar]

[R107] (107).Califano S. Vibrational States. Wiley: New York, 1976. [Google Scholar]

PERMALINK

Reaction Path-Force Matching in Collective Variables: Determining Ab Initio QM/MM Free Energy Profiles by Fitting Mean Force

Bryant Kim

Ryan Snyder

Mulpuri Nagaraju

Yan Zhou

Pedro Ojeda-May

Seth Keeton

Mellisa Hege

Yihan Shao

Jingzhi Pu

Abstract

Graphical Abstract

1. Introduction

2. Theory

2.1. RP-FM is equivalent to fitting free energy mean force

2.2. RP-FM-CV fits mean force on collective variables in internal coordinates

2.3. Determining force on CVs using redundant internal coordinate transformation

2.4. Linearized force matching in RP-FM-CV using spline functions

2.5. Force modification for iterative RP-FM-CV

Figure 1.

3. Critical Test: Menshutkin Reaction NH3 + CH3Cl

Figure 2.

4. Computational Details

4.1. Description of the solute model

4.2. Potential energy calculations

4.3. Definition of the collective variables

4.4. Boundary conditions and treatment of long-range electrostatics

4.5. Restraints and MD simulations

4.6. String MFEP simulations

4.7. Force matching in redundant internal coordinates

5. Results and Discussion

Table 1.

5.1. Free energy profiles

Figure 3.

5.2. Force correlations

Figure 4.

5.3. Internal force corrections on CVs along the MFEP

Figure 5.

5.4. Tests of different sets of redundant internal coordinates

Figure 6.

Figure 7.

5.5. Tests of number of configurations included in FM

Figure 8.

5.6. Tests of basis-set convergence

Figure 9.

5.7. RP-FM-CV produces AI/MM-quality free energy paths

Figure 10.

5.8. Convergence of the overall procedure

Table 2.

Figure 11.

5.9. Radial distribution functions

Figure 12.

6. Outlook

7. Concluding Remarks

Supplementary Material

Acknowledgements.

APPENDIX

Appendix A. Force matching in CVs using spline functions

Appendix B. Determination of internal forces on CVs using redundant internal coordinate transformation

Footnotes

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

3. Critical Test: Menshutkin Reaction NH₃ + CH₃Cl