Abstract
In silico investigations of enzymatic reactions and chemical reactions in condensed phases often suffer from formidable computational costs due to a large number of degrees of freedom and enormous important volume in phase space. Usually, accuracy must be compromised to trade for efficiency by lowering the reliability of the Hamiltonians employed or reducing the sampling time. Reference-potential methods (RPM) offer an alternative approach to reaching high accuracy of simulation without much loss of efficiency. In this perspective, we summarize the idea of RPM and showcase some recent applications. Most importantly, the pitfalls of these methods are also discussed, and remedies to these pitfalls are presented.
Graphical Abstract
Computer simulation is widely used nowadays for the study of chemical process and enzymatic reactions. However, applications of computer simulation still inevitably face three fundamental challenges, as have been summarized in a recent review by Hansen and van Gunsteren, namely the insufficient/inefficient sampling of the phase space, limited accuracy of the Hamiltonian describing the interaction potential, and the statistical reliability of the methods for data post-processing.1 Unfortunately, the first two difficulties require opposite solutions, thus posing a great dilemma. If one wants to reach a longer time scale (so interesting physical processes may take place), some further approximations to the Hamiltonian in use will be required. On the other hand, in order to achieve a higher accuracy, one needs to employ a higher-level Hamiltonian, which will further limit the simulation time scale. This dilemma becomes even irreconcilable when studying chemical reactions or enzymatic reactions, in which a hybrid quantum mechanical molecular mechanical (QM/MM) description (QM for atoms involved directly in bond forming/breaking and MM for solvent and enzyme atoms) is required. In order to fully converge the calculations of thermodynamic properties along a reaction, tens of ns of simulation time must be performed, while the time step for all-atom QM/MM molecular dynamics (MD) simulation is typically 0.5 or 1.0 fs. Each step of QM/MM MD propagation may take hundreds or thousands of seconds on a mainstream computer. Therefore, a single QM/MM MD simulation may take years or even tens of years of wall-clock time on a single computer. Although the simulation can be massively parallel on modern supercomputers, it is still too expensive for routine use. Therefore, it is essential to develop a more efficient methodology for sampling a large number of configurations. For the trajectory analysis, on the other hand, the calculations of statistical averages require independent and identically distributed (i.i.d.) samples. In order to avoid dependency among the sampled configurations, the sampling time interval should be no less than the correlation time of the physical properties in question. For simulations in condensed phase, the correlation time is on the order of ps. With a time step around 1 fs, only one configuration in every 1000 or more propagation steps can be used for further analysis, with the remaining 99.9% of the configurations discarded. Such a need to sample a large number of configurations of a condensed-phase system and then use only a small percentage of these configurations for subsequent analysis opens up the opportunity for more efficient sampling/analysis methodologies.
To this end, in the 1990s Gao and Warshel and his coworkers independently proposed the reference-potential method (RPM) for the calculations of hydration free energy and the free energy profiles for chemical reactions.2-5 Basically, the RPM is an importance sampling method, which exploits the fact that the expectation
under the distribution function , which is unavailable or difficult to obtain, can be efficiently calculated via
with samples drawn from an easy-to-get distribution , if the surrogate distribution is very close to the target distribution . In the RPM, an initial simulation is carried out utilizing some inexpensive (surrogate) potential energy functions, such as empirical valence bond (EVB)6 and AM1,7 which is expected to be close to the target potential-of-interest, usually the ab initio (ai) Hamiltonians. A perturbation rectification in the spirit of the second equation above is performed in a subsequent step to obtain the statistical properties at the level of the potential-of-interest. Inspired by these pioneering works, the RPM have emerged as powerful tools and have been applied in many studies, especially for the calculations of free energy landscapes. The pioneering work of Rod and Ryde in utilizing the RPM for the computation of free energy barriers in a methyl transfer reaction catalyzed by catechol O-methyltransferase is discussed. Their innovative approach of combining molecular mechanical calculations with density functional theory corrections has provided valuable insights into the underlying thermodynamics of this biologically important process.8 Beierlein et al. have made significant contributions to the field by applying the RPM to the calculation of free energy of protein-ligand binding,9 and two years later, Polyak et al. introduced a RPM called dual-Hamiltonian free energy perturbation (DH-FEP) for calculating free energy profiles of chemical reactions.10 König et al. have made notable progress by incorporating the Bennett acceptance ratio method into the RPM simulations, leading to the development of the non-Boltzmann Bennett acceptance ratio (NBB) method. This refined approach has demonstrated improved accuracy in free energy calculations, paving the way for more precise results in future studies. Dybeck et al. compared the performance of NBB and Multistate Bennett Acceptance Ratio (MBAR) in solvation free energy calculations and showed that the variances are marginally smaller for MBAR.11 Jia et al. proved the superiority of the BAR+TP approach for RPM-based free energy calculations. Their comprehensive analysis has established the optimal path for accurate calculations.12 Hudson et al. incorporated energy reweighting into the chain-of-replicas method and the non-equilibrium simulation method for the computations of free energy profiles.13,14 Their pioneering work has opened up new avenues for the applications of RPM. Piccini and Parrinello combined the RPM with metadynamics for the first time to study the free energy profile of a SN2 reaction.15 Giese and York integrated force matching into the molecular mechanics potential tuning process to enhance the similarity between MM and QM/MM potentials, resulting in a higher convergence rate of RPM.16 Rizzi et al. have advanced the field by integrating the RPM with machine learning, employing normalizing flow to assist in the correction from reference potential to target potential. Their innovative approach has the potential to revolutionize free energy calculations, offering a promising direction for future research in this area.17 Giese et al. have extended the idea of RPM by proposing the generalized weighted thermodynamic perturbation (gwTP) method.18 This novel approach, utilizing multiple reference potentials in umbrella sampling and piecing together free energy profile segments, can be used seamlessly with redundant neural network potentials from active learning and it has the potential to significantly advance the accuracy and efficiency of free energy calculations.19,20
In this perspective, we will first briefly introduce the statistical basis of the RPM, and then showcase some applications of the RPM conducted in our own groups. Finally, we will discuss potential future directions in RPM-based free energy simulations.
Theory
Due to the large gap in spatial-temporal scales between experimental physical/chemical processes and all-atom simulations, enhanced sampling on multiple thermodynamic states is now routinely employed to accelerate the exploration in phase space. For the study of chemical reactions, the most widely used enhanced sampling method is the umbrella sampling (US) method, 21 in which a direct propagation from reactant to product is replaced by stratified windows aligned along a pre-assumed low-dimensional reaction pathway. For the simulation in each window, a (harmonic) restraining potential is applied to keep the system in the vicinity of a prescribed phase space and prevent it from falling back to the reactant or product states, which can be written as
(1) |
for the kth thermodynamic state. and are the unbiased and the restraining potential energy functions, and is the collective variable (CV) describing the reaction process. In normal US calculations, is the Hamiltonian-of-interest. However, in the RPM, it is the reference Hamiltonian. In each sampling window, the degrees of freedom (DoF) orthogonal to need be adequately sampled before the convergence can be reached. However, hidden barriers may slow down the exploration of the orthogonal DoF, thus long-time simulations are sometimes desirable.
After running window simulations using a series of biased potential , each simulation contributes samples. All the samples are assembled together for the calculations of the expectation of any thermodynamics properties under the unbiased Hamiltonian via
(2) |
where is the total number of samples, and is the unnormalized weight of sample under the unbiased Hamiltonian . Using the MBAR,22 the weight can be written as
(3) |
where
(4) |
is the estimated free energy of state and must be solved iteratively. The uncertainties can be estimated using the asymptotic covariance, bootstrapping or the block average.22,23 With the estimated free energy of the unbiased state
(5) |
the normalized weight under the unbiased Hamiltonian can be written as
(6) |
It can be seen from the formulation above that in US simulations we only carry out biased simulations, from which unbiased properties can be obtained. This is an extrapolation process in the Hamiltonian space, although this extrapolation is usually mild, and the calculated unbiased properties are reliable with small magnitude of uncertainties. Similarly, given sufficient samples, we can apply extrapolation to any other states, for instance to the target Hamiltonian in the RPM. The normalized weight under the target Hamiltonian is now written as
(7) |
with
(8) |
being the normalization factor. The “agressiveness” of the extrapolation depends on the distribution width of the difference between the target Hamiltonian and the reference Hamiltonian . The second line of Eq. 7 indicates that it can be considered as a weighted free energy perturbation. Thermodynamic properties under the target Hamiltonian can thus be calculated via
(9) |
The operator can, for instance, measure the bond length, the charge distribution, etc. When it is the indicator function
(10) |
its ensemble average yields potential of mean force (PMF), up to an additive constant,
(11) |
Applications
Proton transfer between titratable groups is ubiquitous in biomolecules. As a simplest model system, tautomerization reaction within a glycine molecule in aqueous solution was studied by us using the RPM.24 The semiempirical methods PM3 and PM6 were chosen as the reference potential, and the target potential was the density functional theory with the B3LYP functional and 6-31G(d) basis set. The free energy profiles show significant Hamiltonian dependency. As shown in Fig. 1.a, under the PM3 level, the free energy profile is qualitatively wrong, where the reaction free energy is nearly zero. PM6 yields qualitatively correct free energy profile. However, it overestimates the reaction free energy as compared to the result under the DFT level of theory. As shown in Fig. 1.b, after the corrections from the PM3 and PM6 levels to the DFT level, the free energy profiles show much improved agreement with the direct DFT calculation, while the computational cost of this indirect approach is only 3.4% of the direct approach.
With a continuous development of force fields, classical molecular modeling using force fields is becoming more and more accurate.28-35 However, there is still room for further improvement. Currently, the quality of force fields is improved via either introducing more atomic types or the introduction of extra terms in the functional forms as in the polarizable force fields.32,36,37 Both approaches require computation-intensive benchmarking. As an alternative solution, the RPM can be applied to improve the accuracy of classical force field based simulations. For 3-hydroxypropanal, for instance, the free energy profile for the dihedral rotation along the C─C bond (shown in Fig. 1.c) at the molecular mechanics (MM) level of theory was calculated via the MBAR analysis over the umbrella sampling trajectories, and it was extrapolated using RPM to the QM/MM level with the solute molecule in the QM region and the remaining atoms in the MM region. Figure 1.c shows the free energy profiles at the MM and B3LYP/6-31G(d)/MM levels, in which the shaded areas are the 95% confidence region. The profiles show different preference for the planar (dihedral angle ≈ 180°) and nonplanar (dihedral angle ≈ ±60°) structures at the MM and QM/MM levels of theory.
The microscopic explanation to the endo/exo stereoselectivity of the Diels–Alder (DA) reaction between cyclopentadiene and methyl vinyl ketone (MVK) in the aqueous solution has posed a challenge to the computational chemists.38-43 Quantum mechanical calculations utilizing a continuum model for solvent often fail to accurately predict the reaction barrier. Therefore, sampling of the reaction at a high level of theory in explicit solvent model is needed. With the RPM, the US was performed at the PM6/MM level, and later an extrapolation to the B3LYP/MM level was carried out.25 The statistical analysis at the B3LYP/MM level shows that the stereoselectivity mainly comes from the solvation effect. At their respective transition states, the first peak of the solvent distribution around the oxygen atom in MVK is slightly closer for the endo pathway than that for the exo pathway (shown in Fig. 1.d). Although one order of magnitude smaller than the experimental measurement, the predicted endo/exo ratio is qualitatively correct. A further improvement will require a more accurate QM/MM Hamiltonian as the target potential, longer simulations, and a more rational definition of the collective variable.
The accuracy of QM/MM calculations depends on not only the Hamiltonian of the QM region, but also the partitioning scheme of the QM and MM regions. With a small QM region, one achieves higher computational efficiency often at a sacrifice of the computational accuracy.44-48 For most cases, the QM region is chosen with chemical intuition, and the convergence of the calculated properties with respect to the QM size is rarely checked in actual application projects due to the steep computational cost. Moreover, QM/MM calculations may face technical difficulties when the reactive region is varying over time. For instance, some solvent molecules may directly participate in the reaction beyond serving as a dielectric medium, and the exchange of water molecules between the reactive QM region and the surrounding MM region may occur on a time scale similar to the reaction time. In addition to the existing restrained QM/MM methods49-51 and adaptive QM/MM methods,52-59 RPM has been suggested as an alternative solution. The nucleophilic addition inside the 4-(dimethylamino)butanal molecule is a typical example. In this reaction, the solvent molecules stabilize the reaction product by accepting excess electrons from the aldehyde group. In the actual condensed phase system, the solvent molecules in different solvation layers surrounding the aldehyde group may exchange, resulting in a large scale of permutation. However, partitioning of the QM and MM regions with some solvent molecules included in the QM region breaks this symmetry, and once the exchange occurs, it may result in discontinuity in the QM region. Basically speaking, different QM/MM partition schemes correspond to different Hamiltonians. In order to avoid this technical difficulty, the lowest level of the partitioning scheme, where the QM region contains only the 4-(dimethylamino)butanal molecule, was utilized in our work as the reference potential, while the target potential encompasses several nearest solvent molecules in the QM region.26 As shown in Fig. 1.e, by extrapolating from a semiempirical Hamiltonian to a DFT level of theory and from the minimal QM region to larger QM regions with different number of solvent molecules, the accuracy could be improved and the convergence with respect to the QM size can be examined with remarkably increased efficiency.
The RPM can be applied not only to the classical QM/MM trajectories but also to the path integral QM/MM molecular dynamics simulations for the studies of quantum delocalization of light particles such as protons. As a typical example, the protonated 1,8-bis(dimethylamino)naphthalene (DMANH) molecule has a short hydrogen donor-acceptor distance.60 Therefore, the quantum tunneling effect for the proton transfer between the two nitrogen atoms can be nonnegligible. Using the RPM, the simulation time was extended by us to a scale of nanoseconds at the PM6/MM level with 16 beads for each QM atom, and then a PM6/MM to BLYP-D3/6-31G(d)/MM extrapolation was applied. Our results showed that the lowest-free energy structure at the PM6/MM level prefers a relatively more localized proton, while at the DFT level of theory a more diffused proton is preferred (See Fig. 1.f). A 545-fold reduction in the total CPU time was achieved while reaching the accuracy comparable to the DFT level of theory.27
Remedies to Pitfalls
It can be seen from Eq. 7 that the RPM is fundamentally a free energy perturbation (FEP) method with energy difference appearing in the exponent in the numerator, only that each configuration has a unequal weight. Therefore, it naturally inherits the numerical difficulty of FEP, i.e. the width of the distribution of determines the convergence rate with respect to the sample size.61-66 Any methods that can shrink the distribution width of can improve the convergence. Quantitative criteria to guide the convergence are long desired. So far, many criteria have been promoted, such as the variance of energy difference ,67,68 bias measure Π,63,69 and overlap matrix.70 To characterize the reliability of the TP calculation, the “reweighting entropy”71 is introduced, which is defined as
(12) |
for the samples collected in the mth bin around , and
(13) |
which are normalized in the mth bin. It measures the flatness of the distribution of weight . An even distribution of leads to close to 1, while a sharply distributed (only a very small number of samples have a non-negligible weight) makes close to 0.
Another metric is the smoothness of the density of states (DoS). Rewriting the definition of the potential of mean force under the target Hamiltonian as an integral in the space of
(14) |
where
(15) |
is the DOS of in the mth bin around ,
(16) |
is again the indicator function, and C is an irrelevant constant.72 With a continuous energy function, the DOS should be continuous intrinsically. However, with finite samples, the estimated DOS can be noisy, especially in the rarely sampled region. This sampled noise in the low energy region may deteriorate the calculated ensemble averages. With a large number of samples, can be fitted to a Gaussian with the mean being
(17) |
and the variance
(18) |
With this Gaussian-shaped DoS, the probability of falling into the small energy bin with a width of near is
(19) |
While, the sampled probability is
(20) |
By rescaling the sample weights via
(21) |
the potential of mean force becomes
(22) |
The results show that with this Gaussian smoothing over the DoS, the potential of mean force becomes much less noisy as shown in Fig. 2.a.
Most semiempirical QM/MM Hamiltonians show limited similarity to ab initio QM/MM Hamiltonians, therefore the important region in the phase space on a semiempirical QM/MM (free) energy surface does not necessarily cover the important region of ab initio QM/MM ones. It may lead to aggressive extrapolation if the RPM is employed, and a slow convergence may deteriorate the calculation results. Even when the free energy properties can be restored by the correction from the reference potential to the target potential, the recovery of the geometric properties, e.g. the reaction pathway in a two-dimensional or ever higher-dimensional space, can be much more difficult. By simply altering the importance of each sample, one does not gain access to the unsampled important configurations of the target Hamiltonian. In order to strength the similarity between the reference and the target Hamiltonians, calibration of semi-empirical Hamiltonians via force matching is one of the promising approaches. By constraining the parameters within ±5% of their original values, the parameters of the standard PM3 method were optimized using the force matching method against the B3LYP/6-31G(d) level of theory for a series of reactions in a recent work.73 As shown in Fig. 2.b, the reparametrized PM3 method can produce a much improved reaction pathway projected in a 2D CV space for the chorismate mutase reaction. After a correction from this newly reparametrized PM3 Hamiltonian to the DFT Hamiltonian, the free energy profile was accurately reproduced. Although such a reparametrization of the semi-empirical Hamiltonian against high level QM methods for molecules of interest can improve the convergence rate of RPM, the magnitude of improvement is usually limited due to the relatively small number of parameters available for tuning. Artificial neural network (ANN) granted us a capability to further correct the semi-empirical Hamiltonians towards higher-level Hamiltonians. In a recent study, we trained a delta machine learning potential (ΔMLP) to reproduce the differences between the ai-QM/MM and semiempirical (se) QM/MM energies and forces. With this machine learning correction to the semi-empirical Hamiltonian, the ai-QM/MM energy and forces could be well reproduced with errors less than 1.0 kcal · mol−1 and 1.0 kcal · mol−1 · Å−1, respectively, on average for representative configurations along the reaction pathway for Menshutkin and chorismate mutase reactions.74 Thus, the free energy profiles and the reaction pathways show much improved agreement with the ground truth (at the DFT level of theory) as shown in Fig. 2.d. Such machine-learning assisted potential refinement can be greatly helpful for enhancing the applicability of RPM. It is important to note that the final results, such as the free energy barrier and reaction free energy, are highly dependent on the choice of target Hamiltonian, which we have intentionally set as density functional theory with a small basis set for the sake of convenience in presentation. However, a higher level of theory may be desirable for comparison with experimental measurements. For example, Brickel and Meuwly reported a barrier of 12.4 kcal/mol for the chorismate mutase reaction at the MP2/6-311++G(2d,2p) level of theory,75 which closely agrees with our study’s result of 12.1 kcal/mol. In contrast, Turan et al. obtained a barrier of 18.0 kcal/mol for the Menshutkin reaction at the same MP2/6-311++G(2d,2p) level of theory,76 which is 2.7 kcal/mol higher than our DFT level of theory result. The choice of a suitable target Hamiltonian or improving the density functional’s quality is beyond the scope of this perspective. Therefore, readers should exercise caution in selecting the target Hamiltonian, as the accuracy of the RPM method cannot exceed that of the chosen target Hamiltonian.
Parallel to the idea of improving the reference potential, the convergence of RPM can be facilitated by optimizing the mapping from the configurations from the reference potential sampling to those from the target potential utilizing elegant mathematical transformations, such as targeted free energy perturbation (TFEP)77-79 and normalizing flow.17 The basic idea of TFEP is to find an optimal auxiliary state A′ (B′), which can be mapped from the sampled state A (B) via an invertible transformation and has more significant overlap with the target state B (A) than A (B) does. Therefore, the FEP from the auxiliary state to the target state converges faster than the original FEP from the sampled state to the target state.77 However, for a complex system, the optimal is difficult to find. Wirnsberger et al. proposed to use normalizing flow for the mapping and optimize the parameters79 by minimizing the Kullback-Leibler (KL) divergence ( and/or ). Rizzi exploited the fact that this idea can also be used for the calculation of free energy surfaces, and they applied this method to the analysis of the samples from umbrella sampling using
(23) |
where is the energy difference between the auxiliary state mapped from the target Hamiltonian and the reference Hamiltonian.17 Here, the equation has been reformulated instead of the weighted FEP form in their original paper. They showed that this configuration mapping using normalization flow can accelerate the convergence of RPM for an asymmetric SN2 reaction.
Outlook
In this Perspective, we have reviewed the theory and applications of reference-potential methods, while also highlighting their limitations. Despite progress in improving the robustness of these methods, it is crucial for practitioners to be aware of potential sources of error and regularly assess convergence of results. Looking forward, we anticipate that further advancements in quantum chemistry and machine learning, such as the emergence of new semi-empirical quantum mechanical methods,80,81 transfer learned and -machine learned potential energy functions,82-84 and optimal transport theory85 etc, will significantly improve the applicability of reference-potential methods in the near future. Moreover, we believe that these methods can help bridge the gap between computational studies and experimental investigations, and thereby strengthen the use of computational methods for understanding and interpreting experiments.
Acknowledgement
Y. Mei is supported by the National Natural Science Foundation of China (Grant No. 22073030). Y. Mo is supported by the National Natural Science Foundation of China (Grant No. 21973030). Y. Shao is supported by the National Institutes of Health (Grant No. R01GM135392).
Biographies
Jia-Ning Wang is a Ph.D. student in Dr. Ye Mei’s group at East China Normal University. He received his B. S. degree from the National Base for Fundamental Sciences (Mathematics & Physics) at Inner Mongolia University, China, in 2013. His research interests lie in multiscale simulations and free energy methods
Yuanfei Xue is now pursuing her Ph.D. in Dr. Ye Mei’s group at East China Normal University. She obtained her master’s degree in chemistry from the University of Manchester, UK, in 2018. Her research interest spans from accelerated simulation methods to rational enzymatic evolution.
Dr. Pengfei Li holds a Bachelor of Science degree in Physics from Shandong Normal University, which he obtained in 2014. He completed his Ph.D. in Physics at East China Normal University in 2019, where his research focused on the development of multiscale free energy simulations for chemical reactions, solvation of small molecules, and protein-ligand binding free energy prediction. Following the completion of his Ph.D., Dr. Li joined Silicon Therapeutics LLC as a scientist, where he played a pivotal role in establishing a computational physics platform for in-silico design and optimization of small molecule drugs. He then moved on to TandemAI LLC in 2022 as a principal scientist, where he contributed to the development of computational platforms for predicting protein-ligand binding affinity. Currently, Dr. Li serves as a senior application scientist in the field of computational chemistry at Single Particle LLC, showcasing his expertise and contributions to the field.
Dr. Xiaoliang Pan received his Ph.D. in Physical Chemistry from Jilin University in 2012. He currently works as a postdoctoral researcher at the University of Oklahoma. Before that he also did research at the University of Arizona. His scientific interests are in simulating enzyme reactions and developing computational tools and algorithms to accelerate QM/MM free energy calculations. His most recent research includes combining physical-based models and machine learning techniques to model biomolecular systems.
Dr. Meiting Wang received her Ph.D. in computational biophysics from East China Normal University in 2019. She currently holds the position of postdoctoral fellow in the Department of Computational Chemistry at Lund University. Dr. Wang’s primary research interest centers around the development and application of free energy calculation methods, with a particular focus on accurately and efficiently predicting the binding free energy between proteins and small molecules using theoretical approaches. In addition, she is actively engaged in research related to computer-aided drug design, reflecting her diverse interests and expertise in the field of computational biophysics.
Dr. Yihan Shao obtained his Ph.D. in Physical Chemistry from the University of California at Berkeley in 2002. After a stint at the Q-Chem software company (as a staff, senior, and principal scientist), he joined the University of Oklahoma in 2016 as an Assistant Professor, and was promoted to Associate Professor in 2022. Lately, he got interested in exploring how chemical/photochemical/enzyme reactions and bioimaging probes work.
Dr. Yan Mo is an Associate Professor at East China Normal University. Her research mainly focuses on the theoretical and computational study of excitation energy transfer in light-harvesting systems, as well as the ultrafast dynamics of polymer and protein side chains at the air-water interface. Her expertise in these areas contributes to our understanding of complex molecular processes and their practical applications in renewable energy, materials science, and biophysics.
Dr. Ye Mei obtained his Ph.D. in Physical Chemistry from Nanjing University in 2007. He became an Associate Professor in the Department of Physics at East China Normal University in 2009 after completing his postdoctoral career at the Department of Physics of Nanjing University, and in 2012 he was promoted to Full Professor. His research mainly focuses on linear scale quantum mechanical methods for proteins and statistical methods for free energy calculations. He is also interested in rational enzymatic evolution and developing empirical interaction potentials for biomacromolecules, including force fields and machine learning potentials.
Footnotes
Supporting Information Available
- Simulation protocol for the dihedral rotation of a 3-hydroxypropanal molecule
References
- (1).Hansen N; van Gunsteren WF Practical Aspects of Free-energy Calculations: A Review. J. Chem. Theory Comput 2014, 10, 2632–2647. [DOI] [PubMed] [Google Scholar]
- (2).Gao J. Absolute Free Energy of Solvation from Monte Carlo Simulations Using Combined Quantum and Molecular Mechanical Potentials. J. Phys. Chem 1992, 96, 537–540. [Google Scholar]
- (3).Gao J; Xia X A Priori Evaluation of Aqueous Polarization Effects through Monte Carlo QM-MM Simulations. Science 1992, 258, 631–635. [DOI] [PubMed] [Google Scholar]
- (4).Muller RP; Warshel A Ab Initio Calculations of Free Energy Barriers for Chemical Reactions in Solution. J. Phys. Chem 1995, 99, 17516–17524. [PubMed] [Google Scholar]
- (5).Bentzien J; Muller RP; Florián J; Warshel A Hybrid Ab Initio Quantum Mechanics/Molecular Mechanics Calculations of Free Energy Surfaces for Enzymatic Reactions: The Nucleophilic Attack in Subtilisin. J. Phys. Chem. B 1998, 102, 2293–2301. [Google Scholar]
- (6).Warshel A; Weiss RM An Empirical Valence Bond Approach for Comparing Reactions in Solutions and in Enzymes. J. Am. Chem. Soc 1980, 102, 6218–6226. [Google Scholar]
- (7).Dewar MJS; Zoebisch EG; Healy EF; Stewart JJP Development and Use of Quantum Mechanical Molecular Models. 76. AM1: A New General Purpose Quantum Mechanical Molecular Model. J. Am. Chem. Soc 1985, 107, 3902–3909. [Google Scholar]
- (8).Rod TH; Ryde U Quantum Mechanical Free Energy Barrier for an Enzymatic Reaction. Phys. Rev. Lett 2005, 94, 138302. [DOI] [PubMed] [Google Scholar]
- (9).Beierlein FR; Michel J; Essex JW A Simple QM/MM Approach for Capturing Polarization Effects in Protein-Ligand Binding Free Energy Calculations. J. Phys. Chem. B 2011, 115, 4911–4926. [DOI] [PubMed] [Google Scholar]
- (10).Polyak I; Benighaus T; Boulanger E; Thiel W Quantum Mechanics/Molecular Mechanics Dual Hamiltonian Free Energy Perturbation. J. Chem. Phys 2013, 139, 064105. [DOI] [PubMed] [Google Scholar]
- (11).Dybeck EC; König G; Brooks BR; Shirts MR Comparison of Methods To Reweight from Classical Molecular Simulations to QM/MM Potentials. J. Chem. Theory Comput 2016, 12, 1466–1480. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (12).Jia X; Wang M; Shao Y; König G; Brooks BR; Zhang JZH; Mei Y Calculations of Solvation Free Energy through Energy Reweighting from Molecular Mechanics to Quantum Mechanics. J. Chem. Theory Comput 2016, 12, 499–511. [DOI] [PubMed] [Google Scholar]
- (13).Hudson PS; White JK; Kearns FL; Hodoscek M; Boresch S; Woodcock HL Efficiently Computing Pathway Free Energies: New Approaches Based on Chain-of-Replica and Non-Boltzmann Bennett Reweighting Schemes. BBA Gen. Subjects 2015, 1850, 944–953. [DOI] [PubMed] [Google Scholar]
- (14).Hudson PS; Woodcock HL; Boresch S Use of Nonequilibrium Work Methods to Compute Free Energy Differences Between Molecular Mechanical and Quantum Mechanical Representations of Molecular Systems. J. Phys. Chem. Lett 2015, 6, 4850–4856. [DOI] [PubMed] [Google Scholar]
- (15).Piccini G; Parrinello M Accurate Quantum Chemical Free Energies at Affordable Cost. J. Phys. Chem. Lett 2019, 10, 3727–3731. [DOI] [PubMed] [Google Scholar]
- (16).Giese TJ; York DM Development of a Robust Indirect Approach for MM → QM Free Energy Calculations That Combines Force-Matched Reference Potential and Bennett’s Acceptance Ratio Methods. J. Chem. Theory Comput 2019, 15, 5543–5562. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (17).Rizzi A; Carloni P; Parrinello M Targeted Free Energy Perturbation Revisited: Accurate Free Energies from Mapped Reference Potentials. J. Phys. Chem. Lett 2021, 12, 9449–9454. [DOI] [PubMed] [Google Scholar]
- (18).Giese TJ; Zeng J; York DM Multireference Generalization of the Weighted Thermodynamic Perturbation Method. J. Phys. Chem. A 2022, 126, 8519–8533. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (19).Zeng J; Giese TJ; Ekesan S; York DM Development of Range-Corrected Deep Learning Potentials for Fast, Accurate Quantum Mechanical/Molecular Mechanical Simulations of Chemical Reactions in Solution. J. Chem. Theory Comput 2021, 17, 6993–7009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (20).Giese TJ; Zeng J; Ekesan S; York DM Combined QM/MM, Machine Learning Path Integral Approach to Compute Free Energy Profiles and Kinetic Isotope Effects in RNA Cleavage Reactions. J. Chem. Theory Comput 2022, 18, 4304–4317. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (21).Torrie GM; Valleau JP Nonphysical Sampling Distributions in Monte Carlo Free-energy Estimation: Umbrella Sampling. J. Comput. Phys 1977, 23, 187–199. [Google Scholar]
- (22).Shirts MR; Chodera JD Statistically Optimal Analysis of Samples from Multiple Equilibrium States. J. Chem. Phys 2008, 129, 124105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (23).Zhu F; Hummer G Convergence and Error Estimation in Free Energy Calculations Using the Weighted Histogram Analysis Method. J. Comput. Chem 2012, 33, 453–465. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (24).Li P; Jia X; Pan X; Shao Y; Mei Y Accelerated Computation of Free Energy Profile at Ab Initio Quantum Mechanical/Molecular Mechanics Accuracy via a Semi-Empirical Reference Potential. I. Weighted Thermodynamics Perturbation. J. Chem. Theory Comput 2018, 14, 5583–5596. [DOI] [PubMed] [Google Scholar]
- (25).Li P; Liu F; Shao Y; Mei Y Computational Insights into Endo/Exo Selectivity of the Diels-Alder Reaction in Explicit Solvent at Ab Initio Quantum Mechanical/Molecular Mechanical Level. J. Phys. Chem. B 2019, 123, 5131–5138. [DOI] [PubMed] [Google Scholar]
- (26).Wang J-N; Liu W; Li P; Mo Y; Hu W; Zheng J; Pan X; Shao Y; Mei Y Accelerated Computation of Free Energy Profile at Ab Initio Quantum Mechanical/Molecular Mechanics Accuracy via a Semiempirical Reference Potential. 4. Adaptive QM/MM. J. Chem. Theory Comput 2021, 17, 1318–1325. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (27).Xue Y; Wang J-N; Hu W; Zheng J; Li Y; Pan X; Mo Y; Shao Y; Wang L; Mei Y Affordable Ab Initio Path Integral for Thermodynamic Properties via Molecular Dynamics Simulations Using Semiempirical Reference Potential. J. Phys. Chem. A 2021, 125, 10677–10685. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (28).Lopes PEM; Guvench O; MacKerell AD In Molecular Modeling of Proteins; Kukol A, Ed.; Springer New York: New York, NY, 2015; pp 47–71. [Google Scholar]
- (29).Lemkul JA; Huang J; Roux B; MacKerell ADJ An Empirical Polarizable Force Field Based on the Classical Drude Oscillator Model: Development History and Recent Applications. Chem. Rev 2016, 116, 4983–5013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (30).Nerenberg PS; Head-Gordon T New Developments in Force Fields for Biomolecular Simulations. Curr. Opin. Struct. Biol 2018, 49, 129–138. [DOI] [PubMed] [Google Scholar]
- (31).Lemkul JA In Computational Approaches for Understanding Dynamical Systems: Protein Folding and Assembly; Strodel B, Barz B, Eds.; Progress in Molecular Biology and Translational Science; Academic Press, 2020; Vol. 170; pp 1–71.32145943 [Google Scholar]
- (32).Jing Z; Liu C; Cheng SY; Qi R; Walker BD; Piquemal J-P; Ren P Polarizable Force Fields for Biomolecular Simulations: Recent Advances and Applications. Annu. Rev. Biophys 2019, 48, 371–394. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (33).Meuwly M. Machine Learning for Chemical Reactions. Chem. Rev 2021, 121, 10218–10239. [DOI] [PubMed] [Google Scholar]
- (34).Unke OT; Chmiela S; Sauceda HE; Gastegger M; Poltavsky I; Schütt KT; Tkatchenko A; Müller K-R Machine Learning Force Fields. Chem. Rev 2021, 121, 10142–10186. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (35).Manzhos S; Carrington TJ Neural Network Potential Energy Surfaces for Small Molecules and Reactions. Chem. Rev 2021, 121, 10187–10217. [DOI] [PubMed] [Google Scholar]
- (36).Lu C; Wu C; Ghoreishi D; Chen W; Wang L; Damm W; Ross GA; Dahlgren MK; Russell E; Von Bargen CD; Abel R; Friesner RA; Harder ED OPLS4: Improving Force Field Accuracy on Challenging Regimes of Chemical Space. J. Chem. Theory Comput 2021, 17, 4291–4300. [DOI] [PubMed] [Google Scholar]
- (37).He X; Man VH; Yang W; Lee T-S; Wang J A Fast and High-quality Charge Model for the Next Generation General AMBER Force Field. J. Chem. Phys 2020, 153. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (38).Rideout DC; Breslow R Hydrophobic Acceleration of Diels-Alder Reactions. J. Am. Chem. Soc 1980, 102, 7816–7817. [Google Scholar]
- (39).Jorgensen WL; Lim D; Blake JF Ab Initio Study of Diels-Alder Reactions of Cyclopentadiene with Ethylene, Isoprene, Cyclopentadiene, Acrylonitrile, and Methyl Vinyl Ketone. J. Am. Chem. Soc 1993, 115, 2936–2942. [Google Scholar]
- (40).Chandrasekhar J; Shariffskul S; Jorgensen WL QM/MM Simulations for Diels–Alder Reactions in Water: Contribution of Enhanced Hydrogen Bonding at the Transition State to the Solvent Effect. J. Phys. Chem. B 2002, 106, 8078–8085. [Google Scholar]
- (41).Acevedo O; Jorgensen WL Understanding Rate Accelerations for Diels–Alder Reactions in Solution Using Enhanced QM/MM Methodology. J. Chem. Theory Comput 2007, 3, 1412–1419. [DOI] [PubMed] [Google Scholar]
- (42).Yang Z; Doubleday C; Houk KN QM/MM Protocol for Direct Molecular Dynamics of Chemical Reactions in Solution: The Water-Accelerated Diels–Alder Reaction. J. Chem. Theory Comput 2015, 11, 5606–5612. [DOI] [PubMed] [Google Scholar]
- (43).Liu F; Yang Z; Mei Y; Houk KN QM/QM’ Direct Molecular Dynamics of Water-Accelerated Diels–Alder Reaction. J. Phys. Chem. B 2016, 120, 6250–6254. [DOI] [PubMed] [Google Scholar]
- (44).Solt I; Kulhánek P; Simon I; Winfield S; Payne MC; Csányi G; Fuxreiter M Evaluating Boundary Dependent Errors in QM/MM Simulations. J. Phys. Chem. B 2009, 113, 5728–5735. [DOI] [PubMed] [Google Scholar]
- (45).Sumowski CV; Ochsenfeld C A Convergence Study of QM/MM Isomerization Energies with the Selected Size of the QM Region for Peptidic Systems. J. Phys. Chem. A 2009, 113, 11734–11741. [DOI] [PubMed] [Google Scholar]
- (46).Liao R-Z; Thiel W Convergence in the QM-only and QM/MM Modeling of Enzymatic Reactions: A Case Study for Acetylene Hydratase. J. Comput. Chem 2013, 34, 2389–2397. [DOI] [PubMed] [Google Scholar]
- (47).Kulik HJ; Zhang J; Klinman JP; Martínez TJ How Large Should the QM Region Be in QM/MM Calculations? The Case of Catechol O-Methyltransferase. J. Phys. Chem. B 2016, 120, 11381–11394. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (48).Mehmood R; Kulik HJ Both Configuration and QM Region Size Matter: Zinc Stability in QM/MM Models of DNA Methyltransferase. J. Chem. Theory Comput 2020, 16, 3121–3134. [DOI] [PubMed] [Google Scholar]
- (49).Rowley CN; Roux B The Solvation Structure of Na+ and K+ in Liquid Water Determined from High Level ab Initio Molecular Dynamics Simulations. J. Chem. Theory Comput 2012, 8, 3526–3535. [DOI] [PubMed] [Google Scholar]
- (50).Shiga M; Masia M Boundary Based on Exchange Symmetry Theory for Multilevel Simulations. I. Basic Theory. J. Chem. Phys 2013, 139, 044120. [DOI] [PubMed] [Google Scholar]
- (51).Takahashi H; Kambe H; Morita A A Simple and Effective Solution to the Constrained QM/MM Simulations. J. Chem. Phys 2018, 148, 134119. [DOI] [PubMed] [Google Scholar]
- (52).Kerdcharoen T; Morokuma K ONIOM-XS: An Extension of the ONIOM Method for Molecular Simulation in Condensed Phase. Chem. Phys. Lett 2002, 355, 257–262. [Google Scholar]
- (53).Heyden A; Lin H; Truhlar DG Adaptive Partitioning in Combined Quantum Mechanical and Molecular Mechanical Calculations of Potential Energy Functions for Multiscale Simulations. J. Phys. Chem. B 2007, 111, 2231–2241. [DOI] [PubMed] [Google Scholar]
- (54).Bulo RE; Ensing B; Sikkema J; Visscher L Toward a Practical Method for Adaptive QM/MM Simulations. J. Chem. Theory Comput 2009, 5, 2212–2221. [DOI] [PubMed] [Google Scholar]
- (55).Bernstein N; Várnai C; Solt I; Winfield SA; Payne MC; Simon I; Fuxreiter M; Csányi G QM/MM Simulation of Liquid Water with an Adaptive Quantum Region. Phys. Chem. Chem. Phys 2012, 14, 646–656. [DOI] [PubMed] [Google Scholar]
- (56).Takenaka N; Kitamura Y; Koyano Y; Nagaoka M The Number-adaptive Multiscale QM/MM Molecular Dynamics Simulation: Application to Liquid Water. Chem. Phys. Lett 2012, 524, 56–61. [DOI] [PubMed] [Google Scholar]
- (57).Waller MP; Kumbhar S; Yang J A Density-Based Adaptive Quantum Mechanical/Molecular Mechanical Method. ChemPhysChem 2014, 15, 3218–3225. [DOI] [PubMed] [Google Scholar]
- (58).Watanabe HC; Kubař T; Elstner M Size-Consistent Multipartitioning QM/MM: A Stable and Efficient Adaptive QM/MM Method. J. Chem. Theory Comput 2014, 10, 4242–4252. [DOI] [PubMed] [Google Scholar]
- (59).Field MJ An Algorithm for Adaptive QC/MM Simulations. J. Chem. Theory Comput 2017, 13, 2342–2351. [DOI] [PubMed] [Google Scholar]
- (60).Zhou S; Wang L Symmetry and 1H NMR Chemical Shifts of Short Hydrogen Bonds: Impact of Electronic and Nuclear Quantum Effects. Phys. Chem. Chem. Phys 2020, 22, 4884–4895. [DOI] [PubMed] [Google Scholar]
- (61).Lu N; Kofke DA Accuracy of Free-energy Perturbation Calculations in Molecular Simulation. I. Modeling. J. Chem. Phys 2001, 114, 7303–7311. [Google Scholar]
- (62).Gore J; Ritort F; Bustamante C Bias and Error in Estimates of Equilibrium Free-energy Differences from Nonequilibrium Measurements. Proc. Natl. Acad. Sci. U.S.A 2003, 100, 12564–12569. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (63).Wu D; Kofke DA Phase-space Overlap Measures. I. Fail-safe Bias Detection in Free Energies Calculated by Molecular Simulation. J. Chem. Phys 2005, 123, 054103. [DOI] [PubMed] [Google Scholar]
- (64).Shirts MR; Pande VS Comparison of Efficiency and Bias of Free Energies Computed by Exponential Averaging, the Bennett Acceptance Ratio, and Thermodynamic Integration. J. Chem. Phys 2005, 122, 144107. [DOI] [PubMed] [Google Scholar]
- (65).Cave-Ayland C; Skylaris C-K; Essex JW Direct Validation of the Single Step Classical to Quantum Free Energy Perturbation. J. Phys. Chem. B 2015, 119, 1017–1025. [DOI] [PubMed] [Google Scholar]
- (66).Boresch S; Woodcock HL Convergence of Single-step Free Energy Perturbation. Mol. Phys 2017, 115, 1200–1213. [Google Scholar]
- (67).Pohorille A; Jarzynski C; Chipot C Good Practices in Free-Energy Calculations. J. Phys. Chem. B 2010, 114, 10235–10253. [DOI] [PubMed] [Google Scholar]
- (68).Dellago C; Hummer G Computing Equilibrium Free Energies Using Non-Equilibrium Molecular Dynamics. Entropy 2014, 16, 41–61. [Google Scholar]
- (69).Wu D; Kofke DA Model for Small-sample Bias of Free-energy Calculations Applied to Gaussian-distributed Nonequilibrium Work Measurements. J. Chem. Phys 2004, 121, 8742–8747. [DOI] [PubMed] [Google Scholar]
- (70).Klimovich P; Shirts M; Mobley D Guidelines for the Analysis of Free Energy Calculations. J. Comput. Aided Mol. Des 2015, 29, 397–411. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (71).Wang M; Li P; Jia X; Liu W; Shao Y; Hu W; Zheng J; Brooks BR; Mei Y Efficient Strategy for the Calculation of Solvation Free Energies in Water and Chloroform at the Quantum Mechanical/Molecular Mechanical Level. J. Chem. Inf. Model 2017, 57, 2476–2489. [DOI] [PubMed] [Google Scholar]
- (72).Hu W; Li P; Wang J-N; Xue Y; Mo Y; Zheng J; Pan X; Shao Y; Mei Y Accelerated Computation of Free Energy Profile at Ab Initio Quantum Mechanical/Molecular Mechanics Accuracy via a Semiempirical Reference Potential. 3. Gaussian Smoothing on Density-of-States. J. Chem. Theory Comput 2020, 16, 6814–6822. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (73).Pan X; Li P; Ho J; Pu J; Mei Y; Shao Y Accelerated Computation of Free Energy Profile at Ab Initio Quantum Mechanical/Molecular Mechanical Accuracy via a Semi-Empirical Reference Potential. II. Recalibrating Semi-Empirical Parameters with Force Matching. Phys. Chem. Chem. Phys 2019, 21, 20595–20605. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (74).Pan X; Yang J; Van R; Epifanovsky E; Ho J; Huang J; Pu J; Mei Y; Nam K; Shao Y Machine-Learning-Assisted Free Energy Simulation of Solution-phase and Enzyme Reactions. J. Chem. Theory Comput 2021, 17, 5745–5758. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (75).Brickel S; Meuwly M Molecular Determinants for Rate Acceleration in the Claisen Rearrangement Reaction. J. Phys. Chem. B 2019, 123, 448–456. [DOI] [PubMed] [Google Scholar]
- (76).Turan HT; Brickel S; Meuwly M Solvent Effects on the Menshutkin Reaction. J. Phys. Chem. B 2022, 126, 1951–1961. [DOI] [PubMed] [Google Scholar]
- (77).Jarzynski C. Targeted Free Energy Perturbation. Phys. Rev. E 2002, 65, 046122. [DOI] [PubMed] [Google Scholar]
- (78).Hahn AM; Then H Using Bijective Maps to Improve Free-energy Estimates. Phys. Rev. E 2009, 79, 011113. [DOI] [PubMed] [Google Scholar]
- (79).Wirnsberger P; Ballard AJ; Papamakarios G; Abercrombie S; Racanière S; Pritzel A; Jimenez Rezende D; Blundell C Targeted Free Energy Estimation via Learned Mappings. J. Chem. Phys 2020, 153, 144112. [DOI] [PubMed] [Google Scholar]
- (80).Bannwarth C; Ehlert S; Grimme S GFN2-xTB—An Accurate and Broadly Parametrized Self-Consistent Tight-Binding Quantum Chemical Method with Multipole Electrostatics and Density-Dependent Dispersion Contributions. J. Chem. Theory Comput 2019, 15, 1652–1671. [DOI] [PubMed] [Google Scholar]
- (81).Řezáč J; Stewart JJP How Well Do Semiempirical QM Methods Describe the Structure of Proteins? J. Chem. Phys 2023, 158, 044118. [DOI] [PubMed] [Google Scholar]
- (82).Ramakrishnan R; Dral PO; Rupp M; von Lilienfeld OA Big Data Meets Quantum Chemistry Approximations: The -Machine Learning Approach. J. Chem. Theory Comput 2015, 11, 2087–2096. [DOI] [PubMed] [Google Scholar]
- (83).Käser S; Meuwly M Transfer Learned Potential Energy Surfaces: Accurate Anharmonic Vibrational Dynamics and Dissociation Energies for the Formic Acid Monomer and Dimer. Phys. Chem. Chem. Phys 2022, 24, 5269–5281. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (84).Bowman JM; Qu C; Conte R; Nandi A; Houston PL; Yu Q -Machine Learned Potential Energy Surfaces and Force Fields. J. Chem. Theory Comput 2023, 19, 1–17. [DOI] [PubMed] [Google Scholar]
- (85).Peyré G; Cuturi M Computational Optimal Transport: With Applications to Data Science. Found. Trends Mach. Learn 2019, 11, 355–607. [Google Scholar]