Skip to main content
ACS AuthorChoice logoLink to ACS AuthorChoice
. 2025 May 26;129(22):5477–5490. doi: 10.1021/acs.jpcb.5c02006

Transferability of MACE Graph Neural Network for Range Corrected Δ‑Machine Learning Potential QM/MM Applications

Timothy J Giese , Jinzhe Zeng ‡,§, Darrin M York †,*
PMCID: PMC12333372  PMID: 40418048

Abstract

We previously introduced a “range corrected” Δ−machine learning potential (ΔMLP) that used deep neural networks to improve the accuracy of combined quantum mechanical/molecular mechanical (QM/MM) simulations by correcting both the internal QM and QM/MM interaction energies and forces [J. Chem. Theory Comput. 2021, 17, 6993–7009]. The present work extends this approach to include graph neural networks. Specifically, the approach is applied to the MACE message passing neural network architecture, and a series of AM1/d + MACE models are trained to reproduce PBE0/6–31G* QM/MM energies and forces of model phosphoryl transesterification reactions. Several models are designed to test the transferability of AM1/d + MACE by varying the amount of training data and calculating free energy surfaces of reactions that were not included in the parameter refinement. The transferability is compared to AM1/d + DP models that use the DeepPot-SE (DP) deep neural network architecture. The AM1/d + MACE models are found to reproduce the target free energy surfaces even in instances where the AM1/d + DP models exhibit inaccuracies. We train “end-state” models that include data only from the reactant and product states of the 6 reactions. Unlike the uncorrected AM1/d profiles, the AM1/d + MACE method correctly reproduces a stable pentacoordinated phosphorus intermediate even though the training did not include structures with a similar bonding pattern. Furthermore, the message passing mechanism hyperparameters defining the MACE network are varied to explore their effect on the model’s accuracy and performance. The AM1/d + MACE simulations are 28% slower than AM1/d QM/MM when the ΔMLP correction is performed on a graphics processing unit. Our results suggest that the MACE architecture may lead to ΔMLP models with improved transferability.


graphic file with name jp5c02006_0010.jpg


graphic file with name jp5c02006_0008.jpg

1. Introduction

Simulation of biochemical reactions is challenging due to the broad range of spatial and temporal scales involved. Often the goal of these simulations is to identify the minimum free energy pathway that defines the mechanism and determines the reaction rate. , This typically requires extensive sampling of high-dimensional free energy surfaces and the use of efficient free energy path methods. In order to model chemical bond formation and cleavage processes, a quantum mechanical/molecular mechanical (QM/MM) potential may be used. , Although good accuracy can be obtained with ab initio density-functional models and a reliable basis set, their evaluation is computationally intensive. The large computational cost places practical restrictions on the number of atoms that can be treated quantum mechanically and/or the amount of sampling that can be reasonably afforded. These restrictions, in turn, place limitation on the predictive capabilities of the method.

Machine learning potentials (MLP) have emerged as a cost-effective alternative to expensive ab initio evaluation. Pure MLPs abandon physics-based mathematical modeling in favor of neural network calculation of the total potential energy. Although pure MLPs can be trained to reproduce target data within the scope of their parametrization, , they lack a treatment for long-range electrostatic interactions which are critical to correctly simulate the condensed phase, macromolecular systems, and interfacial properties. This has led to the development of long-range corrections to supplement pure MLPs. The so-called ΔMLP approach uses the opposite strategy: the MLP is a correction to (as opposed to a replacement for) an inexpensive physics-based model. The long-range interactions are calculated by the inexpensive base model, and the MLP is a short-range correction. Therefore, the ΔMLP approach can be easily adapted for use with semiempirical quantum mechanical/molecular mechanical (QM/MM) calculations with electrostatic ,,− or mechanical embedding. Typically, the ΔMLP is parametrized to correct semiempirical QM/MM energies and forces to match ab initio QM/MM target data; however, some methods are trained to reproduce target data that also includes polarization of the nearby MM surrounding. , Several QM/MM ΔMLP strategies use neural networks to explicitly correct the internal QM interactions in a manner that implicitly accounts for the MM environment. ,− In contrast, Böselt et al. and Zeng et al. independently proposed “range corrected” extensions of the QM/MM ΔMLP strategy that use neural networks to explicitly correct the interactions between QM and nearby MM atoms in a manner that yields smooth potential energies as MM atoms drift into (or away from) the vicinity of the QM region during the course of simulation.

Applications of the range corrected ΔMLP QM/MM strategy ,− ,− were aided by the availability of an interface between the DeePMD-kit software and sander molecular dynamics program ,,− available as part of AmberTools. The ΔMLPs in those works were limited to using the DeepPot-SE (DP) deep neural network potential implemented with the TensorFlow libraries. Many of the new MLPs appearing in the literature, however, are graph neural networks (GNNs) developed with PyTorch. The publication of new neural network models are often accompanied by software that implements the method and provides basic infrastructure for training the network parameters. The differences between the ad hoc software infrastructure introduces an obstacle for consistently training different models with the same optimization algorithms, hyperparameters, and active learning strategy. The comparisons are further inconvenienced by needing to interface each method to the molecular dynamics program. We have recently extended the DeePMD-kit software with a software plugin architecture, designated DeePMD-GNN, , that allows it to evaluate graph neural networks developed with PyTorch. The plugin architecture to DeePMD-kit causes the new MLPs to immediately become available within sanderor any dynamics program interfaced with DeePMD-kit. , Furthermore, the various MLPs can be trained with the aid of the DP-GEN software so that they may be compared using a consistent set of training algorithms.

In the present work, we describe a new range corrected ΔMLP based on the MACE message passing neural network architecture. , The range corrected graph topology is designed to improve the accuracy of QM and nearby QM/MM interactions within a cutoff. We parametrize a series of ΔMLPs that correct AM1/d , QM/MM to reproduce PBE0/6–31G* QM/MM target energies and forces. The accuracy of MACE and DP ΔMLPs are tested by comparing their ability to reproduce the free energy surfaces of 6 nonenzymatic phosphoryl transesterification reactions shown in Figure . The comparisons investigate the transferability of the MACE and DP architectures as one varies the amount of training data. Furthermore, we explore their transferability to reactions not included in the parametrization. The quality of the models is judged by their ability to reproduce a reference free energy surface and by their ability to be used as a reference potential to estimate the high-level surface from reweighted AM1/d ΔMLP sampling. Finally, we train a series of MACE ΔMLPs with various network hyperparameters to quantify their effect on the model’s performance and accuracy.

1.

1

Nonenzymatic phosphate transesterification reactions examined in this work. Parts a and b illustrate the reaction with the nucleophile (Nuc) and C2′-methylated nucleophile (mNuc), respectively, where the atom labels use the ribonucleic acid numbering convention. The reactant state has a fully formed O5′-P bond and a broken P–O2′ bond. The product state has a P–O2′ covalent bond and a broken O5′-P bond. The leaving groups (RO5′) considered in this work are ethoxide (EtO), acetate (AcO), and phenoxide (PhO).

The nucleophile (Nuc) and methylated nucleophile (mNuc) shown in Figure a,b, respectively, are hydroxyalkyl phosphate esters that undergo phosphoryl transesterification with either an ethoxide (EtO), acetate (AcO), or phenoxide (PhO) leaving group. These are nonenzymatic models for the RNA-cleavage reactions occurring in various nucleolytic ribozymes, including hammerhead, hairpin, HDV, VS, glmS, , twister, pistol, TS, ,, and hatchet ribozymes. Under physiological conditions, the RNA transphosphorylation mechanism includes: deprotonation of O2′ by a general base, nucleophilic attack of the scissile phosphate by the activated O2′, departure of the O5′ leaving group, and protonation of the O5′ by a general acid. In contrast, the model reactions explored in the present work are performed in basic conditions, in which case the O2′ and O5′ remain deprotonated throughout the reaction. The reaction progress is therefore characterized by the phosphoryl transfer reaction coordinate ξ=|RO5RP||RPRO2| . The mNuc nucleophile shown in Figure introduces a methyl group at the C2′ position to make the O2′ a secondary alkoxide, like what is found in the native enzymatic reactions. A similar series of model reactions have been studied using linear free energy relationships, , and it was found that the mechanism is correlated to the leaving group pK a. The reaction was found to proceed through a concerted mechanism containing a single, early (ξ < 0) transition state when the leaving group pK a is less than 11. Alternatively, the reaction proceeds through an associative mechanism when the leaving group pK a is greater than 12. The associative mechanism contains two barriers separated by a minimum. The early transition state corresponds to partial formation of a P–O2′ bond, and the rate controlling (late) transition state is characterized by partial cleavage of the P–O5′ bond. The pK a values of the leaving groups examined in this work are 15.5 (EtO), 4.76 (AcO), and 9.89 (PhO); therefore, one would expect only the Nuc-EtO and mNuc-EtO reactions to display a pentacoordinated phosphorus intermediate and a late rate-controlling transition state.

The remainder of the paper is organized as follows. The Methods section summarizes the main differences between the DP and MACE architectures, and we describe the necessary changes to the MACE graph topology that are required to make the architecture suitable for use as a ΔMLP in QM/MM applications. The computational details of the umbrella sampling used to calculate the free energy surfaces are provided. The AM1/d QM/MM sampling was performed to obtain free energy surfaces, and various subsets of the saved samples were used to train the ΔMLPs. The training strategy is described and the optimization hyperparameters are provided. The Results and Discussion section begins by demonstrating that the PBE0/6–31G* free energy surfaces cannot be reliably estimated by reweighting the AM1/d QM/MM sampling. The transferability of the DP and MACE models are compared by training them to 3 of the 6 reactions. The models are used in umbrella sampling to predict the free energy surfaces of all 6 reactions. It is shown that the AM1/d + MACE potential reproduces all 6 surfaces, whereas the AM1/d + DP potential displays artifacts in the 3 untrained surfaces. We then compare DP and MACE models trained only to the reactant and product states without providing them training data between these states. It is shown that the error in the resulting AM1/d + MACE free energy surface are much smaller than the AM1/d + DP surfaces. The AM1/d + MACE samples also better agree with the reference ab initio free energy surface when the sampling is reweighted. Finally, we train a series of MACE models that vary the network hyperparameters to observe the relative effect of each parameter on the model’s accuracy and performance. The manuscript concludes with a summary of the results.

2. Methods

We recently reported interoperable software infrastructure for next-generation QM/MM-ΔMLP force fields. , where the ΔMLP is a nonelectrostatic correction applied to the semiempirical QM and short-range QM/MM interactions and parametrized to reproduce an ab initio QM/MM method. The infrastructure was originally created to develop “range-corrected” DP models for semiempirical QM/MM applications. , By “range-corrected”, we mean that the MLP modifies the short-range QM/MM interactions (in addition to the semiempirical QM energy) in a manner that avoids energy discontinuities as MM atoms move to and away from the vicity of the QM region. We recently extended the DeePMD-kit software to support GNNs, , and present work describes how the graph’s network edges are assigned to adhere to the range-corrected QM/MM framework. The approach is demonstrated by developing QM/MM-ΔMLP potentials using the MACE message passing neural network architecture, and the transferability of the MACE ΔMLP is compared to DP models trainined to the same data. Below we describe the machine learning models, the simulation details, and the neural network training.

2.1. Machine Learning Potentials

The range corrected DP models use deep neural networks rather than graph neural networks to calculate the ΔMLP correction. , The energy is decomposed into atomic components (site energies), and the contribution from each atom is the output of a fitting network composed of 3 hidden layers with 240 neurons per layer. The input to the fitting network is a “feature matrix” that is calculated from embedding and coordinate matrices. The embedding matrix uses the hybrid descriptor infrastructure within DeePMD-kit to define descriptors for the QM and QM/MM interactions. The present work uses the deep potential smooth edition descriptor with type embedding. , The embedding matrices are calculated from deep neural networks composed of 3 layers consisting of 25, 50, and 100 neurons. Embedding networks are trained for each chemical species. The QM atom chemical species is assigned by atomic symbol (e.g., “H” for a QM hydrogen), whereas the MM atoms are assigned by the atomic symbol prefixed by the letter “m” (e.g., “mH” for a MM hydrogen). In this manner, QM and MM atoms of the same element can have different network parameters. The contribution of the MM atoms to the QM/MM “coordinate matrix” smoothly approaches 0 at a 6 Å cutoff, and the feature matrix includes 12 axis filters. The DP model’s atomic contribution to the energy includes a “bias”a constant that persists when the atom is isolated. The bias of the MM atoms is forced to be zero to prevent discontinuous changes to the total energy as MM atoms diffuse to (or from) the QM region’s vicinity. The DP network described here is similar in size to ΔMLPs trained in previous works. ,,,

The range corrected MACE models are message passing neural networksa type of graph neural networkwhere the nodes of the graph are the atoms and the edges define the topology of the communication pattern. In the original description of MACE, the edges consist of all atom pairs within a cutoff. The MACE energy does not encounter a discontinuity as an interatomic distance exceeds the cutoff because the distance features are embedded using Bessel basis functions and a polynomial envelope that smoothly forces the feature to zero. In contrast, the edges within the range corrected MACE ΔMLP include all atom pairs within a cutoff if at least one of the atoms in the pair is a QM atom. In other words, the edges connecting MM atoms to other MM atoms are excluded from the graph because the ΔMLP only corrects the QM and QM/MM interactions. The MACE total energy includes a contribution from each site corresponding to its isolated atomic energy. One often chooses the isolated site energies from a linear regression to a collection of molecular energies. For a ΔMLP, these are instead isolated atom energy corrections (the difference between the base and target levels of theory). The isolated MM atom energy corrections must be zero to ensure the range corrected MACE energy is continuous as a MM atom traverses its cutoff with the QM atoms. Figures S1 in the Supporting Information illustrates the smoothness of the QM and QM/MM MACE corrections, and Figure S2 demonstrates energy conservation in microcanonical QM/MM simulations using a AM1/d + MACE potential. Like the DP model, the inputs to the MACE network are the chemical species and atomic coordinates. The chemical species of the QM atoms are assigned by atomic number. We reserved a portion of the periodic table to assign chemical species for the MM atoms by adding 50 to their atomic number. This is the mechanism which allows a MM element’s network parameters to be trained differently than the corresponding QM element. From the perspective of MACE, the MM atoms are treated in the same manner as QM atoms; however, pairs of MM atoms do not contribute edges and the MM atomic energy bias is restricted to be zero.

A graph neural network has previously been used within a range corrected ΔMLP QM/MM model. In that work, the graph was described as consisting of nodes and edges that represent atoms and their interactions up to a cutoff radius, respectively. It was not explicitly articulated whether their graph structure excluded edges formed by MM pairs, if the MM energy biases were forced to zero, or if the MM atoms have different network parameters as corresponding QM elements; however, our interpretation of their work and previous models lead us to believe their QM/MM ΔMLP implementation is ostensibly similar to what we have presented. Aside from the details of the QM/MM implementation, the GNN used in ref is significantly different than the MACE model employed in the present work, in part because GNN frameworks have matured at a rapid pace. Whereas ref constructs 2-body messages (messages calculated from atomic pairs) from invariant features, the Atomic Cluster Expansion (ACE) framework proposed a many-body expansion to describe the local environment. The SphereNet GNN model similarly found it beneficial to construct 3-body messages; however, the computational performance suffered from explicit enumeration of triplets. Other GNNs, such as NequIP, were shown to improve accuracy by using equivariant internal features, but many message passing layers were required. The MACE model combines the ACE framework with equivariant internal features to build many-body messages. , It was found to reduce the number of message passing layers needed to achieve comparable accuracy, and the computational efficiency is maintained by building high order features from tensor products of lower order features rather than explicit enumeration of n-tuplets. The maximum body order is controlled by a hyperparameter called the “correlation”, ν, which is the body order minus 1. Values of ν = 1 and ν = 2 would construct 2-body and 3-body features, respectively. The correlation does not refer to the “receptive field”, which is the set of nodes which ultimately influence the output of a given target node. The first layer of messages are constructed from a local environment, but their influence can extend beyond the cutoff as the messages are iteratively communicated through the graph. The number of message passing layers, T, is another hyperparameter.

Unless otherwise noted, the following hyperparameters were set to define the range corrected MACE models trained in this work. Radial features are generated using a 6 Å cutoff, 8 Bessel basis functions, and a polynomial envelope of order 5, which was fed to a 3-layer perceptron consisting of 64 neurons/layer. The angular description of the local environment is expanded in spherical harmonics to order 3. The equivariant message passing uses spherical harmonics up to order L = 1 with N = 128 embedding channels and a correlation order of ν = 3. The MLP consists of T = 2 message passing layers. The readout of the last layer is passed through a 16 channel perceptron and gated with the sigmoid linear unit function. In some instances, we make comparison to models that vary the values of L, ν, T, and N to observe their effect on accuracy and performance.

2.2. Computational Details

Simulations were carried out with recently developed software infrastructure in AmberTools for use with QM/MM-ΔMLP force fields. , The 6 nonenzymatic reactions were prepared by solvating each system with 2200 TIP3P water molecules in a 40.8 Å cubic unit cell. The solute is the QM region (see Figure ), and the solvent is the MM region. Hydrogen mass repartitioning was applied to the solute atoms to allow for stable QM/MM dynamics with a 2 fs time step. The system density was equilibrated with a MM potential in the isothermal–isobaric ensemble with the Langevin thermostat and Berendsen barostat for 1 ns at 298 K and 1 atm while using a 5 ps–1 collision frequency. The electrostatics were evaluated with the particle mesh Ewald method using 9 Å real-space cutoffs, a 1 Å reciprocal space grid, and tinfoil boundary conditions. The solute net charge was neutralized with a uniform background correction. , The AM1/d QM/MM potential was used to extend the simulation for an additional 10 ps. , The semiempirical QM/MM electrostatics were evaluated with the Mulliken charge particle mesh Ewald method. The umbrella sampling consisted of 64 values of ξ ranging from 3.3Åξ3.0Å in increments of 0.1 Å. All harmonic potential force constants were set to 300 kcal mol–1 Å–2. An initial structure for each window was obtained by scanning the reaction coordinate in a sequential series of 2 ps NVT simulations departing from the reactant state. Each window was then independently equilibrated for 200 ps in the NVT ensemble at 298 K, followed by an additional 200 ps of production sampling. The positions and forces of 500 samples from each production simulation were saved. The PBE0/6–31G* QM/MM energies and forces of each saved sample was reevaluated for free energy analysis and ΔMLP training. The ab initio QM/MM electrostatics were evaluated with the ambient potential composite Ewald method.

The training produces 4 ΔMLP models obtained from independent stochastic optimizations initiated from different random number seeds. AM1/d ΔMLP umbrella sampling was then performed with each model. The 64 windows were re-equilibrated with the AM1/d ΔMLP for 20 ps in the canonical ensemble at 298 K. This was followed by 50 ps of production sampling, from which 250 samples were saved. Free energy surfaces were calculated from the multistate Bennett acceptance ratio (MBAR) method and a 0.1 Å histogram bin spacing , using the implementation in FE-ToolKit. The optimal choice of histogram bin width is dependent upon the underlying features of the surface. A detailed analysis of the histogram bin widths on analogous nonenzymatic phosphoryl transfer reaction surfaces has been presented elsewhere. It was found that bin widths smaller than 0.1 Å unnecessarily introduce numerical noise without signifcantly changing the activation free energy. Some have proposed using small histogram bins and postprocessing the noisy data with Gaussian Process Regression or other smoothing techniques. The curves illustrated in the current work have not been smoothed; the curves are a series of straight line segments that connect the free energy values at the bin centers. The potential energy of each sample was reevaluated with the 4 AM1/d ΔMLP models and the PBE0/6–31G* QM/MM reference. These energy evaluations were used to estimate the PBE0/6–31G* QM/MM free energy surface from the AM1/d ΔMLP sampling using either the weighted thermodynamic perturbation (wTP) method or the generalized weighted thermodynamic perturbation (gwTP) method. The wTP method reweights the samples obtained from a single reference potential. There are 4 reference potentials (the 4 AM1/d ΔMLP models) resulting in 4 wTP estimates. In contrast, the gwTP method reweights the sampling obtained from multiple reference potentials to produce a single estimate of the target free energy surface. If the reference and target potentials always agreed, then the observed distribution of samples would not need to be reweighted to mimic the target potential’s expected distribution. If the reference and target distributions poorly overlapped, then the reweighting becomes unreliable. The presence of poor overlap is often evident when the reweighting is dominated by a nonnegligble weight from only a few samples. One can quantify the weight distribution within each histogram bin from a quantity called the “reweighting entropy”. The reweighting entropy is a unitless number between 0 and 1. It is 1 when the weights are uniform, and it is close to zero when the nonzero weights are dominated by a few samples. Previous studies have found that reweighting methods become unreliable when the reweighting entropy drops below 0.6. , The free energy surface calculations employed the density of states smoothing algorithm described in ref using a 0.2 kT energy histogram. This algorithm dampens outlier weights within a spatial bin, as opposed to smoothing the free energy values between spatial bins. Nevertheless, it was found to reduce the numerical noise and slightly increase the reweighting entropy values.

2.3. Neural Network Training

The neural network training was performed with the “simplify” workflow within the DP-GEN software. In this workflow, one has a fixed database of samples that are partitioned into training and test subsets, and the neural network parameters are optimized using the Adam stochastic gradient descent method to reproduce the training data. The training is repeated 4 times using different random number seeds to produce 4 neural network parameter sets. The training subset is discarded and the 4 parameter sets are applied to the samples in the test set. If the maximum root-mean-square error of the atomic force vectors is less than 0.08 eV/Å, then the sample is discarded from the test set. The whole process repeats by treating the revised test set as the new database which is repartitioned into training and test subsets. The optimization is restarted with the new training set. The process terminates when all of the database samples have been exhausted. In the present work, the first training set contains 50% of the database samples selected at random. In the second and subsequent iterations, the training set contains no more than 25% of the original database size. In this manner, the process is guaranteed to terminate after 3 rounds of optimizations; however, it often finishes after 2 rounds because the first set of parameters is usually accurate enough to discard most of the test set.

Each Adam optimization was performed with an exponential learning rate that decays from 10–3 to 10–5 over the course of 400,000 steps. The loss function is a weighted sum of squared differences in the predicted and target energy and force corrections. The weight on the energy errors exponentially increase from 1 eV–2 to 100 eV–2 during the optimization whereas the weight on the force errors remain fixed at 100 Å2·eV–2.

The ΔMLPs generated in this work were prepared by first parametrizing an “end-state model”. The end-state models are trained to the AM1/d QM/MM sampling of the reactant and product states from all 6 reactions. Specifically, the training data consisted of the 500 samples saved from the 12 simulations corresponding to the ξ = – 2.0 Å and ξ = 3.0 Å states of the 6 reactions. The amount of training data was extended 10% by selecting 50 samples from each ensemble and making small, random displacements to the QM atomic positions. Each QM heavy atom was displaced by up to 0.15 Å in a random direction, and the each hydrogen was displaced along its covalent bond to change its length by a random amount in the range 0.7 Å to 1.2 Å. Every DP and MACE model was parametrized to the same end-state training data.

The end-state models can be directly used in production sampling to calculate free energy surfaces, or they can be used to restart the optimization to create a “fine–tuned” model that is trained to additional data. All of the ΔMLPs discussed in the present work are end-state models except those referred to as S:1, S:2, S:4, and S:8 whose training was restarted to include umbrella sampling from other ξ values. The S:1 ΔMLP includes all 64 windows (−3.3 Å ≤ ξ ≤ 3.0 Å in steps of 0.1 Å) from each of the mNuc/EtO, Nuc/AcO, and Nuc/PhO reactions. The S:2 ΔMLP includes 32 windows (−3.3 Å ≤ξ ≤ 2.9 Å in steps of 0.2 Å) from each of the mNuc-EtO, Nuc-AcO, and Nuc-PhO reactions. The S:4 ΔMLP includes 16 windows (−3.3 Å ≤ ξ ≤ 2.7 Å in steps of 0.4 Å) from each of the mNuc-EtO, Nuc-AcO, and Nuc-PhO reactions. The S:8 ΔMLP includes 8 windows (−3.3 Å ≤ ξ ≤ 2.3 Å in steps of 0.8 Å) from each of the mNuc-EtO, Nuc-AcO, and Nuc-PhO reactions. None of the fine–tuned models are provided additional data from the mNuc-AcO, mNuc-PhO, nor Nuc-EtO reactions. We will show that the S:1, S:2, S:4, and S:8 MACE ΔMLPs produce nearly indistinguishable gwTP estimates of the PBE0/6–31G* free energy surfaces. Furthermore, their reweighting entropies exceed 0.8, which suggest the estimates are reliable. The “reference” PBE0/6–31G* free energy surfaces appearing in the comparisons are the average of these 4 gwTP estimates. Each reference surface is produced from an aggregate of 76.8 ns of QM/MM sampling.

3. Results and Discussion

3.1. AM1/d QM/MM Free Energy Profiles

All of the ΔMLPs explored in the present work are corrections to a AM1/d QM/MM potential that are parametrized to reproduce PBE0/6–31G* QM/MM energies and forces. Figure compares the AM1/d and PBE0/6–31G* reference free energy surfaces to appreciate the magnitude of the desired corrections. The AM1/d rate limiting transition states are 1 to 5 kcal/mol higher than PBE0/6–31G*, and the locations of the AM1/d reactant state minima are shifted by 0.4 Å in the + ξ direction. The PBE0/6–31G* mNuc-EtO and Nuc-EtO profiles exhibit two barriers separated by an intermediate, whereas the AM1/d profiles have only one barrier. Figure also shows the PBE0/6–31G* surfaces estimated from wTP reweighting of the AM1/d sampling. The uncorrected AM1/d method is not an adequate reference potential; it produces target free energy surfaces that exhibit large amounts of numerical noise. The lack of phase space overlap between AM1/d and PBE0/6–31G* is also expressed by the presence of very low reweighting entropy values, which range from 0.12 to 0.24.

2.

2

PBE0/6–31G* free energy profile estimated from wTP analysis (green line) of the AM1/d (red line) sampling. The reference curve is the average of the 4 gwTP estimates shown in Figure 3. Parts a–f and g–l are the reactions involving the mNuc and Nuc nucleophiles, respectively.

3.2. Comparison of DP and MACE Transferability

The sensitivity and transferability of MACE and DP ΔMLPs are examined in Figures and , respectively. The models shown in these 2 figures refine the end-state parametrization by restarting the network optimization with umbrella sampling taken from the mNuc-EtO, Nuc-AcO, and Nuc-PhO systems. The training data includes umbrella windows extracted with a stride S:n, where n is the integer stride through the sequence of 64 windows. In other words, the S:8 models contain large gaps in the training data relative to the S:1 models. Sampling from the Nuc-EtO, mNuc-AcO, and mNuc-PhO reactions were not included in the model refinement.

3.

3

Estimates of the PBE0/6–31G* free energy profile made from gwTP analysis of several MACE parametrizations. The S:1 models were parametrized to the NCH–OEt, NHH-OAc, and NHH-OPh reactions using all 64 umbrella windows. The other S/n models were parametrized with fewer samples. n is the stride in the ξ series of umbrella windows used to train the model. The S:2 parametrization used every other window (32 windows/reaction). The S:4 parametrization used every fourth window (16 windows/reaction). The S:8 parametrization used 8 windows/reaction. The reference curve is the average of the 4 surfaces. Parts a–f and g–l are the reactions involving the mNuc and Nuc nucleophiles, respectively.

4.

4

Estimates of the PBE0/6–31G* free energy profile made from gwTP analysis of several DP parametrizations. The S:1 models were parametrized to the NCH–OEt, NHH-OAc, and NHH-OPh reactions using all 64 umbrella windows. The other S/n models were parametrized with fewer samples. n is the stride in the ξ series of umbrella windows used to train the model. The S:2 parametrization used every other window (32 windows/reaction). The S:4 parametrization used every fourth window (16 windows/reaction). The S:8 parametrization used 8 windows/reaction. The reference curve is the average of the 4 surfaces. Parts a–f and g–l are the reactions involving the mNuc and Nuc nucleophiles, respectively.

The gwTP–estimated PBE0/6–31G* surfaces obtained from the MACE ΔMLP models (see Figure ) are not sensitive to the gaps in the training data, and the models serve as excellent reference potentials for all 6 reactions, including the 3 reactions that were not included in the parameter refinement. The large reweighting entropies and the striking agreement between the S:1, S:2, S:4, and S:8 MACE models motivated us to use their average as the reference PBE0/6–31G* surface appearing in the comparisons because the amount of sampling performed with AM1/d + MACE far exceeds what can be reasonably afforded from explicit ab initio QM/MM simulation. In contrast, the DP ΔMLP estimates of the PBE0/6–31G* surfaces (see Figure ) exhibit lower reweighting entropies and more numerical noise. The reweighting entropies are summarized in Table . The DP average reweighting entropies decrease as the gaps in the training samples increase from S:1 to S:8; however, the differences are smaller than their standard deviations. The DP models are transferable to the Nuc-EtO reaction, but less so to the mNuc-AcO and mNuc-PhO reactions which systematically have the lowest reweighting entropies among the 6 reactions. The mNuc-PhO surface, in particular, is poorly reproduced; the 4 models contain significant numerical noise and the reweighting entropies range from 0.39 to 0.46.

1. Generalized Weighted Thermodynamic Perturbation Reweighting Entropies for Several Parametrizations (Param.) of the MACE and DP Architectures (Arch.) .

    Reaction
param. arch. mNuc-EtO mNuc-AcO mNuc-PhO Nuc-EtO Nuc-AcO Nuc-PhO
S1 MACE 0.87 ± 0.03 0.87 ± 0.02 0.84 ± 0.04 0.86 ± 0.03 0.85 ± 0.04 0.83 ± 0.04
  DP 0.74 ± 0.07 0.66 ± 0.11 0.45 ± 0.14 0.72 ± 0.10 0.72 ± 0.08 0.71 ± 0.10
S2 MACE 0.87 ± 0.04 0.87 ± 0.02 0.84 ± 0.03 0.86 ± 0.04 0.86 ± 0.03 0.84 ± 0.04
  DP 0.73 ± 0.08 0.66 ± 0.09 0.40 ± 0.15 0.73 ± 0.10 0.72 ± 0.09 0.71 ± 0.09
S4 MACE 0.87 ± 0.03 0.86 ± 0.04 0.84 ± 0.04 0.86 ± 0.04 0.85 ± 0.04 0.83 ± 0.05
  DP 0.72 ± 0.08 0.66 ± 0.11 0.46 ± 0.13 0.73 ± 0.08 0.72 ± 0.08 0.70 ± 0.11
S8 MACE 0.87 ± 0.04 0.86 ± 0.03 0.84 ± 0.04 0.86 ± 0.04 0.85 ± 0.05 0.83 ± 0.05
  DP 0.70 ± 0.10 0.61 ± 0.11 0.39 ± 0.14 0.68 ± 0.09 0.69 ± 0.08 0.66 ± 0.10
a

The reweighting corresponds to the prediction of the PBE0/6-31G* free energy from the AM1/d+ΔMLP sampling produced by 4 network parameter sets. This table summarizes the average and standard deviation of the RE values illustrated in Figures and .

The AM1/d + MACE and AM1/d + DP free energy surfaces (as opposed to the PBE0/6–31G* estimated from sample reweighting) can be found in the Supporting Information. In brief, the AM1/d + MACE surfaces are nearly indistinguishable from each other and the PBE0/6–31G* target, whereas the AM1/d + DP surfaces exhibit errorsespecially for the reactions not included in the refinement.

Figures and and Table extend the comparisons by examining the transferability of the MACE and DP end-state models. The data used to train these models was limited to the ξ = 3 Å and ξ = −2 Å umbrella samples from each reaction. Figure illustrates the free energies of the AM1/d + ΔMLP models, and Figure compares their use as reference potentials to estimate the PBE0/6–31G* surface from wTP analysis. Each plot contains 4 MACE and 4 DP surfaces corresponding to the 4 end-state parametrizations initiated from different random number seeds. Figure illustrates that there are significant variation between the 4 DP end-state models near ξ ≈ 0the region furthest from the samples used to train the models. In this region, the DP surfaces qualitatively look more similar to AM1/d than the PBE0/6–31G* target because their rate limiting barriers are similar to AM1/d and their mNuc-EtO and Nuc-EtO profiles do not convincingly exhibit early transition states. When used as reference potentials to estimate the PBE0/6–31G* surface (see Figure ), the DP models continue to show significant variation near ξ ≈ 0. The numerical noise in their wTP estimates are large, and the reweighting entropies are low (0.38 to 0.42). In comparison, the 4 MACE end-state models in Figure show greater agreement with each other and the PBE0/6–31G* target. Furthermore, the MACE end-state models clearly produce intermediates in the mNuc-EtO and Nuc-EtO profiles; however, the depth of the intermediate is too shallow with respect to the reference profile by 2 to 3 kcal/mol. We found this observation to be surprising because the intermediate state corresponds to a pentacoordinate phosphorus structure; the training data only included structures in which either the P–O2′ or P–O5′ bond was fully broken. When wTP analysis is performed on the MACE sampling (see Figure ), the errors in the mNuc-EtO and Nuc-EtO intermediate state well depths are reduced to 1.5 kcal/mol. There is good agreement between the 4 wTP MACE estimates and qualitative agreement with ab initio reference, but there is noticeable numerical noise in the predicted surfaces. This is consistent with the mediocre reweighting entropy values that range from 0.63 to 0.70.

5.

5

Comparison of AM1/d + ΔMLP free energy profiles, where the ΔMLP is calculated with DP or MACE networks. The ΔMLP corrections were parametrized to the ξ = 3 Å and ξ = – 2 Å umbrella samples of each reaction. Parts a–f and g–l are the reactions involving the mNuc and Nuc nucleophiles, respectively.

6.

6

Comparison of PBE0/6–31G* free energy profiles estimated from wTP analysis of the AM1/d + ΔMLP sampling, where the ΔMLP is calculated with DP or MACE networks. The ΔMLP corrections were parametrized to the ξ = 3 Å and ξ = −2 Å umbrella samples of each reaction. Parts a–f and g–l are the reactions involving the mNuc and Nuc nucleophiles, respectively.

2. Average Reweighting Entropies of the MACE and DP Architecture (Arch.) End-State Models .

    reaction
weights arch. mNuc-EtO mNuc-AcO mNuc-PhO Nuc-EtO Nuc-AcO Nuc-PhO
MBAR MACE 0.98 ± 0.02 0.98 ± 0.02 0.98 ± 0.01 0.98 ± 0.02 0.98 ± 0.02 0.98 ± 0.01
  DP 0.98 ± 0.02 0.98 ± 0.03 0.98 ± 0.02 0.98 ± 0.01 0.98 ± 0.02 0.98 ± 0.02
wTP MACE 0.70 ± 0.07 0.65 ± 0.11 0.67 ± 0.08 0.66 ± 0.09 0.64 ± 0.10 0.63 ± 0.11
  DP 0.42 ± 0.10 0.40 ± 0.10 0.38 ± 0.10 0.44 ± 0.11 0.40 ± 0.08 0.40 ± 0.09
a

The values measure the distribution of the unbiased AM1/d+ΔMLP weights (MBAR) or wTP weights used to estimate the PBE0/6-31G* the free energy. This table summarizes the average and standard deviation of the RE values illustrated in Figures and .

One may question if the differences between AM1/d + DP and AM1/d + MACE shown in Figures and are a direct consequence of the ΔMLP energies or a secondary effect caused by structural inconsistencies in the simulations. We have included a comparison of heavy atom coordinate root-mean-square deviations (RMSD) in the Supporting Information to address this possibility. In summary, we performed expensive PBE0/6–31G* QM/MM simulations of the rate limiting transition states to calculate an ensemble averaged ab initio solute structure. The configurations saved from the AM1/d + DP and AM1/d + MACE trajectories were aligned to the ab initio structure to minimize the RMSD. The average and standard deviation of the RMSD values were tabulated, and it was found that the difference between the AM1/d + DP and AM1/d + MACE averages are less than the sum of their standard deviations. This suggests that the observations illustrated Figures and are likely a direct outcome of the ΔMLP corrections rather than a strucutral effect.

3.3. Effect of MACE Network Parameters on Accuracy and Performance

Figure and Table compare a series of AM1/d + MACE end-state models. The models differ by varying the network parameters that control the message passing mechanism, such as the equivariant feature maximum angular momentum (L), the number of message passing layers (T), the correlation order (ν), and the number of channels (N). The MACE models in Figures , , and correspond to the hyperparameters L = 1, T = 2, ν = 3, N = 128. The variations shown in Figure independently adjust each hyperparameter to reduce the complexity of the network. The free energy surfaces produced by these models are largely invariant to adjustment of the network parameters. There is a slight decrease in the PBE0/6–31G* wTP reweighting entropy averages when L is reduced from 1 to 0 or T is reduced from 2 to 1; however, these differences are smaller than the standard deviation. The reweighting entropies are most sensitive to a reduction of the correlation order from 3 to 1, which causes the entropies to decrease from 0.65 to 0.43.

7.

7

Comparison of AM1/d + MACE end-state model free energy profiles. The reweighting entropies analyze the wTP estimate of the PBE0/6–31G* distributions. The default model hyperparameters are L = 1, T = 2, ν = 3, and N = 128. The other variations differ by reducing the value of a hyperparameter, as indicated in the legend. Parts a–f and g–l are the reactions involving the mNuc and Nuc nucleophiles, respectively.

3. Average Reweighting Entropies of Several AM1/d + MACE End-State Models Calculated From PBE0/6-31G* Target Potential wTP Analysis. L, T, ν, and N are the Equivariant Feature Maximum Angular Momentum, the Number of Message Passing Layers, the Correlation Order, and the Number of Channels, Respectively .

        reaction
L T ν N mNuc-EtO mNuc-AcO mNuc-PhO Nuc-EtO Nuc-AcO Nuc-PhO
1 2 3 128 0.70 ± 0.07 0.65 ± 0.11 0.67 ± 0.08 0.66 ± 0.09 0.64 ± 0.10 0.63 ± 0.11
0 2 3 128 0.64 ± 0.08 0.63 ± 0.10 0.61 ± 0.09 0.64 ± 0.08 0.59 ± 0.10 0.59 ± 0.08
1 1 3 128 0.64 ± 0.08 0.60 ± 0.10 0.60 ± 0.08 0.60 ± 0.10 0.57 ± 0.11 0.57 ± 0.09
1 2 1 128 0.43 ± 0.09 0.43 ± 0.11 0.43 ± 0.10 0.43 ± 0.11 0.43 ± 0.10 0.47 ± 0.08
1 2 3 64 0.68 ± 0.07 0.66 ± 0.08 0.66 ± 0.09 0.63 ± 0.08 0.64 ± 0.09 0.63 ± 0.07
a

This table summarizes the average and standard deviation of the RE values illustrated in Figure .

The observations made in Figure offers some insight into why the AM1/d + MACE models are more accurate and transferable than AM1/d + DP. The DP ΔMLP correction is a type of feed-forward neural network evaluated with 2-body embedding descriptors, whereas most of the MACE ΔMLP potentials appearing in this work were parametrized with a correlation order of 3 (4-body descriptors). One may question if the increased body–order in the MACE descriptors is the source of the differences between the DP and MACE results. This hypothesis is partially supported by Figure , which shows that the quality of the AM1/d + MACE reference potential is reduced when the correction is limited to 2-body descriptors.

Table shows the simulation performance of the mNuc-AcO system measured on a single core of a Intel Xeon 8358 2.60 GHz central processing unit (CPU) using AM1/d, AM1/d + DP, and several AM1/d + MACE QM/MM models. The performance is quantified by the average number of milliseconds needed to complete a molecular dynamics time step (smaller values are better performance). The measurements were repeated using a NVidia V100 graphics processing unit (GPU) to evaluate the ΔMLP component of the energy. When calculated on a CPU, the AM1/d + DP method is twice as slow as a standard AM1/d QM/MM potential; however, the AM1/d + MACE models are 14 to 29 times more expensive which makes them impractical to use. Furthermore, reducing the values of MACE network hyperparameters has a significant effect on the model’s CPU performance. Reducing the values of L or T causes the CPU performance to double. When the GPU is used as a coprocessor to evaluate the ΔMLP, the AM1/d + DP method is only 8% slower than AM1/d QM/MM. The AM1/d + MACE models achieve a similar performance; they are 18% to 28% slower than AM1/d QM/MM. The large difference in CPU performance is likely the consequence of how the two MLP architectures are implemented. The DP architecture is implemented with the TensorFlow libraries, whereas the MACE method is based on PyTorch, whose performance is tuned for GPUs. The AM1/d + MACE GPU timings are not as sensitive to changes in the model hyperparameters as the CPU timings. This could be a consequence of the small size of this application; the GPU calculation may be bottle-necked by memory transfers rather than floating point evaluation.

4. Performance of AM1/d and AM1/d + ΔMLP QM/MM Simulation of the mNuc-AcO Reaction .

          time (ms/step)
model L T ν N CPU CPU + GPU
AM1/d ··· ··· ··· ··· 114 ···
AM1/d + DP ··· ··· ··· ··· 250 124
AM1/d + MACE 1 2 3 128 3277 146
  0 2 3 128 1636 136
  1 1 3 128 1558 135
  1 2 1 128 2334 137
  1 2 3 64 1631 133
a

The performance metric is the wall-clock time per molecular dynamics step (ms/step, lower is better) evaluated with a Intel Xeon 8358 2.60 GHz and an NVidia V100. L, T, ν, and N are the equivariant feature maximum angular momentum, the number of message passing layers, the correlation order, and the number of channels, respectively.

4. Conclusions

We described the adaptation of the MACE message passing neural network for use as a range corrected ΔMLP to improve semiempirical QM and QM/MM interactions. The key modifications include: changing the graph network to avoid direct communication between pairs of MM atoms, forcing the isolated atom MM energy corrections to be zero, and allowing MM and QM atoms of the same element to be trained with different neural network parameters. These changes ensure that the energy is conserved when MM atoms enter or leave the vicinity of the QM region during the course of dynamics.

We trained a series of AM1/d + ΔMLP models to reproduce PBE0/6–31G* QM/MM energies and forces. The models were used to investigate the transferability of the MACE corrections in umbrella sampling free energy applications of 6 nonenzymatic reactions that mimic the RNA transphosphorylation mechanism. We repeated the training and QM/MM umbrella sampling to make comparison with range corrected DeepPot-SE deep neural network ΔMLP models, referred to as AM1/d + DP. We were able to train the different architectures in a consistent manner by utilizing the graph neural network software plugin infrastructure recently incorporated into the DeePMD-kit package. This allowed us to parametrize the methods using the algorithms and optimization hyperparameters interfaced through the DP-GEN software, which also provides an easy to use framework for query–by–committee active learning. ,

We parametrized “end-state” models that were trained to the reactant and product state structures from all 6 reactions, and these models were refined several times by including varying amounts of umbrella sampling from 3 of the 6 reactions. When the refined models were applied as reference potentials to estimate the PBE0/6–31G* surfaces of all 6 reactions, the AM1/d + MACE method was found to consistently well-reproduce the target surfaces, whereas the AM1/d + DP models exhibited substantially larger numerical noiseespecially for those reactions not included in the refinement training. The free energy surfaces were also calculated using several end-state models that differed only in the random number seed used to initiate the neural network parameter optimization. There were large variations in the AM1/d + DP models near the transition states and the surfaces showed greater resemblance to the AM1/d base model than the PBE0/6–31G* target because their barriers were too large and they failed to reliably produce the expected pentacoordinated phosphorus intermediate states. When used as a reference potential, the target surfaces estimated from the AM1/d + DP sampling suffered from numerical noise and low reweighting entropies indicating that the accuracy of the ΔMLP correction is insufficient. In contrast, the AM1/d + MACE surfaces agreed reasonably well with each other and PBE0/6–31G*. The AM1/d + MACE models correctly predicted stable pentacoordinated phosphorus intermediate states even though the training did not include structures with a similar bonding pattern. When used as a reference potential, the AM1/d + MACE models produced less numerical noise and larger reweighting entropies than the AM1/d + DP models.

Several AM1/d + MACE end-state models were compared to investigate the effect of the message passing hyperparameters on the model’s accuracy and performance. Individually reducing the equivariant feature maximum angular momentum and the number of message passing layers had a minimal impact on the free energy surfaces. The largest effect occurred by reducing the correlation order from 3 to 1 which decreased the wTP reweighting entropies from 0.65 to 0.43. The PyTorch implementation of the MACE potential is not well-optimized for inference on CPUs. The AM1/d + MACE performance is 2900% slower than a standard AM1/d QM/MM calculation when performed on a CPU; however, when the ΔMLP is evaluated with a NVidia V100 GPU, the AM1/d + MACE performance is only 28% slower. This performance penalty is only slightly worse than the 8% performance penalty observed with the AM1/d + DP models.

In conclusion, the MACE architecture shows potential as a ΔMLP that may be more transferable than deep neural network models. At the present time, it is not feasible to perform MACE ΔMLP simulations on CPUs; however, its GPU–accelerated performance is sufficiently fast to obtain the sampling needed to calculate free energy surfaces of chemical reactions.

Supplementary Material

jp5c02006_si_001.pdf (159.3KB, pdf)

Acknowledgments

The authors are grateful for financial support provided by the National Institutes of Health (No. GM62248) and the National Science Foundation (CSSI Frameworks Grant No. 2209718). Computational resources were provided by the Office of Advanced Research Computing (OARC) at Rutgers, The State University of New Jersey; the Advanced Cyberinfrastructure Coordination Ecosystem: Services & Support (ACCESS) program, which is supported by National Science Foundation grants #2138259, #2138286, #2138307, #2137603, and #2138296 (supercomputer Expanse at SDSC through allocation CHE190067); and the Texas Advanced Computing Center (TACC) at the University of Texas at Austin, URL: http://www.tacc.utexas.edu (supercomputer Frontera through allocation CHE20002). .

The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.jpcb.5c02006.

  • Figures that illustrate the smoothness of the QM and QM/MM MACE corrections, the conservation of energy within AM1/d + MACE NVE simulations, energy surfaces analyzed from the AM1/d + MACE and AM1/d + DP S/n refined models, and geometrical comparisons (PDF)

The authors declare no competing financial interest.

Published as part of The Journal of Physical Chemistry B special issue “Applications of Free-Energy Calculations to Biomolecular Processes”.

References

  1. Warshel A., Levitt M.. Theoretical studies of enzymic reactions: Dielectric, electrostatic and steric stabilization of the carbonium ion in the reaction of lysozyme. J. Mol. Biol. 1976;103:227–249. doi: 10.1016/0022-2836(76)90311-9. [DOI] [PubMed] [Google Scholar]
  2. Karplus M.. Molecular dynamics simulations of biomolecules. Acc. Chem. Res. 2002;35:321–323. doi: 10.1021/ar020082r. [DOI] [PubMed] [Google Scholar]
  3. Warshel A.. Molecular dynamics simulations of biological reactions. Acc. Chem. Res. 2002;35:385–395. doi: 10.1021/ar010033z. [DOI] [PubMed] [Google Scholar]
  4. Gao J., Truhlar D. G.. Quantum Mechanical Methods for Enzyme Kinetics. Annu. Rev. Phys. Chem. 2002;53:467–505. doi: 10.1146/annurev.physchem.53.091301.150114. [DOI] [PubMed] [Google Scholar]
  5. Garcia-Viloca M., Gao J., Karplus M., Truhlar D. G.. How enzymes work: Analysis by modern rate theory and computer simulations. Science. 2004;303:186–195. doi: 10.1126/science.1088172. [DOI] [PubMed] [Google Scholar]
  6. Giese T. J., Ekesan S., York D. M.. Extension of the Variational Free Energy Profile and Multistate Bennett Acceptance Ratio Methods for High-Dimensional Potential of Mean Force Profile Analysis. J. Phys. Chem. A. 2021;125:4216–4232. doi: 10.1021/acs.jpca.1c00736. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Giese T. J., Ekesan S., McCarthy E., Tao Y., York D. M.. Surface-Accelerated String Method for Locating Minimum Free Energy Paths. J. Chem. Theory Comput. 2024;20:2058–2073. doi: 10.1021/acs.jctc.3c01401. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Nam K., Gao J., York D. M.. An efficient linear-scaling Ewald method for long-range electrostatic interactions in combined QM/MM calculations. J. Chem. Theory Comput. 2005;1:2–13. doi: 10.1021/ct049941i. [DOI] [PubMed] [Google Scholar]
  9. Senn H. M., Thiel W.. QM/MM methods for biomolecular systems. Angew. Chem., Int. Ed. 2009;48:1198–1229. doi: 10.1002/anie.200802019. [DOI] [PubMed] [Google Scholar]
  10. Behler J.. Perspective: Machine learning potentials for atomistic simulations. J. Chem. Phys. 2016;145:170901. doi: 10.1063/1.4966192. [DOI] [PubMed] [Google Scholar]
  11. Butler K. T., Davies D. W., Cartwright H., Isayev O., Walsh A.. Machine learning for molecular and materials science. Nature. 2018;559:547–555. doi: 10.1038/s41586-018-0337-2. [DOI] [PubMed] [Google Scholar]
  12. Noé F., Tkatchenko A., Müller K.-R., Clementi C.. Machine Learning for Molecular Simulation. Annu. Rev. Phys. Chem. 2020;71:361–390. doi: 10.1146/annurev-physchem-042018-052331. [DOI] [PubMed] [Google Scholar]
  13. Pinheiro Jr M., Ge F., Ferré N., Dral P. O., Barbatti M.. Choosing the right molecular machine learning potential. Chem. Sci. 2021;12:14396–14413. doi: 10.1039/D1SC03564A. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Manzhos S., Carrington Jr. T.. Neural Network Potential Energy Surfaces for Small Molecules and Reactions. Chem. Rev. 2021;121:10187–10217. doi: 10.1021/acs.chemrev.0c00665. [DOI] [PubMed] [Google Scholar]
  15. Zeng, J. ; Cao, L. ; Zhu, T. . Neural network potentials. In Quantum Chemistry in the Age of Machine Learning; Dral, P. O. , Ed.; Elsevier, 2022; pp 279–294. [Google Scholar]
  16. Liu Z., Zubatiuk T., Roitberg A., Isayev O.. Auto3D: Automatic Generation of the Low-Energy 3D Structures with ANI Neural Network Potentials. J. Chem. Inf. Model. 2022;62:5373–5382. doi: 10.1021/acs.jcim.2c00817. [DOI] [PubMed] [Google Scholar]
  17. Yuan Y., Cui Q.. Accurate and Efficient Multilevel Free Energy Simulations with Neural Network-Assisted Enhanced Sampling. J. Chem. Theory Comput. 2023;19:5394–5406. doi: 10.1021/acs.jctc.3c00591. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Smith J. S., Isayev O., Roitberg A. E.. ANI-1: an extensible neural network potential with DFT accuracy at force field computational cost. Chem. Sci. 2017;8:3192–3203. doi: 10.1039/C6SC05720A. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Gao X., Ramezanghorbani F., Isayev O., Smith J. S., Roitberg A. E.. TorchANI: A Free and Open Source PyTorch-Based Deep Learning Implementation of the ANI Neural Network Potentials. J. Chem. Inf. Model. 2020;60:3408–3415. doi: 10.1021/acs.jcim.0c00451. [DOI] [PubMed] [Google Scholar]
  20. Xu L., Shao W., Jin H., Wang Q.. Data Efficient and Stability Indicated Sampling for Developing Reactive Machine Learning Potential to Achieve Ultralong Simulation in Lithium-Metal Batteries. J. Phys. Chem. C. 2023;127:24106–24117. doi: 10.1021/acs.jpcc.3c05522. [DOI] [Google Scholar]
  21. York D. M., Darden T., Pedersen L. G.. The effect of long-range electrostatic interactions in simulations of macromolecular crystals: a comparison of the Ewald and truncated list methods. J. Chem. Phys. 1993;99:8345–8348. doi: 10.1063/1.465608. [DOI] [Google Scholar]
  22. Yue S., Muniz M. C., Calegari Andrade M. F., Zhang L., Car R., Panagiotopoulos A. Z.. When do short-range atomistic machine-learning models fall short? J. Chem. Phys. A. 2021;154:034111. doi: 10.1063/5.0031215. [DOI] [PubMed] [Google Scholar]
  23. Parsaeifard B., De D. S., Finkler J. A., Goedecker S.. Fingerprint-Based Detection of Non-Local Effects in the Electronic Structure of a Simple Single Component Covalent System. Condens. Matter. 2021;6:9. doi: 10.3390/condmat6010009. [DOI] [Google Scholar]
  24. Niblett S. P., Galib M., Limmer D. T.. Learning intermolecular forces at liquid–vapor interfaces. J. Chem. Phys. 2021;155:164101. doi: 10.1063/5.0067565. [DOI] [PubMed] [Google Scholar]
  25. Gao A., Remsing R. C.. Self-consistent determination of long-range electrostatics in neural network potentials. Nat. Commun. 2022;13:1572. doi: 10.1038/s41467-022-29243-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Anstine D. M., Isayev O.. Machine Learning Interatomic Potentials and Long-Range Physics. J. Phys. Chem. A. 2023;127:2417–2431. doi: 10.1021/acs.jpca.2c06778. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Zinovjev K.. Electrostatic Embedding of Machine Learning Potentials. J. Chem. Theory Comput. 2023;19:1888–1897. doi: 10.1021/acs.jctc.2c00914. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Zinovjev K., Hedges L., Montagud Andreu R., Woods C., Tuñón I., van der Kamp M. W.. emle-engine: A Flexible Electrostatic Machine Learning Embedding Package for Multiscale Molecular Dynamics Simulations. J. Chem. Theory Comput. 2024;20:4514–4522. doi: 10.1021/acs.jctc.4c00248. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Gubler M., Finkler J. A., Schäfer M. R., Behler J., Goedecker S.. Accelerating Fourth-Generation Machine Learning Potentials Using Quasi-Linear Scaling Particle Mesh Charge Equilibration. J. Chem. Theory Comput. 2024;20:7264–7271. doi: 10.1021/acs.jctc.4c00334. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Thomas, J. ; Baldwin, W. J. ; Csányi, G. ; Ortner, C. . Self-consistent Coulomb interactions for machine learning interatomic potentials. 2024, arxiv:2406.10915. [Google Scholar]
  31. Ramakrishnan R., Dral P. O., Rupp M., von Lilienfeld O. A.. Big Data Meets Quantum Chemistry Approximations: The Δ-Machine Learning Approach. J. Chem. Theory Comput. 2015;11:2087–2096. doi: 10.1021/acs.jctc.5b00099. [DOI] [PubMed] [Google Scholar]
  32. Zaspel P., Huang B., Harbrecht H., von Lilienfeld O. A.. Boosting Quantum Machine Learning Models with a Multilevel Combination Technique: Pople Diagrams Revisited. J. Chem. Theory Comput. 2019;15:1546–1550. doi: 10.1021/acs.jctc.8b00832. [DOI] [PubMed] [Google Scholar]
  33. Pan X., Yang J., Van R., Epifanovsky E., Ho J., Huang J., Pu J., Mei Y., Nam K., Shao Y.. Machine-Learning-Assisted Free Energy Simulation of Solution-Phase and Enzyme Reactions. J. Chem. Theory Comput. 2021;17:5745–5758. doi: 10.1021/acs.jctc.1c00565. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Zeng J., Giese T. J., Ekesan S. ¸., York D. M.. Development of Range-Corrected Deep Learning Potentials for Fast, Accurate Quantum Mechanical/Molecular Mechanical Simulations of Chemical Reactions in Solution. J. Chem. Theory Comput. 2021;17:6993–7009. doi: 10.1021/acs.jctc.1c00201. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Nandi A., Qu C., Houston P. L., Conte R., Bowman J. M.. Δ -machine learning for potential energy surfaces: A PIP approach to bring a DFT-based PES to CCSD­(T) level of theory. J. Chem. Phys. 2021;154:051102. doi: 10.1063/5.0038301. [DOI] [PubMed] [Google Scholar]
  36. Liu Y., Li J.. Permutation-Invariant-Polynomial Neural-Network-Based Δ-Machine Learning Approach: A Case for the HO2 Self-Reaction and Its Dynamics Study. J. Phys. Chem. Lett. 2022;13:4729–4738. doi: 10.1021/acs.jpclett.2c01064. [DOI] [PubMed] [Google Scholar]
  37. Ding Y., Huang J. D. P. /M. M.. DP/MM: A Hybrid Model for Zinc–Protein Interactions in Molecular Dynamics. J. Phys. Chem. Lett. 2024;15:616–627. doi: 10.1021/acs.jpclett.3c03158. [DOI] [PubMed] [Google Scholar]
  38. Yao S., Van R., Pan X., Park J. H., Mao Y., Pu J., Mei Y., Shao Y.. Machine learning based implicit solvent model for aqueous-solution alanine dipeptide molecular dynamics simulations. RSC Adv. 2023;13:4565–4577. doi: 10.1039/D2RA08180F. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Giese T. J., Zeng J., Ekesan S., York D. M.. Combined QM/MM, Machine Learning Path Integral Approach to Compute Free Energy Profiles and Kinetic Isotope Effects in RNA Cleavage Reactions. J. Chem. Theory Comput. 2022;18:4304–4317. doi: 10.1021/acs.jctc.2c00151. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Giese T. J., Zeng J., York D. M.. Multireference Generalization of the Weighted Thermodynamic Perturbation Method. J. Phys. Chem. A. 2022;126:8519–8533. doi: 10.1021/acs.jpca.2c06201. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Giese T. J., York D. M.. Estimation of frequency factors for the calculation of kinetic isotope effects from classical and path integral free energy simulations. J. Chem. Phys. 2023;158:174105. doi: 10.1063/5.0147218. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Zhou B., Zhou Y., Xie D.. Accelerated Quantum Mechanics/Molecular Mechanics Simulations via Neural Networks Incorporated with Mechanical Embedding Scheme. J. Chem. Theory Comput. 2023;19:1157–1169. doi: 10.1021/acs.jctc.2c01131. [DOI] [PubMed] [Google Scholar]
  43. Lier B., Poliak P., Marquetand P., Westermayr J., Oostenbrink C.. BuRNN: Buffer Region Neural Network Approach for Polarizable-Embedding Neural Network/Molecular Mechanics Simulations. J. Phys. Chem. Lett. 2022;13:3812–3818. doi: 10.1021/acs.jpclett.2c00654. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Shen L., Wu J., Yang W.. Multiscale Quantum Mechanics/Molecular Mechanics Simulations with Neural Networks. J. Chem. Theory Comput. 2016;12:4934–4946. doi: 10.1021/acs.jctc.6b00663. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Wu J., Shen L., Yang W.. Internal force corrections with machine learning for quantum mechanics/molecular mechanics simulations. J. Chem. Phys. 2017;147:161732. doi: 10.1063/1.5006882. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Snyder R., Kim B., Pan X., Shao Y., Pu J.. Facilitating ab initio QM/MM free energy simulations by Gaussian process regression with derivative observations. Phys. Chem. Chem. Phys. 2022;24:25134–25143. doi: 10.1039/D2CP02820D. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Snyder R., Kim B., Pan X., Shao Y., Pu J.. Bridging semiempirical and ab initio QM/MM potentials by Gaussian process regression and its sparse variants for free energy simulation. J. Chem. Phys. A. 2023;159:054107. doi: 10.1063/5.0156327. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Gómez-Flores C. L., Maag D., Kansari M., Vuong V.-Q., Irle S., Gräter F., Kubař T., Elstner M.. Accurate Free Energies for Complex Condensed-Phase Reactions Using an Artificial Neural Network Corrected DFTB/MM Methodology. J. Chem. Theory Comput. 2022;18:1213–1226. doi: 10.1021/acs.jctc.1c00811. [DOI] [PubMed] [Google Scholar]
  49. Böselt L., Thürlemann M., Riniker S.. Machine Learning in QM/MM Molecular Dynamics Simulations of Condensed- Phase Systems. J. Chem. Theory Comput. 2021;17:2641–2658. doi: 10.1021/acs.jctc.0c01112. [DOI] [PubMed] [Google Scholar]
  50. Liang, W. ; Zeng, J. ; York, D. M. ; Zhang, L. ; Wang, H. . Learning DeePMD-Kit: A Guide to Building Deep Potential Models. In A Practical Guide to Recent Advances in Multiscale Modeling and Simulation of Biomolecules; Wang, Y. , Zhou, R. , Eds.; AIP Publishing, 2023; pp 1–20. [Google Scholar]
  51. Giese T. J., Zeng J., Lerew L., McCarthy E., Tao Y., Ekesan S. ¸., York D. M.. Software Infrastructure for Next-Generation QM/MM-ΔMLP Force Fields. J. Phys. Chem. B. 2024;128:6257–6271. doi: 10.1021/acs.jpcb.4c01466. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Tao Y., Giese T. J., EkesanZeng Ş.J., Aradi B., Hourahine B., Aktulga H. M., Götz A. W., Merz K. M. Jr., York D. M.. Amber free energy tools: Interoperable software for free energy simulations using generalized quantum mechanical/molecular mechanical and machine learning potentials. J. Chem. Phys. A. 2024;160:224104. doi: 10.1063/5.0211276. [DOI] [PubMed] [Google Scholar]
  53. Tao Y., Giese T. J., York D. M.. Electronic and Nuclear Quantum Effects on Proton Transfer Reactions of Guanine–Thymine (G-T) Mispairs Using Combined Quantum Mechanical/Molecular Mechanical and Machine Learning Potentials. Molecules. 2024;29:2703. doi: 10.3390/molecules29112703. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Wilson T. J., McCarthy E., Ekesan S. ¸., Giese T. J., Li N.-S., Huang L., Piccirilli J. A., York D. M., Lilley D. M. J.. The Role of General Acid Catalysis in the Mechanism of an Alkyl Transferase Ribozyme. ACS Catal. 2024;14:15294–15305. doi: 10.1021/acscatal.4c04571. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Xiong M., Nie T., Li Z., Hu M., Su H., Hu H., Xu Y., Shao Q.. Potency Prediction of Covalent Inhibitors against SARS-CoV-2 3CL-like Protease and Multiple Mutants by Multiscale Simulations. J. Chem. Inf. Model. 2024;64(24):9501–9516. doi: 10.1021/acs.jcim.4c01594. [DOI] [PubMed] [Google Scholar]
  56. Yang J., Cong Y., Li Y., Li H.. Machine Learning Approach Based on a Range-Corrected Deep Potential Model for Efficient Vibrational Frequency Computation. J. Chem. Theory Comput. 2023;19:6366–6374. doi: 10.1021/acs.jctc.3c00386. [DOI] [PubMed] [Google Scholar]
  57. Zeng J., Zhang D., Lu D., Mo P., Li Z., Chen Y., Rynik M., Huang L., Li Z., Shi S., Wang Y., Ye H., Tuo P., Yang J., Ding Y., Li Y., Tisi D., Zeng Q., Bao H., Xia Y., Huang J., Muraoka K., Wang Y., Chang J., Yuan F., Bore S. L., Cai C., Lin Y., Wang B., Xu J., Zhu J.-X., Luo C., Zhang Y., Goodall R. E. A., Liang W., Singh A. K., Yao S., Zhang J., Wentzcovitch R., Han J., Liu J., Jia W., York D. M., Weinan E., Car R., Zhang L., Wang H.. DeePMD-kit v2: A software package for deep potential models. J. Chem. Phys. 2023;159:054801. doi: 10.1063/5.0155600. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Wang H., Zhang L., Han J., E W.. DeePMD-kit: A deep learning package for many-body potential energy representation and molecular dynamics. Comput. Phys. Commun. 2018;228:178–184. doi: 10.1016/j.cpc.2018.03.016. [DOI] [Google Scholar]
  59. Zeng J., Zhang D., Peng A., Zhang X., He S., Wang Y., Liu X., Bi H., Li Y., Cai C., Zhang C., Du Y., Zhu J.-X., Mo P., Huang Z., Zeng Q., Shi S., Qin X., Yu Z., Luo C., Ding Y., Liu Y.-P., Shi R., Wang Z., Bore S. L., Chang J., Deng Z., Ding Z., Han S., Jiang W., Ke G., Liu Z., Lu D., Muraoka K., Oliaei H., Singh A. K., Que H., Xu W., Xu Z., Zhuang Y.-B., Dai J., Giese T. J., Jia W., Xu B., York D. M., Zhang L., Wang H.. DeePMD-kit v3: A Multiple-Backend Framework for Machine Learning Potentials. J. Chem. Theory Comput. 2025;21(9):4375–4385. doi: 10.1021/acs.jctc.5c00340. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Case D. A., Aktulga H. M., Belfon K., Cerutti D. S., Cisneros G. A., Cruzeiro V. W. D., Forouzesh N., Giese T. J., Götz A. W., Gohlke H., Izadi S., Kasavajhala K., Kaymak M. C., King E., Kurtzman T., Lee T.-S., Li P., Liu J., Luchko T., Luo R., Manathunga M., Machado M. R., Nguyen H. M., O’Hearn K. A., Onufriev A. V., Pan F., Pantano S., Qi R., Rahnamoun A., Risheh A., Schott-Verdugo S., Shajan A., Swails J., Wang J., Wei H., Wu X., Wu Y., Zhang S., Zhao S., Zhu Q., Cheatham T. E. 3rd, Roe D. R., Roitberg A., Simmerling C., York D. M., Nagan M. C., Merz K. M. Jr.. AmberTools. J. Chem. Inf. Model. 2023;63:6183–6191. doi: 10.1021/acs.jcim.3c01153. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Abadi, M. ; Agarwal, A. ; Barham, P. ; Brevdo, E. ; Chen, Z. ; Citro, C. ; Corrado, G. S. ; Davis, A. ; Dean, J. ; Devin, M. ; Ghemawat, S. ; Goodfellow, I. ; Harp, A. ; Irving, G. ; Isard, M. ; Jia, Y. ; Jozefowicz, R. ; Kaiser, L. ; Kudlur, M. ; Levenberg, J. ; Mané, D. ; Monga, R. ; Moore, S. ; Murray, D. ; Olah, C. ; Schuster, M. ; Shlens, J. ; Steiner, B. ; Sutskever, I. ; Talwar, K. ; Tucker, P. ; Vanhoucke, V. ; Vasudevan, V. ; Viégas, F. ; Vinyals, O. ; Warden, P. ; Wattenberg, M. ; Wicke, M. ; Yu, Y. ; Zheng, X. . TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. 2015, arXiv:1603.04467. [Google Scholar]
  62. Paszke, A. , Gross, S. , Massa, F. , Lerer, A. , Bradbury, J. , Chanan, G. , Killeen, T. , Lin, Z. , Gimelshein, N. , Antiga, L. , Desmaison, A. , Köpf, A. , Yang, E. , DeVito, Z. , Raison, M. , Tejani, A. , Chilamkurthy, S. , Steiner, B. , Fang, L. , Bai, J. , Chintala, S. . PyTorch: An Imperative Style, High-Performance Deep Learning Library. 2019, arXiv:1912.01703. [Google Scholar]
  63. Zeng J., Giese T. J., Zhang D., Wang H., York D. M.. DeePMD-GNN: A DeePMD-kit Plugin for External Graph Neural Network Potentials. J. Chem. Info and Model. 2025;65:3154–3160. doi: 10.1021/acs.jcim.4c02441. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Zhang Y., Wang H., Chen W., Zeng J., Zhang L., Wang H., Weinan E.. DP-GEN A concurrent learning platform for the generation of reliable deep learning based potential energy models. Comput. Phys. Commun. 2020;253:107206. doi: 10.1016/j.cpc.2020.107206. [DOI] [Google Scholar]
  65. Batatia, I. ; Kovács, D. P. ; Simm, G. N. C. ; Ortner, C. ; Csányi, G. . MACE: higher order equivariant message passing neural networks for fast and accurate force fields. 2022, arXiv:2206.07697. [Google Scholar]
  66. Kovács D. P., Batatia I., Arany E. S., Csányi G.. Evaluation of the MACE force field architecture: From medicinal chemistry to materials science. J. Chem. Phys. 2023;159:044118. doi: 10.1063/5.0155322. [DOI] [PubMed] [Google Scholar]
  67. Lopez X., York D. M.. Parameterization of semiempirical methods to treat nucleophilic attacks to biological phosphates: AM1/d parameters for phosphorus. Theor. Chem. Acc. 2003;109:149–159. doi: 10.1007/s00214-002-0422-2. [DOI] [Google Scholar]
  68. Nam K., Cui Q., Gao J., York D. M.. Specific reaction parametrization of the AM1/d Hamiltonian for phosphoryl transfer reactions: H, O, and P atoms. J. Chem. Theory Comput. 2007;3:486–504. doi: 10.1021/ct6002466. [DOI] [PubMed] [Google Scholar]
  69. Bevilacqua P. C., Harris M. E., Piccirilli J. A., Gaines C., Ganguly A., Kostenbader K., Ekesan S. ¸., York D. M.. An Ontology for Facilitating Discussion of Catalytic Strategies of RNA-Cleaving Enzymes. ACS Chem. Biol. 2019;14:1068–1076. doi: 10.1021/acschembio.9b00202. [DOI] [PMC free article] [PubMed] [Google Scholar]
  70. Gaines C. S., Piccirilli J. A., York D. M.. The L-platform/L-scaffold framework: a blueprint for RNA-cleaving nucleic acid enzyme design. RNA. 2020;26:111–125. doi: 10.1261/rna.071894.119. [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. Prody G. A., Bakos J. T., Buzayan J. M., Schneider I. R., Bruening G.. Autolytic processing of dimeric plant virus satellite RNA. Science. 1986;231:1577–1580. doi: 10.1126/science.231.4745.1577. [DOI] [PubMed] [Google Scholar]
  72. Pley H. W., Flaherty K. M., McKay D. B.. Three-dimensional structure of a hammerhead ribozyme. Nature. 1994;372:68–74. doi: 10.1038/372068a0. [DOI] [PubMed] [Google Scholar]
  73. Scott W. G., Murray J. B., Arnold J. R. P., Stoddard B. L., Klug A.. Capturing the structure of a catalytic RNA intermediate: The Hammerhead Ribozyme. Science. 1996;274:2065–2069. doi: 10.1126/science.274.5295.2065. [DOI] [PubMed] [Google Scholar]
  74. Martick M., Lee T.-S., York D. M., Scott W. G.. Solvent structure and hammerhead ribozyme catalysis. Chem. Biol. 2008;15:332–342. doi: 10.1016/j.chembiol.2008.03.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. Wong K.-Y., Lee T.-S., York D. M.. Active participation of the Mg2+ ion in the reaction coordinate of RNA self-cleavage catalyzed by the hammerhead ribozyme. J. Chem. Theory Comput. 2011;7:1–3. doi: 10.1021/ct100467t. [DOI] [PMC free article] [PubMed] [Google Scholar]
  76. Buzayan J. M., Gerlach W. L., Bruening G.. Nonenzymatic cleavage and ligation of RNAs complementary to a plant virus sattelite RNA. Nature. 1986;323:349–353. doi: 10.1038/323349a0. [DOI] [Google Scholar]
  77. Rupert P. B., Massey A. P., Sigurdsson S. T., Ferré-D’Amaré A. R.. Transition State Stabilization by a Catalytic RNA. Science. 2002;298:1421–1424. doi: 10.1126/science.1076093. [DOI] [PubMed] [Google Scholar]
  78. Nam K., Gao J., York D. M.. Quantum mechanical/molecular mechanical simulation study of the mechanism of hairpin ribozyme catalysis. J. Am. Chem. Soc. 2008;130:4680–4691. doi: 10.1021/ja0759141. [DOI] [PMC free article] [PubMed] [Google Scholar]
  79. Heldenbrand H., Janowski P. A., Giambaşu G., Giese T. J., Wedekind J. E., York D. M.. Evidence for the role of active site residues in the hairpin ribozyme from molecular simulations along the reaction path. J. Am. Chem. Soc. 2014;136:7789–7792. doi: 10.1021/ja500180q. [DOI] [PMC free article] [PubMed] [Google Scholar]
  80. Sharmeen L., Kuo M. Y., Dinter-Gottlieb G., Taylor J.. Antigenomic RNA of human hepatitis delta virus can undergo self-cleavage. J. Virol. 1988;62:2674–2679. doi: 10.1128/jvi.62.8.2674-2679.1988. [DOI] [PMC free article] [PubMed] [Google Scholar]
  81. Ferré-D’Amaré A. R., Zhou K., Doudna J. A.. Crystal structure of a hepatitis delta virus ribozyme. Nature. 1998;395:567–574. doi: 10.1038/26912. [DOI] [PubMed] [Google Scholar]
  82. Weissman B., Ekesan S. ¸., Lin H.-C., Gardezi S., Li N.-S., Giese T. J., McCarthy E., Harris M. E., York D. M., Piccirilli J. A.. Dissociative Transition State in Hepatitis Delta Virus Ribozyme Catalysis. J. Am. Chem. Soc. 2023;145:2830–2839. doi: 10.1021/jacs.2c10079. [DOI] [PMC free article] [PubMed] [Google Scholar]
  83. Saville B. J., Collins R. A.. A site-specific self-cleavage reaction performed by a novel RNA in neurospora mitochondria. Cell. 1990;61:685–696. doi: 10.1016/0092-8674(90)90480-3. [DOI] [PubMed] [Google Scholar]
  84. Suslov N. B., DasGupta S., Huang H., Fuller J. R., Lilley D. M. J., Rice P. A., Piccirilli J. A.. Crystal Structure of the Varkud Satellite Ribozyme. Nat. Chem. Biol. 2015;11:840–846. doi: 10.1038/nchembio.1929. [DOI] [PMC free article] [PubMed] [Google Scholar]
  85. Ganguly A., Weissman B. P., Giese T. J., Li N.-S., Hoshika S., Rao S., Benner S. A., Piccirilli J. A., York D. M.. Confluence of theory and experiment reveals the catalytic mechanism of the Varkud satellite ribozyme. Nat. Chem. 2020;12:193–201. doi: 10.1038/s41557-019-0391-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  86. Winkler W. C., Nahvi A., Roth A., Collins J. A., Breaker R. R.. Control of gene expression by a natural metabolite-responsive ribozyme. Nature. 2004;428:281–286. doi: 10.1038/nature02362. [DOI] [PubMed] [Google Scholar]
  87. Klein D. J., Ferré-D’Amaré A. R.. Structural basis of glmS ribozyme activation by glucosamine-6-phosphate. Science. 2006;313:1752–1756. doi: 10.1126/science.1129666. [DOI] [PubMed] [Google Scholar]
  88. Liu Y., Wilson T. J., McPhee S. A., Lilley D. M. J.. Crystal structure and mechanistic investigation of the twister ribozyme. Nat. Chem. Biol. 2014;10:739–744. doi: 10.1038/nchembio.1587. [DOI] [PubMed] [Google Scholar]
  89. Roth A., Weinberg Z., Chen A. G., Kim P. B., Ames T. D., Breaker R. R.. A widespread self-cleaving ribozyme class is revealed by bioinformatics. Nat. Chem. Biol. 2014;10:56–62. doi: 10.1038/nchembio.1386. [DOI] [PMC free article] [PubMed] [Google Scholar]
  90. Gaines C. S., York D. M.. Ribozyme Catalysis with a Twist: Active State of the Twister Ribozyme in Solution Predicted from Molecular Simulation. J. Am. Chem. Soc. 2016;138:3058–3065. doi: 10.1021/jacs.5b12061. [DOI] [PMC free article] [PubMed] [Google Scholar]
  91. Gaines C. S., Giese T. J., York D. M.. Cleaning Up Mechanistic Debris Generated by Twister Ribozymes Using Computational RNA Enzymology. ACS Catal. 2019;9:5803–5815. doi: 10.1021/acscatal.9b01155. [DOI] [PMC free article] [PubMed] [Google Scholar]
  92. Ren A., Vusurovic N., Gebetsberger J., Gao P., Juen M., Kreutz C., Micura R., Patel D.. Pistol Ribozyme Adopts a Pseudoknot Fold Facilitating Site-specific In-line Cleavage. Nat. Chem. Biol. 2016;12:702–708. doi: 10.1038/nchembio.2125. [DOI] [PMC free article] [PubMed] [Google Scholar]
  93. Weinberg Z., Kim P. B., Chen T. H., Li S., Harris K. A., Lünse C. E., Breaker R. R.. New classes of self-cleaving ribozymes revealed by comparative genomics analysis. Nat. Chem. Biol. 2015;11:606–610. doi: 10.1038/nchembio.1846. [DOI] [PMC free article] [PubMed] [Google Scholar]
  94. Wilson T. J., Liu Y., Li N. S., Dai Q., Piccirilli J. A., Lilley D. M.. Comparison of the structures and mechanisms of the pistol and hammerhead ribozymes. J. Am. Chem. Soc. 2019;141:7865–7875. doi: 10.1021/jacs.9b02141. [DOI] [PMC free article] [PubMed] [Google Scholar]
  95. Kostenbader K., York D. M.. Molecular simulations of the pistol ribozyme: unifying the interpretation of experimental data and establishing functional links with the hammerhead ribozyme. RNA. 2019;25:1439–1456. doi: 10.1261/rna.071944.119. [DOI] [PMC free article] [PubMed] [Google Scholar]
  96. Ekesan S. ¸., York D. M.. Who stole the proton? Suspect general base guanine found with a smoking gun in the pistol ribozyme. Org. Biomol. Chem. 2022;20:6219–6230. doi: 10.1039/D2OB00234E. [DOI] [PMC free article] [PubMed] [Google Scholar]
  97. Liu Y., Wilson T. J., Lilley D. M. J.. The structure of a nucleolytic ribozyme that employs a catalytic metal ion. Nat. Chem. Biol. 2017;13:508–513. doi: 10.1038/nchembio.2333. [DOI] [PMC free article] [PubMed] [Google Scholar]
  98. Gaines C. S., York D. M.. Model for the Functional Active State of the TS Ribozyme from Molecular Simulation. Angew. Chem., Int. Ed. 2017;56:13392–13395. doi: 10.1002/anie.201705608. [DOI] [PMC free article] [PubMed] [Google Scholar]
  99. Huang M., York D. M.. Linear free energy relationships in RNA transesterification: theoretical models to aid experimental interpretations. Phys. Chem. Chem. Phys. 2014;16:15846–15855. doi: 10.1039/C4CP01050G. [DOI] [PMC free article] [PubMed] [Google Scholar]
  100. Chen H., Giese T. J., Huang M., Wong K.-Y., Harris M. E., York D. M.. Mechanistic Insights into RNA Transphosphorylation from Kinetic Isotope Effects and Linear Free Energy Relationships of Model Reactions. Chem.Eur. J. 2014;20:14336–14343. doi: 10.1002/chem.201403862. [DOI] [PMC free article] [PubMed] [Google Scholar]
  101. Zhang, L. ; Han, J. ; Wang, H. ; Saidi, W. ; Car, R. ; E, W. . End-to-end Symmetry Preserving Inter-atomic Potential Energy Model for Finite and Extended Systems. In Advances in Neural Information Processing Systems 31; Bengio, S. , Wallach, H. , Larochelle, H. , Grauman, K. , Cesa-Bianchi, N. , Garnett, R. , Eds.; Curran Associates, Inc., 2018; pp 4436–4446. [Google Scholar]
  102. Zhang D., Bi H., Dai F.-Z., Jiang W., Liu X., Zhang L., Wang H.. Pretraining of attention-based deep learning potential model for molecular simulation. Npj Comput. Mater. 2024;10:94. doi: 10.1038/s41524-024-01278-7. [DOI] [Google Scholar]
  103. Zeng J., Tao Y., Giese T. J., York D. M.. QDπ: A Quantum Deep Potential Interaction Model for Drug Discovery. J. Chem. Theory Comput. 2023;19:1261–1275. doi: 10.1021/acs.jctc.2c01172. [DOI] [PMC free article] [PubMed] [Google Scholar]
  104. Gilmer, J. ; Schoenholz, S. S. ; Riley, P. F. ; Vinyals, O. ; Dahl, G. E. . Neural Message Passing for Quantum Chemistry. 2017, arXiv:1704.01212. [Google Scholar]
  105. Hofstetter A., Böselt L., Riniker S.. Graph-convolutional neural networks for (QM)­ML/MM molecular dynamics simulations. Phys. Chem. Chem. Phys. 2022;24:22497–22512. doi: 10.1039/d2cp02931f. [DOI] [PubMed] [Google Scholar]
  106. Batatia, I. ; Batzner, S. ; Kovács, D. P. ; Musaelian, A. ; Simm, G. N. C. ; Drautz, R. ; Ortner, C. ; Kozinsky, B. ; Csányi, G. . The Design Space of E(3)-Equivariant Atom-Centered Interatomic Potentials. 2022, arxiv:2205.06643. [DOI] [PMC free article] [PubMed] [Google Scholar]
  107. Drautz R.. Atomic cluster expansion for accurate and transferable interatomic potentials. Phys. Rev. B. 2019;99:014104. doi: 10.1103/PhysRevB.99.014104. [DOI] [Google Scholar]
  108. Liu, Y. ; Wang, L. ; Liu, M. ; Zhang, X. ; Oztekin, B. ; Ji, S. . Spherical Message Passing for 3D Graph Networks. 2022, arxiv:2102.05013. [Google Scholar]
  109. Batzner S., Musaelian A., Sun L., Geiger M., Mailoa J. P., Kornbluth M., Molinari N., Smidt T. E., Kozinsky B.. E­(3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials. Nat. Commun. 2022;13:2453. doi: 10.1038/s41467-022-29939-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  110. Hopkins C. W., Le Grand S., Walker R. C., Roitberg A. E.. Long-Time-Step Molecular Dynamics through Hydrogen Mass Repartitioning. J. Chem. Theory Comput. 2015;11:1864–1874. doi: 10.1021/ct5010406. [DOI] [PubMed] [Google Scholar]
  111. Berendsen H. J. C., Postma J. P. M., van Gunsteren W. F., Dinola A., Haak J. R.. Molecular dynamics with coupling to an external bath. J. Chem. Phys. A. 1984;81:3684–3690. doi: 10.1063/1.448118. [DOI] [Google Scholar]
  112. Darden T., York D., Pedersen L.. Particle mesh Ewald: An N log­(N) method for Ewald sums in large systems. J. Chem. Phys. 1993;98:10089–10092. doi: 10.1063/1.464397. [DOI] [Google Scholar]
  113. Giese T. J., Panteva M. T., Chen H., York D. M.. Multipolar Ewald methods, 1: Theory, accuracy, and performance. J. Chem. Theory Comput. 2015;11:436–450. doi: 10.1021/ct5007983. [DOI] [PMC free article] [PubMed] [Google Scholar]
  114. Bogusz S., Cheatham III T. E., Brooks B. R.. Removal of pressure and free energy artifacts in charged periodic systems via net charge corrections to the Ewald potential. J. Chem. Phys. 1998;108:7070–7084. doi: 10.1063/1.476320. [DOI] [Google Scholar]
  115. Giese T. J., York D. M.. Ambient-Potential Composite Ewald Method for ab Initio Quantum Mechanical/Molecular Mechanical Molecular Dynamics Simulation. J. Chem. Theory Comput. 2016;12:2611–2632. doi: 10.1021/acs.jctc.6b00198. [DOI] [PMC free article] [PubMed] [Google Scholar]
  116. Shirts M. R., Chodera J. D.. Statistically optimal analysis of samples from multiple equilibrium states. J. Chem. Phys. A. 2008;129:124105. doi: 10.1063/1.2978177. [DOI] [PMC free article] [PubMed] [Google Scholar]
  117. Li P., Jia X., Pan X., Shao Y., Mei Y.. Accelerated Computation of Free Energy Profile at ab Initio Quantum Mechanical/Molecular Mechanics Accuracy via a Semi-Empirical Reference Potential. I. Weighted Thermodynamics Perturbation. J. Chem. Theory Comput. 2018;14:5583–5596. doi: 10.1021/acs.jctc.8b00571. [DOI] [PubMed] [Google Scholar]
  118. Hu W., Li P., Wang J.-N., Xue Y., Mo Y., Zheng J., Pan X., Shao Y., Mei Y.. Accelerated Computation of Free Energy Profile at Ab Initio Quantum Mechanical/Molecular Mechanics Accuracy via a Semiempirical Reference Potential. 3. Gaussian Smoothing on Density-of-States. J. Chem. Theory Comput. 2020;16:6814–6822. doi: 10.1021/acs.jctc.0c00794. [DOI] [PMC free article] [PubMed] [Google Scholar]
  119. Wang J.-N., Xue Y., Li P., Pan X., Wang M., Shao Y., Mo Y., Mei Y.. Perspective: Reference-Potential Methods for the Study of Thermodynamic Properties in Chemical Processes: Theory, Applications, and Pitfalls. J. Phys. Chem. Lett. 2023;14:4866–4875. doi: 10.1021/acs.jpclett.3c00671. [DOI] [PMC free article] [PubMed] [Google Scholar]
  120. Xue Y., Wang J.-N., Hu W., Zheng J., Li Y., Pan X., Mo Y., Shao Y., Wang L., Mei Y.. Affordable Ab Initio Path Integral for Thermodynamic Properties via Molecular Dynamics Simulations Using Semiempirical Reference Potential. J. Phys. Chem. A. 2021;125:10677–10685. doi: 10.1021/acs.jpca.1c07727. [DOI] [PMC free article] [PubMed] [Google Scholar]
  121. Wang J.-N., Liu W., Li P., Mo Y., Hu W., Zheng J., Pan X., Shao Y., Mei Y.. Accelerated Computation of Free Energy Profile at Ab Initio Quantum Mechanical/Molecular Mechanics Accuracy via a Semiempirical Reference Potential. 4. Adaptive QM/MM. J. Chem. Theory Comput. 2021;17:1318–1325. doi: 10.1021/acs.jctc.0c01149. [DOI] [PMC free article] [PubMed] [Google Scholar]
  122. Wang M., Li P., Jia X., Liu W., Shao Y., Hu W., Zheng J., Brooks B. R., Mei Y.. Efficient Strategy for the Calculation of Solvation Free Energies in Water and Chloroform at the Quantum Mechanical/Molecular Mechanical Level. J. Chem. Inf. Model. 2017;57:2476–2489. doi: 10.1021/acs.jcim.7b00001. [DOI] [PubMed] [Google Scholar]
  123. Kingma, D. P. ; Ba, J. . Adam: A Method for Stochastic Optimization. 2017, arxiv:1412.6980. [Google Scholar]
  124. Zeng J., Cao L., Xu M., Zhu T., Zhang J. Z. H.. Complex reaction processes in combustion unraveled by neural network-based molecular dynamics simulation. Nat. Commun. 2020;11:5713. doi: 10.1038/s41467-020-19497-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  125. Zeng J., Zhang L., Wang H., Zhu T.. Exploring the Chemical Space of Linear Alkane Pyrolysis via Deep Potential GENerator. Energy Fuels. 2021;35:762–769. doi: 10.1021/acs.energyfuels.0c03211. [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

jp5c02006_si_001.pdf (159.3KB, pdf)

Articles from The Journal of Physical Chemistry. B are provided here courtesy of American Chemical Society

RESOURCES