Skip to main content
ACS AuthorChoice logoLink to ACS AuthorChoice
. 2025 Apr 29;65(9):4367–4380. doi: 10.1021/acs.jcim.5c00341

Including Physics-Informed Atomization Constraints in Neural Networks for Reactive Chemistry

Shuhao Zhang †,*, Michael Chigaev ‡,§, Olexandr Isayev , Richard A Messerly ‡,, Nicholas Lubbers ⊥,*
PMCID: PMC12076496  PMID: 40298943

Abstract

graphic file with name ci5c00341_0008.jpg

Machine learning interatomic potentials (MLIPs) have emerged as powerful tools for investigating atomistic systems with high accuracy and a relatively low computational cost. However, a common and unaddressed challenge with many current neural network (NN) MLIP models is their limited ability to accurately predict the relative energies of systems containing isolated or nearly isolated atoms, which appear in various reactive processes. To address this limitation, we present a mathematical technique for modifying any existing atom-centered NN architecture to account for the energies of isolated atoms. The result produces a consistent prediction of the atomization energy (AE) of a system using minimal constraints on the model. Using this technique, we build a model architecture that we call hierarchically interacting particle neural network (HIP-NN)-AE, an AE-constrained version of the HIP-NN, as well as ANI-AE, the AE-constrained version of the accurate NN engine for molecular energies (ANI). Our results demonstrate AE consistency of AE-constrained models, which drastically improves the AE predictions for the models. We compare the AE-constrained approach to unconstrained models as well as models from the literature in other scenarios, such as bond dissociation energies, bond dissociation pathways, and extensibility tests. These results show that the constraints improve the model performance in some of these tasks and do not negatively affect the performance on any tasks. The AE constraint approach thus offers a robust solution to the challenges posed by isolated atoms in energy prediction tasks.

Introduction

In computational studies of chemical reactions, the concept of the potential energy surface (PES), grounded in the Born–Oppenheimer approximation, allows for the prediction of system energy solely on the basis of nuclear coordinates. This makes PES one of the most powerful tools for investigating reactions in silico. Quantum mechanical (QM) methods, which are based on solving the Schrödinger equation, provide an accurate means of modeling the PES of a given system. However, QM calculations are computationally expensive and suffer from unfavorable scaling with system size, making them impractical for many systems of interest. Reactive force fields, while more computationally efficient, are considerably less accurate and less transferable.13 Therefore, developing a method that can rapidly and accurately describe the PES would be highly beneficial to a wide range of research fields, including chemistry, materials science, drug discovery, and physics.

Machine learning (ML) methods have demonstrated their transformative potential across numerous scientific and engineering disciplines.46 For example, atom-centered ML models are capable of accurately predicting a wide range of chemical properties.710 ML has also emerged as a powerful tool for constructing surrogate models that can describe the PES, both rapidly and accurately.11,12 While a variety of algorithms, such as kernel ridge regression, Gaussian process regression, and polynomial approximations, have been applied to model the PES,1315 neural networks (NNs) have become the most successful and popular ML approach in this field.16 Since the breakthrough of the Behler–Parrinello network in modeling bulk silicon,17 researchers have developed several generations of NN-based ML interatomic potentials (MLIPs) with diverse architectures. These include atomic environment vector18-based deep learning models like the ANI network12 and DeepMD,19 as well as message-passing architectures such as AIMNet,20,21 HIP-NN22 and MACE.23,24 These pioneering models have demonstrated strong capabilities in predicting energies and forces for equilibrium and near-equilibrium conformers and have been successfully applied to multiple downstream tasks.25 However, MLIPs trained solely on near-equilibrium data can produce significant errors26 or even nonphysical predictions when applied to reactive systems.

Training a reactive MLIP across a diverse range of conformers involved in a chemical reaction requires a training set filled with nonequilibrium structures and their corresponding energy and force labels.2729 However, adequately and efficiently sampling the reactive chemical space is a challenging task.26,3032 Reactive systems, by nature, exhibit a vast diversity of molecular configurations, making it difficult to capture all relevant structures.33,34 Furthermore, the computational cost of labeling a sufficiently large number of reactive conformers using high-level QM methods is intractable.35 Labeling a reactive data set is especially challenging, as structures with partially or fully broken bonds require unrestricted or multireference QM calculations. Unrestricted calculations are prone to spin contamination,36 while multireference calculations require system-specific user decisions regarding active space, rendering such methods difficult to implement in a self-consistent, fully automated fashion.37 Although methods like active learning3841 have been developed to improve the efficiency of data set sampling, there is growing interest in developing approaches that embed fundamental physical insights into MLIPs directly, rather than simply expanding the data set.42,43

A key deficiency of MLIPs in modeling reactive systems stems from the lack of inherent physical principles. Physics-informed ML, an emerging methodology in ML model development, has shown significant success in enhancing ML research across various fields. Based on previous studies,44 both observational biases (data-driven) and inductive biases (model-driven) have proven to be effective ways of incorporating physical knowledge into ML models.45 In other words, embedding physical principles, such as conservation laws and constraints, directly into the model architecture can not only improve the model accuracy but also provide additional guidance during the training process.46 This approach allows the model to learn from both the data and the underlying physical laws, leading to more reliable and interpretable predictions, particularly in complex systems such as chemical reactions. The premise of this study is that enforcing a physics-informed constraint on the MLIP atomization energy (AE) will also improve reaction energy predictions.

While reaction energies are typically computed based on the difference in single-point energy (SPE), multiple state functions can be used for the same purpose. For example, the reaction energy can also be computed based on AE, as illustrated in Figure 1. Therefore, although most MLIPs are trained to predict SPE values, an MLIP trained to predict AE values would also result in reliable reaction energy predictions. This insight is especially important when considering that, compared to SPE, AE is inherently subject to clearer physics-based constraints. For instance, the AE of a single isolated atom must be exactly zero. As such, the AE is an attractive target for incorporation into MLIP architectures using a physics-informed learning approach.

Figure 1.

Figure 1

Methods of computing energy differences between conformations. The energy of a chemical process can be evaluated in several ways. First, the reaction pathway (middle) describes how a chemical process takes place in nature. Ab initio calculations (bottom route, blue) define single-point energies in terms of the energy of nuclei and atoms. Alternatively, the atomization process (top route, pink) can be used to link to conformations by atomizing the system to a reference energy state in which the energy of interaction is exactly zero. The latter forms an elegant basis for mathematically constraining ML models.

Related work has considered constraining the AE of MLIPs. SNAP47 first implemented a constraint on the formation energy by modifying their descriptors by constant offsets based on a vacuum atom. This scheme is readily extendable to any linear model and was also used by linear ACE.48 However, the question is more complex in nonlinear and NN models because atomization constraints cannot be achieved by preprocessing the features; effective per-species contributions shift during training. Body-ordering has been considered in BotNet,49 and its descendant, MACE,23 by carefully constraining the interaction functions so that contributions from individual atoms, pairs, triples, etc., can be identified. Alternatively, SWANI50 instead modified each individual NN layer with a bias-free approach51 from image processing literature—although this approach alone exhibited drawbacks and required further architectural modifications for improved performance.

In this article, we introduce a generic physics-informed AE constraint procedure that produces constraints on the predictions of isolated atoms. Compared with prior approaches, the constraint is implemented as a minimal modification of the network architecture, and this procedure can be applied to any atom-centered NN architecture. We use this procedure to produce the hierarchically interacting particle neural network (HIP-NN) with AE (HIP-NN-AE), a modified version of the HIP-NN model.22 While this additional constraint is designed to ensure accurate AE predictions for isolated atoms, we observe substantial improvements in AE predictions for a wide range of molecules. As a result, compared to the unconstrained HIP-NN architecture, HIP-NN-AE demonstrates superior performance in predicting reaction-related properties, such as full AE and bond dissociation energy, and no deficit in the prediction of other properties, such as conformational energy variations. We also validate the approach on an additional MLIP architecture, namely, accurate NN engINe for molecular energies (ANAKIN-ME, aka ANI). The modified version of ANI with an AE constraint, termed ANI-AE, shows similar trends in performance improvements.

Methods

Background on Energy Normalization

A PES function E(z,R) which describes the interactions between atoms of chemical species z and positions R has an important invariance to shifts of the form

graphic file with name ci5c00341_m001.jpg 1

where C is constant with respect to positions. In particular, it is natural to consider chemical energy shifts (CESs) of the form

graphic file with name ci5c00341_m002.jpg 2

where zi denotes the species of atom i, and so Ez gives a constant per-atom contribution for each species z; this parametrizes the possible energy shifts compatible with an extensive potential energy function. Applying a CES has no effect on experimentally observable phenomena so long as the process in consideration does not affect the identity of the nuclei at hand. As such, the overall CES has not played an important role in the parametrization of quantum-mechanical or classical potential energy functions for chemical physics; it is predominantly bookkeeping, which must be accounted for in comparing energies between methods.

However, MLIPs have reintroduced the consideration of energy normalization, as normalization can play a role in training dynamics in addition to comparisons between methods. SNAP52 and linear ACE48 assigned the Ez contributions exactly in order to match the CES associated with the ab initio theory. The latter examined the performance difference when using the AE baseline (which they term the E0 normalization) compared to fitting the CES parameters empirically (that is, the training target is to the average data set energy, which they term the AVG normalization) and showed that setting the CES ahead of time with the E0 normalization was preferable. This is simple to do in a linear model, because it can be accomplished by preprocessing a constant shift to the features so that an atom in a vacuum state has features that are identically zero and setting the bias term of linear regression to the CES required by the ab initio theory.

However, CESs are subtler in nonlinear models, such as NNs. In these models, a shift of the input features does not produce a linear change in the output energy, and during training, a change in parameters will generally interact with the values of other parameters to produce an effective CES. It is thus not possible to preprocess the CES out of a NN model. However, it has been examined how the energy normalization applied to the data set affects the training of NNs.49 Here, it was shown that training networks to reproduce atomization energies by applying a CES to the training data improves their performance in generalization compared to training with standard machine-learning practices of preprocessing to produce a zero-mean and unit-variance training target.

In this work, we improve upon these ideas by implementing a CES, which is dynamic during training, which we describe in the following section.

Review of the HIP-NN Architecture

First, we briefly review the HIP-NN model architecture.22,53,54 HIP-NN can be thought of variously as a message-passing NN,55 a continuous filter convolutional network,56 or a graph NN.57 Particles indexed by i are fed into the network, including the positions ri and species Z0i, which are encoded as a one-hot variable. The network generates new features for each atom, zn+1, from features zn, where n indexes layers, using a standard NN update, zn+1 = f(zn), where f is an activation function. HIP-NN uses two types of layers: on-site layers and interaction layers, where the former is a subset of the latter restricted only to variables within a single particle i. The interaction term, which provides the crucial representation of particle configuration to the network, is given by the equation

graphic file with name ci5c00341_m003.jpg 3

where rij gives the interparticle distance, j indexes neighbors of particle i, and the sensitivity functions sν(rij), indexed by ν, encode interparticle distances as a feature vector of length Nν. The sensitivity functions are constructed to decay to 0 for some cutoff, rij > rcut, so that eq 3 specifies a local interaction. The learnable interaction tensor Vνab specifies how to combine the sensitivities and features of neighboring particles to generate new features.

This form of interaction is completely local in nature and symmetric under rotation and translation of the system at hand. None of the interactions depend explicitly on the index values i, and so the activations zni are equivariant with respect to permutation symmetry. The tensor sensitivity variation of HIP-NN53 (HIP-NN-TS) extends eq 3 to account for angles between neighbors, which improves the accuracy of the network for energy regression while retaining the invariance with respect to rotation. The HIP-HOP-NN variation utilizes high-order-polynomials, which directly capture many-body features in the environment in an invariant way.54 We employ the HIP-NN-TS variation in this study but write HIP-NN for brevity.

Modeled energies Inline graphic in HIP-NN are constructed using linear regression from the features arising from a set of layers throughout the network, with contributions Ei for each atom, and Eni further specify the contribution from each layer n contributing to the energy

graphic file with name ci5c00341_m005.jpg 4

where wna are learnable weights and bn is a learnable bias. Due to the one-hot encoding representation for z0i, the first term, E0i, specifies a per-species contribution and thus constitutes a form of CES given by the weights w0a, where a index features that in this case are in direct correspondence to species. This gives it a special role with regard to our atomization-constrained variant model, which we now discuss.

Consistent AE Constraint

In order to improve AE prediction, we focus on the commonly used setting where energy is predicted as a sum over atom-centered contributions, that is

graphic file with name ci5c00341_m006.jpg 5

This setting is common to a large number of NN MLIPs, including both Behler–Parinello and message-passing NN families.16 We will show in the section Results and Discussion that these architectures do not produce accurate atomization energies, even when trained on data that uses an AE baseline.

The principal inconsistency begins with the fact that although in the atomization baseline evaluating the energy of an isolated system should return identically zero, NN models trained on this baseline produce nonzero energy predictions. This is in spite of the fact that near the training data, the energy predictions match the baseline. This creates a contradiction between the AE as viewed by the model’s single-point output and as viewed as the difference between single-point and atomized states. When NN MLIPs are applied to atomized states, they predict nonzero values, sometimes by up to hundreds of kcal/mol. The difference between the state energies does not correspond to the prediction of the model when one of the states is the reference state for the energy baseline. In other words, there is a residual per-chemical contribution to energy that remains even when that atom has no neighbors within the cutoff radius of the MLIP. Many methods incorporate per-atom energy shifts in their model forms, including but not limited to HIP-NN,22,53 ANI,12,38,58 PhysNet,59 AIMNet,60 transformer interatomic potential (TrIP),61 and neural equivariant interatomic potential (NequIP).62 This can occur as part of the data normalization, such as subtracting best-fit per-species energies from all data points. It can also occur as an explicit component of the learnable model parameters (e.g., the bias parameter in the final layers of Behler–Parinello models with per-species networks) as well as through indirect contributions (that is, the net effect of other weights, biases, and nonlinearities on the model output), both of which will vary over the course of training.

Some studies have attempted to properly model free atoms by including in the training data set diatomic systems where the two atoms are separated by very large distances.27,63 While previous work added over 1000 diatomic systems,63 the amount of data points needed to reach convergence of AE is unknown and likely a function of other variables such as network architecture and training data set. Adding too many diatomic systems risks lowering the overall quality and diversity of the training data set. By contrast, imposing a direct constraint on the model does not require augmenting the data set with a somewhat arbitrary number of distant diatomics.

This inconsistency can be rectified by using the AE baseline for both the model and the data in tandem. For a given input conformer, we first follow the standard training workflow, calculating the chemical environment embedding of each atom and feeding it into the NN to obtain the per-atom energy prediction. Next, we generate an in-vacuum embedding for each atom using an isolated atomic system and feed these embeddings into the same NN to obtain the in-vacuum energy prediction for each atom. As shown in Figure 2, the final molecular energy prediction is the sum of the per-atom energy predictions minus the sum of the in-vacuum energy predictions. This approach ensures that the AE prediction for a single atom is correct because, in that case, the chemical environment embedding and the in-vacuum embedding are identical. As a result, the final prediction for the single-atom conformer is zero, which aligns with the physical definition of the AE.

Figure 2.

Figure 2

Illustration of the constrained model in the process of predicting the AE of a water molecule. Three colored bars on top of the atoms in the water molecule represent the chemical environment embedding vectors of each atom, while two colored bars on top of the isolated atoms on the right represent the in-vacuum embedding vectors. After passing through the same NN, we obtain both the per-atom energy predictions and the in-vacuum atomic energy predictions. The final molecular AE prediction is the difference between the sum of these two categories.

In the equations, this procedure can be written as follows. The atomization-constrained NN uses a prediction scheme of AE, Inline graphic in terms of the vacuum response of the raw, unconstrained network, Ei, as

graphic file with name ci5c00341_m008.jpg 6

Note that the above equation does more than remove the terms E0i, which are particular to some architectures, from the prediction. This equation also removes the indirect changes to the energy baseline of the model, which are present generically across NN models. The indirect effect is caused by the combined effect of all species-based inputs, weights, and bias terms in each layer of the network, which are included in Ei. Thus, our scheme differs greatly from bias-free NN methods,50 which constrain each layer to produce an output of zero for the vacuum state. In contrast, the atomization scheme here constrains the network with exactly one constraint for each species in the training set. Thus, this scheme is a minimal constraint to effect an isolated atom energy of zero; because of the locality of eq 3, when all atoms have separated to large enough distance, the NN activations in sum produce EiEi, and so, Inline graphic.

For polyatomic systems, the difference between Ei and Ei can be interpreted as the per-atom contribution to the total AE of the molecule. This enables consistent training to AE calculated by using QM methods. Additionally, this constraint does not directly affect how forces are computed on each atom, which are still obtained from the derivative of the final energy prediction with respect to the input coordinates.

During training, Ei is not constant; it is a function of the NN parameters. Thus, it is important that the automatic differentiation is used to calculate parameter gradients through the entire Inline graphic prediction. Then, the NN parameters are updated in such a way that the AE constraint is exactly obeyed by the network as it evolves through training, and per-species constant contributions cannot be used by the network to explain (in the statistical sense of the word) the QM energy. Whatever CES is produced in the raw network is eliminated in Inline graphic, so it only remains to train Inline graphic to match reference atomization energies in a training data set. Thus, by using differentiable Ei calculations, we are training the AE of the model to match the AE of a data set, producing a model which is consistent with respect to the atomization process.

This modification does incur some additional computational costs during training. The increase in cost is minimal in practice; however, the vacuum contributions can be evaluated only once per species in the batch and can then be broadcast to the appropriate atoms in the training batch. During inference, however, the parameters of the NN are fixed, and therefore, the Ei are constant terms. Thus, after training, the E0i terms in HIP-NN, which are completely canceled via eq 6, can be adjusted to include the vacuum term Ei directly, with zero computational cost over the original HIP-NN model form. A similar approach can be applied to ANI, adjusting the bias terms of the final layers of the NNs for each species of atom. In summary, our AE-constraint scheme increases training costs by a small amount but does not incur any additional computational cost at inference time.

We applied this same scheme to HIP-NN and ANI architectures. We refer to HIP-NN with our AE constraint as HIP-NN-AE and to ANI with the AE constraint as ANI-AE, whereas HIP-NN denotes the unconstrained model. However, we reiterate that this approach does not depend at all on the details of HIP-NN or ANI, but only on the existence of atom-centered contributions to energy. Thus, the AE constraint could be easily applied to many other atom-centered NN MLIPs. If an explicit model parameter for a CES does not exist, then the architecture would need to be modified to account for this, but this is a trivial addition compared with the complexity of MLIP architectures.

Training Data Set

In this study, we train on data from the ANI-1x data set.64 Although the complete data set contains approximately 5 M total structures computed at varying levels of theory, we follow recent works6568 in training our potentials to a subset of approximately 500k structures with their corresponding ωB97/6-31G(d) DFT energy and forces. Similar to Kovács et al.,65 we use the subset selected via active learning,25,38 which neatly provides a smaller but still extensive data set on which a transferable potential can be trained. To compare the constrained HIP-NN-AE model with the unconstrained HIP-NN, we trained both architectures on the same data set and designed multiple test cases to evaluate different aspects of their performance. Please refer to the Supporting Information for details on model training. The test results are presented below.

Model Training

We performed a hyperparameter search for both HIP-NN and HIP-NN-AE (see Supporting Information for details of the hyperparameter search). Once the hyperparameters were decided, we trained six replica models for both types of models and removed the model with the largest validation error at the end of the training process, yielding five models for each approach to measure accuracy and quantify the uncertainty of that accuracy. The standard deviations reported in the results below are the standard deviations among the five replica models of both the HIP-NN and HIP-NN-AE procedures, and the visualized results show the output from a single model. To avoid biasing our decision, we simply used the first model trained. No hyperparameter search was performed for ANI-AE. Instead, to allow for a meaningful comparison between ANI-AE and the published ANI-1x potential, the ANI-AE architecture, hyperparameters, and learning profiler are the same as ANI-1x.64

Results and Discussion

AE Prediction

Here, we validate that HIP-NN-AE provides physically meaningful predictions for the fully atomized system, i.e., when all atoms are moved infinitely apart. For this purpose, we performed full-atomization scans for four different molecules, i.e., where the distances of all atoms are scaled relative to the minimum-energy distances. Figure 3 compares the full-atomization scans predicted by HIP-NN, HIP-NN-AE, and ANI-AE, all when trained on the subset of ANI-1x, a nonreactive data set that does not include the breaking of chemical bonds. Also included in Figure 3 are the full-atomization scans for the ANI-1x model,38 the TrIP,61 a version of NequIP,62,69 AIMNet2,20,21 and MACE-OFF23.70 All models were trained on the ANI-1x data set (either the subset or the complete data set), except AIMNet2 and MACE-OFF23, which were trained on data labeled with a similar DFT protocol to ANI-1x. These literature MLIPs were chosen to represent a diverse set of model constructs. ANI-1x is a feed-forward NN based on modified Behler–Parrinello symmetry functions, AIMNet utilizes learnable atomic feature vectors, TrIP is based on the SE(3)-transformer,71 NequIP is a fully equivariant message-passing NN, and MACE includes higher-order hierarchical many-body terms.23 For each of these models, the atomization baseline energies of the DFT training set are used to determine the offset self-energy. Five of the unconstrained models (HIP-NN, ANI-1x, TrIP, NequIP, and AIMNet2) incorrectly converge to arbitrary energies at full atomization. Thus, while these models perform well near equilibrium, they fail drastically to reproduce the AE. By design, HIP-NN-AE and ANI-AE correctly converge to zero when the molecule is completely atomized.

Figure 3.

Figure 3

Complete atomization of several molecular structures (aspirin, d-glucose, toluene, nitroglycerin). Comparison between the unconstrained HIP-NN model, the atomization-energy constrained HIP-NN model (HIP-NN-AE), and the atomization-energy constrained ANI model (ANI-AE), trained on a subset of the ANI-1x data set. Also included are several models from the literature: ANI-1x, TrIP, NequIP, AIMNet2, and MACE-OFF23 (medium). All models were trained on the ANI-1x data set, except AIMNet2 and MACE-OFF23, which were trained on data labeled with a similar DFT protocol to ANI-1x. The self-energy, Eself, is defined as the sum of the DFT energies for the isolated C, H, N, and O atoms. The scale factor is defined as the ratio between the interatomic distances r and the minimum-energy interatomic distances r0.

This phenomenon is not data set-dependent and can be reproduced on arbitrary data sets. For example, HIP-NN and HIP-NN-AE show similar results when trained on QM7-x72 (see Figure S1). The improvement achieved by including the AE constraint with ANI also demonstrates that the impact of our methodology is not limited to the HIP-NN architecture. However, not all existing models have this difficulty. In particular, the MACE architecture23 is built around a detailed notion of body order for all energy contributions and thus supplies consistent AE curves (see Supporting Information for full atomization scans with additional MACE models).

The agreement between all models near equilibrium demonstrates on a qualitative level that the AE constraint does not significantly alter the model’s ability to fit the near-equilibrium training data. Upon quantitative examination, the testing errors of HIP-NN and HIP-NN-AE were very similar. Fluctuations between different random seeds of approximately +/– 10% were definitively larger than any difference between the model architectures, which both achieved a mean absolute error (MAE) of approximately 0.75 kcal/mol. The training MAE for HIP-NN-AE models was approximately 0.5 kcal/mol, approximately twice that of the unmodified HIP-NN, which to some degree indicates that the AE approach mitigates the overfitting gap between training and testing, thus preventing the model from learning spurious values from the training data.

In the intermediate region of scale factors around 1.5–4, Figure 3 shows that all models exhibit spurious behaviors, i.e., nonphysical oscillations, small peaks, and/or nonsmooth convergence. Similar undesirable behaviors are presented in Supporting Information for MACE literature models24,70,73 and for HIP-NN and HIP-NN-AE trained on QM7-x. The strong variances in intermediate-scale behavior for all models underscore the lack of data in this regime. This can be argued to be in part due to the lack of physical significance of dilated molecules; however, it is also difficult to obtain meaningful energies for any kind of bond-breaking data without resorting to multireference QM.

Bond Dissociation Energy Prediction

To further validate our hypothesis that improvement in AE predictions can facilitate other types of relative energy prediction tasks, we investigate the bond dissociation energies (BDE) of bond-breaking reactions. Although atomization energies and BDEs are related properties, BDEs serve as an out-of-sample test case for our models that were trained solely on atomization energies. Furthermore, BDEs serve as fundamental and practical properties for modeling a variety of complicated reactions. Moreover, breaking a single bond forms fragments in a wide range of sizes, making it possible to investigate the correlation between model performance and fragment size.

We test HIP-NN-AE, HIP-NN, ANI-AE, and various literature MLIPs against the BDE database (called BDE-db),74,75 which contains ≈290 K unique single-bond dissociation reactions of small organic molecules and their corresponding BDEs calculated by the M06-2x/def2-TZVP method. This DFT protocol was selected for achieving near coupled-cluster level accuracy on BDEs at a fraction of the computational cost.74 All molecules in BDE-db contain H, C, N, and O only, and the distribution of the molecules is similar to our training sets. All of the reactions in the BDE-db data set start from a closed-shell molecule that dissociates into two open-shell fragments. Thus, the BDE is computed by subtracting the reactant energy (either SPE or AE) from the sum of the fragment energies. As the spin states for closed-shell molecules and open-shell radicals are well-defined, single-determinant DFT is sufficient to model these systems. Our computed BDEs are compared with the electronic BDEs in BDE-db, rather than the enthalpy BDEs. Because our training set (ANI-1x) was computed with a different level of theory [ωB97x/6-31G(d)], we must exercise some caution when comparing our models with BDE-db. Since BDEs are energy differences rather than absolute energies, a comparison is still meaningful. However, none of the models considered (HIP-NN-AE, HIP-NN, ANI-AE, and the literature MLIPs) were trained on radical structures or bond dissociations. Because comparing BDEs can be ill-conceived if the molecular geometry changes significantly between DFT and the MLIP,48 we utilize the DFT-optimized reactant and fragment geometries offered by BDE-db.

Table 1 provides the MAE and mean signed error (bias) between the BDE-db DFT values and the predicted BDE values for each MLIP trained in this work (HIP-NN-AE, HIP-NN, and ANI-AE) and select literature MLIPs (ANI-1x, TrIP, NequIP, and MACE-OFF23). In addition, Figure 4 presents a 2D histogram comparing the DFT BDEs and the predicted BDEs for the HIP-NN and HIP-NN-AE models.

Table 1. Performance Comparisons on the Bond Dissociation Energy Dataset (BDE-db)a.

  MAE (kcal/mol) bias (kcal/mol)
HIP-NN-AE 9.2 ± 1.1 3.1 ± 1.5
HIP-NN 28.8 ± 6.2 23.1 ± 5.8
ANI-AE 24.8 –24.7
ANI-1x 57.2 –56.7
TrIP 20.0 15.9
NequIP 47.2 46.7
MACE-OFF23 8.9 2.0
a

Values reported are MAE and mean signed error (bias) in kcal/mol. Comparison between the MLIPs trained in this work (HIP-NN-AE, HIP-NN, and ANI-AE) and select literature MLIPs (ANI-1x, TrIP, NequIP, and MACE-OFF23) tested on the BDE-db DFT values. Errors for HIP-NN models are given as the standard deviation over five models trained.

Figure 4.

Figure 4

2D histogram comparing predicted BDEs for HIP-NN and HIP-NN-AE with DFT BDE in the BDE-db data set. The AE constraint demonstrates remarkable improvements in BDE, improving the bias by a factor of around eight and improving the overall error by a factor of more than three.

Clearly, the MAEs for HIP-NN-AE (9.2 kcal/mol) and ANI-AE (24.8 kcal/mol) are too large to recommend either model as a quantitative BDE prediction tool. The discrepancy in DFT methodologies between the training data set, ANI-1x, and BDE-db likely accounts for approximately 3–7 kcal/mol error.74 In addition, recall that the ANI-1x training data set does not include any radical structures or dissociated molecules. Considering these two key features of the training data set, the MAE for HIP-NN-AE of approximately 9 kcal/mol is actually quite remarkable. In fact, the MAE for HIP-NN-AE is nearly equal to the MAE for the MACE-OFF23 model (8.9 kcal/mol, although MACE-OFF23 was not trained on the same data set). By contrast, the errors for the other unconstrained models (HIP-NN, ANI-1x, TrIP, and NequIP) are far larger. These large errors are not a surprising result, as none of these models were intended for BDE prediction.

More importantly, the atomization-energy constraint greatly reduces the error compared with the unconstrained model with the same architecture. For example, the MAE for HIP-NN-AE is approximately one-third as large as the MAE for HIP-NN (29 kcal/mol), while the MAE for ANI-AE is approximately half as large as the MAE for ANI-1x (57 kcal/mol). However, our objective is not to provide HIP-NN-AE or ANI-AE as a BDE prediction tool but rather to demonstrate that the AE constraint has clear advantages for modeling bond dissociations in the absence of reactive data. Developing large reactive data sets that adequately cover all dissociation pathways and label structures with multireference QM energies and forces is extremely challenging. While training to such a reactive data set would most certainly improve the prediction of BDEs for both unconstrained and AE-constrained models, this result shows how physical constraints can alleviate data set deficiencies. Since a more rigorous training approach was applied for HIP-NN-AE (i.e., a hyperparameter search and ensemble training) than for ANI-AE, the remaining results and discussion focus on HIP-NN-AE.

The improvement with HIP-NN-AE compared to HIP-NN is even more evident when considering the tendency to over- or underpredict the true BDE. A useful indicator for this purpose is the bias, i.e., the mean signed error. The unconstrained HIP-NN model significantly overpredicts the BDE, similar to TrIP and NequIP, as evidenced by a positive bias of approximately 23 kcal/mol. By contrast, HIP-NN-AE has only a bias of 3 kcal/mol, similar to the bias of MACE-OFF23 (2 kcal/mol), demonstrating that the AE constraint effectively removes the general tendency to overpredict BDEs. Furthermore, HIP-NN overestimates the BDEs for approximately 76% of BDE-db, while only 56% of the HIP-NN-AE predictions are overestimates.

In addition, HIP-NN predicts negative BDEs for 31 bonds, while HIP-NN-AE predicts only 8 negative BDEs. Therefore, the AE constraint significantly reduces the number of nonphysical negative BDEs. HIP-NN-AE also has fewer large outliers than HIP-NN, further demonstrating that the AE constraint results in BDEs that are more reasonable BDEs overall.

A deeper understanding of the improvement with the AE constraint is seen by comparing the error distributions for HIP-NN and HIP-NN-AE categorized by bond type, as shown in Figure 5. Indeed, prediction errors for all types of H-X bonds improve significantly (by 28–32 kcal/mol) for the HIP-NN-AE constrained model compared with the HIP-NN unconstrained model. The MAE error bars show not only that HIP-NN-AE performs better for H–X bonds but also that the MAE is far more consistent across random training trials. By contrast, the HIP-NN unconstrained model has extremely high MAEs for H–X bonds, but the MAE are also highly variable. Improvement for other bond types is typically smaller than the combined uncertainties, although the error distributions for HIP-NN-AE appear to be centered closer to zero than HIP-NN. Every H–X bond results in a free H atom, which is well represented with HIP-NN-AE. Although HIP-NN-AE also properly represents free C, N, and O atoms, the BDE-db data set does not include any bond dissociations that produce free C, N, or O atoms. However, we now demonstrate that the improved BDEs for HIP-NN-AE are not strictly limited to bonds that form free atoms.

Figure 5.

Figure 5

Comparison of BDE prediction error distribution on each type of bond. MAEs are shown for each model, as well as error bars which show the standard deviation of this error across five independently trained models.

The improvement of BDEs for the constrained HIP-NN-AE model is especially evident on smaller-sized inputs. To quantify this size effect, we propose an indicator that depends on the size of all reactants and products involved in the reaction. In this specific case where the sum of product size is equal to the size of the reactant, we only need to guarantee that the indicator grows with the increase of the reactant size Sm and also grows with the increase of the ratio between the geometric mean and arithmetic mean of the size of fragment 1 Sf1 and fragment 2 Sf2. Notice that the logarithm of the product of two fragment sizes satisfies such requirements; we utilize the following for an indicator of reaction size.

graphic file with name ci5c00341_m013.jpg

Figure 6 demonstrates the improvement in BDE predictions made by the AE-constrained model compared to the unconstrained model. If we look closely at the distribution of prediction errors with respect to the reaction size indicator, i.e., the green and yellow contours, the errors made by the unconstrained model are separated into two peaks. Reactions with a smaller size indicator (dark green) have significantly higher error than larger indicators (bright yellow), indicating the unconstrained model has a systematic error on small-size inputs. The main reason for such a systematic error in the unconstrained model is the error in AE prediction of isolated hydrogen atoms, which are involved in more than half of individual reactions in BDE-db. However, the projection contour plots on the YZ plane show that more than one peak has such a systematic shift, indicating that the isolated hydrogen is not the only cause of the systematic error. By contrast, the error distributions (yellow and green contours on the YZ plane) among different reaction size indicators are centered at a similar value for HIP-NN-AE, making the overall error distribution among reaction size indicators closer to a normal distribution. While the overall error distribution is closer to zero for HIP-NN-AE, the small reaction size peak shifted the most compared to unconstrained HIP-NN. This shift helps validate our assumption that the AE constraint improves BDE predictions, especially when small fragments are involved. Therefore, the improvements in HIP-NN-AE are not limited to isolated atoms. Moreover, as explained above, the MAE of HIP-NN-AE (approximately 9 kcal/mol) is partially attributed to the inherent difference in BDEs when computed with two different DFT functionals and basis sets.

Figure 6.

Figure 6

3D distributions of prediction errors and reaction size indicators and projected contour plots to show the tendencies with fixed variables. The color of the contour plot indicates the magnitude of the fixed variable, i.e., brighter colors correspond to larger values for the fixed variable. For example, in the HIP-NN subplot, each contour plot on the YZ plane shows the distribution with a fixed reaction size. Specifically, the dark green peak is the average distribution with a reaction size indicator near 2.5, and the light yellow peak is the average distribution with a reaction size indicator near 5. Similarly, in the HIP-NN subplot, the light yellow peak on the XZ plane is the average distribution with BDE error near 30 kcal/mol, and the dark purple peak is the average distribution with BDE error near −30 kcal/mol.

While the AE-constrained models demonstrate clear improvements in the BDE values themselves, it is also important to investigate the bond dissociation pathways. Figure 7 examines the bond dissociation pathways for the methane C–H bond and the ethanol O–H bond. These bonds were chosen to allow for comparison with two literature studies that report reactive ML potentials, namely, ANI-1xnr,39 and linear ACE.48 To be consistent with these literature studies for a fair comparison between methods, a rigid scan is performed where the positions of all atoms are fixed, except for the two atoms involved in the bond dissociation process.

Figure 7.

Figure 7

Comparison of rigid bond dissociation pathways for (left) C–H bond in methane and (right) O–H bond in ethanol. HIP-NN unconstrained and atomization-energy-constrained (HIP-NN-AE and ANI-AE) models are compared with DFT calculations for the singlet and triplet state and literature ML potentials. Shaded bands show the standard deviation of ensemble members (5) for each HIP-NN variant. In red are singlet and triplet calculations using ωb97-x, the level of theory used in the ANI-1x training data set. Also shown are the results from linear ACE48 models and their accompanying DFT calculations, the ANI-1x model, as well as ANI-1xnr.39 Below, the distribution distances in the training data set are shown for C–H along with a vertical indicator of the maximum of this distribution.

From these scans, it is clear that certain MLIPs, such as ANI-1x, are not appropriate for predicting BDE. ANI-1x was trained using the average energy normalization,38 which is known to produce models which revert toward the mean energy in unknown regions.48 Although the nonphysical BDEs of ANI-1x are attributed to the model being trained on only near-equilibrium data, it is noteworthy that the unconstrained standard HIP-NN model trained on a subset of the ANI-1x data set using the E0 normalization approach already yields more physically reasonable BDEs. In fact, the unconstrained HIP-NN model overpredicts the BDE for ethanol to a similar extent as the ACE potential using the same E0 normalization. Most importantly, HIP-NN-AE and ANI-AE show that the AE constraint dramatically improves agreement with the fully dissociated fragments for the triplet spin state, despite being trained under near-equilibrium conditions using a DFT singlet PES.

As observed in Figure 7, for bond lengths shorter than 2 Å, the HIP-NN-AE, HIP-NN, ANI-AE, and ANI-1x models agree closely with the DFT singlet data. Notably, HIP-NN-AE demonstrates a nonphysical maximum between 2 and 3 Å. This maximum occurs at approximately the same bond length where HIP-NN, HIP-NN-AE, ANI-AE, and ANI-1x diverge. This divergence in models is also near the maximum bond length present in the ANI-1x data set of approximately 2.5 Å, as shown by the vertical dotted line. The bottom panels in Figure 7 show that a non-negligible number of structures in the ANI-1x data set correspond to partially dissociated C–H or O–H bonds, with bond lengths between 2 and 3 Å. However, for elongated bond lengths, the singlet is no longer the correct spin multiplicity. For bond lengths greater than approximately 3 Å, the triplet becomes the correct spin multiplicity. For systems with partial bond dissociations, single-determinant DFT suffers from a significant amount of spin contamination. Thus, neither the singlet nor the triplet can adequately represent bond lengths between 2 and 3 Å. More complicated and expensive multireference QM methods are required for these partially dissociated structures. Therefore, the nonphysical maximum in HIP-NN-AE between 2 and 3 Å is primarily a result of the ANI-1x training data set containing some structures with elongated bond lengths that are improperly labeled as singlet multiplicity. Although HIP-NN-AE follows the singlet training data through partially dissociated bond lengths, the AE constraint eventually causes HIP-NN-AE to correct toward the triplet state energy, despite the training data set not containing any triplet data or radical structures. Similarly, ANI-AE agrees closely with ANI-1x until this divergence bond length, corresponding to the maximum in ANI-1x. Subsequently, the ANI-AE energy remains fairly constant near the correct BDE value. By contrast, the ANI-1x model decreases rapidly to an extremely small BDE value. The unconstrained HIP-NN model plateaus at an elevated BDE value, corresponding approximately to the singlet energy for the longest bond length in the training data set.

We also compare to the ANI-1xnr potential,39 which was trained using active learning to model high-temperature and high-pressure reactive conditions for organic molecules, using condensed-phase, plane-wave DFT simulations that contain many reactive events. This data was calculated using singlet multiplicity; however, the authors note their assumption that a condensed-phase system does not accumulate an impactful amount of excess spin. The ANI-1xnr potential is more accurate in the transition region; however, HIP-NN-AE and ANI-1xnr show similar accuracy in the dissociated limit. This demonstrates that the AE constraint improves the BDE prediction as much as the presence of bond-breaking events and radical fragments in the condensed-phase ANI-1xnr training set.

Conformational Energy Variations Prediction

To evaluate the performance of HIP-NN-AE on other types of relative energy predictions, we compared HIP-NN-AE with the HIP-NN and ANI-1x models12 using selected subsets from the COMP6 data set.38 In the COMP6 data set, each molecule has multiple sampled geometries, with each geometry labeled with its corresponding QM-calculated energy and forces. This allows us to easily generate sets of conformational variation pairs and their respective relative energies. Specifically, we focus on the energy differences between each conformer and the first available conformer of the same molecule.

From Table 2, we can see that while both HIP-NN and HIP-NN-AE outperform the ANI-1x model in predicting conformational variation energies, the performance difference between HIP-NN and HIP-NN-AE is not substantial. Nonetheless, molecules in subsets like tripeptide and DrugBank are significantly larger than those in the training set of HIP-NN and HIP-NN-AE. Considering that isolated or near-isolated atoms are rarely present in the conformers provided by the COMP6 data set, we conclude that adding the AE constraint does not negatively impact the performance of HIP-NN in predicting relative energies. We also tested absolute energies, shown in the Supporting Information, which show essentially the same trends.

Table 2. Performance Comparisons on Each Subset of COMP6a.

  drugbank gdb11 07 gdb11 08 gdb11 09 gdb11 10
HIP-NN-AE 1.427 ± 0.054 0.485 ± 0.029 0.562 ± 0.032 0.644 ± 0.027 0.737 ± 0.027
HIP-NN 1.498 ± 0.061 0.495 ± 0.025 0.579 ± 0.028 0.668 ± 0.024 0.772 ± 0.021
ANI-1x 2.336 1.203 1.342 1.434 1.532
  gdb11 11 gdb13 12 gdb13 13 s66x8 tripeptide full
HIP-NN-AE 1.719 ± 0.057 1.830 ± 0.054 2.084 ± 0.063 1.218 ± 0.455 1.030 ± 0.071
HIP-NN 1.794 ± 0.065 1.931 ± 0.043 2.170 ± 0.043 0.954 ± 0.130 1.064 ± 0.054
ANI-1x 3.161 3.453 3.749 2.246 1.804
a

Values reported are MAEs in kcal/mol. Comparison between the MLIPs trained in this work (HIP-NN-AE, HIP-NN) and select literature MLIPs (ANI-1x) tested on the COMP6 DFT values.

Conclusions

In this work, we introduced the concept of incorporating a minimal AE consistency constraint into NN-based MLIP architectures to drastically improve AE predictions. The constraint is minimal in that its dimensionality is precisely one per atomic species; this is the number of degrees of freedom required to shift between SPE and AE. This constraint has a small effect on the training cost, and after training, there is zero additional cost in comparison to the unconstrained architecture. This constraint can be applied to any NN MLIP architecture that provides per-atom contributions to the model energy and trains via stochastic gradient descent. The training preprocessing does require the computation of single-atom energies at the same level of theory as the training data set, which is a very modest expense. The constraint is exact (also called a hard constraint, rather than a soft constraint, regularization, or penalty term) and contains zero hyperparameters to tune. This shows distinct advantages over augmenting the data set, a soft-constraint approach which only approximates the correct behavior and requires specifying the amount of data augmentation to provide.

We applied the AE constraint to the HIP-NN architecture, yielding a model we referred to as HIP-NN-AE, and to the ANI architecture, yielding ANI-AE. We then trained the models to near-equilibrium data sets. Training with the AE constraint had very little effect on the overall train/test error; this is fortuitous because, intuitively, any generally constrained model might be more difficult to train than an unconstrained one. Furthermore, the PES of HIP-NN-AE and ANI-AE through the full atomization process aligns more closely with physical principles compared to those of unconstrained HIP-NN and ANI-1x.

The HIP-NN-AE model provides more accurate BDE predictions than HIP-NN while almost eliminating completely the tendency to overpredict BDEs and nonphysical negative BDEs. Improvements in BDEs are particularly pronounced for bond dissociation processes that produce small fragments. ANI-AE showed similar improvement with respect to BDE predictions compared to the unconstrained ANI-1x model. HIP-NN-AE also outperforms the HIP-NN model in other relative energy prediction tasks, such as predicting conformational variation energies in the COMP6 data set. Although the improvements to conformational energies are not drastic, we conclude that adding the AE constraint does not negatively affect the model performance. We also compared our approach to models from the literature on bond stretches for methane and ethanol. Due to limitations we identify in the training data set associated with the assumption of a singlet spin state, we find that the intermediate bond-breaking distances are not well modeled by either approach, although a dramatic increase in the ensemble uncertainty shows that this kind of error can be identified at test time. In the dissociated limit, we demonstrate that unrestrained HIP-NN compares similarly to (unrestrained) linear ACE, while the constrained HIP-NN-AE performs similarly to ANI-1xnr, which was specifically trained to capture high-energy, reactive conditions containing radicals. Our studies also underscore the need for attention to spin states and distance distributions in data sets in order to successfully model dissociation in MLIPs.

We briefly contrast our results with related techniques in the literature. The bias-free NN method used in SWANI50 accomplishes a similar goal but through a more severe, nonminimal constraint. The performance with the bias-free method alone was actually worse than the original ANI method and had to be combined with other improvements (more potential contributions and wider NNs) to yield the full SWANI model. While an explicit performance comparison was not performed, BOTNET49 and MACE23 introduced a similar type of bias-free architectural design involving constraining the body order of all energy contributions in the network. Furthermore, these models do include a residual interaction term to account for higher body order terms, and this term is not constrained and can produce one-body, one-atom contributions; the constraint is not exact. Thus, we provide an alternative formulation based on minimally constraining the model, which in our tests shows no drawbacks.

In addition to developing the HIP-NN-AE model, two extensions to the hippynn open-source package were created during this project: a HIP-NN hyper-parameter tuning tool using Optuna76 that can automatically find hyper-parameters for all available HIP-NN variations, and a HIP-NN batch geometry optimizer that enables highly efficient molecular geometry optimizations across batches of molecules on GPUs.

Future work might investigate the impact of the AE constraint when training HIP-NN-AE to reactive data sets, i.e., data sets containing bond dissociation processes, reactive dynamics39 and/or transition states.77 Although we already tested this technique with ANI, it would be interesting to examine a larger variety of NN architectures to see whether the performance benefits are consistent across models. Furthermore, the AE constraint may prove beneficial beyond the context of MLIPs.10 Thus, we recommend future studies consider including the AE constraint when predicting chemical properties with an atom-centered NN.

Acknowledgments

The authors thank Benjamin Nebgen, Justin S. Smith, Aidan Thompson, and Mitchell Wood for helpful discussions. S.Z., M.C., R.A.M., and N.L. acknowledge support from the US Department of Energy, Office of Science, Basic Energy Sciences, Chemical Sciences, Geosciences, and Biosciences Division under Triad National Security, LLC (“Triad”) contract grant 89233218CNA000001 (FWP: LANLE3F2). S.Z. gratefully acknowledges the resources of the Los Alamos National Laboratory (LANL) Computational Science summer student program. The work at LANL was supported by the LANL Laboratory Directed Research and Development (LDRD) Projects 20230290ER and 20230435ECR. Work at LANL was performed in part at the Center for Nonlinear Studies and the Center for Integrated Nanotechnologies, a US Department of Energy Office of Science user facility at LANL. This research used resources provided by the LANL Institutional Computing Program. This research used resources provided by the Darwin testbed at LANL, which is funded by the Computational Systems and Software Environments subprogram of LANL’s Advanced Simulation and Computing program (NNSA/DOE). S.Z. and O.I. acknowledge support from the Office of Naval Research through the Energetic Materials Program (MURI grant number N00014-21-1-2476). This work used supercomputing resources through allocation CHE200122 from the Advanced Cyberinfrastructure Coordination Ecosystem: Services & Support (ACCESS) program, which is supported by NSF grants #2138259, #2138286, #2138307, #2137603, and #2138296. This research used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under contract no. DE-AC05-00OR22725.

Data Availability Statement

All data used in this work were produced in prior studies and can be found by following the appropriate references; no new data were created in this work. The software used for this work is hippynn, which is open-source and available at https://github.com/lanl/hippynn. Also included is an example script demonstrating how to train with the AE-constrained method for HIP-NN. Code for the ANI-AE model, based on TorchANI,78 is available at https://github.com/amateurcat/ANI-AE.

Supporting Information Available

The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.jcim.5c00341.

  • Details of data preprocessing, model training and hyperparameter optimization, and additional results on COMP6 performed in this project (PDF)

Author Contributions

Conceptualization was provided by N.L. and R.A.M.; data curation was performed by S.Z., M.C., and R.A.M.; funding acquisition was performed by N.L., R.A.M., and O.I.; investigation was performed by S.Z., M.C., R.A.M., and N.L.; methodology was by S.Z., M.C., R.A.M., and N.L.; software was written by S.Z. and N.L.; resources were acquired by R.A.M. and N.L.; supervision was performed by O.I., R.A.M., and N.L.; visualizations were created by S.Z., M.C., R.A.M., and N.L.; writing of the original draft was done by S.Z., R.A.M., and N.L.; and reviewing and editing were done by S.Z., O.I., R.A.M., and N.L.

The authors declare no competing financial interest.

Special Issue

Published as part of Journal of Chemical Information and Modelingspecial issue “Modeling Reactions from Chemical Theories to Machine Learning”.

Supplementary Material

ci5c00341_si_001.pdf (267.7KB, pdf)

References

  1. Senftle T. P.; Hong S.; Islam M. M.; Kylasa S. B.; Zheng Y.; Shin Y. K.; Junkermeier C.; Engel-Herbert R.; Janik M. J.; Aktulga H. M. The ReaxFF reactive force-field: development, applications and future directions. npj Comput. Mater. 2016, 2, 15011. 10.1038/npjcompumats.2015.11. [DOI] [Google Scholar]
  2. Van Duin A. C.; Dasgupta S.; Lorant F.; Goddard W. A. ReaxFF: a reactive force field for hydrocarbons. J. Phys. Chem. A 2001, 105, 9396–9409. 10.1021/jp004368u. [DOI] [Google Scholar]
  3. Liang T.; Shin Y. K.; Cheng Y.-T.; Yilmaz D. E.; Vishnu K. G.; Verners O.; Zou C.; Phillpot S. R.; Sinnott S. B.; Van Duin A. C. Reactive potentials for advanced atomistic simulations. Annu. Rev. Mater. Res. 2013, 43, 109–129. 10.1146/annurev-matsci-071312-121610. [DOI] [Google Scholar]
  4. Artrith N.; Butler K. T.; Coudert F.-X.; Han S.; Isayev O.; Jain A.; Walsh A. Best practices in machine learning for chemistry. Nat. Chem. 2021, 13, 505–508. 10.1038/s41557-021-00716-z. [DOI] [PubMed] [Google Scholar]
  5. Dral P. O. Quantum chemistry in the age of machine learning. J. Phys. Chem. Lett. 2020, 11, 2336–2347. 10.1021/acs.jpclett.9b03664. [DOI] [PubMed] [Google Scholar]
  6. Meuwly M. Machine learning for chemical reactions. Chem. Rev. 2021, 121, 10218–10239. 10.1021/acs.chemrev.1c00033. [DOI] [PubMed] [Google Scholar]
  7. Lupo Pasini M.; Karabin M.; Eisenbach M. Transferring predictions of formation energy across lattices of increasing size. Mach. Learn.: Sci. Technol. 2024, 5, 025015. 10.1088/2632-2153/ad3d2c. [DOI] [Google Scholar]
  8. Chen L.-Y.; Hsu T.-W.; Hsiung T.-C.; Li Y.-P. Deep Learning-Based Increment Theory for Formation Enthalpy Predictions. J. Phys. Chem. A 2022, 126, 7548–7556. 10.1021/acs.jpca.2c04848. [DOI] [PubMed] [Google Scholar]
  9. Ward L.; Dandu N.; Blaiszik B.; Narayanan B.; Assary R. S.; Redfern P. C.; Foster I.; Curtiss L. A. Graph-Based Approaches for Predicting Solvation Energy in Multiple Solvents: Open Datasets and Machine Learning Models. J. Phys. Chem. A 2021, 125, 5990–5998. 10.1021/acs.jpca.1c01960. [DOI] [PubMed] [Google Scholar]
  10. Fedik N.; Zubatyuk R.; Kulichenko M.; Lubbers N.; Smith J. S.; Nebgen B.; Messerly R.; Li Y. W.; Boldyrev A. I.; Barros K.; Isayev O.; Tretiak S. Extending machine learning beyond interatomic potentials for predicting molecular properties. Nat. Rev. Chem 2022, 6, 653–672. 10.1038/s41570-022-00416-3. [DOI] [PubMed] [Google Scholar]
  11. Unke O. T.; Chmiela S.; Sauceda H. E.; Gastegger M.; Poltavsky I.; Schütt K. T.; Tkatchenko A.; Müller K.-R. Machine learning force fields. Chem. Rev. 2021, 121, 10142–10186. 10.1021/acs.chemrev.0c01111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Smith J. S.; Isayev O.; Roitberg A. E. ANI-1: an extensible neural network potential with DFT accuracy at force field computational cost. Chem. Sci. 2017, 8, 3192–3203. 10.1039/C6SC05720A. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Deringer V. L.; Csányi G. Machine learning based interatomic potential for amorphous carbon. Phys. Rev. B 2017, 95, 094203. 10.1103/PhysRevB.95.094203. [DOI] [Google Scholar]
  14. Jiang B.; Guo H. Permutation invariant polynomial neural network approach to fitting potential energy surfaces. J. Chem. Phys. 2013, 139, 054112. 10.1063/1.4817187. [DOI] [PubMed] [Google Scholar]
  15. Pham C. H.; Lindsey R. K.; Fried L. E.; Goldman N. High-accuracy semiempirical quantum models based on a minimal training set. J. Phys. Chem. Lett. 2022, 13, 2934–2942. 10.1021/acs.jpclett.2c00453. [DOI] [PubMed] [Google Scholar]
  16. Kulichenko M.; Smith J. S.; Nebgen B.; Li Y. W.; Fedik N.; Boldyrev A. I.; Lubbers N.; Barros K.; Tretiak S. The rise of neural networks for materials and chemical dynamics. J. Phys. Chem. Lett. 2021, 12, 6227–6243. 10.1021/acs.jpclett.1c01357. [DOI] [PubMed] [Google Scholar]
  17. Behler J.; Parrinello M. Generalized neural-network representation of high-dimensional potential-energy surfaces. Phys. Rev. Lett. 2007, 98, 146401. 10.1103/PhysRevLett.98.146401. [DOI] [PubMed] [Google Scholar]
  18. Behler J. Atom-centered symmetry functions for constructing high-dimensional neural network potentials. J. Chem. Phys. 2011, 134, 074106. 10.1063/1.3553717. [DOI] [PubMed] [Google Scholar]
  19. Zhang L.; Han J.; Wang H.; Car R.; E W. Deep potential molecular dynamics: a scalable model with the accuracy of quantum mechanics. Phys. Rev. Lett. 2018, 120, 143001. 10.1103/PhysRevLett.120.143001. [DOI] [PubMed] [Google Scholar]
  20. Zubatyuk R.; Smith J. S.; Leszczynski J.; Isayev O. Accurate and transferable multitask prediction of chemical properties with an atoms-in-molecules neural network. Sci. Adv. 2019, 5, eaav6490 10.1126/sciadv.aav6490. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Anstine D.; Zubatyuk R.; Isayev O. AIMNet2: a neural network potential to meet your neutral, charged, organic, and elemental-organic needs. ChemRxiv 2023, 10.26434/chemrxiv-2023-296ch. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Lubbers N.; Smith J. S.; Barros K. Hierarchical modeling of molecular energies using a deep neural network. J. Chem. Phys. 2018, 148, 241715. 10.1063/1.5011181. [DOI] [PubMed] [Google Scholar]
  23. Batatia I.; Kovacs D. P.; Simm G.; Ortner C.; Csányi G.. MACE: Higher order equivariant message passing neural networks for fast and accurate force fields. Advances in Neural Information Processing Systems, 2022; Vol. 35; pp 11423–11436.
  24. Gelzinyte E.; Öeren M.; Segall M. D.; Csányi G. Transferable machine learning interatomic potential for bond dissociation energy prediction of drug-like molecules. J. Chem. Theory Comput. 2024, 20, 164–177. 10.1021/acs.jctc.3c00710. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Smith J. S.; Nebgen B. T.; Zubatyuk R.; Lubbers N.; Devereux C.; Barros K.; Tretiak S.; Isayev O.; Roitberg A. E. Approaching coupled cluster accuracy with a general-purpose neural network potential through transfer learning. Nat. Commun. 2019, 10, 2903. 10.1038/s41467-019-10827-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Yang Y.; Zhang S.; Ranasinghe K. D.; Isayev O.; Roitberg A. E. Machine Learning of Reactive Potentials. Annu. Rev. Phys. Chem. 2024, 75, 371–395. 10.1146/annurev-physchem-062123-024417. [DOI] [PubMed] [Google Scholar]
  27. Yoo P.; Sakano M.; Desai S.; Islam M. M.; Liao P.; Strachan A. Neural network reactive force field for C, H, N, and O systems. npj Comput. Mater. 2021, 7, 9. 10.1038/s41524-020-00484-3. [DOI] [Google Scholar]
  28. Hamilton B. W.; Yoo P.; Sakano M. N.; Islam M. M.; Strachan A. High-pressure and temperature neural network reactive force field for energetic materials. J. Chem. Phys. 2023, 158, 144117. 10.1063/5.0146055. [DOI] [PubMed] [Google Scholar]
  29. Zhang S.; Zubatyuk R.; Yang Y.; Roitberg A.; Isayev O. ANI-1xBB: An ANI-Based Reactive Potential for Small Organic Molecules. J. Chem. Theory Comput. 2025, 10.1021/acs.jctc.5c00347. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Guan X.; Heindel J. P.; Ko T.; Yang C.; Head-Gordon T. Using machine learning to go beyond potential energy surface benchmarking for chemical reactivity. Nat. Comput. Sci. 2023, 3, 965–974. 10.1038/s43588-023-00549-5. [DOI] [PubMed] [Google Scholar]
  31. Hu Q.; Gordon A.; Johanessen A.; Tan L.; Goodpaster J. Training Transferable Interatomic Neural Network Potentials for Reactive Chemistry: Improved Chemical Space Sampling. ChemRxiv 2024, 10.26434/chemrxiv-2024-c375f. [DOI] [Google Scholar]
  32. Zhao Q.; Anstine D. M.; Isayev O.; Savoie B. M. Δ 2 machine learning for reaction property prediction. Chem. Sci. 2023, 14, 13392–13401. 10.1039/D3SC02408C. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Behler J. First principles neural network potentials for reactive simulations of large molecular and condensed systems. Angew. Chem., Int. Ed. 2017, 56, 12828–12840. 10.1002/anie.201703114. [DOI] [PubMed] [Google Scholar]
  34. Stocker S.; Csanyi G.; Reuter K.; Margraf J. T. Machine learning in chemical reaction space. Nat. Commun. 2020, 11, 5505. 10.1038/s41467-020-19267-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Hu Q. H.; Johannesen A. M.; Graham D. S.; Goodpaster J. D. Neural network potentials for reactive chemistry: CASPT2 quality potential energy surfaces for bond breaking. Digital Discovery 2023, 2, 1058–1069. 10.1039/D3DD00051F. [DOI] [Google Scholar]
  36. Menon A. S.; Radom L. Consequences of spin contamination in unrestricted calculations on open-shell species: Effect of Hartree- Fock and Møller- Plesset contributions in hybrid and double-hybrid density functional theory approaches. J. Phys. Chem. A 2008, 112, 13225–13230. 10.1021/jp803064k. [DOI] [PubMed] [Google Scholar]
  37. Jeong W.; Stoneburner S. J.; King D.; Li R.; Walker A.; Lindh R.; Gagliardi L. Automation of active space selection for multireference methods via machine learning on chemical bond dissociation. J. Chem. Theory Comput. 2020, 16, 2389–2399. 10.1021/acs.jctc.9b01297. [DOI] [PubMed] [Google Scholar]
  38. Smith J. S.; Nebgen B.; Lubbers N.; Isayev O.; Roitberg A. E. Less is more: Sampling chemical space with active learning. J. Chem. Phys. 2018, 148, 241733. 10.1063/1.5023802. [DOI] [PubMed] [Google Scholar]
  39. Zhang S.; Makoś M. Z.; Jadrich R. B.; Kraka E.; Barros K.; Nebgen B. T.; Tretiak S.; Isayev O.; Lubbers N.; Messerly R. A.; Smith J. S. Exploring the frontiers of condensed-phase chemistry with a general reactive machine learning potential. Nat. Chem. 2024, 16, 727–734. 10.1038/s41557-023-01427-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Young T. A.; Johnston-Wood T.; Deringer V. L.; Duarte F. A transferable active-learning strategy for reactive molecular force fields. Chem. Sci. 2021, 12, 10944–10955. 10.1039/D1SC01825F. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Wardzala J. J.; King D. S.; Ogunfowora L.; Savoie B.; Gagliardi L. Organic Reactivity Made Easy and Accurate with Automated Multireference Calculations. ACS Cent. Sci. 2024, 10, 833–841. 10.1021/acscentsci.3c01559. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Xue L.-Y.; Guo F.; Wen Y.-S.; Feng S.-Q.; Huang X.-N.; Guo L.; Li H.-S.; Cui S.-X.; Zhang G.-Q.; Wang Q.-L. ReaxFF-MPNN machine learning potential: a combination of reactive force field and message passing neural networks. Phys. Chem. Chem. Phys. 2021, 23, 19457–19464. 10.1039/D1CP01656C. [DOI] [PubMed] [Google Scholar]
  43. Pun G. P.; Batra R.; Ramprasad R.; Mishin Y. Physically informed artificial neural networks for atomistic modeling of materials. Nat. Commun. 2019, 10, 2339. 10.1038/s41467-019-10343-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Karniadakis G. E.; Kevrekidis I. G.; Lu L.; Perdikaris P.; Wang S.; Yang L. Physics-informed machine learning. Nat. Rev. Phys. 2021, 3, 422–440. 10.1038/s42254-021-00314-5. [DOI] [Google Scholar]
  45. Zubatiuk T.; Isayev O. Development of multimodal machine learning potentials: toward a physics-aware artificial intelligence. Acc. Chem. Res. 2021, 54, 1575–1585. 10.1021/acs.accounts.0c00868. [DOI] [PubMed] [Google Scholar]
  46. Specht T.; Nagda M.; Fellenz S.; Mandt S.; Hasse H.; Jirasek F. HANNA: Hard-constraint neural network for consistent activity coefficient prediction. Chem. Sci. 2024, 15, 19777–19786. 10.1039/D4SC05115G. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Thompson A. P.; Swiler L. P.; Trott C. R.; Foiles S. M.; Tucker G. J. Spectral neighbor analysis method for automated generation of quantum-accurate interatomic potentials. J. Comput. Phys. 2015, 285, 316–330. 10.1016/j.jcp.2014.12.018. [DOI] [Google Scholar]
  48. Kovács D. P.; Oord C. v. d.; Kucera J.; Allen A. E.; Cole D. J.; Ortner C.; Csányi G. Linear atomic cluster expansion force fields for organic molecules: beyond rmse. J. Chem. Theory Comput. 2021, 17, 7696–7711. 10.1021/acs.jctc.1c00647. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Batatia I.; Batzner S.; Péter Kovács D.; Musaelian A.; Simm G. N. C.; Drautz R.; Ortner C.; Kozinsky B.; Csányi G. The Design Space of E(3)-Equivariant Atom-Centered Interatomic Potentials. arXiv 2022, arXiv:2205.06643. 10.48550/arXiv.2205.06643. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Fu W.; Mo Y.; Xiao Y.; Liu C.; Zhou F.; Wang Y.; Zhou J.; Zhang Y. J. Enhancing Molecular Energy Predictions with Physically Constrained Modifications to the Neural Network Potential. J. Chem. Theory Comput. 2024, 20, 4533–4544. 10.1021/acs.jctc.3c01181. [DOI] [PubMed] [Google Scholar]
  51. Mohan S.; Kadkhodaie Z.; Simoncelli E. P.; Fernandez-Granda C.. Robust and interpretable blind image denoising via bias-free convolutional neural networks. 8th International Conference on Learning Representations, ICLR 2020, 2020.
  52. Wood M. A.; Thompson A. P. Extending the accuracy of the SNAP interatomic potential form. J. Chem. Phys. 2018, 148, 241721. 10.1063/1.5017641. [DOI] [PubMed] [Google Scholar]
  53. Chigaev M.; Smith J. S.; Anaya S.; Nebgen B.; Bettencourt M.; Barros K.; Lubbers N. Lightweight and effective tensor sensitivity for atomistic neural networks. J. Chem. Phys. 2023, 158, 184108. 10.1063/5.0142127. [DOI] [PubMed] [Google Scholar]
  54. Allen A. E. A.; Shinkle E.; Bujack R.; Lubbers N. Optimal Invariant Bases for Atomistic Machine Learning. arXiv 2025, arXiv:2503.23515. 10.48550/arXiv.2503.23515. [DOI] [Google Scholar]
  55. Gilmer J.; Schoenholz S. S.; Riley P. F.; Vinyals O.; Dahl G. E.. Neural Message Passing for Quantum Chemistry. Proceedings of the 34th International Conference on Machine Learning, 2017; pp 1263–1272.
  56. Schütt K.; Kindermans P.-J.; Sauceda Felix H. E.; Chmiela S.; Tkatchenko A.; Müller K.-R.. SchNet: A continuous-filter convolutional neural network for modeling quantum interactions. Advances in Neural Information Processing Systems 30 (NIPS 2017), 2017; Vol. 1; pp 1–10.
  57. Wu Z.; Pan S.; Chen F.; Long G.; Zhang C.; Yu P. S. A comprehensive survey on graph neural networks. IEEE Trans. Neural Networks Learn. Syst. 2020, 32, 4–24. 10.1109/tnnls.2020.2978386. [DOI] [PubMed] [Google Scholar]
  58. Devereux C.; Smith J.; Davis K.; Barros K.; Zubatyuk R.; Isayev O.; Roitberg A. Extending the applicability of the ANI deep learning molecular potential to Sulfur and Halogens. ChemRxiv 2020, 10.26434/chemrxiv.118. [DOI] [PubMed] [Google Scholar]
  59. Unke O. T.; Meuwly M. PhysNet: A neural network for predicting energies, forces, dipole moments, and partial charges. J. Chem. Theory Comput. 2019, 15, 3678–3693. 10.1021/acs.jctc.9b00181. [DOI] [PubMed] [Google Scholar]
  60. Zubatyuk R.; Smith J. S.; Leszczynski J.; Isayev O. Accurate and transferable multitask prediction of chemical properties with an atoms-in-molecules neural network. Sci. Adv. 2019, 5, eaav6490 10.1126/sciadv.aav6490. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Hedelius B. E.; Tingey D.; Della Corte D. TrIP–Transformer Interatomic Potential Predicts Realistic Energy Surface Using Physical Bias. J. Chem. Theory Comput. 2024, 20, 199–211. 10.1021/acs.jctc.3c00936. [DOI] [PubMed] [Google Scholar]
  62. Batzner S.; Musaelian A.; Sun L.; Geiger M.; Mailoa J. P.; Kornbluth M.; Molinari N.; Smidt T. E.; Kozinsky B. E(3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials. Nat. Commun. 2022, 13, 2453. 10.1038/s41467-022-29939-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Lu J.; Wang C.; Zhang Y. Predicting Molecular Energy Using Force-Field Optimized Geometries and Atomic Vector Representations Learned from an Improved Deep Tensor Neural Network. J. Chem. Theory Comput. 2019, 15, 4113–4121. 10.1021/acs.jctc.9b00001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Smith J. S.; Zubatyuk R.; Nebgen B.; Lubbers N.; Barros K.; Roitberg A. E.; Isayev O.; Tretiak S. The ANI-1ccx and ANI-1x data sets, coupled-cluster and density functional theory properties for molecules. Sci. Data 2020, 7, 134. 10.1038/s41597-020-0473-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Kovács D. P.; Batatia I.; Arany E. S.; Csányi G. Evaluation of the MACE force field architecture: From medicinal chemistry to materials science. J. Chem. Phys. 2023, 159, 044118. 10.1063/5.0155322. [DOI] [PubMed] [Google Scholar]
  66. Haghighatlari M.; Li J.; Guan X.; Zhang O.; Das A.; Stein C. J.; Heidar-Zadeh F.; Liu M.; Head-Gordon M.; Bertels L.; Hao H.; Leven I.; Head-Gordon T. NewtonNet: a Newtonian message passing network for deep learning of interatomic potentials and forces. Digital Discovery 2022, 1, 333–343. 10.1039/D2DD00008C. [DOI] [PMC free article] [PubMed] [Google Scholar]
  67. Simeon G.; De Fabritiis G.. Tensornet: Cartesian tensor representations for efficient learning of molecular potentials. NIPS ’23: Proceedings of the 37th International Conference on Neural Information Processing Systems, 2024; Vol. 36.
  68. Zaverkin V.; Holzmüller D.; Bonfirraro L.; Kästner J. Transfer learning for chemically accurate interatomic neural network potentials. Phys. Chem. Chem. Phys. 2023, 25, 5383–5396. 10.1039/D2CP05793J. [DOI] [PubMed] [Google Scholar]
  69. Žugec I.; Geilhufe R. M.; Lončarić I. Global machine learning potentials for molecular crystals. J. Chem. Phys. 2024, 160, 154106. 10.1063/5.0196232. [DOI] [PubMed] [Google Scholar]
  70. Kovács D. P.; Moore J. H.; Browning N. J.; Batatia I.; Horton J. T.; Pu Y.; Kapil V.; Witt W. C.; Magdău I.-B.; Cole D. J.; Csányi G. MACE-OFF: Transferable Short Range Machine Learning Force Fields for Organic Molecules. arXiv 2023, arXiv:2312.15211. 10.48550/arXiv.2312.15211. [DOI] [Google Scholar]
  71. Fuchs F. B.; Worrall D. E.; Fischer V.; Welling M.. Se(3)-transformers: 3d roto-translation equivariant attention networks. Advances in Neural Information Processing Systems, 2020; Vol. 33; pp 1970–1981.
  72. Hoja J.; Medrano Sandonas L.; Ernst B. G.; Vazquez-Mayagoitia A.; DiStasio R. A.; Tkatchenko A. QM7-X, a comprehensive dataset of quantum-mechanical properties spanning the chemical space of small organic molecules. Sci. Data 2021, 8, 43. 10.1038/s41597-021-00812-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  73. Kovács D. P.; Batatia I.; Arany E. S.; Csányi G. Evaluation of the MACE force field architecture: From medicinal chemistry to materials science. J. Chem. Phys. 2023, 159, 044118. 10.1063/5.0155322. [DOI] [PubMed] [Google Scholar]
  74. St. John P. C.; Guan Y.; Kim Y.; Kim S.; Paton R. S. Prediction of organic homolytic bond dissociation enthalpies at near chemical accuracy with sub-second computational cost. Nat. Commun. 2020, 11, 2328. 10.1038/s41467-020-16201-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. St. John P. C.; Guan Y.; Kim Y.; Etz B. D.; Kim S.; Paton R. S. Quantum chemical calculations for over 200,000 organic radical species and 40,000 associated closed-shell molecules. Sci. Data 2020, 7, 244. 10.1038/s41597-020-00588-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  76. Akiba T.; Sano S.; Yanase T.; Ohta T.; Koyama M.. Optuna: A Next-generation Hyperparameter Optimization Framework. Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2019.
  77. Schreiner M.; Bhowmik A.; Vegge T.; Busk J.; Winther O. Transition1x-a dataset for building generalizable reactive machine learning potentials. Sci. Data 2022, 9, 779. 10.1038/s41597-022-01870-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  78. Gao X.; Ramezanghorbani F.; Isayev O.; Smith J. S.; Roitberg A. E. TorchANI: A Free and Open Source PyTorch-Based Deep Learning Implementation of the ANI Neural Network Potentials. J. Chem. Inf. Model. 2020, 60, 3408–3415. 10.1021/acs.jcim.0c00451. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

ci5c00341_si_001.pdf (267.7KB, pdf)

Data Availability Statement

All data used in this work were produced in prior studies and can be found by following the appropriate references; no new data were created in this work. The software used for this work is hippynn, which is open-source and available at https://github.com/lanl/hippynn. Also included is an example script demonstrating how to train with the AE-constrained method for HIP-NN. Code for the ANI-AE model, based on TorchANI,78 is available at https://github.com/amateurcat/ANI-AE.


Articles from Journal of Chemical Information and Modeling are provided here courtesy of American Chemical Society

RESOURCES