Abstract
In the last several years, there has been a surge in the development of machine learning potential (MLP) models for describing molecular systems. We are interested in a particular area of this field — the training of system-specific MLPs for reactive systems — with the goal of using these MLPs to accelerate free energy simulations of chemical and enzyme reactions. To help new members in our labs become familiar with the basic techniques, we have put together a self-guided Colab tutorial (https://cc-ats.github.io/mlp_tutorial/), which we expect to be also useful to other young researchers in the community. Our tutorial begins with the introduction of simple feedforward neural network (FNN) and kernel-based (using Gaussian process regression, GPR) models by fitting the two-dimensional Müller-Brown potential. Subsequently, two simple descriptors are presented for extracting features of molecular systems: symmetry functions (including the ANI variant) and embedding neural networks (such as DeepPot-SE). Lastly, these features will be fed into FNN and GPR models to reproduce the energies and forces for the molecular configurations in a Claisen rearrangement reaction.
I. INTRODUCTION
In recent years, there have been significant progresses in the development of machine learning potentials (MLPs) for generating high-quality potential energy surfaces for chemical systems.1–16 In general, for a molecule with atoms, a MLP model takes an input (Cartesian coordinates) vector along with the atom types, and returns the potential energy of the system. The forces on the atoms, the negative of the derivative of the potential energy with respect to the coordinates, are calculated often through autodifferentiation. These MLPs can be categorized into two main groups based on their architecture:8 descriptor-based models and graph neural network (GNN)-based models.
For descriptor-based models, the system’s coordinates are first transformed into descriptor vectors, which must adhere to translational, rotational, and permutational symmetries. In the Behler-Parrinello neuralnetwork (BPNN)17 and its ANI variants,18–20 for instance, symmetric functions are used to encode the local environment of each atom into a descriptor called an atomic environment vector (AEV). In the DeepPot-SE models,21–25 on the other hand, embedding neural networks are used to transform the coordinates into descriptors. These and other descriptors (such as the internal coordinates,26 Coulomb matrix,27 permutation invariant polynomial5,14,28,29, bag of bonds,30 normalized inverted internuclear distances,31 FCHL representation,32, and weighted symmetry functions33) are then used as inputs to a regressor, such as a neural network or a kernel-based regressor, to predict the target molecular energy and the corresponding atomic forces.
In an alternative approach, GNN-based models treat the molecular system as a dense graph, with each atom representing a node and two-body interactions represented by edges between the nodes. Unlike descriptor-based models where the descriptors are calculated from the atomic coordinates in one pass, in GNN-based models, the description for each atom’s local environment is updated iteratively through multiple rounds of refinements. Examples for this category include DTNN,34 SchNet,35,36 PhysNet,37 NequIP,38 etc.
In this tutorial on basic MLPs for reactive systems, which we prepared in the last year for training new members in our labs, we primarily focused on descriptor-based models, specifically the atom-centered symmetry functions (including the ANI variant) and the DeepPot-SE descriptors. We then employed these descriptors in combination with two types of regressors — neural network models and Gaussian process regression (GPR)39,40 based kernel models — to train the MLPs for model systems.
This tutorial is organized as follows. In Section II, we will briefly introduce the underlying methods for feature extraction (symmetry functions and DeepPot-SE) and for data regression (neural networks and GPR). Seven tutorial lessons will be briefly outlined in Section III. A discussion is presented in Section IV on the utilization and extension of these MLPs. Concluding remarks are made in Section V.
II. METHODS
A. Feature Extraction
1. Symmetry Functions
Atomic feature vectors , also known as symmetry functions, describe the organization of the environment surrounding each atom, and are usually decomposed into two-body and three-body terms. The two-body terms, which are called the radial functions following the nomenclature of Behler and Parrinello,17 for the atom are defined as
| (1) |
summing up the contributions from all atoms other than the atom itself. Here, is a damping function of the interatomic distance with a cutoff defined as
| (2) |
Note that and in Eq. 1 as well as in Eq. 2 are all predetermined hyperparameters. The three-body terms, or the angular functions, for the atom are defined as
| (3) |
with the angle being
| (4) |
where is a vector pointing from atom with coordinate to atom with coordinate , i.e,
| (5) |
Here and are hyperparameters, and or . In the ANI18 implementation of BPNN, the angular function is replaced with
| (6) |
where is the shifting hyperparameter defining the center of the Gaussian. With different combinations of the hyperparameters, a series of symmetry functions and can be defined, enhancing the capability of characterizing the inhomogeneous environment.
With the cutoff distance , the BPNN potential becomes short-ranged. For systems where the long-range interaction is non-negligible, for instance for molecules in the condensed phase, the Coulomb interaction beyond the cutoff distance can still be non-negligible. For these kinds of systems, one extra feature representing the electrostatic potential embedding the atom can be appended. It can be seen that the atomic feature vectors do not depend on the absolute position of the atoms but the relative positions among all the atoms, therefore the mandatory translational and rotational invariances are satisfied. Behler and Parrinello used a fully-connected feedforward neural network for the atomic features-to-energy perception. Instead of individual neural networks for each atom, atoms of the same element share the same neural network. More generally, atoms of the same atom type share the same neural work. In other words, the neural network is not atom-wise, but element-wise or atom-type-wise. In this way, the condition of permutational invariance is also met.
2. DeepPot-SE Representation
Similar to the symmetry functions, in Deep Potential - Smooth Edition (DeepPot-SE),21 for a system consisting of atoms, each atom is first represented by its local environment matrix , i.e., the relative coordinates between atom and each of its neighbor atoms,
| (7) |
Next, the local environment matrix is transformed to the generalized local environment matrix ,
| (8) |
where and
| (9) |
Here is the switching distance from which the components in smoothly decay to zero at the cutoff distance . In this tutorial, we focused on relatively small molecular systems, and no cutoff was applied for the interatomic interactions, i.e., for any values.
In the next step of feature abstraction, an embedding neural network (ENN) is used to map each value through multiple hidden layers of neurons into outputs, which form the -th row of the embedding matrix . It should be noted that a separate embedding neural network needs to be trained for each pair of the atom element types .
| (10) |
Lastly, a feature matrix of size by is computed
| (11) |
where is the same as (in Eq. 10) and a submatrix contains the first columns of (i.e., ). Both and are additional hyperparameters of the DeepPot-SE representation, besides the number of hidden layers and the number of neurons in each layer of the embedding networks.
B. Feedforward Neural Networks
BPNN, named after Behler and Parrinello, was proposed in 2007 to deal with the difficulty in handling a varying number of atoms in molecules and permutation variance.3,17,41 The basic idea of BPNN is to decompose the molecular energy into atomic contributions
| (12) |
where is the total number of atoms in the molecule, and is the energy of the atom as the output of a trained neural network. The input to the neural network is the atomic feature vector denoted as , like those defined in Eqs. 1 and 3, instead of the original molecular coordinates. The workflow of BPNN is illustrated in Figure 1, where an element-dependent feedforward neural network maps the the atomic feature vectors of the atom into its atomic energies .
FIG. 1:

The Neural Network proposed by Behler and Parrinello (Fig. 2 in Ref. 17).
The fitting networks for DeepPot-SE are similar to the neural networks in BPNN; the atomic feature vectors are replaced with vectors that are reshaped from the feature matrix for each atom in Eq. 11.
The standard structure of the neural network can be found in many books and articles.3,42 A simple example of a NN with only one hidden layer is shown in Fig. 2. With this NN, the molecular potential energy surface can be expressed as
| (13) |
where and are the numbers of nodes in the hidden layer and input layer, respectively. is concatenated feature vector of the atom from Eqs. 1 and 3 (or 6), is the weight connecting node in the input layer and node in the hidden layer, and is the bias of node in the hidden layer. Similarly, is the weight that connects node in the hidden layer and the output layer (only one node), and is the bias of the output layer. The activation function can be an arbitrary nonlinear function and must be differentiable, such as a sigmoid function, a hyperbolic tangent function, a trigonometric function, or an exponential function. Nonlinearity ensures the complexity of NN, and the differentiability ensures that the parameters of a model can be optimized by the gradient descent method. The second derivatives of the activation functions should be available if the forces are used for the NN training.43–45 The initial values of weight and bias parameters can be set randomly and are optimized during a training process using back-propagation.
FIG. 2:

A fully-connected feedforward neural network with one hidden layer.
The loss function is defined as the mean squared error (MSE) of the predicted molecular energies with respect to those from reference quantum mechanical calculations as
| (14) |
where is the batch size, and and are the potential energy predicted by the neural network and the reference electronic energy from a quantum mechanical calculation (ground truth) for the structure, respectively. In the training of machine-learning potentials for driving molecular dynamics simulations, the loss function in Eq.14 is often augmented by the error in the predicted atomic forces.
C. Gaussian Process Regression
Gaussian process regression (GPR) offers an alternative approach to modeling the relationship between molecular descriptors and the potential energy surface (PES).46–55 The developments and applications of GPR for materials and molecules have recently been reviewed by Deringer et al.40 Below, we will provide only a brief review of the GPR formalism that is relevant to this tutorial.
GPR is a non-parametric, kernel-based stochastic inference machine learning method.39 Unlike NNs, which are optimized by minimizing the loss function that parameterizes a predefined functional form based on a predefined network architecture, GPR maximizes the likelihood of observations (such as molecular energy) based on an infinite set of Gaussian-correlated latent functions. To begin, a prior distribution is assumed as follows:
| (15) |
where is a set of -dimensional input vectors is the mean of the functions and is the covariance kernel matrix of the training data set based on a given covariance kernel function that defines the similarity between the two input vectors involved:39
| (16) |
In this tutorial, the covariance function, , in use is the radial basis function:
| (17) |
where is the vertical variation parameter, is the length scale parameter, and is the Euclidean distance between two input vectors and . A third parameter is introduced to account for a certain level of noise in the observations, which modifies the covariance kernel matrix of the training data as:
| (18) |
where is the identity matrix.39 The hyperparameters, , are trained by maximizing the log marginal likelihood:
| (19) |
The expected (E) energy at a new configuration can be predicted by GPR as follows:
| (20) |
where . The variance of the predictive distribution can also be determined as:
| (21) |
The forces are then calculated following the analytical gradient of the energy with respect to the Cartesian coordinates:
| (22) |
Here, is the mean of the predictive distribution for energy, is the -th component of , and corresponds to the Cartesian direction on atom .
Similar to BPNN, permutational invariance can be introduced by adopting the Gaussian approximation potential (GAP) formalism developed by Bartók and co-workers.47 This formalism utilizes a set of linear combination matrices, , to combine atomic contributions to the potential energy. Atomic contributions to the energy can be made according to
| (23) |
where corresponds to the concatenated feature vector and is now a covariance matrix comparing each individual atomic environment.
In addition to being trained based on energy-only observations ,54 the GPR model can be influenced by including force observations in training, as demonstrated in our recent QM/MM work.55 Because the derivatives of Gaussian processes, , are also Gaussian processes, the observation set can be extended to include a set of derivative observations.56 Here, we use nuclear gradient components, , as our observable derivatives, and include them in an extended set, :
| (24) |
The kernel is similarly extended, following the formalism introduced by Meyer and Hausser,57 to account for the transformation between Cartesian and internal input space:
| (25) |
After the model is optimized, the expected energy at a new configuration can be predicted by GPR according to:
| (26) |
where . The associated predictive variance is given by:
| (27) |
The prediction of the expected gradient is given by:
| (28) |
with the associated variance being:
| (29) |
III. LESSONS ON THE MLP TRAINING FOR MODEL SYSTEMS
The tutorials developed here are based on Jupyter notebooks and Google Colaboratory. Jupyter notebooks are a powerful and versatile tool popular in both research and education. They allow users to combine code, text, equations, and visualizations in a single interactive document, making them a great tool for exploring and understanding complex concepts. The interactive nature of Jupyter notebooks makes them particularly useful in education, as they allow students to experiment with code and see the results of their work in real time. Many universities and other educational institutions use Jupyter notebooks in their courses. Google Colaboratory, or Colab for short, is a popular hosted version of Jupyter notebooks that allows users to access powerful computing resources without having to set up and maintain their own infrastructure. Overall, Jupyter notebooks and Colab are valuable tools that can make learning more engaging and effective.
The tutorials will cover several important topics in machine learning and molecular modeling. In Lessons 1 and 2, we will introduce the concepts of neural networks and GPR and use these models to reproduce the two-dimensional Müller-Brown potential energy surface. In Lessons 3 and 4, we will introduce two molecular representations, the Behler-Parrinello symmetry functions and the Deep Potential, and explore their properties using the butane molecule as a test case. In the final three lessons (Lessons 5–7), we will combine these machine learning models and molecular representations to train several machine learning potentials that can accurately model the Claisen rearrangement reaction in the gas phase. Throughout the tutorials, we will provide hands-on examples that will allow students to apply what they have learned and gain practical experience with these important tools.
Lesson 1: Basic Feedforward Neural Network Models
We introduce the Müller-Brown potential energy surface and feedforward neural networks. A feedforward neural network is trained using grid points uniformly sampled from the Müller-Brown potential energy surface (Fig. 3a). FNN models were trained using only energy (Fig. 3b) and energy with gradient58 (Fig. 3c).
FIG. 3:

(a) Reference Müller-Brown PES, (b) predicted PES from the FNN trained with energies only, and (c) predicted PES from the FNN trained with energies and gradients.
Lesson 2: Basic Gaussian Process Regression Models
A GPR model is used to constructed a predicted surface (Fig. 4) of the Müller-Brown potential energy surface, similar to Lesson 1. Emphasis is placed on GPR parameters, marginal likelihood, and variance from the analytical surface. Additional sections are added to show how gradients for a surface can be predicted using GPR.
FIG. 4:

(a) Reference PES, (b) GPR predicted PES, (c) difference between the reference and GPR predicted surfaces, and (d) predicted variance for the Müller-Brown PES.
Lesson 3: Behler-Parrinello Symmetry Functions for Feature Extraction
We introduce Behler-Parrinello17 and ANI methods18 for feature extraction using symmetry functions. This lesson discusses the importance of symmetry functions (Fig. 5) for ensuring the energy of a molecule described by a neural network is rotationally and translationally invariant. BP and ANI are then used for feature extraction of a butane molecule. The parameters are set to values used in the ANI model.18
FIG. 5:

The BP angular symmetry functions (left) compared to the ANI angular symmetry functions (right).
Lesson 4: DeepPot Representation for Feature Extraction
We provide an overview of DeepPot MLP training workflow (Fig. 6) and discuss the significance of the embedding matrices for feature extraction in an embedding neural network. DeepPot is then used for feature extraction of butane molecular configurations from Lesson 3.
FIG. 6:

Schematic of the DeepPot-SE feature extraction process for the atom.
Lesson 5: BP-FNN Models for the Claisen Rearrangement
We combine the Behler-Parrinello symmetry functions with a feedforward neural network to construct a neural network that is rotationally and translationally invariant. The BP-FNN MLP is trained using geometries relevant to a Claisen rearrangement reaction (Fig. 7). Following the training, the model is compared to reference values calculated using Density Functional Theory (DFT) with B3LYP functional and 6–31+G* basis set.
FIG. 7:

Claisen rearrangement modeled in lessons 5–7.
Lesson 6: DeepPot-FNN Models for the Claisen Rearrangement
We combine DeepPot with a FNN to describe the molecular configurations along the Claisen rearrangement reaction pathway from Lesson 5. The predictions made by the DeepPot-FNN MLP model we train are again compared to DFT reference results.
Lesson 7: BP-GPR Models for the Claisen Rearrangement
Our final lesson combines the BP and ANI symmetry functions with GPR to create an MLP model. The BP-GPR MLP model is trained and tested using the same reactive system as in Lessons 5 and 6.
IV. DISCUSSION
This tutorial focuses on the training of MLPs for describing the ground-state potential energy surface of a reactive system. It should be noted that our focus is placed on the readability of the code implementation, rather than the software modularity or run-time efficiency. Once learning the basics through this tutorial, the readers can adopt advanced software platforms, such as DeePMD-kit,22,59,60 ænet,61,62 AMP,63 MLatom, PhysNet,37 SchNetPack,36 sGDML,64 TorchANI20, and TorchMD-NET65 for their own machine learning model development. It should also be noted that there are several other areas of research that are not covered. These include:
MLPs for describing electronic excited states. A comprehensive review on this topic can be found in Ref. 11. In general, it would require the training of several MLPs, one for each adiabatic or diabatic electronic surface, as well as, in the former case, the training of ML models for the non-adiabatic coupling.66–72
Active learning/adaptive sampling schemes for training the MLPs for molecular dynamics simulations. This can involve (a) the estimation of prediction uncertainty using the query-of-the-committee19,69,71,73 and other approaches74 and (b) the use of uncertainty estimates in hyperactive learning to bias sampling towards large uncertainty regions in the generation of a training set.75,76
Efficient protocols for generating MLPs for QM/MM simulations. It is not practical to incorporate all MM atoms (in addition to QM atoms) in the training of these potentials, as this would lead to an explosively large array of descriptors, the most straightforward way is to include only MM atoms within a distance cutoff from the QM region in the MLP training.77–79 In general, a smooth distance cutoff is necessary to ensure a smooth potential energy surface.77,80,81 Alternatively, one can adopt an implicit description of the MM environment through the use of MM-perturbed semiempirical QM charges,82,83 MM electrostatic potential or field at QM atom positions,84,85 or through polarizable embedding.86 One can also use both MM electrostatic potential and field in the training of QM/MM MLPs81,87 using our QM/MM-AC scheme80 for separating inner and outer MM atoms and projecting outer MM charges onto inner MM atom positions.80,88,89
These topics will be covered in future advanced tutorials on MLPs.
V. CONCLUSIONS
A Colab tutorial was developed to showcase the implementation of basic machine learning models (neural networks; Gaussian process regression) for reactive systems (using Claisen arrangement as a model system). We hope this tutorial will make it easier for undergraduate/graduate students to get familiar with the basics of machine learning techniques in the context of atomistic modeling.
ACKNOWLEDGEMENTS
We would like to dedicate this tutorial to Prof. Elfi Kraka on the occasion of her 70th birthday. Her dedication to theoretical and computational chemistry research and education has inspired us for years. We acknowledge the support from the National Institutes of Health through grants R01GM135392 (Shao and Pu), R44GM133270, and P30GM145423 (Shao). We also acknowledge the support from the National Science Foundation through grant CHE-2102071 (Shao) and the National Natural Science Foundation of China through grant 22073030 (Mei). Mao acknowledges the support from San Diego State University Startup Fund. Shao acknowledges the support from the OU Data Institute for the Societal Challenges monthly seed funding program. The authors thank the OU Supercomputing Center for Education & Research (OSCER) for the computational resources. Pu acknowledges the Indiana University Pervasive Technology Institute (supported in part by Lilly Endowment, Inc.) for providing supercomputing resources on BigRed3 and Carbonate that have facilitated the research reported in this paper.
References
- 1.Behler J, “Neural Network Potential-energy Surfaces in Chemistry: A Tool for Large-scale Simulations,” Phys. Chem. Chem. Phys 13, 17930–17955 (2011). [DOI] [PubMed] [Google Scholar]
- 2.Manzhos S, Dawes R, and Carrington T, “Neural Network-Based Approaches for Building High Dimensional and Quantum Dynamics-Friendly Potential Energy Surfaces,” Int. J. Quantum Chem 115, 1012–1020 (2015). [Google Scholar]
- 3.Behler J, “Constructing High-dimensional Neural Network Potentials: A Tutorial Review,” Int. J. Quantum Chem 115, 1032–1050 (2015). [Google Scholar]
- 4.Behler J, “Perspective: Machine Learning Potentials for Atomistic Simulations,” J. Chem. Phys 145, 170901 (2016). [DOI] [PubMed] [Google Scholar]
- 5.Jiang B, Li J, and Guo H, “Potential Energy Surfaces from High Fidelity Fitting of ab initio Points: the Permutation Invariant Polynomial - Neural Network Approach,” Int. Rev. Phys. Chem 35, 479–506 (2016). [Google Scholar]
- 6.Butler KT, Davies DW, Cartwright H, Isayev O, and Walsh A, “Machine Learning for Molecular and Materials Science,” Nature 559, 547–555 (2018). [DOI] [PubMed] [Google Scholar]
- 7.Dral PO, “Quantum Chemistry in the Age of Machine Learning,” J. Phys. Chem. Lett 11, 2336–2347 (2020). [DOI] [PubMed] [Google Scholar]
- 8.Zhang J, Lei Y-K, Zhang Z, Chang J, Li M, Han X, Yang L, Yang YI, and Gao YQ, “A Perspective on Deep Learning for Molecular Modeling and Simulations,” J. Phys. Chem. A 124, 6745–6763 (2020). [DOI] [PubMed] [Google Scholar]
- 9.Pinheiro M, Ge F, Ferré N, Dral PO, and Barbatti M, “Choosing the Right Molecular Machine Learning Potential,” Chem. Sci 12, 14396–14413 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Westermayr J, Gastegger M, Schütt KT, and Maurer RJ, “Perspective on Integrating Machine Learning into Computational Chemistry and Materials Science,” J. Chem. Phys 154, 230903 (2021). [DOI] [PubMed] [Google Scholar]
- 11.Westermayr J and Marquetand P, “Machine Learning for Electronically Excited States of Molecules,” Chem. Rev 121, 9873–9926 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Kulik HJ, Hammerschmidt T, Schmidt J, Botti S, Marques MAL, Boley M, Scheffler M, Todorović M, Rinke P, Oses C, Smolyanyuk A, Curtarolo S, Tkatchenko A, Bartók AP, Manzhos S, Ihara M, Carrington T, Behler J, Isayev O, Veit M, Grisafi A, Nigam J, Ceriotti M, Schütt KT, Westermayr J, Gastegger M, Maurer RJ, Kalita B, Burke K, Nagai R, Akashi R, Sugino O, Hermann J, Noé F, Pilati S, Draxl C, Kuban M, Rigamonti S, Scheidgen M, Esters M, Hicks D, Toher C, Balachandran PV, Tamblyn I, Whitelam S, Bellinger C, and Ghiringhelli LM, “Roadmap on Machine learning in Electronic Structure,” Electron. Struct 4, 023004 (2022). [Google Scholar]
- 13.Gokcan H and Isayev O, “Learning Molecular Potentials with Neural Networks,” WIREs Comput. Mol. Sci 12 (2022), 10.1002/wcms.1564. [DOI] [Google Scholar]
- 14.Bowman JM, Qu C, Conte R, Nandi A, Houston PL, and Yu Q, “Delta-Machine Learned Potential Energy Surfaces and Force Fields,” J. Chem. Theory Comput 19, 1–17 (2023). [DOI] [PubMed] [Google Scholar]
- 15.Biswas R, Lourderaj U, and Sathyamurthy N, “Artificial Neural Networks and Their Utility in Fitting Potential Energy Curves and Surfaces and Related Problems,” J. Chem. Sci 135, 22 (2023). [Google Scholar]
- 16.Bhatia H, Aydin F, Carpenter TS, Lightstone FC, Bremer P-T, Ingólfsson HI, Nissley DV, and Streitz FH, “The Confluence of Machine Learning and Multiscale Simulations,” Curr. Op. Struct. Biol 80, 102569 (2023). [DOI] [PubMed] [Google Scholar]
- 17.Behler J and Parrinello M, “Generalized Neural-Network Representation of High-Dimensional Potential-Energy Surfaces,” Physical Review Letters 98, 146401 (2007). [DOI] [PubMed] [Google Scholar]
- 18.Smith JS, Isayev O, and Roitberg AE, “ANI-1: An Extensible Neural Network Potential with DFT Accuracy at Force Field Computational Cost,” Chem. Sci 8, 3192–3203 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Smith JS, Nebgen BT, Zubatyuk R, Lubbers N, Devereux C, Barros K, Tretiak S, Isayev O, and Roitberg AE, “Approaching Coupled Cluster Accuracy with a General-purpose Neural Network Potential Through Transfer Learning,” Nat. Commun 10, 2903 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Gao X, Ramezanghorbani F, Isayev O, Smith JS, and Roitberg AE, “TorchANI: A Free and Open Source PyTorch-Based Deep Learning Implementation of the ANI Neural Network Potentials,” J. Chem. Inf. Model 60, 3408–3415 (2020). [DOI] [PubMed] [Google Scholar]
- 21.Zhang L, Han J, Wang H, Saidi WA, Car R, and E W, “End-to-end Symmetry Preserving Inter-atomic Potential Energy Model for Finite and Extended Systems,” in NeurIPS (2018). [Google Scholar]
- 22.Wang H, Zhang L, Han J, and E W, “DeePMD-kit: A Deep Learning Package for Many-Body Potential Energy Representation and Molecular Dynamics,” Comput. Phys. Commun 228, 178–184 (2018). [Google Scholar]
- 23.Zhang Y, Wang H, Chen W, Zeng J, Zhang L, Wang H, and E W, “DP-GEN: A Concurrent Learning Platform for the Generation of Reliable Deep Learning Based Potential Energy Models,” Comput. Phys. Commun 253, 107206 (2020). [Google Scholar]
- 24.Zeng J, Tao Y, Giese TJ, and York DM, “QDπ: A Quantum Deep Potential Interaction Model for Drug Discovery,” J. Chem. Theory Comput 19, 1261–1275 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Zeng J, Tao Y, Giese TJ, and York DM, “Modern semiempirical electronic structure methods and machine learning potentials for drug discovery: Conformers, tautomers, and protonation states,” J. Chem. Phys 158, 124110 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Manzhos S, Wang X, Dawes R, and Carrington T, “A Nested Molecule-Independent Neural Network Approach for High-Quality Potential Fits,” J. Phys. Chem. A 110, 5295–5304 (2006). [DOI] [PubMed] [Google Scholar]
- 27.Rupp M, Tkatchenko A, Müller K-R, and von Lilienfeld OA, “Fast and Accurate Modeling of Molecular Atomization Energies with Machine Learning,” Phys. Rev. Lett 108, 058301 (2012). [DOI] [PubMed] [Google Scholar]
- 28.Jiang B and Guo H, “Permutation Invariant Polynomial Neural Network Approach to Fitting Potential Energy Surfaces,” J. Chem. Phys 139, 054112 (2013). [DOI] [PubMed] [Google Scholar]
- 29.Qu C, Houston PL, Conte R, Nandi A, and Bowman JM, “Breaking the Coupled Cluster Barrier for Machine-Learned Potentials of Large Molecules: The Case of 15-Atom Acetylacetone,” J. Phys. Chem. Lett 12, 4902–4909 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Hansen K, Biegler F, Ramakrishnan R, Pronobis W, von Lilienfeld OA, Müller K-R, and Tkatchenko A, “Machine Learning Predictions of Molecular Properties: Accurate Many-Body Potentials and Nonlocality in Chemical Space,” J. Phys. Chem. Lett 6, 2326–2331 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Dral PO, Owens A, Yurchenko SN, and Thiel W, “StructureBased Sampling and Self-Correcting Machine Learning for Accurate Calculations of Potential Energy Surfaces and Vibrational Levels,” J. Chem. Phys 146, 244108 (2017). [DOI] [PubMed] [Google Scholar]
- 32.Faber FA, Christensen AS, Huang B, and von Lilienfeld OA, “Alchemical and Structural Distribution Based Representation for Universal Quantum Machine Learning,” J. Chem. Phys 148, 241717 (2018). [DOI] [PubMed] [Google Scholar]
- 33.Gastegger M, Schwiedrzik L, Bittermann M, Berzsenyi F, and Marquetand P, “wACSF-Weighted Atom-centered Symmetry Functions as Descriptors in Machine Learning Potentials,” J. Chem. Phys 148, 241709 (2018). [DOI] [PubMed] [Google Scholar]
- 34.Schütt KT, Arbabzadah F, Chmiela S, Müller KR, and Tkatchenko A, “Quantum-chemical Insights from Deep Tensor Neural Networks,” Nat. Commun 8, 13890 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Schütt KT, Sauceda HE, Kindermans P-J, Tkatchenko A, and Müller K-R, “SchNet - A Deep Learning Architecture for Molecules and Materials,” J. Chem. Phys 148, 241722 (2018). [DOI] [PubMed] [Google Scholar]
- 36.Schütt KT, Kessel P, Gastegger M, Nicoli KA, Tkatchenko A, and Müller K-R, “SchNetPack: A Deep Learning Toolbox For Atomistic Systems,” J. Chem. Theory Comput 15, 448–455 (2019). [DOI] [PubMed] [Google Scholar]
- 37.Unke OT and Meuwly M, “PhysNet: A Neural Network for Predicting Energies, Forces, Dipole Moments, and Partial Charges,” J. Chem. Theory Comput 15, 3678–3693 (2019). [DOI] [PubMed] [Google Scholar]
- 38.Batzner S, Musaelian A, Sun L, Geiger M, Mailoa JP, Kornbluth M, Molinari N, Smidt TE, and Kozinsky B, “E (3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials,” Nat. Commun 13, 2453 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Rasmussen CE and Williams CK, Gaussian Processes for Machine Learning (MIT press; Cambridge, 2006). [Google Scholar]
- 40.Deringer VL, Bartok AP, Bernstein N, Wilkins DM, Ceriotti M, and Csanyi G, “Gaussian Process Regression for Materials and Molecules,” Chem. Rev 121, 10073–10141 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Behler J, “Atom-centered Symmetry Functions for Constructing High-dimensional Neural Network Potentials,” J. Chem. Phys 134, 074106 (2011). [DOI] [PubMed] [Google Scholar]
- 42.Hastie T, Tibshirani R, and Friedman J, The Elements of Statistical Learning (Springer; New York, 2009). [Google Scholar]
- 43.Artrith N and Behler J, “High-dimensional Neural Network Potentials for Metal Surfaces: A Prototype Study for Copper,” Phys. Rev. B 85, 045439 (2012). [Google Scholar]
- 44.Witkoskie JB and Doren DJ, “Neural Network Models of Potential Energy Surfaces: Prototypical Examples,” J. Chem. Theory Comput 1, 14–23 (2005). [DOI] [PubMed] [Google Scholar]
- 45.Pukrittayakamee A, Malshe M, Hagan M, Raff LM, Narulkar R, Bukkapatnam S, and Komanduri R, “Simultaneous Fitting of a Potential-energy Surface and Its Corresponding Force Fields Using Feedforward Neural Networks,” J. Chem. Phys 130, 134101 (2009). [DOI] [PubMed] [Google Scholar]
- 46.Kolb A, Marshall P, Zhao B, Jiang B, and Guo H, “Representing Global Reactive Potential Energy Surfaces Using Gaussian Processes,” J. Phys. Chem. A 121, 2552–2557 (2017). [DOI] [PubMed] [Google Scholar]
- 47.Bartók AP, Payne MC, Kondor R, and Csányi G, “Gaussian Approximation Potentials: The Accuracy of Quantum Mechanics, without the Electrons,” Phys. Rev. Lett 104, 136403 (2010). [DOI] [PubMed] [Google Scholar]
- 48.De S, Bartók AP, Csányi G, and Ceriotti M, “Comparing Molecules and Solids Across Structural and Alchemical Space,” Phys. Chem. Chem. Phys 18, 13754–13769 (2016). [DOI] [PubMed] [Google Scholar]
- 49.Uteva E, Graham RS, Wilkinson RD, and Wheatley RJ, “Active Learning in Gaussian Process Interpolation of Potential Energy Surfaces,” J. Chem. Phys 149, 174114 (2018). [DOI] [PubMed] [Google Scholar]
- 50.Rossi K, Jurásková V, Wischert R, Garel L, Corminbœuf C, and Ceriotti M, “Simulating Solvation and Acidity in Complex Mixtures with First-Principles Accuracy: The Case of CH3SO3H and H2O2 in Phenol,” J. Chem. Theory Comput 16, 5139–5149 (2020). [DOI] [PubMed] [Google Scholar]
- 51.Christensen AS, Bratholm LA, Faber FA, and Anatole von Lilienfeld O, “FCHL Revisited: Faster and More Accurate Quantum Machine Learning,” J. Chem. Phys 152, 044107 (2020). [DOI] [PubMed] [Google Scholar]
- 52.Vandermause J, Torrisi SB, Batzner S, Xie Y, Sun L, Kolpak AM, and Kozinsky B, “On-the-fly Active Learning of Interpretable Bayesian Force Fields for Atomistic Rare Events,” Npj Comput. Mater 6, 20 (2020). [Google Scholar]
- 53.Symons BCB, Bane MK, and Popelier PLA, “DL_FFLUX: A Parallel Quantum Chemical Topology Force Field,” J. Chem. Theory Comput 17, 7043–7055 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Snyder R, Kim B, Pan X, Shao Y, and Pu J, “Bridging Semiempirical and Ab Initio QM/MM Potentials by Gaussian Process Regression and Its Sparse Variants for Free Energy Simulation,” J. Chem. Phys 159, 054107 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Snyder R, Kim B, Pan X, Shao Y, and Pu J, “Facilitating Ab initio QM/MM Free Energy Simulations by Gaussian Process Regression with Derivative Observations,” Phys. Chem. Chem. Phys 24, 25134–25143 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Solak E, Murray-Smith R, Leithead WE, Leith DJ, and Rasmussen CE, “Derivative Observations in Gaussian Process Models of Dynamic Systems,” NIPS 15, 1033–1040 (2003). [Google Scholar]
- 57.Meyer R and Hauser AW, “Geometry Optimization Using Gaussian Process Regression in Internal Coordinate Systems,” J. Chem. Phys 152, 084112 (2020). [DOI] [PubMed] [Google Scholar]
- 58.Smith JS, Lubbers N, Thompson AP, and Barros K, “Simple and efficient algorithms for training machine learning potentials to force data,” (2020), 10.48550/ARXIV.2006.05475, publisher: arXiv Version Number: 1. [DOI] [Google Scholar]
- 59.Liang W, Zeng J, York DM, Zhang L, and Wang H, “Learning DeePMD-Kit: A Guide to Building Deep Potential Models,” in A Practical Guide to Recent Advances in Multiscale Modeling and Simulation of Biomolecules, edited by Wang Y and Zhou R (AIP Publishing; LLCMelville, New York, 2023) pp. 6–1–6–20. [Google Scholar]
- 60.Zeng J, Zhang D, Lu D, Mo P, Li Z, Chen Y, Rynik M, Huang L, Li Z, Shi S, Wang Y, Ye H, Tuo P, Yang J, Ding Y, Li Y, Tisi D, Zeng Q, Bao H, Xia Y, Huang J, Muraoka K, Wang Y, Chang J, Yuan F, Bore SL, Cai C, Lin Y, Wang B, Xu J, Zhu J-X, Luo C, Zhang Y, Goodall REA, Liang W, Singh AK, Yao S, Zhang J, Wentzcovitch R, Han J, Liu J, Jia W, York DM, Car WE,R, Zhang L, and Wang H, “DeePMD-kit v2: A Software Package for Deep Potential Models,” J. Chem. Phys 159, 054801 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Artrith N and Urban A, “An Implementation of Artificial Neural-network Potentials for Atomistic Materials Simulations: Performance for TiO2,” Comput. Mat. Sci 114, 135–150 (2016). [Google Scholar]
- 62.López-Zorrilla J, Aretxabaleta XM, Yeu IW, Etxebarria I, Manzano H, and Artrith N, “ænet-PyTorch: A GPU-supported implementation for machine learning atomic potentials training,” J. Chem. Phys 158, 164105 (2023). [DOI] [PubMed] [Google Scholar]
- 63.Khorshidi A and Peterson AA, “AMP: A Modular Approach to Machine Learning in Atomistic Simulations,” Comput. Phys. Commun 207, 310–324 (2016). [Google Scholar]
- 64.Chmiela S, Sauceda HE, Poltavsky I, Müller K-R, and Tkatchenko A, “sGDML: Constructing Accurate and Data Efficient Molecular Force Fields Using Machine Learning,” Comput. Phys. Commun 240, 38–45 (2019). [Google Scholar]
- 65.Doerr S, Majewsk M, Pérez A, Krämer A, Clementi C, Noe F, Giorgino T, and Fabritiis GD, “Torchmd: A deep learning framework for molecular simulations,” (2020), arXiv:2012.12106 [physics.chem-ph]. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Chen W-K, Liu X-Y, Fang W-H, Dral PO, and Cui G, “Deep Learning for Nonadiabatic Excited-State Dynamics,” J. Phys. Chem. Lett 9, 6702–6708 (2018). [DOI] [PubMed] [Google Scholar]
- 67.Hu D, Xie Y, Li X, Li L, and Lan Z, “Inclusion of Machine Learning Kernel Ridge Regression Potential Energy Surfaces in On-the-Fly Nonadiabatic Molecular Dynamics Simulation,” J. Phys. Chem. Lett 9, 2725–2732 (2018). [DOI] [PubMed] [Google Scholar]
- 68.Dral PO, Barbatti M, and Thiel W, “Nonadiabatic Excited-State Dynamics with Machine Learning,” J. Phys. Chem. Lett 9, 5660–5663 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Westermayr J, Gastegger M, Menger MFSJ, Mai S, González L, and Marquetand P, “Machine Learning Enables Long Time Scale Molecular Photodynamics Simulations,” Chem. Sci 10, 8100–8107 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Shen Y and Yarkony DR, “Construction of Quasi-diabatic Hamiltonians That Accurately Represent ab Initio Determined Adiabatic Electronic States Coupled by Conical Intersections for Systems on the Order of 15 Atoms. Application to Cyclopentoxide Photoelectron Detachment in the Full 39 Degrees of Freedom,” J. Phys. Chem. A 124, 4539–4548 (2020). [DOI] [PubMed] [Google Scholar]
- 71.Li J, Reiser P, Boswell BR, Eberhard A, Burns NZ, Friederich P, and Lopez SA, “Automatic Discovery of Photoisomerization Mechanisms with Nanosecond Machine Learning Photodynamics Simulations,” Chem. Sci 12, 5302–5314 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Ha J-K, Kim K, and Min SK, “Machine Learning-Assisted Excited State Molecular Dynamics with the State-Interaction State-Averaged Spin-Restricted Ensemble-Referenced Kohn-Sham Approach,” J. Chem. Theory Comput 17, 694–702 (2021). [DOI] [PubMed] [Google Scholar]
- 73.Seung HS, Opper M, and Sompolinsky H, “Query by Committee,” in Proceedings of the fifth annual workshop on Computational learning theory (ACM, Pittsburgh Pennsylvania USA, 1992) pp. 287–294. [Google Scholar]
- 74.Tan AR, Urata S, Goldman S, Dietschreit JCB, and Gómez-Bombarelli R, “Single-model Uncertainty Quantification in Neural Network Potentials Does not Consistently Outperform Model Ensembles,” (2023), 10.48550/ARXIV.2305.01754, publisher: arXiv Version Number: 1. [DOI] [Google Scholar]
- 75.van der Oord C, Sachs M, Kovács DP, Ortner C, and Csányi G, “Hyperactive Learning (HAL) for Data-Driven Interatomic Potentials,” (2022), 10.48550/ARXIV.2210.04225, publisher: arXiv Version Number: 2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Kulichenko M, Barros K, Lubbers N, Li YW, Messerly R, Tretiak S, Smith JS, and Nebgen B, “Uncertainty-Driven Dynamics for Active Learning of Interatomic Potentials,” Nat. Comput. Sci 3, 230–239 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Zeng J, Giese TJ, Ekesan S, and York DM, “Development of Range-Corrected Deep Learning Potentials for Fast, Accurate Quantum Mechanical/Molecular Mechanical Simulations of Chemical Reactions in Solution,” J. Chem. Theory Comput 17, 6993–7009 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Böselt L, Thürlemann M, and Riniker S, “Machine Learning in QM/MM Molecular Dynamics Simulations of Condensed-Phase Systems,” J. Chem. Theory Comput 17, 2641–2658 (2021). [DOI] [PubMed] [Google Scholar]
- 79.Lier B, Poliak P, Marquetand P, Westermayr J, and Oostenbrink C, “BuRNN: Buffer Region Neural Network Approach for Polarizable-Embedding Neural Network/Molecular Mechanics Simulations,” J. Phys. Chem. Lett 13, 3812–3818 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.Pan X, Nam K, Epifanovsky E, Simmonett AC, Rosta E, and Shao Y, “A Simplified Charge Projection Scheme for Long-Range Electrostatics in ab initio QM/MM Calculations,” J. Chem. Phys 154, 024115 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Pan X, Yang J, Van R, Epifanovsky E, Ho J, Huang J, Pu J, Mei Y, Nam K, and Shao Y, “Machine-Learning-Assisted Free Energy Simulation of Solution-Phase and Enzyme Reactions,” J. Chem. Theory Comput 17, 5745–5758 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Shen L, Wu J, and Yang W, “Multiscale Quantum Mechanics/Molecular Mechanics Simulations with Neural Networks,” J. Chem. Theory Comput 12, 4934–4946 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83.Wu J, Shen L, and Yang W, “Internal Force Corrections with Machine Learning for Quantum Mechanics/Molecular Mechanics Simulations,” J. Chem. Phys 147, 161732 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Shen L and Yang W, “Molecular Dynamics Simulations with Quantum Mechanics/Molecular Mechanics and Adaptive Neural Networks,” J. Chem. Theory Comput 14, 1442–1455 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85.Gastegger M, Schütt KT, and Müller K-R, “Machine Learning of Solvent Effects on Molecular Spectra and Reactions,” Chem. Sci 12, 11473–11483 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Zinovjev K, “Electrostatic Embedding of Machine Learning Potentials,” J. Chem. Theory Comput 19, 1888–1897 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.Yao S, Van R, Pan X, Park JH, Mao Y, Pu J, Mei Y, and Shao Y, “Machine Learning Based Implicit Solvent Model for Aqueous-Solution Alanine Dipeptide Molecular Dynamics Simulations,” RSC Adv. 13, 4565–4577 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 88.Gregersen BA and York DM, “Variational Electrostatic Projection (VEP) Methods for Efficient Modeling of the Macromolecular Electrostatic and Solvation Environment in Activated Dynamics Simulations,” J. Phys. Chem. B 109, 536–556 (2005). [DOI] [PubMed] [Google Scholar]
- 89.Gregersen BA and York DM, “A Charge-Scaling Implementation of the Variational Electrostatic Projection Method,” J. Comput. Chem 27, 103–115 (2006). [DOI] [PubMed] [Google Scholar]
