Abstract
Many-Body eXpansion (MBX) is a C++ library that implements many-body potential energy functions (PEFs) within the “many-body energy” (MB-nrg) formalism. MB-nrg PEFs integrate an underlying polarizable model with explicit machine-learned representations of many-body interactions to achieve chemical accuracy from the gas to the condensed phases. MBX can be employed either as a stand-alone package or as an energy/force engine that can be integrated with generic software for molecular dynamics and Monte Carlo simulations. MBX is parallelized internally using Open Multi-Processing and can utilize Message Passing Interface when available in interfaced molecular simulation software. MBX enables classical and quantum molecular simulations with MB-nrg PEFs, as well as hybrid simulations that combine conventional force fields and MB-nrg PEFs, for diverse systems ranging from small gas-phase clusters to aqueous solutions and molecular fluids to biomolecular systems and metal-organic frameworks.
I. INTRODUCTION
Molecular dynamics (MD) and Monte Carlo (MC) simulations1,2 have been widely used for understanding and characterizing structural, thermodynamic, and dynamical properties of molecular systems, from small gas-phase clusters to extended materials and biomolecular systems.3–8 The potential energy function (PEF) used to represent the multidimensional potential energy surface associated with the molecular system being studied directly determines the level of realism as well as the predictive power of any MD and MC simulation.
In the early days of molecular simulations, due to limited computational resources, the only viable options for PEFs were empirically parameterized force fields (FFs) that use relatively simple expressions to describe intramolecular distortions and pairwise-additive functions to describe intermolecular interactions.9,10 Although more advanced (nonpolarizable and polarizable) FFs developed over the past five decades11–15 remain the most commonly used PEFs in MD and MC simulations, machine-learning (ML) models trained on electronic structure data have become increasingly popular, promising higher accuracy than conventional FFs.16–19 Some examples of ML PEFs include neural network potentials (NNPs),20–29 equivariant graph neural network potentials,30 Gaussian approximation potentials (GAPs),31 moment tensor potentials (MTPs),32 and spectral neighbor analysis potentials (SNAPs),33 as well as PEFs based on the atomic cluster expansion,34 kernel ridge regression methods,35 gradient-domain machine learning (GDML),36 and support vector machines (SVM).37 Permutationally invariant polynomials (PIPs) have also been used, either as standalone fitting functions38–62 or in combination with neural networks (PIP-NNs).63–66 Many ML PEFs are, however, limited in their transferability—those designed to mimic gas phase properties perform well under those conditions but may not be as accurate when applied to condensed-phase systems,67–69 and models that are trained to reproduce condensed-phase properties may not perform as well in the gas phase or at interfaces.70,71
Ten years ago, Babin, Medders, and Paesani introduced MB-pol, a data-driven many-body PEF for water rigorously derived from “first principles.”72–74 MB-pol combines physics-based many-body models with data-driven machine-learned representations of individual many-body interactions that are expressed in terms of multidimensional PIPs. These machine-learned PIPs were shown to account for limitations in classical representations of molecular interactions that arise when overlapping electron densities lead to quantum-mechanical effects that do not have a classical counterpart, such as exchange-repulsion, charge transfer, and charge penetration.75–77 The PIPs of MB-pol were trained on large datasets of many-body energies calculated at the coupled cluster level of theory, including single, double, and perturbative triple excitations, i.e., CCSD(T), the current “gold standard” for chemical accuracy.78 By construction, MB-pol is fully transferable across all phases,79,80 accurately reproducing the properties of small gas-phase clusters,81–92 liquid water,93–99 the air/water interface,100–104 and ice.105–110 Remarkably, MB-pol was shown to be the first and, currently, only water PEF able to correctly predict the phase diagram of water.111 More recently, an updated version of MB-pol, MB-pol(2023), which was trained on larger training sets of many-body interactions, was shown to achieve even higher accuracy for simulations of water in both gas and liquid phases.112
Building on the accuracy and predictive power of MB-pol, many-body PEFs for various molecular systems were developed, including halide113–119 and alkali-metal120–124 ions in water, molecular fluids,125–128 small molecules in water,129,130 and generic covalently-bonded molecules in the gas phase.131 These many-body PEFs were developed within the many-body energy (MB-nrg) theoretical/computational framework,113,120 which effectively generalizes the MB-pol framework to arbitrary molecules. Briefly, the MB-nrg PEF of a system is built upon a baseline physics-based model describing permanent electrostatics, London dispersion forces, and many-body polarization, which is supplemented by explicit machine-learned n-body PIPs. As in MB-pol, the MB-nrg PIPs effectively represent quantum-mechanical many-body interactions arising from the overlap of the electron densities of individual monomers.113,120
Here, we introduce MBX (Many-Body eXpansion),132 a modular C++ library that can either be used as a standalone software for calculating MB-nrg energies and forces for the molecular system of interest or interfaced with external MD and MC engines to perform classical and quantum simulations of the molecular system of interest across different thermodynamic states and phases, in both periodic and non-periodic conditions, using the corresponding MB-nrg PEFs. Importantly, MBX is interfaced with MB-Fit,133 a Python software infrastructure that provides an integrated suite of codes for the automated development of MB-nrg PEFs for generic molecules, from training set generation to PEF fitting and implementation.134
II. THEORY: MB-NRG POTENTIAL ENERGY FUNCTIONS
The energy of a system containing N (atomic or molecular) monomers (hereafter referred to as 1-mers) can be rigorously expressed as a sum of n-body energy contributions (1 ≤ n ≤ N) according to the many-body expansion (MBE) of the energy,135
| (1) |
where each 1-body energy, ɛ1B(i), is the energy of the isolated ith 1-mer, E1(i). For n ≥ 2, the n-body energies, ɛnB are defined recursively according to the following expression:
| (2) |
It should be noted that within the MB-nrg theoretical/computational framework the reference zero for the energy scale (where EN = 0) corresponds to the molecular configuration in which all N 1-mers are separated by infinite distances and each 1-mer is in its minimum-energy geometry. As a consequence, ɛ1B(i) corresponds to the distortion energy of the ith 1-mer relative to its minimum-energy geometry. Since the MBE converges quickly for molecular systems with localized electron densities, i.e., molecular systems with large electronic band gaps,136–139 the MBE provides a rigorous and efficient theoretical/computational framework for the development of many-body PEFs where each n-body term of Eq. (1) is fitted to reproduce the corresponding n-body reference energies calculated from “first principles.”
As in MB-pol,72–74 the MB-nrg PEFs integrate physics-based many-body terms, representing contributions to molecular interactions that can be accurately represented by classical expressions (e.g., permanent electrostatics and polarization), with explicit machine-learned representations of individual n-body terms in the MBE, which effectively recover quantum-mechanical interactions arising from the overlap of 1-mer’s electron densities (e.g., exchange-repulsion, charge transfer, and charge penetration) that cannot be represented by classical expressions.140 Specifically, the MB-nrg theoretical/computational framework approximates the MBE defined in Eq. (1) as
| (3) |
where n ≤ N and N is the total number of 1-mers in the system.
Each of the VnB terms of an MB-nrg PEF includes an n-body machine-learned term for each n-mer. Each is expressed as a product of a switching function and a PIP (i.e., ). The switching function (snB) ensures that the contribution from the associated term goes to zero as any subset of the 1-mers in an n-mer is separated from the rest.
Following the original MB-pol PEF,72,73 a given n-body PIP takes the following form:
| (4) |
where M1, M2, …, Mn are n 1-mers that compose an n-mer of type ν(M1, M2, …, Mn), L is the number of linear parameters, cl are the linear parameters, ηl are the symmetrized monomials built from the variables, ξ1−λ, each of which is an exponential of an interatomic distance with one of the following forms:
| (5a) |
| (5b) |
| (5c) |
| (5d) |
where m and n are the indices for the physical atoms or fictitious sites defined by the n-mer’s geometry, and Rmn is the distance between two atoms/sites. τ(mn) maps the pair of atoms/sites into distinct classes, such that all atom/site pairs within the same class share the same nonlinear fitting parameters kτ(mn) and d0,τ(mn). There is one unique set of monomials (ηl), linear fitting parameters (cl), and non-linear fitting parameters [kτ(mn), d0,τ(mn)] for each unique n-mer type [ν(M1, M2, …, Mn)].
In Eq. (3), V1B is the total 1-body energy given by
| (6) |
Because a switching function is not used for the 1-body term, is simply a machine-learned PIP representing the 1-body energy of the ith 1-mer with functional form as in Eq. (4),
| (7) |
represents the 1-body dispersion energy, as a sum of interatomic pairwise contributions,
| (8) |
where Rkl is the distance between atoms k and l located on 1-mer Mi, C6,kl is the corresponding dispersion coefficient, and Δkl = 0 if the atom pair is excluded or 1 otherwise. f(bklRkl) is the Tang–Toennies damping function,141
| (9) |
where bkl is a fitting parameter. By convention, all atom pairs that participate in a bond, angle, or dihendral angle are excluded (Δkl = 0). Thus, for most 1-mers, all atom pairs are excluded and . However, for large 1-mers, this may not be the case.
The explicit 2-body term of an MB-nrg PEF, V2B in Eq. (3), is expressed as
| (10) |
where is a 2-body machine-learned term representing the 2-body energy of the 2-mer composed by the ith and jth 1-mers, constructed as a product of a switching function and a PIP with functional form as in Eq. (4),
| (11) |
in Eq. (10) is the total 2-body dispersion energy calculated as a sum of pairwise additive contributions associated with each pair of atoms located on the two 1-mers in a 2-mer,140
| (12) |
where Rkl is the distance between atoms k and l located on 1-mers Mi and Mj, respectively, C6,kl is the corresponding dispersion coefficient, and f(bklRkl) is the Tang–Toennies damping function [Eq. (9)]. In both Eqs. (8) and (12), the dispersion coefficients are calculated using the Exchange Dipole Moment (XDM) model.142–144
All other explicit many-body terms (VnB) in Eq. (3) take the following form:
| (13) |
where each is built as the product of a switching function and a PIP with functional form as in Eq. (4),
| (14) |
Explicit n-body terms may be retained up to an arbitrary n-body level. Generally, it is sufficient to truncate these terms at the n = 3 or n = 4 level, depending on the system being studied. Specific details about the switching functions (s2B, s3B, and s4B), including functional forms used by the MB-nrg PEFs available in MBX, are discussed in the supplementary material.
Finally, the electrostatics term, Velec, in Eq. (3) is based on a modified version of the Thole model145 introduced in Ref. 146 and further refined for the MB-pol PEF.72,73 Velec represents permanent electrostatics by a sum of Coulomb interactions between smeared partial charges located on each 1-mer as well as induced electrostatics (up to dipoles) by an implicit many-body polarization term. Within the MB-nrg theoretical/computational framework, the partial charges, which can have fixed or geometry-dependent values, are obtained by fitting the multipole moments calculated from “first principles” for each isolated 1-mer and can be placed on both physical atoms and fictitious sites.
In MBX, Velec is represented by four terms describing charge-charge interactions (Vqq), charge-dipole interactions (Vqμ), dipole-dipole interactions (Vμμ), and the polarization energy (Vpol), respectively. Each of these terms is defined as follows:
| (15a) |
| (15b) |
| (15c) |
| (15d) |
where the Einstein notation is used for repeated Greek letters (e.g., is a condensed form of ) In Eqs. (15a)–(15d), N is the total number of electrostatic sites in the system, qi is the charge of site i, μi is the dipole moment of site i, is the polarizability of site i ( becomes a scalar if it is isotropic), and , , and are the electrostatic tensors defined as follows:
| (16a) |
| (16b) |
| (16c) |
| (16d) |
where α, β, γ define any of the Cartesian directions (x, y, or z), Rij is the distance between atoms i and j, and δ is the Kronecker delta. The functions Si(r) are the screening functions designed to smear the charges over space, which can be recursively derived from Eq. (18a) as
| (17) |
As in MB-pol,72,73 the screening functions for the MB-nrg PEFs are given by
| (18a) |
| (18b) |
| (18c) |
| (18d) |
where a is the Thole damping, which can be different for charge–charge, charge–dipole, and dipole–dipole interactions, , with i and j being the two sites involved, r = Rij, and α is the polarizability factor that is usually set to be the same as the polarizability. The interested reader is referred to Ref. 147 for specific details about the derivation of Eqs. (14)–(17).
III. SOFTWARE STRUCTURE
The C++ source code of MBX is organized into four modules, each of which handles specific functions: building block is responsible for maintaining the state of the system; potential evaluates the various components of the MB-nrg PEFs; I/O manages inputs, outputs, and interfaces with MD drivers; and utilities contains functions to execute miscellaneous support tasks. The potential module is further divided into sub-modules to calculate each of the energy contributions described in Eq. (3): n-body PIPs, 2-body dispersion, permanent electrostatics, and many-body polarization. The general workflow for an energy calculation step performed by MBX is shown in Fig. 1.
FIG. 1.
General workflow for an energy and force calculation step in MBX.
A. Input
Since all PEF parameters are directly compiled into MBX, the user only needs to provide minimal information that is passed to MBX through two files: the NRG file, which contains information about all 1-mers in the system and their initial coordinates, and the JSON file, which specifies details about the calculation to be performed, such as enabling or disabling n-body terms for certain n-mers, and assorted settings, such as the algorithm for the calculation of many-body polarization and convergence threshold for the induced dipole moments. More information about the format and contents of the NRG and JSON files are discussed in the supplementary material.
B. Building block
The building block module contains the System class, which stores all the information about the 1-mers in the system. MBX provides a function that reads the NRG file, and creates and initializes a System instance. The initialized System can then be configured using the JSON parameters that control the energy calculation. Initializing a System object requires multiple memory allocation calls and is therefore not instantaneous, but it is performed only once per system of interest.
While MD and MC software implementations for force fields typically treat atoms as the smallest unit, MBX considers 1-mers, each consisting of a few atoms, as the smallest unit of the system since the n-body PIPs, which form the backbone of MB-nrg PEFs, are evaluated on these 1-mers. For this reason, the System class stores data on a per-1-mer basis.
For each 1-mer type defined in a MB-nrg PEF, the parameters defining all relevant atomic quantities (e.g., charges, polarizabilities, and dispersion coefficients) are automatically compiled into MBX. Because the n-body PIPs of a given MB-nrg PEF are fitted over the underlying representation of electrostatics and dispersion, the parameters entering the expressions for [Eq. (10)] and Velec [Eqs. (15a)–(15d)] are intertwined with each MB-nrg PEF. As a consequence, if the user wishes to adopt a different set of electrostatic or dispersion parameters, all n-body PIPs will need to be refitted using MB-Fit.133
The System class oversees the calculations of each contribution to the total energy by delegating to the appropriate functions within the potential module (see below). Each function returns the energy and associated gradients of a particular energy contribution with respect to the coordinates of the atoms. Once the System object is initialized, it is not possible in the current version of MBX to add new 1-mers or change the type of existing ones without rebuilding the System instance. The atomic coordinates can be updated at any time as long as they are in the same order as the initial set of coordinates. Similarly, any parameters specifying the type of calculation to be performed (e.g., algorithm for many-body polarization, convergence threshold for the induced dipoles, and box size and shape for calculations in periodic boundary conditions) can be changed at any time.
MBX initializes a System object through the following steps:
-
1.
Create a new System object with default parameters corresponding to those used for a gas-phase calculation.
-
2.
Add 1-mers to the System using the AddMonomer member function. The coordinates, atom labels, and 1-mer type for each 1-mer are stored in the System object.
-
3.
After all 1-mers have been added, initialize the System. This involves storing the properties for each 1-mer and reordering the 1-mers for optimization of parallelization. The reordering process groups 1-mers of the same type together and orders the types by increasing number of 1-mers. For example, the input for a system of 250 CO2 molecules (i.e., 1-mers of type CO2) and 300 H2O molecules (i.e., 1-mers of type H2O) can be provided in any order, but MBX will reorder it such that the CO2 molecules come before the H2O molecules.
-
4.
Set the physical properties of the atoms, such as charges, polarizabilities, and dispersion coefficients, using helper functions.
-
5.
Parse the JSON file containing information about box size and shape for calculations in periodic boundary conditions, cutoffs, and type of MB-nrg PEFs, as well as other options that control and determine the type of calculation and energy calls to be performed. If a JSON file is not found or not present, the defaults are used.
C. Machine-learned 1-body term:
Since different MD and MC engines have different conventions regarding storage of atom coordinates, MBX first translates the atoms in the 1-mer to obey the minimum-image convention before evaluating the 1-body PIPs. This is only necessary when performing a calculation in periodic boundary conditions. MBX selects the first atom of each 1-mer as the reference atom to identify the minimum images of the other atoms in the 1-mer. This reference atom is then placed in the principal box and the closest images of other atoms in the 1-mer are selected through an algorithm that operates in fractional coordinates.
D. Machine-learned n-body terms (n > 1):
MBX supports n-body PIPs with arbitrary values for n, which can be generated with MB-Fit,133,134 and currently already provides functions to evaluate 2-body, 3-body, and 4-body PIPs. Adding n-body PIPs with larger values of n is trivial and does not require any significant refactoring of the source code. In order to efficiently evaluate all terms [Eq. (4)], MBX first identifies all n-mers for which it is possible that the associated n-body switching function (snB) is non-zero. An n-mer is accepted and passed to the polynomial evaluation if and only if some 1-mer within the n-mer is within a predefined n-body cutoff of all other 1-mers in the n-mer. In other words, there must be a “central” 1-mer, and all other 1-mers must be within the n-body cutoff of the “central” 1-mer. This idea is formalized in the following criterion:
center-neighbor criterion: Using the first atom of each 1-mer to define the position of the 1-mer, the center-neighbor criterion for a given n-mer is satisfied if and only if there exists at least one 1-mer (“center”) such that the distances between the “center” 1-mer and all other n-1 1-mers (“neighbors”) in the n-mer are smaller than the n-body cutoff .
The value for each used by the center-neighbor criterion is specified by the user in the JSON file. As a consequence, MBX only needs to collect n-mers for which the center-neighbor criterion is satisfied and pass this information to the PIP evaluator. The rules for setting appropriate values for are discussed in the supplementary material.
MBX uses a K-D Tree to search for n-mers that satisfy the center-neighbor criterion, after which the evaluation of the n-body PIPs with n > 1 is effectively the same as for the 1-body PIPs. Using a K-D Tree allows MBX to quickly identify relevant n-mers and avoid the need for a double or triple loop over all 1-mers, which would be extremely slow. The Nanoflann library148 is used to implement the K-D Tree and perform the radial search. It should be noted that, although not negligible, the Central Processing Unit (CPU) time required to create the tree and perform the search is still a small fraction of the CPU time required to calculate the n-body PIP contributions. The K-D Tree implementation in MBX is as follows: first, a tree is built using the first atom of each 1-mer as a point in the tree. Once the tree containing all the 1-mers is completed, MBX loops over all the 1-mers, which will be the candidate “center” 1-mer in the current loop, and performs a radial search of all other 1-mers that are within , which will be the candidate “neighbor” 1-mers. Then, n-mers are constructed from the “center” 1-mer and each combination of n-1 “neighbors.” By construction, each of the constructed n-mers necessarily satisfies the center-neighbor criterion. However, it is possible that the same n-mer can be selected several times, with different 1-mers acting as the “center.” To avoid double counting, the candidate n-mer is considered valid only if the 1-mer index of the “center” is the smallest among all valid “center” 1-mers.
Although K-D Trees were not originally designed for use in periodic boundary conditions, MBX has implemented a patch that allows for their use in such cases by replicating the box in space. This implies that instead of building a tree for a single copy of the system as done in the gas phase, MBX builds a tree for 27 copies: the original one and the 26 adjacent boxes. Only images within the main simulation box are eligible “center” 1-mers. Future versions of MBX will implement more advanced solutions to address the potential memory cost of this process when the target number of 1-mers is large. After obtaining the lists of n-mers, MBX sends batches of multiple n-mers of the same type to the PIP functions, which then transform the coordinates into PIP variables and calculate the corresponding PIP values.
E. Physics-based terms
MBX defines two distinct classes that are dedicated to calculating the following non-bonded interactions: dispersion (Dispersion class) and permanent and induced electrostatics (Electrostatics class). As in conventional force fields and discussed in Sec. II, MBX excludes these non-bonded interactions for atom pairs that are part of a bond, angle, or dihedral angle. However, MBX does not scale these interactions as common force fields do—for a particular atom pair, they are either entirely enabled or entirely disabled [hence, Δkl in Eq. (8)]. Generally, all atom pairs within a 1-mer are excluded, but in the event that a 1-mer contains non-excluded pairs both classes calculate the contributions from 1-mer dispersion and electrostatics [i.e., dispersion energy as in Eq. (8) and 1-body contributions to Velec in Eq. (3), respectively] in a first step, ignoring any pair in the excluded pairs list. Then, the intermolecular contributions are calculated in a double-loop over the 1-mer types. For each pair of 1-mer types, the contributions to the dispersion and electrostatics energies are calculated. Before evaluation, the coordinates and associated properties (e.g., atomic charges and polarizabilities, dispersion coefficients, etc.) are reordered to maximize speedup from vectorization through single-instruction multiple-data (SIMD) operations.
1. Dispersion
As shown in Eq. (12), the dispersion energy of a MB-nrg PEF is calculated in real space as a pairwise-additive potential using pair-defined dispersion coefficients (C6,kl) that are calculated using the XDM model.142–144 If the molecular system of interest is in periodic boundary conditions, the long-range contribution to the dispersion energy is calculated in reciprocal space using the particle mesh Ewald (PME) algorithm as implemented in the helPME library.149,150 PME uses atom-defined C6 that are then combined using the usual geometric mean combination rule to obtain pair coefficients (i.e., ). A discontinuity in the energy and its gradients can occur if the C6,kl pair coefficients used to calculate the dispersion energy in real space are abruptly changed to the values used by the PME algorithm at the cutoff distance. To avoid this discontinuity, MBX applies a switching function of the same form as that used for the 2-body PIP switching function (see the supplementary material), enabling a smooth transition from the C6,kl used in real space to the C6,kl used in the PME calculation.
2. Electrostatics
The electrostatics calculation involves several steps, including the computation of the permanent electric field, the calculation of the long-range electric field using the PME algorithm as implemented in the helPME library,149,150 and the determination of the induced dipoles using one of three algorithms: iterative, conjugate gradient, or always stable predictor-corrector.151 The permanent contribution to the electrostatic energy is straightforward to calculate and relatively fast. However, the bottleneck of the electrostatics calculation is to obtain the induced dipoles on each site. While the analytical solution of the induced dipole moments is possible, it is not efficient for large systems,147 and it has not been implemented in MBX. A detailed description of the possible methods to solve for the induced dipole moments can be found in Ref. 147.
F. Output
Once all energy and gradient contributions have been calculated, they are summed and stored in the System object, ready to be retrieved by the user or a MD/MC driver. After this step is completed, external modifications to the coordinates of the system such as progression to the next MD/MC step can be performed. The new coordinates are set in the same System instance, which can then be used to perform another energy/force calculation.
While energies and forces are the most commonly retrieved information by MD and MC drivers, MBX provides interfaces to retrieve any of the system’s properties, including but not limited to, charges, permanent and induced dipole moments, and the virial tensor.
IV. DRIVERS
MBX has three built-in drivers to perform single point calculations, geometry optimizations, and normal-mode analyses, all written in C++. A simple example on how to use MBX to read an NRG file and set up the system with a JSON file is shown in Fig. 2.
FIG. 2.
Example of a C++ main function to use the MBX library with a NRG and a JSON file.
Besides the internal drivers discussed above, the current version of MBX also provides an efficient interface to popular software packages Large-scale Atomic/Molecular Massively Parallel Simulator (LAMMPS)152 and i-PI153 for both classical MD and quantum path-integral molecular dynamics (PIMD) simulations.2 MBX acts as a client that returns MB-nrg energies and forces, while the actual MD steps are controlled by the MD engine. In the case of i-PI, the communication between MBX and i-PI can be established in two ways: Internet and Unix domain sockets. For LAMMPS, MBX is connected through the combination of specific FIX and PAIR_STYLE commands in the LAMMPS input. The MBX/LAMMPS and MBX/i-PI interfaces have already been used to study the water vapor/liquid equilibrium,104 CH4/H2O126,128 and CO2/H2O125,127 mixtures, and ions in solution.118,119,122 In the current version of MBX, all of the computationally expensive functions are parallelized using Open Multi-Processing (OpenMP) to maximize use of large many-core compute nodes. This design readily enables other “driver” codes, serial or parallel, to couple with MBX and perform advanced calculations, such as MD and PIMD simulations using LAMMPS or i-PI.
The pure driver-only nature of i-PI makes the interface with MBX very simple. A single driver code that communicates with the i-PI socket is enough to allow both packages to communicate. The driver code receives the coordinates and the simulation cell from i-PI through a socket, sets them into MBX, and performs the energy calculation for those coordinates. Gradients and energies are then retrieved from MBX and sent through the socket to i-PI that performs the time evolution for each time step, updating both atom coordinates and simulation cell, which are then sent back to the driver.
In the case of LAMMPS, MBX is tightly coupled to enable large-scale parallel simulations with minimal overhead. LAMMPS is parallelized using a spatial domain decomposition algorithm whereby the simulation is partitioned into sub-domains and individual Message Passing Interface (MPI) ranks are responsible for computing all tasks within the sub-domain to which they have been assigned. In MBX, minimal changes were necessary to enable the calculation of the real-space interactions within each LAMMPS sub-domain containing local and ghost particles. Local particles are contained within the sub-domain owned by an MPI rank and ghost particles are replicated from neighboring sub-domains owned by other MPI ranks. For performance reasons, the iterative electrostatic solver in MBX was enabled with MPI and does not need to interact with LAMMPS during intermediate steps. In current CPU-only data-driven many-body simulations with MBX+LAMMPS, the performance bottleneck functions include evaluation of the n-body PIP terms, and calculation of the long-range portion of the electrostatic and dispersion interactions that include evaluation of distributed 3D Fast Fourier Transforms (FFTs). The electrostatic solver involves an iterative calculation of induced dipole moments requiring repeated communication with neighboring MPI ranks and evaluation of multiple 3D FFTs. These terms of the MB-nrg PEF along with all the others can be evaluated independently of one another and in arbitrary order.
The LAMMPS interface also enables hybrid FF/MB-nrg simulations where some interactions are described by conventional force fields [e.g., Assisted Model Building with Energy Refinement (AMBER),154 Chemistry at Harvard Molecular Mechanics (CHARMM),155 and Optimized Potentials for Liquid Simulations (OPLS)156] and other interactions are described by MB-nrg PEFs. In these hybrid simulations, the electrostatic energy is exclusively computed by MBX, while the remaining non-bonded interactions between FF and MB-nrg molecules are represented by Lennard-Jones potentials that can be derived using standard Lorentz-Berthelot mixing rules. In the case of FF molecules solvated in MB-pol water, the recommended effective Lennard-Jones parameters for MB-pol are listed in Table I.
TABLE I.
Effective Lennard-Jones parameters for MB-pol water.
| Atom | σ (Å) | ɛ (kcal/mol) |
|---|---|---|
| O | 3.263 93 | 0.269 48 |
| H | 2.683 54 | 3.7 × 10−10 |
Importantly, given its modularity and portability, MBX can be used in combination with any software package (e.g., in-house software developed within a research group) that supplies atom coordinates and expects energies and forces. MBX modules and sub-modules can be included by other C++ codes and System objects can be instantiated and used like any other C++ class. MBX also provides wrapper interfaces in C, FORTRAN and Python. The System class by itself is too big to be automatically adapted to other languages. However, for each one of the main System member function, there is a wrapper that enables calls from other programming languages. While not all of the member functions are wrapped, implementing a wrapper to retrieve a property that is currently not available is a simple and straightforward process.
V. PARALLELIZATION
In order to perform calculations on large systems, it is necessary to parallelize the evaluation of the various contributions to the total potential energy and forces. MBX exploits two sources of parallelization. Internally, MBX parallelizes the calculation of the various PEF contributions using OpenMP. Externally, MBX can exploit MPI parallelization schemes implementing domain decomposition which may be available in the interfaced molecular simulation software. For example, since LAMMPS is able to partition the simulation box into sub-domains overseen by individual MPI ranks, the MBX/LAMMPS interface allows each LAMMPS MPI rank to use one or more MBX OpenMP threads. This implies that both sources of parallelization (OpenMP in MBX and MPI in LAMMPS or other software) can be used together.
As a showcase of the OpenMP parallelization, Fig. 3 reports the mean runtime of an energy calculation for a box of 2048 water molecules as a function of the number of cores. The timings observed suggest that the OpenMP parallelization is efficient up to about 16 threads, after which MBX is not currently able to take full advantage of further parallelization through OpenMP. Also shown in Fig. 3 is the runtime when the calculations are performed within LAMMPS using a single MPI rank (and the indicated number of OpenMP threads). As expected, the scaling for both MBX as a standalone code and when interfaced with LAMMPS using a single MPI rank is essentially identical, since the OpenMP parallelization is internal to MBX. It should be noted here that, as is generally the case, the electrostatics represents the most expensive energy contribution to calculate. Since the i-PI interface utilizes no additional source of parallelization, the relative times profile of MBX in i-PI is essentially identical to that obtained when MBX is interfaced with LAMMPS in Fig. 3.
FIG. 3.
Relative time to calculate all energies and gradients for a cubic box of 2048 water molecules in MBX in periodic boundary conditions. Calculations were each performed 100 times, and the average was taken. The relative times are presented as a function of the number of OpenMP threads used with MBX as a standalone code (a) and with LAMMPS using a single MPI rank (b), being the reference time the average time taken when using 1 OMP thread. All the calculations were performed on a compute node with two sockets each with 64 2.6 GHz AMD 7H12 Rome processors.
When the simulations are driven by LAMMPS, MBX can also take advantage of parallelization over MPI ranks. Figure 4 shows the relative times associated with the MBX energy and gradient calculations when interfaced with LAMMPS, utilizing several different combinations of MPI ranks and OpenMP threads. Comparing columns [1, 2] and [1, 4] with columns [2, 1] and [4, 1], it is clear that the OpenMP parallelization is more effective when the total number of available threads is small. However, as nOMP gets larger and approaches the parallelization limit observed in Fig. 3, the use of MPI ranks is more effective in achieving the best performance. The optimal combination of OpenMP threads and MPI ranks depends on various factors, including the system’s size and topology (i.e., cluster, bulk, or interface). It should be noted that the evaluation of all individual contributions to the energy scales relatively well with both MPI and OpenMP parallelization, with the exception of the PME part of the electrostatics, which will be the focus of further optimizations in the subsequent releases of MBX. The actual timings associated with the MBX energy and gradient calculations shown in Figs. 3 and 4 are reported in the supplementary material.
FIG. 4.
Relative time to calculate all energies and gradients for a cubic box of 2048 water molecules in periodic boundary conditions using MBX interfaced with LAMMPS. Calculations were each performed 100 times, and the average was taken. The relative times are presented as a function of the number of OpenMP threads (nOMP) per MPI rank and the number of MPI ranks (nMPI), being the time corresponding to 1 OMP thread and 1 MPI rank the reference. Calculations were performed on a compute node with two sockets each with 64 2.6 GHz AMD 7H12 Rome processors.
All timings reported in Figs. 3 and 4 were obtained for simulations of 2048 water molecules in a periodic cubic box carried out on a compute node with two sockets each with 64 2.6 GHz AMD 7H12 Rome processors using a convergence threshold (ɛ) for the atomic induced dipole moments of 10−16, which corresponds to each component of the induced dipole moment of each atom being converged up to the eighth decimal digit. The convergence criterion is met when the squared difference between successive iterations (k and k + 1) of each induced dipole moment component (α) for each atom i, , is smaller than the tolerance ɛ,
| (19) |
A threshold ɛ = 10−16 corresponds to a conservative and safe convergence criterion for all systems that we have simulated with our MB-nrg PEFs to date. However, it is worth noting that larger values up to ɛ = 10−8 are sufficient for systems with weaker responses to electric fields (e.g., neat H2O, CO2, CH4 solutions). A systematic analysis of the energy conservation and associated energy fluctuations for simulations of 2048 water molecules in a periodic cubic box carried out in the microcanonical (NVE = constant number of molecules, volume, and energy) ensemble as a function of the convergence tolerance is reported in the supplementary material.
VI. CONCLUSIONS
Over the pasy decade, data-driven many-body MB-nrg PEFs have been shown to accurately predict the properties of various molecular systems from the gas to the condensed phase. By integrating an underlying many-body polarizable model with explicit machine-learned representations of individual n-body interactions, MB-nrg PEFs achieve chemical accuracy in the representation of molecular interactions at both short and long range, and at all n-body orders.
In this work, we introduced MBX, a C++ modular library that enables MB-nrg energy and forces calculations. MBX is divided into modules responsible for particular tasks. The potential module is divided into sub-modules, each handling one specific energy contribution: n-body PIPs, dispersion energy, and electrostatics. Other modules are responsible for input/output, interfacing with drivers (e.g., software for MD and MC simulations), and constructing the System class that stores the state of the molecular system.
While MBX can be used as a standalone software, it also provides interfaces to common MD packages such as i-PI and LAMMPS along with interfaces written in Fortran and Python that can be seamlessly used in combination with third-party software (e.g., in-house software developed by a research group). Both interfaces have already been used to study various molecular systems, including liquid water, CO2/H2O mixtures, CH4/H2O mixtures, hydrated alkali-metal ion clusters, and ionic solutions.
MBX includes an internal OpenMP parallelization that is more efficient when the number of threads is small. When interfaced with external software that provides its own MPI parallelization (e.g., LAMMPS), MBX enables efficient MB-nrg energy and force calculations that take advantage of both OpenMP and MPI parallelizations. Future versions of MBX will include improved parallelization schemes as well as the implementation of the extended MB-nrg framework introduced in Ref. 131 for covalently bonded molecules, with the goal of enabling fast MB-nrg energy/force calculations that, in turn, will enable chemically accurate large-scale computer simulations of generic molecular systems.
SUPPLEMENTARY MATERIAL
Description of the MBX input file formats and functional form of the switching functions for the MB-nrg PEFs.
ACKNOWLEDGMENTS
Different aspects of this work were supported by the National Science Foundation (Grant Nos. CHE-1954895 and CHE-2102309) (overall software development and implementation) and the Air Force Office of Scientific Research (Grant No. FA9550-16-1-0327) (PIP optimization). M.R. was partially supported by a Software Fellowship from the Molecular Sciences Software Institute (MolSSI), which was initially funded by the National Science Foundation (Grant No. ACI-1547580). D.G.A.S. was supported by the Molecular Sciences Software Institute (MolSSI), which is funded by the National Science Foundation (Grant No. CHE-2136142). A.C.S. was supported by the intramural research program of the National Heart, Lung, and Blood Institute. C.K. was supported by the Office of Science, U.S. Department of Energy (Contract No. DE-AC02-06CH11357). This research used resources of the Extreme Science and Engineering Discovery Environment (XSEDE), which was supported by the National Science Foundation (Grant No. ACI-1548562), the Department of Defense High Performance Computing Modernization Program (HPCMP), and the Triton Shared Computing Cluster (TSCC) at the San Diego Supercomputer Center.
The submitted manuscript has been created by UChicago Argonne, LLC, Operator of Argonne National Laboratory (“Argonne”). Argonne, a U.S. Department of Energy Office of Science laboratory, is operated under Contract No. DE-AC02-06CH11357. The U.S. Government retains for itself, and others acting on its behalf, a paid-up nonexclusive, irrevocable worldwide license in said article to reproduce, prepare derivative works, distribute copies to the public, and perform publicly and display publicly, by or on behalf of the Government. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan. http://energy.gov/downloads/doe-public-access-plan.
Contributor Information
Marc Riera, Email: mailto:mrierari@ucsd.edu.
Christopher Knight, Email: mailto:knightc@anl.gov.
Francesco Paesani, Email: mailto:fpaesani@ucsd.edu.
AUTHOR DECLARATIONS
Conflict of Interest
The authors have no conflicts to disclose.
Author Contributions
Marc Riera: Conceptualization (equal); Investigation (equal); Methodology (equal); Software (lead); Validation (equal); Visualization (equal); Writing – original draft (equal); Writing – review & editing (equal). Christopher Knight: Funding acquisition (supporting); Investigation (equal); Software (equal); Validation (equal); Writing – original draft (equal); Writing – review & editing (equal). Ethan F. Bull-Vulpe: Investigation (equal); Methodology (equal); Software (supporting); Validation (supporting); Visualization (supporting); Writing – original draft (equal); Writing – review & editing (equal). Xuanyu Zhu: Investigation (equal); Methodology (equal); Software (supporting); Validation (supporting); Visualization (supporting); Writing – original draft (equal); Writing – review & editing (equal). Henry Agnew: Investigation (supporting); Methodology (supporting); Software (supporting); Validation (supporting); Visualization (supporting); Writing – review & editing (supporting). Daniel G. A. Smith: Funding acquisition (supporting); Software (supporting); Writing – review & editing (supporting). Andrew C. Simmonett: Funding acquisition (supporting); Software (supporting); Writing – review & editing (supporting). Francesco Paesani: Conceptualization (equal); Funding acquisition (equal); Investigation (equal); Methodology (equal); Project administration (equal); Resources (equal); Software (equal); Supervision (equal); Writing – original draft (equal); Writing – review & editing (equal).
DATA AVAILABILITY
The data that support the findings of this study are available from the corresponding author upon reasonable request. MBX can be downloaded from https://github.com/paesanilab/MBX.
REFERENCES
- 1.Frenkel D. and Smit B., Understanding Molecular Simulation: From Algorithms to Applications (Elsevier, 2001), Vol. 1. [Google Scholar]
- 2.Tuckerman M., Statistical Mechanics: Theory and Molecular Simulation (Oxford University Press, 2010). [Google Scholar]
- 3.van Gunsteren W. F. and Berendsen H. J. C., “Computer simulation of molecular dynamics: Methodology, applications, and perspectives in chemistry,” Angew. Chem., Int. Ed. 29, 992–1023 (1990). 10.1002/anie.199009921 [DOI] [Google Scholar]
- 4.Binder K., Monte Carlo and Molecular Dynamics Simulations in Polymer Science (Oxford University Press, 1995). [Google Scholar]
- 5.Warshel A., “Computer simulations of enzyme catalysis: Methods, progress, and insights,” Annu. Rev. Biophys. Biomol. Struct. 32, 425–443 (2003). 10.1146/annurev.biophys.32.110601.141807 [DOI] [PubMed] [Google Scholar]
- 6.Karplus M. and Kuriyan J., “Molecular dynamics and protein function,” Proc. Natl. Acad. Sci. U. S. A. 102, 6679–6685 (2005). 10.1073/pnas.0408930102 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Durrant J. D. and McCammon J. A., “Molecular dynamics simulations and drug discovery,” BMC Biol. 9, 71 (2011). 10.1186/1741-7007-9-71 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Ohno K., Esfarjani K., and Kawazoe Y., Computational Materials Science: From Ab Initio to Monte Carlo Methods (Springer, 2018). [Google Scholar]
- 9.Lifson S. and Warshel A., “Consistent force field for calculations of conformations, vibrational spectra, and enthalpies of cycloalkane and n-alkane molecules,” J. Chem. Phys. 49, 5116–5129 (1968). 10.1063/1.1670007 [DOI] [Google Scholar]
- 10.Warshel A. and Lifson S., “Consistent force field calculations. II. Crystal structures, sublimation energies, molecular and lattice vibrations, molecular conformations, and enthalpies of alkanes,” J. Chem. Phys. 53, 582–594 (1970). 10.1063/1.1674031 [DOI] [Google Scholar]
- 11.Rappé A. K., Casewit C. J., Colwell K. S., Goddard W. A. III, and Skiff W. M., “UFF, a full periodic table force field for molecular mechanics and molecular dynamics simulations,” J. Am. Chem. Soc. 114, 10024–10035 (1992). 10.1021/ja00051a040 [DOI] [Google Scholar]
- 12.Halgren T. A. and Damm W., “Polarizable force fields,” Curr. Opin. Struct. Biol. 11, 236–242 (2001). 10.1016/s0959-440x(00)00196-2 [DOI] [PubMed] [Google Scholar]
- 13.A. D. MacKerell, Jr., “Empirical force fields for biological macromolecules: Overview and issues,” J. Comput. Chem. 25, 1584–1604 (2004). 10.1002/jcc.20082 [DOI] [PubMed] [Google Scholar]
- 14.Nerenberg P. S. and Head-Gordon T., “New developments in force fields for biomolecular simulations,” Curr. Opin. Struct. Biol. 49, 129–138 (2018). 10.1016/j.sbi.2018.02.002 [DOI] [PubMed] [Google Scholar]
- 15.Harrison J. A., Schall J. D., Maskey S., Mikulski P. T., Knippenberg M. T., and Morrow B. H., “Review of force fields and intermolecular potentials used in atomistic computational materials research,” Appl. Phys. Rev. 5, 031104 (2018). 10.1063/1.5020808 [DOI] [Google Scholar]
- 16.Behler J., “Perspective: Machine learning potentials for atomistic simulations,” J. Chem. Phys. 145, 170901 (2016). 10.1063/1.4966192 [DOI] [PubMed] [Google Scholar]
- 17.Deringer V. L., Caro M. A., and Csányi G., “Machine learning interatomic potentials as emerging tools for materials science,” Adv. Mater. 31, 1902765 (2019). 10.1002/adma.201902765 [DOI] [PubMed] [Google Scholar]
- 18.Noé F., Tkatchenko A., Müller K.-R., and Clementi C., “Machine learning for molecular simulation,” Annu. Rev. Phys. Chem. 71, 361–390 (2020). 10.1146/annurev-physchem-042018-052331 [DOI] [PubMed] [Google Scholar]
- 19.Mueller T., Hernandez A., and Wang C., “Machine learning for interatomic potential models,” J. Chem. Phys. 152, 050902 (2020). 10.1063/1.5126336 [DOI] [PubMed] [Google Scholar]
- 20.Blank T. B., Brown S. D., Calhoun A. W., and Doren D. J., “Neural network models of potential energy surfaces,” J. Chem. Phys. 103, 4129–4137 (1995). 10.1063/1.469597 [DOI] [Google Scholar]
- 21.Gassner H., Probst M., Lauenstein A., and Hermansson K., “Representation of intermolecular potential functions by neural networks,” J. Phys. Chem. A 102, 4596–4605 (1998). 10.1021/jp972209d [DOI] [Google Scholar]
- 22.Lorenz S., Groß A., and Scheffler M., “Representing high-dimensional potential-energy surfaces for reactions at surfaces by neural networks,” Chem. Phys. Lett. 395, 210–215 (2004). 10.1016/j.cplett.2004.07.076 [DOI] [Google Scholar]
- 23.Manzhos S. and T. Carrington, Jr., “Using neural networks to represent potential surfaces as sums of products,” J. Chem. Phys. 125, 194105 (2006). 10.1063/1.2387950 [DOI] [PubMed] [Google Scholar]
- 24.Behler J. and Parrinello M., “Generalized neural-network representation of high-dimensional potential-energy surfaces,” Phys. Rev. Lett. 98, 146401 (2007). 10.1103/physrevlett.98.146401 [DOI] [PubMed] [Google Scholar]
- 25.Ghasemi S. A., Hofstetter A., Saha S., and Goedecker S., “Interatomic potentials for ionic systems with density functional accuracy based on charge densities obtained by a neural network,” Phys. Rev. B 92, 045131 (2015). 10.1103/physrevb.92.045131 [DOI] [Google Scholar]
- 26.Smith J. S., Isayev O., and Roitberg A. E., “ANI-1: An extensible neural network potential with DFT accuracy at force field computational cost,” Chem. Sci. 8, 3192–3203 (2017). 10.1039/c6sc05720a [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Schütt K. T., Sauceda H. E., Kindermans P.-J., Tkatchenko A., and Müller K.-R., “SchNet—A deep learning architecture for molecules and materials,” J. Chem. Phys. 148, 241722 (2018). 10.1063/1.5019779 [DOI] [PubMed] [Google Scholar]
- 28.Zhang L., Han J., Wang H., Car R., and E W., “Deep potential molecular dynamics: A scalable model with the accuracy of quantum mechanics,” Phys. Rev. Lett. 120, 143001 (2018). 10.1103/physrevlett.120.143001 [DOI] [PubMed] [Google Scholar]
- 29.Unke O. T. and Meuwly M., “PhysNet: A neural network for predicting energies, forces, dipole moments, and partial charges,” J. Chem. Theory Comput. 15, 3678–3693 (2019). 10.1021/acs.jctc.9b00181 [DOI] [PubMed] [Google Scholar]
- 30.Batzner S., Musaelian A., Sun L., Geiger M., Mailoa J. P., Kornbluth M., Molinari N., Smidt T. E., and Kozinsky B., “E(3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials,” Nat. Commun. 13, 2453 (2022). 10.1038/s41467-022-29939-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Bartók A. P., Payne M. C., Kondor R., and Csányi G., “Gaussian approximation potentials: The accuracy of quantum mechanics, without the electrons,” Phys. Rev. Lett. 104, 136403 (2010). 10.1103/physrevlett.104.136403 [DOI] [PubMed] [Google Scholar]
- 32.Shapeev A. V., “Moment tensor potentials: A class of systematically improvable interatomic potentials,” Multiscale Model. Simul. 14, 1153–1173 (2016). 10.1137/15m1054183 [DOI] [Google Scholar]
- 33.Thompson A. P., Swiler L. P., Trott C. R., Foiles S. M., and Tucker G. J., “Spectral neighbor analysis method for automated generation of quantum-accurate interatomic potentials,” J. Comput. Phys. 285, 316–330 (2015). 10.1016/j.jcp.2014.12.018 [DOI] [Google Scholar]
- 34.Drautz R., “Atomic cluster expansion for accurate and transferable interatomic potentials,” Phys. Rev. B 99, 014104 (2019). 10.1103/physrevb.99.014104 [DOI] [Google Scholar]
- 35.Rupp M., Tkatchenko A., Müller K.-R., and von Lilienfeld O. A., “Fast and accurate modeling of molecular atomization energies with machine learning,” Phys. Rev. Lett. 108, 058301 (2012). 10.1103/PhysRevLett.108.058301 [DOI] [PubMed] [Google Scholar]
- 36.Chmiela S., Tkatchenko A., Sauceda H. E., Poltavsky I., Schütt K. T., and Müller K.-R., “Machine learning of accurate energy-conserving molecular force fields,” Sci. Adv. 3, e1603015 (2017). 10.1126/sciadv.1603015 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Vitek A., Stachon M., Krömer P., and Snáel V., “Towards the modeling of atomic and molecular clusters energy by support vector regression,” in 2013 5th International Conference on Intelligent Networking and Collaborative Systems (IEEE, 2013), pp. 121–126. [Google Scholar]
- 38.Braams B. J. and Bowman J. M., “Permutationally invariant potential energy surfaces in high dimensionality,” Int. Rev. Phys. Chem. 28, 577–606 (2009). 10.1080/01442350903234923 [DOI] [PubMed] [Google Scholar]
- 39.Xie Z. and Bowman J. M., “Permutationally invariant polynomial basis for molecular energy surface fitting via monomial symmetrization,” J. Chem. Theory Comput. 6, 26–34 (2010). 10.1021/ct9004917 [DOI] [PubMed] [Google Scholar]
- 40.Wang Y., Huang X., Shepler B. C., Braams B. J., and Bowman J. M., “Flexible, ab initio potential, and dipole moment surfaces for water. I. Tests and applications for clusters up to the 22-mer,” J. Chem. Phys. 134, 094509 (2011). 10.1063/1.3554905 [DOI] [PubMed] [Google Scholar]
- 41.Wang Y. and Bowman J. M., “Ab initio potential and dipole moment surfaces for water. II. Local-monomer calculations of the infrared spectra of water clusters,” J. Chem. Phys. 134, 154510 (2011). 10.1063/1.3579995 [DOI] [PubMed] [Google Scholar]
- 42.Barragán P., Prosmiti R., Wang Y., and Bowman J. M., “Full-dimensional (15-dimensional) ab initio analytical potential energy surface for the cluster,” J. Chem. Phys. 136, 224302 (2012). 10.1063/1.4726126 [DOI] [PubMed] [Google Scholar]
- 43.Mancini J. S. and Bowman J. M., “Communication: A new ab initio potential energy surface for HCl–H2O, diffusion Monte Carlo calculations of D0 and a delocalized zero-point wavefunction,” J. Chem. Phys. 138, 121102 (2013). 10.1063/1.4799231 [DOI] [PubMed] [Google Scholar]
- 44.Kamarchik E., Toffoli D., Christiansen O., and Bowman J. M., “Ab initio potential energy and dipole moment surfaces of the F−(H2O) complex,” Spectrochim. Acta, Part A 119, 59–62 (2014). 10.1016/j.saa.2013.04.076 [DOI] [PubMed] [Google Scholar]
- 45.Conte R., Houston P. L., and Bowman J. M., “Communication: A benchmark-quality, full-dimensional ab initio potential energy surface for Ar-HOCO,” J. Chem. Phys. 140, 151101 (2014). 10.1063/1.4871371 [DOI] [Google Scholar]
- 46.Mancini J. S. and Bowman J. M., “A new many-body potential energy surface for HCl clusters and its application to anharmonic spectroscopy and vibration–vibration energy transfer in the HCl trimer,” J. Phys. Chem. A 118, 7367–7374 (2014). 10.1021/jp412264t [DOI] [PubMed] [Google Scholar]
- 47.Qu C., Conte R., Houston P. L., and Bowman J. M., “‘Plug and play’ full-dimensional ab initio potential energy and dipole moment surfaces and anharmonic vibrational analysis for CH4–H2O,” Phys. Chem. Chem. Phys. 17, 8172–8181 (2015). 10.1039/c4cp05913a [DOI] [PubMed] [Google Scholar]
- 48.Conte R., Qu C., and Bowman J. M., “Permutationally invariant fitting of many-body, non-covalent interactions with application to three-body methane–water–water,” J. Chem. Theory Comput. 11, 1631–1638 (2015). 10.1021/acs.jctc.5b00091 [DOI] [PubMed] [Google Scholar]
- 49.Homayoon Z., Conte R., Qu C., and Bowman J. M., “Full-dimensional, high-level ab initio potential energy surfaces for H2(H2O) and H2(H2O)2 with application to hydrogen clathrate hydrates,” J. Chem. Phys. 143, 084302 (2015). 10.1063/1.4929338 [DOI] [PubMed] [Google Scholar]
- 50.Qu C. and Bowman J. M., “An ab initio potential energy surface for the formic acid dimer: Zero-point energy, selected anharmonic fundamental energies, and ground-state tunneling splitting calculated in relaxed 1–4-mode subspaces,” Phys. Chem. Chem. Phys. 18, 24835–24840 (2016). 10.1039/c6cp03073d [DOI] [PubMed] [Google Scholar]
- 51.Wang Y., Bowman J. M., and Kamarchik E., “Five ab initio potential energy and dipole moment surfaces for hydrated NaCl and NaF. I. Two-body interactions,” J. Chem. Phys. 144, 114311 (2016). 10.1063/1.4943580 [DOI] [PubMed] [Google Scholar]
- 52.Yu Q. and Bowman J. M., “Ab initio potential for H3O+ → H+ + H2O: A step to a many-body representation of the hydrated proton?,” J. Chem. Theory Comput. 12, 5284–5292 (2016). 10.1021/acs.jctc.6b00765 [DOI] [PubMed] [Google Scholar]
- 53.Wang Q. and Bowman J. M., “Two-component, ab initio potential energy surface for CO2–H2O, extension to the hydrate clathrate, CO2@(H2O)20, and VSCF/VCI vibrational analyses of both,” J. Chem. Phys. 147, 161714 (2017). 10.1063/1.4994543 [DOI] [PubMed] [Google Scholar]
- 54.Qu C., Yu Q., and Bowman J. M., “Permutationally invariant potential energy surfaces,” Annu. Rev. Phys. Chem. 69, 151–175 (2018). 10.1146/annurev-physchem-050317-021139 [DOI] [PubMed] [Google Scholar]
- 55.Qu C. and Bowman J. M., “IR spectra of (HCOOH)2 and (DCOOH)2: Experiment, VSCF/VCI, and ab initio molecular dynamics calculations using full-dimensional potential and dipole moment surfaces,” J. Phys. Chem. Lett. 9, 2604–2610 (2018). 10.1021/acs.jpclett.8b00447 [DOI] [PubMed] [Google Scholar]
- 56.Qu C. and Bowman J. M., “High-dimensional fitting of sparse datasets of CCSD(T) electronic energies and MP2 dipole moments, illustrated for the formic acid dimer and its complex IR spectrum,” J. Chem. Phys. 148, 241713 (2018). 10.1063/1.5017495 [DOI] [PubMed] [Google Scholar]
- 57.Qu C. and Bowman J. M., “Assessing the importance of the H2–H2O–H2O three-body interaction on the vibrational frequency shift of H2 in the sII clathrate hydrate and comparison with experiment,” J. Phys. Chem. A 123, 329–335 (2018). 10.1021/acs.jpca.8b11675 [DOI] [PubMed] [Google Scholar]
- 58.Nandi A., Qu C., and Bowman J. M., “Full and fragmented permutationally invariant polynomial potential energy surfaces for trans and cis N-methyl acetamide and isomerization saddle points,” J. Chem. Phys. 151, 084306 (2019). 10.1063/1.5119348 [DOI] [PubMed] [Google Scholar]
- 59.Qu C. and Bowman J. M., “A fragmented, permutationally invariant polynomial approach for potential energy surfaces of large molecules: Application to N-methyl acetamide,” J. Chem. Phys. 150, 141101 (2019). 10.1063/1.5092794 [DOI] [PubMed] [Google Scholar]
- 60.Nandi A., Qu C., and Bowman J. M., “Using gradients in permutationally invariant polynomial potential fitting: A demonstration for CH4 using as few as 100 configurations,” J. Chem. Theory Comput. 15, 2826–2835 (2019). 10.1021/acs.jctc.9b00043 [DOI] [PubMed] [Google Scholar]
- 61.Nandi A., Qu C., Houston P. L., Conte R., Yu Q., and Bowman J. M., “A CCSD(T)-based 4-body potential for water,” J. Phys. Chem. Lett. 12, 10318–10324 (2021). 10.1021/acs.jpclett.1c03152 [DOI] [PubMed] [Google Scholar]
- 62.Nandi A., Qu C., Houston P. L., Conte R., and Bowman J. M., “Δ-machine learning for potential energy surfaces: A PIP approach to bring a DFT-based PES to CCSD(T) level of theory,” J. Chem. Phys. 154, 051102 (2021). 10.1063/5.0038301 [DOI] [PubMed] [Google Scholar]
- 63.Jiang B. and Guo H., “Permutation invariant polynomial neural network approach to fitting potential energy surfaces,” J. Chem. Phys. 139, 054112 (2013). 10.1063/1.4817187 [DOI] [PubMed] [Google Scholar]
- 64.Li J., Jiang B., and Guo H., “Permutation invariant polynomial neural network approach to fitting potential energy surfaces. II. Four-atom systems,” J. Chem. Phys. 139, 204103 (2013). 10.1063/1.4832697 [DOI] [PubMed] [Google Scholar]
- 65.Jiang B. and Guo H., “Permutation invariant polynomial neural network approach to fitting potential energy surfaces. III. Molecule-surface interactions,” J. Chem. Phys. 141, 034109 (2014). 10.1063/1.4887363 [DOI] [PubMed] [Google Scholar]
- 66.Xie C., Zhu X., Yarkony D. R., and Guo H., “Permutation invariant polynomial neural network approach to fitting potential energy surfaces. IV. Coupled diabatic potential energy matrices,” J. Chem. Phys. 149, 144107 (2018). 10.1063/1.5054310 [DOI] [PubMed] [Google Scholar]
- 67.Morawietz T. and Behler J., “A density-functional theory-based neural network potential for water clusters including van der Waals corrections,” J. Phys. Chem. A 117, 7356–7366 (2013). 10.1021/jp401225b [DOI] [PubMed] [Google Scholar]
- 68.Schran C., Behler J., and Marx D., “Automated fitting of neural network potentials at coupled cluster accuracy: Protonated water clusters as testing ground,” J. Chem. Theory Comput. 16, 88–99 (2019). 10.1021/acs.jctc.9b00805 [DOI] [PubMed] [Google Scholar]
- 69.Rosenberger D., Smith J. S., and Garcia A. E., “Modeling of peptides with classical and novel machine learning force fields: A comparison,” J. Phys. Chem. B 125, 3598–3612 (2021). 10.1021/acs.jpcb.0c10401 [DOI] [PubMed] [Google Scholar]
- 70.Yue S., Muniz M. C., Calegari Andrade M. F., Zhang L., Car R., and Panagiotopoulos A. Z., “When do short-range atomistic machine-learning models fall short?,” J. Chem. Phys. 154, 034111 (2021). 10.1063/5.0031215 [DOI] [PubMed] [Google Scholar]
- 71.Zhai Y., Caruso A., Bore S. L., Luo Z., and Paesani F., “A ‘short blanket’ dilemma for a state-of-the-art neural network potential for water: Reproducing experimental properties or the physics of the underlying many-body interactions?,” J. Chem. Phys. 158, 084111 (2023). 10.1063/5.0142843 [DOI] [PubMed] [Google Scholar]
- 72.Babin V., Leforestier C., and Paesani F., “Development of a ‘first principles’ water potential with flexible monomers: Dimer potential energy surface, VRT spectrum, and second virial coefficient,” J. Chem. Theory Comput. 9, 5395–5403 (2013). 10.1021/ct400863t [DOI] [PubMed] [Google Scholar]
- 73.Babin V., Medders G. R., and Paesani F., “Development of a ‘first principles’ water potential with flexible monomers. II: Trimer potential energy surface, third virial coefficient, and small clusters,” J. Chem. Theory Comput. 10, 1599–1607 (2014). 10.1021/ct500079y [DOI] [PubMed] [Google Scholar]
- 74.Medders G. R., Babin V., and Paesani F., “Development of a ‘first-principles’ water potential with flexible monomers. III. Liquid phase properties,” J. Chem. Theory Comput. 10, 2906–2910 (2014). 10.1021/ct5004115 [DOI] [PubMed] [Google Scholar]
- 75.Bizzarro B. B., Egan C. K., and Paesani F., “Nature of halide–water interactions: Insights from many-body representations and density functional theory,” J. Chem. Theory Comput. 15, 2983–2995 (2019). 10.1021/acs.jctc.9b00064 [DOI] [PubMed] [Google Scholar]
- 76.Egan C. K., Bizzarro B. B., Riera M., and Paesani F., “Nature of alkali ion–water interactions: Insights from many-body representations and density functional theory. II,” J. Chem. Theory Comput. 16, 3055–3072 (2020). 10.1021/acs.jctc.0c00082 [DOI] [PubMed] [Google Scholar]
- 77.Paesani F., “Water: Many-body potential from first principles (from the gas to the liquid phase),” in Handbook of Materials Modeling: Methods: Theory and Modeling (Springer, 2020), pp. 635–660. [Google Scholar]
- 78.Rezac J. and Hobza P., “Benchmark calculations of interaction energies in noncovalent complexes and their applications,” Chem. Rev. 116, 5038–5071 (2016). 10.1021/acs.chemrev.5b00526 [DOI] [PubMed] [Google Scholar]
- 79.Reddy S. K., Straight S. C., Bajaj P., Huy Pham C., Riera M., Moberg D. R., Morales M. A., Knight C., Götz A. W., and Paesani F., “On the accuracy of the MB-pol many-body potential for water: Interaction energies, vibrational frequencies, and classical thermodynamic and dynamical properties from clusters to liquid water and ice,” J. Chem. Phys. 145, 194504 (2016). 10.1063/1.4967719 [DOI] [PubMed] [Google Scholar]
- 80.Paesani F., “Getting the right answers for the right reasons: Toward predictive molecular simulations of water with many-body potential energy functions,” Acc. Chem. Res. 49, 1844–1851 (2016). 10.1021/acs.accounts.6b00285 [DOI] [PubMed] [Google Scholar]
- 81.Richardson J. O., Pérez C., Lobsiger S., Reid A. A., Temelso B., Shields G. C., Kisiel Z., Wales D. J., Pate B. H., and Althorpe S. C., “Concerted hydrogen-bond breaking by quantum tunneling in the water hexamer prism,” Science 351, 1310–1313 (2016). 10.1126/science.aae0012 [DOI] [PubMed] [Google Scholar]
- 82.Cole W. T., Farrell J. D., Wales D. J., and Saykally R. J., “Structure and torsional dynamics of the water octamer from THz laser spectroscopy near 215 μm,” Science 352, 1194–1197 (2016). 10.1126/science.aad8625 [DOI] [PubMed] [Google Scholar]
- 83.Mallory J. D. and Mandelshtam V. A., “Diffusion Monte Carlo studies of MB-pol (H2O)2–6 and (D2O)2–6 clusters: Structures and binding energies,” J. Chem. Phys. 145, 064308 (2016). 10.1063/1.4960610 [DOI] [PubMed] [Google Scholar]
- 84.Videla P. E., Rossky P. J., and Laria D., “Communication: Isotopic effects on tunneling motions in the water trimer,” J. Chem. Phys. 144, 061101 (2016). 10.1063/1.4941701 [DOI] [PubMed] [Google Scholar]
- 85.Brown S. E., Götz A. W., Cheng X., Steele R. P., Mandelshtam V. A., and Paesani F., “Monitoring water clusters ‘melt’ through vibrational spectroscopy,” J. Am. Chem. Soc. 139, 7082–7088 (2017). 10.1021/jacs.7b03143 [DOI] [PubMed] [Google Scholar]
- 86.Vaillant C. L. and Cvitaš M. T., “Rotation-tunneling spectrum of the water dimer from instanton theory,” Phys. Chem. Chem. Phys. 20, 26809–26813 (2018). 10.1039/c8cp04991b [DOI] [PubMed] [Google Scholar]
- 87.Vaillant C., Wales D., and Althorpe S., “Tunneling splittings from path-integral molecular dynamics using a Langevin thermostat,” J. Chem. Phys. 148, 234102 (2018). 10.1063/1.5029258 [DOI] [PubMed] [Google Scholar]
- 88.Schmidt M. and Roy P.-N., “Path integral molecular dynamic simulation of flexible molecular systems in their ground state: Application to the water dimer,” J. Chem. Phys. 148, 124116 (2018). 10.1063/1.5017532 [DOI] [PubMed] [Google Scholar]
- 89.Bishop K. P. and Roy P.-N., “Quantum mechanical free energy profiles with post-quantization restraints: Binding free energy of the water dimer over a broad range of temperatures,” J. Chem. Phys. 148, 102303 (2018). 10.1063/1.4986915 [DOI] [PubMed] [Google Scholar]
- 90.Videla P. E., Rossky P. J., and Laria D., “Isotopic equilibria in aqueous clusters at low temperatures: Insights from the MB-pol many-body potential,” J. Chem. Phys. 148, 084303 (2018). 10.1063/1.5019377 [DOI] [PubMed] [Google Scholar]
- 91.Samala N. R. and Agmon N., “Temperature dependence of intramolecular vibrational bands in small water clusters,” J. Phys. Chem. B 123, 9428–9442 (2019). 10.1021/acs.jpcb.9b07777 [DOI] [PubMed] [Google Scholar]
- 92.Cvitaš M. T. and Richardson J. O., “Quantum tunnelling pathways of the water pentamer,” Phys. Chem. Chem. Phys. 22, 1035–1044 (2020). 10.1039/C9CP05561D [DOI] [PubMed] [Google Scholar]
- 93.Medders G. R. and Paesani F., “Infrared and Raman spectroscopy of liquid water through ‘first-principles’ many-body molecular dynamics,” J. Chem. Theory Comput. 11, 1145–1154 (2015). 10.1021/ct501131j [DOI] [PubMed] [Google Scholar]
- 94.Straight S. C. and Paesani F., “Exploring electrostatic effects on the hydrogen bond network of liquid water through many-body molecular dynamics,” J. Phys. Chem. B 120, 8539–8546 (2016). 10.1021/acs.jpcb.6b02366 [DOI] [PubMed] [Google Scholar]
- 95.Reddy S. K., Moberg D. R., Straight S. C., and Paesani F., “Temperature-dependent vibrational spectra and structure of liquid water from classical and quantum simulations with the MB-pol potential energy function,” J. Chem. Phys. 147, 244504 (2017). 10.1063/1.5006480 [DOI] [PubMed] [Google Scholar]
- 96.Hunter K. M., Shakib F. A., and Paesani F., “Disentangling coupling effects in the infrared spectra of liquid water,” J. Phys. Chem. B 122, 10754–10761 (2018). 10.1021/acs.jpcb.8b09910 [DOI] [PubMed] [Google Scholar]
- 97.Sun Z., Zheng L., Chen M., Klein M. L., Paesani F., and Wu X., “Electron-hole theory of the effect of quantum nuclei on the X-ray absorption spectra of liquid water,” Phys. Rev. Lett. 121, 137401 (2018). 10.1103/physrevlett.121.137401 [DOI] [PubMed] [Google Scholar]
- 98.Gaiduk A. P., Pham T. A., Govoni M., Paesani F., and Galli G., “Electron affinity of liquid water,” Nat. Commun. 9, 247 (2018). 10.1038/s41467-017-02673-z [DOI] [PMC free article] [PubMed] [Google Scholar]
- 99.Cruzeiro V., Wildman A., Li X., and Paesani F., “Relationship between hydrogen-bonding motifs and the 1b1 splitting in the X-ray emission spectrum of liquid water,” J. Phys. Chem. Lett. 12, 3996–4002 (2021). 10.1021/acs.jpclett.1c00486 [DOI] [PubMed] [Google Scholar]
- 100.Medders G. R. and Paesani F., “Dissecting the molecular structure of the air/water interface from quantum simulations of the sum-frequency generation spectrum,” J. Am. Chem. Soc. 138, 3912–3919 (2016). 10.1021/jacs.6b00893 [DOI] [PubMed] [Google Scholar]
- 101.Moberg D. R., Straight S. C., and Paesani F., “Temperature dependence of the air/water interface revealed by polarization sensitive sum-frequency generation spectroscopy,” J. Phys. Chem. B 122, 4356–4365 (2018). 10.1021/acs.jpcb.8b01726 [DOI] [PubMed] [Google Scholar]
- 102.Sun S., Tang F., Imoto S., Moberg D. R., Ohto T., Paesani F., Bonn M., Backus E. H., and Nagata Y., “Orientational distribution of free O–H groups of interfacial water is exponential,” Phys. Rev. Lett. 121, 246101 (2018). 10.1103/physrevlett.121.246101 [DOI] [PubMed] [Google Scholar]
- 103.Sengupta S., Moberg D. R., Paesani F., and Tyrode E., “Neat water–vapor interface: Proton continuum and the nonresonant background,” J. Phys. Chem. Lett. 9, 6744–6749 (2018). 10.1021/acs.jpclett.8b03069 [DOI] [PubMed] [Google Scholar]
- 104.Muniz M. C., Gartner T. E. III, Riera M., Knight C., Yue S., Paesani F., and Panagiotopoulos A. Z., “Vapor-liquid equilibrium of water with the MB-pol many-body potential,” J. Chem. Phys. 154, 211103 (2021). 10.1063/5.0050068 [DOI] [PubMed] [Google Scholar]
- 105.Pham C. H., Reddy S. K., Chen K., Knight C., and Paesani F., “Many-body interactions in ice,” J. Chem. Theory Comput. 13, 1778–1784 (2017). 10.1021/acs.jctc.6b01248 [DOI] [PubMed] [Google Scholar]
- 106.Moberg D. R., Straight S. C., Knight C., and Paesani F., “Molecular origin of the vibrational structure of ice Ih,” J. Phys. Chem. Lett. 8, 2579–2583 (2017). 10.1021/acs.jpclett.7b01106 [DOI] [PubMed] [Google Scholar]
- 107.Moberg D. R., Sharp P. J., and Paesani F., “Molecular-level interpretation of vibrational spectra of ordered ice phases,” J. Phys. Chem. B 122, 10572–10581 (2018). 10.1021/acs.jpcb.8b08380 [DOI] [PubMed] [Google Scholar]
- 108.Moberg D. R., Becker D., Dierking C. W., Zurheide F., Bandow B., Buck U., Hudait A., Molinero V., Paesani F., and Zeuch T., “The end of ice I,” Proc. Natl. Acad. Sci. U. S. A. 116, 24413–24419 (2019). 10.1073/pnas.1914254116 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 109.del Rosso L., Celli M., Colognesi D., Rudic S., English N. J., and Ulivi L., “Density of phonon states in cubic ice Ic,” J. Phys. Chem. C 125, 23533–23538 (2021). 10.1021/acs.jpcc.1c07647 [DOI] [Google Scholar]
- 110.Rasti S., Jónsson E. Ö., Jónsson H., and Meyer J., “New insights into the volume isotope effect of ice Ih from polarizable many-body potentials,” J. Phys. Chem. Lett. 13, 11831–11836 (2022). 10.1021/acs.jpclett.2c03212 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 111.Bore S. L. and Paesani F., “Realistic phase diagram of water from ‘first principles’ data-driven quantum simulations,” Nat. Commun. 14, 3349 (2023). 10.1038/s41467-023-38855-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 112.Zhu X., Riera M., Bull-Vulpe E. F., and Paesani F., “MB-pol(2023): Sub-chemical accuracy for water simulations from the gas to the liquid phase,” J. Chem. Theory Comput. 19, 3551–3566 (2023). 10.1021/acs.jctc.3c00326 [DOI] [PubMed] [Google Scholar]
- 113.Bajaj P., Götz A. W., and Paesani F., “Toward chemical accuracy in the description of ion–water interactions through many-body representations. I. Halide–water dimer potential energy surfaces,” J. Chem. Theory Comput. 12, 2698–2705 (2016). 10.1021/acs.jctc.6b00302 [DOI] [PubMed] [Google Scholar]
- 114.Bajaj P., Wang X.-G., T. Carrington, Jr., and Paesani F., “Vibrational spectra of halide–water dimers: Insights on ion hydration from full-dimensional quantum calculations on many-body potential energy surfaces,” J. Chem. Phys. 148, 102321 (2018). 10.1063/1.5005540 [DOI] [PubMed] [Google Scholar]
- 115.Bajaj P., Richardson J. O., and Paesani F., “Ion-mediated hydrogen-bond rearrangement through tunnelling in the iodide–dihydrate complex,” Nat. Chem. 11, 367 (2019). 10.1038/s41557-019-0220-2 [DOI] [PubMed] [Google Scholar]
- 116.Bajaj P., Zhuang D., and Paesani F., “Specific ion effects on hydrogen-bond rearrangements in the halide–dihydrate complexes,” J. Phys. Chem. Lett. 10, 2823–2828 (2019). 10.1021/acs.jpclett.9b00899 [DOI] [PubMed] [Google Scholar]
- 117.Bajaj P., Riera M., Lin J. K., Mendoza Montijo Y. E., Gazca J., and Paesani F., “Halide ion microhydration: Structure, energetics, and spectroscopy of small halide–water clusters,” J. Phys. Chem. A 123, 2843–2852 (2019). 10.1021/acs.jpca.9b00816 [DOI] [PubMed] [Google Scholar]
- 118.Caruso A. and Paesani F., “Data-driven many-body models enable a quantitative description of chloride hydration from clusters to bulk,” J. Chem. Phys. 155, 064502 (2021). 10.1063/5.0059445 [DOI] [PubMed] [Google Scholar]
- 119.Caruso A., Zhu X., Fulton J. L., and Paesani F., “Accurate modeling of bromide and iodide hydration with data-driven many-body potentials,” J. Phys. Chem. B 126, 8266–8278 (2022). 10.1021/acs.jpcb.2c04698 [DOI] [PubMed] [Google Scholar]
- 120.Riera M., Mardirossian N., Bajaj P., Götz A. W., and Paesani F., “Toward chemical accuracy in the description of ion–water interactions through many-body representations. Alkali-water dimer potential energy surfaces,” J. Chem. Phys. 147, 161715 (2017). 10.1063/1.4993213 [DOI] [PubMed] [Google Scholar]
- 121.Riera M., Brown S. E., and Paesani F., “Isomeric equilibria, nuclear quantum effects, and vibrational spectra of M+(H2o)n=1–3 clusters, with M = Li, Na, K, Rb, and Cs, through many-body representations,” J. Phys. Chem. A 122, 5811–5821 (2018). 10.1021/acs.jpca.8b04106 [DOI] [PubMed] [Google Scholar]
- 122.Zhuang D., Riera M., Schenter G. K., Fulton J. L., and Paesani F., “Many-body effects determine the local hydration structure of Cs+ in solution,” J. Phys. Chem. Lett. 10, 406–412 (2019). 10.1021/acs.jpclett.8b03829 [DOI] [PubMed] [Google Scholar]
- 123.Riera M., Talbot J. J., Steele R. P., and Paesani F., “Infrared signatures of isomer selectivity and symmetry breaking in the Cs+(H2O)3 complex using many-body potential energy functions,” J. Chem. Phys. 153, 044306 (2020). 10.1063/5.0013101 [DOI] [PubMed] [Google Scholar]
- 124.Zhuang D., Riera M., Zhou R., Deary A., and Paesani F., “Hydration structure of Na+ and K+ ions in solution predicted by data-driven many-body potentials,” J. Phys. Chem. B 126, 9349–9360 (2022). 10.1021/acs.jpcb.2c05674 [DOI] [PubMed] [Google Scholar]
- 125.Riera M., Yeh E. P., and Paesani F., “Data-driven many-body models for molecular fluids: CO2/H2O mixtures as a case study,” J. Chem. Theory Comput. 16, 2246–2257 (2020). 10.1021/acs.jctc.9b01175 [DOI] [PubMed] [Google Scholar]
- 126.Riera M., Hirales A., Ghosh R., and Paesani F., “Data-driven many-body models with chemical accuracy for CH4/H2O mixtures,” J. Chem. Phys. B 124, 11207–11221 (2020). 10.1021/acs.jpcb.0c08728 [DOI] [PubMed] [Google Scholar]
- 127.Yue S., Riera M., Ghosh R., Panagiotopoulos A. Z., and Paesani F., “Transferability of data-driven, many-body models for CO2 simulations in the vapor and liquid phases,” J. Chem. Phys. 156, 104503 (2022). 10.1063/5.0080061 [DOI] [PubMed] [Google Scholar]
- 128.Robinson V. N., Ghosh R., Egan C. K., Riera M., Knight C., Paesani F., and Hassanali A., “The behavior of methane–water mixtures under elevated pressures from simulations using many-body potentials,” J. Chem. Phys. 156, 194504 (2022). 10.1063/5.0089773 [DOI] [PubMed] [Google Scholar]
- 129.Cruzeiro V. W. D., Lambros E., Riera M., Roy R., Paesani F., and Gotz A. W., “Highly accurate many-body potentials for simulations of N2O5 in water: Benchmarks, development, and validation,” J. Chem. Theory Comput. 17, 3931–3945 (2021). 10.1021/acs.jctc.1c00069 [DOI] [PubMed] [Google Scholar]
- 130.Zhou R., Riera M., and Paesani F., “Towards data-driven many-body simulations of biomolecules in solution: N-methyl acetamide as a proxy for the protein backbone,” J. Chem. Theory Comput. 19, 4308–4321 (2023). 10.1021/acs.jctc.3c00271 [DOI] [PubMed] [Google Scholar]
- 131.Bull-Vulpe E. F., Riera M., Bore S. L., and Paesani F., “Data-driven many-body potential energy functions for generic molecules: Linear alkanes as a proof-of-concept application,” J. Chem. Theory Comput. 19(14), 4494–4509 (2023). 10.1021/acs.jctc.2c00645 [DOI] [PubMed] [Google Scholar]
- 132.MBX: An energy and force calculator for data-driven many-body potential energy functions, http://paesanigroup.ucsd.edu/software/mbx.html, 2019.
- 133.Bull-Vulpe E. F., Riera M., Götz A. W., and Paesani F., “MB-Fit: Software infrastructure for data-driven many-body potential energy functions,” J. Chem. Phys. 155, 124801 (2021). 10.1063/5.0063198 [DOI] [PubMed] [Google Scholar]
- 134.MB-Fit: Software infrastructure for data-driven many-body potential energy functions, https://github.com/paesanilab/MB-Fit, 2021. [DOI] [PubMed]
- 135.Nesbet R. K., “Atomic Bethe-Goldstone equations,” in Advances in Chemical Physics (John Wiley & Sons, Ltd, 1969), pp. 1–34. [Google Scholar]
- 136.Hankins D., Moskowitz J. W., and Stillinger F. H., “Water molecule interactions,” J. Chem. Phys. 53, 4544–4554 (1970). 10.1063/1.1673986 [DOI] [Google Scholar]
- 137.Stoll H., “Correlation energy of diamond,” Phys. Rev. B 46, 6700 (1992). 10.1103/physrevb.46.6700 [DOI] [PubMed] [Google Scholar]
- 138.Stoll H., “On the correlation energy of graphite,” J. Chem. Phys. 97, 8449–8454 (1992). 10.1063/1.463415 [DOI] [Google Scholar]
- 139.Stoll H., “The correlation energy of crystalline silicon,” Chem. Phys. Lett. 191, 548–552 (1992). 10.1016/0009-2614(92)85587-z [DOI] [Google Scholar]
- 140.Stone A. J., The Theory of Intermolecular Forces (Oxford University Press, Oxford, 2013). [Google Scholar]
- 141.Tang K. T. and Toennies J. P., “An improved simple model for the van der Waals potential based on universal damping functions for the dispersion coefficients,” J. Chem. Phys. 80, 3726–3741 (1984). 10.1063/1.447150 [DOI] [Google Scholar]
- 142.Becke A. D. and Johnson E. R., “Exchange-hole dipole moment and the dispersion interaction,” J. Chem. Phys. 122, 154104 (2005). 10.1063/1.1884601 [DOI] [PubMed] [Google Scholar]
- 143.Johnson E. R. and Becke A. D., “A post-Hartree–Fock model of intermolecular interactions,” J. Chem. Phys. 123, 024101 (2005). 10.1063/1.1949201 [DOI] [PubMed] [Google Scholar]
- 144.Johnson E. R. and Becke A. D., “A post-Hartree-Fock model of intermolecular interactions: Inclusion of higher-order corrections,” J. Chem. Phys. 124, 174104 (2006). 10.1063/1.2190220 [DOI] [PubMed] [Google Scholar]
- 145.Thole B. T., “Molecular polarizabilities calculated with a modified dipole interaction,” Chem. Phys. 59, 341–350 (1981). 10.1016/0301-0104(81)85176-2 [DOI] [Google Scholar]
- 146.Burnham C. J., Anick D. J., Mankoo P. K., and Reiter G. F., “The vibrational proton potential in bulk liquid water and ice,” J. Chem. Phys. 128, 154519 (2008). 10.1063/1.2895750 [DOI] [PubMed] [Google Scholar]
- 147.Sala J., Guardia E., and Masia M., “The polarizable point dipoles method with electrostatic damping: Implementation on a model system,” J. Chem. Phys. 133, 234101 (2010). 10.1063/1.3511713 [DOI] [PubMed] [Google Scholar]
- 148.Blanco J. L. and Rai P. K., Nanoflann: A C++ header-only fork of FLANN, a library for nearest neighbor (NN) with KD-trees, https://github.com/jlblancoc/nanoflann, 2014.
- 149.Simmonett A. C. and Brooks B. R., “Analytical Hessians for Ewald and particle mesh Ewald electrostatics,” J. Chem. Phys. 154, 104101 (2021). 10.1063/5.0044166 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 150.Simmonett A. C. and Brooks B. R., “A compression strategy for particle mesh Ewald theory,” J. Chem. Phys. 154, 054112 (2021). 10.1063/5.0040966 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 151.Kolafa J., “Time-reversible always stable predictor–corrector method for molecular dynamics of polarizable molecules,” J. Comput. Chem. 25, 335–342 (2004). 10.1002/jcc.10385 [DOI] [PubMed] [Google Scholar]
- 152.Thompson A. P., Aktulga H. M., Berger R., Bolintineanu D. S., Brown W. M., Crozier P. S., in’t Veld P. J., Kohlmeyer A., Moore S. G., Nguyen T. D., Shan R., Stevens M. J., Tranchida J., Trott C., and Plimpton S. J., “LAMMPS—A flexible simulation tool for particle-based materials modeling at the atomic, meso, and continuum scales,” Comput. Phys. Commun. 271, 108171 (2022). 10.1016/j.cpc.2021.108171 [DOI] [Google Scholar]
- 153.Kapil V., Rossi M., Marsalek O., Petraglia R., Litman Y., Spura T., Cheng B., Cuzzocrea A., Meißner R. H., Wilkins D. M., Helfrecht B. A., Juda P., Bienvenue S. P., Fang W., Kessler J., Poltavsky I., Vandenbrande S., Wieme J., Corminboeuf C., Kühne T. D., Manolopoulos D. E., Markland T. E., Richardson J. O., Tkatchenko A., Tribello G. A., Van Speybroeck V., and Ceriotti M., “i-PI 2.0: A universal force engine for advanced molecular simulations,” Comput. Phys. Commun. 236, 214–223 (2019). 10.1016/j.cpc.2018.09.020 [DOI] [Google Scholar]
- 154.Cornell W. D., Cieplak P., Bayly C. I., Gould I. R., Merz K. M., Ferguson D. M., Spellmeyer D. C., Fox T., Caldwell J. W., and Kollman P. A., “A second generation force field for the simulation of proteins, nucleic acids, and organic molecules,” J. Am. Chem. Soc. 117, 5179–5197 (1995). 10.1021/ja00124a002 [DOI] [Google Scholar]
- 155.Brooks B. R., Bruccoleri R. E., Olafson B. D., States D. J., Swaminathan S., and Karplus M., “CHARMM: A program for macromolecular energy, minimization, and dynamics calculations,” J. Comput. Chem. 4, 187–217 (1983). 10.1002/jcc.540040211 [DOI] [Google Scholar]
- 156.Jorgensen W. L., Maxwell D. S., and Tirado-Rives J., “Development and testing of the OPLS all-atom force field on conformational energetics and properties of organic liquids,” J. Am. Chem. Soc. 118, 11225–11236 (1996). 10.1021/ja9621760 [DOI] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Description of the MBX input file formats and functional form of the switching functions for the MB-nrg PEFs.
Data Availability Statement
The data that support the findings of this study are available from the corresponding author upon reasonable request. MBX can be downloaded from https://github.com/paesanilab/MBX.




