Abstract
We develop a framework for on-the-fly machine learned force field molecular dynamics simulations based on the multipole featurization scheme that overcomes the bottleneck with the number of chemical elements. Considering bulk systems with up to 6 elements, we demonstrate that the number of density functional theory calls remains approximately independent of the number of chemical elements, in contrast to the increase in the smooth overlap of the atomic positions scheme.
1. Main
A multitude of machine learning (ML) models and algorithms have been developed in the past decade to replace and improve first-principles or semiempirical predictions of energies and forces at the atomic scale.1−6 The most accurate of these models rely on graph neural networks or transformer architectures trained on large and carefully curated data sets to learn a map directly from structures to energies or forces.5−12 These models are designed to act as general purpose force fields and have been used successfully to predict the properties of a wide range of materials.11,13−16 However, due to the finite nature of the training sets of these models, edge cases often arise during inference, resulting in unpredictable performance or unstable simulations.17 To address this shortcoming, active learning and fine-tuning have been used to dynamically update models.18−21 These schemes utilize uncertainty quantification (UQ) methods, such as Bayesian estimates, to determine when a model needs to be updated. This procedure requires interfacing with electronic structure theory software packages to generate additional reference data. UQ methods have also been used to train smaller, data-adaptive models “on-the-fly” during a simulation.3,21 Implementations of this approach have been built into popular quantum chemistry packages such as VASP3 and CASTEP,22 improving the reproducibility of the training procedure and portability of the resulting models. These methods rely on a combination of Bayesian error estimates and heuristics to automatically form training sets, eliminating the burden of manually generating the reference data. In principle, this approach can be used for any chemical system. Unfortunately, the feature-based regression and UQ models used in these schemes cause the cost of both training and inference to increase dramatically with the size of the training set and the number of elements present. Furthermore, the robustness of the on-the-fly training procedure is not well established for complex chemistries.
Due to the inherent trade-offs among the various approaches to machine learning force fields (MLFF), researchers must select from a diverse range of options to identify the model best suited to a given problem. The large, general purpose pretrained models offer a straightforward option since they do not require training and are often designed to work with common scientific packages such as the Atomic Simulation Environment23 or even have web-based interfaces.10 Beyond accessibility, it is also crucial to evaluate the validity of model predictions with a first-principles result. This validation is straightforward in the case of large-scale computational screening, where single-point density functional theory (DFT) calculations can be used to validate a subset of candidate materials based on the results of the ML accelerated screening. However, first-principles validation quickly becomes intractable as the quantity of interest moves from DFT energies to properties based on ensemble averages, such as diffusion coefficients or free energies from enhanced sampling.24−26 This is particularly concerning, as atomistic ML models are known to behave unreliably when making out-of-domain predictions or during extended simulations under reaction conditions.17,27 As a result, utilizing on-the-fly potentials that quantify uncertainty and dynamically validate and update becomes a favorable strategy for obtaining reliable estimates of properties that require long molecular dynamics (MD) or Monte Carlo simulations.
On-the-fly ML algorithms typically require a close coupling between the ML and DFT codes involved. This can be done by directly integrating the ML with the DFT code, which restricts the complexity of ML algorithms available due to the incompatibility between the lower level languages used to write DFT codes and the more interpretable languages commonly used for cutting edge ML packages.28−30 Alternatively, it is possible to couple DFT codes to advanced ML frameworks through socket interfaces or hybrid language extensions, allowing fine-tuning of pretrained models on-the-fly.18,31,32 However, this strategy has only emerged recently and substantially increases the complexity of software installation and maintenance. The most prominent on-the-fly implementations are based on legacy smooth overlap of atomic positions (SOAP) chemical descriptors33 and Bayesian linear regression.34 Existing implementations can be found in the VASP,3 CASTEP,22 and SPARC35 codes. However, the generality of these models is not well established, and the SOAP framework is known to scale poorly to systems with many unique chemical elements due to the increasing computational cost of descriptor computation and inference, as well as the inclusion of redundant information in the descriptor vector.36−38
We aim to reduce the gap between state-of-the-art ML models that work for an arbitrary number of elements but are not always portable or transferable, and the existing SOAP-based on-the-fly force fields that are straightforward to use within DFT codes and work well for simple systems but struggle to scale to systems with many chemical elements. To achieve this goal, we introduce a modified workflow based on the normalized Gaussian multipole (GMP) descriptor,39 which shows improved efficiency without compromising performance. See the SI for details on the normalization factor computation. The GMP scheme differs from SOAP in the dependence of the size of the feature vector on the number of unique elements and the formulation of the design matrices. The size of the feature vector for the SOAP chemical descriptor scales quadratically with the number of unique chemical elements, requiring additional computational resources and sometimes causing poor conditioning of the resulting design matrices. Several studies have addressed this issue using compression schemes.37,40 However, the dimension of the resulting feature vector scales linearly with the number of unique elements, and the procedure to calculate the compressed feature vector increases computational cost and complexity. The GMP-based models overcome this scaling issue by implicitly embedding elemental identity through a Gaussian representation of atomic valence densities, leading to a fixed vector size independent of the number of chemical elements in the system.39,41,42 This lack of explicit elemental dependence leads to denser representations of chemical environments, and allows the use of pooled design matrices that combine the chemical information from all element types via the kernel evaluation. To ensure portability and ease of use, we implemented the GMP-based on-the-fly potentials in the SPARC DFT code.43 The SPARC code has minimal dependencies (MPI44 and BLAS(ref (45))/LAPACK(ref (46))/MKL) ensuring ease of compilation and uses a real-space formalism that enables mixed boundary conditions and short wall times, allowing rapid generation of on-the-fly MLFF training data for systems of arbitrary chemical complexity.
To test the implementation, we used a series of bulk metals and alloys with up to six elements. The systems include bulk Al, Ag, Au, Ir, Pd, Pt, and Rh; and alloys of Pt, Ag, Au, Ir, Pd, and Rh, each with 32 atoms in the unit cell. Aluminum was included for benchmarking, as it is a common reference system. The remainder of the elements are of interest in heterogeneous catalysis.47−51 Single-element systems provide both a benchmark for the performance of the ML algorithms and allow us to compare energies of formation of the complex alloys. All MD simulations were carried out in the isokinetic (NVK) ensemble. The ML formalism is extensible to other ensembles, but the choice of training ensemble determines the range of applicability for the resulting models. Each simulation was run for 10,000 steps (20 ps) with a corresponding ab initio MD simulation for each system as a ground truth, and we evaluate the models by comparing the total variation distance (TVD)52 of the pair correlation functions (PCFs). Conceptually, the TVD represents the degree of overlap between the PCFs computed using the MLFF and DFT. Additional details on the TVD calculation are provided in the SI. The use of TVD is inspired by previous work revealing that analysis of PCFs provides the best metric for the stability of machine-learned force fields used for molecular dynamics simulations.27 We assess the average TVD over the full trajectory, as well as time-resolved TVDs that provide further insight into stability. We also provide a comparison of the free energies of formation for the alloys to demonstrate a potential application of the GMP MLFF. We compare the common finite displacement (FD) method53,54 to MD methods55−57 to obtain thermodynamic corrections to electronic energies.
Figure 1 presents the key accuracy metrics for the AIMD and MLFF models, as well as an illustrative PCF plot to contextualize the TVD metric. We computed the PCFs for all pairwise interactions in each alloy, so the number of distributions scales quadratically with the number of elements. The box and whisker plot in Figure 1 a) provides a visualization of the distribution of TVD means broken down by pairwise interactions (for alloys) or by element type (for pure metals). There is a similar increase in the spread of TVD for the GMP and SOAP models associated with an increase in the number of chemical elements, most notably as the number of elements becomes greater than two. We hypothesize that this increasing variance with chemical complexity arises because the total amount of force data per structure remains fixed while the number of unique chemical interactions that must be modeled increases quadratically. It is interesting that the magnitude of this trend is comparable for both the GMP and SOAP models, since the SOAP models explicitly differentiate between unique chemical elements, whereas the GMP models do not. Despite the increase in error for alloys, the median performance for both models appears to stabilize after 4 unique chemical elements. To provide visual context for the TVD metric, we plot the PCF for two outliers of the six-element system in Figure 1 b) which corresponds to the same system for GMP and SOAP. Visual inspection of the PCF shows that the ML models accurately reproduce the locations of both major peaks for the system, indicating that despite the relatively large TVD, neither model fails catastrophically. Finally, Figure 1 c) shows the time-resolved TVD corresponding to the same Ir–Ir outlier for the 6-element alloy. Although oscillations are clearly present, they do not increase with time and generally remain below a TVD of 0.25 for both models. These findings suggest that the models are stable even for the least accurate systems. We further confirmed the stability of the corresponding GMP models by simulating an additional 100 K steps (200 ps) for all systems using the MLFFs in inference-only mode and observed that the formation energies for the alloys shifted by no more than 70 meV relative to the DFT MD values (see convergence table in SI). Overall, these results indicate that the MLFFs are similarly accurate and robust for both SOAP- and GMP-based models.
Figure 1.
Performance summary for SOAP and GMP models trained on alloys of 1 to 6 elements a.) Mean TVD for all alloys broken down by pariwise interaction for GMP and SOAP in a standard box and whisker plot. The distribution in the case of single-element systems comes from the distribution of all 6 individual single element systems; in all other cases it corresponds to multielement pair correlation functions. Open circles correspond to outliers b.) Representative partial PCF. Plots correspond to the outliers highlighted for the 6 element alloy which is the Ir–Ir distribution for both SOAP and GMP c.) Time resolved TVD for SOAP and GMP again for the highlighted outliers. Time resolution was included by computing the TVD for a 2 ps interval incremented by 2 fs across the entire trajectory.
Although the accuracy of SOAP and GMP based models are similar, their computational cost differs significantly. In appropriately optimized on-the-fly MLFF codes, the computational cost is dominated by the generation of DFT training data. Figure 2 shows the relationship between the number of chemical elements and the number of DFT calls and the CPU time for both the SOAP and GMP models. Additional efficiency metrics for both models are tabulated in the SI. For SOAP-based models, the number of DFT calls increases sharply when moving from pure metals to alloys and continues to increase as more unique elements are included. The CPU times follow a similar trend and increase steadily as a function of the number of elements. Both observations are attributed to the increase in the size of the SOAP feature vectors with the number of elements. The amount of data needed to train a reliable model generally increases with the dimension of the feature vector due to the curse of dimensionality,58,59 and the CPU time increases because the size of the descriptor vector increases, which adds to the compute and memory requirements for both the calculation of the features and the evaluation of the kernel. In contrast, both the number of DFT calls and the CPU time required for GMP-based models are approximately constant regardless of the number of elements present. This shows that GMP-based on-the-fly models can overcome the elemental scaling bottleneck that will cause SOAP-based models to become unwieldy as the number of elements present increases. We expect to see further improvements as we optimize our implementations of the ML code and extend parallelization of the ML operations. Furthermore, we note that versions of the SOAP descriptor with better elemental scaling37,40 may lead to improvements similar to the GMP results, though further implementation and testing is needed.
Figure 2.
Number of DFT calls (blue, left axis) and CPU times (red, right axis) required to train GMP and SOAP models for all systems. The CPU times include all operations necessary to compute energies and forces using both DFT and the MLFF, as well as time spent training the MLFFs. The error bars for the single element GMP and SOAP systems correspond to the range in number of DFT calls and CPU times among the different pure systems.
To demonstrate the utility of the on-the-fly ML models, we use GMP-based models to compute the Helmholtz free energies of formation for each of the five alloys, including internal, vibrational, and configurational contributions. Here, we compare the results obtained using the FD and MD approaches to obtain the phonon density of states. The FD approach is a well-established method that relies on perturbing atoms corresponding to a primitive cell within a supercell structure to obtain an approximate Hessian of a potential energy surface. This Hessian is used to construct the dynamical matrix, from which phonon modes can be extracted.60 The phonon density of states can also be extracted directly from an MD simulation using a Fourier transform of the velocity autocorrelation function.56,57 We carry out both procedures at the MLFF and DFT levels of theory. The total free energy for each material was computed from the phonon density of states using ASE’s CrystalThermo package. Additional details on these computations and tabulated values can be found in the SI. The corresponding results shown in Figure 3 show excellent agreement between the DFT and ML formation energies computed using the MD method, while the differences in the ML and DFT FD approximations is substantial (∼0.5 eV) in some cases. However, the FD approximation is plagued by numerical uncertainty associated with the selection of phonon broadening and handling of low-frequency modes. The default options are used for the DFT and ML comparison which includes a broadening of 10–3 and includes the contributions from imaginary modes by taking the negative square root of the eigenvalue of the Hessian. Alternatively, the black dashed lines in Figure 3 show the results using the DFT data with slightly different options (broadening of 10–4 and a low-frequency cutoff equivalent to the largest imaginary mode). The results illustrate that the difference between the DFT and ML models is lower than the numerical uncertainty of the DFT model and highlight the advantage of the MD approach, where these numerical ambiguities are avoided.
Figure 3.
Thermodynamic formation energies for each of the six alloys from the pure bulk components. Each figure contains data for energies computed using phonons extracted directly from MD simulations and the FD method at both the DFT and MLFF levels of theory. The dashed black line in each figure corresponds to the upper bound of the uncertainty associated with low frequency modes and broadening of the PDOS for FD free energy estimates (details in SI). a) Helmholtz free energies of formation with electronic energy included. b) Thermal corrections including vibrational and configurational entropy as well as internal energy.
Furthermore, it has been shown that the MD method implicitly captures some degree of anharmonicity, which has a large impact on the vibrational contribution to entropy.61,62 This effect is not present in pure metal systems, but it plays a significant role in the vibrational energy of the alloys. The discrepancy between the formation energies computed using the FD and MD methods is dominated by the vibrational term, accounting for ∼0.5–1.5 eV of the contribution to the formation energy. Figure 3 b) shows that the difference between the thermal corrections of the FD and MD methods can be as large as ∼2 eV. Beyond accounting for anharmonicity, the MD method also provides a straightforward and potentially more efficient procedure for treating thermal corrections in HEAs. The number of DFT calls necessary to use the FD method for the 32-atom cell (192) is already greater than the number of DFT calls needed for the MLFF-accelerated MD simulation. Due to the random nature of HEAs, even larger cell sizes may be required for rigorous convergence. The number of DFT calls scales linearly with the number of atoms for the FD approach, while we have demonstrated that the number of DFT calls necessary to train a robust GMP-based MLFF model is approximately independent of the chemical complexity. Thus, the on-the-fly MLFF MD approach will become increasingly more efficient as the cell size increases.
The results presented here suggest that on-the-fly potentials with GMP features are a promising strategy for complex chemical systems with many elements. However, some challenges remain. Both SOAP- and GMP-based on-the-fly models require many hyperparameters, some related to how the features are constructed and others related to the uncertainty quantification and the active learning loop. Although it is possible to set reasonable defaults for some of these hyperparameters, others require systematic optimization63−65 or heuristic tuning. We found that the hyperparameters affecting the ML algorithm shared by GMP and SOAP such as initial training set size and regularization strength dominated the outcomes of on-the-fly runs. We selected the optimal parameters via grid search on the six-element alloy and pure Pt systems. Tuning the descriptor hyperparameters further improved the MLFF performance, although less dramatically. We found that the GMP models improved when we increased the maximum order of the spherical harmonic angular probe and increased the density of radial probes near atom centers. There is a well-established literature regarding the SOAP descriptor hyperparameters, and here we selected parameters conservatively to ensure adequate accuracy. Interestingly, we discovered that using data taken from an AIMD simulation to tune the hyperparameters was not an effective strategy. We attempted to evaluate the model hyperparameters by decoupling the DFT data generation from the training procedure. That is, we utilized the same algorithm to determine when the model needed to be updated, but we drew the first-principles configuration, energy, and force data from a fixed AIMD trajectory. This allowed us to evaluate a far greater set of hyperparameters without requiring additional DFT calls for each run. However, even when the hyperparameters were optimized in this way, the models trained on-the-fly often resulted in unstable simulations as indicated in Figure 4 a) where the black curve depicts force errors for the above “offline” training procedure and the red curve corresponds to the true on-the-fly run. One strategy that was effective was to focus on the most complex systems, since hyperparameters that worked for the 6-element system tended to work well for alloys with fewer elements as well. Nevertheless, hyperparameter optimization required substantial computational and human effort, so establishing more systematic approaches to identifying hyperparameters of on-the-fly ML models for complex chemical systems is an important step to make these approaches more accessible and efficient.
Figure 4.
a.) Maximum error of an MD trajectory for the 6 element system system encountered during an on-the-fly run (red) as compared to training from a fixed AIMD trajectory (black). Identical hyperparameters were used in both cases. b.) Scatter plots showing the lack of correlation between the maximum predicted error and the actual RMSE in the on-the-fly simulation for the 5 element alloy for GMP (top) and for Au for SOAP (bottom). The lines represent linear fits between the true and predicted error.
A related challenge is the robustness of on-the-fly models without properly selected hyperparameters. Ideally, the strength of an on-the-fly model is its ability to access the underlying DFT method to ensure reliability. Thus, a user might expect that improper selection of hyperparameters causes a model to be inefficient (i.e., call DFT more often than needed), but should not cause it to yield unphysical results. Unfortunately, this is not the case for the current class of on-the-fly models, where improper hyperparameter selection leads to exploration of highly unphysical portions of phase space resulting in uncontrolled errors, as illustrated in Figure 4 a). This failure mode is related to the uncertainty quantification and structure of the active learning loop. Typically, the error estimate from the Bayesian regression are assumed to be correlated with the actual root mean squared error,3,66−69 although the error estimates are known to be poorly calibrated. However, we observe that in some cases the error estimates are not even correlated with the true error. Figure 4 b) shows the lack of correlation between the Bayesian error estimate and the actual errors for the points where DFT was performed during stable on-the-fly simulations for two different systems with poor correlation (the 5 element system for GMP and Au for SOAP). Pearson’s correlation coefficients between the predicted and actual error for all systems can be found in the SI, and in general the correlation is lower for GMP than for SOAP. Despite the poor correlation, these simulations did not result in catastrophic failure, highlighting the lack of direct connection between the quality of UQ estimates and the stability of the simulation.
The issue of unreliable uncertainty estimates is compounded by the dynamically updated threshold that is often used to overcome the lack of calibration in error estimates.3,70 Once a large error is observed for a training point, the threshold increases, and the model is unable to recover because the DFT calculations are no longer triggered. This can be improved by using hueristics, such as periodically forcing DFT calculations,22 but this requires additional hyperparameters and can still lead to catastrophic failure between checks. Avoiding catastrophic failures will require the integration of well-calibrated UQ estimates with more robust active learning loops, although the small number of data points and highly correlated nature of MD data make statistically rigorous UQ challenging in the case of on-the-fly MLFF.
Despite these challenges, the results presented here indicate that both SOAP and GMP featurization schemes can be used to construct accurate on-the-fly potentials for systems with up to six unique elements, and the GMP featurization scheme enables development of on-the-fly potentials that are more efficient for many-element systems. We demonstrate that the MLFFs generated on-the-fly are robust enough to be used for simulations at length scales not accessible via AIMD, and demonstrate their utility in computing free energies for complex HEAs. The GMP potentials exhibit favorable scaling in terms of the number of DFT calls necessary to train robust potentials, whereas SOAP models require more DFT data as the number of unique chemical elements increases. The models are implemented in the open-source, portable, and highly parallelized real-space DFT code, SPARC,71−73 so that they can be used by the community and applied to large systems with or without periodic boundary conditions. Future work will focus on improving the efficiency of ML operations, the automation of hyperparameter selection, and the development of more reliable uncertainty quantification and active learning approaches. We expect that this platform will enable wider adoption and application of on-the-fly MLFFs for a wide range of systems relevant to chemistry, materials science, and chemical engineering.
2. Computational Methods
We previously adapted the on-the-fly ML algorithm developed by Jinnouchi et al.3 and inspired by the Gaussian approximation potentials2 to correct orbital free DFT calculations to Kohn–Sham accuracy in the SPARC electronic structure code.43,70 This implementation has been extended to be compatible with the full Kohn–Sham formalism with the option to use the SOAP35 or GMP descriptors with all on-the-fly functionality implemented in a development branch. The initial training set size and regularization strength for the ML models were systematically optimized using a grid search routine on the 6 element alloy and pure Pt. Descriptor parameters were either taken from the literature or tweaked heuristically to improve accuracy. We used 2 × 2 × 2 bulk super cells with 32 atoms for all systems. We utilized the PBE exchange correlation functional with the D3 correction scheme of Grimme.74 The k-point density and mesh spacing were adjusted until the DFT energy was converged to at least 10–3 Ha/atom. The SPMS ONCV pseudopotentials with nonlinear core corrections were used to treat core electrons.75,76 The calculations were performed with periodic boundary conditions in all principal directions. All systems were treated as spin-paired since none of the metals considered are magnetic.77 We validated the accuracy of our AIMD simulations using the blocking method to quantify the blocked standard error for the distance associated with the dominant peak in each all-atom PCF.78,79 The standard error did not exceed 10–2 Bohr for any of the simulations indicating that the simulations were valid. Free energies were computed using the ASE CrystalThermo package for both the FD and MD methods. The phonon denisty of states were extracted using the ASE Phonons module for the FD method with both DFT and ML calculators for force estimations. The phonon density of states were extracted from MD simulations using the pwtools python package. Additional details are provided in the SI.
Acknowledgments
The authors gratefully acknowledge the support of the U.S. Department of Energy, Office of Science under grant DE-SC0023445. This research was also supported by the supercomputing infrastructure provided by Partnership for an Advanced Computing Environment (PACE) through its Hive (U.S. National Science Foundation through grant MRI-1828187) and Phoenix clusters at Georgia Institute of Technology, Atlanta, Georgia.
Supporting Information Available
The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.jctc.4c00474.
Additional computational details, tabulated values corresponding to manuscript figures, and a heatmap of Pearson’s correlation values for uncertainty quantification and true error (PDF)
The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the Department of Energy, or the U.S. Government.
The authors declare no competing financial interest.
Supplementary Material
References
- Behler J.; Parrinello M. Generalized neural-network representation of high-dimensional potential-energy surfaces. Phys. Rev. Lett. 2007, 98, 146401. 10.1103/PhysRevLett.98.146401. [DOI] [PubMed] [Google Scholar]
- Bartók A. P.; Payne M. C.; Kondor R.; Csányi G. Gaussian Approximation Potentials: The Accuracy of Quantum Mechanics, without the Electrons. Phys. Rev. Lett. 2010, 104, 136403. 10.1103/PhysRevLett.104.136403. [DOI] [PubMed] [Google Scholar]
- Jinnouchi R.; Karsai F.; Kresse G. On-the-fly machine learning force field generation: Application to melting points. Phys. Rev. B 2019, 100, 014105. 10.1103/PhysRevB.100.014105. [DOI] [PubMed] [Google Scholar]
- Drautz R. Atomic cluster expansion for accurate and transferable interatomic potentials. Phys. Rev. B 2019, 99, 014104. 10.1103/PhysRevB.99.014104. [DOI] [Google Scholar]
- Batatia I.; Benner P.; Chiang Y.; Elena A. M.; Kovács D. P.; Riebesell J.; Advincula X. R.; Asta M.; Baldwin W. J.; Bernstein N.. et al. A foundation model for atomistic materials chemistry. arXiv e-prints 2023, arXiv:2401.00096.
- Passaro S.; Zitnick C. L.; Reducing S. O.. Reducing SO(3) Convolutions to SO(2) for Efficient Equivariant GNNs. arXiv e-prints 2023, arXiv:2302.03655.
- Chen C.; Ye W.; Zuo Y.; Zheng C.; Ong S. P. Graph Networks as a Universal Machine Learning Framework for Molecules and Crystals. Chem. Mater. 2019, 31, 3564–3572. 10.1021/acs.chemmater.9b01294. [DOI] [Google Scholar]
- Chanussot L.; Das A.; Goyal S.; Lavril T.; Shuaibi M.; Riviere M.; Tran K.; Heras-Domingo J.; Ho C.; Hu W.; et al. Open Catalyst 2020 (OC20) Dataset and Community Challenges. ACS Catal. 2021, 11, 6059–6072. 10.1021/acscatal.0c04525. [DOI] [Google Scholar]
- Merchant A.; Batzner S.; Schoenholz S. S.; Aykol M.; Cheon G.; Cubuk E. D. Scaling deep learning for materials discovery. Nature 2023, 624, 80–85. 10.1038/s41586-023-06735-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tran R.; Lan J.; Shuaibi M.; Wood B. M.; Goyal S.; Das A.; Heras-Domingo J.; Kolluru A.; Rizvi A.; Shoghi N.; et al. The Open Catalyst 2022 (OC22) Dataset and Challenges for Oxide Electrocatalysts. ACS Catal. 2023, 13, 3066–3084. 10.1021/acscatal.2c05426. [DOI] [Google Scholar]
- Batatia I.; Péter Kovács D.; Simm G. N. C.; Ortner C.; Csányi G.. MACE: Higher Order Equivariant Message Passing Neural Networks for Fast and Accurate Force Fields. arXiv e-prints 2022, arXiv:2206.07697.
- Liao Y.-L.; Wood B.; Das A.; Smidt T.. EquiformerV2: Improved Equivariant Transformer for Scaling to Higher-Degree Representations. arXiv e-prints 2023, arXiv:2306.12059.
- Gasteiger J.; Shuaibi M.; Sriram A.; Günnemann S.; Ulissi Z.; Zitnick C. L.; Das A. GemNet-OC: Developing Graph Neural Networks for Large and Diverse Molecular Simulation Datasets. Transactions on Machine Learning Research 2022, 1–22. [Google Scholar]
- Lan J., Palizhati A., Shuaibi M., Wood B. M., Wander B., Das A., Uyttendaele M., Zitnick C. L., Ulissi Z. W.. AdsorbML: Accelerating Adsorption Energy Calculations with Machine Learning. 2022, 1–20.
- Sunshine E. M.; Shuaibi M.; Ulissi Z. W.; Kitchin J. R. Chemical Properties from Graph Neural Network-Predicted Electron Densities. J. Phys. Chem. C 2023, 127, 23459–23466. 10.1021/acs.jpcc.3c06157. [DOI] [Google Scholar]
- Kovács P. D.; Moore J. H.; Browning N. J.; Batatia I.; Horton J. T.; Kapil V.; Witt W. C.; Magdău I.-B.; Cole D. J.; Csányi G.. MACE-OFF23: Transferable Machine Learning Force Fields for Organic Molecules. arXiv e-prints 2023, arXiv:2312.15211.
- Stocker S.; Gasteiger J.; Becker F.; Gunnemann S.; Margraf J. T. How Robust are Modern Graph Neural Network Potentials in Long and Hot Molecular Dynamics Simulations?. Mach. Learn.: Sci. Technol. 2022, 3, 045010. 10.1088/2632-2153/ac9955. [DOI] [Google Scholar]
- Musielewicz J.; Wang X.; Tian T.; Ulissi Z. FINETUNA: fine-tuning accelerated molecular simulations. Machine Learning: Science and Technology 2022, 3, 03LT01. 10.1088/2632-2153/ac8fe0. [DOI] [Google Scholar]
- Vandermause J.; Kozinsky B.; Owen C. J. Active learning of reactive Bayesian force fi elds applied to heterogeneous catalysis dynamics of H/Pt. Nat. Commun. 2022, 13, 5183. 10.1038/s41467-022-32294-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Podryabinkin E. V.; Shapeev A. V. Active learning of linearly parametrized interatomic potentials. Comput. Mater. Sci. 2017, 140, 171–180. 10.1016/j.commatsci.2017.08.031. [DOI] [Google Scholar]
- Lysogorskiy Y.; Bochkarev A.; Mrovec M.; Drautz R. Active learning strategies for atomic cluster expansion models. PHYSICAL REVIEW MATERIALS 2023, 7, 043801. 10.1103/PhysRevMaterials.7.043801. [DOI] [Google Scholar]
- Stenczel T. K.; El-Machachi Z.; Liepuoniute G.; Morrow J. D.; Bartók A. P.; Probert M. I. J.; Csányi G.; Deringer V. L. Machine-learned acceleration for molecular dynamics in CASTEP. J. Chem. Phys. 2023, 159, 044803. 10.1063/5.0155621. [DOI] [PubMed] [Google Scholar]
- Hjorth Larsen A.; JØrgen Mortensen J.; Blomqvist J.; Castelli I. E.; Christensen R.; Dułak M.; Friis J.; Groves M. N.; Hammer B.; Hargus C. The atomic simulation environment - A Python library for working with atoms. J. Phys.: Condens. Matter 2017, 29, 273002. 10.1088/1361-648X/aa680e. [DOI] [PubMed] [Google Scholar]
- Jakse N.; Pasturel A. Liquid aluminum: Atomic diffusion and viscosity from ab initio molecular dynamics. Sci. Rep. 2013, 3, 3135. 10.1038/srep03135. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bailleul S.; Dedecker K.; Cnudde P.; Vanduyfhuys L.; Waroquier M.; Van Speybroeck V. Ab initio enhanced sampling kinetic study on MTO ethene methylation reaction. J. Catal. 2020, 388, 38–51. 10.1016/j.jcat.2020.04.015. [DOI] [Google Scholar]
- Piccini G.; Lee M. S.; Yuk S. F.; Zhang D.; Collinge G.; Kollias L.; Nguyen M. T.; Glezakou V. A.; Rousseau R. Ab initio molecular dynamics with enhanced sampling in heterogeneous catalysis. Catalysis Science & Technology 2022, 12, 12–37. 10.1039/D1CY01329G. [DOI] [Google Scholar]
- Fu X., Wu Z., Wang W., Xie T., Keten S., Gomez-Bombarelli R., Jaakkola T.. Forces are not Enough: Benchmark and Critical Evaluation for Machine Learning Force Fields with Molecular Simulations. 2022, 1–25.
- Paszke A.; Gross S.; Massa F.; Lerer A.; Bradbury J.; Chanan G.; Killeen T.; Lin Z.; Gimelshein N.; Antiga L.. et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library. arXiv e-prints 2019.
- Bezanson J.; Edelman A.; Karpinski S.; Shah V. B.. Julia: AFresh Approach to Numerical Computing. 2014; https://ui.adsabs.harvard.edu/abs/2014arXiv1411.1607B.
- Witt W. C.; van der Oord C.; Gelžinytė E.; Järvinen T.; Ross A.; Darby J. P.; Ho C. H.; Baldwin W. J.; Sachs M.; Kermode J.; et al. ACEpotentials.jl: A Julia implementation of the atomic cluster expansion. J. Chem. Phys. 2023, 159, 164101. 10.1063/5.0158783. [DOI] [PubMed] [Google Scholar]
- Tran K.; Ulissi Z. W. Active learning across intermetallics to guide discovery of electrocatalysts for CO2 reduction and H2 evolution. Nature Catalysis 2018, 1, 696–703. 10.1038/s41929-018-0142-1. [DOI] [Google Scholar]
- Shuaibi M.; Sivakumar S.; Chen R. Q.; Ulissi Z. W. Enabling robust offline active learning for machine learning potentials using simple physics-based priors. Machine Learning: Science and Technology 2021, 2, 025007. 10.1088/2632-2153/abcc44. [DOI] [Google Scholar]
- Bartók A. P.; Kondor R.; Csányi G. On representing chemical environments. Phys. Rev. B 2013, 87, 184115. 10.1103/PhysRevB.87.184115. [DOI] [Google Scholar]
- Jinnouchi R.; Asahi R. Predicting Catalytic Activity of Nanoparticles by a DFT-Aided Machine-Learning Algorithm. J. Phys. Chem. Lett. 2017, 8, 4279–4283. 10.1021/acs.jpclett.7b02010. [DOI] [PubMed] [Google Scholar]
- Kumar S.; Jing X.; Pask J. E.; Suryanarayana P. On-the-fly machine learned force fields for the study of warm dense matter: application to diffusion and viscosity of CH. Physics of Plasmas 2024, 31 (4), 043905. 10.1063/5.0204229. [DOI] [Google Scholar]
- Byggmästar J.; Nordlund K.; Djurabekova F. Simple machine-learned interatomic potentials for complex alloys. Physical Review Materials 2022, 6, 083801. 10.1103/PhysRevMaterials.6.083801. [DOI] [Google Scholar]
- Darby J. P.; Kermode J. R.; Csányi G. Compressing local atomic neighbourhood descriptors. npj Computational Materials 2022, 8, 166. 10.1038/s41524-022-00847-y. [DOI] [Google Scholar]
- Byggmästar J.; Nordlund K.; Djurabekova F. Modeling refractory high-entropy alloys with efficient machine-learned interatomic potentials: Defects and segregation. Phys. Rev. B 2021, 104, 104101. 10.1103/PhysRevB.104.104101. [DOI] [Google Scholar]
- Lei X.; Medford A. J. A Universal Framework for Featurization of Atomistic Systems. J. Phys. Chem. Lett. 2022, 13, 7911–7919. 10.1021/acs.jpclett.2c02100. [DOI] [PubMed] [Google Scholar]
- Darby J. P.; Kovács D. P.; Batatia I.; Caro M. A.; Hart G. L. W.; Ortner C.; Csányi G. Tensor-Reduced Atomic Density Representations. Phys. Rev. Lett. 2023, 131, 028001. 10.1103/PhysRevLett.131.028001. [DOI] [PubMed] [Google Scholar]
- Liu M.; Kitchin J. R. SingleNN: Modified Behler-Parrinello Neural Network with Shared Weights for Atomistic Simulations with Transferability. J. Phys. Chem. C 2020, 124, 17811–17818. 10.1021/acs.jpcc.0c04225. [DOI] [Google Scholar]
- Shuaibi M.; Hu Y.; Lei X.; Comer B. M.; Adams M.; Paras J.; Chen R. Q.; Musa E.; Musielewicz J.; Peterson A. A.; et al. AmpTorch: A Python package for scalable fingerprint-based neural network training on multi-element systems with integrated uncertainty quantification. Journal of Open Source Software 2023, 8, 5035. 10.21105/joss.05035. [DOI] [Google Scholar]
- Zhang B.; Jing X.; Xu Q.; Kumar S.; Sharma A.; Erlandson L.; Sahoo S. J.; Chow E.; Medford A. J.; Pask J. E.; et al. SPARC v2.0.0: Spin-orbit coupling, dispersion interactions, and advanced exchange-correlation functionals. Software Impacts 2024, 20, 100649. 10.1016/j.simpa.2024.100649. [DOI] [Google Scholar]
- Forum M. P.MPI: A message-passing interface standard. 1994.
- Blackford L. S. An updated set of basic linear algebra subprograms (BLAS). ACM Trans. Math. Softw. 2002, 28, 135–151. 10.1145/567806.567807. [DOI] [Google Scholar]
- Anderson E.; Bai Z.; Bischof C.; Blackford L. S.; Demmel J.; Dongarra J.; Du Croz J.; Greenbaum A.; Hammarling S.; McKenney A.. et al. LAPACK Users’ Guide, 3rd ed.; Society for Industrial and Applied Mathematics, 1999. [Google Scholar]
- Greeley J.; Stephens I. E.; Bondarenko A. S.; Johansson T. P.; Hansen H. A.; Jaramillo T. F.; Rossmeisl J.; Chorkendorff I.; Nørskov J. K. Alloys of platinum and early transition metals as oxygen reduction electrocatalysts. Nat. Chem. 2009, 1, 552–556. 10.1038/nchem.367. [DOI] [PubMed] [Google Scholar]
- Batchelor T. A.; Pedersen J. K.; Winther S. H.; Castelli I. E.; Jacobsen K. W.; Rossmeisl J. High-Entropy Alloys as a Discovery Platform for Electrocatalysis. Joule 2019, 3, 834–845. 10.1016/j.joule.2018.12.015. [DOI] [Google Scholar]
- Pedersen J. K.; Batchelor T. A. A.; Bagger A.; Rossmeisl J. High-Entropy Alloys as Catalysts for the CO2 and CO Reduction Reactions. ACS Catal. 2020, 10, 2169–2176. 10.1021/acscatal.9b04343. [DOI] [Google Scholar]
- Greeley J.; Mavrikakis M. Alloy catalysts designed from first principles. Nat. Mater. 2004, 3, 810–815. 10.1038/nmat1223. [DOI] [PubMed] [Google Scholar]
- Stephens I. E. L.; Bondarenko A. S.; Grønbjerg U.; Rossmeisl J.; Chorkendorff I. Understanding the electrocatalysis of oxygen reduction on platinum and its alloys. Energy Environ. Sci. 2012, 5, 6744–6762. 10.1039/c2ee03590a. [DOI] [Google Scholar]
- Gibbs A. L.; Su F. E. On Choosing and Bounding Probability Metrics. International Statistical Review 2002, 70, 419–435. 10.1111/j.1751-5823.2002.tb00178.x. [DOI] [Google Scholar]
- Parlinski K.; Li Z. Q.; Kawazoe Y. First-principles determination of the soft mode in cubic ZrO2. Phys. Rev. Lett. 1997, 78, 4063–4066. 10.1103/PhysRevLett.78.4063. [DOI] [Google Scholar]
- Kresse G.; Furthmüller J.; Hafner J. Ab initio Force Constant Approach to Phonon Dispersion Relations of Diamond and Graphite. Europhys. Lett. 1995, 32, 729. 10.1209/0295-5075/32/9/005. [DOI] [Google Scholar]
- Schmerler S.elcorto/pwtools:. 2021; 10.5281/zenodo.5640089, Accessed on 05/15/2024. [DOI]
- Lin S.-T.; Blanco M.; Goddard W. A. III The two-phase model for calculating thermodynamic properties of liquids from molecular dynamics: Validation for the phase diagram of Lennard-Jones fluids. J. Chem. Phys. 2003, 119, 11792–11805. 10.1063/1.1624057. [DOI] [Google Scholar]
- Minakov D. V.; Levashov P. R.; Fokin V. B. Vibrational spectrum and entropy in simulation of melting. Comput. Mater. Sci. 2017, 127, 42–47. 10.1016/j.commatsci.2016.10.023. [DOI] [Google Scholar]
- Watt J.; Borhani R.; Katsaggelos A. K. Machine Learning Refined: Foundations, Algorithms, and Applications 2016, 1–286. 10.1017/CBO9781316402276.002. [DOI] [Google Scholar]
- Bellman R.Dynamic Programming; Princeton University Press, 1957. [Google Scholar]
- Alfè D. PHON: A program to calculate phonons using the small displacement method. Comput. Phys. Commun. 2009, 180, 2622–2633. 10.1016/j.cpc.2009.03.010. [DOI] [Google Scholar]
- Ma D.; Grabowski B.; Körmann F.; Neugebauer J.; Raabe D. Ab initio thermodynamics of the CoCrFeMnNi high entropy alloy: Importance of entropy contributions beyond the configurational one. Acta Mater. 2015, 100, 90–97. 10.1016/j.actamat.2015.08.050. [DOI] [Google Scholar]
- Grabowski B.; Ikeda Y.; Srinivasan P.; Körmann F.; Freysoldt C.; Duff A. I.; Shapeev A.; Neugebauer J. Ab initio vibrational free energies including anharmonicity for multicomponent alloys. npj Computational Materials 2019, 5, 80. 10.1038/s41524-019-0218-8. [DOI] [Google Scholar]
- Lones M. A.How to avoid machine learning pitfalls: a guide for academic researchers. arXiv e-prints 2021, arXiv:2108.02497.
- Yang L.; Shami A. On hyperparameter optimization of machine learning algorithms: Theory and practice. Neurocomputing 2020, 415, 295–316. 10.1016/j.neucom.2020.07.061. [DOI] [Google Scholar]
- Bischl B.; Binder M.; Lang M.; Pielok T.; Richter J.; Coors S.; Thomas J.; Ullmann T.; Becker M.; Boulesteix A.-L.. et al. Hyperparameter Optimization: Foundations, Algorithms, Best Practices and Open Challenges. arXiv e-prints 2021, arXiv:2107.05847.
- Zhang G.; Liu W.; Hu T.; Shuai S.; Chen C.; Xu S.; Ren W.; Wang J.; Ren Z. On-the-fly machine learning force field study of liquid-Al/α-Al2O3 interface. Appl. Surf. Sci. 2023, 638, 158141. 10.1016/j.apsusc.2023.158141. [DOI] [Google Scholar]
- Liu X.; Wang W.; Pérez-Ríos J. Molecular dynamics-driven global potential energy surfaces: Application to the AlF dimer. J. Chem. Phys. 2023, 159, 144103. 10.1063/5.0169080. [DOI] [PubMed] [Google Scholar]
- Jaykhedkar N.; Bystrický R.; Sýkora M.; Bučko T. How the Temperature and Composition Govern the Structure and Band Gap of Zr-Based Chalcogenide Perovskites: Insights from ML Accelerated AIMD. Inorg. Chem. 2023, 62, 12480–12492. 10.1021/acs.inorgchem.3c01696. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lama B.; Smirnova A. L.; Paudel T. R. Enhanced Li-Ion Diffusivity of LiFePO4 by Ru Doping: Ab Initio and Machine Learning Force Field Results. ACS Applied Energy Materials 2023, 6, 10424–10431. 10.1021/acsaem.3c01429. [DOI] [Google Scholar]
- Kumar S.; Jing X.; Pask J. E.; Medford A. J.; Suryanarayana P. Kohn-Sham accuracy from orbital-free density functional theory via Δ-machine learning. J. Chem. Phys. 2023, 159, 244106. 10.1063/5.0180541. [DOI] [PubMed] [Google Scholar]
- Xu Q.; Sharma A.; Comer B.; Huang H.; Chow E.; Medford A. J.; Pask J. E.; Suryanarayana P. SPARC: Simulation Package for Ab-initio Real-space Calculations. SoftwareX 2021, 15, 100709. 10.1016/j.softx.2021.100709. [DOI] [Google Scholar]
- Ghosh S.; Suryanarayana P. SPARC: Accurate and efficient finite-difference formulation and parallel implementation of Density Functional Theory: Isolated clusters. Comput. Phys. Commun. 2017, 212, 189–204. 10.1016/j.cpc.2016.09.020. [DOI] [Google Scholar]
- Ghosh S.; Suryanarayana P. SPARC: Accurate and efficient finite-difference formulation and parallel implementation of Density Functional Theory: Extended systems. Comput. Phys. Commun. 2017, 216, 109–125. 10.1016/j.cpc.2017.02.019. [DOI] [Google Scholar]
- Grimme S.; Antony J.; Ehrlich S.; Krieg H. A consistent and accurate ab initio parametrization of density functional dispersion correction (DFT-D) for the 94 elements H-Pu. J. Chem. Phys. 2010, 132, 154104. 10.1063/1.3382344. [DOI] [PubMed] [Google Scholar]
- Shojaei M. F.; Pask J. E.; Medford A. J.; Suryanarayana P. Soft and transferable pseudopotentials from multi-objective optimization. Comput. Phys. Commun. 2023, 283, 108594. 10.1016/j.cpc.2022.108594. [DOI] [Google Scholar]
- Hamann D. R. Optimized norm-conserving Vanderbilt pseudopotentials. Phys. Rev. B 2013, 88, 085117. 10.1103/PhysRevB.88.085117. [DOI] [Google Scholar]
- Lopanitsyna N.; Fraux G.; Springer M. A.; De S.; Ceriotti M. Modeling high-entropy transition-metal alloys with alchemical compression. Phys. Rev. Materials 2023, 7, 045802. 10.1103/PhysRevMaterials.7.045802. [DOI] [Google Scholar]
- Flyvbjerg H.; Petersen H. G. Error estimates on averages of correlated data. J. Chem. Phys. 1989, 91, 461–466. 10.1063/1.457480. [DOI] [Google Scholar]
- Grossfield A., Zuckerman D. M. In Annual Reports in Computational; Wheeler R. A., Ed.; Elsevier, 2009; Vol. 5; pp 23–48. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.





