Abstract
Small-molecule adsorption energies correlate with energy barriers of catalyzed intermediate reaction steps, determining the dominant microkinetic mechanism. Straining the catalyst can alter adsorption energies and break scaling relationships that inhibit reaction engineering, but identifying desirable strain patterns using density functional theory is intractable because of the high-dimensional search space. We train a graph neural network to predict the adsorption energy response of a catalyst/adsorbate system under a proposed surface strain pattern. The training data are generated by randomly straining and relaxing Cu-based binary alloy catalyst complexes taken from the Open Catalyst Project. The trained model successfully predicts the adsorption energy response for 85% of strains in unseen test data, outperforming ensemble linear baselines. Using ammonia synthesis as an example, we identify Cu-S alloy catalysts as promising candidates for strain engineering. Our approach can locate strain patterns that break adsorption energy scaling relations to improve catalyst performance.
Structure-aware machine learning captures strain impact on molecule-surface interactions for rapid catalyst evaluation.
INTRODUCTION
Structure-property relationships form the core of rational materials design; understanding how changes in atomic structure affect emergent material properties is a primary goal of computational materials modeling (1, 2). The symmetric elastic strain tensor ε quantifies the change of a material’s periodic unit cell from an initial reference state: the bulk ground state crystal structure that minimizes the free energy of formation at zero stress. At a material surface, disruption of bulk bonding changes the electron distribution at the surface and induces surface stress, which can be alleviated by shifts in the atomic positions corresponding to surface strain (3–5). These concepts extend to the rearrangement of surface atoms under any mechanical force. Surface structural changes can dominate the structure-property relationships at the nanoscale, where a substantial portion of the atoms are located near the surface, and mechanical forces can originate from epitaxial mismatch, bending, or other mechanically coupled effects such as piezoelectricity (6, 7). Assuming a coordinate system with a surface normal z component, surface strains are described by three of the Voigt dimensions that are parallel to the plane of the surface ε1,2,6, but surface atoms can also relax in the out-of-plane direction where periodicity is broken. Analysis of this continuous three-dimensional (3D) surface strain space is typically limited to single-element structures with low-index surfaces and high-symmetry deformations (uniaxial or biaxial). This is because the search space is vast and low-index surfaces typically form spontaneously under bulk cleavage or epitaxial growth (8–12).
In small-molecule reactions such as ammonia synthesis, carbon dioxide reduction, or nitrogen dioxide reduction, an effective heterogeneous catalyst reduces the energy of transition states in bond-breaking or bond-building reactions, lowering the activation energy barrier and increasing the likelihood that the reaction proceeds in the desired direction (13). While these energy barriers are difficult to characterize directly, the adsorption energy of a molecular structure on a surface has been successfully used as a proxy to describe catalyst activity and assist in catalyst design (14). Linear scaling relationships identified for adsorption energies of different molecules across different surfaces reflect the similar bonding configurations of many small molecules on valence d-band materials (15, 16). However, these relationships imply that it is difficult to improve catalytic activity by simply changing the catalyst material because the relative adsorption energies of neighboring molecular intermediates in the reaction will not change (17, 18). Strain has been suggested as a promising strategy to break these scaling relationships by changing the surface bonding environment (19, 20), and there are multiple experimental observations indicating that strain can effectively manipulate catalyst-adsorbate interactions and modify catalyst activity across different reactions (9, 11, 21–26). This is especially promising given recent advances in core-shell nanoparticle synthesis and nano-heterostructure synthesis through deposition, allowing for strain control to be achieved in high–surface area systems ideal for catalytic applications (27–30). By breaking these scaling relations, including strain as a degree of freedom in catalyst design greatly increases the complexity of an already high-dimensional search space that includes the catalyst structure and composition, the surface facet, the adsorption site, and the adsorbate composition. Nanoparticles frequently contain high-index surfaces that hold high activity potential but are relatively understudied compared to conventional epitaxial surfaces of metals (31).
Supervised machine learning models can learn nonlinear functions in high-dimensional spaces from a relatively small subset of representative training data. The success of machine learning approaches depends on the combination of the selected model and the featurization of the data, which is the process of preparing and filtering data before it passes into the model. Recently, neural networks using strain tensors as inputs were applied to predict the strain response of the electronic structures of diamond and silicon using a training set of density functional theory (DFT) calculations; equivalent results using DFT alone would have required more than 100 million additional calculations, which is several orders of magnitude beyond what is achievable with current computational capability (32, 33). These models enable deep elastic strain engineering by learning the relationship between the target property (either the bandgap or full band structure) and the strain tensor from a small amount of randomly dispersed training data, covering a region far outside the conventional small-strain linear elastic approximation. However, because the training data contain only one single-element material, extending these predictions to a new material requires additional training data and a new model. Generating large, accurate, and statistically representative training datasets is a substantial bottleneck in applying machine learning to crystalline materials (34); recently, the Open Catalyst Project (OCP) released a dataset of more than 1.2 million DFT-relaxed catalyst-adsorbate structures to address this challenge and facilitate prediction of adsorption energies for catalyst discovery and optimization (35). The dataset spans the critical contributing factors to adsorption energy: bulk composition and structure, surface facet, adsorbate site, and adsorbate composition. Modeling the adsorption energy requires fine-grained featurizations that contain information about the specific atomic positions around the coordination site in an machine learning model. One strategy is to construct these features explicitly, which improves model interpretation but can require a lot of precomputation for complex systems (36). A different strategy involves attempting to use the atomic positions directly and allow the model to learn from the atomic structures. Graph neural networks (GNNs) are a candidate model class for this problem because they operate on atomic structures-as-graphs, which preserve distance and neighbor information for all the atoms in a structure. Several GNN architectures have been developed and applied to predict molecular and crystalline properties, including the adsorption energy of a handful of molecules on bulk structure bimetallic alloy surfaces (37–42). While using these models in the physical sciences remains an active research area, the possibility of generalizing over compositional and structural degrees of freedom is promising for materials applications.
In this work, we synthesize the discussed approaches to investigate the effect of general surface strain engineering on adsorption energies of 27 important small-molecule adsorbates over a range of Cu-binary alloy surfaces taken from the OCP dataset. Cu alloys have generated recent broad interest in catalysis due to recent success in identifying high-activity Cu-based catalysts for carbon dioxide reduction using a combination of machine learning and experiments (43). High-throughput DFT calculations generate a strained training set by randomly applying strains to Cu-alloy catalyst-adsorbate complexes. We find that a DimeNet++ GNN architecture combined with an additional neural network to include strain information succeeds on classification and regression tasks to determine the adsorption energy response to strain. Extrapolating the model to predict the strain response of brand-new surface and adsorbate combinations is more difficult, but our results enable surface strain to be efficiently considered as a continuous engineering parameter in catalyst design.
RESULTS
Dataset generation and machine learning workflow
For a particular catalyst-adsorbate complex structure (Fig. 1A, Cat + Ads), we will specify the compositions of the catalyst and the adsorbate and the surface face with the shorthand X(hkl) : Y*, where X is the surface composition, hkl are the Miller indices of the surface facet, and Y* is the adsorbate with * specifying the initial adsorbing atom. X(hkl)ε will be used to denote a strained catalyst. The adsorption energy Eads is defined as the difference between the energy of the catalyst-adsorbate complex and the individual, separated catalyst (Cat) and adsorbate (Ads; Fig. 1A). In vacuum, Eads depends on the structure and composition of the surface, the structure and composition of the molecule, and the coordination site (location on the surface where the molecule interacts). The strained adsorption energy is similarly defined as the difference between the adsorbate-strained surface complex X(hkl)ε : Y* and the individual strained surface X(hkl)ε and adsorbate. We seek to predict the change in adsorption energy of an adsorbate on a surface due to a rotation-free applied strain in the plane of the surface described by the strain tensor ε, with uniaxial components ε11 (ε1) and ε22 (ε2) and shear component ε12 (ε6). The sign and magnitude of ΔEads has been shown to depend on the surface composition, facet, adsorbate composition, and nature of the deformation even in relatively simple systems such as Pt(111)ε: N (19, 44).
Fig. 1. Modeling workflow for data generation, training, and inference.
(A) Atomistic overview of the structures used to calculate the change in adsorption energy with surface strain. Blue rectangles indicate inputs to the machine learning model; green arrows indicate high-throughput DFT ionic relaxations. (B) Workflow for dataset curation, assembly, and model training. From the Open Catalyst dataset, a subset of binary Cu alloy catalysts with adsorbates is selected (see Supplementary Materials). Random strains are generated for each alloy catalyst, and ΔEads is calculated to form the targets for the training set. Green boxes indicate datasets. Orange boxes are model outputs. Gray boxes are trainable models, and pink arrows show model training. (C) After successful model training, inference can be performed over strain space and surface-adsorbate combinations. Under one applied strain, the adsorption energy of initial states (IS), transition states (TS), and final states (FS), which scales with transition states, can shift in opposite directions, fundamentally changing reaction energy barriers.
The dataset development and model training workflow are summarized in Fig. 1B. To develop a model to approximate ΔEads(ε), we assemble a training dataset of strained Cat + Ads complexes using first-principles DFT calculations using the public OCP dataset as a starting point (Fig. 1B) (35). Recently, numerous copper alloy surfaces were identified as high-activity catalysts in CO2 reduction on the basis of their combined adsorption energies for *H and *CO (43, 45). Building on these results, we extract a compositional subset of the OCP dataset consisting of binary copper alloy catalysts (CuxM1-x, where M is an alloying element) and 27 adsorbates as the scope for our strain investigation. A list of catalyst alloy elements and adsorbates in the training dataset is given in table S1. For each Cat + Ads complex in the filtered dataset, six unique strain tensors were randomly generated by selecting uniform random values for ε1, ε2, and 2ε6 between −3% and 3%. The bulk lattice structure (no strain applied) was also included for each Cat + Ads complex to provide Eads. Each random strain tensor was applied to both the Cat structure and the Cat + Ads complex, generating a pair of structures that were relaxed to enable calculation of . Taking the difference between and Eads gives the final training labels ΔEads. The original OCP catalyst-adsorbate supercells are large enough to minimize multi-adsorbate interactions across the periodic boundary conditions of the unit cell (35). Details of the structure formation and DFT relaxations unique to this work are given in Materials and Methods. Six strains were chosen to keep the energy calculations computationally tractable while including a diverse set of surfaces and molecules; random sampling was selected to generate a uniform distribution of strain orientations relative to the randomly distributed orientations of the adsorbate coordinate environments. The inputs to the machine learning model training are the relaxed zero-strain Cat + Ads structure (represented as a graph; details in Materials and Methods) and the strain tensor, and the output is the change in adsorption energy after strain and relaxation. Therefore, only the Cat + Ads structure relaxed at the bulk lattice parameters is required to make predictions about the adsorption energy response across 3D surface strain space. Further discussion of the model architecture choices is presented in the “Dataset inspection and model selection to incorporate strain” section and the text in the Supplementary Materials. With a successfully trained model, this strain space can be efficiently explored across different catalyst-adsorbate systems to engineer reaction energy diagrams (Fig. 1C). For comparison, evaluating the adsorption energy with 0.5% resolution in the 3D strain space of −3 to 3% over ε1, ε2, and ε6 would require 2200 grid points in strain space for each catalyst-adsorbate complex and more than 6.5 million total structures. Using our calculation cost for training data generation of ~18 hours per pair of Cat and Cat + Ads structures, this effort would consume 72,000 central processing unit (CPU)-years of DFT calculations versus 120 combined CPU-hours and graphics processing unit (GPU)-hours required to train our model.
Dataset inspection and model selection to incorporate strain
After the high-throughput DFT dataset generation, we inspect the resulting distribution of ΔEads to evaluate whether machine learning is warranted; a detailed discussion and analysis of the dataset distribution is given in section S1 and fig. S1. The training dataset shows no obvious trend in ΔEads across several catalyst and adsorbate degrees of freedom that a simple physics-based model can capture, supporting the hypothesis that machine learning could be useful. On the basis of this result, we seek to construct both classification and regression tasks for our model. A summarized target ΔEads distribution is plotted together in Fig. 2 (A and B), where Fig. 2B zooms into the central histogram bar in Fig. 2A. The data distribution contains two long tails on both the positive and negative side that span several orders of magnitude, and there is a high concentration of values near zero corresponding to essentially no strain effect. For catalyst design by strain engineering, we are primarily interested in determining whether a strain will significantly increase, decrease, or have no effect on the adsorption energy of a particular adsorbate. Therefore, we bin the dataset into three categories: ΔEads < − 25 meV (class −Δ, blue), ∣ΔEads∣ < 25 meV (class Z, gray), and ΔEads > 25 meV (class +Δ, pink) to define a classification task for our model; Twenty-five milli–electron volts (kBT evaluated at T = 300 K) is chosen as the threshold to classify a significant strain response, and class Z is short for zero effect. We verify that each of these classes contains a representative distribution of the different Cat + Ads structures. Figure 2C shows the histogram of the fraction of member training examples that originate from a particular Cat + Ads complex in each class. For example, consider the Cu3Sb (210):*CHOH complex shown in Fig. 1A. If we apply four random surface strains (Fig. 2D) to this structure and two of them result in ΔEads < −25 meV (Fig. 2E), then they will contribute to the 0.4 bar of the −Δ histogram in Fig. 2C. Likewise, strains with ΔEads > 25 meV contribute to the +Δ histogram, and ∣ΔEads∣ < 25 meV strains (including all ground state structures and identity matrix strains by definition) contribute to Z. From these histograms, we conclude that most Cat + Ads complexes appear in multiple classes; therefore, accurate classification cannot be achieved on structural or compositional information alone. Only Z contains some complexes with 100% membership, and this is reasonable because we expect that certain Cat + Ads complexes will be relatively immune to surface strain. The total class splits in the training set are given in table S2. The class distribution analysis confirms that classifying the strain response into broad buckets still requires both the structure and the specific strain pattern to successfully predict. For the regression task, we simply normalize the target ΔEads distribution to zero mean and unit SD and calculate the mean absolute error (MAE) of the predicted values against the true values (additional details in Materials and Methods).
Fig. 2. Data distribution across strain pattern and composition supports classification task, plus ensemble linear classifier performance.
(A) Total histogram and (B) zoomed histogram of ΔEads in the training dataset and assigned classes. Class −Δ (ΔEads < −25 meV) is blue. Class Z (∣ΔEads∣ < 25 meV) is gray. Class +Δ (ΔEads > 25 meV) is pink. (C) Histograms of fractional class membership grouped by Cat + Ads structure show even distribution of Cat + Ads structures across the three assigned classes. (D) Example of histogram generation; of five hypothetical strains for Cu3Sb:*CHOH, two fall in −Δ, one falls in Z, and two fall in +Δ [highlighted histogram bars in (C)]. (E) Confusion matrix for the ensemble linear regression baseline model on test data. The x axis gives the model predicted classes, and the y axis gives the true values; the diagonal gives the frequency of correct predictions within each class.
To establish a performance baseline and justify adding model complexity, we test an ensemble linear baseline model by fitting a separate linear regression to each unique group of catalyst alloy-element and adsorbate in the training data (80% of the dataset). We then use each individual regression model to predict the class of any matching alloy-element + adsorbate structures in the test data (10% of the dataset held out from training; see Materials and Methods). We ask the baseline model to do some generalization over the specific catalyst composition structure because this is a potential feature of the GNN that expands predictive capability. Figure 2E gives the normalized confusion matrix for this baseline classifier, which shows the fraction of true samples predicted to fall in each class by the model; each row sums to 1, and the correct model predictions appear along the matrix diagonal. This model performs better than random guessing but still misidentifies the class of ~45% of the test data, with an F1 score of 0.58 for the classification task. The average MAE across each ensemble linear model for the regression task is 0.17 eV, above the typical threshold in catalysis of 0.1 eV. Additional classification and regression metrics for the ensemble linear baseline are given in table S3. Because neither the classification nor regression baseline performance is sufficient to be practically useful, we proceed to training and testing GNN hypothesis.
Model selection, training, and performance
From the training set analysis, we recognize that we need a model that can generalize over both structural and compositional degrees of freedom. GNNs are a promising candidate for this application because differentiating the strain response across different surfaces and molecules requires incorporating detailed structural information into the input. We adapted and modified the DimeNet++ model architecture, first introduced by Klicpera et al. and used in the Open Catalyst challenge, to predict adsorption energies from initial structure (35, 39, 40). The model architecture is shown in Fig. 3A. The graph represents atoms as nodes and the interactions between atomic pairs as edges within a cutoff radius, chosen to be 7 Å with a maximum of 60 nearest neighbors (based on original hyperparameters in the Open Catalyst Dataset) (35). The network embeds each node (atom) of the graph as a set of directional pairwise interactions, and the edges are embedded using a set of spherical basis functions that incorporate bond angle information. The basis set choice and embedding strategy provides rotational invariance to the model; more details are available in (39, 40). After the graph representation of the Cat + Ads complex is passed through the standard DimeNet++ model, we pad the node level output with zeros to the size of the largest structure in the dataset and append the normalized strain tensor, injecting the second component of the input data. The combination of the DimeNet++ output and the strain tensor is lastly passed through a small fully connected neural network before the output is summed to give the final prediction. Adapting a GNN architecture originally designed for molecules to our low-symmetry Cat + Ads structures and strain inputs motivated specific architecture choices, data augmentation strategies, and regularization schemes to obtain sufficient model performance. Details and rationale behind these decisions are discussed in sections S2 and S3.
Fig. 3. Model architecture, task definition, and task results on test data.
(A) Model architecture used for classification and regression tasks. The relaxed zero-strain Cat + Ads structure is input to DimeNet++. The strain tensor is appended to the padded DimeNet++ output and passed through a fully connected neural network (StrainBlock). Regularization is performed on node-level output by classifying nodes as adsorbate, surface, or bulk. ΔEads classification and regression are graph-level tasks. (B) Normalized confusion matrix for the GNN + strain model on test data. Each row matches a different true category, while each column matches a predicted category; the diagonal boxes give the percentage of correct predictions for each class. (C) Results from the GNN regression task, zoomed in bottom. Graph background colors give the true class, while point colors give the predicted class based on the regression. (D) Error analysis in the test data as a function of adsorbate composition and training representation. The x axis gives the error rate within each adsorbate, while the y axis gives the training data representation; *CN stands out as the outlying adsorbate. (E) Same as (D) but grouped by alloy element; no significant outliers are observed.
The performance metrics for the GNN + Strain classifier and the GNN + Strain regressor on test data (10% randomly withheld; details in Materials and Methods) from the training procedure are summarized in Fig. 3 (B and C). As in Fig. 2E, the normalized confusion matrix in Fig. 3B gives the fraction of true samples in the test data that were predicted to fall in each class by the classifier; each row sums to 1, and the correct model predictions appear along the matrix diagonal. On the same set of training and testing data as the linear baseline, the GNN + Strain classifier outperforms the ensemble linear baseline by at least 20% in every category. In addition, the error rate misidentifying −Δ and +Δ classes [thereby confusing a large positive (negative) ΔEads with a large negative (positive) ΔEads] is, on average, one-third of the same linear baseline error, and this is the costliest error to make when evaluating the impact of strain on a reaction diagram. The regression results are shown over the full test dataset in Fig. 3C (top) [zoomed in Fig. 3C (bottom)]; the MAE for the regression model is 0.08 eV, which is within the target range for machine learning approximators in catalysis (35). The points are colored according to the class predicted by the regressor, such that any points in a shaded region of a different color indicate a misclassification by the regression model, while matching points indicate a success. As expected, this model has more difficulty distinguishing positive and negative ΔEads near the MAE, which is where many of the samples lie. Overall, the regularized model architecture (detailed in table S4) performs well on both the classification and regression tasks. Supplied with a larger training dataset, performance may further improve if these tasks are combined, for example, training a separate regressor model within each predicted class. Model performance decreased when the test data were constructed of new Cat + Ads compositions completely unseen in the training data; this type of extrapolation is a goal for the field of physical GNNs but requires larger datasets than the one generated in this work.
Incorrect predictions in the test data are further analyzed in Fig. 3D to assess the variance in the correct model predictions across compositional degrees of freedom. The x axis in Fig. 3D gives the percent predicted incorrectly within each adsorbate subgroup that appears in the randomly selected test data. All adsorbates fall within 10% of the average error rate except for *CN, and *CN is one of the least represented adsorbates in the total strain dataset. The triple bond of *CN is distinct from the bonding of the other adsorbates considered; we anticipate that this can make *CN an outlier in terms of strain-adsorption response and that the performance on this adsorbate would improve with additional training data examples. There are no other immediately discernible trends in the error rate with respect to adsorbate composition, which means that the model is generalizing across the strain response of different adsorbates well. The adsorbate composition showed the largest error variance within the test data. Figure 3E gives the same analysis as Fig. 3D but split by catalyst alloy element, and there is no outlying high-error element despite large differences in element representation across the overall dataset representation highlighted in fig. S1C. Looking beyond composition, additional error analysis is given by calculating the Pearson correlation coefficients across several different interpretable features in fig. S2. These correlation coefficients measure the quality of a linear fit between the prediction error on the test data and the features of the test data Cat + Ads structures, giving an indication of feature importance. The only feature that we identified with a correlation coefficient magnitude greater than 0.15 is the cumulative absolute displacement of the adsorbate atoms under strain. This is physically reasonable, as strains that induce large changes in the adsorbate configuration will have both outsized impacts on the adsorption energy but, more importantly, are underrepresented in the training dataset. Given additional training data and/or training data filtered by adsorbate atom displacement, our model framework can be adapted to obtain higher accuracy on these large deformation strain patterns, which could be a better pool of structural candidates for adsorbate strain engineering.
Inference identifies alloy compositions suitable for surface strain engineering
Recall that the dataset used for training and testing the model contained six random strains for each Cat + Ads structure plus an additional zero-strain structure matching the bulk lattice constants. Considering that the inclusive 3D strain space between −3 and 3% at 0.5% resolution requires 13 grid points in each direction or 2197 total DFT calculations per Cat + Ads structure, this training set covers 0.3% of the total strain space for each Cat + Ads structure. For inference, we generate 500 random strains in this 3D strain space (22% of the total space at the same grid resolution) for each Cat + Ads structure in the dataset (~445,000 total strain + structure combinations) and use the trained classifier model to predict the category for ΔEads for each strain. Inference across all points in the dataset takes ~6 hours on 1 GPU; comparable DFT calculations would require more than 15,000 CPU-years of computational effort.
The ammonia synthesis reaction N2 + 3H2 → 2NH3 is one of the most important industrial chemical reactions in the world and one of the most highly studied in catalysis (46). The overall reaction is exergonic, but on many catalysts, the reaction pathway begins as exergonic and ends as endergonic because of the presence of stable adsorbed intermediates (47, 48). The rate determining step of the most-studied dissociative pathway in Haber-Bosch conditions can be one of several intermediate steps including dissociation of N2 and various H + NHx → NHx+1 steps depending on the catalyst and the catalytic environment. A general guiding principle toward improving ammonia synthesis catalyst performance is reducing the cumulative magnitude of the endergonic steps within the reaction pathway (48–50). Cu-based catalysts have been a recent focus of electrocatalytic nitrogen and nitrate reduction studies, which introduces the additional complexity of competing reactions such as hydrogen evolution (51–53). While many features of the reaction conditions ultimately contribute to the ammonia synthesis rate, the adsorption energy describes the foundational interaction between the catalyst and relevant intermediates from which further microkinetic analysis can be conducted (54). We choose the intermediate reaction *H + *N → *NH as an illustrative example for identifying catalyst candidates with high-strain engineering potential. Figure 4A plots an average of the ground state energy of the reactants *H + *N (black lines) and product *NH (red lines) grouped by catalyst alloy composition. This gives an indication of the relative adsorption energies between surface compositions in the strain-free case. All the intermediate energies are exergonic relative to the formation energies of both N2 and NH3, so raising the adsorption energy of these three intermediates reduces the gross endergonic energy of the dissociative mechanism (55).
Fig. 4. Inference results grouped across different catalysts and adsorbates identify Cu-S alloy surfaces as ideal strain engineering candidates.
(A) Reaction enthalpies averaged over zero-strain Cat + Ads structures for *H + *N ➔ *NH. Black lines represent reactant energies. Red lines represent product energies; the formation energy of NH is included in the product enthalpy. (B) Normalized histogram of inferred strain response classes for each Cat + Ads structure containing *H, grouped by catalyst alloy element. (C) Same as (B) but for *N as the adsorbate; (D) same as (B) for *NH as the adsorbate.
Figure 5 (B to D) plots summary inference results for all the Strained Cat + Ads structures containing *H (Fig. 4B), *N (Fig. 4C), and *NH (Fig. 4D) in the inference dataset. For each adsorbate, we plot a histogram of the inference results over strain space, grouped by the alloy composition of each catalyst surface (x axis) and the predicted class (bar color). We group by alloy element because catalyst composition is practically one of the first decisions made in catalyst selection and it has a relatively high correlation coefficient compared to other independent variables such as Cu composition and surface plane (fig. S3). As an example, Fig. 4B indicates that the adsorption energy of *H on Cu-Pd surfaces is relatively unresponsive to strain because nearly all strains in the inference set fall in the gray class Z. On Cu-Sb surfaces, strain tends to increase the *H adsorption energy (less favorable interaction), with a strong bias toward pink class +Δ over class Z and class −Δ. For *N in Fig. 4C and *NH in Fig. 4D, the distributions differ substantially from the *H graph, reflecting the fundamental change in the adsorbate coordination from *H to *N; for example, Cu-Sb alloys bias toward −Δ for *N and *NH, indicating that strain tends to decrease the adsorption energy (more favorable interaction). Cu-S alloys exhibit a large number of strains that raise the adsorption energy of both *N and *NH, and Fig. 5A shows that the ground state adsorption energy is also more positive for Cu-S alloys relative to the other compositions. Raising the adsorption energy of *NH with strain is particularly desirable because the average zero-strain reaction enthalpy on Cu-S surfaces is −1.29 eV. This indicates that the Cu-S alloys are suitable targets for our goal of raising the adsorption energy of the *H + *N → *NH intermediates to reduce the magnitude of endergonic steps in the ammonia synthesis reaction.
Fig. 5. Inferred strain phase diagram reflects changes in surface structure response to strain.
(A) Surface strain phase diagram resulting from model inference for Cu8S4(201):*NH. Color scale indicates the predicted class of adsorption energy response corresponding to the classes in Fig. 2. (B) Same as (A) for Cu4S2(110):*NH; there are two distinct regions of inferred strain responses, but most of the surface strain patterns are predicted to increase the adsorption energy of *NH. (C) The Cat + Ads zero-strain atomistic structure corresponding to (A); the threefold coordination site (purple circles) includes 2 Cu atoms and 1 S atom in the plane of the surface. (D) The Cat + Ads zero-strain atomistic structure corresponding to (B); the coordination site (purple circles) is similar, but the surface structure is much more dense than that in (A).
Phase diagrams of strain-adsorption energy capture subtle structural effects
High-level analysis of the inference results in aggregate identified Cu-S alloys as candidates to increase the adsorption energy of *NH. Copper sulfide catalysts of varying compositions have been recently studied for ammonia synthesis via the electrochemical nitrogen reduction reaction, which has been suggested to occur at least partially through a dissociative mechanism (56, 57). To further examine the nature of *NH strain response, Fig. 6 plots phase diagrams of the inference results as a function of strain for two different catalyst compositions and surface planes in the Cu-S family. The uniaxial norm and the shear component ε6 are chosen as the pseudo-order parameters because they capture most of the variation within strain space while retaining convenient 2D visualization. Empirically, despite combining ε1 and ε2 together, we find that these quantities generally give well-defined regions in strain space corresponding to one class of predictions. The color scale gives the classifier model prediction for each point in the inference dataset.
Fig. 6. Regressor-predicted strained reaction diagram for single-molecule NH3 synthesis on Cu4S2 (110).
Horizontal lines give the energy of the adsorbate-surface system at each step of the ammonia synthesis reaction. Black lines correspond to the ground state, zero-strain surface. Blue (red) lines give the minimum (maximum) strained adsorption energies for each system across all strain patterns in the inference dataset, predicted using the trained regressor model. The pink lines give an example of a strain pattern that breaks scaling relations for *N + 3 *H ➔ *NH + 2 *H; the strain pattern is taken from the phase diagram of the regressor inference results (insets). The compressive surface strain (both uniaxial and shear) raises the adsorption energy of *N + *H but lowers that of *NH + *H, reducing the reaction enthalpy.
Figure 5A shows 500 inference points for Cu8S4(201):*NH; as indicated by the histograms in Fig. 5, most strains are labeled as class +Δ and predicted to induce a positive change in the adsorption energy greater than 25 meV. The *NH adsorbate has a threefold coordination site of surface atoms (orange circles) that lie nearly parallel to the surface plane consisting of 2 Cu atoms and 1 S atom (Fig. 5A, bottom). At low shear strains, compressive uniaxial strain is predicted to reduce the adsorption energy and tensile uniaxial strain is predicted to increase the adsorption energy. This reflects that expanding the coordination environment lengthens the bonds between the surface atoms and the adsorbate, and compressing the coordination environment reduces the bond length, favoring increased covalent interaction. At small uniaxial strains, positive shear strain is predicted to have little impact on adsorption energy, but negative shear strain is generally predicted to increase the adsorption energy. The qualitative difference in the shear predictions reflects the asymmetry of the surface structure, as different directions of strain are interpreted by the model to result in different adsorption energy changes given the same input graph of the zero-strain structure. Figure 5B shows the inferred strain phase diagram for Cu4S2(110):*NH; this surface originates from a different bulk crystal structure with a similar calculated formation energy than that in Fig. 5A but contains the same elemental composition. The coordination environment for *NH appears qualitatively similar to that in Fig. 5A, a threefold site with 2 Cu atoms and 1 S atom that centers the adsorbing nitrogen. However, the predicted strain response is quite different; nearly all uniaxial strains increase the adsorption energy, and only a combination of compressive shear and uniaxial strain leads to no effect on the adsorption energy. We attribute this to subtle differences in the ground state coordination environment that reflect the different surface structures; in Fig. 5B, both the ground states coordinating Cu-N bond (2.07 Å) and the N-S bond (1.68 Å) are nearly identical to their bulk ground-state counterparts in Cu2N (2.06 Å) and molecular S3N (1.6 Å). Therefore, any strains that disrupt the ability of the surface to preferentially relax into this same coordinating geometry will destabilize the adsorbate relative to the ground state. Because the zero-strain structures are included in the training data, this relative bond length information from the surface can be taken up by the model during training. These results emphasize that the structural information unique to a GNN approach is required to get the correct strain response for adsorption energy in otherwise chemically similar systems.
Identifying strains that break scaling relations using regressor predictions
Following identification of Cu-S surfaces as candidates for strain engineering in the context of ammonia synthesis, we apply the trained regressor model to the Cu4S2 (110) surface across all the adsorbate intermediates in the ammonia synthesis reaction. The strain-aware reaction diagram for this system is plotted in Fig. 6; the adsorption energies for multi-adsorbate systems are still calculated in the dilute limit and simply summed to give the energy of the intermediate state. The adsorption sites are randomly chosen for each intermediate to simulate dilute adsorption. The black reaction diagram lines give the ground state adsorption energies of each intermediate system in the reaction at zero surface strain, while the red and blue lines give the respective minimum and maximum predicted strain adjustment to Eads across all predicted strain patterns. Phase diagrams for the regression inference results are inset for two intermediate states: *N + 3*H and *NH + 2 *H. We note that while the regressor inference diagram shown here and the classifier inference diagram shown in Fig. 5 do not perfectly agree, the general prediction trends and model inference diagrams match well. The green horizontal lines (selected ε in the diagram) correspond to the same strain pattern across reaction intermediates, identified by the small green box in the phase diagram insets. This compressive uniaxial and shear strain pattern breaks the linear scaling relation between *N and *NH on the Cu4S2 (110) surface; the *N adsorption energy is increased under this strain pattern by 0.2 eV, while the *NH adsorption energy is decreased by 0.12 eV, reducing the overall uphill reaction enthalpy. These predictions do not contain any information about the transition state energy, and therefore, we cannot precisely determine whether the forward elementary reaction energy barrier increases or decreases under this compressive strain pattern. However, under the assumption that the transition state will be similar to either the products or the reactants, breaking a scaling relation between the reactants and products gives a one-sixth chance that the forward reaction barrier will decrease (Fig. 1C), increasing the desired reaction rate. This probability is much higher than the probability of identifying a similar strain pattern through intuition or brute force search through strain space of a given system. This illustrates the powerful capability of the machine learning methodology as screening tools that rapidly identify regions of interest in strain space where scaling relations break and reaction barriers may be lowered. Follow-on studies using nudged elastic band first-principles (58) or machine learning (59) methods to calculate transition state energies on candidate strained surfaces will therefore have a much higher success rate when using our machine learning models as system screening tools.
In addition to screening for strain patterns within a given catalyst system that break scaling relations, it is important to identify which Cat + Ads complexes are most influenced by strain. Figure S4A gives 15 Cat + Ads complexes containing important ammonia synthesis adsorbates (*N2, *N, and *NH), which demonstrate the largest predicted range of ΔEads responses. We highlight the predicted regression strain-phase diagram for ΔEads and the ground state coordination site for several of these structures in fig. S4B. Zr4Cu2 (10-2): *N2 and Al4Cu2 (112): *N2 are interesting because of the high predicted strain control over N2 adsorption, particularly driving the adsorption to be more favorable. While hydrogenation of nitrogen is an important rate-determining step in ammonia synthesis, nitrogen adsorption and dissociation is rate-limiting for many catalysts and difficult to engineer because of the inertness of N2. The coordination site analysis for the Zr structure indicates that both N atoms are near the surface, such that tensile strain will separate them and assist with dissociation. Detailed individual study on these high-potential surface, adsorbate, and strain combinations is warranted to develop the relationship between transition state energies and strain when scaling relations are broken.
Strain-adsorption energy phase diagrams and catalyst degradation mechanisms
In computational catalysis design, the surface itself can be easily overlooked as a dynamic reaction participant, particularly when calculations are conducted in the dilute adsorbate limit. However, both single- and multi-atom adsorbates can interact with and substantially modify the surface structure, indicating possible catalyst degradation (modification of the desired active sites) or poisoning (blocking of active sites) mechanisms (60). The strain-adsorption energy phase diagrams generated by our model inference can be used to identify catalyst surfaces where strain induces very large adsorbate-specific adsorption energy changes, indicating potential surface reconstructions. Figure 7 plots inferred ΔEads strain response phase diagrams for the same surface HfCu3(100) with two adsorbates with the same coordinating atom, *N and *NO2, located at the same adsorption site. The predicted strain response is nearly the exact inverse for the two complexes; for nearly all strain configurations, the adsorption energy is predicted to increase for *N and decrease for *NO2. To investigate this difference, we select a strain profile that falls within a region of the strain diagram exhibiting a different strain response for the two adsorbates (black circle); because these strains are not in the training or testing dataset, we run two DFT calculations to get the relaxed atomistic structures under strain and verify the model predictions. The zero-strain Cat, zero-strain Cat + Ads, and strained Cat + Ads complexes are shown from top to bottom in Fig. 7C for *N and Fig. 7D for *NO2. The DFT results confirm the model inference predictions: Under the same applied strain, the adsorption energy of HfCu3(100):*N increases by 30 meV, while for HfCu3(100):*NO2, the adsorption energy decreases by 180 meV. At the bulk lattice constant for HfCu3, the adsorption of N shifts the Hf surface atom position by a very small amount to coordinate tightly with N, increasing the Hf-Cu surface bond length by 0.02 Å. The same adsorption process for NO2 leads to a substantial surface reconstruction, increasing the Hf-Cu surface bond length by 0.61 Å and nearly decomposing *NO2 into *NO and *O. This change in both the coordination environment of the adsorbate and the surface structure leads to opposing responses to the same applied strain. When the surface is strained away from the bulk lattice constant, the adsorption becomes less favorable for *N, as the Hf-Cu bond is further stretched by 0.02 Å from the equilibrium value of the zero-strain surface. *NO2 adsorption causes relaxation to the same coordination environment as on the zero-strain surface without distorting the Hf-Cu surface bonds as much, leading to the 180-meV decrease in the adsorption energy. This example shows that under the same surface strain, different adsorbates can induce local reconstruction that raises or reduces their interaction with the surface relative to the bulk surface. Regions of the strain phase diagram for a given catalytic surface that show strongly opposing effects for related adsorbates can be highly promising for engineering reaction barriers but can also be further screened to check for surface reconstructions that may lead to catalyst degradation or corrosion over time. For reaction systems where catalyst degradation or surface poisoning is a major issue, finding surfaces that show little change in adsorption energy (Z class) with strain may indicate surface stability with respect to mechanical deformations and indicate a more robust catalyst. Last, the model predictions of systems outside the training/testing datasets are verified by independent first-principles calculation, validating the concept of the strain adsorption energy phase diagram to provide guidance for practical strain engineering of heterogeneous catalytic reactions.
Fig. 7. Comparison of inferred strain phase diagrams for HfCu3(100) with *N and *NO2 adsorbed validated by DFT.
(A) Strain phase diagram for HfCu3(100):*N shows most of the strains predicted to increase the adsorption energy. (B) Strain phase diagram for HfCu3(100):*NO2 shows most of the strains predicted to decrease the adsorption energy. Black circles in (A) and (B) correspond to the strain studied in (C). (C) Atomistic structure of (from top to bottom) the zero-strain Cat, the zero-strain Cat + Ads, and the strained Cat + Ads structures corresponding to the deformation shown (A). The strain increases the Hf-Cu surface bond length from the zero-strain case, increasing the adsorption energy. (D) Same as (C) for HfCu3(100):*NO2; the strain decreases the Hf-Cu bond length back toward the zero-strain surface value with no adsorbate, enabling surface relaxation and making adsorption more energetically favorable.
DISCUSSION
Strain is a fundamental property of material surfaces and interfaces that plays an outsize role on the nanoscale, where interfacial properties dominate over bulk properties. A key challenge of rational catalyst design is bridging the wide gap between pristine in silico structures and experimentally realized structures in nanoparticles or surfaces. Nanomaterial catalysts are especially desirable because of the extremely high ratio of potentially active surface area to material volume, yet this is also where strain introduces the largest deviations in expected structure and function from the bulk. Unfortunately, accounting for so many structural degrees of freedom results in a search space that is computationally intractable with physics-based modeling alone. Machine learning models can work in tandem with conventional simulation to interpolate structure property relationships from a relatively small training set across these vast search spaces within computationally practical time scales. GNNs are early in their application to physical systems and do not yet regularly outperform simpler models. However, they offer a high potential performance ceiling because they can directly ingest structural information at a high level of detail, which otherwise must be interpreted, reduced, and converted to features manually. These models have the potential to generalize more effectively across composition and structure with larger training datasets. In addition, they may be able to generalize to defect structures much more easily than conventional machine learning models because the representation changes with any structural change. Defect sites are especially interesting for catalyst design because defects are charge active and sensitive to strain (61).
In this work, we sought to develop model for the relationship between applied or intrinsic strain at a surface and the subsequent change in the adsorbate-surface interaction, which is fundamental to the microkinetic mechanism. To do so, we applied recent advances in symmetry-aware GNNs, synthesizing prior independent efforts to use machine learning for elastic strain engineering and adsorption energy prediction. We improved the model performance on small training datasets by introducing a regularization scheme that incorporates prior physical knowledge across subsections of the graph. From our successful classification and regression task training, we identify Cu-S alloys as promising platforms for strain engineering of nitrogen-containing adsorbates and generate phase diagrams of predicted strain response for several catalyst-adsorbate complexes. We validate several inference predictions on strain patterns outside the training domain with independent DFT calculations and identify subtle structural surface changes illustrating different ways that strain affects adsorption energy. This demonstrates how the model predictions can identify surface-adsorbate combinations that are susceptible to reconstruction under mechanical fluctuations, leading to catalyst degradation or surface poisoning. In poisoning-susceptible situations, strain-insensitive surfaces (gray regions of the phase diagrams) would be desirable because of the structural variance present in practical systems. These case studies show that the model is sensitive enough to distinguish the strain response of the same adsorbate on compositionally identical but structurally different surfaces and different adsorbates on the same exact surface.
Applying these predictions in catalyst synthesis requires further analysis of reasonably achievable strain patterns in a synthesized material or core-shell nanoparticle. Some strain patterns occur spontaneously if they reduce the penalizing surface energy term, while others can be induced through epitaxial stress (5). A natural follow-on to this work would be training a similar GNN to predict the change in the surface energy of a slab under a particular strain without the adsorbate. With the two models together, strains that optimize the adsorption energies for a particular reaction can be filtered by their predicted effect on the surface energy; strains that reduce the surface energy would be more likely to spontaneously form in a nanoparticle or ultrathin epitaxially grown surface. After a machine learning–driven screening analysis on the reactant and product complexes of an elementary reaction, nudged elastic band calculations can verify the impact on the transition state energy imposed by the target strain pattern. Last, improvements in the precision of epitaxial material growth and core-shell nanoparticle synthesis by bottom-up and top-down approaches have enabled finer control over material structure for a given composition (62). Combining these experimental advances with a model that generalizes over different surface facets, strain states, and compositions will enable comparisons of different intermediates and reaction pathways on a particular surface using one model. We anticipate that flexible, structure-aware model architectures such as GNNs will improve catalyst design by bridging the gap between accurate but expensive first-principles simulations and experimentally relevant high dimensional spaces such as strain.
MATERIALS AND METHODS
DFT calculations
First-principles DFT simulations were carried out using the Vienna ab initio simulation package (63, 64). Projector-augmented wave pseudo-potentials (65) are used with a cutoff energy of 400 eV for plane-wave expansions (66). The exchange correlation is treated using the Perdew-Burke-Ernzerhof–generalized gradient approximations. The atomistic structures of catalyst and catalyst + adsorbate slabs were relaxed using Γ-centered k-point meshes of 40/a × 40/b × 1 rounded to the nearest integer, where a and b are the lattice constants of the slab supercell. For structural relaxations, the atomic positions of all unit and supercells are optimized until the force components on each atom are less than 0.03 eV/Å, and the electronic energy is converged within 10−4 eV. A vacuum spacing of 20 Å was added to slab calculations to prevent interactions between periodic images. Following the OCP dataset generation, atoms further than 2 Å from the surface are fixed in their relaxed bulk positions during slab relaxation to simulate the bulk lattice structure, while surface and adsorbate atoms are free to relax (35). Long-range van der Waals dispersion interactions were treated using the DFT-D3 method developed by Grimme et al. (67, 68); these corrections are not part of the original Open Catalyst calculation parameters but we found that including them changed the distribution of adsorption energies. Individual molecules are relaxed in a 12 Å cubic unit cell using the same calculation parameters.
Dataset preparation
Zero-strain structures are converted to graphs for model input using the same graph generation procedure as the OCP. Atoms are nodes; edges are labeled with the distance between two atoms, and the neighbor distances are calculated taking periodic boundary conditions into account. The number of neighbors for each atom is capped at 60 and the cutoff radius for a neighbor interaction is 7 Å (35). Subsurface, surface, and adsorbate atom tags are included in the dataset to be used for node-level regularization. For data normalization, the input strain tensors and all energies are normalized to zero-mean, unit SD before model training; the normalization parameters are calculated independently for ε1, ε2, ε6, and ΔEads, and these normalization parameters are included with the publicly available datasets.
Model training
SchNet, Crystal Graph Convolutional Neural Networks (CGCNN), and DimeNet++ architectures were all tested for the classification and regression tasks; DimeNet++ consistently outperformed the other model architectures. All models are implemented using the PyTorch framework. Hyperparameter optimization was performed for all model parameters and training procedures on the classification task using an Asynchronous Successive Halving Algorithm implemented in the Ray software package. The final model hyperparameters are included in table S4. To prevent overfitting, the model size was reduced until the training loss and the validation loss were similar at the end of training. Train, validation, and test splits were randomly generated using 80, 10, and 10% of the total dataset, respectively. Weighted sampling was used during training of the classifier to adjust for class imbalance between the −Δ, Z, and +Δ classes.
Acknowledgments
Funding: This work was supported by NSF grant MRSEC/DMR-1720530 (V.B.S.).
Author contributions: Conceptualization: C.C.P., A.S., N.C.F., and V.B.S. Data curation: C.C.P. and A.S. Methodology: C.C.P. and N.C.F. Investigation: C.C.P. Validation: C.C.P. Visualization: C.C.P. and A.S. Supervision: V.B.S. Writing (original draft): C.C.P. Writing (review and editing): C.C.P., A.S., N.C.F., and V.B.S.
Competing interests: These authors declare that they have no competing interests.
Data and materials availability: All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials. All model training and dataset generation code, training data, train/test splits, normalization values, and inference results are provided as a zip file hosted here: https://figshare.com/articles/dataset/strain_adsorption_data_GNN_tar_gz/19158425.
Supplementary Materials
This PDF file includes:
Supplementary Text
Tables S1 to S4
Figs. S1 to S4
References
REFERENCES AND NOTES
- 1.Le T., Epa V. C., Burden F. R., Winkler D. A., Quantitative structure-property relationship modeling of diverse materials properties. Chem. Rev. 112, 2889–2919 (2012). [DOI] [PubMed] [Google Scholar]
- 2.Olson G. B., Computational design of hierarchically structured materials. Science 277, 1237–1242 (1997). [Google Scholar]
- 3.Ibach H., The role of surface stress in reconstruction, epitaxial growth and stabilization of mesoscopic structures. Surf. Sci. Rep. 29, 195–263 (1997). [Google Scholar]
- 4.Cammarata R. C., Surface and interface stress effects in thin films. Prog. Surf. Sci. 46, 1–38 (1994). [Google Scholar]
- 5.Müller P., Saùl A., Leroy F., Simple views on surface stress and surface energy concepts. Adv. Nat. Sci. Nanosci. Nanotechnol. 5, 013002 (2014). [Google Scholar]
- 6.Banerjee A., Bernoulli D., Zhang H., Yuen M.-F., Liu J., Dong J., Ding F., Lu J., Dao M., Zhang W., Lu Y., Suresh S., Ultralarge elastic deformation of nanoscale diamond. Science 360, 300–302 (2018). [DOI] [PubMed] [Google Scholar]
- 7.Baughman R. H., Cui C., Zakhidov A. A., Iqbal Z., Barisci J. N., Spinks G. M., Wallace G. G., Mazzoldi A., de Rossi D., Rinzler A. G., Jaschinski O., Roth S., Kertesz M., Carbon nanotube actuators. Science 284, 1340–1344 (1999). [DOI] [PubMed] [Google Scholar]
- 8.Dodson B. W., Many-body surface strain and surface reconstructions in fcc transition metals. Phys. Rev. Lett. 60, 2288–2291 (1988). [DOI] [PubMed] [Google Scholar]
- 9.Xia Z., Guo S., Strain engineering of metal-based nanomaterials for energy electrocatalysis. Chem. Soc. Rev. 48, 3265–3278 (2019). [DOI] [PubMed] [Google Scholar]
- 10.Wang L., Zeng Z., Gao W., Maxson T., Raciti D., Giroux M., Pan X., Wang C., Greeley J., Tunable intrinsic strain in two-dimensional transition metal electrocatalysts. Science 363, 870–874 (2019). [DOI] [PubMed] [Google Scholar]
- 11.He T., Wang W., Shi F., Yang X., Li X., Wu J., Yin Y., Jin M., Mastering the surface strain of platinum catalysts for efficient electrocatalysis. Nature 598, 76–81 (2021). [DOI] [PubMed] [Google Scholar]
- 12.Mavrikakis M., Hammer B., Nørskov J. K., Effect of strain on the reactivity of metal surfaces. Phys. Rev. Lett. 81, 2819–2822 (1998). [Google Scholar]
- 13.Friend C. M., Xu B., Heterogeneous catalysis: A central science for a sustainable future. Acc. Chem. Res. 50, 517–521 (2017). [DOI] [PubMed] [Google Scholar]
- 14.Medford A. J., Vojvodic A., Hummelshøj J. S., Voss J., Abild-Pedersen F., Studt F., Bligaard T., Nilsson A., Nørskov J. K., From the Sabatier principle to a predictive theory of transition-metal heterogeneous catalysis. J. Catal. 328, 36–42 (2015). [Google Scholar]
- 15.Zhao Z. J., Liu S., Zha S., Cheng D., Studt F., Henkelman G., Gong J., Theory-guided design of catalytic materials using scaling relationships and reactivity descriptors. Nat. Rev. Mater. 4, 792–804 (2019). [Google Scholar]
- 16.Wang S., Petzold V., Tripkovic V., Kleis J., Howalt J. G., Skúlason E., Fernández E. M., Hvolbæk B., Jones G., Toftelund A., Falsig H., Björketun M., Studt F., Abild-Pedersen F., Rossmeisl J., Nørskov J. K., Bligaard T., Universal transition state scaling relations for (de)hydrogenation over transition metals. Phys. Chem. Chem. Phys. 13, 20760–20765 (2011). [DOI] [PubMed] [Google Scholar]
- 17.Greeley J., Theoretical heterogeneous catalysis: Scaling relationships and computational catalyst design. Annu. Rev. Chem. Biomol. Eng. 7, 605–635 (2016). [DOI] [PubMed] [Google Scholar]
- 18.Montemore M. M., Medlin J. W., Scaling relations between adsorption energies for computational screening and design of catalysts. Cat. Sci. Technol. 4, 3748–3761 (2014). [Google Scholar]
- 19.Khorshidi A., Violet J., Hashemi J., Peterson A. A., How strain can break the scaling relations of catalysis. Nat. Catal. 1, 263–268 (2018). [Google Scholar]
- 20.Pérez-Ramírez J., López N., Strategies to break linear scaling relationships. Nat. Catal. 2, 971–976 (2019). [Google Scholar]
- 21.Zhang S., Zhang X., Jiang G., Zhu H., Guo S., Su D., Lu G., Sun S., Tuning nanoparticle structure and surface strain for catalysis optimization. J. Am. Chem. Soc. 136, 7734–7739 (2014). [DOI] [PubMed] [Google Scholar]
- 22.Collins G., Holmes J. D., Engineering metallic nanoparticles for enhancing and probing catalytic reactions. Adv. Mater. 28, 5689–5695 (2016). [DOI] [PubMed] [Google Scholar]
- 23.Kibler L. A., El-Aziz A. M., Hoyer R., Kolb D. M., Tuning reaction rates by lateral strain in a palladium monolayer. Angew. Chem. Int. Ed. 44, 2080–2084 (2005). [DOI] [PubMed] [Google Scholar]
- 24.Luo M., Guo S., Strain-controlled electrocatalysis on multimetallic nanomaterials. Nat. Rev. Mater. 2, 17059 (2017). [Google Scholar]
- 25.Jiang K., Luo M., Liu Z., Peng M., Chen D., Lu Y.-R., Chan T.-S., de Groot F. M. F., Tan Y., Rational strain engineering of single-atom ruthenium on nanoporous MoS2 for highly efficient hydrogen evolution. Nat. Commun. 12, 1687 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Huang H., Jia H., Liu Z., Gao P., Zhao J., Luo Z., Yang J., Zeng J., Understanding of strain effects in the electrochemical reduction of CO2: Using Pd nanostructures as an ideal platform. Angew. Chem. Int. Ed. Engl. 56, 3594–3598 (2017). [DOI] [PubMed] [Google Scholar]
- 27.Sneed B. T., Young A. P., Tsung C.-K., Building up strain in colloidal metal nanoparticle catalysts. Nanoscale 7, 12248–12265 (2015). [DOI] [PubMed] [Google Scholar]
- 28.Vollath D., Fischer F. D., Holec D., Surface energy of nanoparticles—Influence of particle size and structure. Beilstein J. Nanotechnol. 9, 2265–2276 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Wu J., Qi L., You H., Gross A., Li J., Yang H., Icosahedral platinum alloy nanocrystals with enhanced electrocatalytic activities. J. Am. Chem. Soc. 134, 11880–11883 (2012). [DOI] [PubMed] [Google Scholar]
- 30.Gan L., Heggen M., Rudi S., Strasser P., Core-shell compositional fine structures of dealloyed PtxNi1–x nanoparticles and their impact on oxygen reduction catalysis. Nano Lett. 12, 5423–5430 (2012). [DOI] [PubMed] [Google Scholar]
- 31.Wu T., Sun M., Huang B., Atomic-strain mapping of high-index facets in late-transition-metal nanoparticles for electrocatalysis. Angew. Chem. Int. Ed. Engl.. 60, 22996–23001 (2021). [DOI] [PubMed] [Google Scholar]
- 32.Shi Z., Tsymbalov E., Dao M., Suresh S., Shapeev A., Li J., Deep elastic strain engineering of bandgap through machine learning. Proc. Natl. Acad. Sci. U.S.A. 116, 4117–4122 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Tsymbalov E., Shi Z., Dao M., Suresh S., Li J., Shapeev A., Machine learning for deep elastic strain engineering of semiconductor electronic band structure and effective mass. npj Comput. Mat. 7, 76 (2021). [Google Scholar]
- 34.Mamun O., Winther K. T., Boes J. R., Bligaard T., High-throughput calculations of catalytic properties of bimetallic alloy surfaces. Sci. Data 6, 76 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Chanussot L., Das A., Goyal S., Lavril T., Shuaibi M., Riviere M., Tran K., Heras-Domingo J., Ho C., Hu W., Palizhati A., Sriram A., Wood B., Yoon J., Parikh D., Zitnick C. L., Ulissi Z., Open catalyst 2020 (OC20) dataset and community challenges. ACS Catal. 11, 6059–6072 (2021). [Google Scholar]
- 36.Esterhuizen J. A., Goldsmith B. R., Linic S., Theory-guided machine learning finds geometric structure-property relationships for chemisorption on subsurface alloys. Chem 6, 3100–3117 (2020). [Google Scholar]
- 37.Chen C., Ye W., Zuo Y., Zheng C., Ong S. P., Graph networks as a universal machine learning framework for molecules and crystals. Chem. Mater. 31, 3564–3572 (2019). [Google Scholar]
- 38.Xie T., Grossman J. C., Crystal graph convolutional neural networks for an accurate and interpretable prediction of material properties. Phys. Rev. Lett. 120, 145301 (2018). [DOI] [PubMed] [Google Scholar]
- 39.J. Klicpera, J. Groß, S. Günnemann, Directional message passing for molecular graphs. arXiv.2003.03123 [cs.LG] (2020).
- 40.J. Klicpera, S. Giri, J. T. Margraf, S. Günnemann, Fast and uncertainty-aware directional message passing for non-equilibrium molecules. arXiv.2011.14115 [cs.LG] (2020).
- 41.Qiao Z., Welborn M., Anandkumar A., Manby F. R., Miller T. F. III, OrbNet: Deep learning for quantum chemistry using symmetry-adapted atomic-orbital features. J. Chem. Phys. 153, 124111 (2020). [DOI] [PubMed] [Google Scholar]
- 42.Fung V., Zhang J., Juarez E., Sumpter B. G., Benchmarking graph neural networks for materials chemistry. npj Comput. Mat. 7, 48 (2021). [Google Scholar]
- 43.Zhong M., Tran K., Min Y., Wang C., Wang Z., Dinh C.-T., de Luna P., Yu Z., Rasouli A. S., Brodersen P., Sun S., Voznyy O., Tan C.-S., Askerka M., Che F., Liu M., Seifitokaldani A., Pang Y., Lo S.-C., Ip A., Ulissi Z., Sargent E. H., Accelerated discovery of CO2 electrocatalysts using active machine learning. Nature 581, 178–183 (2020). [DOI] [PubMed] [Google Scholar]
- 44.Pala R. G. S., Liu F., Determining the adsorptive and catalytic properties of strained metal surfaces using adsorption-induced stress. J. Chem. Phys. 120, 7720–7724 (2004). [DOI] [PubMed] [Google Scholar]
- 45.Zeng S., Shan S., Lu A., Wang S., Caracciolo D. T., Robinson R. J., Shang G., Xue L., Zhao Y., Zhang A., Liu Y., Liu S., Liu Z., Bai F., Wu J., Wang H., Zhong C.-J., Copper-alloy catalysts: Structural characterization and catalytic synergies. Cat. Sci. Technol. 11, 5712–5733 (2021). [Google Scholar]
- 46.Humphreys J., Lan R., Tao S., Development and recent progress on ammonia synthesis catalysts for Haber–Bosch process. Adv. Energy Sustain. Res. 2, 2000043 (2021). [Google Scholar]
- 47.Hattori M., Iijima S., Nakao T., Hosono H., Hara M., Solid solution for catalytic ammonia synthesis from nitrogen and hydrogen gases at 50 °C. Nat. Commun. 11, 2001 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Johnson L. R., Sridhar S., Zhang L., Fredrickson K. D., Raman A. S., Jang J., Leach C., Padmanabhan A., Price C. C., Frey N. C., Raizada A., Rajaraman V., Saiprasad S. A., Tang X., Vojvodic A., MXene materials for the electrochemical nitrogen reduction–functionalized or not? ACS Catal. 10, 253–264 (2020). [Google Scholar]
- 49.Kobayashi Y., Kitano M., Kawamura S., Yokoyama T., Hosono H., Kinetic evidence: The rate-determining step for ammonia synthesis over electride-supported Ru catalysts is no longer the nitrogen dissociation step. Cat. Sci. Technol. 7, 47–50 (2017). [Google Scholar]
- 50.Suryanto B. H. R., Du H.-L., Wang D., Chen J., Simonov A. N., MacFarlane D. R., Challenges and prospects in the catalysis of electroreduction of nitrogen to ammonia. Nat. Catal. 2, 290–296 (2019). [Google Scholar]
- 51.Wang Y., Xu A., Wang Z., Huang L., Li J., Li F., Wicks J., Luo M., Nam D.-H., Tan C. S., Ding Y., Wu J., Lum Y., Dinh C.-T., Sinton D., Zheng G., Sargent E. H., Enhanced nitrate-to-ammonia activity on copper-nickel alloys via tuning of intermediate adsorption. J. Am. Chem. Soc. 142, 5702–5708 (2020). [DOI] [PubMed] [Google Scholar]
- 52.Li J., Gao J., Feng T., Zhang H. H., Liu D., Zhang C., Huang S., Wang C., Du F., Li C., Guo C., Effect of supporting matrixes on performance of copper catalysts in electrochemical nitrate reduction to ammonia. J. Power Sources 511, 230463 (2021). [Google Scholar]
- 53.Zhou H., Xiong B., Chen L., Shi J., Modulation strategies of Cu-based electrocatalysts for efficient nitrogen reduction. J. Mater. Chem. A 8, 20286–20293 (2020). [Google Scholar]
- 54.Motagamwala A. H., Dumesic J. A., Microkinetic modeling: A tool for rational catalyst design. Chem. Rev. 121, 1049–1076 (2020). [DOI] [PubMed] [Google Scholar]
- 55.Honkala K., Hellman A., Remediakis I. N., Logadottir A., Carlsson A., Dahl S., Christensen C. H., Nørskov J. K., Ammonia synthesis from first-principles calculations. Science 307, 555–558 (2005). [DOI] [PubMed] [Google Scholar]
- 56.Kim H. S., Choi J., Kong J., Kim H., Yoo S. J., Park H. S., Regenerative electrocatalytic redox cycle of copper sulfide for sustainable NH3 production under ambient conditions. ACS Catal. 11, 435–445 (2021). [Google Scholar]
- 57.Kong J., Kim M.-S., Akbar R., Park H. Y., Jang J. H., Kim H., Hur K., Park H. S., Electrochemical nitrogen reduction kinetics on a copper sulfide catalyst for NH3 synthesis at low temperature and atmospheric pressure. ACS Appl. Mater. Interfaces 13, 24593–24603 (2021). [DOI] [PubMed] [Google Scholar]
- 58.Sheppard D., Terrell R., Henkelman G., Optimization methods for finding minimum energy paths. J. Chem. Phys. 128, 134106 (2008). [DOI] [PubMed] [Google Scholar]
- 59.M. Schreiner, A. Bhowmik, T. Vegge, P. B. Jørgensen, O. Winther, NeuralNEB—Neural Networks can find reaction paths fast. 10.48550/arxiv.2207.09971 [physics.comp-ph] (2022). [DOI]
- 60.Argyle M. D., Bartholomew C. H., Heterogeneous catalyst deactivation and regeneration: A review. Catal. 5, 145–269 (2015). [Google Scholar]
- 61.Er D., Ye H., Frey N. C., Kumar H., Lou J., Shenoy V. B., Prediction of enhanced catalytic activity for hydrogen evolution reaction in janus transition metal dichalcogenides. Nano Lett. 18, 3943–3949 (2018). [DOI] [PubMed] [Google Scholar]
- 62.Yang X., Wang Y., Tong X., Yang N., Strain engineering in electrocatalysts: Fundamentals, progress, and perspectives. Adv. Energy Mat. 12, 2102261 (2021). [Google Scholar]
- 63.Kresse G., Furthmüller J., Efficiency of ab-initio total energy calculations for metals and semiconductors using a plane-wave basis set. Comput. Mater. Sci. 6, 15–50 (1996). [Google Scholar]
- 64.Kresse G., Hafner J., Ab initio molecular-dynamics simulation of the liquid-metal–amorphous-semiconductor transition in germanium. Phys. Rev. B. 49, 14251–14269 (1994). [DOI] [PubMed] [Google Scholar]
- 65.Blöchl P. E., Projector augmented-wave method. Phys. Rev. B. 50, 17953–17979 (1994). [DOI] [PubMed] [Google Scholar]
- 66.Perdew J. P., Burke K., Ernzerhof M., Generalized gradient approximation made simple. Phys. Rev. Lett. 77, 3865–3868 (1996). [DOI] [PubMed] [Google Scholar]
- 67.Grimme S., Antony J., Ehrlich S., Krieg H., A consistent and accurate ab initio parametrization of density functional dispersion correction (DFT-D) for the 94 elements H-Pu. J. Chem. Phys. 132, 154104 (2010). [DOI] [PubMed] [Google Scholar]
- 68.Grimme S., Ehrlich S., Goerigk L., Effect of the damping function in dispersion corrected density functional theory. J. Comput. Chem. 32, 1456–1465 (2011). [DOI] [PubMed] [Google Scholar]
- 69.L. Perez, J. Wang, The effectiveness of data augmentation in image classification using deep learning. arXiv.1712.04621 [cs.CV] (2017).
- 70.Frey N. C., Akinwande D., Jariwala D., Shenoy V. B., Machine learning-enabled design of point defects in 2d materials for quantum and neuromorphic information processing. ACS Nano 14, 13406–13417 (2020). [DOI] [PubMed] [Google Scholar]
- 71.J. Karaguesian, J. R. Lunger, Y. Shao-Horn, R. Gomez-Bombarelli, Crystal graph convolutional neural networks for per-site property prediction, in Fourth Workshop on Machine Learning and the Physical Sciences (NeurIPS, 2021).
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Supplementary Text
Tables S1 to S4
Figs. S1 to S4
References







