Abstract

In the search for novel intermetallic ternary alloys, much of the effort goes into performing a large number of ab initio calculations covering a wide range of compositions and structures. These are essential to building a reliable convex hull diagram. While density functional theory (DFT) provides accurate predictions for many systems, its computational overheads set a throughput limit on the number of hypothetical phases that can be probed. Here, we demonstrate how an ensemble of machine-learning (ML) spectral neighbor-analysis potentials (SNAPs) can be integrated into a workflow for the construction of accurate ternary convex hull diagrams, highlighting regions that are fertile for materials discovery. Our workflow relies on using available binary-alloy data both to train the SNAP models and to create prototypes for ternary phases. From the prototype structures, all unique ternary decorations are created and used to form a pool of candidate compounds. The SNAPs ensemble is then used to prerelax the structures and screen the most favorable prototypes before using DFT to build the final phase diagram. As constructed, the proposed workflow relies on no extra first-principles data to train the ML surrogate model and yields a DFT-level accurate convex hull. We demonstrate its efficacy by investigating the Cu–Ag–Au and Mo–Ta–W ternary systems.
1. Introduction
Systematic materials design aims to develop methods that can help accelerate the discovery of compounds with tailor-made properties, fit for certain applications. The large investment in the area, not least through the materials genome initiative,1 underpins the importance of searching for novel compounds to bolster technological progress. Atomistic simulations provide a suitable pathway to achieve this goal, since the search can be performed systematically, at low cost, and with complete control over structure and composition. Density functional theory (DFT) calculations are notably used to predict material properties in silico, such as material stability or elastic responses. By performing property predictions across a large range of prototype structures in the form of high-throughput studies,2 novel magnetic,3 high hardness,4 and battery materials5 have been discovered. Extensive databases, grouping large numbers of such calculations, have been created and are open to the community. These include AFLOWlib,6 Materials Project,7 OQMD,8 and NOMAD.9 While such studies remain faster than experimental investigations, the compositional and structural spaces to be searched are incredibly large, limiting the scope of the application of pure DFT workflows. Importantly, such a limitation in sampling capacity becomes increasingly critical as the number of elements per compound grows, despite the anticipation that a majority of future compound discoveries will be highly multielemental.10 In order to address this issue and harness the data available from existing ab initio calculations, machine learning (ML) has proven to be a very powerful tool as it typically comes at a fraction of the DFT computational cost.
The first step in high-throughput computational studies consists of identifying stable compounds by finding a stoichiometry and an associated structure that can be formed. In order to assess the stability of a given structure, the appropriate convex hull diagram must be calculated. The proximity between a compound’s enthalpy of formation, ΔHf, and the closest tie-line on the convex hull serves as a criterion for evaluating its stability. Lower values indicate a higher likelihood of stability. Threshold values, typically up to ∼100 meV/atom, are used as stability cut-offs.11 In order to speed up electronic structure methods such as DFT, one possibility is to predict this quantity directly by using ML models, where compounds’ compositional and structural information is encoded and mapped directly onto ΔHf. This is otherwise known as composition prediction as it is used to identify which stoichiometries are stable by fixing structural variations. The ML models typically used include neural networks, kernel ridge regression, and random trees, while the training data are often taken from the OQMD, AFLOWlib, or Materials Project. For instance, models where the feature vector is only based on compositional information have been used to predict the stability of compounds forming a set of prototype structures (elpasolites, perovskites, heuslers, etc.), which is fixed for the compounds in the training set.11−14 Including structural information in the definition of a model mainly improves the predictions if large training datasets (>105 data points) are used.15 Graph convolutional neural networks16−18 have notably been used to predict convex hull distances accurately and benefit greatly from structural features.19,20 Note that these can also be constructed with compositional information only.21,22 One downside to the inclusion of structural information in the models is that the optimized structure is not known prior to the search, so data for unrelaxed structures has to be used. This can notably be corrected by using ML interatomic potentials (MLIAPs), which are capable of performing relaxations.
MLIAPs combine atomic fingerprints, representing individual atomic environments in the form of feature vectors, with ML algorithms and effectively map the potential energy surface (PES) of a collection of atoms.23 The past decade has seen an immense expansion of the development and application of such potentials.24−29 When trained using active learning, MLIAPs have most notably been able to extend the length and time scales of ab initio molecular dynamic simulations by several orders of magnitude.30−34 Such potentials have been successfully applied to predict the energy and forces of alloys35 and have been used to accelerate and assist the construction or further exploration of binary and ternary convex hulls. Workflows built on these potentials use MLIAPs as surrogate models to first relax and then make energy predictions on a large library of prototype structures. The lowest energy structures are then compared to reference convex hulls obtained from DFT calculations. This process allows one to improve the reference convex hull diagram by identifying structures lying below it. The training of such potentials is crucial for adequate performance, and studies insist on using high-energy structures for relaxations to be reliable.
Work in this area has broadly been split into two categories. In the first, specific MLIAPs are trained for a given system,36−42 typically using active learning. In the second, MLIAPs are trained on large generic databases and are used to scan over many phases.43,44 The former is more accurate than the latter, but it is not transferable to other phases. Due to their higher accuracy, phase-specific MLIAPs can also be regarded as global structure optimizers, in that not only can they be used to identify specific stable compositions but they can also accurately predict their structure. Many other ML global structure optimizers exist, either in the form of novel workflows45−48 or by inserting MLIAPs into the pre-existing state-of-the-art global structure optimizers.49−51
In this work, we demonstrate how a MLIAP can be trained on data readily available on a mainstream repository, such as AFLOWlib,6 and used to screen a library of ternary-alloy prototypes constructed from their associated binary systems. Recently, we have shown52 that an ensemble of spectral neighbor-analysis potential (SNAP)26 models, trained on the energy data of the three binary subsystems associated with a ternary one, was able to predict the energies of ternary compounds with a mean absolute error (MAE) of ∼30 meV/atom, as long as the structures were fully relaxed. This not only provides a fast energy-screening tool for ternary compounds, which only requires existing ab initio data on binary structures, but also gives valuable insight into the fact that chemical environments within binary and ternary transition-metal alloys are similar. Such observation is at the heart of the workflow introduced here. A selection of binary structures, those close to their respective convex hull tie-planes, are selected as templates for ternary alloys. In a high-throughput setup, these are screened using an ensemble of SNAP models trained on binaries. The lowest-energy compounds are then selected as the most promising candidates, and their energies are calculated using high-fidelity DFT. The ternary convex hull is thus updated.
What makes this workflow different to tailor-made MLIAPs used for convex hull construction is that all the data, both for prototype generation and for training the SNAPs, is taken from the relevant binary phases of the AFLOWlib database.6 In other words, there is no need to generate any new data for the purpose of training the MLIAPs. Despite its training database not being specifically made, either by including important configurations through physical intuition or through active learning, it still has a low enough error on energy predictions to enable a high-throughput search of novel alloys. This is because stable binary and ternary phases, at least for the materials class of transition-metal intermetallics investigated here, share similar local atomic environments. In some sense, the workflow enables an interpolation of the data already available in AFLOWlib to scan ternary convex hulls and identify stable compositions. Since only a few high-energy structures and no out-of-equilibrium configurations are included in the SNAP training dataset, additional features are introduced in the workflow to increase the robustness of the predictions. These include constraints on the SNAP-driven relaxation (constant volume and the inclusion of a maximum number of steps) as well as using an ensemble of models.
In this paper, the workflow used to generate novel ternary compounds is presented. The methodology Section 2 details how ternary prototype structures are built from their binary counterparts and how binary compound data from AFLOWlib6 is used to train an ensemble of SNAP26 models. Such SNAPs are used to relax and screen ternary prototypes. Then, the results Section 3 present how the workflow is used to find stable phases for the Cu–Ag–Au and Mo–Ta–W ternary systems. The so-constructed convex hull diagrams are subsequently compared with their available AFLOWlib counterparts, and conclusions are drawn in Section 4. Finally, Section 5 presents the computational methods employed.
2. Methods
The general methodology of our workflow, schematically introduced in Figure 1, is described in detail. From the AFLOWlib database of binary compounds and their associated DFT-computed energies, an ensemble of SNAP models is trained. A subsection of these structures, the ones with the lowest enthalpy of formation, are also used as parent structures and form a library of prototypes. Note that here, as is standard practice in many DFT-based convex hull constructions, a compound’s enthalpy of formation is approximated solely by its ground-state DFT energy. As such, the terms enthalpy and energy will be considered equivalent throughout the rest of the study. Candidate ternaries are then created by generating all possible and unique derivative structures of such prototypes53 at a fixed composition,54 up to a maximum total number of atoms. These two parts are then combined as the ensemble of SNAP models is used to relax the candidate ternaries. The final energies are predicted through cross-validation (CV) within the ensemble of models, while the standard deviation of the predictions is also used to detect and remove geometries for which the relaxation has failed. The resulting structures with the lowest energy and standard deviation are selected as the best candidates (closest to the convex hull). Finally, full ab initio relaxation is performed for these. The Cu–Ag–Au ternary system is used to develop the methodology and is employed here as an example in each subsection.
Figure 1.
Diagram of the full stable ternary compound search workflow implemented in this work. Data available from the three binary subsystems associated with a ternary one (box at the top) is used for two tasks: (i) the training of an ensemble of SNAP models and (ii) the construction of a library of parent prototype structures. Derivative structures of the latter are created and all possible ternary decorations of these are produced. Each of them is then relaxed with the SNAPs model and the lowest-enthalpy structures are screened.
2.1. Generating Prototypes
The first step of the workflow consists of creating a suitable library of ternary prototype materials. The driving idea of this work is that the local atomic environments seen in binary intermetallic alloys are similar to those in the associated ternaries, especially for structures close to equilibrium.52 In the context of the current work, this insight leads to choosing the binary structures as prototypes for the ternaries. More specifically, the bottom of the three binary convex hulls associated with a ternary system (in our example, Ag–Au, Cu–Ag, and Cu–Au for Cu–Ag–Au) are scanned to select the lowest-enthalpy compounds. Those within a certain energy range from the convex hull are then selected. All binary structures considered here are taken from the AFLOWlib database.6 The threshold energy selected differs depending on the system at hand to ensure that roughly the same number of structures are taken from each binary diagram. For instance, in our test system, Cu and Ag are immiscible.55 Therefore, all binaries have a positive enthalpy of formation and lie far from the Cu–Ag tie-line between the two elementary phases (fcc Cu and Ag). As a consequence, the energy window above the hull for this binary is larger than that of the other two. The convex hulls of Ag–Au and Cu–Ag are compared in Figures 2, and Table 1 gives the energy window used, as well as the number of structures selected for each binary system.
Figure 2.
Convex hull diagrams of the Ag–Au (top) and Cu–Ag (bottom) binary systems. Each convex hull is defined by the lower black tie-lines, while the green-shadowed regions up to the higher full lines show the energy windows chosen to select the binary structures.
Table 1. Number of Structures, Nstruct, Selected from Each Binary System, X–Y, to Construct the Ternary Prototypesa.
| X–Y | Nstruct | ΔE (meV) |
|---|---|---|
| Ag–Au | 24 | 1.7 |
| Cu–Ag | 25 | 65.4 |
| Cu–Au | 25 | 6.2 |
Here, we also report the energy window, ΔE, above the convex hull used for the selection.
Once the prototypes are selected, the constituent atoms are stripped of their chemical identity, and all structures are compared using the AFLOWlib symmetry tool56 in order to curate redundancies. This is necessary since certain structure types (such as fcc or bcc) may be present several times in the collected database but may be “decorated” in several different ways for different stoichiometries and binaries. At this stage, all structures are reduced to their primitive cells. It is also important to note that single-element structures are also included in this analysis. This leads to a library of unique, undecorated prototypes, taken from the binary convex hulls. For the Cu–Ag–Au system, this results in 40 prototypes. Information on the prototype structures is provided in the Supporting Information.
From this set of prototypes, ternary alloys are generated. This task is performed at a fixed stoichiometry and for cells up to a maximum number of atoms, Nmax. For all the prototypes with a number of atoms compatible with the fixed stoichiometry, the set of all unique derivative structures are created by following the procedure introduced in refs (53), (54), (57), and by using the associated open source ENUMLIB code. The initial implementation of the algorithm begins from a parent lattice and uses group theory to efficiently enumerate all the unique ways to occupy the sites of supercells constructed from that lattice.53 Further modifications of the scheme allow for the starting structures to be defined by a lattice, an atomic basis (multilattices),57 and for the generation of derivative geometries at a fixed stoichiometry.54 This completes the first step of the workflow (blocks in the top right corner of Figure 1) and leads to a set of unique ternary compounds inspired by the structures of the binaries. The energy of these is then screened using a MLIAP.
2.2. Ensemble of SNAP Models
MLIAPs typically assume that the total energy, E, of a N-atom system defined by coordinates rN can be written as a sum of atomic energies Ei
| 1 |
Such a partition, first proposed
by
Behler and Parrinello,25 is based on the
principle of near-sightedness.58,59 The MLIAP of choice
for this work is SNAP,26 which has proved
to perform well regardless of the nature of the chemical bond.60 In this model, the total energy of a compound
is written as a sum of linear combinations of the feature vectors
describing the chemical environments of each atom i of the type αi in the system.
SNAP then takes the bispectrum components,
, as
feature vectors. The function, ESNAP,
that returns the SNAP-predicted total
energy is thus defined as
| 2 |
where
and βαi are the species-dependent
linear coefficients
of the ML model. Further details on this potential can be found in Section 5. SNAP’s
linear form allows one to obtain good performance with a small number
of features, 56 per species in our case, and when trained on small
datasets (≤103 data points).61 Furthermore, the SNAP hyperparameters are easy to optimize
since the range of optimal values for Jmax (the maximum angular momentum of the bispectrum) and rcut (the cutoff radius) is wide and consistent for accurate
performance.26,35,62 In our experience, the optimization of the atomic weights, although
generally useful, only leads to modest improvements.52
As for our previous study, an ensemble of SNAP models
is used to
increase the robustness of the predictions. Furthermore, this provides
a means of estimating the prediction uncertainty.52 The ensemble is defined as a set of K functions,
, where each SNAP
model, defined by EkSNAP, is trained differently
and hence has different
linear coefficients. The predicted energy of a new system with atomic
positions, rN, is defined
as the mean prediction of the models, E, and its
uncertainty is estimated from the standard deviation, σ, of E, namely
| 3 |
| 4 |
The training data only consists of binary alloys obtained from the AFLOWlib database. In the case of the Cu–Ag–Au system, the total energies have been recomputed for consistency by single-point DFT calculation (no further relaxation is performed). Differently from what was done when generating the prototypes, here all binaries, no matter their distance from the convex hull, are included in the training dataset. The same workflow is also used for Mo–Ta–W (the results are described in Section 3), for which we demonstrate that the energy values taken directly from AFLOWlib are suitable for training the SNAP models. The full details on the Cu–Ag–Au binary subsystems can be found in ref (52).
Previously, 10 different SNAP models within the ensemble were obtained by training on different subsets of the same size of the binary-alloy database.52 For this work, 5 models are trained on the full database, but for each one, a different set of atomic weights for Cu, Ag, and Au are used to compute the bispectrum components. This difference is motivated by the need to distinguish compounds with identical site positions in their structure (e.g., the sites of a bcc supercell) but different atomic site occupations. If the atomic weights for all species are identical, for some of these structures, notably the high-symmetry ones, SNAP will predict identical energies for different compounds. This is illustrated in Figure 3 for two distinct compounds obtained as bcc derivative structures with the composition Cu1Ag1Au2. Prototypes A and B only differ by a permutation of the Ag and Cu atoms. Hence, SNAP models using identical weights for these two atomic types will fail to predict different energies for the compounds. Therefore, by construction, in the ensemble created, the two elements in each pair of atomic types (e.g., Cu and Ag in Cu–Ag) have different weights in at least one model.
Figure 3.
SNAP performance was assessed for two structurally identical prototypes. The upper two panels show two different possible site occupations for a 3 × 1 × 1 bcc derivative structure with Cu1Ag1Au2 stoichiometry. Here, we show the z axis view, with bronze, silver, and gold spheres representing Cu, Ag, and Au, respectively. The table shows the SNAP-predicted energies for the two prototypes when the SNAP is trained with different atomic weights, {wα}, as indicated in the first column. Note that when the Cu and Ag weights are identical, the two energies coincide by construction. All the crystal structure visualizations are generated with the use of the atomic simulation environment (ASE).63
Before selecting the values for the atomic weights, rcut and Jmax are optimized manually and independently by using 10-fold Monte Carlo CV for fixed identical weights, {1, 1, 1} and thus find rcut = 3.5 Å and Jmax = 4. For these values, the optimal atomic weights are set by performing a grid search, with the same CV method, where all three atomic weights are varied from −5 to 5 in steps of 1 (omitting 0). Within this search space, the sets of weights used for the SNAP models in the ensemble are chosen to minimize the CV root-mean squared error (RMSE). The training and CV errors for each model of the ensemble are given in Table 2. The ensemble is then used to predict which of the prototype structures has the lowest enthalpy.
Table 2. Training and CV Errors for the 5 Models, Defined by Different Atomic Weights {wα}, of the Ensemblea.
| wCu | wAg | wAu | training MAE | training RMSE | CV MAE | CV RMSE |
|---|---|---|---|---|---|---|
| 1 | 1 | 2 | 8.0 | 13.4 | 27.1 | 83.5 |
| 1 | 2 | 2 | 8.7 | 13.5 | 24.8 | 64.7 |
| –1 | –2 | –1 | 9.7 | 16.4 | 30.6 | 86.4 |
| –1 | –2 | –2 | 8.5 | 13.2 | 23.5 | 64.3 |
| –1 | –1 | –2 | 7.7 | 13.1 | 25.6 | 75.0 |
Here, we report the MAE and the RMSE. All values are given in meV/atom.
2.3. Energy Screening
The final aim of the workflow is to suggest low-energy ternary structures with a given stoichiometry. Since many compounds with a large energy spread are screened, the suggestions made need to be accurate (must include low-energy structures) and reliable (must not include high-energy and unphysical systems). While the energy error of the SNAP surrogate model is low, it is still prone to making poor predictions about new systems that do not resemble the structures seen in training. As a result, the construction of the workflow focuses on the robustness of the final predictions made. Note that choosing parent prototypes from binary compounds increases the reliability of the predictions.
The first step in the energy-screening process consists of setting the compounds’ unit-cell volume. This is chosen by taking the weighted average of the elemental volumes of the constituent atoms, an approximation that reproduces the results from ab initio-relaxed compounds quite well, as is illustrated in Figure 4. Then, the volume and all lattice parameters are kept fixed during any relaxation driven by the SNAP models. This is because, while the training database includes a diverse set of structures, they are all at equilibrium, namely, their forces and stress-tensor elements are close to zero. Therefore, no configurations are strongly compressed or expanded, a fact that causes the SNAP models to perform poorly on the prediction of equilibrium volumes and lattice parameters. The volume is allowed to change only for the final, most promising structures selected for the DFT relaxation.
Figure 4.
Plot showing the initial
unrelaxed volumes, Vunrelax, and relaxed
equilibrium ones, Vrelax, of a set of
ternary prototypes at the stoichiometry
Cu2Ag1Au1. Unrelaxed volumes are
chosen from the volume of the binary associated with each ternary
compound’s structure. The dashed line indicates the mean equilibrium
volumes for these compounds,
, while the full line shows the volume predicted
by the weighted average of the elemental volumes, Vpred.
Each of the K SNAP models available is used to drive an ionic relaxation, with a maximum of Ns steps for all of the prototypes, leading to K differently relaxed structures per prototype. A “cross-validated” energy prediction is given for each relaxed structure. Given a candidate obtained by relaxing a prototype with the k-th SNAP model, the energy prediction is made with K – 1 models, namely, all SNAPs bar the one used for the relaxation of the candidate at hand. The mean and standard deviation of the energy predictions for the K – 1 models are then saved. For every prototype, one of the K relaxed structures obtained is selected, namely, the one with the lowest “cross-validated” standard deviation. This is the structure whose final total energy has received the largest consensus among the SNAP models. Therefore, there is only one relaxed structure per prototype. This procedure is illustrated in the flowchart in Figure 1.
The reason why this process is not a single ionic relaxation stems from the drive toward robustness of the predictions. Without the inclusion of the Ns iteration cutoff, some of the relaxations would lead to structures that are trapped in unphysical local minima of the PES of the driving SNAP model. By stopping the relaxation process at a low number of steps (Ns = 10 in this study), this effect is mostly avoided as the structures cannot change too drastically. For the relaxations that are accurately driven by SNAP, the largest drop in energy typically occurs during the first few steps of the relaxation process. While accurate relaxations are also cut before convergence, as they are not distinguished from the inaccurate ones, the final structures are still lower in energy than the initial prototypes. This reduces the likelihood of obtaining high-energy structures and the total run time of the workflow remains modest.
Using “cross-validated” energy predictions of the relaxed structures helps to remove the bias of specific SNAP models. The SNAP driving the relaxation typically predicts the final structure to have a lower energy than the initial one since it moves the geometric configuration toward an (at least) local minimum of its particular PES. If the relaxation is inaccurate, the resulting structure will be, in fact, high in energy (as predicted by DFT). The relaxation-driving SNAP model is, therefore, not used for energy predictions. Instead, the other models of the ensemble are used, since they are less likely to present a bias toward that relaxed structure and therefore to predict it being low in energy. Note that since the compounds used as training data are the same for all of the K models, they could also lead to biases for the same structure. This is accounted for by using the “cross-validated” standard deviation, rather than the mean energy, to select the “best” structure out of the K relaxed ones. Indeed, even if several/all SNAP models are biased toward a particular structure and collectively predict it to have a low energy, the inaccurate predictions of each model will be different. This is because they are inaccurate, extrapolated predictions. It has indeed been shown that the standard deviation of SNAPs correlates with the error of the ensemble’s prediction.52 Hence, while the mean prediction of the ensemble may give a low energy value, the standard deviation will be large.
Finally, the “cross-validated” standard deviation prediction for all structures must be lower than a cutoff value, σcut, to be considered for the final energy screening. This typically excludes structures with low SNAP-predicted energy that DFT returns to be high-energy as well as structures with high SNAP-predicted energy. From the sample of structures selected, the ones with the lowest “cross-validated” mean energy are chosen and relaxed with DFT. In this study, 15 structures per stoichiometry are selected through such a process.
In summary, the workflow described creates a set of prototypes and uses an ensemble of ML potentials to relax and screen the structures, which are most likely to have low energy. This is done iteratively at fixed stoichiometries. The final selected compounds are then recomputed with full DFT relaxation. The workflow, therefore, allows one to perform all of the computationally intensive DFT calculations only on the most promising candidates. In the following sections, this workflow will be used to reconstruct the ternary-alloy convex hulls of Cu–Ag–Au and Mo–Ta–W.
3. Results
This section, which is separated into two subsections, presents the key outcomes of our method. First, we examine the performance of our workflow against the well-established and extensively studied Cu–Ag–Au phase diagram.64 Then, we provide a comparison between our results and those of one of the better-characterized phase diagrams available in AFLOWlib, specifically Mo–Ta–W. By benchmarking the workflow phase diagram predictions to the DFT created ones available in AFLOWlib, we gain valuable insights into the effectiveness of our approach.
In order to accurately evaluate the stability of our predicted prototypes and ensure consistency in our analysis, we have used the QHull(65) library to calculate the convex hulls presented in this work. The stable, ground-state compounds used to construct the reference convex hull (a subset of the full database) were downloaded from AFLOWlib. For both ternary systems studied, in order to guarantee consistency, we have recalculated the energies of these compounds with the Vienna Ab initio Simulation Package (VASP).66 Throughout the entire process, we have strictly followed the AFLOWlib standards as outlined in ref (67), with an energy cutoff of 600 eV to ensure tight convergence. More information regarding the DFT calculations can be found in Section 5.1. It should be noted that, for the Mo–Ta–W ternary system, we directly use the AFLOWlib precomputed energies for the training of the SNAPs ensemble. For the Cu–Ag–Au system, energy data are taken from a previous project,52 where they were recalculated with VASP.
3.1. Cu–Ag–Au Ternary Convex Hull
In order to evaluate the performance of our workflow, it is essential to select a well-studied phase diagram that meets specific requirements. Another key consideration is the availability of sufficient data to train an accurate MLIAP. To facilitate the identification and correction of any errors during the development of the workflow, it is also beneficial to choose a relatively simple phase diagram. With these criteria in mind, we chose the Cu–Ag–Au ternary system, a choice further supported by the fact that the MLIAPs for this phase diagram have already been optimized and trained in our previous work.52
As a proof of concept, we have focused on the equiatomic Cu1Ag1Au1 ternary phase as well as phases with stoichiometric ratios of 2–1–1a and 2–2–1. The reason for this choice is that data at these stoichiometries is available in AFLOWlib for comparison. The results of the workflow are presented in Figure 5 and Table 3. In order to quantitatively assess the stability of the structures proposed by the workflow, we use δ, the distance from the reference convex hull (AFLOWlib). A negative value indicates that the predicted structure lies below the calculated convex hull, establishing its stability as an intermetallic compound. Then, the convex hull needs to be recalculated and corrected by taking into account the newly predicted stable structure. In contrast, a positive distance from the convex hull provides a criterion for assessing whether the structure is metastable or unstable. In Table 3, values predicted by the workflow (AFLOWlib) are labeled as δWP (δAFLOW).
Figure 5.

Workflow predictions for the Cu–Ag–Au ternary system across different stoichiometries. Graph presents the different compositions and their corresponding enthalpies of formation, ΔHf. The blue points are associated with the predictions from the proposed workflow, whereas the orange ones represent the lowest-energy AFLOWlib points. Dashed line (CH) marks the tie-plane position of the convex hull. Proposed workflow manages to identify one stable intermetallic phase among these, namely, Cu1Ag1Au2. Furthermore, it manages to outperform the AFLOW dictionary method in all of the presented cases. Unit cell of the newly discovered crystal structure on the convex hull is presented as well. Here, Au atoms are in gold, Ag in silver, and Cu in bronze.
Table 3. Workflow Predictions for the Cu–Ag–Au Ternary System with 2–2–1 and 3–1–1 Compositionsa.
| stoichiometry | δAFLOW (meV/atom) | δWP (meV/atom) |
|---|---|---|
| Cu2Ag2Au1 | 208.95 | 25.99 |
| Cu2Ag1Au2 | 205.69 | 37.45 |
| Cu1Ag2Au2 | 90.27 | 17.21 |
| Cu3Ag1Au1 | – | 20.35 |
| Cu1Ag3Au1 | – | 31.05 |
| Cu1Ag1Au3 | – | –0.02 |
The stoichiometries and their corresponding distances from the convex hull, δWP, are presented. For the 2–2–1 compounds, the distances from the convex hull of the phases available in the AFLOWlib database, δAFLOW, are also given. Note that for all materials, the distance from the AFLOWlib convex hull tie-plane is used as a reference. A new gold-heavy intermetallic, namely, Cu1Ag1Au3 is predicted to be stable (see Figure 6).
The scalability and speed of the algorithm allow us, in principle, to investigate more regions of the phase diagram in a single study than a pure DFT phase diagram construction scheme. This is exemplified by using the proposed workflow to predict structures that are not in AFLOWlib’s database, namely, compounds with 3–1–1 stoichiometry. The results of the benchmark are presented in Table 3, alongside the crystal structure of the new stable phase, Cu1Ag1Au3, in Figure 6.
Figure 6.

Unit cell of the crystal structure found on the convex hull at the composition Cu1Ag1Au3 is presented in both a top view with respect to the z axis (left), a side view along the x axis (middle), and a tilted view (right). In this structure, Au atoms are colored in gold, Ag atoms in silver, and Cu atoms in bronze.
Our approach outperforms the AFLOW dictionary method in all cases, demonstrating a better predictive capability, which arises from the exploration of a larger pool of prototypes. Interestingly, the structures predicted by the proposed workflow are consistently closer to the convex hull than those predicted by the AFLOW dictionary method. This is to be expected since the workflow effectively selects the relevant structures for creating the pool of ternary candidates. Furthermore, our model consistently predicts structures with a negative or almost negative (<10 meV/atom) enthalpy of formation, a fact that gives us confidence in the reliability of the predicted structures. Notably, we have been able to identify two new gold-heavy stable phases, namely, Cu1Ag1Au2 and Cu1Ag1Au3. This indicates that stable intermetallic phases may exist on the gold side of the phase diagram. We have confidence in our prediction, given the fact that the dictionary method structure for Cu1Ag1Au2 is within 3 meV/atom of the convex hull, suggesting the possibility of the existence of a stable phase. This is consistent with the formation of the solid solutions in the gold-rich region of the experimental phase diagram.64 The rest of the structures are considered to be potentially metastable, with an average distance from the convex hull of around 30 meV/atom.68 Overall, our analysis demonstrates the ability of the workflow introduced here to predict structures closer to the convex hull than those from the state-of-the-art dictionary method and possibly uncover novel phases should they exist.
3.2. Mo–Ta–W Ternary Convex Hull
As a second benchmark, we explore a phase diagram that exhibits a variety of stable phases. Thus, the main criterion for our selection, among all of the possible transition-metal ternary combinations, is the total number of stable compounds. The Mo–Ta–W ternary system emerged as a good candidate based on a search run with the AFLOW REST-API.69 In fact, it exhibits the highest number of stable ternary phases of the entire database of transition metal alloys. In order to compare our proposed workflow with the dictionary method, we have made predictions corresponding to the same stoichiometries presented in the previous section. Furthermore, we used our method to explore areas of the phase diagram poorly covered in AFLOWlib.
We now perform a similar analysis to that described in the previous section. The structure prototypes used for element decoration are extracted from those of the binaries closest to their respective convex hulls. Information on the prototype structures is provided in Table S2 in the Supporting Information. Then, an ensemble of ML models relaxes the created structures and orders them based on their predicted energy. A set of 15 structures for each stoichiometry, corresponding to those with the lowest predicted energies, is sampled and proceeded to the next stage. The latter consists of performing a DFT relaxation and a static calculation for each one of these predictions. A significant difference with respect to the Cu–Ag–Au system is that, we now use AFLOWlib’s database to train the models without further recalculation. The AFLOW REST-API is used to download the energies and the crystal structures for the three binary convex hulls (Mo–W, Ta–W, and Mo–Ta). The models are trained as explained in the Methods Section (see Section 2). Recycling data already available on AFLOWlib allows us to avoid about 1500 DFT relaxation calculations, some of them for cells up to 46 atoms, just for the training of the model.
The results for the 1–1–1 and 2–1–1 compositions, those with stable phases in AFLOWlib, are presented first in Figure 7. In this case as well, we predict a new stable intermetallic phase, Mo1Ta2W1. However, this time, our workflow does not consistently outperform the dictionary method. In fact, for two out of the four stoichiometries investigated in Figure 7, we obtain compounds with energies similar to the ones already present in AFLOWlib, while for one, Mo2Ta1W1, our search delivers a compound with a higher energy. Interestingly, in this last case, our newly found structure and the original one, contained in AFLOWlib, belong to different space groups. The AFLOW-predicted one has space group 107 (tetragonal), while our scheme finds a low-symmetry monoclinic crystal structure with space group 9. The final geometries are not equivalent as determined by the AFLOW-SYM tool.56 Nevertheless, the compound discovered with the workflow only has an enthalpy of formation 14.91 meV/atom higher than that of the AFLOWlib compound.
Figure 7.

Workflow predictions for the Mo–Ta–W ternary system across different stoichiometries, 1–1–1 and 2–1–1. Graph presents the different compositions and their corresponding enthalpy of formation, ΔHf. The blue points are associated with the predictions from the proposed workflow, whereas the orange ones represent the lowest-energy AFLOWlib data. Dashed line marks the tie-plane position of the convex hull (CH). The proposed workflow has managed to identify one previously unknown intermetallic phase, namely, Mo1Ta2W1, whose unit cell is shown as an inset. Here, Mo atoms are in purple, Ta in gold, and W in silver.
As a force field-based approach, our workflow gets better when the MLIAP improves. In this case, we have extracted the data used to train the SNAPs from the AFLOWlib repository, a detail that led to a force field less accurate than that used for the Cu–Ag–Au system. In fact, minor inconsistencies in the energy data, due to unconverged results, may generate errors in the force field.70,71 That being understood, we have still demonstrated that new phases can be predicted by an almost DFT-free workflow since our initial data for model training are readily available in the AFLOWlib database. The workflow systematically assesses a wide range of compositions and potential compounds. Specifically, it involves the evaluation of 331,734 ternaries based on their calculated SNAP energies. Following this, the 15 lowest-enthalpy structures, for each stoichiometry, undergo relaxation with DFT. Interestingly, the DFT analysis reveals that, on average, the most stable compound ranks seventh among the suggested options. Additionally, the ab initio computations are shortened since all compounds move closer to their equilibrium geometry after the SNAP-guided relaxation, in contrast to their fully unrelaxed counterparts.
Perhaps a more accurate force field would also be able to find the AFLOWlib minimum for Mo2Ta1W1 (see Figure 7). Nevertheless, our workflow is already able to identify the majority of the structures close to the convex hull. It should also be noted that this is the phase diagram for which AFLOWlib’s dictionary method works best, as it is able to detect four intermetallic phases, more than for any other transition metal alloy phase diagram.
Then, we move on to analyze stoichiometries poorly explored in AFLOWlib, namely, 2–2–1 and 4–1–1. In Table 4, we provide a comparison of the distance from the convex hull for the structures predicted with our method, δWP, and the ones from AFLOWlib, δAFLOW. For these compositions, the AFLOWlib compounds are unstable, as they all have a positive enthalpy of formation. In contrast, those found by our workflow all have a negative enthalpy of formation and are found near or at the convex hull. These results provide a comparison between our method and that of AFLOWlib for structures predicted to be unstable by the latter.
Table 4. Workflow Predictions for the Mo–Ta–W Ternary System with 2–2–1 and 4–1–1 Compositionsa.
| stoichiometry | δAFLOW (meV/atom) | δWP (meV/atom) |
|---|---|---|
| Mo2Ta2W1 | 880.90 | 0.00 |
| Mo1Ta2W2 | 962.84 | 0.00 |
| Mo2Ta1W2 | 1032.50 | 8.50 |
| Mo4Ta1W1 | 320.95 | 46.56 |
| Mo1Ta4W1 | 516.30 | 3.25 |
| Mo1Ta1W4 | 334.16 | 0.00 |
The stoichiometries and their corresponding distance from the convex hull, δ, are presented (δWP is for compounds generated by our workflow, while δAFLOW is for the AFLOWlib compounds). Three intermetallic phases are predicted to be stable and two others are metastable. Surprisingly, our algorithm is able to find structures with an energy of up to 1 eV/atom lower than that identified by the dictionary method of AFLOWlib.
The ability of our workflow to consistently predict structures that (i) are close to the convex hull and (ii) have a negative enthalpy of formation is thus demonstrated. The former point means that we have an effective algorithm to use for the structure search in regions of interest. The latter validates our physical intuition behind the assumption that the crystal structures of the binary alloys close to the convex hull can be used as a template for atomic decoration in the search for ternary phases. This approach has allowed us to identify three new intermetallic compounds (see Table 4), namely, Mo2Ta2W1, Mo1Ta2W2, and Mo1Ta1W4. Such positive results demonstrate the value of the enhanced freedom in the structure search provided by our algorithm with respect to the dictionary methods.
Finally, following the same spirit as that for the analysis of the Cu–Ag–Au system, we now turn our attention to previously unexplored areas of the ternary convex hull. Our results for the Mo1Ta2W3 and 3–1–1 compositions are shown in Figure 8.
Figure 8.

Workflow predictions (blue points) of the enthalpy of formation for the Mo–Ta–W ternary system across the Mo1Ta2W3 and 3–1–1 compositions. Enthalpy of formation for each composition at the appropriate convex hull tie-plane (CH) is shown as a dashed line. Unit cells of the crystal structures found on the convex hull are presented as well. Here, Mo atoms are colored purple, Ta gold, and W silver. Two new intermetallics alloys have been identified, namely, Mo1Ta2W3 and Mo1Ta3W1.
As one can observe, together with structures away from the tie-plane, we also find two new stable compounds, namely, Mo1Ta2W3 and Mo1Ta3W1. Such new phases, together with the low-energy ones previously discussed, call for a modification of the ternary convex hull that exists in AFLOWlib. The new diagram is presented in the top panel of Figure 9. In order to facilitate the comparison, the lower panel of the same figure shows the difference between the AFLOW- and our workflow-predicted convex hulls (positive values mean that our predicted convex hull is lower in energy than the original AFLOWlib one).
Figure 9.

Workflow-computed convex hull for the Mo–Ta–W system (upper panel). Color heat map corresponds to the calculated enthalpy of formation at a given stoichiometry. In the lower panel, we present the difference between the convex hull of AFLOWlib (reference) and that computed by our workflow. Black crosses for the ternary region symbolize the newly predicted intermetallic phases, and the red cross denotes the only stable intermetallic that was originally predicted by the AFLOW dictionary method.
The new convex hull returns a picture where most of the stable ternary structures identified belong to the Ta–W heavy area and only one intermetallic alloy exists in the Mo-rich region of the compositional space. The latter is the compound found on AFLOWlib. Interestingly, the new phases predicted by our workflow undercut Mo1Ta1W1, Mo1Ta2W1, and Mo1Ta1W2, the other intermetallic alloys initially predicted as stable by AFLOWlib. These are now, respectively, 2.59, 10.95, and 1.31 meV/atom, above their associated tie-planes, and have to be considered metastable. Experimentally, there is evidence that the Mo–Ta–W system forms a ternary solid solution72 at finite temperature across the entire phase diagram. It should be noted that the Mo–Ta binary space is far better sampled by AFLOWlib than the Ta–W and Mo–W ones. This could imply that it is more difficult to reach the convex hull close to such a facet of the diagram. In contrast, the Mo–W system only displays small enthalpies of formation for the stable binary phases, implying that both Mo and W form more stable phases with Ta than among themselves. These two reasons could explain why it is more difficult to find stable intermetallic phases in the Mo-rich part of the composition space.
In summary, the workflow developed provides a comprehensive scan across the ternary composition space, enabling the construction of a convex hull with DFT-accuracy and identifying areas prone to alloy stability through the discovery of new hull points. This is accomplished by screening on the order of 105 candidate compounds with the SNAPs ensemble. In the case of the Cu–Ag–Au test system, two stable Au-rich compounds are found, which are not present on the AFLOWlib convex hull. The new Cu1Ag1Au2 (Figure 5) and Cu1Ag1Au3 (Figure 6) ternary structures have space groups 123 and 63, respectively. They resemble distorted bcc and hcp structures. The workflow hence highlights the region in which Cu–Ag–Au alloys are likely to form. The experimental structures for these phases are fcc solid-state solutions.55,64,73 For the Mo–Ta–W system, six novel compounds are identified, undercutting three of the AFLOWlib compounds, leaving only the Mo2Ta1W1 intermetallic to lie on the convex hull. Experimentally, bcc solid-state solutions form across the full compositional range.72 One of the phases discovered, Mo1Ta1W4, possesses a tetragonally distorted bcc structure, while the others have space groups 74 (Mo1Ta2W1), 12 (Mo1Ta2W2 and Mo1Ta3W1), and 38 (Mo1Ta2Ta3 and Mo2Ta2W1). The results found here suggest that at low temperatures, the Mo-rich corner is dominated by an intermetallic phase, Mo2Ta1W1. However, the regions with lower Mo concentrations form many more different intermetallics that lie close to each other in energy. This region is thus more susceptible to the formation of solid-state phases at finite temperatures. The convex hulls obtained with this workflow suggest promising regions of material stability, notably in the form of solid-state solutions for the two systems studied. This could help guide experimental studies of the synthesis of stable ternary alloys. Our strategy resembles the MatLearn74 approach in spirit but is a more accurate method as it provides DFT-level convex hulls. This is illustrated by the fact that no novel ternary phases are predicted by MatLearn for Cu–Ag–Au and Mo–Ta–W, and for the former, the only “known” phase (from DFT) is in the Cu-rich region, which differs from the experimental phase diagram.
4. Conclusions
We developed a workflow that predicts the crystal structure and assesses the stability of ternary compounds of a particular stoichiometry. A library of prototype structures is formed from the lowest-enthalpy alloys of the associated binary subsystems. From this database, derivative ternary structures are generated by site decoration. Then, an ensemble of SNAP force fields is used to select the most promising structures among them, bypassing the majority of the ab initio calculations. Therefore, the proposed workflow highly increases the throughput in the search for stable ternary compounds without compromising the quality of the predictions. This is used here to map the ternary convex hull of the transition-metal alloys. The crucial aspect of the proposed scheme is that both the training of the force fields and the creation of the prototype ternary structures are based solely on knowledge of the binary phases. As such, no additional DFT calculations are required since both the structures and their corresponding energies are readily available on the AFLOWlib database. Employing ab initio calculations solely in the final stage of the workflow and focusing them on the most promising candidates allows us to perform a comprehensive exploration of the phase diagram of a ternary system with only a few hundred DFT calculations. This enables us to map previously unexplored portions of the ternary space and identify regions of interest, thus driving the discovery of novel compounds.
We have demonstrated that the proposed workflow is able to predict crystal structures with negative enthalpy of formation and effectively identify the stable intermetallics should they exist. In particular, we used the Cu–Ag–Au and Mo–Ta–W ternary systems as an example. In the first case, we have predicted several new phases that, although not all thermodynamically stable, have an enthalpy of formation lower than those found by the AFLOW dictionary method. In addition, we have identified a Au-rich composition region where stable intermetallic phases are expected, in accordance with the location of solid solutions in the experimental phase diagram.64 Interestingly, in the case of Mo–Ta–W, one of the ternary systems with the largest number of stable intermetallics in AFLOWlib, our method is capable of identifying a plethora of new phases, resulting in the correction of the original DFT-calculated convex hull proposed by AFLOW.
In summary, we have developed a novel way to integrate ML with a DFT workflow. Although the ML model introduced here does not perform as well as force fields with tailor-made databases, its construction requires no new DFT calculations and simply recycles pre-existing results already present in large-scale databases. This represents an example of how ML interatomic potentials can be seamlessly integrated into a materials design pipeline without the need to generate ad hoc large training sets.
5. Computational Methods
The details of the computational methods are presented in this section. The parameters used for the DFT calculations run with VASP66 are first discussed. A brief presentation of the SNAP26 is then given, along with details of the implementations used for the current work.
5.1. DFT Calculations
All DFT calculations are performed using the VASP,66 version 5.4.4. Projector augmented wave (PAW) pseudopotentials are used for each element together with the Perdew–Burke–Ernzerhof (PBE) functional.75 A plane wave cutoff of 600 eV is used for all calculations. The energy convergence criterion for each self-consistent cycle is of 10–4 eV. Full atomic relaxations are performed (update of atomic positions, cell volume, and lattice parameters) with a stopping criterion on the forces of 10–3 eV/Å. A Fermi–Dirac smearing of 0.2 eV is chosen for all calculations.
For the k-point sampling, a gamma-centered mesh is employed for all calculations. The density of the mesh and the spacing between k-points is chosen based on AFLOWlib’s convergence criteria.67 The mesh is system-specific and is determined from the NKPPRA (number of k-points per reciprocal atom). The number of sampling points along each direction is proportional to the norm of the corresponding reciprocal lattice vector. The total number of sampling points per reciprocal atom is then minimized and NKPPRA is used as a lower bound. Values of 10 × 103 and 6 × 103 are used for static calculations and relaxations, respectively.
5.2. Spectral Analysis Neighbor Potential
SNAP26 is used as an energy predictor.
As described in Section II B, an ensemble of models is employed for
predictions. Equation 2 defines the expression of the function, ESNAP, and combined with eq 1, gives the energy of a system with N atoms. The
atomic fingerprints that define the chemical environments of each
atom i in the system, belonging to species αi, are the bispectrum components.23 These are used to represent configurations instead
of seemingly more obvious choices (e.g., atomic Cartesian coordinates),
as they are invariant upon rotation and permutations of identical
atoms. Note that invariance with respect to translations is guaranteed
by eq 1. For each atom,
the vector
, which
collects the first components up
to a maximum index, is taken as a feature for the ML model (ridge
regression in the case of SNAP). A short description of the bispectrum
components is given below.
The neighborhood of an atom i atom can be described by a density function, ρi, centered at that atom, with delta functions at the sites of surrounding atoms, within a sphere of radius rcut. It is defined in three dimensions as
| 5 |
where the sum is over all
atoms within rcut from the central atom.
Here, ri is the position
of atom i, rij = |ri – rj|, wαj is the specie-specific weight of atom j, and fc is a cutoff function
that smoothly runs to zero as rij approaches rcut, as defined in
ref (25). In order
to represent this density distribution as a vector, it is expanded
on a suitable basis. Atomic positions are first mapped onto the 4D
sphere, by switching to polar coordinates (θ,ϕ,r) and by defining a third polar angle, θ0, from the radial coordinate (see ref (23) for details). The density function is then expanded
in terms of hyperspherical harmonics
, the natural basis for expansion on the
4D sphere. Dropping the atomic index, ρ is written as
| 6 |
The hyperspherical
harmonic index, J, runs in
half-integer steps, while m and m′ run between −J and J in integer steps. The outer sum is truncated in practice at a value
of Jmax, treated as a hyperparameter.
The expansion coefficients,
, cannot be used as descriptors since they
are complex and are not invariant under system rotation. From them,
however, the rotationally invariant and real-valued bispectrum components
are constructed
| 7 |
Here,
and
are the Clebsch–Gordan
coefficients,
which possess the same symmetry invariances as the system. After taking
the nonzero and unique distinct components, the bispectrum vector
is formed, denoted
, with
atomic and specie indices. The bispectrum
components are a highly nonlinear representation of the local atomic
coordinates and account for up to four-body interactions. Their complexity
is what makes it possible for them to be effectively used together
with a simple regressor in SNAP to accurately map structures to energies.
The fitting, testing, and predictions of the SNAP models used are performed using an in-house Python library built with SCIKIT-LEARN76 and the ASE63 Python libraries. The bispectrum components are computed using LAMMPS.77 The pipeline is built in Python to perform the API download of binary structures and energies from the AFLOWlib78 database and to generate derivative structures from the prototypes using ENUMLIB.53 DFT calculations are managed by using a combination of ASE63 and PYMATGEN.79
Acknowledgments
This work has been supported by the Irish Research Council Advanced Laureate Award (IRCLA/2019/127) and by the Irish Research Council postgraduate program (M.C.). We acknowledge the DJEI/DES/SFI/HEA Irish Centre for High-End Computing (ICHEC) and Trinity Centre for High Performance Computing (TCHPC) for the provision of computational resources. We would like to acknowledge Eve Gilligan for creating the cover illustration.
Data Availability Statement
Data associated with this project, including the AFLOWlib auid for the training data, the AFLOWlib labels of the structures used as prototypes, the parameters of the SNAP models, and the lowest values of the enthalpy of formation found at each composition are available on the Github repository (https://github.com/HugoRossignol/Workflow_Ternary_ConvexHull).
Supporting Information Available
The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.jcim.3c01391.
Information on the prototype structures: AFLOWlib auid, space group of the parent binary compound, and space group of the prototype structure for the Cu–Ag–Au and Mo–Ta–W systems (PDF)
Author Contributions
This section is written according to the CRediT system. H.R. and M.M. contributed equally to this work. H.R. and M.M. contributed to conceptualization, methodology, software, data curation, formal analysis, investigation, validation, visualization, writing the original draft, as well as reviewing and editing the manuscript. M.C. contributed to conceptualization, software, as well as reviewing and editing the manuscript. S.S. contributed to conceptualization, funding acquisition, project administration, resources, supervision, as well as reviewing and editing the manuscript.
The authors declare no competing financial interest.
Footnotes
By this, we mean all permutations of a stoichiometric ratio. For an X–Y–Z ternary system, 2–1–1 refers to three compositions: X2Y1Z1, X1Y2Z1, and X1Y1Z2.
Supplementary Material
References
- Materials Genome Initiative. https://www.mgi.govweb/. Accessed on 01/09/2023.
- Curtarolo S.; Hart G. L. W.; Nardelli M. B.; Mingo N.; Sanvito S.; Levy O. The high-throughput highway to computational materials design. Nat. Mater. 2013, 12, 191–201. 10.1038/nmat3568. [DOI] [PubMed] [Google Scholar]
- Sanvito S.; Oses C.; Xue J.; Tiwari A.; Zic M.; Archer T.; Tozman P.; Venkatesan M.; Coey M.; Curtarolo S. Accelerated discovery of new magnets in the Heusler alloy family. Sci. Adv. 2017, 3 (4), e1602241 10.1126/sciadv.1602241. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sarker P.; Harrington T.; Toher C.; Oses C.; Samiee M.; Maria J.-P.; Brenner D. W.; Vecchio K. S.; Curtarolo S. High-entropy high-hardness metal carbides discovered by entropy descriptors. Nat. Commun. 2018, 9 (1), 4980. 10.1038/s41467-018-07160-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kim J. C.; Li X.; Moore C. J.; Bo S.-H.; Khalifah P. G.; Grey C. P.; Ceder G. Analysis of charged state stability for monoclinic LiMnBO3 cathode. Chem. Mater. 2014, 26 (14), 4200–4206. 10.1021/cm5014174. [DOI] [Google Scholar]
- Curtarolo S.; Setyawan W.; Wang S.; Xue J.; Yang K.; Taylor R. H.; Nelson L. J.; Hart G. L.; Sanvito S.; Buongiorno-Nardelli M.; Mingo N.; Levy O. AFLOWLIB.org: A distributed materials properties repository from high-throughput ab initio calculations. Comput. Mater. Sci. 2012, 58, 227–235. 10.1016/j.commatsci.2012.02.002. [DOI] [Google Scholar]
- Jain A.; Ong S. P.; Hautier G.; Chen W.; Richards W. D.; Dacek S.; Cholia S.; Gunter D.; Skinner D.; Ceder G.; Persson K. A. Commentary: The Materials Project: A materials genome approach to accelerating materials innovation. APL Mater. 2013, 1 (1), 011002. 10.1063/1.4812323. [DOI] [Google Scholar]
- Saal J. E.; Kirklin S.; Aykol M.; Meredig B.; Wolverton C. Materials design and discovery with high-throughput density functional theory: the open quantum materials database (OQMD). JOM 2013, 65, 1501–1509. 10.1007/s11837-013-0755-4. [DOI] [Google Scholar]
- Draxl C.; Scheffler M. The NOMAD laboratory: from data sharing to artificial intelligence. J. Phys. Mater. 2019, 2 (3), 036001. 10.1088/2515-7639/ab13bb. [DOI] [Google Scholar]
- Toher C.; Oses C.; Hicks D.; Curtarolo S. Unavoidable disorder and entropy in multi-component systems. npj Comput. Mater. 2019, 5, 69. 10.1038/s41524-019-0206-z. [DOI] [Google Scholar]
- Kim K.; Ward L.; He J.; Krishna A.; Agrawal A.; Wolverton C. Machine-learning-accelerated high-throughput materials screening: Discovery of novel quaternary Heusler compounds. Phys. Rev. Mater. 2018, 2 (12), 123801. 10.1103/PhysRevMaterials.2.123801. [DOI] [Google Scholar]
- Faber F.; Lindmaa A.; Von Lilienfeld O. A.; Armiento R. Crystal structure representations for machine learning models of formation energies. Int. J. Quantum Chem. 2015, 115 (16), 1094–1101. 10.1002/qua.24917. [DOI] [Google Scholar]
- Faber F. A.; Lindmaa A.; Von Lilienfeld O. A.; Armiento R. Machine learning energies of 2 million elpasolite (ABC2D6) crystals. Phys. Rev. Lett. 2016, 117 (13), 135502. 10.1103/PhysRevLett.117.135502. [DOI] [PubMed] [Google Scholar]
- Schmidt J.; Shi J.; Borlido P.; Chen L.; Botti S.; Marques M. A. Predicting the thermodynamic stability of solids combining density functional theory and machine learning. Chem. Mater. 2017, 29 (12), 5090–5103. 10.1021/acs.chemmater.7b00156. [DOI] [Google Scholar]
- Ward L.; Liu R.; Krishna A.; Hegde V. I.; Agrawal A.; Choudhary A.; Wolverton C. Including crystal structure attributes in machine learning models of formation energies via Voronoi tessellations. Phys. Rev. B 2017, 96 (2), 024104. 10.1103/PhysRevB.96.024104. [DOI] [Google Scholar]
- Xie T.; Grossman J. C. Crystal graph convolutional neural networks for an accurate and interpretable prediction of material properties. Phys. Rev. Lett. 2018, 120 (14), 145301. 10.1103/PhysRevLett.120.145301. [DOI] [PubMed] [Google Scholar]
- Park C. W.; Wolverton C. Developing an improved crystal graph convolutional neural network framework for accelerated materials discovery. Phys. Rev. Mater. 2020, 4 (6), 063801. 10.1103/PhysRevMaterials.4.063801. [DOI] [Google Scholar]
- Schmidt J.; Hoffmann N.; Wang H.-C.; Borlido P.; Carriço P. J. M. A.; Cerqueira T. F.; Botti S.; Marques M. A. Machine-Learning-Assisted Determination of the Global Zero-Temperature Phase Diagram of Materials. Adv. Mater. 2023, 35 (22), 2210788. 10.1002/adma.202210788. [DOI] [PubMed] [Google Scholar]
- Bartel C. J.; Trewartha A.; Wang Q.; Dunn A.; Jain A.; Ceder G. A critical examination of compound stability predictions from machine-learned formation energies. npj Comput. Mater. 2020, 6 (1), 97. 10.1038/s41524-020-00362-y. [DOI] [Google Scholar]
- Pandey S.; Qu J.; Stevanović V.; St John P.; Gorai P. Predicting energy and stability of known and hypothetical crystals using graph neural network. Patterns 2021, 2 (11), 100361. 10.1016/j.patter.2021.100361. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Goodall R. E. A.; Lee A. A. Predicting materials properties without crystal structure: Deep representation learning from stoichiometry. Nat. Commun. 2020, 11 (1), 6280. 10.1038/s41467-020-19964-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jha D.; Ward L.; Paul A.; Liao W.-k.; Choudhary A.; Wolverton C.; Agrawal A. Elemnet: Deep learning the chemistry of materials from only elemental composition. Sci. Rep. 2018, 8 (1), 17593–17613. 10.1038/s41598-018-35934-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bartók A. P.; Kondor R.; Csányi G. On representing chemical environments. Phys. Rev. B: Condens. Matter Mater. Phys. 2013, 87 (18), 184115. 10.1103/PhysRevB.87.184115. [DOI] [Google Scholar]
- Bartók A. P.; Payne M. C.; Kondor R.; Csányi G. Gaussian approximation potentials: The accuracy of quantum mechanics, without the electrons. Phys. Rev. Lett. 2010, 104 (13), 136403. 10.1103/PhysRevLett.104.136403. [DOI] [PubMed] [Google Scholar]
- Behler J.; Parrinello M. Generalized neural-network representation of high-dimensional potential-energy surfaces. Phys. Rev. Lett. 2007, 98 (14), 146401. 10.1103/PhysRevLett.98.146401. [DOI] [PubMed] [Google Scholar]
- Thompson A. P.; Swiler L. P.; Trott C. R.; Foiles S. M.; Tucker G. J. Spectral neighbor analysis method for automated generation of quantum-accurate interatomic potentials. J. Comput. Phys. 2015, 285, 316–330. 10.1016/j.jcp.2014.12.018. [DOI] [Google Scholar]
- Shapeev A. V. Moment tensor potentials: A class of systematically improvable interatomic potentials. Multiscale Model. Simul. 2016, 14 (3), 1153–1173. 10.1137/15M1054183. [DOI] [Google Scholar]
- Drautz R. Atomic cluster expansion for accurate and transferable interatomic potentials. Phys. Rev. B 2019, 99 (1), 014104. 10.1103/PhysRevB.99.014104. [DOI] [Google Scholar]
- Domina M.; Patil U.; Cobelli M.; Sanvito S. Cluster expansion constructed over Jacobi-Legendre polynomials for accurate force fields. Phys. Rev. B 2023, 108 (9), 094102. 10.1103/PhysRevB.108.094102. [DOI] [Google Scholar]
- Deringer V. L.; Csányi G. Machine learning based interatomic potential for amorphous carbon. Phys. Rev. B 2017, 95 (9), 094203. 10.1103/PhysRevB.95.094203. [DOI] [Google Scholar]
- Caro M. A.; Csányi G.; Laurila T.; Deringer V. L. Machine learning driven simulated deposition of carbon films: From low-density to diamondlike amorphous carbon. Phys. Rev. B 2020, 102 (17), 174201. 10.1103/PhysRevB.102.174201. [DOI] [Google Scholar]
- Jinnouchi R.; Karsai F.; Kresse G. On-the-fly machine learning force field generation: Application to melting points. Phys. Rev. B 2019, 100 (1), 014105. 10.1103/PhysRevB.100.014105. [DOI] [PubMed] [Google Scholar]
- Mortazavi B.; Javvaji B.; Shojaei F.; Rabczuk T.; Shapeev A. V.; Zhuang X. Exceptional piezoelectricity, high thermal conductivity and stiffness and promising photocatalysis in two-dimensional MoSi2N4 family confirmed by first-principles. Nano Energy 2021, 82, 105716. 10.1016/j.nanoen.2020.105716. [DOI] [Google Scholar]
- Mortazavi B.; Silani M.; Podryabinkin E. V.; Rabczuk T.; Zhuang X.; Shapeev A. V. First-principles multiscale modeling of mechanical properties in graphene/borophene heterostructures empowered by machine-learning interatomic potentials. Adv. Mater. 2021, 33 (35), 2102807. 10.1002/adma.202102807. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chen C.; Deng Z.; Tran R.; Tang H.; Chu I.-H.; Ong S. P. Accurate force field for molybdenum by machine learning large materials data. Phys. Rev. Mater. 2017, 1 (4), 043603. 10.1103/PhysRevMaterials.1.043603. [DOI] [Google Scholar]
- Gubaev K.; Podryabinkin E. V.; Hart G. L.; Shapeev A. V. Accelerating high-throughput searches for new alloys with active learning of interatomic potentials. Comput. Mater. Sci. 2019, 156, 148–156. 10.1016/j.commatsci.2018.09.031. [DOI] [Google Scholar]
- Bernstein N.; Csányi G.; Deringer V. L. De novo exploration and self-guided learning of potential-energy surfaces. npj Comput. Mater. 2019, 5 (1), 99. 10.1038/s41524-019-0236-6. [DOI] [Google Scholar]
- Artrith N.; Urban A.; Ceder G. Constructing first-principles phase diagrams of amorphous LixSi using machine-learning-assisted sampling with an evolutionary algorithm. J. Chem. Phys. 2018, 148 (24), 241711. 10.1063/1.5017661. [DOI] [PubMed] [Google Scholar]
- Kharabadze S.; Thorn A.; Koulakova E. A.; Kolmogorov A. N. Prediction of stable Li-Sn compounds: boosting ab initio searches with neural network potentials. npj Comput. Mater. 2022, 8 (1), 136. 10.1038/s41524-022-00825-4. [DOI] [Google Scholar]
- Seko A. Machine learning potentials for multicomponent systems: The Ti-Al binary system. Phys. Rev. B 2020, 102 (17), 174104. 10.1103/PhysRevB.102.174104. [DOI] [Google Scholar]
- Ekström Kelvinius F.; Armiento R.; Lindsten F. Graph-based machine learning beyond stable materials and relaxed crystal structures. Phys. Rev. Mater. 2022, 6 (3), 033801. 10.1103/physrevmaterials.6.033801. [DOI] [Google Scholar]
- Wang R.; Xia W.; Slade T. J.; Fan X.; Dong H.; Ho K.-M.; Canfield P. C.; Wang C. Z. Machine learning guided discovery of ternary compounds involving La and immiscible Co and Pb elements. npj Comput. Mater. 2022, 8 (1), 258. 10.1038/s41524-022-00950-0. [DOI] [Google Scholar]
- Law J. N.; Pandey S.; Gorai P.; St John P. C. Upper-Bound Energy Minimization to Search for Stable Functional Materials with Graph Neural Networks. JACS Au 2022, 3, 113–123. 10.1021/jacsau.2c00540. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chen C.; Ong S. P. A universal graph deep learning interatomic potential for the periodic table. Nat. Comput. Sci. 2022, 2 (11), 718–728. 10.1038/s43588-022-00349-3. [DOI] [PubMed] [Google Scholar]
- Bisbo M. K.; Hammer B. Global optimization of atomic structure enhanced by machine learning. Phys. Rev. B 2022, 105 (24), 245404. 10.1103/PhysRevB.105.245404. [DOI] [Google Scholar]
- Paleico M. L.; Behler J. Global optimization of copper clusters at the ZnO(1010) surface using a DFT-based neural network potential and genetic algorithms. J. Chem. Phys. 2020, 153 (5), 054704. 10.1063/5.0014876. [DOI] [PubMed] [Google Scholar]
- Yamashita T.; Sato N.; Kino H.; Miyake T.; Tsuda K.; Oguchi T. Crystal structure prediction accelerated by Bayesian optimization. Phys. Rev. Mater. 2018, 2 (1), 013803. 10.1103/PhysRevMaterials.2.013803. [DOI] [Google Scholar]
- Podryabinkin E. V.; Tikhonov E. V.; Shapeev A. V.; Oganov A. R. Accelerating crystal structure prediction by machine-learning interatomic potentials with active learning. Phys. Rev. B 2019, 99 (6), 064114. 10.1103/PhysRevB.99.064114. [DOI] [Google Scholar]
- Deringer V. L.; Pickard C. J.; Csányi G. Data-driven learning of total and local energies in elemental boron. Phys. Rev. Lett. 2018, 120 (15), 156001. 10.1103/PhysRevLett.120.156001. [DOI] [PubMed] [Google Scholar]
- Pickard C. J. Ephemeral data derived potentials for random structure search. Phys. Rev. B 2022, 106 (1), 014102. 10.1103/PhysRevB.106.014102. [DOI] [Google Scholar]
- Tong Q.; Xue L.; Lv J.; Wang Y.; Ma Y. Accelerating CALYPSO structure prediction by data-driven learning of a potential energy surface. Faraday Discuss. 2018, 211, 31–43. 10.1039/C8FD00055G. [DOI] [PubMed] [Google Scholar]
- Minotakis M.; Rossignol H.; Cobelli M.; Sanvito S. Machine-learning surrogate model for accelerating the search of stable ternary alloys. Phys. Rev. Mater. 2023, 7 (9), 093802. 10.1103/PhysRevMaterials.7.093802. [DOI] [Google Scholar]
- Hart G. L.; Forcade R. W. Algorithm for generating derivative structures. Phys. Rev. B: Condens. Matter Mater. Phys. 2008, 77 (22), 224115. 10.1103/PhysRevB.77.224115. [DOI] [Google Scholar]
- Hart G. L.; Nelson L. J.; Forcade R. W. Generating derivative structures at a fixed concentration. Comput. Mater. Sci. 2012, 59, 101–107. 10.1016/j.commatsci.2012.02.015. [DOI] [Google Scholar]
- Kusoffsky A. Thermodynamic evaluation of the ternary Ag–Au–Cu system—including a short range order description. Acta Mater. 2002, 50 (20), 5139–5145. 10.1016/S1359-6454(02)00382-8. [DOI] [Google Scholar]
- Hicks D.; Oses C.; Gossett E.; Gomez G.; Taylor R. H.; Toher C.; Mehl M. J.; Levy O.; Curtarolo S. AFLOW-SYM: platform for the complete, automatic and self-consistent symmetry analysis of crystals. Acta Crystallogr., Sect. A: Found. Adv. 2018, 74 (3), 184–203. 10.1107/S2053273318003066. [DOI] [PubMed] [Google Scholar]
- Hart G. L. W.; Forcade R. W. Generating derivative structures from multilattices: Algorithm and application to hcp alloys. Phys. Rev. B: Condens. Matter Mater. Phys. 2009, 80 (1), 014120. 10.1103/PhysRevB.80.014120. [DOI] [Google Scholar]
- Kohn W. Density functional and density matrix method scaling linearly with the number of atoms. Phys. Rev. Lett. 1996, 76 (17), 3168–3171. 10.1103/PhysRevLett.76.3168. [DOI] [PubMed] [Google Scholar]
- Prodan E.; Kohn W. Nearsightedness of electronic matter. Proc. Natl. Acad. Sci. U.S.A. 2005, 102 (33), 11635–11638. 10.1073/pnas.0505436102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lunghi A.; Sanvito S. A unified picture of the covalent bond within quantum-accurate force fields: From organic molecules to metallic complexes’ reactivity. Sci. Adv. 2019, 5, eaaw221 10.1126/sciadv.aaw2210. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zuo Y.; Chen C.; Li X.; Deng Z.; Chen Y.; Behler J.; Csányi G.; Shapeev A. V.; Thompson A. P.; Wood M. A.; Ong S. P. Performance and cost assessment of machine learning interatomic potentials. J. Phys. Chem. A 2020, 124 (4), 731–745. 10.1021/acs.jpca.9b08723. [DOI] [PubMed] [Google Scholar]
- Li X.-G.; Hu C.; Chen C.; Deng Z.; Luo J.; Ong S. P. Quantum-accurate spectral neighbor analysis potential models for Ni-Mo binary alloys and fcc metals. Phys. Rev. B 2018, 98 (9), 094104. 10.1103/PhysRevB.98.094104. [DOI] [Google Scholar]
- Hjorth Larsen A.; Jørgen Mortensen J.; Blomqvist J.; Castelli I. E.; Christensen R.; Dułak M.; Friis J.; Groves M. N.; Hammer B.; Hargus C.; Hermes E. D.; Jennings P. C.; Bjerre Jensen P.; Kermode J.; Kitchin J. R.; Leonhard Kolsbjerg E.; Kubal J.; Kaasbjerg K.; Lysgaard S.; Bergmann Maronsson J.; Maxson T.; Olsen T.; Pastewka L.; Peterson A.; Rostgaard C.; Schiøtz J.; Schütt O.; Strange M.; Thygesen K. S.; Vegge T.; Vilhelmsen L.; Walter M.; Zeng Z.; Jacobsen K. W. The atomic simulation environment—a Python library for working with atoms. J. Phys.: Condens. Matter 2017, 29 (27), 273002. 10.1088/1361-648x/aa680e. [DOI] [PubMed] [Google Scholar]
- Prince A.Phase Diagrams of Ternary Gold Alloys; Institute of Metals, 1990; pp 7–42. [Google Scholar]
- Barber C. B.; Dobkin D. P.; Huhdanpaa H. The quickhull algorithm for convex hulls. ACM Trans. Math Software 1996, 22 (4), 469–483. 10.1145/235815.235821. [DOI] [Google Scholar]
- Kresse G.; Furthmüller J. Efficiency of ab-initio total energy calculations for metals and semiconductors using a plane-wave basis set. Comput. Mater. Sci. 1996, 6 (1), 15–50. 10.1016/0927-0256(96)00008-0. [DOI] [Google Scholar]
- Calderon C. E.; Plata J. J.; Toher C.; Oses C.; Levy O.; Fornari M.; Natan A.; Mehl M. J.; Hart G.; Buongiorno Nardelli M.; Curtarolo S. The AFLOW standard for high-throughput materials science calculations. Comput. Mater. Sci. 2015, 108, 233–238. 10.1016/j.commatsci.2015.07.019. [DOI] [Google Scholar]
- Sun W.; Dacek S. T.; Ong S. P.; Hautier G.; Jain A.; Richards W. D.; Gamst A. C.; Persson K. A.; Ceder G. The thermodynamic scale of inorganic crystalline metastability. Sci. Adv. 2016, 2, e1600225 10.1126/sciadv.1600225. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Taylor R. H.; Rose F.; Toher C.; Levy O.; Yang K.; Buongiorno Nardelli M.; Curtarolo S. A RESTful API for exchanging materials data in the AFLOWLIB.org consortium. Comput. Mater. Sci. 2014, 93, 178–192. 10.1016/j.commatsci.2014.05.014. [DOI] [Google Scholar]
- Deringer V. L.; Bartók A. P.; Bernstein N.; Wilkins D. M.; Ceriotti M.; Csányi G. Gaussian process regression for materials and molecules. Chem. Rev. 2021, 121 (16), 10073–10141. 10.1021/acs.chemrev.1c00022. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bayerl D.; Andolina C. M.; Dwaraknath S.; Saidi W. A. Convergence acceleration in machine learning potentials for atomistic simulations. Digital Discovery 2022, 1 (1), 61–69. 10.1039/D1DD00005E. [DOI] [Google Scholar]
- Rostoker W.A Study of Ternary Phase Diagrams of Tungsten and Tantalum; Battelle Memorial Inst. Defense Metals Information Center: Columbus, OH, 1963. [Google Scholar]
- Cao W.; Chang Y.; Zhu J.; Chen S.; Oates W. Thermodynamic modeling of the Cu–Ag–Au system using the cluster/site approximation. Intermetallics 2007, 15 (11), 1438–1446. 10.1016/j.intermet.2007.05.003. [DOI] [Google Scholar]
- Peterson G. G.; Brgoch J. Materials discovery through machine learning formation energy. J. Phys.: Energy 2021, 3 (2), 022002. 10.1088/2515-7655/abe425. [DOI] [Google Scholar]
- Perdew J. P.; Burke K.; Ernzerhof M. Generalized Gradient Approximation Made Simple. Phys. Rev. Lett. 1996, 77, 3865–3868. 10.1103/PhysRevLett.77.3865. [DOI] [PubMed] [Google Scholar]
- Pedregosa F.; Varoquaux G.; Gramfort A.; Michel V.; Thirion B.; Grisel O.; Blondel M.; Prettenhofer P.; Weiss R.; Dubourg V.; Vanderplas J.; Passos A.; Cournapeau D.; Brucher M.; Perrot M.; Duchesnay E. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
- Plimpton S. Fast parallel algorithms for short-range molecular dynamics. J. Comput. Phys. 1995, 117 (1), 1–19. 10.1006/jcph.1995.1039. [DOI] [Google Scholar]
- Oses C.; Gossett E.; Hicks D.; Rose F.; Mehl M. J.; Perim E.; Takeuchi I.; Sanvito S.; Scheffler M.; Lederer Y.; Levy O.; Toher C.; Curtarolo S. AFLOW-CHULL: cloud-oriented platform for autonomous phase stability analysis. J. Chem. Inf. Model. 2018, 58 (12), 2477–2490. 10.1021/acs.jcim.8b00393. [DOI] [PubMed] [Google Scholar]
- Ong S. P.; Richards W. D.; Jain A.; Hautier G.; Kocher M.; Cholia S.; Gunter D.; Chevrier V. L.; Persson K. A.; Ceder G. Python Materials Genomics (pymatgen): A robust, open-source python library for materials analysis. Comput. Mater. Sci. 2013, 68, 314–319. 10.1016/j.commatsci.2012.10.028. [DOI] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Data associated with this project, including the AFLOWlib auid for the training data, the AFLOWlib labels of the structures used as prototypes, the parameters of the SNAP models, and the lowest values of the enthalpy of formation found at each composition are available on the Github repository (https://github.com/HugoRossignol/Workflow_Ternary_ConvexHull).




