Skip to main content
Scientific Data logoLink to Scientific Data
. 2022 Mar 2;9:64. doi: 10.1038/s41597-022-01177-w

A dataset of 175k stable and metastable materials calculated with the PBEsol and SCAN functionals

Jonathan Schmidt 1, Hai-Chen Wang 1, Tiago F T Cerqueira 2, Silvana Botti 3, Miguel A L Marques 1,
PMCID: PMC8891291  PMID: 35236866

Abstract

In the past decade we have witnessed the appearance of large databases of calculated material properties. These are most often obtained with the Perdew-Burke-Ernzerhof (PBE) functional of density-functional theory, a well established and reliable technique that is by now the standard in materials science. However, there have been recent theoretical developments that allow for increased accuracy in the calculations. Here, we present a dataset of calculations for 175k crystalline materials obtained with two functionals: geometry optimizations are performed with PBE for solids (PBEsol) that yields consistently better geometries than the PBE functional, and energies are obtained from PBEsol and from SCAN single-point calculations at the PBEsol geometry. Our results provide an accurate overview of the landscape of stable (and nearly stable) materials, and as such can be used for reliable predictions of novel compounds. They can also be used for training machine learning models, or even for the comparison and benchmark of PBE, PBEsol, and SCAN.

Subject terms: Theory and computation, Condensed-matter physics


Measurement(s) optimized geometry (PBESol) • total energy (PBESol, Scan) • bandgap (PBESol, Scan)
Technology Type(s) Density functional theory (VASP)
Factor Type(s) Exchange correlation functional • Crystal structure

Background & Summary

The search for new materials remains one of the most important quests but, unfortunately, also one of the great challenges of materials science. Nowadays, data-driven searching strategies have become the most cost-effective methods to tackle this problem, and the fastest way of finding new materials or study their properties are computational high-throughput searches. After years of data accumulation, there are millions of calculations of materials available in open databases1,2 that are used as an invaluable reservoir to select and filter promising candidates for further experimental synthesis and characterization.

These high-throughput studies in solid-state material science36 have broadened the exploration of the vast chemical space, while plenty of works have successfully found and predicted promising materials for technological applications. However, nearly all high throughput searches rely on the use of density functional theory (DFT) within the Perdew-Burke-Ernzerhof (PBE) approximation to the exchange-correlation functional7. This is a well established and reliable approach that earned its place as the standard technique in solid-state research. However, the PBE functional is now over 25 years old, and more recent (and accurate) functional have by now been proposed in the literature. For example, the Armiento-Mattson 20058 or the PBE for solids9 functionals consistently lead to superior geometries10,11, while the SCAN meta-generalized gradient approximation12 yields formation energies that are on average better by a factor of two than the PBE13. Unfortunately, and in stark contrast with the abundance of PBE data, there are no available comparable large scale datasets calculated with these improved functionals.

There are a number of other databases that use either higher accuracy methods, like G0 W0, or apply density-functionals different from PBE. For example, we can mention the Computational 2D Materials Database14,15 that provides a dataset of 4000 2D materials calculated with HSE, G0 W0, RPA and the Bethe-Salpeter equation, while JARVIS contains a large number of calculations with vdW corrections using OptB88vdW16,17 and performed with the modified Becke-Johnson potential1820.

In a previous work21 we combined data from the AFLOW database1, the Materials Project2 and from our own group to create a rather complete convex hull of thermodynamic stability at the PBE level. The details of the selection of the dataset can be found in ref. 21. Specifically, we selected all materials that were calculated with the same functional, pseudopotential, as well as U-parameters used in the Materials Project. We removed duplicates, i.e. entries with the same space group, composition and total energy (rounded to the 4th digit). Here we determined the space group using pymatgen with the “symprec” keyword set to 0.1. From the AFLOW database we further removed all prototypes labeled “_DEVIL_PROTOTYPES_” and all other combinations of prototypes and pseudopotentials that are noted as ill-converged in the code of ref. 22. As AFLOW is still known to contain outliers22, we also removed them from the calculation of the hull following a strategy similar to the one explained in ref. 22. We used the total energies of all the remaining structures to calculate the convex hulls applying the corrections from the materials project workflow to the energies.

From this dataset23 we selected around 175k compounds that were either stable (i.e., on the convex hull), or close to stable (within 100 meV/atom of the hull). These were then reoptimized with PBEsol9. Finally, the total energies were reevaluated with SCAN12 to create highly accurate formation energies and convex hulls.

Methods

Our starting point was the dataset used in the machine learning study of ref. 21. This included PBE calculations stemming from the Materials Project database2, AFLOW1, and our own calculations. These were then filtered to obtain a homogeneous set in what regards the calculation parameters, leading to a dataset containing more than two million compounds. We then constructed the convex hull of thermodynamic stability and extracted entries that were either on the hull or within 0.1 eV/atom24,25. The reason for the choice of this cutoff was twofold. First, its value is still below the average error in the formation energies calculated with PBE26,27, but is larger than the estimated error in the distances to the convex hull28,29. As such, the compounds that were misidentified by the PBE as thermodynamically unstable are likely to be included in the set. Second, there are a number of materials that are metastable, but experimentally accessible (for example for compositions that have more than one polymorph). We can reasonably expect that the cutoff allows for the inclusion of the majority of these cases. We also eliminated materials with unit cells that were too large for our computational resources, leading to a final amount of ~175k compounds.

All calculations were performed using density-functional theory, within the projector augmented wave method (PAW)30 as implemented in the Vienna ab initio simulation package (VASP)31,32. We used the PAW setups shipped with version 5.4 of vasp that include information on the kinetic energy density of the core electrons. We mostly followed the recommendations of the Materials Project for the choice of the pseudopotentials. The exception was Cs, for which we used an improved pseudopotential generated by the vasp developers, as the stock PAW setup often led to negative densities during the self-consistent cycle, crashing the calculation. All calculations were performed taking into account spin-polarization, and started from a ferromagnetic configuration (as in the large majority of high-throughput studies). This most likely leads to an incorrect spin configuration for antiferromagnetic systems, resulting in an energy higher than the true groud-state. However, it is well known that in most cases magnetic exchange energies are rather small, so the error in the total energy is limited to a few tens of meV/atom. Methfessel-Paxton order one smearing with a width of 0.2 eV was applied in the integration of the Brillouin zone.

The structures were optimized using the PBEsol9 approximation, the “High” precision keyword of vasp, and a Γ-centered k-point grid with 2000 k-points per reciprocal atom, until the forces on the atoms were below 5 meV/Å. To detect the structures that required re-optimization we checked that the the absolute value of the forces (in meV/Å) and the individual components of the stress tensor (in meV/Å3), calculated with PBEsol with stricter convergence criteria (520 eV for the wave-function cutoff and 8000 k-points per reciprocal atom), remained smaller than 0.05. If it was not the case, we increased both the energy cutoff and the number of k-points and performed again the geometry optimization. Around 3% of all structures required this further calculation step. We note that as we had already a reasonable starting point, specifically the PBE geometry, the geometry optimization required a relatively small number of steps in most cases.

A final energy evaluation with the SCAN meta-GGA functional was performed with a cutoff of 520 eV, 8000 k-points per reciprocal atom, and including the non-spherical contributions from the gradient corrections inside the PAW spheres. As expected, SCAN calculations were much more unstable than PBEsol, due to the well known numerical instabilities of this functional33,34, which led to a much lower average convergence rate. In any case, we succeeded in converging nearly all calculations. In total, the calculations presented here required around 10 million CPU hours.

Data Records

The output files of vasp were collected and processed with the pymatgen library35. Each final data record consisted of a ComputedStructureEntry that included the chemical composition, the total energy, and the detailed crystal structure of the entry. Two files, containing all entries for each functional, can be freely downloaded from the Materials Cloud repository36 and can be loaded trivially using the json module of python. For convenience, we also provide a summary of the data in tabulated form, that include the fields listed in Table 1.

Table 1.

Fields included in the summary of the data in tabulated form.

Composition Chemical composition of each material.
Number of sites Number of atoms in the unit cell.
EPBE Total energy per atom calculated with PBE (extracted from the primary dataset)
EPBEsol Total energy per atom calculated with PBEsol.
ESCAN Total energy per atom calculated with SCAN.
VPBE Volume per atom of the PBE unit cell (extracted from the primary dataset).
VPBEsol Volume per atom of the PBEsol unit cell.
GapPBEsol Band gap calculated with PBEsol.
GapSCAN Band gap calculated with SCAN.
MPBEsol Total magnetic moment per atom calculated with PBEsol.
MSCAN Total magnetic moment per atom calculated with SCAN.
SSCAN The diagonal elements of the stress tensor calculated with SCAN.

In panel a of Fig. 1 we plot the histogram of the number of different chemical elements in our materials. We can clearly see that the dataset is dominated by ternary compounds, followed by binary and quaternary. Relatively few multinary materials with more than 4 different chemical elements are present. We can understand this distribution by considering that the number of permutations increases very rapidly with the number of chemical elements, explaining, e.g., why we have many more binaries than elementary substances or ternary than binary compounds. However, we have to keep in mind that the complexity of the unit cells also increases and that most of the systematic high-throughput DFT studies were performed for ternary systems3742. This can also be seen in panel (b) of Fig. 1 where we show a histogram of the number of atoms in the primitive unit cell. The distribution is dominated by a peak centered around five atoms per unit cell, arising from the materials stemming from the high-throughput studies of AFLOW1 and from our group3,37,38. Obviously, we do not expect that the true distribution of all possible thermodynamically stable and metastable materials follows the behavior depicted in Fig. 1. Finally, in panels (c) and (d) of the same figure, we plot an histogram of the number of materials as a function of the space group index and of the crystal system. We see that the most represented systems are the trigonal and orthorhombic, while the tetragonal and triclinic are the least represented.

Fig. 1.

Fig. 1

Distribution of (a) number of different chemical elements per unit cell, (b) number of atoms per unit cell, (c) index of space groups, and (d) crystal systems for all the materials in our dataset.

In Fig. 2 we show the distribution of chemical elements for the calculated structures. We considered all elements except noble gases up to bismuth as well as most lanthanides, and actinides up to plutonium. We can observe some obvious trends. Not surprisingly, the most common element is oxygen due to the abundance of oxides in our planet and their stability in our oxygen-rich atmosphere. The remaining chalcogens (S, Se, Te) are all equally represented, which is probably an indication of their chemical similarity. For the halogen family we see a decreasing number of materials following the decrease of the electronegativity, with iodine compounds being half as abundant as fluorides. For the pnictogens we witness exactly the opposite behavior, with much fewer nitrides than compounds containing antimony. This can be understood from the high chemical stability of the N2 molecule and the rather high nominal oxidation state of nitrogen (−3) that hinders the synthesis of nitrides. Finally carbon-containing materials are rather scarce due to the absence of organic compounds in our dataset.

Fig. 2.

Fig. 2

Periodic table depicting the chemical elements present in our dataset. The number beneath the chemical symbol is the number of materials present in the dataset that contain the given element.

From the metals, the two with the highest number of compounds are aluminum and lithium. The former is within a cluster of highly represented chemical elements (such as nickel, copper, zinc, or indium). The exception in this region of the periodic table is gallium, in spite of its importance in many semiconductors used in electronics and optoelectronics. Lithium compounds, on the other hand, are much more common in our dataset than materials containing any other alkali element, which might be explained by the popularity of studies in lithium compounds in view of their application in battery technologies. Interestingly, in contrast to the other alkali earth elements, beryllium appears in relatively few compounds, probably due to its high toxicity. Another region that exhibits relatively few compounds is the one centered in vanadium and that includes rhenium, hafnium, niobium, etc. We can observe a continuity in the values in this region that might indicate that these chemical elements, often exhibiting the very high oxidation numbers of +4, +5 or +6, have more difficulty producing stable compounds than other metals with lower oxidation numbers. Finally, the least represented groups are unsurprisingly the lanthanides and the actinides, showing how little we know the chemistry of these chemical elements that are essential in a multitude of technologies, such as in hard magnets or in the storage of nuclear waste.

It is also interesting to look at the structural diversity in our dataset. With that objective, we used pymatgen to divide our structures into groups according to structural similarity. In total, our data turned out to contain 24706 prototypes of 4557 different generic compositions. However, and as expected, the distribution of compounds among these prototypes was rather unbalanced. For example, 15399 prototypes only appeared once in the dataset and 2412 only twice. On the other hand, most materials belong to just a few prototypes. The most common was by far the Heusler family of compounds, with almost 12 000 compositions, followed by the double perovskite family with more than 5000 elements. Of course, these numbers reflect not only the chemical stability and size of each one of the families, but also the interest of the community for these compounds.

Technical Validation

In the left panel of Fig. 3 we plot the distribution of the volumes per atom with the PBE (obtained from the primary data) and PBEsol. Both curves look similar, with a peak at around 15 Å3 and skewed towards larger volumes. We can also observe that the PBE data is shifted toward larger volumes with respect to PBEsol. This is due to the well known underbinding of the PBE43 that leads to lattice constants that are larger than their experimental values by 2–3% (and consequently volumes that are overestimated by ~10%). This underbinding is almost totally corrected by PBEsol, yielding smaller volumes in much better agreement with experiment. In the right panel of the same figure we plot a histogram of the three diagonal components of the stress tensor calculated with SCAN at the PBEsol geometry. We can see that the calculations yield rather small stresses, showing that the structures are close to mechanical equilibrium. This is expected because SCAN, like PBEsol, yields high quality geometries in good agreement with experiment10,44. This also validates our pragmatic approach of using a single point SCAN calculation at the PBEsol geometries.

Fig. 3.

Fig. 3

Left: Distribution and scatter plots of volumes per atom calculated with PBE (from the primary data) and PBEsol functionals. The width of the bins is 0.35 Å3/atom. Right: distribution and scatter plots of the diagonal components of the stress tensor calculated with PBEsol and SCAN at PBEsol geometries. The width of the bins is 0.6 meV/Å3.

In Fig. 4 we show the distribution of (indirect) band gaps calculated with both PBEsol and SCAN. As expected, the curves decay monotonically with the value of the gap, and exhibit a fat tail that extends beyond 10 eV. It is known that PBEsol, as well as PBE, strongly underestimate the value of the band gaps by essentially a factor of two, leading to mean absolute percentage errors bordering the 50%45,46 with respect to experiment. This is to some extent corrected by SCAN, that increases consistently the band gaps leading to a mean absolute percentage error of around 40%45,46. This increase is evident from Fig. 4, where the SCAN band gap distribution is shifted to the right with respect to PBEsol.

Fig. 4.

Fig. 4

Left: Distribution and scatter plots of the (in)direct band gaps calculated with PBEsol and SCAN at the PBEsol geometry. The width of the bins is 0.1 eV. Right: Distribution and scatter plots of the distances to the convex hull calculated with PBEsol and SCAN at PBEsol geometries. The corresponding hulls contain 40246 and 38692 materials. The width of the bins is 2 meV/atom.

Finally, we performed an analysis of the convex hulls obtained with PBE, PBEsol, and SCAN. The hulls contained 36985, 40246 and 38692 materials respectively. These differences are expected, and mostly stem from relatively small changes in the formation energy for compounds that were close to the hull. The distributions of the distances to the convex hull of thermodynamic stability are plotted in the right panel of Fig. 4. For this plot, we always removed the compound under study from the hull, allowing therefore for negative distances. Although this allows for a better interpretation of the results, we should remember that technically speaking all compounds with negative distances should be placed strictly at zero. Both distributions grow fast for negative distances to the hull, having a marked peak at zero. The main contributors to this behavior are the experimental compounds. The number of materials with positive distance to the hull is relatively constant, until reaching our cutoff at ±0.2 eV/atom. This cutoff is obviously smeared for the PBEsol and SCAN calculations.

As an illustration, we plot in Fig. 5 the ternary phase diagrams of Li–Al–Cu and Mg–Sc–Zn. We see that the three functionals agree to a large extent on which are the stable materials. However, some differences are also clear. For example, MgZn is stable with the PBE functional but not with PBEsol or SCAN, or LiAl3 is unstable with SCAN but stable with the other two functionals. In any case, compounds that are stable with one functional do appear stable, or at least metastable with the other functionals. Of course, SCAN is the most accurate of the three functionals in what concerns formation energies and distances to the convex hull, so the SCAN diagrams should on average have the highest accuracy.

Fig. 5.

Fig. 5

Ternary phase diagrams of the Li–Al–Cu (upper panel) and Mg–Sc–Zn (lower panel) systems, calculated with the PBE (left), PBEsol (middle) and SCAN (right). The blue points indicate compositions on the convex hull, while red points denote materials that are within 50 meV/atom from the hull.

Usage Notes

The data can be downloaded from the Materials Cloud repository36. The energies, compositions, and structures for each material are formatted as ComputedStructureEntries and stored as compressed json files. They can therefore be trivially loaded in python and analyzed with pymatgen. We note that we used version v2019.10.2 of pymatgen, but the data should be compatible with other versions.

Acknowledgements

The authors gratefully acknowledge the Gauss Centre for Supercomputing e.V. (www.gauss-centre.eu) for funding this project by providing computing time on the GCS Supercomputer SuperMUC-NG at the Leibniz Supercomputing Centre under the project pn68wa. We would also like to thank Georg Kresse for providing us with an improved PAW setup for Cs.

Author contributions

J.S., H.W. and M.A.L.M. performed the calculations. All authors contributed to the analysis of the results and to the writing the manuscript.

Funding

Open Access funding enabled and organized by Projekt DEAL.

Code availability

All data can be easily processed with publicly available tools such as json and pymatgen35. An example usage is provided with the data. The dataset was generated with VASP, the bash and python scripts to generate input files or manage the output files can be downloaded from github repository: https://github.com/hyllios/utils/tree/main/ht_pd_scan.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  • 1.Curtarolo S, et al. AFLOW: An automatic framework for high-throughput materials discovery. Comput. Mater. Sci. 2012;58:218–226. doi: 10.1016/j.commatsci.2012.02.005. [DOI] [Google Scholar]
  • 2.Jain A, et al. Commentary: The materials project: A materials genome approach to accelerating materials innovation. APL Mater. 2013;1:011002. doi: 10.1063/1.4812323. [DOI] [Google Scholar]
  • 3.Körbel S, Marques MAL, Botti S. Stable hybrid organic–inorganic halide perovskites for photovoltaics from ab initio high-throughput calculations. J. Mater. Chem. A. 2018;6:6463–6475. doi: 10.1039/c7ta08992a. [DOI] [Google Scholar]
  • 4.Graužinytė M, Botti S, Marques MAL, Goedecker S, Flores-Livas JA. Computational acceleration of prospective dopant discovery in cuprous iodide. Phys. Chem. Chem. Phys. 2019;21:18839–18849. doi: 10.1039/c9cp02711d. [DOI] [PubMed] [Google Scholar]
  • 5.Flores-Livas JA, Sarmiento-Pérez R, Botti S, Goedecker S, Marques MAL. Rare-earth magnetic nitride perovskites. JPhys Mater. 2019;2:025003. doi: 10.1088/2515-7639/ab083e. [DOI] [Google Scholar]
  • 6.Wang H-C, Pistor P, Marques MAL, Botti S. Double perovskites as p-type conducting transparent semiconductors: a high-throughput search. J. Mater. Chem. A. 2019;7:14705–14711. doi: 10.1039/c9ta01456j. [DOI] [Google Scholar]
  • 7.Perdew JP, Burke K, Ernzerhof M. Generalized gradient approximation made simple. Phys. Rev. Lett. 1996;77:3865. doi: 10.1103/PhysRevLett.77.3865. [DOI] [PubMed] [Google Scholar]
  • 8.Armiento R, Mattsson AE. Functional designed to include surface effects in self-consistent density functional theory. Phys. Rev. B. 2005;72:085108. doi: 10.1103/PhysRevB.72.085108. [DOI] [Google Scholar]
  • 9.Perdew JP, et al. Restoring the density-gradient expansion for exchange in solids and surfaces. Phys. Rev. Lett. 2008;100:136406. doi: 10.1103/PhysRevLett.100.136406. [DOI] [PubMed] [Google Scholar]
  • 10.Csonka GI, et al. Assessing the performance of recent density functionals for bulk solids. Phys. Rev. B. 2009;79:155107. doi: 10.1103/PhysRevB.79.155107. [DOI] [Google Scholar]
  • 11.Tran F, Stelzl J, Blaha P. Rungs 1 to 4 of DFT Jacob’s ladder: Extensive test on the lattice constant, bulk modulus, and cohesive energy of solids. J. Chem. Phys. 2016;144:204120. doi: 10.1063/1.4948636. [DOI] [PubMed] [Google Scholar]
  • 12.Sun J, Ruzsinszky A, Perdew JP. Strongly constrained and appropriately normed semilocal density functional. Phys. Rev. Lett. 2015;115:036402. doi: 10.1103/PhysRevLett.115.036402. [DOI] [PubMed] [Google Scholar]
  • 13.Zhang, Y. et al. Efficient first-principles prediction of solid stability: Towards chemical accuracy. Npj Comput. Mater. 4, 10.1038/s41524-018-0065-z (2018).
  • 14.Haastrup S, et al. The computational 2D materials database: high-throughput modeling and discovery of atomically thin crystals. 2D Materials. 2018;5:042002. doi: 10.1088/2053-1583/aacfc1. [DOI] [Google Scholar]
  • 15.Gjerding MN, et al. Recent progress of the computational 2d materials database (c2db) 2D Materials. 2021;8:044002. doi: 10.1088/2053-1583/ac1059. [DOI] [Google Scholar]
  • 16.Thonhauser T, et al. Van der waals density functional: Self-consistent potential and the nature of the van der waals bond. Phys. Rev. B. 2007;76:125112. doi: 10.1103/PhysRevB.76.125112. [DOI] [Google Scholar]
  • 17.Klimeš JCV, Bowler DR, Michaelides A. Van der waals density functionals applied to solids. Phys. Rev. B. 2011;83:195131. doi: 10.1103/PhysRevB.83.195131. [DOI] [Google Scholar]
  • 18.Choudhary, K. et al. The joint automated repository for various integrated simulations (JARVIS) for data-driven materials design. npj Computational Materials6, 10.1038/s41524-020-00440-1 (2020).
  • 19.Choudhary, K. et al. Computational screening of high-performance optoelectronic materials using OptB88vdw and TB-mBJ formalisms. Scientific Data5, 10.1038/sdata.2018.82 (2018). [DOI] [PMC free article] [PubMed]
  • 20.Choudhary K, Cheon G, Reed E, Tavazza F. Elastic properties of bulk and low-dimensional materials using van der Waals density functional. Phys. Rev. B. 2018;98:014107. doi: 10.1103/PhysRevB.98.014107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Schmidt J, Pettersson L, Verdozzi C, Botti S, Marques MAL. Crystal graph attention networks for the prediction of stable materials. Science Advances. 2021;7:eabi7948. doi: 10.1126/sciadv.abi7948. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Oses C, et al. Aflow-chull: Cloud-oriented platform for autonomous phase stability analysis. J. Chem. Inf. Model. 2018;58(12):2477–2490. doi: 10.1021/acs.jcim.8b00393. [DOI] [PubMed] [Google Scholar]
  • 23.Schmidt J, Pettersson L. 2021. Crystal-graph attention networks for the prediction of stable materials. Materials Cloud. https://archive.materialscloud.org/record/2021.222 [DOI] [PMC free article] [PubMed]
  • 24.Emery, A. A. & Wolverton, C. High-throughput DFT calculations of formation energy, stability and oxygen vacancy formation energy of ABO3 perovskites. Sci. Data4, 10.1038/sdata.2017.153 (2017). [DOI] [PMC free article] [PubMed]
  • 25.Sun W, et al. The thermodynamic scale of inorganic crystalline metastability. Sci. Adv. 2016;2:e1600225. doi: 10.1126/sciadv.1600225. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Sarmiento-Pérez R, Botti S, Marques MAL. Optimized exchange and correlation semilocal functional for the calculation of energies of formation. J. Chem. Theory Comput. 2015;11:3844–3850. doi: 10.1021/acs.jctc.5b00529. [DOI] [PubMed] [Google Scholar]
  • 27.Stevanović V, Lany S, Zhang X, Zunger A. Correcting density functional theory for accurate predictions of compound enthalpies of formation: Fitted elemental-phase reference energies. Phys. Rev. B. 2012;85:115104. doi: 10.1103/PhysRevB.85.115104. [DOI] [Google Scholar]
  • 28.Hautier G, Ong SP, Jain A, Moore CJ, Ceder G. Accuracy of density functional theory in predicting formation energies of ternary oxides from binary oxides and its implication on phase stability. Phys. Rev. B. 2012;85:155208. doi: 10.1103/PhysRevB.85.155208. [DOI] [Google Scholar]
  • 29.Bartel CJ, Weimer AW, Lany S, Musgrave CB, Holder AM. The role of decomposition reactions in assessing first-principles predictions of solid stability. Npj Comput. Mater. 2019;5:1–9. doi: 10.1038/s41524-018-0143-2. [DOI] [Google Scholar]
  • 30.Blöchl PE. Projector augmented-wave method. Phys. Rev. B. 1994;50:17953–17979. doi: 10.1103/PhysRevB.50.17953. [DOI] [PubMed] [Google Scholar]
  • 31.Kresse G, Furthmüller J. Efficiency of ab-initio total energy calculations for metals and semiconductors using a plane-wave basis set. Comput. Mater. Sci. 1996;6:15–50. doi: 10.1016/0927-0256(96)00008-0. [DOI] [PubMed] [Google Scholar]
  • 32.Kresse G, Furthmüller J. Efficient iterative schemes for ab initio total-energy calculations using a plane-wave basis set. Phys. Rev. B. 1996;54:11169–11186. doi: 10.1103/PhysRevB.54.11169. [DOI] [PubMed] [Google Scholar]
  • 33.Bartók AP, Yates JR. Regularized SCAN functional. J. Chem. Phys. 2019;150:161101. doi: 10.1063/1.5094646. [DOI] [PubMed] [Google Scholar]
  • 34.Furness JW, Kaplan AD, Ning J, Perdew JP, Sun J. Accurate and numerically efficient r2SCAN meta-generalized gradient approximation. J. Phys. Chem. Lett. 2020;11:8208–8215. doi: 10.1021/acs.jpclett.0c02405. [DOI] [PubMed] [Google Scholar]
  • 35.Ong SP, et al. Python materials genomics (pymatgen): A robust, open-source python library for materials analysis. Comput. Mater. Sci. 2013;68:314–319. doi: 10.1016/j.commatsci.2012.10.028. [DOI] [Google Scholar]
  • 36.Schmidt J, Wang H -C, Cerqueira T F T, Botti S, Marques M A L. 2021. A new dataset of 175k stable and metastable materials calculated with the PBEsol and SCAN functionals. Materials Cloud. https://archive.materialscloud.org/record/2021.164 [DOI] [PMC free article] [PubMed]
  • 37.Schmidt J, et al. Predicting the thermodynamic stability of solids combining density functional theory and machine learning. Chem. Mater. 2017;29:5090–5103. doi: 10.1021/acs.chemmater.7b00156. [DOI] [Google Scholar]
  • 38.Schmidt J, Chen L, Botti S, Marques MAL. Predicting the stability of ternary intermetallics with density functional theory and machine learning. J. Chem. Phys. 2018;148:241728. doi: 10.1063/1.5020223. [DOI] [PubMed] [Google Scholar]
  • 39.Saal JE, Kirklin S, Aykol M, Meredig B, Wolverton C. Materials design and discovery with high-throughput density functional theory: The open quantum materials database (OQMD) JOM. 2013;65:1501–1509. doi: 10.1007/s11837-013-0755-4. [DOI] [Google Scholar]
  • 40.Hautier G, Fischer CC, Jain A, Mueller T, Ceder G. Finding nature’s missing ternary oxide compounds using machine learning and density functional theory. Chem. Mater. 2010;22:3762–3767. doi: 10.1021/cm100795d. [DOI] [Google Scholar]
  • 41.Shi, J. et al. High-throughput search of ternary chalcogenides for p-type transparent electrodes. Sci. Rep. 7, 10.1038/srep43179 (2017). [DOI] [PMC free article] [PubMed]
  • 42.Oliynyk AO, et al. High-throughput machine-learning-driven synthesis of full-heusler compounds. Chem. Mater. 2016;28:7324–7331. doi: 10.1021/acs.chemmater.6b02724. [DOI] [Google Scholar]
  • 43.Haas, P., Tran, F. & Blaha, P. Calculation of the lattice constant of solids with semilocal functionals. Phys. Rev. B79, 10.1103/physrevb.79.085104 (2009).
  • 44.Isaacs EB, Wolverton C. Performance of the strongly constrained and appropriately normed density functional for solid-state materials. Phys. Rev. Mater. 2018;2:063801. doi: 10.1103/PhysRevMaterials.2.063801. [DOI] [Google Scholar]
  • 45.Borlido, P. et al. Exchange-correlation functionals for band gaps of solids: benchmark, reparametrization and machine learning. Npj Comput. Mater. 6, 10.1038/s41524-020-00360-0 (2020).
  • 46.Borlido P, et al. Large-scale benchmark of exchange-correlation functionals for the determination of electronic band gaps of solids. J. Chem. Theory Comput. 2019;15:5069–5079. doi: 10.1021/acs.jctc.9b00322. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Citations

  1. Schmidt J, Pettersson L. 2021. Crystal-graph attention networks for the prediction of stable materials. Materials Cloud. https://archive.materialscloud.org/record/2021.222 [DOI] [PMC free article] [PubMed]
  2. Schmidt J, Wang H -C, Cerqueira T F T, Botti S, Marques M A L. 2021. A new dataset of 175k stable and metastable materials calculated with the PBEsol and SCAN functionals. Materials Cloud. https://archive.materialscloud.org/record/2021.164 [DOI] [PMC free article] [PubMed]

Data Availability Statement

All data can be easily processed with publicly available tools such as json and pymatgen35. An example usage is provided with the data. The dataset was generated with VASP, the bash and python scripts to generate input files or manage the output files can be downloaded from github repository: https://github.com/hyllios/utils/tree/main/ht_pd_scan.


Articles from Scientific Data are provided here courtesy of Nature Publishing Group

RESOURCES