New tolerance factor to predict the stability of perovskite oxides and halides

Christopher J Bartel; Christopher Sutton; Bryan R Goldsmith; Runhai Ouyang; Charles B Musgrave; Luca M Ghiringhelli; Matthias Scheffler

doi:10.1126/sciadv.aav0693

. 2019 Feb 8;5(2):eaav0693. doi: 10.1126/sciadv.aav0693

New tolerance factor to predict the stability of perovskite oxides and halides

Christopher J Bartel ^1,^*, Christopher Sutton ², Bryan R Goldsmith ³, Runhai Ouyang ², Charles B Musgrave ^1,^4,⁵, Luca M Ghiringhelli ^2,^*, Matthias Scheffler ²

PMCID: PMC6368436 PMID: 30783625

Simple and interpretable data-driven descriptor accurately predicts the synthesizability of single and double perovskites.

Abstract

Predicting the stability of the perovskite structure remains a long-standing challenge for the discovery of new functional materials for many applications including photovoltaics and electrocatalysts. We developed an accurate, physically interpretable, and one-dimensional tolerance factor, τ, that correctly predicts 92% of compounds as perovskite or nonperovskite for an experimental dataset of 576 ABX₃ materials (X = O²⁻, F⁻, Cl⁻, Br⁻, I⁻) using a novel data analytics approach based on SISSO (sure independence screening and sparsifying operator). τ is shown to generalize outside the training set for 1034 experimentally realized single and double perovskites (91% accuracy) and is applied to identify 23,314 new double perovskites (A₂BB′X₆) ranked by their probability of being stable as perovskite. This work guides experimentalists and theorists toward which perovskites are most likely to be successfully synthesized and demonstrates an approach to descriptor identification that can be extended to arbitrary applications beyond perovskite stability predictions.

INTRODUCTION

Crystal structure prediction from chemical composition continues as a persistent challenge to accelerated materials discovery (1, 2). Most approaches capable of addressing this challenge require several computationally demanding electronic-structure calculations for each material composition, limiting their use to a small set of materials (3–6). Alternatively, descriptor-based approaches enable high-throughput screening applications because they provide rapid estimates of material properties (7, 8). Notably, the Goldschmidt tolerance factor, t (9), has been used extensively to predict the stability of the perovskite structure based only on the chemical formula, ABX₃, and the ionic radii, r_i, of each ion (A, B, X)

t = \frac{r_{A} + r_{X}}{\sqrt{2} (r_{B} + r_{X})}

(1)

The perovskite crystal structure, as shown in Fig. 1A, is defined as any ABX₃ compound with a network of corner-sharing BX₆ octahedra surrounding a larger A-site cation (r_A > r_B), where the cations, A and B, can span the periodic table and the anion, X, is typically a chalcogen or halogen. Distortions from the cubic structure can arise from size mismatch of the cations and anion, which results in additional perovskite structures and nonperovskite structures. The B cation can also be replaced by two different ions, resulting in the double perovskite formula, A₂BB′X₆ (Fig. 1B). Single and double perovskite materials have exceptional properties for a variety of applications such as electrocatalysis (10), proton conduction (11), ferroelectrics (12) (using oxides, X = O²⁻), battery materials (13) (using fluorides, X = F⁻), as well as photovoltaics (14) and optoelectronics (15) (using the heavier halides, X = Cl⁻, Br⁻, I⁻).

The first step in designing new perovskites for these applications is typically the assessment of stability using t, which has informed the design of perovskites for over 90 years. However, as reported in recent studies, its accuracy is often insufficient (16). Considering 576 ABX₃ solids experimentally characterized at ambient conditions and reported in (17–19) (see Fig. 1C for the A, B, and X elements in this set), t correctly distinguishes between perovskite and nonperovskite for only 74% of materials and performs considerably worse for compounds containing heavier halides [chlorides (51% accuracy), bromides (56%), and iodides (33%)] than for oxides (83%) and fluorides (83%) (Fig. 2A, fig. S1, and table S1). This deficiency in generalization to halide perovskites severely limits the applicability of t for materials discovery.

Fig. 2 — (A) A decision tree classifier determines that the optimal bounds for perovskite formability using the Goldschmidt tolerance factor (t) are 0.825 < t < 1.059, which yields a classification accuracy of 74% for 576 experimentally characterized *ABX*₃ solids. (B) τ achieves a classification accuracy of 92% on the set of 576 *ABX*₃ solids based on perovskite classification for τ < 4.18, with this decision boundary identified using a one-node decision tree. All classifications made by t and τ on the experimental dataset are provided in table S1. The largest value of τ in the experimental set of 576 compounds is 181.5; however, all points with τ > 13 are correctly labeled as nonperovskite and are not shown to highlight the decision boundary. The outlying compounds at τ > 10 that are labeled perovskite yet have large τ are PuVO₃, AmVO₃, and PuCrO₃, which may indicate poorly defined radii or incorrect experimental characterization. (C) Comparison of Platt-scaled classification probabilities, P(τ), versus t. LaAlO₃ and NaBeCl₃ are labeled to highlight the variation in P(τ) at nearly constant t. (D) Comparison between P(τ) and the decomposition enthalpy (ΔH_d) for 36 double perovskite halides calculated using density functional theory (DFT) in the $F m \bar{3} m$ structure in (32) and 37 single and double perovskite chalcogenides and halides in the $P m \bar{3} m$ structure in (33). The legend corresponds with the anion, X. Positive decomposition enthalpy (ΔH_d > 0) indicates that the structure is stable with respect to decomposition into competing compounds. The green and white shaded regions correspond with agreement and disagreement between the calculated ΔH_d and the classification by τ. Points of disagreement are outlined in red. CaZrO₃ and CaHfO₃ are labeled because they are known to be stable in the perovskite structure, although they are unstable in the cubic structure (34, 35). For this reason, the best-fit line for the chalcogenides (X = O²⁻, S²⁻, Se²⁻) excludes these two points.

In this work, we present a new tolerance factor (τ), which has the form

τ = \frac{r_{X}}{r_{B}} - n_{A} (n_{A} - \frac{r_{A} / r_{B}}{ln (r_{A} / r_{B})})

(2)

where n_A is the oxidation state of A, r_i is the ionic radius of ion i, r_A > r_B by definition, and τ < 4.18 indicates perovskite. A high overall accuracy of 92% for the experimental set (94% for a randomly chosen test set of 116 compounds) and nearly uniform performance across the five anions evaluated [oxides (92% accuracy), fluorides (92%), chlorides (90%), bromides (93%), and iodides (91%)] is achieved with τ (Fig. 2B, fig. S1, and table S1). Like t, the prediction of perovskite stability using τ requires only the chemical composition, allowing the tolerance factor to be agnostic to the many structures that are considered perovskite. In addition to predicting if a material is stable as perovskite, τ also provides a monotonic estimate of the probability that a material is stable in the perovskite structure. The accurate and probabilistic nature of τ, as well as its generalizability over a broad range of single and double perovskites, allows new physical insights into the stability of the perovskite structure and the prediction of thousands of new double perovskite oxides and halides, 23,314 of which are provided here and ranked by their probability of being stable in the perovskite structure.

RESULTS AND DISCUSSION

Finding an improved tolerance factor to predict perovskite stability

One key aspect of the performance of t is how well the sum of ionic radii estimates the interatomic bond distances for a given structure. Shannon’s revised effective ionic radii (20) based on a systematic empirical assessment of interatomic distances in nearly 1000 compounds are the typical choice for radii because they provide ionic radius as a function of ion, oxidation state, and coordination number for the majority of elements. Most efforts to improve t have focused on refining the input radii (17, 19, 21, 22) or increasing the dimensionality of the descriptor through two-dimensional (2D) structure maps (18, 23, 24) or high-dimensional machine-learned models (25–27). However, all hitherto applied approaches for improving the Goldschmidt tolerance factor are only effective over a limited range of ABX₃ compositions. Despite its modest classification accuracy, t remains the primary descriptor used by experimentalists and theorists to predict the stability of perovskites.

The SISSO (sure independence screening and sparsifying operator) approach (28) was used to identify an improved tolerance factor for predicting whether a given compound is perovskite [determined by experimental realization of any structure with corner-sharing BX₆ octahedra (21) at ambient conditions] or nonperovskite [determined by experimental realization of any structure(s) without corner-sharing BX₆ octahedra, including, in some cases, failed synthesis of any ABX₃ compound]. Of the 576 experimentally characterized ABX₃ solids, 80% were used to train and 20% were used to test the SISSO-learned descriptor. Several alternative atomic properties were considered as candidate features, and among them, SISSO determined that the best performing descriptor, τ (Eq. 2 and Fig. 2B), depends only on oxidation states and Shannon ionic radii (see Materials and Methods for an explanation of the approach used for descriptor identification and a discussion of alternative approaches). For the set of 576 ABX₃ compositions, τ correctly labels 94% of the perovskites and 89% of the nonperovskites compared with 94 and 49%, respectively, using t. The primary advantage of τ over t is the remarkable reduction in compounds that are predicted to be perovskite but are not experimentally identified as stable perovskites, with false-positive rates for τ and t of 11 and 51%, respectively. Full confusion matrices along with additional performance metrics for τ and t are provided in table S2. The large decrease in false-positive rate (from 51% to 11%) while substantially increasing the overall classification accuracy (from 74% to 92%) demonstrates that τ improves significantly upon t as a reliable tool to guide experimentalists toward which compounds can be synthesized in perovskite structures.

Beyond the improved accuracy, a crucial advantage of τ is the monotonic (continuous) dependence of perovskite stability on τ. As τ decreases, the τ-based probability of being perovskite, P(τ), increases, where perovskites are expected for an empirically determined range of τ < 4.18 (Fig. 2B; Materials and Methods for details). Probabilities are obtained using Platt’s scaling (29), where the binary classification of perovskite/nonperovskite is transformed into a continuous probability estimate of perovskite stability, P(τ), by training a logistic regression model on the τ-derived binary classification. Probabilities cannot similarly be obtained with t because the stability of the perovskite structure does not increase or decrease monotonically with t, where 0.825 < t < 1.059 results in a classification as perovskite (this range maximizes the classification accuracy of t on the set of 576 compounds). While P(τ) is sigmoidal with respect to τ because of the logistic fit (fig. S2), a bell-shaped behavior of P(τ) with respect to t is observed because of the multiple decision boundaries required for t (Fig. 2C). This relationship leads to an increase in P(τ) (i.e., probability of perovskite stability using τ), with an increase in t until a value of t ~ 0.9. Beyond this range, the probabilities level out or decrease as t increases further.

The disparity between the τ-derived perovskite probability, P(τ), and the assignment by t can be significant, especially in the range where t predicts a stable perovskite (0.825 < t < 1.059). A comparison of the perovskite (LaAlO₃) and the nonperovskite (NaBeCl₃) illustrates the discrepancy between these two approaches. t incorrectly predicts both compounds to be perovskite (t = 1.0), whereas P(τ) varies from <10% for NaBeCl₃ to >97% for LaAlO₃, in agreement with the experimental results. For NaBeCl₃, instability in the perovskite structure arises from an insufficiently large Be²⁺ cation on the B site, which leads to unstable BeCl₆ octahedra. This contribution to perovskite stability is accounted for in the first term of τ (Eq. 2, r_X/r_B = μ⁻¹, where μ is the octahedral factor).

μ is the typical choice for a second feature used in combination with t (18, 19, 23) and was recently used to assess the predictive accuracy of Goldschmidt’s “no-rattling” principle. In this analysis, six inequalities dependent on t and μ were derived and used to predict the formability of single and double perovskites with a reported accuracy of ~80% (30). Notably, training a decision tree algorithm on the bounds of t and μ that optimally separate perovskite from nonperovskite leads to a classification accuracy of 85% for this dataset (fig. S3). In contrast to these 2D descriptors based on (t, μ), τ incorporates μ as a 1D descriptor yet still achieves a higher accuracy of 92%, demonstrating the capability of the SISSO algorithm to identify a highly accurate tolerance factor composed of intuitively meaningful parameters.

The nature of geometrical descriptors, such as t or μ, is fundamentally different than that of data-driven descriptors, such as τ. t and μ are derived from geometric constraints that indicate when the perovskite structure is a possible structure that can form. However, these constraints do not necessarily indicate when the perovskite structure is the ground-state structure and does form. For instance, if t = 1 and the ionic limit on which t was derived is applicable (the interatomic distances are sums of the ionic radii), these criteria do not suggest that perovskite is the ground-state structure, only that the interatomic distances are such that the lattice constants in the A-X and B-X directions can be commensurate with the perovskite structure. The fact that t does not guarantee the formation of the perovskite structure is evident by the high false-positive rate (51%) in the region of t where perovskite is expected (0.825 < t < 1.059). Similarly, although μ may fall within the range where BX₆ octahedra are expected based on geometric considerations (0.414 < μ < 0.732), the octahedra that form may be edge or face sharing, and therefore, the observed structure is nonperovskite. In this work, SISSO searches a massive space of potential descriptors to identify the one that most successfully detects when a given chemical formula will or will not crystallize in the perovskite structure, and because this is the target property, τ emerges as a much more predictive descriptor than t or μ.

Although the classification by τ disagrees with the experimental label for 8% of the 576 compounds, the agreement increases to 99% outside the range 3.31 < τ < 5.92 (200 compounds) and 100% outside the range 3.31 < τ < 12.08 (152 compounds). The experimental dataset may also be imperfect as compounds can manifest different crystal structures as a function of the synthesis conditions due to, e.g., defects in the experimental samples (impurities, vacancies, etc.). These considerations emphasize the usefulness of τ-derived probabilities, in addition to the binary classification of perovskite/nonperovskite, which address these uncertainties in the experimental data and corresponding classification by τ.

Comparing τ to calculated perovskite stabilities

The precise and probabilistic nature of τ, as well as its simple functional form—depending only on widely available Shannon radii (and the oxidation states required to determine the radii)—enables the rapid search across composition space for stable perovskite materials. Before attempting synthesis, it is common for new materials to be examined using computational approaches; therefore, it is useful to compare the predictions from τ with those obtained using density functional theory (DFT). The stabilities (decomposition enthalpies, ΔH_d) of 73 single and double perovskite chalcogenides and halides were recently examined with DFT using the Perdew-Burke-Ernzerhof (31) exchange-correlation functional (DFT) (32, 33). τ is found to agree with the calculated stability for 64 of 73 calculated materials. Importantly, the probabilities that result from classification with τ linearly correlate with ΔH_d, demonstrating the value of the monotonic behavior of τ and P(τ) (Fig. 2D and table S3).

Although τ appears to disagree with these DFT calculations for nine compounds, six disagreements lie near the decision boundaries [P(τ) = 0.5, ΔH_d = 0 meV/atom], suggesting that they cannot be confidently classified as stable or unstable perovskites using τ or DFT calculations of the cubic structure. Of the remaining disagreements, CaZrO₃ and CaHfO₃ reveal the power of τ compared with DFT calculations of the cubic structure, as these two oxides are known to be isostructural with the orthorhombic perovskite CaTiO₃, from which the name perovskite originates (34, 35). ΔH_d < −90 meV/atom for these two compounds in the cubic structure, indicating that they are nonperovskites. In contrast, τ predicts both compounds to be stable perovskites with ~65% probability, which agrees with the experimental results. These results show that a key challenge in the prediction of perovskite stability from quantum chemical calculations is the requirement of a specific structure as an input, as there are more than a dozen unique structures classified as perovskite (i.e., those having corner-sharing BX₆ octahedra) and many more that are nonperovskite.

Several recent machine-learned descriptors for perovskite stability have been trained or tested on DFT-calculated stabilities of only the cubic perovskite structure (33, 36–38). However, less than 10% of perovskites are observed experimentally in this structure (21), leading to an inherent disagreement between the descriptor predictions and experimental observations. Recently, it was shown that of 254 synthesized perovskite oxides (ABO₃), DFT calculations in the Open Quantum Materials Database (39) predict only 186 (70%) to be stable or even moderately unstable (within 100 meV/atom of the convex hull) (27). The discrepancy is likely associated with the difference in energy between the true perovskite ground state and the calculated high-symmetry structure(s). Because τ was trained exclusively on the experimental characterization of ABX₃ compounds, τ is informed by the true ground-state (or metastable but observed) structure of each ABX₃ and the potential for these compounds to decompose into any compound(s) in the A-B-X composition space. A principal advantage of τ over many existing descriptors is that its identification and validation were based on experimentally observed stability or instability of a structurally diverse dataset.

Extension to double perovskite oxides and halides

Double perovskites are particularly intriguing as an emerging class of semiconductors that offer a lead-free alternative to traditional perovskite photoabsorbers and an increased compositional tunability for enhancing desired properties such as catalytic activity (10, 16, 40). Still, the experimentally realized composition space of double perovskites is relatively unexplored compared with the number of possible A, B, B′, and X combinations that can form A₂BB’X₆ compounds. The set of 576 compounds used for training and testing τ is composed of 49 A cations, 67 B cations, and 5 X anions, from which >500,000 double perovskite formulas, A₂BB′X₆, can be constructed. Comparison with the Inorganic Crystal Structure Database (ICSD) (30, 41) reveals only 918 compounds (<0.2%) with known crystal structures, 868 of which are perovskite.

Although τ was only trained on ABX₃ compounds, it is readily adaptable to double perovskites because it depends only on composition and not structure. To extend τ to A₂BB′X₆ formulas, r_B is approximated as the arithmetic mean of the two B-site radii (r_B, r_B′). τ correctly classifies 91% of these 918 A₂BB′X₆ compounds in the ICSD (compared with 92% on 576 ABX₃ compounds), recovering 806 of 868 known double perovskites (table S4). The geometric mean has also been used to approximate the radius of a site with two ions (42). We find that this has little effect on classification with τ, as 91% of the 918 A₂BB′X₆ compounds are also correctly classified using the geometric mean for r_B, and the classification label differs for only 14 of 918 compounds using the arithmetic or geometric mean. Although τ was identified using 460 ABX₃ compounds, the agreement with experiment on these compounds (92%) is comparable to that on the 1034 compounds (91%) that span ABX₃ (116 compounds) and A₂BB′X₆ (918 compounds) formulas and was completely excluded from the development of τ (i.e., test set compounds). This result indicates pronounced generalizability to predicting experimental realization for single and double perovskites that are yet to be discovered. With τ thoroughly validated as being predictive of experimental stability, the space of yet-undiscovered double perovskites was explored to identify 23,314 charge-balanced double perovskites that τ predicts to be stable at ambient conditions (of >500,000 candidates). These compounds are provided in table S4 including assigned oxidation states and radii along with t and τ, predictions made using each tolerance factor, and classification in the ICSD where available. There are thousands of additional compounds with substitutions on the A and/or X sites, AA′BB′(XX′)₃, that are expected to be similarly rich in yet-undiscovered perovskite compounds.

Two particularly attractive classes of materials within this set of A₂BB′X₆ compounds are double perovskites with A = Cs⁺, X = Cl⁻ and A = La³⁺, X = O²⁻, which have garnered substantial interest in a number of applications including photovoltaics, electrocatalysis, and ferroelectricity. The ICSD contains 45 compounds (42 perovskites) with the formula CsBB′Cl₆, 43 of which are correctly classified as perovskite or nonperovskite by τ. From the high-throughput analysis using τ, we predict an additional 420 perovskites to be stable with 164 having at least the probability of perovskite formation as the recently synthesized perovskite, Cs₂AgBiCl₆ [P(τ) = 69.6%] (43). A map of perovskite probabilities for charge-balanced Cs₂BB′Cl₆ compounds is shown in Fig. 3 (lower triangle). Within this set of 164 probable perovskites, there is an opportunity to synthesize double perovskite chlorides that contain 3d transition metals substituted on one or both B sites, as 83 new compounds of this type are predicted to be stable as perovskite with high probability.

Fig. 3 — Lower triangle: Probability of forming a stable perovskite with the formula Cs₂*BB′*Cl₆ as predicted by τ. Upper triangle: Probability of forming a stable perovskite with the formula La₂*BB′*O₆ as predicted by τ. White spaces indicate *B/B′* combinations that do not result in charge-balanced compounds with r_A > r_B. The colors indicate the Platt-scaled classification probabilities, P(τ), with higher P(τ) indicating a higher probability of forming a stable perovskite. *B/B′* sites are restricted to ions that are labeled as B sites in the experimental set of 576 *ABX*₃ compounds.

While double perovskite oxides have been explored extensively for a number of applications, the small radius and favorable charge of O²⁻ yields a massive design space for the discovery of new compounds. For La₂BB′O₆, ~63% of candidate compositions are found to be charge-balanced compared with only ~24% of candidate Cs₂BB′Cl₆ compounds. The ICSD contains 85 La₂BB′O₆ compounds, all of which are predicted to be perovskite by τ in agreement with the experiment. We predict an additional 1128 perovskites to be discoverable in this space, with a remarkable 990 having P(τ) ≥ 85% (Fig. 3, upper triangle). All 128 ABX₃ compounds in the experimental set that meet this threshold are experimentally realized as perovskite, suggesting that there is ample opportunity for perovskite discovery in lanthanum oxides.

Compositional mapping of perovskite stability

In addition to enabling the rapid exploration of stoichiometric perovskite compositions, τ provides the probability of perovskite stability, P(τ), for an arbitrary combination of n_A, r_A, r_B, and r_X, which is shown in Fig. 4. For each grouping shown in Fig. 4, experimentally realized perovskites and nonperovskites are shown as single points to compare with the range of values in the predictions made from τ. Doping at various concentrations presents a nearly infinite number of A_1−xA′_xB_1−yB′_y(X_1−zX′_z)₃ compositions that allows the tuning of technologically useful properties. τ suggests the size and concentration of dopants on the A, B, or X sites that likely lead to improved stability in the perovskite structure. Conversely, compounds that lie in the high-probability region are likely amenable to ionic substitutions that decrease the probability of forming a perovskite but may improve a desired property for another application. For example, LaCoO₃, with P(τ) = 98.9%, should accommodate reasonable ionic substitutions (i.e., A sites of comparable size to La or B sites of comparable size to Co) and was recently shown to have enhanced oxygen exchange capacity and nitric oxide oxidation kinetics with stable substitutions of Sr on the A site (44).

The probability maps in Fig. 4 arise from the functional form of τ (Eq. 2) and provide insights into the stability of the perovskite structure as the size of each ion is varied. The perovskite structure requires that the A and B cations occupy distinct sites in the ABX₃ lattice, with A 12-fold and B 6-fold coordinated by X. When r_A and r_B are too similar, nonperovskite lattices that have similarly coordinated A and B sites, such as cubic bixbyite, become preferred over the perovskite structure. On the basis of the construct of τ, as r_A/r_B → 1, P(τ) → 0, which arises from the +x/ln(x) (x = r_A/r_B) term, where ${lim}_{x \to 1} \frac{x}{ln (x)} = + \infty$ and larger values of τ lead to lower probabilities of forming perovskites. When r_A = r_B, τ is undefined, yet compounds where A and B have identical radii are rare and not expected to adopt perovskite structures (t = 0.71).

The octahedral term in τ (r_X/r_B) also manifests itself in the probability maps, particularly in the lower bound on r_B where perovskites are expected as r_X is varied. As r_X increases, r_B must similarly increase to enable the formation of stable BX₆ octahedra. This effect is noticeable when separately comparing compounds containing Cl⁻ (left), Br⁻ (center), and I⁻ (right) (bottom row of Fig. 4), where the range of allowed cation radii decreases as the anion radius increases. For r_B << r_X, r_X/r_B becomes large, which increases τ and therefore decreases the probability of stability in the perovskite structure. This accounts for the inability of small B-site ions to sufficiently separate X anions in BX₆ octahedra, where geometric arguments suggest that B is sufficiently large to form BX₆ octahedra only for r_B/r_X > 0.414. Because the cation radii ratios strongly affect the probability of perovskite, as discussed in the context of x/ln(x), r_X also has a noticeable indirect effect on the lower bound of r_A, which increases as r_X increases.

The role of n_A in τ is more difficult to parse, but its placement dictates two effects on stability—as A is more oxidized (increasing n_A), −n_A² increases the probability of forming the perovskite structure, but n_A also magnifies the effect of the x/ln(x) term, increasing the importance of the cation radii ratio. Notably, n_A = 1 for most halides and some oxides (245 of the 576 compounds in our set), and in these cases, $τ = \frac{r_{X}}{r_{B}} + \frac{r_{A} / r_{B}}{ln (r_{A} / r_{B})} - 1$ for all combinations of A, B, and X and n_A plays no role as the composition is varied.

This analysis illustrates how data-driven approaches not only can be used to maximize the predictive accuracy of new descriptors but also can be leveraged to understand the actuating mechanisms of a target property—in this case, perovskite stability. This attribute distinguishes τ from other descriptors for perovskite stability that have emerged in recent years. For instance, three recent works have shown that the experimental formability of perovskite oxides and halides can be separately predicted with high accuracy using kernel support vector machines (26), gradient boosted decision trees (25), or a random forest of decision trees (27). While these approaches can yield highly accurate models, the resulting descriptors are not documented analytically, and therefore, the mechanism by which they make the perovskite/nonperovskite classification is opaque.

CONCLUSIONS

We report a new tolerance factor, τ, that enables the prediction of experimentally observed perovskite stability significantly better than the widely used Goldschmidt tolerance factor, t, and the 2D structure map using t and the octahedral factor, μ. For 576 ABX₃ and 918 A₂BB′X₆ compounds, the prediction by τ agrees with the experimentally observed stability for >90% of compounds, with >1000 of these compounds reserved for testing generalizability (prediction accuracy). The deficiency of t arises from its functional form and not the input features, as the calculation of τ requires the same inputs as t (composition, oxidation states, and Shannon ionic radii). Thus, τ enables a superior prediction of perovskite stability with negligible computational cost. The monotonic and 1D nature of τ allows the determination of perovskite probability as a continuous function of the radii and oxidation states of A, B, and X. These probabilities are shown to linearly correlate with DFT-computed decomposition enthalpies and help clarify how chemical substitutions at each of the sites modulate the tendency for perovskite formation. Using τ, we predict the probability of double perovskite formation for thousands of unexplored compounds, resulting in a library of stable perovskites ordered by their likelihood of forming perovskites. Because of the simplicity and accuracy of τ, we expect its use to accelerate the discovery and design of state-of-the-art perovskite materials for applications ranging from photovoltaics to electrocatalysis.

MATERIALS AND METHODS

Radii assignment

To develop a descriptor that takes as input the chemical composition and outputs a prediction of perovskite stability, the features that comprise the descriptor must also be based only on composition. However, it is not known a priori which cation will occupy the A or B site given only a chemical composition, CC′X₃ (C and C′ being cations). Therefore, we developed a systematic method for determining which cation is A or B to enable τ to be applied to an arbitrary new material. First, a list of allowed oxidation states is defined for each cation based on Shannon’s radii (20). All pairs of oxidation states for C and C′ that charge-balance X₃ are considered. If more than one charge-balanced pair exists, a single pair is chosen on the basis of the electronegativity ratio of the two cations (χ_C/χ_C′). If 0.9 < χ_C/χ_C′ < 1.1, the pair that minimizes |n_C – n_C′| is chosen, where n_C is the oxidation state for C. Otherwise, the pair that maximizes |n_C – n_C′| is chosen. With the oxidation states of C and C′ assigned, the values of the Shannon radii for the cations occupying the A and B sites are chosen to be closest to the coordination number of 12 and 6, which are consistent with the coordination environments of the A and B cations in the perovskite structure. Last, the radii of the C and C′ cations were compared, and the larger cation is assigned as the A-site cation. This strategy reproduced the assignment of the A and B cations for 100% of 313 experimentally labeled perovskites.

Selection of τ

For the identification of τ among the offered candidates, the oxidation states (n_A, n_B, n_X), ionic radii (r_A, r_B, r_X), and radii ratios (r_A/r_B, r_A/r_X, r_B/r_X) comprise the primary features, Φ₀, where Φ_n refers to the descriptor space with n iterations of complexity as defined in (28). For example, Φ₁ refers to the primary features (Φ₀), together with one iteration of algebraic/functional operations applied to each feature in Φ₀. Φ₂ then refers to the application of algebraic/functional operations to all potential descriptors in Φ₁, and so forth. Note that Φ_m contains all potential descriptors within Φ_n<m, with a filter to remove redundant potential descriptors. For the discovery of τ, complexity up to Φ₃ is considered, yielding ~3 × 10⁹ potential descriptors. An alternative would be to exclude the radii ratios from Φ₀ and construct potential descriptors with complexity up to Φ₄. However, given the minimal Φ₀ = [n_A, n_B, n_X, r_A, r_B, r_X], there are ~10⁸ potential descriptors in Φ₃, so ~10¹⁶ potential descriptors would be expected in Φ₄ (based on ~10² being present in Φ₁ and ~1 × 10⁴ in Φ₂), and this number is impractical to screen using available computing resources.

The dataset of 576 ABX₃ compositions was partitioned randomly into an 80% training set for identifying candidate descriptors and a 20% test set for analyzing the predictive ability of each descriptor. The top 100,000 potential descriptors most applicable to the perovskite classification problem were identified using one iteration of SISSO with a subspace size of 100,000. Each descriptor in the set of ~3 × 10⁹ was ranked according to domain overlap, as described by Ouyang et al. (28). To identify a decision boundary for classification, a decision tree classifier with a maximum depth of two was fit to the top 100,000 candidate descriptors ranked based on domain overlap. Domain overlap (and not decision tree performance) was used as the SISSO ranking metric because of the much lower computational expense associated with applying this metric. Notably, τ was the 14,467th highest ranked descriptor by SISSO using the domain overlap metric, and hence, this defines the minimum subspace required to identify τ using this approach. Without evaluating a decision tree model for each descriptor in the set of ~3 × 10⁹ potential descriptors, we cannot be certain that a subspace size of 100,000 is sufficient to find the best descriptor. However, the identification of τ within a subspace as small as 15,000 suggests that a subspace size of 100,000 is sufficiently large to efficiently screen the much larger descriptor space. We have also conducted a test on this primary feature space (Φ₀ = [n_A, n_B, n_X, r_A, r_B, r_X, r_A/r_B, r_A/r_X, r_B/r_X]) with a subspace size of 500,000. Even after increasing the subspace size by 5×, τ remains the highest performing descriptor (a classification accuracy of 92% on the 576-compound set). An important distinction between the SISSO approach described here and by Ouyang et al. (28) is the choice of sparsifying operator (SO). In this work, domain overlap was used to rank the features in SISSO, but a decision tree with a maximum depth of two was used as the SO (instead of domain overlap) to identify the best descriptor of those selected by SISSO. This alternative SO was used to decrease the leverage of individual data points, as the experimental labeling of perovskite/nonperovskite is prone to some ambiguity based on synthesis conditions, defects, and other experimental considerations.

The benefit of including the radii ratios in Φ₀ was made clear by comparing the performance of τ to the best descriptor obtained using the minimal primary feature space with Φ₀ = [n_A, n_B, n_X, r_A, r_B, r_X]. Repeating the procedure used to identify τ yields a Φ₃ with ~1 × 10⁸ potential descriptors. The best 1D descriptor was found to be $\frac{r_{B}}{n_{X} (r_{A} - r_{B})} + \frac{r_{B}}{r_{A}} - \frac{r_{X}}{r_{B}}$ , with a classification accuracy of 89%.

Alternative features

We also considered the effects of including properties outside of those required to compute t or τ. Beginning with Φ₀ = [n_A, n_B, n_X, r_A, r_B, r_X, r_cov,A, r_cov,B, r_cov,X, IE_A, IE_B, IE_X, χ_A, χ_B, χ_X], where r_cov,i is the empirical covalent radius of neutral element i, IE_i is the empirical first ionization energy of neutral element i, and χ_i is the Pauling electronegativity of element i, all taken from WebElements (45), an aggregation of a number of references that are available within. Repeating the procedure used to identify τ results in ~6 × 10¹⁰ potential descriptors in Φ₃. The best performing 1D descriptor was found to be $\frac{r_{A} / r_{B} - \sqrt{χ_{X}}}{r_{cov, X} / r_{B} - r_{cov, A} / r_{cov, X}}$ with a classification accuracy of 90%, lower than τ that makes use of only the oxidation states and ionic radii and is only slightly higher than the accuracy of the descriptor obtained using the minimal feature set.

Increasing dimensionality

To assess the performance of descriptors with increased dimensionality, following the approach to higher dimensional descriptor identification using SISSO described in (28), the residuals from classification by τ (those misclassified by the decision tree, Fig. 2B) were used as the target property in the search for a second dimension to include with τ. From the same set of ~3 × 10⁹ potential descriptors constructed to identify τ, the 100,000 1D descriptors that best classify the 41 training set compounds misclassified by τ were identified on the basis of domain overlap. Each of these 100,000 descriptors was paired with τ, and the performance of each 2D descriptor was assessed using a decision tree with a maximum depth of two. The best performing 2D descriptor was found to be $(τ, \frac{| r_{A} r_{X} / r_{B}^{2} - n_{B} r_{A} / r_{B} |}{| r_{A} r_{B} / r_{X}^{2} - r_{A} / r_{B} + n_{B} |})$ , with a classification accuracy of 95% on the 576-compound set. Improvements are expected to diminish as the dimensionality increases further due to the iterative nature of SISSO and the higher-order residuals used for subspace selection. Although the second dimension leads to slightly improved classification performance on the experimental set compared with τ, the simplicity and monotonicity of τ, which enables physical interpretation and the extraction of meaningful probabilities, support its selection instead of the more complex 2D descriptor. The benefits and capabilities of having a meaningfully probabilistic 1D tolerance factor, such as τ, are described in detail within the main text.

Potential for overfitting

The SISSO algorithm as implemented here selects τ from a space of ~3 × 10⁹ candidate descriptors, and the only parameter that is fit is the optimum value of τ that defines the decision boundary for classification as perovskite or nonperovskite, τ = 4.18. This decision boundary was optimized using a decision tree to maximize the classification accuracy on the training set of 460 compounds. In this case, Gini impurity was minimized to optimize the decision boundary, but alternative cost functions based on Kullback-Leibler divergence or classification accuracy (e.g., l₂) would find the same decision boundary. The SISSO descriptor identification is done from billions of candidates, but these functions comprise a discrete set, i.e., they form a basis in a large dimensional space where the number of training points is the dimensionality of the space, which is not densely covered by the functions. Therefore, the selection of only one function, τ, cannot overfit the data. However, if some physical mechanism determining the stability of perovskites is not represented in the training set, it might be missed by the learned formula (here, τ), and therefore, the generalizability of the model would be hampered. However, the 94% accuracy achieved by τ on the excluded set of 116 compounds shows that τ can generalize outside of the training data.

Alternative radii for more covalent compounds

Ionic radii are required inputs for τ (and t), and although the Shannon effective ionic radii are ubiquitous in solid-state materials research, a new set of B²⁺ radii was recently proposed for 18 cations to account for how their effective cationic radii vary as a function of increased covalency with the heavier halides (19). These revised radii apply to 129 of the 576 experimentally characterized compounds compiled in this dataset (62% of halides). Using these revised radii results in a 5% decrease in the accuracy of τ to 86% for these 129 compounds compared to a classification accuracy of 91% using the Shannon radii for these same compounds. The application of τ using Shannon radii for presumably covalent compounds was further validated by noting that τ correctly classifies 37 of 40 compounds that contain Sn or Pb and achieves an accuracy of 91% for 141 compounds with X = Cl⁻, Br⁻, or I⁻. In addition to the higher accuracy achieved by τ when using Shannon radii, we note that the Shannon radii are more comprehensive than the revised radii in (19), applying to more ions, oxidation states, and coordination environments, and are thus recommended for the calculation of τ.

Computer packages used

SISSO was performed using Fortran 90. Platt’s scaling (29) was used to extract classification probabilities for τ by fitting a logistic regression model on the decision tree classifications using threefold cross-validation. Decision tree fitting and Platt scaling were performed within the Python package scikit-learn. Data visualizations were generated within the Python packages Matplotlib and Seaborn.

Supplementary Material

http://advances.sciencemag.org/cgi/content/full/5/2/eaav0693/DC1

supp_5_2_eaav0693__index.html^{(2.3KB, html)}

Acknowledgments

We thank A. Holder for helpful discussions regarding the manuscript. Funding: This project has received funding from the European Union’s Horizon 2020 research and innovation program (#676580: The NOMAD Laboratory—A European Center of Excellence and #740233: TEC1p), the Berlin Big-Data Center (BBDC, #01IS14013E), and BiGmax, the Max Planck Society’s Research Network on Big-Data-Driven Materials-Science. C.J.B. acknowledges support from a U.S. Department of Education Graduate Assistantship in Areas of National Need. C.S. acknowledges funding by the Alexander von Humboldt Foundation. C.B.M. acknowledges support from NSF award CBET-1433521, which was cosponsored by the NSF and the U.S. Department of Energy (DOE), Office of Energy Efficiency and Renewable Energy (EERE), Fuel Cell Technologies Office and from DOE award EERE DE-EE0008088. Part of this research was performed using computational resources sponsored by the U.S. DOE, Office of EERE and located at the National Renewable Energy Laboratory. Author contributions: M.S. and C.J.B. conceived the idea. C.J.B., C.S., and B.R.G. designed the studies. C.J.B. performed the studies. C.J.B., C.S., and B.R.G. analyzed the results and wrote the manuscript. R.O. provided the SISSO algorithm and facilitated its implementation. C.B.M., L.M.G., and M.S. supervised the project. All the authors discussed the results and implications and edited the manuscript. Competing interests: The authors declare that they have no competing financial interests. Data and materials availability: A repository containing all files necessary for classifying ABX₃ and AA′BB′(XX′)₃ compositions as perovskite or nonperovskite using τ is available at https://github.com/CJBartel/perovskite-stability. A graphical interface allowing users to classify compounds with τ is also available at https://analytics-toolkit.nomad-coe.eu. The classification of all compounds shown in the manuscript is available in the Supplementary Materials. All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials. Additional data related to this paper may be requested from the authors.

SUPPLEMENTARY MATERIALS

Supplementary material for this article is available at http://advances.sciencemag.org/cgi/content/full/5/2/eaav0693/DC1

Table S1. The 576 ABX₃ used for training and testing τ.

Table S2. Confusion matrices for τ (above) and t (below).

Table S3. Additional information associated with Fig. 2D.

Table S4. Double perovskite oxides and halides.

Fig. S1. Comparing the performance of t and τ by composition.

Fig. S2. Sigmoidal relationship between P(τ) and τ.

Fig. S3. (t, μ) structure map for 576 ABX₃ solids.

REFERENCES AND NOTES

1.Pauling L., The principles determining the structure of complex ionic crystals. J. Am. Chem. Soc. 51, 1010–1026 (1929). [Google Scholar]
2.Woodley S. M., Catlow R., Crystal structure prediction from first principles. Nat. Mater. 7, 937–946 (2008). [DOI] [PubMed] [Google Scholar]
3.Kirkpatrick S., Gelatt C. D. Jr., Vecchi M. P., Optimization by simulated annealing. Science 220, 671–680 (1983). [DOI] [PubMed] [Google Scholar]
4.Doye J. P. K., Wales D. J., Thermodynamics of global optimization. Phys. Rev. Lett. 80, 1357–1360 (1998). [Google Scholar]
5.Goedecker S., Minima hopping: An efficient search method for the global minimum of the potential energy surface of complex molecular systems. J. Chem. Phys. 120, 9911–9917 (2004). [DOI] [PubMed] [Google Scholar]
6.Oganov A. R., Lyakhov A. O., Valle M., How evolutionary crystal structure prediction works—And why. Acc. Chem. Res. 44, 227–237 (2011). [DOI] [PubMed] [Google Scholar]
7.Curtarolo S., Hart G. L. W., Buongiorno Nardelli M., Mingo N., Sanvito S., Levy O., The high-throughput highway to computational materials design. Nat. Mater. 12, 191–201 (2013). [DOI] [PubMed] [Google Scholar]
8.Ghiringhelli L. M., Vybiral J., Levchenko S. V., Draxl C., Scheffler M., Big data of materials science: Critical role of the descriptor. Phys. Rev. Lett. 114, 105503 (2015). [DOI] [PubMed] [Google Scholar]
9.Goldschmidt V. M., Die gesetze der krystallochemie. Naturwissenschaften 14, 477–485 (1926). [Google Scholar]
10.Hwang J., Rao R. R., Giordano L., Katayama Y., Yu Y., Shao-Horn Y., Perovskites in catalysis and electrocatalysis. Science 358, 751–756 (2017). [DOI] [PubMed] [Google Scholar]
11.Duan C., Tong J., Shang M., Nikodemski S., Sanders M., Ricote S., Almansoori A., O’Hayre R., Readily processed protonic ceramic fuel cells with high performance at low temperatures. Science 349, 1321–1326 (2015). [DOI] [PubMed] [Google Scholar]
12.Cohen R. E., Origin of ferroelectricity in perovskite oxides. Nature 358, 136–138 (1992). [Google Scholar]
13.Yi T., Chen W., Cheng L., Bayliss R. D., Lin F., Plews M. R., Nordlund D., Doeff M. M., Persson K. A., Cabana J., Investigating the intercalation chemistry of alkali ions in fluoride perovskites. Chem. Mater. 29, 1561–1568 (2017). [Google Scholar]
14.Correa-Baena J.-P., Saliba M., Buonassisi T., Grätzel M., Abate A., Tress W., Hagfeldt A., Promises and challenges of perovskite solar cells. Science 358, 739–744 (2017). [DOI] [PubMed] [Google Scholar]
15.Kovalenko M. V., Protesescu L., Bodnarchuk M. I., Properties and potential optoelectronic applications of lead halide perovskite nanocrystals. Science 358, 745–750 (2017). [DOI] [PubMed] [Google Scholar]
16.Li W., Wang Z., Deschler F., Gao S., Friend R. H., Cheetham A. K., Chemically diverse and multifunctional hybrid organic–inorganic perovskites. Nat. Rev. Mater. 2, 16099 (2017). [Google Scholar]
17.Zhang H., Li N., Li K., Xue D., Structural stability and formability of ABO₃-type perovskite compounds. Acta Crystallogr. B 63, 812–818 (2007). [DOI] [PubMed] [Google Scholar]
18.Li C., Lu X., Ding W., Feng L., Gao Y., Guo Z., Formability of ABX₃ (X = F, Cl, Br, I) halide perovskites. Acta Crystallogr. B 64, 702–707 (2008). [DOI] [PubMed] [Google Scholar]
19.Travis W., Glover E. N. K., Bronstein H., Scanlon D. O., Palgrave R. G., On the application of the tolerance factor to inorganic and hybrid halide perovskites: A revised system. Chem. Sci. 7, 4548–4556 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Shannon R. D., Revised effective ionic radii and systematic studies of interatomic distances in halides and chalcogenides. Acta Crystallogr. A 32, 751–767 (1976). [Google Scholar]
21.Lufaso M. W., Woodward P. M., Prediction of the crystal structures of perovskites using the software program SPuDS. Acta Crystallogr. B 57, 725–738 (2001). [DOI] [PubMed] [Google Scholar]
22.Kieslich G., Sun S., Cheetham A. K., Solid-state principles applied to organic–inorganic perovskites: New tricks for an old dog. Chem. Sci. 5, 4712–4715 (2014). [Google Scholar]
23.Li C., Soh K. C. K., Wu P., Formability of ABO₃ perovskites. J. Alloys Compd. 372, 40–48 (2004). [Google Scholar]
24.Becker M., Klüner T., Wark M., Formation of hybrid ABX₃ perovskite compounds for solar cell application: First-principles calculations of effective ionic radii and determination of tolerance factors. Dalton Trans. 46, 3500–3509 (2017). [DOI] [PubMed] [Google Scholar]
25.Pilania G., Balachandran P. V., Gubernatis J. E., Lookman T., Classification of ABO₃ perovskite solids: A machine learning study. Acta Crystallogr. B 71, 507–513 (2015). [DOI] [PubMed] [Google Scholar]
26.Pilania G., Balachandran P. V., Kim C., Lookman T., Finding new perovskite halides via machine learning. Front. Mater. 3, 19 (2016). [Google Scholar]
27.Balachandran P. V., Emery A. A., Gubernatis J. E., Lookman T., Wolverton C., Zunger A., Predictions of new ABO₃ perovskite compounds by combining machine learning and density functional theory. Phys. Rev. Mater. 2, 043802 (2018). [Google Scholar]
28.Ouyang R., Curtarolo S., Ahmetcik E., Scheffler M., Ghiringhelli L. M., SISSO: A compressed-sensing method for identifying the best low-dimensional descriptor in an immensity of offered candidates. Phys. Rev. Mater. 2, 083802 (2018). [Google Scholar]
29.J. C. Platt, Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods, in Advances in Large Margin Classifiers, A. J. Smola, P. B. Bartlett, B. Schölkopf, D. Schuurmans, Eds. (MIT Press, 1999), vol. 10, 61–74. [Google Scholar]
30.Filip M. R., Giustino F., The geometric blueprint of perovskites. Proc. Natl. Acad. Sci. U.S.A. 115, 5397–5402 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
31.Perdew J. P., Burke K., Ernzerhof M., Generalized gradient approximation made simple. Phys. Rev. Lett. 77, 3865–3868 (1996). [DOI] [PubMed] [Google Scholar]
32.Zhao X.-G., Yang D., Sun Y., Li T., Zhang L., Yu L., Zunger A., Cu–In halide perovskite solar absorbers. J. Am. Chem. Soc. 139, 6718–6725 (2017). [DOI] [PubMed] [Google Scholar]
33.Sun Q., Yin W.-J., Thermodynamic stability trend of cubic perovskites. J. Am. Chem. Soc. 139, 14905–14908 (2017). [DOI] [PubMed] [Google Scholar]
34.Megaw H. D., Crystal structure of double oxides of the perovskite type. Proc. Phys. Soc. 58, 133 (1946). [Google Scholar]
35.Feteira A., Sinclair D. C., Rajab K. Z., Lanagan M. T., Crystal structure and microwave dielectric properties of alkaline-earth hafnates, AHfO₃ (A=Ba, Sr, Ca). J. Am. Ceram. Soc. 91, 893–901 (2008). [Google Scholar]
36.Schmidt J., Shi J., Borlido P., Chen L., Botti S., Marques M. A. L., Predicting the thermodynamic stability of solids combining density functional theory and machine learning. Chem. Mater. 29, 5090–5103 (2017). [Google Scholar]
37.Faber F. A., Lindmaa A., von Lilienfeld O. A., Armiento R., Machine learning energies of 2 million elpasolite (ABC₂D₆) crystals. Phys. Rev. Lett. 117, 135502 (2016). [DOI] [PubMed] [Google Scholar]
38.Xie T., Grossman J. C., Crystal graph convolutional neural networks for an accurate and interpretable prediction of material properties. Phys. Rev. Lett. 120, 145301 (2018). [DOI] [PubMed] [Google Scholar]
39.Kirklin S., Saal J. E, Meredig B., Thompson A., Doak J. W., Aykol M., Rühl S., Wolverton C., The open quantum materials database (OQMD): Assessing the accuracy of DFT formation energies. npj Comput. Mater. 1, 15010 (2015). [Google Scholar]
40.Kamat P. V., Bisquert J., Buriak J., Lead-free perovskite solar cells. ACS Energy Lett. 2, 904–905 (2017). [Google Scholar]
41.Hellenbrandt M., The Inorganic Crystal Structure Database (ICSD)—Present and future. Crystallogr. Rev. 10, 17–22 (2004). [Google Scholar]
42.Li W., Ionescu E., Riedel R., Gurlo A., Can we predict the formability of perovskite oxynitrides from tolerance and octahedral factors? J. Mater. Chem. A 1, 12239–12245 (2013). [Google Scholar]
43.McClure E. T., Ball M. R., Windl W., Woodward P. M., Cs₂AgBiX₆ (X = Br, Cl): New visible light absorbing, lead-free halide perovskite semiconductors. Chem. Mater. 28, 1348–1354 (2016). [Google Scholar]
44.Choi S. O., Penninger M., Kim C. H., Schneider W. F., Thompson L. T., Experimental and computational investigation of effect of Sr on NO oxidation and oxygen exchange for La_1–xSr_xCoO₃ perovskite catalysts. ACS Catal. 3, 2719–2728 (2013). [Google Scholar]
45.Source: WebElements, www.webelements.com/.

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

http://advances.sciencemag.org/cgi/content/full/5/2/eaav0693/DC1

supp_5_2_eaav0693__index.html^{(2.3KB, html)}

aav0693_SM.pdf^{(709.3KB, pdf)}

aav0693_Table_S1.csv^{(35.5KB, csv)}

aav0693_Table_S3.csv^{(6.6KB, csv)}

aav0693_Table_S4.csv^{(6MB, csv)}

Supplementary material for this article is available at http://advances.sciencemag.org/cgi/content/full/5/2/eaav0693/DC1

Table S1. The 576 ABX₃ used for training and testing τ.

Table S2. Confusion matrices for τ (above) and t (below).

Table S3. Additional information associated with Fig. 2D.

Table S4. Double perovskite oxides and halides.

Fig. S1. Comparing the performance of t and τ by composition.

Fig. S2. Sigmoidal relationship between P(τ) and τ.

Fig. S3. (t, μ) structure map for 576 ABX₃ solids.

[R1] 1.Pauling L., The principles determining the structure of complex ionic crystals. J. Am. Chem. Soc. 51, 1010–1026 (1929). [Google Scholar]

[R2] 2.Woodley S. M., Catlow R., Crystal structure prediction from first principles. Nat. Mater. 7, 937–946 (2008). [DOI] [PubMed] [Google Scholar]

[R3] 3.Kirkpatrick S., Gelatt C. D. Jr., Vecchi M. P., Optimization by simulated annealing. Science 220, 671–680 (1983). [DOI] [PubMed] [Google Scholar]

[R4] 4.Doye J. P. K., Wales D. J., Thermodynamics of global optimization. Phys. Rev. Lett. 80, 1357–1360 (1998). [Google Scholar]

[R5] 5.Goedecker S., Minima hopping: An efficient search method for the global minimum of the potential energy surface of complex molecular systems. J. Chem. Phys. 120, 9911–9917 (2004). [DOI] [PubMed] [Google Scholar]

[R6] 6.Oganov A. R., Lyakhov A. O., Valle M., How evolutionary crystal structure prediction works—And why. Acc. Chem. Res. 44, 227–237 (2011). [DOI] [PubMed] [Google Scholar]

[R7] 7.Curtarolo S., Hart G. L. W., Buongiorno Nardelli M., Mingo N., Sanvito S., Levy O., The high-throughput highway to computational materials design. Nat. Mater. 12, 191–201 (2013). [DOI] [PubMed] [Google Scholar]

[R8] 8.Ghiringhelli L. M., Vybiral J., Levchenko S. V., Draxl C., Scheffler M., Big data of materials science: Critical role of the descriptor. Phys. Rev. Lett. 114, 105503 (2015). [DOI] [PubMed] [Google Scholar]

[R9] 9.Goldschmidt V. M., Die gesetze der krystallochemie. Naturwissenschaften 14, 477–485 (1926). [Google Scholar]

[R10] 10.Hwang J., Rao R. R., Giordano L., Katayama Y., Yu Y., Shao-Horn Y., Perovskites in catalysis and electrocatalysis. Science 358, 751–756 (2017). [DOI] [PubMed] [Google Scholar]

[R11] 11.Duan C., Tong J., Shang M., Nikodemski S., Sanders M., Ricote S., Almansoori A., O’Hayre R., Readily processed protonic ceramic fuel cells with high performance at low temperatures. Science 349, 1321–1326 (2015). [DOI] [PubMed] [Google Scholar]

[R12] 12.Cohen R. E., Origin of ferroelectricity in perovskite oxides. Nature 358, 136–138 (1992). [Google Scholar]

[R13] 13.Yi T., Chen W., Cheng L., Bayliss R. D., Lin F., Plews M. R., Nordlund D., Doeff M. M., Persson K. A., Cabana J., Investigating the intercalation chemistry of alkali ions in fluoride perovskites. Chem. Mater. 29, 1561–1568 (2017). [Google Scholar]

[R14] 14.Correa-Baena J.-P., Saliba M., Buonassisi T., Grätzel M., Abate A., Tress W., Hagfeldt A., Promises and challenges of perovskite solar cells. Science 358, 739–744 (2017). [DOI] [PubMed] [Google Scholar]

[R15] 15.Kovalenko M. V., Protesescu L., Bodnarchuk M. I., Properties and potential optoelectronic applications of lead halide perovskite nanocrystals. Science 358, 745–750 (2017). [DOI] [PubMed] [Google Scholar]

[R16] 16.Li W., Wang Z., Deschler F., Gao S., Friend R. H., Cheetham A. K., Chemically diverse and multifunctional hybrid organic–inorganic perovskites. Nat. Rev. Mater. 2, 16099 (2017). [Google Scholar]

[R17] 17.Zhang H., Li N., Li K., Xue D., Structural stability and formability of ABO₃-type perovskite compounds. Acta Crystallogr. B 63, 812–818 (2007). [DOI] [PubMed] [Google Scholar]

[R18] 18.Li C., Lu X., Ding W., Feng L., Gao Y., Guo Z., Formability of ABX₃ (X = F, Cl, Br, I) halide perovskites. Acta Crystallogr. B 64, 702–707 (2008). [DOI] [PubMed] [Google Scholar]

[R19] 19.Travis W., Glover E. N. K., Bronstein H., Scanlon D. O., Palgrave R. G., On the application of the tolerance factor to inorganic and hybrid halide perovskites: A revised system. Chem. Sci. 7, 4548–4556 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R20] 20.Shannon R. D., Revised effective ionic radii and systematic studies of interatomic distances in halides and chalcogenides. Acta Crystallogr. A 32, 751–767 (1976). [Google Scholar]

[R21] 21.Lufaso M. W., Woodward P. M., Prediction of the crystal structures of perovskites using the software program SPuDS. Acta Crystallogr. B 57, 725–738 (2001). [DOI] [PubMed] [Google Scholar]

[R22] 22.Kieslich G., Sun S., Cheetham A. K., Solid-state principles applied to organic–inorganic perovskites: New tricks for an old dog. Chem. Sci. 5, 4712–4715 (2014). [Google Scholar]

[R23] 23.Li C., Soh K. C. K., Wu P., Formability of ABO₃ perovskites. J. Alloys Compd. 372, 40–48 (2004). [Google Scholar]

[R24] 24.Becker M., Klüner T., Wark M., Formation of hybrid ABX₃ perovskite compounds for solar cell application: First-principles calculations of effective ionic radii and determination of tolerance factors. Dalton Trans. 46, 3500–3509 (2017). [DOI] [PubMed] [Google Scholar]

[R25] 25.Pilania G., Balachandran P. V., Gubernatis J. E., Lookman T., Classification of ABO₃ perovskite solids: A machine learning study. Acta Crystallogr. B 71, 507–513 (2015). [DOI] [PubMed] [Google Scholar]

[R26] 26.Pilania G., Balachandran P. V., Kim C., Lookman T., Finding new perovskite halides via machine learning. Front. Mater. 3, 19 (2016). [Google Scholar]

[R27] 27.Balachandran P. V., Emery A. A., Gubernatis J. E., Lookman T., Wolverton C., Zunger A., Predictions of new ABO₃ perovskite compounds by combining machine learning and density functional theory. Phys. Rev. Mater. 2, 043802 (2018). [Google Scholar]

[R28] 28.Ouyang R., Curtarolo S., Ahmetcik E., Scheffler M., Ghiringhelli L. M., SISSO: A compressed-sensing method for identifying the best low-dimensional descriptor in an immensity of offered candidates. Phys. Rev. Mater. 2, 083802 (2018). [Google Scholar]

[R29] 29.J. C. Platt, Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods, in Advances in Large Margin Classifiers, A. J. Smola, P. B. Bartlett, B. Schölkopf, D. Schuurmans, Eds. (MIT Press, 1999), vol. 10, 61–74. [Google Scholar]

[R30] 30.Filip M. R., Giustino F., The geometric blueprint of perovskites. Proc. Natl. Acad. Sci. U.S.A. 115, 5397–5402 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R31] 31.Perdew J. P., Burke K., Ernzerhof M., Generalized gradient approximation made simple. Phys. Rev. Lett. 77, 3865–3868 (1996). [DOI] [PubMed] [Google Scholar]

[R32] 32.Zhao X.-G., Yang D., Sun Y., Li T., Zhang L., Yu L., Zunger A., Cu–In halide perovskite solar absorbers. J. Am. Chem. Soc. 139, 6718–6725 (2017). [DOI] [PubMed] [Google Scholar]

[R33] 33.Sun Q., Yin W.-J., Thermodynamic stability trend of cubic perovskites. J. Am. Chem. Soc. 139, 14905–14908 (2017). [DOI] [PubMed] [Google Scholar]

[R34] 34.Megaw H. D., Crystal structure of double oxides of the perovskite type. Proc. Phys. Soc. 58, 133 (1946). [Google Scholar]

[R35] 35.Feteira A., Sinclair D. C., Rajab K. Z., Lanagan M. T., Crystal structure and microwave dielectric properties of alkaline-earth hafnates, AHfO₃ (A=Ba, Sr, Ca). J. Am. Ceram. Soc. 91, 893–901 (2008). [Google Scholar]

[R36] 36.Schmidt J., Shi J., Borlido P., Chen L., Botti S., Marques M. A. L., Predicting the thermodynamic stability of solids combining density functional theory and machine learning. Chem. Mater. 29, 5090–5103 (2017). [Google Scholar]

[R37] 37.Faber F. A., Lindmaa A., von Lilienfeld O. A., Armiento R., Machine learning energies of 2 million elpasolite (ABC₂D₆) crystals. Phys. Rev. Lett. 117, 135502 (2016). [DOI] [PubMed] [Google Scholar]

[R38] 38.Xie T., Grossman J. C., Crystal graph convolutional neural networks for an accurate and interpretable prediction of material properties. Phys. Rev. Lett. 120, 145301 (2018). [DOI] [PubMed] [Google Scholar]

[R39] 39.Kirklin S., Saal J. E, Meredig B., Thompson A., Doak J. W., Aykol M., Rühl S., Wolverton C., The open quantum materials database (OQMD): Assessing the accuracy of DFT formation energies. npj Comput. Mater. 1, 15010 (2015). [Google Scholar]

[R40] 40.Kamat P. V., Bisquert J., Buriak J., Lead-free perovskite solar cells. ACS Energy Lett. 2, 904–905 (2017). [Google Scholar]

[R41] 41.Hellenbrandt M., The Inorganic Crystal Structure Database (ICSD)—Present and future. Crystallogr. Rev. 10, 17–22 (2004). [Google Scholar]

[R42] 42.Li W., Ionescu E., Riedel R., Gurlo A., Can we predict the formability of perovskite oxynitrides from tolerance and octahedral factors? J. Mater. Chem. A 1, 12239–12245 (2013). [Google Scholar]

[R43] 43.McClure E. T., Ball M. R., Windl W., Woodward P. M., Cs₂AgBiX₆ (X = Br, Cl): New visible light absorbing, lead-free halide perovskite semiconductors. Chem. Mater. 28, 1348–1354 (2016). [Google Scholar]

[R44] 44.Choi S. O., Penninger M., Kim C. H., Schneider W. F., Thompson L. T., Experimental and computational investigation of effect of Sr on NO oxidation and oxygen exchange for La_1–xSr_xCoO₃ perovskite catalysts. ACS Catal. 3, 2719–2728 (2013). [Google Scholar]

[R45] 45.Source: WebElements, www.webelements.com/.

PERMALINK

New tolerance factor to predict the stability of perovskite oxides and halides

Christopher J Bartel

Christopher Sutton

Bryan R Goldsmith

Runhai Ouyang

Charles B Musgrave

Luca M Ghiringhelli

Matthias Scheffler

Abstract

INTRODUCTION

Fig. 1. Perovskite structure and composition.

Fig. 2. Assessing the performance of the improved tolerance factor, τ.

RESULTS AND DISCUSSION

Finding an improved tolerance factor to predict perovskite stability

Comparing τ to calculated perovskite stabilities

Extension to double perovskite oxides and halides

Fig. 3. Map of predicted double perovskite oxides and halides.

Compositional mapping of perovskite stability

Fig. 4. The effects of ionic radii and oxidation states on the stability of single and double perovskite oxides and halides.

CONCLUSIONS

MATERIALS AND METHODS

Radii assignment

Selection of τ

Alternative features

Increasing dimensionality

Potential for overfitting

Alternative radii for more covalent compounds

Computer packages used

Supplementary Material

Acknowledgments

SUPPLEMENTARY MATERIALS

REFERENCES AND NOTES

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

New tolerance factor to predict the stability of perovskite oxides and halides

Christopher J Bartel

Christopher Sutton

Bryan R Goldsmith

Runhai Ouyang

Charles B Musgrave

Luca M Ghiringhelli

Matthias Scheffler

Abstract

INTRODUCTION

Fig. 1. Perovskite structure and composition.

Fig. 2. Assessing the performance of the improved tolerance factor, τ.

RESULTS AND DISCUSSION

Finding an improved tolerance factor to predict perovskite stability

Comparing τ to calculated perovskite stabilities

Extension to double perovskite oxides and halides

Fig. 3. Map of predicted double perovskite oxides and halides.

Compositional mapping of perovskite stability

Fig. 4. The effects of ionic radii and oxidation states on the stability of single and double perovskite oxides and halides.

CONCLUSIONS

MATERIALS AND METHODS

Radii assignment

Selection of τ

Alternative features

Increasing dimensionality

Potential for overfitting

Alternative radii for more covalent compounds

Computer packages used

Supplementary Material

Acknowledgments

SUPPLEMENTARY MATERIALS

REFERENCES AND NOTES

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases