Skip to main content
ACS AuthorChoice logoLink to ACS AuthorChoice
. 2026 Mar 19;22(7):3761–3771. doi: 10.1021/acs.jctc.6c00044

PNcsp+: A Periodic Number-Based Crystal Structure Prediction Method Enhanced by Machine Learning

Cem Oran , Riccarda Caputo , Pierre Villars §, Adem Tekin †,∥,*
PMCID: PMC13085239  PMID: 41854229

Abstract

Crystal structure prediction (CSP) is central to materials discovery, yet its efficiency and interpretability remain limited by the vast configurational space and reliance on costly local optimizations. Although template-based and machine-learning (ML) approaches have improved exploration, many approaches still require large data sets, complex similarity metrics, or opaque generative pipelines. In this work, we introduce PNcsp+, an enhanced and chemically interpretable CSP framework that uses the Mendeleev Periodic Number (PN) as a transparent descriptor of elemental similarity. PNcsp+ expands the original implementation through a larger prototype library, an improved data management strategy, and ML-assisted prototype scoring by combining cutting-edge neural network models such as MACE, M3GNet, and ALIGNN-FF. Despite its simplicity, PNcsp+ reaches state-of-the-art performance. In evaluations on the CSPBench data seta curated set of 180 benchmark crystal structures for assessing CSP methodsour approach surpasses alternative methods by achieving 86.1% space group accuracy and 85.0% structure matching accuracy within the Top-5 predictions, all without structure relaxations. Moreover, our case study on several hybrid systems, including ammonium and methylammonium cations, demonstrated that molecular components emerge autonomously in the predicted lattices, guided solely by PN-derived similarity relationships. Overall, PNcsp+ shows that fundamental periodic trends, combined with targeted ML-based evaluation, offer an efficient, scalable, and interpretable CSP framework, enabling accelerated discovery across both inorganic and hybrid chemical spaces.


graphic file with name ct6c00044_0008.jpg


graphic file with name ct6c00044_0006.jpg

Introduction

Recent advances in theoretical and computational methodologies have made it possible to systematically investigate an increasingly broad class of compounds. The primary motivation behind these efforts is the discovery of novel materials with properties precisely tailored for targeted applications, enabling technological breakthroughs. In this regard, crystal structure prediction (CSP) constitutes a central research area in materials science, aiming at identifying stable solid-state phases whose crystal structure determination might be challenging to resolve experimentally. Predicting crystal structures directly through ab initio geometry optimization using widely employed density functional theory (DFT) packages such as VASP and CASTEP can deliver highly accurate results; however, because they rely on local optimization schemes, the final structures are strongly dependent on the quality of the initial guess. Moreover, the computational cost of DFT calculations makes the exploration of configuration space impractical for complex systems. The integration of machine learning (ML) algorithms with high-performance computing infrastructures has significantly enhanced this process, allowing more accurate and efficient analyses of large structural and compositional spaces. But completely relying on “black-box” ML models has its own drawbacks, such as limited interpretability, which is the main factor that hinders the extraction of meaningful chemical insight.

A number of studies have proposed approaches that incorporate the empirical chemical knowledge into the search process. For phases with unknown crystal structure representation, the most effective strategy is the exploration of phase space by using a set of prototype structures and evaluating the chemical similarities between constituent elements. A recent study introduced a comprehensive benchmark data set consisting of 180 test systems (CSPBench set) and evaluated the performance of 13 CSP programs. The results demonstrated that template-based approaches achieve the most consistent and reliable predictive performance. Complementing this, another recent study evaluated the performance of template-based method, TCSP, against two of the most prominent alternatives, CSPML and EquiCSP through the same CSPBench data set. This study provided a unified perspective on the strengths and limitations of both template-driven and ML-based methods in CSP.

CSPML represents a template-based algorithm that employs metric learning to guide the selection of prototype structures whose compositions are chemically “replaceable” with the target composition. Once promising templates are selected, CSPML performs elemental substitution and then carries out local relaxation on the substituted structures. This design situates CSPML as an ML extension of the traditional template-driven discovery pipelines. In contrast, EquiCSP is an equivariant diffusion-based generative model for the CSP that directly builds crystal symmetries into its design. It ensures that the generated structures always respect the fundamental symmetry constraintslike how atoms repeat across the lattice and how the crystal translates periodically. By doing so, EquiCSP produces more realistic crystal candidates and learns faster than earlier diffusion-based methods.

In parallel, TCSP 2.0 is an enhanced template-based CSP approach that upgrades its predecessor by replacing traditional oxidation state assignment with the deep learning BERTOS model, and by introducing element embedding distance metrics and a majority-voting scheme for the space group selection. These innovationstogether with an expanded template library and CHGNet -based structure relaxationenable TCSP 2.0 to achieve substantially higher accuracy in both structure matching and space group prediction compared to the previous CSP approaches.

In all these approaches, whether template-based or ML-driven, the ability to distinguish between chemically similar and dissimilar systems remains a central challenge. In both Wei et al. and Wang et al. such chemical similarity metrics were constructed through complex analyses of large data sets and intricate structural correlations. Recently, we have also developed a template-based CSP framework, PNcsp which employs a similarity metric based on the Mendeleev Periodic Number (PN). In its current enhanced implementation, PNcsp+, the method integrates state-of-the-art ML models dedicated to prototype evaluation and an improved data-reduction pipeline. Moreover, the prototype inventory has been substantially expanded, and the overall data-management strategy has been improved.

In contrast to alternative approaches, PNcsp+ captures similarity relationships in a remarkably simple and interpretable manner. This conceptual simplicity constitutes the principal advantage of our method over existing alternatives. Whereas many existing methods require generating large numbers of potentially “similar” prototype systems and subsequently applying atomic substitutions, the PN-based approach drastically reduces the number of candidates that must be considered. Despite this streamlined process, PNcsp+ still surpasses alternative methods in accurately reproducing stable crystal structure configurations. Our framework was applied to 180 test systems included in the CSPBench set, and its performance was compared with the most successful CSP approaches, such as TCSP 2.0, CSPML, and EquiCSP. Furthermore, the method was tested on a set of challenging hybrid organic–inorganic compounds containing ammonium and methylammonium cations. Remarkably, the molecular cations emerged naturally within the predicted lattice frameworks based solely on PN-derived similarity relationships, highlighting the framework’s adaptability beyond purely inorganic chemistry.

Methodology

PN Concept and Overview of the PNcsp Approach

The Mendeleev PN is an ordering index assigned to chemical elements such that chemically similar elements appear consecutively. Although the atomic number is fundamental, it does not fully satisfy this objective, as it progresses period by period rather than grouping elements according to chemical similarity. The first formally defined periodic enumeration of this kind was introduced by Pettifor in 1984 through a phenomenological optimization aimed at separating binary AB compounds into distinct structure types. This ordering effectively traverses the periodic table group by group, leading to characteristic rearrangements, such as positioning Eu and Yb after Ca and Pb after Ga.

The Mendeleev PN enumeration, developed by Pierre Villars and coworkers and employed in this study (hereafter referred to simply as PN), is more directly derived from periodic trends. In this representation, elements are ordered group-wise from top to bottom and sequentially from left to right across the periodic table. Compared with conventional layouts, H is placed above the halogen group, while Be and Mg are positioned above group 12 (Zn, Cd, Hg) to better reflect chemical similarity.

In fact, the PN incorporates the principal quantum number (the period of the Periodic Table) and the valence shell configuration, while moving down along the group. In this scheme, elements with similar PN values exhibit closely related chemical properties. A detailed description of the systematic enumeration of all 118 chemical elements within the PN space is provided in the (Supporting Information SI).

By using the PN of the constituent elements of chemical systems of any order, we can draw a phase map, the PN-representation of the phase space, and identify crystal structure similarities and physical property trends in a straightforward way. Recently, we have demonstrated that the PN representation of the phase map of binary and higher-order chemical systems clearly pointed out the existence of forming, where stable phases are observed, and nonforming regions, where no known phases form except a small number of accepted violations. This representation, extendable from binary to higher-order systems, not only delineates stability domains with clarity but also reveals chemical trends and structural similarities across neighboring systems, including correlations in enthalpy of formation, mechanical properties, and prototypes.

The PN as a similarity metric was first implemented in our initial algorithm of PNcsp where it served as a similarity metric for predicting the crystal structures of binary phases identified in the binary phase diagrams but whose crystal structures are not yet determined experimentally. Unlike conventional approaches, the PNcsp methodology exploits this ordering to formalize the notion of chemical interchangeability. Specifically, structural prototypes are generated by systematically replacing atoms within known crystalline motifs with alternative elements whose PN values indicate comparable elemental properties. Consequently, structural similarity assessment is (i) simple, relying on a single descriptor, and (ii) interpretable, being rooted in an explicit chemical rationale rather than in latent learned representations. This quantitative basis for substitution represents the central conceptual distinction between our approach and those reported previously by other groups.

The exploration of substitution pathways proceeds in an iterative manner. The search begins with the first-order nearest neighbors in the PN representation of the phase map and gradually extends to higher orders (second, third, and so on), thereby constructing a controlled expansion of candidate systems. The distance metric in PN space is defined as the absolute difference between the PNs of the constituent elements, |PN 1PN 2|. To restrict the chemical space to the most likely stable phases, only the phases with negative formation enthalpy constitute the pool of neighbor phases of a target phase. Once the potential similar phases are identified, the corresponding structure types are considered for the elemental substitution process.

Enhancements and New Functionalities in PNcsp+

As a major enhancement over the original implementation, PNcsp+ integrates an AI-based pre-evaluation module that markedly lowers the computational cost of traditional DFT-based screening and optimization steps. A review of the literature reveals that graph neural networks (GNNs)owing to their natural ability to capture structure–property relationships in molecular and covalently bonded systemsare among the most widely adopted and accurate ML approaches for predicting the energies and, consequently, the thermodynamic stabilities of crystalline materials. Among these, MegNet and its extended variant M3GNet which accounts for three-body interactions, have been successfully employed as surrogates for DFT calculations. Moreover, alternative models with distinct architectures, such as MACE and ALIGNN-FF representing both equivariant and nonequivariant GNN designs, have also demonstrated high accuracy in crystal structure energy prediction tasks. Building upon these advances, the pre-evaluation module in PNcsp+ harnesses these state-of-the-art GNN modelsspecifically MACE, ALIGNN-FF, and M3GNetto perform rapid single-point energy predictions for the generated structural prototypes.

Given that publicly available pretrained versions of these models have consistently demonstrated near-DFT accuracy in reproducing energies and forces across diverse materials data sets, they are incorporated directly into the PNcsp+ workflow without additional fine-tuning. During the final selection stage, the predictions from the three GNN models can be combined through an ensemble averaging strategy to improve robustness and mitigate model-specific biases. In this approach, the predicted energies from the selected models (all three or a subset thereof) are averaged to obtain a consensus estimate, thereby stabilizing the ranking of candidate structures.

Beyond single-point energy prediction, these models are also integrated into an optional structure relaxation pipeline, providing a scalable alternative to the DFT-based relaxation procedure employed in the original implementation. As an additional refinement aimed at reducing computational cost during the structure relaxation step, an optional redundant-prototype elimination process is incorporated through pymatgen’s StructureMatcher module prior to GNN-based assessment. This procedure effectively filters out structurally similar prototypes, ensuring that only unique candidates are passed forward for subsequent structure relaxation. This process enhances the diversity of the candidate pool, allowing the GNN models to focus on genuinely distinct structural configurations.

It is important to note that both structure relaxation and redundant-prototype elimination are optional features and were not employed in the benchmarking analysis presented in this work, as detailed in the following section. While structure relaxation can further improve ranking accuracy, albeit at significantly increased computational cost. Accordingly, PNcsp+ incorporates relaxation as an optional stage, allowing users to balance computational efficiency against predictive refinement. The structure ranking may be applied either by skipping relaxation for rapid screening or after relaxation for enhanced energetic refinement. This design provides methodological flexibility, enabling users to tailor the workflow according to available resources and accuracy requirements.

Significant enhancements have also been implemented in the data-source architecture and data-management strategy. The original PNcsp framework was primarily designed for equimolar binary systems, relying mainly on a locally generated data set. Its applicability to more complex, nonequimolar, and multicomponent systems has been verified only through limited preliminary tests. In PNcsp+, we extended the pool of phases comprising higher-order systems and any possible constitutions by adopting the Open Quantum Materials Database (OQMD) as the primary source. While numerous computational structure databases exist, OQMD offers distinct advantages owing to the sheer volume of its datacontaining over 1.4 million ICSD-derived and hypothetical structuresand its focus on structural decorations. Rather than prioritizing curated properties of experimentally known compounds, OQMD systematically explores hypothetical chemical space by decorating common crystal prototypes (e.g., perovskites, Heusler alloys, and spinels) with a large variety of feasible elemental combinations.

Moreover, the extensive enumeration of stoichiometric variants enables smoother and more complete convex hull construction for thermodynamic stability analysis. For a candidate compound A x B y C z , OQMD typically includes many competing nearby phases, allowing more reliable evaluation of the energy above hull (E hull) against a nearly exhaustive set of competitors, thereby functioning as a large-scale prototype reference space. Furthermore, unlike databases primarily accessed through web APIs, OQMD’s qmpy framework supports full local deployment, facilitating integration into high-throughput CSP workflows. In PNcsp+, local deployment enabled offline querying with minimal latency and without API limitations, eliminating data-access bottlenecks present in earlier implementations and enabling efficient large-scale structural screening. This design makes OQMD particularly well-suited for template-based CSP.

In addition, complementary repositories have also been integrated, such as the Materials Project (MP) which is particularly strong in systems with organic components, and the Materials Platform for Data Science (MPDS) recognized as the world’s largest experimental materials database, fully curated by specialists in crystallography and materials science. Full access to the MPDS platform is reserved for registered users, whereas only a limited amount of content is publicly available. When activated, these auxiliary databases expand the search space and enhance the likelihood of identifying relevant and structurally analogous prototypes, further strengthening the robustness and flexibility of the PNcsp+ framework.

Figure illustrates the PNcsp+’s multistep prediction process. Initially, the program generates candidate structures by assessing crystal structure type similarity within the PN representation of the phase map. These candidates are subsequently refined by a GNN-based evaluator module, either through single-point energy prediction or optional structure relaxation, leading to the final selection of the top-n results. No additional structural feature engineering or chemical rule encoding is required. Regarding hyperparameters and user-defined settings, the framework requires only a small number of choices: the number of nearest neighbors to include, the GNN model used for evaluation, and whether structure relaxation is enabled. Thus, the workflow does not involve extensive hyperparameter tuning or trial-and-error optimization.

1.

1

Workflow of prototype prediction with the PNcsp+ framework.

Benchmark Set and Evaluation

Performance Evaluation of the PNcsp+ Approach

To provide a rigorous assessment of PNcsp+, we conducted evaluations using the benchmark set, CSPBench comprising 180 crystal structures. This data set, derived from the MP database, was specifically designed to challenge CSP algorithms across a wide spectrum of difficulties. The difficulty classification accounts for multiple factors, including space group diversity, template-based categories, and elemental composition that characterize distinct crystal families. Such a design ensures that the benchmark not only captures straightforward cases but also includes complex systems where structural prediction is inherently more demanding.

To assess the predictive performance of PNcsp+ on the 180 test systems in comparison with other programs, the evaluation was carried out using two complementary criteria, following the approach of Wei et al.: (i) the success rate in reproducing the correct space group, which reflects how accurately the algorithm captures crystallographic symmetry, and (ii) a crystal structure similarity analysis, which quantifies the degree of correspondence between predicted configurations and their reference structures. Crystal structural similarity was evaluated using the StructureMatcher class from the Pymatgen library employing its default tolerance parameters (ltol = 0.2, stol = 0.3, angle_tol = 5).

Performance of GNN Models under PNcsp+

Given the broad scope of the benchmark set, OQMD was employed as the sole data source during the evaluation. To maintain computational tractability, the nearest neighbor order was restricted to the fourth order for each constituent element, thereby effectively constraining the search radius in the phase map. If no suitable candidates were identified within the fourth order, the search was extended up to the sixth order (a condition required only for a small number of systems). This controlled neighborhood enlargement ensures that the algorithm balances chemical diversity with chemical property similarities while still capturing a sufficiently rich set of substitutional candidates.

Prototype evaluation was carried out through single-point energy predictions without structure relaxation, employing both individual GNN models and their ensemble average. Structure relaxation was intentionally excluded from the CSPBench benchmarking to emphasize a key strength of the PNcsp+ framework: the template-based search generates high-quality initial structures that are already very close to their optimized geometries. This allows reliable ranking of candidate structures using single-point energies alone. In addition, for large-scale screening tasks, such as CSPBench (180 systems spanning varying levels of structural complexity), omitting structure relaxation yields substantial computational savings while preserving competitive predictive accuracy.

Table summarizes the percentage of cases in which PNcsp+ successfully identified the target structure within the top-n predictions, when evaluated using different GNN models as scoring functions. Performance of the models was compared using three evaluation metrics: (i) Space Group matching (SG), (ii) StructureMatcher matching (SM), and (iii) simultaneous agreement in both Space Group and StructureMatcher metrics (Both). Among the individual models tested, MACE (MPA-0 middle) consistently outperformed the others in top-3 and top-5 categories, while M3GNet exhibited lower but still comparable performance in these 2 cases. In top-10 predictions, M3GNet shows slightly better performance than MACE in metrics based on structure similarity, whereas both models achieve the same matching rates in the Space Group metric. On the other hand, ALIGNN-FF performed significantly worse, ranking as the last across all metrics. These observations are consistent with the mean absolute errors obtained in predicting DFT reference total energies for the CSPBench data set, as reported by the Materials Project, with values of 0.185 eV/atom for MACE, 0.194 eV/atom for M3GNet, and 2.650 eV/atom for ALIGNN-FF. Moreover, we observed that incorporating ALIGNN-FF into the ensemble averaging did not enhance the overall ranking quality. Based on these findings, the ensemble strategy was constructed by averaging the single-point energy predictions from the two best-performing models, MACE and M3GNet. Under this configuration, the ensemble model yielded the highest accuracy in the top-5 category and the second-highest performance across the remaining categories.

1. Prediction Performance Comparison of PNcsp+ and Its Variants Using Four Nearest Neighbors .

  TOP-10
TOP-5
TOP-3
PNcsp+ Model SG SM Both SG SM Both SG SM Both
Ensemble 90.56 86.11 85.56 86.11 85.00 82.22 82.22 82.78 78.89
MACE 90.56 86.11 85.56 85.56 85.00 81.67 83.33 83.33 79.44
M3GNet 90.56 86.67 86.11 84.44 83.33 81.11 78.33 80.56 75.56
ALIGNN-FF 87.78 80.00 78.33 76.11 72.78 70.00 62.78 64.44 59.44
a

Numerical values are reported as percentages. “SG” refers to matches based on space group symmetry, “SM” to matches determined by StructureMatcher, and “Both” to predictions meeting both criteria.

b

The ensemble evaluation is based on the single-point energy predictions of M3GNet and MACE models.

In Table , the accuracy of PNcsp+ is assessed based on its top-3, top-5, and top-10 predictions, with consideration given to the characteristics of the data set. Overall, 20% of the test structures in the data set belong to the polymorph category, consisting of structures that do not always correspond to the lowest-energy (i.e., thermodynamically most stable) configurations reported in the MP database. Consequently, evaluating PNcsp+ solely on the basis of its top-1 prediction would not provide a fair measure of performance, since in these categories the target structures may not correspond to the ground-state configurations. To enable a more rigorous evaluation of top-1 prediction accuracy, the polymorph category was extended by incorporating additional polymorphs reported in the MP database, and prediction performance was examined for both the original test set and the extended data set. Notably, inspection of the results revealed that the first-ranked predictions produced by PNcsp+ frequently corresponded to ground-state configurations, even within these challenging categories. Using the best-performing model, MACE, PNcsp+ achieved SG: 62.78%, SM: 73.89%, and Both: 61.67% on the original data set, while the performance improved to SG: 71.11%, SM: 81.67%, and Both: 69.44% on the extended data set. In contrast, M3GNet and ALIGNN-FF exhibited significantly lower performance (further details are provided in the SI).

With these settings, PNcsp+ is capable of completing the structural screening on a standard server equipped with a 32-core AMD processor in approximately ∼3 h (1 min per system on average). Based on our systematic tests, the use of four nearest neighbors provides an optimal balance between accuracy and efficiency, while increasing the order beyond this threshold yields no significant improvements in predictive accuracy. As illustrated in Table , the screening time for the benchmark set of 180 systems increases from ∼1.5 h when considering two neighbors to ∼3 h for four neighbors, and rises sharply to ∼15 h when the nearest neighbor order is extended to six. The number of candidate prototypes processed by the structure evaluation module strongly depends on the chemical system under consideration. Table reports the average number of candidates per system across the 180 benchmark systems for different nearest neighbor settings. In contrast to other prominent template-based approaches, which typically evaluate thousands of templates ,, PNcsp+ operates on a much smaller and more refined candidate pool, owing to its neighbor-order–guided pre-elimination strategy.

2. Computation Time and Number of Evaluated Systems for Different Nearest Neighbor Configurations.

Search Configuration Average candidates per System Search Time Ensemble Evaluation Time Total Time
2 Neighbors 78 18 min 1 h 8 min 1 h 26 min
4 Neighbors 242 23 min 2 h 42 min 3 h 5 min
6 Neighbors 677 1 h 25 min 13 h 24 min 14 h 59 min
a

The ensemble evaluation is based on the single-point energy predictions of M3GNet and MACE models.

In terms of computational efficiency, M3GNet exhibited by far the highest prediction speed. MACE achieved the second-best performance but remained considerably slower than M3GNet, with prediction times approximately three times longer. This apparent performance gap, however, was partially influenced by the numerical precision used in the evaluation: M3GNet and ALIGNN were run with their default float32 data type, whereas MACE was evaluated with the recommended float64 precision. When the precision of MACE was reduced to float32, its computational speed improved noticeably, resulting in an approximately 40–50% reduction in runtime. ALIGNN-FF, meanwhile, showed the lowest computational efficiency, operating about four times slower than M3GNet.

In light of these observations, it can be concluded that, for performance-critical large-scale screening tasks, replacing the ensemble strategy with a single high-performing model such as M3GNet can significantly reduce the runtime of the prototype evaluation stage. In addition, it is worth noting that, in this performance benchmark, the GNN-based evaluation module was initialized from scratch for each individual system. In practical large-scale structure screening, deploying the evaluation module once and reusing it across the entire target system pool would provide additional performance gains.

The efficiency of PNcsp+ arises from its targeted screening strategy, which selectively yields prototypes exhibiting the most compatible structural characteristicssuch as lattice parameters, bonding types, and interatomic distanceswhile ensuring thermodynamic stability. Low-quality candidates are systematically filtered out prior to the atomic substitution stage, thereby avoiding unnecessary computational overhead. In the subsequent step, the structure evaluation module further helps to refine the candidate pool and rapidly identify the most promising prototypes. This hierarchical selection ensures that only the most physically and chemically plausible structures advance to the final stage. Owing to this multilevel filtering, the need for the subsequent geometry optimization (either DFT- or ML-based) is significantly reduced. Remarkably, the predicted structures closely matched their experimentally determined counterparts even in the absence of structure relaxation, highlighting the intrinsic accuracy and computational efficiency of the PNcsp+ framework.

Comparison with Different CSP Algorithms

In line with the recent comparative study by Wei et al. which evaluated the performance of their template-based program TCSP 2.0 against two leading alternatives, CSPML and EquiCSP, we adopted their published results as reference data to enable a direct and transparent comparison with our own method, PNcsp+. Notably, while the competing methods employed CHGNet-based structure relaxation, our evaluation relies exclusively on the unrelaxed structures generated by PNcsp+, ranked via an ensemble of single-point energy predictions from M3GNet and MACE.

Figure compares the performance of the CSP algorithms for top-5 predictions using the same three evaluation metrics as in Table . PNcsp+ (Ensemble) attains the highest accuracy, achieving 86.1% in space group matching, 85.0% in structure matching, and 82.2% in the combined criterion. It outperforms its closest competitor, TCSP 2.0, by approximately 7 percentage points in structure matching (TCSP 2.0 achieves 83.9%, 78.3%, and 75.0% for the three metrics, respectively). In contrast, EquiCSP and CSPML perform considerably lower, indicating larger deviations from the target structures. The small difference between the blue and red bars for PNcsp+ suggests strong agreement not only in symmetry but also in full atomic structure. This comparison demonstrates that PNcsp+ outperforms alternative CSP frameworks in both symmetry-based and atomic-level structure matching. For top-1 predictions on the original data set, PNcsp+ outperforms the leading competitor, TCSP 2.0, in the structure matching metric (SMPNcsp+: 73.9%, SMTCSP 2.0: 68.3%). However, TCSP 2.0 achieves higher performance in the other two evaluation metrics (SGPNcsp+: 62.8%, SGTCSP 2.0: 70.6% and BothPNcsp+: 61.4%, BothTCSP 2.0: 61.7%).

2.

2

Performance comparison of PNcsp+ and other CSP approaches on the CSPbench set within the Top-5 category.

Across the full benchmark, PNcsp+ failed to identify any viable prototype candidates for five systemsCo4NiSb12, Fe2Cu6SnS8, MgV4SnO12, YbH3CN3, and Tb4Alfor distinct reasons. Regarding Co4NiSb12, the reported composition appears to be inconsistent: the phase may contain three, rather than four, formula units. Furthermore, inspection of the phase diagrams available in the Linus Pauling File (LPF) database (MPDS is the primary online access point and platform for LPF), particularly the isothermal sections at 813 and 873 K, indicates that no ternary compounds form in this system. The ternary phase listed in CSPBench is therefore likely a solid solution based on the binary CoSb3 phase (cubic modification) alloyed with Co–Ni, rather than a genuine ternary compound.

For MgV4SnO12 and Tb4Al, neither structural information nor constitution data are available in the LPF database. In the case of Tb–Al, only binary aluminide phases of rare-earth elements are documented in the Al-rich regionsuch as the trialuminides of Ho or Dysuggesting that the ternary or more complex phases listed in CSPBench may not be experimentally confirmed.

Interestingly, Fe2Cu6SnS8 and YbH3CN3 are reported in the LPF databasethe former crystallizing in a tetragonal phase and the latter in a hexagonal structure belonging to the family of rare-earth metal guanidinates, containing the functional group [C–N3–H3]2–. However, because both compounds reside in sparsely populated regions of the OQMD-derived phase map, PNcsp+ was unable to retrieve or match suitable prototype candidates, despite the availability of corresponding structural information in LPF. This underscores the value of incorporating LPF into PNcsp+’s data inventory, as targeted searches over a small set of systems can enhance prototype-matching accuracy while adding only minimal search overhead.

Among the structures that the StructureMatcher classified as “not similar”, several cases lie near the similarity threshold. For example, KCuCl3 and K3MnO4 exhibit borderline average root-mean-square (RMS) distances, yet upon local relaxation using the MACE potential, their predicted structures evolve into forms that closely match the target structures. PNcsp+ also identified alternative polymorphs for some systems. As an example, LuSeO3F has three known polymorphs (two of which are reported as experimental structures in the MP database), and although PNcsp+ did not recover the specific target structure in the benchmark set, it successfully predicted the remaining two polymorphs. Among the experimentally reported MP polymorphs, two are monoclinic (P121/m1 and P121/c1) and one is triclinic (P1̅). Although PNcsp+ did not explicitly recover the ground-state P121/m1 phase, it successfully predicted the other two polymorphs. The reported formation energies are nearly degenerate, with only ∼0.002 eV/atom separating the triclinic and ground-state structures. Structural analysis shows that the triclinic phase represents a slightly distorted variant of the ground state, consistent with the distortion pattern obtained by PNcsp+. Thus, the predicted structures remain very close to the energetic and structural ground-state landscape despite the absence of an exact match. This behavior highlights PNcsp+’s ability to navigate physically meaningful structural variants even when the exact target prototype is not captured.

A Case Study beyond Inorganic Systems

In this section, we applied PNcsp+ to a set of structurally and chemically diverse systemsranging from hybrid organic–inorganic perovskites (HOIPs) with halides to ammonium and antiperovskite saltsthat represent some of the most challenging classes of ionic and hybrid materials. CSP for these kinds of materials is challenging due to the coexistence of ionic, covalent, and hydrogen-bonding interactions, along with orientational disorder of molecular or polyatomic cations. In addition, there is limited information available about them in the structure databases. Therefore, these systems serve as ideal test cases to evaluate PNcsp+’s capability in accurately predicting complex lattice architectures governed by competing interactions and compositional flexibility. To this end, we focused on the representative compounds including perovskite and antiperovskite structures: methylammonium lead iodide, CH3NH3PbI3 and some of the common precursor methylammonium halides, CH3NH3X (MAX), X = Cl, Br, I, as well as ammonium lead iodide, NH4PbI3; ammonium potassium sulfate, NH4KSO4, and sodium-rich chlorosulfate, Na3SO4Cl.

These compounds collectively exemplify a broad class of ionic and hybrid crystals with exceptional functional properties. Their significance arises from their diverse physicochemical behavior. For instance, HOIPs with halides have revolutionized the field of photovoltaics due to their strong light absorption and long carrier diffusion lengths while ammonium and alkali-metal salts are widely used for their functional properties in areas such as agriculture (fertilizers), energy storage (batteries), food processing (additives and preservatives), and as raw materials for the chemical industry (glass, soap, and bulk chemicals). Sodium-rich antiperovskites, in particular, are considered promising solid electrolytes for all-solid-state Na-ion batteries because of their intrinsic fast-ion mobility and structural stability under varying chemical environments.

PNcsp+ successfully proposed multiple polymorphic phases for all these systems through a fully template-based strategy, without manually defined rules to specify which atoms constitute molecular subunits in hybrid systems. Given the structural complexity and the strong dependence of stability on the atomic configuration, all predicted candidates were further refined via DFT-based geometry optimization, and their enthalpies of formation were computed to assess thermodynamic stability. Remarkably, all predicted structures were found to be thermodynamically stable as shown in Table . In DFT calculations, CASTEP code with the generalized gradient approximation (GGA) of the exchange-correlation functional of the PBEsol type and on-the-fly generated pseudopotentials, the 80.otfg type were employed. Figures – show the corresponding optimized crystal structures. Notably, the results demonstrate that both ammonium and methylammonium molecular cations are constructed within the predicted frameworks solely based on the PN neighborhood relationships, without any explicit predefined molecular templates. The perovskite frameworks are also reproduced, as clearly observed in the cases of CH3NH3PbI3 and (NH4PbI3, where the methylammonium and ammonium cations occupy the cavities between the corner-sharing octahedra formed by Pb–I network (Figure ).

3. Calculated Enthalpies of Formation for the Selected Systems.

Chemical System Space Group Δ f H [eV/atom]
CH3NH3I P21/m –0.440
CH3NH3I R3m –0.438
CH3NH3I Pbcm –0.434
CH3NH3Cl R3m –0.519
CH3NH3Cl P21/m –0.514
CH3NH3Cl Pbcm –0.505
CH3NH3Br R3m –0.488
CH3NH3Br P21/m –0.487
CH3NH3Br Pbcm –0.478
CH3NH3PbI3 Pnma –0.541
CH3NH3PbI3 Pm –0.535
NH4PbI3 P21 –0.684
NH4PbI3 R3 –0.661
NH4KSO4 P21/c –1.364
NH4KSO4 Pna21 –1.363
NH4KSO4 Pnma –1.363
NH4KSO4 Pca21 –1.352
Na3SClO4 P4/nmm –1.965
Na3SClO4 R3m –1.922
Na3SClO4 P4̅3m –1.922
Na3SClO4 I4/mcm –1.900

3.

3

Optimized structures of CH3NH3PbI3 (a) and NH4PbI3 (b) perovskites and CH3NH3I (c) salt representing the typical lattice arrangement of methylammonium halides. PbI6 octahedra are highlighted in gray within corresponding structures.

5.

5

Optimized perovskite structures of Na3SClO4. SO4 tetrahedra are highlighted in yellow, and the ClNa3 octahedra are highlighted in green, with Na atoms depicted in orange.

Among these systems, CH3NH3PbI3 has attracted the most extensive attention owing to its technological and industrial importance. In the literature up to date, three modifications are reported for CH3NH3PbI3: the cubic phase stable above room temperature, the tetragonal phase stable around room temperature and below, and a low-temperature phase with orthorhombic structure. , By using our methodology described in the present work, we found two possible stable modifications of CH3NH3PbI3 with monoclinic and orthorhombic symmetry representations, being the latter slightly lower in energy than the former, as reported in Table . The monoclinic modification with Pearson’s symbol mP12 and space group number 6 can be interpreted as a distorted perovskite structure, where the center of mass of the methylammonium cations sits on the corner of a distorted cubic lattice, the (1a) position of the ideal cubic perovskite structure, and Pb atoms occupy the body-centered position of the lattice. Similarly, the orthorhombic modification (oP48,62) is a variation of the orthorhombic perovskite phase with prototype GdFeO3, oP20,62 adopted by CaTiO3 in the room-temperature phase.

Notably, PNcsp+ successfully identified structures that closely match those described in previous studies. Regarding the precursor salts, the MP database reports two structures for CH3NH3I with Pbcm and P21 space groups, while CH3NH3Cl has a single entry with Pbcm symmetry and no entry is available for CH3NH3Br (similarly, no entries exist for these systems in OQMD). Remarkably, PNcsp+ not only recovered these reported structures but also predicted an additional polymorph with R3m symmetry, which exhibits the lowest enthalpy of formation among the CH3NH3Cl and CH3NH3Br systems.

The rhombohedral phase of methylammonium halides derived from the low-temperature (123 K) structure of methylammonium fluoride, initially determined experimentally , and later examined computationally for phase stability.

For NH4PbI3, experimental reports indicate that the compound primarily adopts an orthorhombic crystal structure with Pnma symmetry while the MP database lists an additional Cm polymorph. , The PNcsp+ identified two competitive alternativesa monoclinic (P21) and a trigonal phase (R3)whose enthalpies of formation are comparable to that of the experimentally observed orthorhombic phase, reported as −0.675 eV per atom.

Regarding the crystal structure of NH4KSO4, two orthorhombic phases have been reported: one with Pnma symmetry and a slightly modified variant with Pna21 symmetry, as indicated in two previous studies. , The PNcsp+ not only reproduced these reported structures but also predicted two additional modifications with P21/c and Pca21 symmetries, the former exhibiting a slightly lower enthalpy of formation than the others. As shown in Figure , SO4 tetrahedra coordinate with potassium atoms via oxygen atoms, forming the fundamental inorganic framework. The ammonium ions are linked to this framework through an extensive network of hydrogen bonds. In the lower two structures, exhibiting Pnma and P21 symmetries, the crystal structure adopts a compact and highly interconnected topology. In contrast, the P21/c and Pca21 variants display a more open framework, where the enlarged voids accommodate the ammonium cations within the resulting cavities.

4.

4

Optimized structures of NH4KSO4. SO4 tetrahedra are highlighted in yellow.

For Na3SClO4, although an experimental study has confirmed the formation of the compound based on the X-ray diffraction patterns no direct structural information has been reported. A prior theoretical study proposed a hypothetical cubic structure with Fm3®m symmetry and suggested its stability. PNcsp+, however, uncovered four distinct structural candidates: cubic (P4̅3m), tetragonal (P4/nmm and I4/mcm), and trigonal (R3m) phases, with the P4/nmm configuration exhibiting the lowest enthalpy of formation (Table ). It is worth noting that the Fmm structure proposed in the study of Xu et al. possesses a significantly higher enthalpy of formation compared to PNcsp+’s outcomes. Figure shows that two types of polyhedra interconnect to stabilize the lattice: the smaller polyhedra formed by sulfur and oxygen are encapsulated within a framework composed of chlorine and sodium (located interior), yielding a balanced and cohesive structure.

Conclusion

PNcsp+ offers a chemically interpretable and computationally efficient route for CSP by harnessing Mendeleev PN to quantify elemental similarity. Its hierarchical screening across multiple databases, coupled with a prototype evaluation pipeline that integrates diverse GNN models, enables the systematic identification of structurally consistent and thermodynamically stable candidates within a minimal computational overhead.

Comprehensive benchmarking against established CSP approaches reveals that PNcsp+ surpasses existing methods in identifying structurally and energetically plausible prototypes for inorganic systems with a diverse range of complexity. PNcsp+ with the ensemble averaging scheme achieved its best performance, attaining 86.1% accuracy in space group classification and 85.0% in structure matching within the Top-5 predictions without reliance on structure relaxation. Furthermore, its successful application to hybrid organic–inorganic compounds highlights the framework’s versatility beyond purely inorganic chemistry. Applications to hybrid systems including ammonium and methylammonium compounds, remarkably demonstrate that the molecular cations emerge naturally within the predicted lattice frameworksguided solely by PN-derived neighborhood relationships and without reliance on manually defined molecular templateswhile accurately reproducing the underlying crystalline architectures. By capturing fundamental structure–chemistry relationships, PNcsp+ reveals that fundamental periodic trends can be leveraged to accelerate materials discovery while maintaining high predictive accuracy.

Supplementary Material

ct6c00044_si_001.pdf (6.6MB, pdf)

Acknowledgments

The work was financially supported by Material Phases Data Systems (MPDS) and İstanbul Technical UniversityBAP (Scientific Research Projects Coordination) Office under Grant Number (TDK-2025-47476). Computing resources were provided by the National Center for High Performance Computing of Turkey (UHEM) under Grant Number 1002132012.

An open-source software implementation of our similarity-based initial crystal structure prediction method is available at https://github.com/tccdem/PNcsp_Plus.

The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.jctc.6c00044.

  • A detailed description of the PN concept, Top-1 performance metrics of PNcsp+, and more information about the CSP for LuSeO3F are provided (PDF)

The authors declare no competing financial interest.

References

  1. Kresse G., Furthmüller J.. Efficient iterative schemes for ab initio total-energy calculations using a plane-wave basis set. Phys. Rev. B. 1996;54:11169–11186. doi: 10.1103/PhysRevB.54.11169. [DOI] [PubMed] [Google Scholar]
  2. Clark S. J., Segall M. D., Pickard C. J., Haspin P. J., Probert M. J., Refson K., Payne M. C.. First principles methods using CASTEP. Z. Kristallogr. 2005;220:567–570. doi: 10.1524/zkri.220.5.567.65075. [DOI] [Google Scholar]
  3. Legrain F., Carrete J., van Roekeghem A., Madsen G. K. H., Mingo N.. Materials Screening for the Discovery of New Half-Heuslers: Machine Learning versus ab Initio Methods. J. Phys. Chem. B. 2018;122:625–632. doi: 10.1021/acs.jpcb.7b05296. [DOI] [PubMed] [Google Scholar]
  4. Kim K., Ward L., He J., Krishna A., Agrawal A., Wolverton C.. Machine-learning-accelerated high-throughput materials screening: Discovery of novel quaternary Heusler compounds. Phys. Rev. Mater. 2018;2:123801. doi: 10.1103/PhysRevMaterials.2.123801. [DOI] [Google Scholar]
  5. Faber F. A., Lindmaa A., von Lilienfeld O. A., Armiento R.. Machine Learning Energies of 2 Million Elpasolite (ABC2D6) Crystals. Phys. Rev. Lett. 2016;117:135502. doi: 10.1103/PhysRevLett.117.135502. [DOI] [PubMed] [Google Scholar]
  6. Wang H. C., Botti S., Marques M. A. L.. Predicting stable crystalline compounds using chemical similarity. Npj Comput. Mater. 2021;7:12. doi: 10.1038/s41524-020-00481-6. [DOI] [Google Scholar]
  7. Wei, L. ; Omee, S. S. ; Dong, R. ; Fu, N. ; Song, Y. ; Siriwardane, E. M. D. ; Xu, M. ; Wolverton, C. ; Hu, J. . CSPBench: A benchmark and critical evaluation of Crystal Structure Prediction. 2024; https://arxiv.org/abs/2407.00733.
  8. Wei L., Dong R., Fu N., Omee S. S., Hu J.. TCSP 2.0: Template based crystal structure prediction with improved oxidation state prediction and chemistry heuristics. Comput. Mater. Sci. 2026;261:114317. doi: 10.1016/j.commatsci.2025.114317. [DOI] [Google Scholar]
  9. Kusaba M., Liu C., Yoshida R.. Crystal structure prediction with machine learning-based element substitution. Comput. Mater. Sci. 2022;211:111496. doi: 10.1016/j.commatsci.2022.111496. [DOI] [Google Scholar]
  10. Lin, P. ; Chen, P. ; Jiao, R. ; Mo, Q. ; Jianhuan, C. ; Huang, W. ; Liu, Y. ; Huang, D. ; Lu, Y. . Equivariant Diffusion for Crystal Structure Prediction Proceedings Of The 41st International Conference On Machine Learning PMLR; 2024. 29890–29913 [Google Scholar]
  11. Fu N., Hu J., Feng Y., Morrison G., Loye H.-C. Z., Hu J.. Composition Based Oxidation State Prediction of Materials Using Deep Learning Language Models. Adv. Sci. 2023;10:2301011. doi: 10.1002/advs.202301011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Deng B., Zhong P., Jun K., Riebesell J., Han K., Bartel C. J., Ceder G.. CHGNet as a pretrained universal neural network potential for charge-informed atomistic modelling. Nat. Mach. Intell. 2023;5:1031–1041. doi: 10.1038/s42256-023-00716-3. [DOI] [Google Scholar]
  13. Wei L., Fu N., Siriwardane E. M. D., Yang W., Omee S. S., Dong R., Xin R., Hu J.. TCSP: a Template-Based Crystal Structure Prediction Algorithm for Materials Discovery. Inorg. Chem. 2022;61:8431–8439. doi: 10.1021/acs.inorgchem.1c03879. [DOI] [PubMed] [Google Scholar]
  14. Oran C., Caputo R., Villars P., Özcü H. B., Canbaz F. H., Tekin A.. Phase Prediction via Crystal Structure Similarity in the Periodic Number Representation. Inorg. Chem. 2024;63:20521–20530. doi: 10.1021/acs.inorgchem.4c03137. [DOI] [PubMed] [Google Scholar]
  15. Caputo R., Villars P., Tekin A., Oran C.. Periodic table representation of binary, ternary and higher-order systems of inorganic compounds. J. Alloys Compd. 2024;970:172638. doi: 10.1016/j.jallcom.2023.172638. [DOI] [Google Scholar]
  16. Pettifor D. G.. A chemical scale for crystal-structure maps. Solid State Commun. 1984;51:31–34. doi: 10.1016/0038-1098(84)90765-8. [DOI] [Google Scholar]
  17. Villars P., Cenzual K., Daams J., Chen Y., Iwata S.. Data-driven atomic environment prediction for binaries using the Mendeleev Number: Part 1. Composition AB. J. Alloys Compd. 2004;367:167–175. doi: 10.1016/j.jallcom.2003.08.060. [DOI] [Google Scholar]
  18. Villars P., Daams J., Shikata Y., Rajan K., Iwata S.. A New Approach to Describe Elemental-Property Parameters. Chem. Met. Alloys. 2008;1:1–23. doi: 10.30970/cma1.0007. [DOI] [Google Scholar]
  19. Blokhin, E. ; Villars, P. . Handbook of Materials Modeling, 2nd ed.; Andreoni, W. ; Yip, S. , eds.; Springer: Cham, 2020. p. 1837 [Google Scholar]
  20. Yu H., Giantomassi M., Materzanini G., Wang J., Rignanese G. M.. Systematic assessment of various universal machine-learning interatomic potentials. Mater. Genome Eng. Adv. 2024;2:e58. doi: 10.1002/mgea.58. [DOI] [Google Scholar]
  21. Chen C., Ye W., Zuo Y., Zheng C., Ong S. P.. Graph Networks as a Universal Machine Learning Framework for Molecules and Crystals. Chem. Mater. 2019;31:3564–3572. doi: 10.1021/acs.chemmater.9b01294. [DOI] [Google Scholar]
  22. Chen C., Ong S. P.. A universal graph deep learning interatomic potential for the periodic table. Nat. Comput. Sci. 2022;2:718–728. doi: 10.1038/s43588-022-00349-3. [DOI] [PubMed] [Google Scholar]
  23. Batatia I., Kovacs D. P., Simm G., Ortner C., Csanyi G.. MACE Higher Order Equivariant Message Passing Neural Networks for Fast and Accurate Force Fields. Adv. Neural Inf. Process. Syst. 2022;35:11423–11436. doi: 10.52202/068431-0830. [DOI] [Google Scholar]
  24. Choudhary K., DeCost B., Major L., Butler K., Thiyagalingam J., Tavazza F.. Unified graph neural network force-field for the periodic table: solid state applications. Digital Discovery. 2023;2:346–355. doi: 10.1039/D2DD00096B. [DOI] [Google Scholar]
  25. Kirklin S., Saal J. E., Meredig B., Thompson A., Doak J. W., Aykol M., Rühl S., Wolverton C.. The Open Quantum Materials Database (OQMD): assessing the accuracy of DFT formation energies. Npj Comput. Mater. 2015;1:1–15. doi: 10.1038/npjcompumats.2015.10. [DOI] [Google Scholar]
  26. Jain A., Ong S. P., Hautier G., Chen W., Richards W. D., Dacek S., Cholia S., Gunter D., Skinner D., Ceder G., Persson K. A.. Commentary: The Materials Project: A materials genome approach to accelerating materials innovation. APL Mater. 2013;1:011002. doi: 10.1063/1.4812323. [DOI] [Google Scholar]
  27. Ong S., Richards W. D., Jain A., Hautier G., Kocher M., Cholia S., Gunter D., Chevrier V. L., Persson K. A., Ceder G.. Python Materials Genomics (pymatgen): A robust, open-source python library for materials analysis. Comput. Mater. Sci. 2013;68:314–319. doi: 10.1016/j.commatsci.2012.10.028. [DOI] [Google Scholar]
  28. Castelliz L.. Kristallstruktur von Mn5Ge3 und einiger ternärer Phasen mit zwei Übergangselementen. Monatsh. Chem. Verw. Teile Anderer Wiss. 1953;84:765–776. doi: 10.1007/BF00902776. [DOI] [Google Scholar]
  29. Görne A., George J., van Leusen J., Dück G., Jacobs P., Chogondahalli Muniraju N., Dronskowski R.. Ammonothermal Synthesis Crystal Structure, and Properties of the Ytterbium­(II) and Ytterbium­(III) Amides and the First Two Rare-Earth-Metal Guanidinates, YbC­(NH)­3 and Yb­(CN3H4)­3. Inorg. Chem. 2016;55:6161–6168. doi: 10.1021/acs.inorgchem.6b00736. [DOI] [PubMed] [Google Scholar]
  30. Szymanski J. T.. The crystal structure of mawsonite, Cu 6 Fe 2 SnS 8. Can. Mineral. 1976;14:529–535. [Google Scholar]
  31. Koza J. A., Hill J. C., Demster A. C., Switzer J. A.. Epitaxial Electrodeposition of Methylammonium Lead Iodide Perovskites. Chem. Mater. 2016;28:399–405. doi: 10.1021/acs.chemmater.5b04524. [DOI] [Google Scholar]
  32. Wang X. D., Huang Y. H., Liao J. F., Wei Z. F., Li W. G., Xu Y. F., Chen H. Y., Kuang D. B.. Surface passivated halide perovskite single-crystal for efficient photoelectrochemical synthesis of dimethoxydihydrofuran. Nat. Commun. 2021;12:1202. doi: 10.1038/s41467-021-21487-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Kosmatos K. O., Theofylaktos L., Giannakaki E., Deligiannis D., Konstantakou M., Stergiopoulos T.. Methylammonium Chloride: A Key Additive for Highly Efficient, Stable, and Up-Scalable Perovskite Solar Cells. Energy Environ. Mater. 2019;2:79–92. doi: 10.1002/eem2.12040. [DOI] [Google Scholar]
  34. Kojima A., Teshima K., Shirai Y., Miyasaka T.. Organometal Halide Perovskites as Visible-Light Sensitizers for Photovoltaic Cells. J. Am. Chem. Soc. 2009;131:6050–6051. doi: 10.1021/ja809598r. [DOI] [PubMed] [Google Scholar]
  35. Stranks S. D., Snaith H. J.. Metal-halide perovskites for photovoltaic and light-emitting devices. Nat. Nanotechnol. 2015;10:391–402. doi: 10.1038/nnano.2015.90. [DOI] [PubMed] [Google Scholar]
  36. National Laboratory of the Rockies Best Research-Cell Efficiency Chart. https://www.nrel.gov/pv/cell-efficiency, 2025. Accessed: 27 rOctobe 2025.
  37. Li J.-B., Jiang Z. K., Wang R., Zhao J. Z., Wang R.. Ferroelectric order in hybrid organic-inorganic perovskite NH4PbI3 with non-polar molecules and small tolerance factor. Npj Comput. Mater. 2023;9:62. doi: 10.1038/s41524-023-01019-2. [DOI] [Google Scholar]
  38. Zhou Z., Pang S., Ji F., Zhang B., Cui G.. The fabrication of formamidinium lead iodide perovskite thin films via organic cation exchange. Chem. Commun. 2016;52:3828–3831. doi: 10.1039/C5CC09873D. [DOI] [PubMed] [Google Scholar]
  39. Frost J. M., Butler K. T., Brivio F., Hendon C. H., van Schilfgaarde M., Walsh A.. Atomistic Origins of High-Performance in Hybrid Halide Perovskite Solar Cells. Nano Lett. 2014;14:2584–2590. doi: 10.1021/nl500390f. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Powlson D. S., Dawson C. J.. Use of ammonium sulphate as a sulphur fertilizer: Implications for ammonia volatilization. Soil Use Manage. 2022;38:622–634. doi: 10.1111/sum.12733. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Sheveleva O. G., Rupcheva V. A., Poilov V. Z.. Production and properties of double potassium-ammonium sulfate. Russ. J. Appl. Chem. 2016;89:29–33. doi: 10.1134/S1070427216010043. [DOI] [Google Scholar]
  42. Zhumashev K., Serikbayeva A., Boranbayeva A., Altybayeva Z., Bussurmanova A., Akkenzheyeva A., Gusmanova A., Cherkeshova S., Narembekova A., Berdikulova F.. et al. Study of Kinetics of Interaction Between Ammonium Bisulfate and Potassium Chloride & for Fertilizers. ES Mater. Manuf. 2025;27:1421. doi: 10.30919/mm1421. [DOI] [Google Scholar]
  43. Xia W., Zhao Y., Zhao F., Adair K., Zhao R., Li S., Zou R., Zhao Y., Sun X.. Antiperovskite Electrolytes for Solid-State Batteries. Chem. Rev. 2022;122:3763–3819. doi: 10.1021/acs.chemrev.1c00594. [DOI] [PubMed] [Google Scholar]
  44. Perdew J. P., Ruzsinszky A., Csonka G. I., Vydrov O. A., Scuseria G. E., Constantin L. A., Zhou X., Burke K.. Restoring the density-gradient expansion for exchange in solids and surfaces. Phys. Rev. Lett. 2008;100:136406. doi: 10.1103/PhysRevLett.100.136406. [DOI] [PubMed] [Google Scholar]
  45. Lejaeghere K., Van Speybroeck V., Van Oost G., Cottenier S.. Error estimates for solid-state Density-Functional Theory predictions: An overview by means of the ground-state elemental crystals. Crit. Rev. Solid State Mater. Sci. 2014;39:1. doi: 10.1080/10408436.2013.772503. [DOI] [Google Scholar]
  46. Marin-Villa P., Gila-Herranz P., Jimenez-Ruiz M., Ivanov A., Armstrong J., Drużbicki K., Fernandez-Alonso F.. Molecular Derailment via Pressurization in Methylammonium Lead Iodide. J. Phys. Chem. Lett. 2025;16:10906–10914. doi: 10.1021/acs.jpclett.5c01832. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Whitfield P. S., Herron N., Guise W. E., Page K., Cheng Y. Q., Milas I., Crawford M. K.. Structures, Phase Transitions and Tricritical Behavior of the Hybrid Perovskite Methyl Ammonium Lead Iodide. Sci. Rep. 2016;6:35685. doi: 10.1038/srep35685. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Lee J.-H., Bristowe N. C., Bristowe P. D., Cheetham A. K.. Role of hydrogen-bonding and its interplay with octahedral tilting in CH3NH3PbI3. Chem. Commun. 2015;51:6434–6437. doi: 10.1039/C5CC00979K. [DOI] [PubMed] [Google Scholar]
  49. Demchenko D. O., Izyumskaya N., Feneberg M., Avrutin V., Özgür U., Goldhahn R., Morkoç H.. Optical properties of the organic-inorganic hybrid perovskite CH3NH3PbI3: Theory and experiment. Phys. Rev. B. 2016;94:075206. doi: 10.1103/PhysRevB.94.075206. [DOI] [Google Scholar]
  50. Montero-Alejo A. L., Menéndez-Proupin E., Hidalgo-Rojas D., Palacios P., Wahnón P., Conesa J. C.. Modeling of Thermal Effect on the Electronic Properties of Photovoltaic Perovskite CH3NH3PbI3: The Case of Tetragonal Phase. J. Phys. Chem. C. 2016;120:7976–7986. doi: 10.1021/acs.jpcc.6b01013. [DOI] [Google Scholar]
  51. Hong Q. J., Ushakov S. V., van de Walle A., Navrotsky A.. Melting temperature prediction using a graph neural network model: From ancient minerals to new materials. Proc. Natl. Acad. Sci. U. S. A. 2022;119:e2209630119. doi: 10.1073/pnas.2209630119. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Woods-Robinson R., Xiong Y., Shen J.-X., Winner N., Horton M. K., Asta M., Ganose A. M., Hautier G., Persson K. A.. Designing transparent conductors using forbidden optical transitions. Matter. 2023;6:3021–3039. doi: 10.1016/j.matt.2023.06.043. [DOI] [Google Scholar]
  53. Lux D., Schwarz W., Hess H.. Methylammoniumfluoride CH3NH3F. Cryst. Struct. Commun. 1979;60:41–43. [Google Scholar]
  54. Solladié-Cavallo A., Khiar N.. Methylammonium Fluoride (MAF): A Convenient Reagent for Si-O Bond Cleavage. Synth. Commun. 1989;19:1335–1340. doi: 10.1080/00397918908054542. [DOI] [Google Scholar]
  55. Gebhardt J., Rappe A. M.. Design of Metal-Halide Inverse-Hybrid Perovskites. J. Phys. Chem. C. 2018;122:13872–13883. doi: 10.1021/acs.jpcc.8b01008. [DOI] [Google Scholar]
  56. Fan L. Q., Wu J. H.. NH4PbI3 . Acta Crystallogr., Sect. E. 2007;63:i189. doi: 10.1107/S1600536807050581. [DOI] [Google Scholar]
  57. Ohi K., Osaka J., Uno H.. Ferroelectric Phase Transition in Rb2SO4-(NH4)­2SO4 and Cs2SO4-(NH4)­2SO4Mixed Crystals. J. Phys. Soc. Jpn. 1978;44:529–536. doi: 10.1143/JPSJ.44.529. [DOI] [Google Scholar]
  58. Shamah A. M., Ahmed S., Kamel R., Badr Y.. Structural changes of ((NH4)­1-xKx)­2SO4 crystals. Phys. Status Solidi. 1987;100:115–119. doi: 10.1002/pssa.2211000113. [DOI] [Google Scholar]
  59. Gedam S. C., Dhoble S. J., Moharil S. V.. Dy3+and Mn2+ emission in fluoride- and chloride-based Na3­(SO4)­X (X = F or Cl) phosphors. Luminescence. 2012;27:441–446. doi: 10.1002/bio.1371. [DOI] [PubMed] [Google Scholar]
  60. Xu Z., Liu Y., Sun X., Xie X., Guan X., Chen C., Lu P., Ma X.. Theoretical design of Na-rich anti-perovskite as solid electrolyte: The effect of cluster anion in stability and ionic conductivity. J. Solid State Chem. 2022;316:123643. doi: 10.1016/j.jssc.2022.123643. [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

ct6c00044_si_001.pdf (6.6MB, pdf)

Data Availability Statement

An open-source software implementation of our similarity-based initial crystal structure prediction method is available at https://github.com/tccdem/PNcsp_Plus.


Articles from Journal of Chemical Theory and Computation are provided here courtesy of American Chemical Society

RESOURCES