Skip to main content
Nature Communications logoLink to Nature Communications
. 2022 Jun 2;13:3095. doi: 10.1038/s41467-022-30692-y

Reliable crystal structure predictions from first principles

Rahul Nikhar 1, Krzysztof Szalewicz 1,
PMCID: PMC9163189  PMID: 35654882

Abstract

An inexpensive and reliable method for molecular crystal structure predictions (CSPs) has been developed. The new CSP protocol starts from a two-dimensional graph of crystal’s monomer(s) and utilizes no experimental information. Using results of quantum mechanical calculations for molecular dimers, an accurate two-body, rigid-monomer ab initio-based force field (aiFF) for the crystal is developed. Since CSPs with aiFFs are essentially as expensive as with empirical FFs, tens of thousands of plausible polymorphs generated by the crystal packing procedures can be optimized. Here we show the robustness of this protocol which found the experimental crystal within the 20 most stable predicted polymorphs for each of the 15 investigated molecules. The ranking was further refined by performing periodic density-functional theory (DFT) plus dispersion correction (pDFT+D) calculations for these 20 top-ranked polymorphs, resulting in the experimental crystal ranked as number one for all the systems studied (and the second polymorph, if known, ranked in the top few). Alternatively, the polymorphs generated can be used to improve aiFFs, which also leads to rank one predictions. The proposed CSP protocol should result in aiFFs replacing empirical FFs in CSP research.

Subject terms: Chemical physics, Crystal engineering, Density functional theory, Structure prediction


Developing theoretical frameworks to predict new polymorphs is highly desirable. Here the authors present an ab initio based force-field approach for crystal structure prediction offering a dramatic computational speed-up over fully ab initio schemes.

Introduction

Properties of crystalline solids depend critically on the polymorphic form of a given substance and many crystals can exist in several such forms1,2. The knowledge of possible stable polymorphic forms of a crystal is of particular importance for pharmaceutical industry3. If a polymorph different from the one obtained in laboratories crystallizes during manufacturing of a drug, it will have different physicochemical properties and may lead to undesirable therapeutic effects, two examples are ritonavir4,5 and rotigotine68. Thus, in the drug development process, one would like to know if the polymorph used is thermodynamically the most stable form in ambient conditions. In defense industry, developments of energetic materials are costly and highly dangerous9,10 and a priori knowledge of crystal structure of notional materials would allow accelerated screening of such materials. Also semiconductor industry can benefit from such knowledge11. CSP methods answer these needs by finding a set of most stable crystalline polymorphs of a given molecule starting from its two-dimensional diagram and not using any experimental information specific for this molecule.

Reliable CSPs for molecular crystals starting from the knowledge of only two-dimensional diagrams of monomer(s) were nearly impossible for a long time. In 1988, Maddox12 described failure of CSPs as a continuing scandal in the physical sciences and stated that in general even simplest crystalline solids posed great challenge. In mid 1990s, Gavezzotti13 asked the fundamental question: ‘Are crystal structures predictable?’, and his answer was ‘No’. In response to this criticism, the Cambridge Crystallographic Data Centre (CCDC) conducted a series of “blind” tests1419 by providing only two-dimensional diagrams of monomers of crystals that have been measured but not published and asking research groups to submit their predictions, with the results of the first test published in 2000. While the field has advanced significantly since the first test, the results of the last, 6th test19 are still not completely satisfactory. The participating groups achieved the success rate between 13% and 57% (not including polymorphs C and E of system XXIII), where success means that the experimental polymorph was found among polymorphs on two lists containing 100 polymorphs each.

One should remark here that predictions of crystals structures are actually a difficult problem for physical science, opposite to what Maddox12 implied. The reason is the high dimensionality of the conformational and crystallographic space resulting in thousands of plausible polymorphs produced by sampling of this space within a relatively narrow window of lattice energies and densities. The energetic distances between consecutive polymorphs ordered by lattice energy are of the order of 1 kJ/mol at the low-energy end, which requires accuracies nearly impossible to achieve by empirical FFs. Also, for experimentally observed polymorphs, the differences between their computed lattice energies are of the same order20.

While there are several variants of CSPs, including a recent use of deep neural networks21, the majority of groups participating in the 6th blind test used some form of FFs, mostly of empirical character. The most successful CSP protocol consisting of polymorph-space sampling plus lattice-energy minimization has been developed by Neumann et al.22,23. This protocol uses a tailor-made FF which is obtained by refining parameters of an empirical FF to reproduce as close as possible pDFT+D lattice energies (and their derivatives). The initial polymorphs for pDFT+D calculations are obtained using the empirical FF. The method is included in the commercial software package GRACE (Generation Ranking and Characterization Engine)24, but some of its details are not available. Recent reviews of the field of CSPs can be found in refs. 2531.

In the present work, a CSP protocol is proposed based entirely on first principles, i.e., not utilizing any experimental information. Since the main characteristic of this method is the use of aiFFs, we will refer to it as the CSP(aiFF) protocol. This protocol consists of several stages shown in Fig. 1. While aiFFs have been used in CSPs for some time19,3234, such predictions were taking a long time (several months at the minimum), required huge amounts of human effort, and were possible for monomers with up to about 20 atoms. In the present work, four recent developments are combined to dramatically reduce costs and increase predictability of such CSPs: (a) The development of a very effective variant35 of symmetry-adapted perturbation theory (SAPT)36 for ab initio calculations of interaction energies; (b) The creation of autoPES37,38: an automatic, effective, and reliable method for generation of potential energy surfaces (PESs) with minimal human involvement; (c) Enabling the use such aiFFs in the lattice-energy minimization stage of CSPs, a part of the present work; (d) The application of pDFT+D for a final refinement of polymorph rankings. Stage 3 of Fig. 1 can produce even millions of polymorphs at low costs and past experience indicates that the experimentally relevant polymorphs are almost always among them. Thus, the essence of CSP protocols is to filter all relevant low lattice energy polymorphs out of this set. In the past few years, it has been demonstrated by several groups that pDFT+D geometry optimization of polymorphs places the experimental polymorph ranked within the top few, often as number one19,3941. However, such calculations are so expensive that they can be afforded for only a hundred or so polymorphs. In contrast, if an FF is used in Stage 4, tens of thousands polymorphs can be optimized. This FF has to be sufficiently accurate not to miss any important polymorphs. Thus, both the ab initio method and the fit to interaction energies computed using this method must have sufficiently small uncertainties. In calculations of dimer interaction energies, the variant of SAPT used by us (see “Methods”) is nearly as accurate35,42,43 as the coupled cluster method with single, double, and noniterative triple excitations, CCSD(T), the “gold-standard” method of electronic structure theory, but is significantly less expensive. To prevent loss of accuracy due to fitting, the form of the fitting function has to be significantly more involved than those of empirical FFs that are typically built from Lennard-Jones plus Coulomb potentials, see “Methods”. The extended form can fit ab initio data with uncertainties of about 1 kJ/mol, which we will show to be sufficient for reliable CSPs. Such form has never been used in lattice energy minimizations and we had to modify CSP software to make it possible. Finally, to make Stage 2 affordable, the number of ab initio grid points needed to fit an aiFF has to be reasonably small. The autoPES method37,38 reduces this number by two orders of magnitude compared to typical surface-fitting approaches, reducing in this way the development costs by the same ratio. It also reduces amount of human involvement almost to zero as the whole process is completely automated. We show below that the proposed protocol found the experimental crystal ranked as number one for all 15 molecules studied (and the second polymorph, if known, ranked in the top few).

Fig. 1. Overview of aiFF-based CSP protocol.

Fig. 1

Stage 1: monomer energy minimization to find the equilibrium geometry. Stage 2: ab initio calculations of dimer intermolecular interaction energies followed by fitting an analytic form of aiFF to these data. Stage 3: generation of millions of plausible packing arrangements of polymorphs by sampling different space groups, orientations of monomers, and unit cell parameters, followed by a reduction of this set to tens of thousands of polymorphs using density criteria or crude lattice energy minimizations with simple FFs. Stage 4: fine minimization with aiFFs for all polymorphs in the reduced set. Stage 5: refinement of the ranking via pDFT+D calculations on a couple dozen top-ranked polymorphs from Stage 4.

Results and discussion

Performance of CSP(aiFF) protocol

To asses the performance of our method, we carried out CSPs for 15 molecules including several systems from the CCDC blind tests1419 (denoted by roman numerals), as well as for methanol, benzene, nitromethane, 5,5′-dinitro-2H,2′H-3,3′-bi-1,2,4-triazole (DNBT), 1-3-5-trinitrobenzene (TNB), deferiprone, and fluorouracil. The molecular graphs are shown in Supplementary Fig. 1. The results are summarized in Table 1. An extended version of this table is available in Supplementary Table 1.

Table 1.

CSPs from SAPT(DFT)-based aiFFs minimizations followed by pDFT+D fixed-geometry calculations.

System SG Rank RMSD20 RMSE
IPoly1 P21/c 2/1 0.09 0.6
IPoly2 Pbca 8/2 0.32 0.6
II P21/n 1/1 0.59 1.3
IV P21/c 2/1 0.24 0.63
VIII C2/c 4/1 0.28 1.1
XII Pbca 9/1 0.53 0.84
XIII P21/c 4/1 0.45 1.1
XVI Pbca 16/1 0.29 1.0
XXII P21/n 1/1 0.15 1.4
Methanol P212121 6/1 0.4 0.92
BenzenePoly1 Pbca 1/1 0.16 0.59
BenzenePoly2 P21/c 4/3 0.4 0.59
Nitromethane P212121 1/1 0.27 0.74
DNBT P21/c 1/1 0.58 1.56
TNB P21/c 3/1 0.67 1.28
DeferipronePoly1 Pbca 2/2 0.28 0.71
DeferipronePoly2 P21/c 8/1 0.24 0.71
Fluorouracil P21/c 9/1 0.61 1.06

SG: predicted space group of the crystal (SG is the same for experimental and predicted polymorphs); Rank: rank of the experimental polymorph after minimizations and after pDFT+D calculations; RMSD20: root mean square deviation (in Å) between the experimental crystal and the calculated polymorph for 20 overlapping molecules (heavy atoms only); RMSE: root mean square error (in kJ/mol) of the fit for negative interaction energies.

The CSP(aiFF) protocol ranked the experimental polymorph as number 1 in 5 cases, as number 2–6 in 7 cases, and as numbers 9, 9, and 16. We have also included a second experimentally identified polymorph in the cases of system I, benzene, and deferiprone, denoted as “Poly2” in Table 1, and these are ranked as numbers 8, 4, and 8, respectively. After pDFT+D calculations on top-ranked 20 polymorphs of each crystal, without any further geometry optimization, an experimental crystal became ranked as number 1 in each case. For deferiprone, it was Poly2 that became the rank 1 polymorph, while Poly1 remained at rank 2. For system I and benzene, Poly2 changed rank from 8 to 2 and from 4 to 3, respectively. RMSD20’s between the calculated and experimental crystals vary between 0.09 and 0.67 Å, below the CCDC threshold of 0.8 Å. Also densities and cell parameters, shown in Supplementary Table 1, agree very closely. Supplementary Fig. 2 displays the percent deviations between the calculated and experimental lattice parameters. The average errors for the cell parameters a, b, c, and β amount to 4.3%, 2.6%, 4.3%, and 2.4%, respectively. Such level of predictivity is unprecedented for a completely first-principles CSP protocol. The overlaps of the experimental polymorphs with the closest calculated ones are shown in Fig. 2. This figure allows intuitive appreciation how close these structures are. This exceptional performance of CSP(aiFF) has been achieved despite the investigated systems exhibiting typical difficulties due to closeness of polymorphs’ lattice energies and despite using rigid-monomer approximation. The lattice energy vs. density landscapes from the aiFF minimizations for systems IV and XXII are shown in Supplementary Fig. 3. Analogous graphs for the other systems look similar. The lowest-energy 100 polymorphs spread the range of about 5 kJ/mol for systems I, XII, XIII, benzene, and nitromethane, about 10 kJ/mol for systems II, IV, VIII, XVI, XXII, methanol, TNB, deferiprone, and fluorouracil, and about 20 kJ/mol for DNBT. At the low-energy end, the energy differences between consecutive polymorphs are less than 1 kJ/mol, i.e., comparable to the RMSEs of the fits over all dimer configurations with negative interaction energies, shown in Table 1.

Fig. 2. Overlaps of crystal structures.

Fig. 2

Overlap of the experimental crystal structure (element-specific colors) with the closest calculated crystal structure (green) using SAPT(DFT)-based aiFFs for systems: a and b I, c II, d IV, e VIII, f XII, g XIII, h XVI, i XXII, j methanol, k and l benzene, m nitromethane, n DNBT, o TNB, p and q Deferiprone, r Fluorouracil.

Performance of a simplified aiFF form

The use of the extended functional form of aiFFs in the lattice energy minimizations instead of the simpler exp-6-1 form (not including a polynomial in front of exponential, damping functions, etc., see “Methods”) used in some empirical FFs leads to enormous improvements in rankings. To quantify such improvements, we performed lattice energy minimizations with the exp-6-1 form of aiFFs, fitted using the same level of theory as in the case of the extended form, for systems I, II, IV, and XXII, achieving rankings of 138, 2231, 49, and 60, respectively, while the rankings of the extended aiFF form for these systems are 1 or 2, see Table 1. The main reason for this improvement is that RMSEs for negative interaction energies are from 2.3 to 5.3 times smaller in the latter case (these ratios are correlated with the number of fit parameters: e.g., 30 and 270 for the exp-6-1 and the extended form, respectively, in the case of system IV).

Performance of an empirical FF

In order to quantify better the predictive power of our approach, calculations analogous to those described above have been performed with an empirical FF. We have chosen the W99 FF44 with point charges computed by us using the CHELPG method45. For the 18 experimental polymorphs considered, the W99+charges FF found 33% of them at rank 10 or better, while the analogous result for aiFF (without the pDFT+D step) is 94%. This amounts to a qualitative difference for technological applications. For more details on CSPs with the W99+charges FF, see Supplementary Tables 1 and 2.

Alternative CSP(aiFF) protocol

One may ask why pDFT+D calculations are needed to improve the rankings, while several comparisons on benchmark interaction energies, see, e.g., refs. 42,43, show that SAPT(DFT) is nearly as accurate as CCSD(T) and more accurate than DFT+D methods. The main reason is that what is used in CSPs are aiFFs, and they include additional uncertainties due to fitting. Although the average fit error for negative interaction energies is only ~1 kJ/mol, errors may be larger at some configurations. If a configuration with a too negative interaction energy is important for a polymorph, this polymorph may become overly stable and therefore too highly ranked. Two other possible reasons, basis set size and neglect of many-body effects in CSP(aiFF), are discussed in Supplementary Information and found unlikely to be a reason. To improve the predictions from Stage 4, we have developed an alternative version of our method, alt-CSP(aiFF). After executing the CSP(aiFF) protocol less the pDFT+D stage, the geometries of 20 polymorphs with the lowest lattice energies are examined and consecutive nearest neighbor dimers identified. Then SAPT calculations are performed for these dimers, the aiFF is refitted, and lattice minimizations for the 20 polymorphs are performed with the new aiFF. This procedure is iterated until the energies of the 5x5x5 clusters extracted from each polymorph computed in two ways: just from the aiFF and in a hybrid way, replacing the aiFF interaction energies by the available SAPT ones, are the same to within some threshold. We have applied alt-CSP(aiFF) to two of the worst ranking crystals from Table 1: system XVI (rank 16) and fluorouracil (rank 9). In each case, alt-CSP(aiFF) resulted in the experimental polymorph at rank 1, while RMSD20 was reduced from 0.29 to 0.15 Å and from 0.61 to 0.42 Å, respectively. Thus, alt-CSP(aiFF) can be used without the pDFT+D stage. However, the additional ab initio calculations are about as expensive as the pDFT+D ones, so there is no gain in terms of efficiency.

Cost comparisons

The method proposed not only is highly reliable, as shown above, but also is very efficient compared to alternative ways of combining FF-based CSPs with pDFT+D calculations. To demonstrate this, we show in Fig. 3 the costs of three possible CSP strategies in terms of single-core wall times on the example of system I. Note that this type of calculations are typically performed on hundreds of cores, so the actual wall time is just a couple hours for Strategy 1, the approach proposed here. The majority of time for Strategy 1, 7 core-days, is spent for the development of an aiFF and most of this time is used to compute SAPT(DFT) interaction energies for 706 dimer configurations, with very little time spent on fitting these energies. The next stage, the packing and minimization (PACK+MIN) of hundreds of thousands of crystals, requires only less than a third of a day. The final stage, pDFT+D calculations for the top 20 polymorphs at aiFF geometries, requires approximately one day. Hypothetical Strategy 2 differs from Strategy 1 by the use of an empirical FF in the PACK+MIN stage and by performing pDFT+D calculations for 100 polymorphs with reoptimization of geometries (this number of polymorphs was chosen as a trade-off between success rate and computational costs). The time required for the latter stage would be 70 core-days, so Strategy 2 is about an order of magnitude more expensive than Strategy 1. Moreover, if the W99+charges FF were used, the success rate of Strategy 2 on the set of 18 polymorphs examined here would be 72% (see Supplementary Table 2), while the success rate of Strategy 1 is 100% already with 16 top-ranked polymorphs. All the PACK+MIN bars appear to be of about the same height for aiFF and for the empirical FF. This is because the calculation of the lattice energy is only about two times more expensive in the former case. Hypothetical Strategy 3 performs pDFT+D calculations with geometry optimization for all 25,500 polymorphs produced by PACK+MIN. This strategy would have a very high reliability (since practice indicates that the experimental polymorphs are almost always included in such a large pool of candidate structures), but it would be extremely costly, 49 single-core years, and hence not practical (although possible if a few thousands cores were used). With the use of an empirical FF, one can set the number of polymorphs included in the pDFT+D stage anywhere between 100 and 25,000, systematically increasing costs and reliability relative to Strategy 2. However, with W99+charges and our set of polymorphs, the success rate would remain at 72% until the number of polymorphs is at least 589. For Strategies 2 and 3, the PACK+MIN stage can be replaced by any other protocol producing the required number of candidate polymorphs, with insignificant effects on the total timings.

Fig. 3. Computational cost of the considered CSP protocols.

Fig. 3

Total wall times required for system I CSPs on a single core of the Intel E5-2670 processor using different strategies. Rows “aiFF”, “PACK+MIN”, and “pDFT+D” denote times of an aiFF development, packing and minimization, and of periodic DFT+D calculations.

Neglected effects

Since aiFFs are sums of two-body interactions, they neglect the many-body effects mentioned earlier and discussed in Supplementary Information. While we show that these effects are not critical in CSPs for the crystals considered here, they may be significant for some other crystals4648. The most important many-body effect, the many-body polarization, can be accounted for using polarizable aiFFs that can be developed using autoPES, but are not yet implemented in our CSP codes. In Supplementary Information, we also explain why the relatively small basis set that we used is adequate for CSPs. A much more important neglected effect is flexibility of monomers. Although the monomers considered by us were assumed to be rigid, the proposed CSP(aiFF) protocol can be applied to monomers with soft degrees of freedom. Such monomers may be significantly deformed in crystals compared to their equilibrium structures in gas phase. The recent version of autoPES38 has the capability of computing interaction energies accounting for all or selected intramonomer degrees of freedom and most CSP codes can perform packing and minimization including all degrees of freedom, therefore such predictions can be made still completely from first principles. However, costs of such calculations increase steeply with the total number of degrees of freedom. One way around this problem is to assume separation of inter- and intramonomer degrees of freedom in Stage 2, as it has been done in all biomolecular FFs and in all FFs used in flexible-monomers CSPs. Since our aiFFs depend only of separations between atoms of different monomers, interaction energies can be computed for arbitrary monomer configurations. Such “flexibilized” intermonomer FF can replace the intermonomer component of current empirical FFs, while the intramonomer component can be kept unchanged. One can expect that such a replacement should lead to improved predictions in flexible-monomer CSPs.

Other effects neglected by the present version of CSP(aiFF) are thermal and entropic ones, as the results presented by us correspond to 0 K temperature. For some crystals, these effects can change the rankings of polymorphs, as pointed out by Brandenburg and Grimme39 and recently investigated extensively by Hoja et al.41. The thermal and entropic effects can be routinely computed using pDFT+D, although such calculations are several times more expensive than pDFT+D calculations with static geometries. As a test, we have computed both effects for the 5 lowest lattice energy polymorphs of system XXII, leading to no change of rankings.

Concluding remarks

The first-principles CSP(aiFF) method developed here was applied to crystals of 15 rigid molecules with 18 known experimental polymorphs. When aiFFs are applied in CSPs for crystals of these molecules, 17 or 94% the polymorphs are ranked in the range 1–10, while the remaining one has rank 16. For comparison, analogous CSPs with the empirical W99+charges FF ranks only 33% of polymorphs in the range 1–10, 3 experimental polymorphs are not found within 568 or more generated ones, and for two molecules predictions were not possible due to missing atom types. The ability of CSP(aiFF) to minimize tens of thousands polymorphs is its key advantage over alternative approaches which have to use low-accuracy methods at this stage, often erroneously leading to discarding of correct structures. Upon a subsequent reranking of the top 20 polymorphs with pDFT+D calculations at fixed aiFF geometries, for all 15 molecules an experimental polymorph became ranked as number 1, while the second polymorphs became ranked as numbers 2, 2, and 3. The pDFT+D step can be omitted if aiFFs are iteratively improved by performing ab initio calculations on dimers extracted from crystals predicted with the previous iteration of an aiFF [the alt-CSP(aiFF) protocol]. The proposed CSP protocol not only shows ultimate predictive power for the systems tested, but is also inexpensive compared to other highly predictive approaches. On about a hundred cores, complete predictions for any of the systems investigated here take less than a day, including the aiFF generation. The CSP(aiFF) protocol requires a minimal human involvement, consisting only of input preparation for autoPES, UPACK, and pDFT+D calculations, and includes only free software with open source codes. Limitations of the current implementation of the CSP(aiFF) methodology have been discussed, in particular the neglect of many-body interactions and the rigid-monomer approximation. Although the test set included only homogeneous crystals, there are no reasons to doubt that the method will work equally well for cocrystals including salts since the quality of aiFFs does not depend on dimers being homogeneous or heterogeneous (of course, for two-component cocrystals, three PESs have to be developed). Also, while the largest of the test molecules included 22 atoms, the method should apply equally well to larger molecules since the relative accuracy of SAPT(DFT) does not change with system size35. Of course, calculations will be more expensive as the size increases, but molecules with about 100 atoms are within reach. The effectiveness of the proposed CSP protocol is due to the use of the SAPT(DFT) method which is computationally efficient relative to other accurate electronic structure methods and due to the use of the autoPES method for fitting aiFFs since this method not only cuts the costs of such fits by orders of magnitude, but also reduces human effort of this most difficult to automate step almost to zero. An important element of the CSP(aiFF) protocol is that it replaces simple potential forms used in all earlier CSP protocols by an extended form capable of fitting ab initio interaction energies with significantly decreased uncertainties. An advantage of the proposed protocol is that it constitutes a complete first-principles procedure for investigating crystal structures and properties. Such a protocol should work equally well for any type of monomer, in contrast to the protocols using empirical FFs, which are expected to work well only for systems similar to those used in fitting such FFs. We believe that the overall effect of the proposed CSP protocol will be that the field of CSPs will move from the use of empirical FFs to aiFFs. This should increase reliability of predictions and therefore, while CSPs have played so far at the best advisory role in technology developments, they may become a leading element in developments of novel crystalline materials. More generally, aiFFs can be used in several types of computational material design.

Methods

Monomer geometry minimization

In Stage 1, monomer geometries were optimized using ORCA49,50 with the PBE51 functional and D3 correction52 in the aug-cc-pVTZ53 basis set.

Ab initio calculations of interaction energies

To make the CSP(aiFF) protocol practical, aiFFs have to be constructed in Stage 2 at reasonably low costs, but at the same time with small uncertainties, for monomers with dozens of atoms. This requires first that the ab initio method used to compute intermolecular interaction energies is inexpensive and accurate. It appears that the best current choice for such calculations is SAPT36,54, an ab initio method that computes interaction energies directly, starting from isolated monomers and imposing the correct electron permutational symmetry. We applied the SAPT variant based on DFT, SAPT(DFT)55,56, see ref. 35 for a recent review of this method. SAPT(DFT) and CCSD(T) calculations scale as O(n5) and O(n7) with system size, respectively, where n is the number of electrons, and for dimers with a couple dozens of atoms, SAPT(DFT) calculations are about two orders of magnitude less expensive than CCSD(T) calculations. The recently developed new SAPT(DFT) algorithms and effective computer codes35,42 can be used to compute thousands of grid points for dimers with ~100-atom monomers using reasonable computer resources and being able to achieve this in a few days if a sufficient number of computer cores are available.

The details of calculations of SAPT(DFT)5558 first- and second-order interaction energies are as follows. We used the density-fitting version35,59,60 in the SAPT202061 codes interfaced with the ORCA package49,50 for calculations on monomers. The PBE51 functional was used in DFT calculations applying the gradient-regulated asymptotic correction (GRAC)62,63. The aug-cc-pVDZ53 basis set plus a set of 3s3p2d2f midbond functions (default of autoPES) was used in the monomer-centered plus basis set (MC+BS) format64. The terms accounting for higher-order induction and exchange-induction effects, denoted as δEint,respHF and obtained as a difference between Hartree–Fock (HF) interaction energies and the sum of appropriate SAPT(HF) first- and second-order corrections in their response (resp) versions, was included for all systems except system XIII, benzene, DNBT, and TNB. We use a short-hand notation for SAPT interaction energy components: “indx” is the sum of the second-order induction and exchange-induction components, as well as of the δEint,respHF contribution, “dispx” is the sum of the dispersion and exchange-dispersion components, “elst” is the electrostatic component, and “exch” is the first-order exchange component. Relative importance of attractive components is illustrated in Supplementary Fig. 4.

Generation of aiFFs

In all past CSPs, only simple FFs have been used at the lattice-energy minimization stage. The two most often used forms are the Lennard-Jones 12-6-1 potential: A12/r12 − C6/r6 + qaqb/r, and the Buckingham exp-6-1 potential: Aeβr − C6/r6 + qaqb/r, where r is an atom-atom distance and A12, A, β, C6, qa, and qb are adjustable parameters. SAPT(DFT)-based aiFFs have been used in CSPs, but always with the exp-6-1 potential form in the minimization stage. This form is not pliable enough to fit well ab initio data, leading to uncertainties of a few kJ/mol, too large for reliable CSPs. In contrast, the extended form used by us in the CSP(aiFF) protocol can fit ab initio data with uncertainties of about 1 kJ/mol, which we show to be sufficient for reliable CSPs. This functional form is37,38

V=aA,bB1+i=1,2aiab(rab)ieαabβabrab+A12ab(rab)12n=6,8fn(δnab,rab)Cnab(rab)n+f1(δ1ab,rab)qaqbrab 1

where a (b) goes over the sets of atoms in monomer A (B), respectively, αab, βab, aiab, A12ab are repulsion-energy parameters, Cnab are long-range dispersion plus induction energy parameters, qx, x = a, b, are atomic partial charges, δnab are damping parameters, and fn are the Tang-Toennies65 damping functions: fn(δ,r)=1eδrm=0n(δr)m/m! Long-range interaction energies were computed using an ab initio-distributed approach. The damping parameters in the dispersion plus induction term were fitted separately to the sum of all close-range second-order components plus δEint,respHF, while δ1ab were fitted to electrostatic energies. All PESs developed here are two-body, 6-dimensional PESs, i.e., assume rigid monomers. The aiFFs were constructed as sums of these two-body PESs. One should add that the extended form of FFs given by Eq. (1) has been used in some published CSPs, but only in molecular dynamics (MD) simulations that can replace the pDFT+D calculations of Stage 5. Note that MD calculations are about as expensive as pDFT+D ones and significantly more expensive than the minimizations of Stage 4. Graphs showing SAPT(DFT) interaction energy components and their fits as functions of the distance R between the centers of mass of monomers are included in Supplementary Fig. 5. One can see in particular that the ab initio electrostatic energies are reproduced very well for R’s larger than the van der Waals minimum distance RvdW despite using only damped charge-charge interactions, i.e., omitting higher multipolar terms. While the use of the latter terms in empirical FFs improves the predictions compared to the use of charges only66,67, our results show that higher-rank multipoles are not needed if the electrostatic function includes damping and is fitted to ab initio electrostatic energies. The worsening of the agreement with ab initio values seen for R < RvdW is inevitable and is due to the charge-overlap effects that are not proportional to inverse powers of R36. These effects are accounted for in the overall fit by the first term in Eq. (1). This is why the total fitted and ab initio interaction energies are in excellent agreement for all R.

Crystal packing and lattice-energy minimization

Since none of the available CSP packages is capable of using the form of aiFFs given by Eq. (1), we have modified two such packages: MOLPAK68 and UPACK69 to be applied in Stages 3 and 4. MOLPAK uses the concept of coordination geometry and by default searches in 26 space groups: P1, P1¯, P2, Pm, Pc, P21, P2/c, P21/m, P2/m, P21/c, Cc, C2, C2/c, Pnn2, Pba2, Pnc2, P221, Pmn21, Pma2, P21212, P212121, Pca21, Pna21, Pnma, Fdd2, Pbcn, and Pbca. It generates polymorphs on a grid in three-dimensional search space by systematically varying the orientation of the central molecule in steps of 10. This generation is performed in all 51 coordination geometries. The packing in the unit cell is controlled by a simple repulsive 1/r12 interaction between atoms: the molecules are brought together until an energy threshold is reached. This step provides an initial set of 6859 angle combinations × 51 coordination geometries = 349,809 hypothetical polymorphs. From this set, 25,500 densest polymorphs, 500 from each coordination geometry, are minimized using the program WMIN70. The default functional form of the FF in WMIN is exp-6-1. We have modified this code to include FFs of the form of Eq. (1).

UPACK generates random crystal structures in 13 default space groups: C2, C2/c, Cc, P1, P1¯, P21, P21/c, P212121, Pbca, Pc, Pbcn, Pca21, and Pna21. It can use any 12-6-1 potential and we selected the OPLS-AA FF71. The packing stage is divided in UPACK into two steps. In the first step, only 500 reasonable structures per symmetry group are randomly generated in an unrestricted way and are then used to estimate cell dimensions. In the second step, the random generation is performed in a restricted coordinate space using this cell estimate. Most of the generated structures are immediately rejected using the criterion that atom-atom 12-6-1 interactions are not allowed to be larger than 2000 kJ/mol for any pair. Such generations plus energy criterion testings continue until 5000 polymorphs per symmetry group, i.e., the total of 65,000 polymorphs are found. This second step involves also a rough optimization of lattice energies. The resulting list is subjected to clustering72 to remove duplicates. Clustering reduces the pool significantly. For example, for system XXII it is reduced to 13,014 polymorphs.

In Stage 4 of CSP(aiFF) realized with UPACK, all the polymorphs from the reduced set are minimized with tight thresholds. We have modified UPACK to be able to use FFs of the form of Eq. (1). We found that it is advantageous to perform Stage 4 first with the OPLS-AA FF, i.e., using the original UPACK path including clustering, and then minimize the reduced set using aiFF. The procedure was chosen not to save time, although it does result in minor savings, but to avoid minimizations ending up in “holes” of an FF, i.e., unphysical minima at very short intermonomer separations. By construction, 12-6-1 FFs do not have any holes, while exp-6-1 and our extended-form FFs almost always have holes (although behind about 100 kJ/mol barriers, one of constraints of the autoPES fitting). We found that aiFF minimizations starting from the OPLS-minimized structures almost never end up in holes. We could have easily avoided the use of OPLS by fitting a 12-6-1 FF to the ab initio data.

The two CSP packages modified by us produced almost identical predictions for cases where we used both. MOLPAK was used for systems I, II, XII, XXII, nitromethane, and benzene. UPACK was used for the remaining systems, as well as for system I, II, and XXII treated also by MOLPAK. For these three systems, rankings of the experimental crystal by the two packages were identical.

PLATON73 was used for checking missed symmetries74 and for space group transformations from non-standard setting to standard setting by assigning the target crystal the proper space group and cell parameters, leading to the data in Table 1. For example, for system II both MOLPAK and UPACK predicted the experimental crystal in P21/c symmetry, and PLATON transformed it to P21/n symmetry.

pDFT+D calculations

In Stage 5, periodic single-point DFT+D lattice energy calculations, i.e., without geometry optimizations, were performed for the 20 top-ranked polymorphs from aiFF minimizations using the PBE51 functional with pseudopotentials75 plus the D3 dispersion correction52 with the Becke–Johnson (BJ) damping76,77. We used Quantum ESPRESSO (QE)78,79 codes, with the plane-wave kinetic energy cutoffs of 340 and 3061 eV for the wave functions and charge densities, respectively.

The zero-point vibrational energy (ZPVE) and thermal effects were calculated within the harmonic approximation using Phonopy 2.8.180 and VASP 5.4.48185 with the same DFT+D approach as applied in QE. The projector augmented-wave pseudopotentials86,87 were used. For the relaxation of the crystal, a cutoff of 1000 eV for the plane-wave basis set was used. The relaxation was stopped if the total energy change between two steps for electronic and ionic motions were smaller than 10−5 and 0.5 10−2 eV, respectively. Phonon calculations were performed at the Γ-point using a supercell of at least 10 Å length in each direction. Similarly to the relaxation step, a cutoff of 1000 eV for the plane-wave basis set and a convergence threshold of 10−8 eV were used in the total energy calculation. Next, ZPVE and thermal effects were calculated on a mesh of 8 × 8 × 8 using the dynamical matrix built from the force constants of the displaced atoms in the supercell.

Supplementary information

Peer Review File (4MB, pdf)

Acknowledgements

This work was supported by the U.S. Army Research Laboratory and Army Research Office (Grant No. W911NF-19-1-0117) and the NSF (Grant No. CHE-1900551). We thank Rafał Podeszwa for comments on the manuscript.

Author contributions

R.N. and K.S. designed the method. R.N. coded it and performed numerical calculations. Both authors analyzed the results, wrote the manuscript, and revised it.

Peer review

Peer review information

Nature Communications thanks Gregory Beran, Graeme Day and the other, anonymous, reviewer for their contribution to the peer review of this work. Peer reviewer reports are available.

Data availability

The data that support the findings of this study are included within the Article and Supplementary Information. In particular, the .zip file contains coordinates and energies of all computed data points, parameters of the fits, and the crystallographic information files for a set of top-ranked polymorphs.

Code availability

The codes used for electronic structure calculations, fitting, CSPs, and pDFT+D calculations: SAPT, ORCA, autoPES (part of the SAPT package), MOLPAK, UPACK, Quantum Espresso, and VASP are available on the web and the links are provided in references of the main paper and the Supplementary Information. A patch to UPACK is available on the SAPT web site. A FORTRAN program computing the fitted potentials is included in the Supplementary_Data_1.zip file.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

The online version contains supplementary material available at 10.1038/s41467-022-30692-y.

References

  • 1.Cruz-Cabeza AJ, Reutzel-Edens SM, Bernstein J. Facts and fictions about polymorphism. Chem. Soc. Rev. 2015;44:8619–8635. doi: 10.1039/C5CS00227C. [DOI] [PubMed] [Google Scholar]
  • 2.Bučar D, Lancaster RW, Bernstein J. Disappearing Polymorphs Revisited. Angew. Chem. Int. Ed. 2015;54:6972–6993. doi: 10.1002/anie.201410356. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Hilfiker, R. & von Raumer, M. (eds.) Polymorphism in the Pharmaceutical Industry: Solid Form and Drug Development (Wiley-VCH, Weinheim, Germany, 2019).
  • 4.Bauer J, et al. Ritonavir: An Extraordinary Example of Conformational Polymorphism. Pharm. Res. 2001;18:859–866. doi: 10.1023/A:1011052932607. [DOI] [PubMed] [Google Scholar]
  • 5.Chemburkar SR, et al. Dealing with the Impact of Ritonavir Polymorphs on the Late Stages of Bulk Drug Process Development. Org. Process Res. Dev. 2000;4:413–417. doi: 10.1021/op000023y. [DOI] [Google Scholar]
  • 6.Waknine, Y. Rotigotine patch recalled due to drug crystallization. Medscape (2008).
  • 7.Rietveld IB, Ceolin R. Rotigotine: Unexpected Polymorphism with Predictable Overall Monotropic Behavior. J. Pharm. Sci. 2015;104:4117–4122. doi: 10.1002/jps.24626. [DOI] [PubMed] [Google Scholar]
  • 8.Mortazavi M, et al. Computational polymorph screening reveals late-appearing and poorly-soluble form of rotigotine. Comm. Chem. 2019;2:70. doi: 10.1038/s42004-019-0171-y. [DOI] [Google Scholar]
  • 9.Badgujar DM, Talawar MB, Asthana SN, Mahulikar PP. Advances in science and technology of modern energetic materials: An overview. J. Hazard. Mater. 2008;151:289–305. doi: 10.1016/j.jhazmat.2007.10.039. [DOI] [PubMed] [Google Scholar]
  • 10.Zhang C. Origins of the Energy and Safety of Energetic Materials and of the Energy & Safety Contradiction. Propellants Explos. Pyrotech. 2018;43:855–856. doi: 10.1002/prep.201880931. [DOI] [Google Scholar]
  • 11.Jurchescu OD, et al. Effects of polymorphism on charge transport in organic semiconductors. Phys. Rev. B. 2009;80:085201. doi: 10.1103/PhysRevB.80.085201. [DOI] [Google Scholar]
  • 12.Maddox J. Crystals from first principles. Nature. 1988;335:201. doi: 10.1038/335201a0. [DOI] [Google Scholar]
  • 13.Gavezzotti A. Are Crystal Structures Predictable? Acc. Chem. Res. 1994;27:309–314. doi: 10.1021/ar00046a004. [DOI] [Google Scholar]
  • 14.Lommerse JPM, et al. A test of crystal structure prediction of small organic molecules. Acta Cryst. B. 2000;56:697–714. doi: 10.1107/S0108768100004584. [DOI] [PubMed] [Google Scholar]
  • 15.Motherwell WDS, et al. Crystal structure prediction of small organic molecules: a second blind test. Acta Cryst. B. 2002;58:647–661. doi: 10.1107/S0108768102005669. [DOI] [PubMed] [Google Scholar]
  • 16.Day GM, et al. A third blind test of crystal structure prediction. Acta Cryst. B. 2005;61:511–527. doi: 10.1107/S0108768105016563. [DOI] [PubMed] [Google Scholar]
  • 17.Day GM, et al. Significant progress in predicting the crystal structures of small organic molecules - a report on the fourth blind test. Acta Cryst. B. 2009;65:107–125. doi: 10.1107/S0108768109004066. [DOI] [PubMed] [Google Scholar]
  • 18.Bardwell DA, et al. Towards crystal structure prediction of complex organic compounds - a report on the fifth blind test. Acta Cryst. B. 2011;67:535–551. doi: 10.1107/S0108768111042868. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Reilly AM, et al. Report on the sixth blind test of organic crystal-structure prediction methods. Acta Cryst. B. 2016;72:439–459. doi: 10.1107/S2052520616007447. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Nyman J, Day GM. Static and lattice vibrational energy differences between polymorphs. CrystEngComm. 2015;17:5154–5165. doi: 10.1039/C5CE00045A. [DOI] [Google Scholar]
  • 21.Ryan K, Lengyel J, Shatruk M. Crystal Structure Prediction via Deep Learning. J. Am. Chem. Soc. 2018;140:10158–10168. doi: 10.1021/jacs.8b03913. [DOI] [PubMed] [Google Scholar]
  • 22.Neumann MA, Leusen FJJ, Kendrick J. A Major Advance in Crystal Structure Prediction. Angew. Chem. Int. Ed. 2008;47:1–5. doi: 10.1002/anie.200790254. [DOI] [PubMed] [Google Scholar]
  • 23.Neumann MA. Tailor-Made Force Fields for Crystal-Structure Prediction. J. Phys. Chem. B. 2008;112:9810–9829. doi: 10.1021/jp710575h. [DOI] [PubMed] [Google Scholar]
  • 24.Neumann, M. A. GRACE; Avant-garde Materials Simulation: St-Germain-en-Laye, France. https://www.avmatsim.eu/ (2008).
  • 25.Day GM. Current approaches to predicting molecular organic crystal structures. Crystallogr. Rev. 2011;17:3–52. doi: 10.1080/0889311X.2010.517526. [DOI] [Google Scholar]
  • 26.Price SL. Predicting crystal structures of organic compounds. Chem. Soc. Rev. 2014;43:2098–2111. doi: 10.1039/C3CS60279F. [DOI] [PubMed] [Google Scholar]
  • 27.Szalewicz K. Determination of Structure and Properties of Molecular Crystals from First Principles. Acc. Chem. Res. 2014;47:3266–3274. doi: 10.1021/ar500275m. [DOI] [PubMed] [Google Scholar]
  • 28.Hoja J, Reilly AM, Tkatchenko A. First-principles modeling of molecular crystals: structures and stabilities, temperature and pressure. Wiley Interdiscip. Rev.: Comput. Mol. Sci. 2017;7:e1294. [Google Scholar]
  • 29.Oganov AR. Crystal structure prediction: reflections on present status and challenges. Faraday Discuss. R. Soc. Chem. 2018;211:643–660. doi: 10.1039/C8FD90033G. [DOI] [PubMed] [Google Scholar]
  • 30.Price, S. L. & Brandenburg, J. G. Molecular crystal structure prediction. In Non-Covalent Interactions in Quantum Chemistry and Physics, (eds de-la Roza, A. O. & DiLabio, G. A.) 333–363 (Elsevier, 2017).
  • 31.Price SL. Is zeroth order crystal structure prediction (CSP_0) coming to maturity? What should we aim for in an ideal crystal structure prediction code? Faraday Discuss. R. Soc. Chem. 2018;211:9–30. doi: 10.1039/C8FD00121A. [DOI] [PubMed] [Google Scholar]
  • 32.Podeszwa R, Bukowski R, Rice BM, Szalewicz K. Potential energy surface for cyclotrimethylene trinitramine dimer from symmetry-adapted perturbation theory. Phys. Chem. Chem. Phys. 2007;9:5561–5569. doi: 10.1039/b709192c. [DOI] [PubMed] [Google Scholar]
  • 33.Podeszwa R, Rice BM, Szalewicz K. Predicting Structure of Molecular Crystals from First Principles. Phys. Rev. Lett. 2008;101:115503. doi: 10.1103/PhysRevLett.101.115503. [DOI] [PubMed] [Google Scholar]
  • 34.Misquitta AJ, Welch GWA, Stone AJ, Price SL. A first principles prediction of the crystal structure of C6Br2ClFH2. Chem. Phys. Lett. 2008;456:105–109. doi: 10.1016/j.cplett.2008.02.113. [DOI] [Google Scholar]
  • 35.Garcia J, Podeszwa R, Szalewicz K. SAPT codes for calculations of intermolecular interaction energies. J. Chem. Phys. 2020;152:184109. doi: 10.1063/5.0005093. [DOI] [PubMed] [Google Scholar]
  • 36.Jeziorski B, Moszyński R, Szalewicz K. Perturbation Theory Approach to Intermolecular Potential Energy Surfaces of van der Waals Complexes. Chem. Rev. 1994;94:1887–1930. doi: 10.1021/cr00031a008. [DOI] [Google Scholar]
  • 37.Metz MP, Piszczatowski K, Szalewicz K. Automatic Generation of Intermolecular Potential Energy Surfaces. J. Chem. Theory Comput. 2016;12:5895–5919. doi: 10.1021/acs.jctc.6b00913. [DOI] [PubMed] [Google Scholar]
  • 38.Metz MP, Szalewicz K. Automatic Generation of Flexible-Monomer Intermolecular Potential Energy Surfaces. J. Chem. Theory Comput. 2020;16:2317–2339. doi: 10.1021/acs.jctc.9b01241. [DOI] [PubMed] [Google Scholar]
  • 39.Brandenburg JG, Grimme S. Organic crystal polymorphism: a benchmark for dispersion-corrected mean-field electronic structure methods. Cryst. Acta B. 2016;52:502–513. doi: 10.1107/S2052520616007885. [DOI] [PubMed] [Google Scholar]
  • 40.Whittleton SR, de-la Roza AO, Johnson ER. Exchange-Hole Dipole Dispersion Model for Accurate Energy Ranking in Molecular Crystal Structure Prediction. J. Chem. Theory Comput. 2017;13:441–450. doi: 10.1021/acs.jctc.6b00679. [DOI] [PubMed] [Google Scholar]
  • 41.Hoja J, et al. Reliable and practical computational description of molecular crystal polymorphs. Sci. Adv. 2019;5:eaau3338. doi: 10.1126/sciadv.aau3338. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Garcia J, Szalewicz K. Ab Initio Extended Hartree–Fock plus Dispersion Method Applied to Dimers with Hundreds of Atoms. J. Phys. Chem. A. 2020;124:1196–1203. doi: 10.1021/acs.jpca.9b11900. [DOI] [PubMed] [Google Scholar]
  • 43.Taylor DC, et al. Blind test of density-functional-based methods on intermolecular interaction energies. J. Chem. Phys. 2016;145:124105. doi: 10.1063/1.4961095. [DOI] [PubMed] [Google Scholar]
  • 44.Williams DE. Improved intermolecular force field for molecules containing H, C, N, and O atoms, with application to nucleoside and peptide crystals. J. Comput. Chem. 2001;22:1154–1166. doi: 10.1002/jcc.1074. [DOI] [Google Scholar]
  • 45.Breneman CM, Wiberg KB. Determining atom-centered monopoles from molecular electrostatic potentials. The need for high sampling density in formamide conformational analysis. J. Comp. Chem. 1990;11:361. doi: 10.1002/jcc.540110311. [DOI] [Google Scholar]
  • 46.Welch GWA, Karamertzanis PG, Misquitta AJ, Stone AJ, Price SL. Is the Induction Energy Important for Modeling Organic Crystals? J. Chem. Theory Comput. 2008;4:522–532. doi: 10.1021/ct700270d. [DOI] [PubMed] [Google Scholar]
  • 47.Karamertzanis PG, et al. Modeling the interplay of inter- and intramolecular hydrogen bonding in conformational polymorphs. J. Chem. Phys. 2008;128:244708. doi: 10.1063/1.2937446. [DOI] [PubMed] [Google Scholar]
  • 48.Greenwell C, et al. Overcoming the difficulties of predicting conformational polymorph energetics in molecular crystals via correlated wavefunction methods. Chem. Sci. 2020;11:2200–2214. doi: 10.1039/C9SC05689K. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Neese F. The ORCA program system. Wiley Interdiscip. Rev.: Comput. Mol. Sci. 2012;2:73–78. [Google Scholar]
  • 50.Neese, F. ORCA, an ab initio, DFT, and semiempirical electronic structure package. with contributions from U. Becker, et al. https://orcaforum.kofo.mpg.de.
  • 51.Perdew JP, Burke K, Ernzerhof M. Generalized Gradient Approximation Made Simple. Phys. Rev. Lett. 1996;77:3865–3868. doi: 10.1103/PhysRevLett.77.3865. [DOI] [PubMed] [Google Scholar]
  • 52.Grimme S, Antony J, Elrich S, Krieg H. A consistent and accurate ab initio parametrization of density functional dispersion correction (DFT-D) for the 94 elements H-Pu. J. Chem. Phys. 2010;132:154104. doi: 10.1063/1.3382344. [DOI] [PubMed] [Google Scholar]
  • 53.Kendall RA, Dunning TH, Jr, Harrison RJ. Electron-affinities of the first-row atoms revisited. Systematic basis sets and wave functions. J. Chem. Phys. 1992;96:6796–6806. doi: 10.1063/1.462569. [DOI] [Google Scholar]
  • 54.Szalewicz K. Symmetry-adapted perturbation theory of intermolecular forces. Wiley Interdisc. Rev. Comp. Mol. Sci. 2012;2:254–272. doi: 10.1002/wcms.86. [DOI] [Google Scholar]
  • 55.Misquitta AJ, Podeszwa R, Jeziorski B, Szalewicz K. Intermolecular potentials based on symmetry-adapted perturbation theory including dispersion energies from time-dependent density functional calculations. J. Chem. Phys. 2005;123:214103. doi: 10.1063/1.2135288. [DOI] [PubMed] [Google Scholar]
  • 56.Hesselmann A, Jansen G, Schütz M. Density-functional theory-symmetry-adapted intermolecular perturbation theory with density fitting: A new efficient method to study intermolecular interaction energies. J. Chem. Phys. 2005;122:014103. doi: 10.1063/1.1824898. [DOI] [PubMed] [Google Scholar]
  • 57.Misquitta AJ, Jeziorski B, Szalewicz K. Dispersion Energy from Density-Functional Theory Description of Monomers. Phys. Rev. Lett. 2003;91:033201. doi: 10.1103/PhysRevLett.91.033201. [DOI] [PubMed] [Google Scholar]
  • 58.Hesselmann A, Jansen G. Intermolecular dispersion energies from time-dependent density functional theory. Chem. Phys. Lett. 2003;367:778–784. doi: 10.1016/S0009-2614(02)01796-7. [DOI] [Google Scholar]
  • 59.Bukowski R, Podeszwa R, Szalewicz K. Efficient calculations of coupled Kohn-Sham dynamic susceptibility functions and dispersion energies with density fitting. Chem. Phys. Lett. 2005;414:111–116. doi: 10.1016/j.cplett.2005.08.048. [DOI] [Google Scholar]
  • 60.Podeszwa R, Bukowski R, Szalewicz K. Density-Fitting Method in Symmetry-Adapted Perturbation Theory Based on Kohn-Sham Description of Monomers. J. Chem. Theory Comput. 2006;2:400–412. doi: 10.1021/ct050304h. [DOI] [PubMed] [Google Scholar]
  • 61.Bukowski, R. et al. SAPT2020: An ab initio program for many-body symmetry-adapted perturbation theory calculations of intermolecular interaction energies. http://www.physics.udel.edu/~szalewic/SAPT/SAPT.html (2020).
  • 62.Grüning M, Gritsenko OV, van Gisbergen SJA, Baerends EJ. Shape corrections to exchange-correlation potentials by gradient-regulated seamless connection of model potentials for inner and outer region. J. Chem. Phys. 2001;114:652–660. doi: 10.1063/1.1327260. [DOI] [Google Scholar]
  • 63.Cencek W, Szalewicz K. On asymptotic behavior of density functional theory. J. Chem. Phys. 2013;139:024104–(1:27). doi: 10.1063/1.4811833. [DOI] [PubMed] [Google Scholar]
  • 64.Williams HL, Mas EM, Szalewicz K, Jeziorski B. On the effectiveness of monomer-, dimer-, and bond-centered basis functions in calculations of intermolecular interaction energies. J. Chem. Phys. 1995;103:7374–7391. doi: 10.1063/1.470309. [DOI] [Google Scholar]
  • 65.Tang KT, Toennies JP. An improved simple-model for the van der Waals potential based on universal damping functions for the dispersion coefficients. J. Chem. Phys. 1984;80:3726–3741. doi: 10.1063/1.447150. [DOI] [Google Scholar]
  • 66.Day GM, Motherwell WDS, Jones W. Beyond the Isotropic Atom Model in Crystal Structure Prediction of Rigid Molecules: Atomic Multipoles versus Point Charges. Cryst. Growth Des. 2005;5:1023–1033. doi: 10.1021/cg049651n. [DOI] [Google Scholar]
  • 67.Price SL. Computational prediction of organic crystal structures and polymorphism. Int. Rev. Phys. Chem. 2008;27:541–568. doi: 10.1080/01442350802102387. [DOI] [Google Scholar]
  • 68.Holden JR, Du Z, Ammon HL. Prediction of possible crystal structures for C-, H-, N-, O-, and F- containing organic compounds. J. Comput. Chem. 1993;14:422–437. doi: 10.1002/jcc.540140406. [DOI] [Google Scholar]
  • 69.van Eijck BP, Kroon J. UPACK program package for crystal structure prediction: Force fields and crystal structure generation for small carbohydrate molecules. J. Comput. Chem. 1999;20:799–812. doi: 10.1002/(SICI)1096-987X(199906)20:8&#x0003c;799::AID-JCC6&#x0003e;3.0.CO;2-Z. [DOI] [PubMed] [Google Scholar]
  • 70.Busing, W. R. Report ORNL-5747. Oak Ridge National Laboratory, Oak Ridge, TN (1981).
  • 71.Jorgensen WL, Tirado-Rives J. The OPLS [Optimized Potentials for Liquid Simulations] potential functions for proteins, energy minimizations for crystals of cyclic peptides and crambin. J. Am. Chem. Soc. 1988;110:1657–1666. doi: 10.1021/ja00214a001. [DOI] [PubMed] [Google Scholar]
  • 72.Van Eijck BP, Kroon J. Fast clustering of equivalent structures in crystal structure prediction. J. Comput. Chem. 1997;18:1036–1042. doi: 10.1002/(SICI)1096-987X(199706)18:8&#x0003c;1036::AID-JCC7&#x0003e;3.0.CO;2-U. [DOI] [Google Scholar]
  • 73.Spek AL. Single-crystal structure validation with the program PLATON. J. Appl. Crystallogr. 2003;36:7–13. doi: 10.1107/S0021889802022112. [DOI] [Google Scholar]
  • 74.Spek AL. Structure validation in chemical crystallography. Acta Crystallogr., Sect. D: Biol. Crystallogr. 2009;65:148–155. doi: 10.1107/S090744490804362X. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Rappe AM, Rabe KM, Kaxiras E, Joannopoulos JD. Optimized pseudopotentials. Phys. Rev. B: Condens. Matter. 1990;41:1227. doi: 10.1103/PhysRevB.41.1227. [DOI] [PubMed] [Google Scholar]
  • 76.Becke AD, Johnson ER. Exchange-hole dipole moment and the dispersion interaction. J. Chem. Phys. 2005;122:154104–(1:5). doi: 10.1063/1.1884601. [DOI] [PubMed] [Google Scholar]
  • 77.Grimme S, Ehrlich S, Goerigk L. Effect of the damping function in dispersion corrected density functional theory. J. Comp. Chem. 2011;32:1456–1465. doi: 10.1002/jcc.21759. [DOI] [PubMed] [Google Scholar]
  • 78.Giannozzi P, et al. Quantum ESPRESSO: a modular and open-source software project for quantum simulations of materials. J. Phys. Condens. Matter. 2009;21:395502. doi: 10.1088/0953-8984/21/39/395502. [DOI] [PubMed] [Google Scholar]
  • 79.Giannozzi P, et al. Advanced capabilities for materials modelling with quantum ESPRESSO. J. Phys. Condens. Matter. 2017;29:465901. doi: 10.1088/1361-648X/aa8f79. [DOI] [PubMed] [Google Scholar]
  • 80.Togo A, Tanaka I. First principles phonon calculations in materials science. Scr. Mater. 2015;108:1–5. doi: 10.1016/j.scriptamat.2015.07.021. [DOI] [Google Scholar]
  • 81.Kresse G, Hafner J. Ab initio molecular dynamics for liquid metals. Phys. Rev. B. 1993;47:558–561. doi: 10.1103/PhysRevB.47.558. [DOI] [PubMed] [Google Scholar]
  • 82.Kresse G, Hafner J. Ab initio molecular-dynamics simulation of the liquid-metal–amorphous-semiconductor transition in germanium. Phys. Rev. B: Condens. Matter. 1994;49:14251–14269. doi: 10.1103/PhysRevB.49.14251. [DOI] [PubMed] [Google Scholar]
  • 83.Kresse G, Furthmüller J. Efficiency of ab-initio total energy calculations for metals and semiconductors using a plane-wave basis set. Comput. Mater. Sci. 1996;6:15–50. doi: 10.1016/0927-0256(96)00008-0. [DOI] [PubMed] [Google Scholar]
  • 84.Kresse G, Furthmüller J. Efficient iterative schemes for ab initio total-energy calculations using a plane-wave basis set. Phys. Rev. B: Condens. Matter. 1996;54:11169–11186. doi: 10.1103/PhysRevB.54.11169. [DOI] [PubMed] [Google Scholar]
  • 85.Kresse, G. et al. VASP: Vienna ab initio simulation package. http://www.vasp.at (2021).
  • 86.Blöchl PE. Projector augmented-wave method. Phys. Rev. B: Condens. Matter. 1994;50:17953–17979. doi: 10.1103/PhysRevB.50.17953. [DOI] [PubMed] [Google Scholar]
  • 87.Kresse G, Joubert D. From ultrasoft pseudopotentials to the projector augmented-wave method. Phys. Rev. B: Condens. Matter. 1999;59:1758–1775. doi: 10.1103/PhysRevB.59.1758. [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Peer Review File (4MB, pdf)

Data Availability Statement

The data that support the findings of this study are included within the Article and Supplementary Information. In particular, the .zip file contains coordinates and energies of all computed data points, parameters of the fits, and the crystallographic information files for a set of top-ranked polymorphs.

The codes used for electronic structure calculations, fitting, CSPs, and pDFT+D calculations: SAPT, ORCA, autoPES (part of the SAPT package), MOLPAK, UPACK, Quantum Espresso, and VASP are available on the web and the links are provided in references of the main paper and the Supplementary Information. A patch to UPACK is available on the SAPT web site. A FORTRAN program computing the fitted potentials is included in the Supplementary_Data_1.zip file.


Articles from Nature Communications are provided here courtesy of Nature Publishing Group

RESOURCES