Abstract
This study systematically investigates the influence of various parameters of the wavefunction calculation during Hirshfeld atom refinement (HAR). We aim to address the lack of consensus in the literature and conflicting information on a generally recommended procedure. A set of amino acid test structures, known for their immense biochemical importance and unimpeachable experimental data quality, was employed to ensure reliable results, unbiased by the question of insufficient diffraction data quality. A comprehensive permutation of refinement parameters was conducted to avoid overlooking potential influences, resulting in 2496 structure refinements per amino acid. Applying a solvent model systematically improved refinement results compared to gas-phase calculations. Additionally, it was observed that the pure Hartree–Fock method outperforms all tested density functional theory methods across all structures in this test set of polar-organic molecules. These findings underscore the importance of carefully considering the level of theory applied in HAR and offer an overview of the performance of various methods and parameters.
I. INTRODUCTION
Structural scientists use single-crystal x-ray diffraction (SCXRD) to get precise information about the structural parameters of a substance of interest. Nowadays, they can rely on modern experimental setups, enabling high-quality data acquisition within a significantly reduced timeframe, easily outweighing the data collection potential of SCXRD equipment available to researchers 50 years ago. Still, data analysis is often based on the same model applied for half a century for the structure refinement. The independent atom model (IAM) laid the foundation for the precalculated scattering factors, which are frequently used.1 It has been shown extensively that the IAM and its assumption of non-interacting atoms are too crude for a modern electronic structure representation.2–5 The effects the IAM neglects include chemical bonding phenomena, electron lone pairs, weak interactions, and atom or molecular polarization.6,7 These effects can only be described by applying a non-spherical treatment of the electron density of atoms in a crystal structure. The lower quality of IAM structure models has been demonstrated explicitly for anisotropic displacement parameters (ADPs)6 and bond lengths.8,9 The shortcoming is particularly pronounced for hydrogen atoms, which are classically located at positions that do not agree with accurate neutron data,10,11 reinforcing the assumption of invisible H-atoms in x-ray experiments. For more complex endeavors than refining non-hydrogen atoms and displacement parameters, it is therefore considered imperative to replace the IAM, which promoted the development of new models over the decades.12
Two methodological branches concerning non-spherical refinement have been established in this context. The first branch is the multipole model,13 which utilizes an expansion of spherical harmonics to describe the electron density deformation. Its model parameters are usually fitted against high-resolution, high-quality data in a least squares procedure. As this method provided significantly better flexibility and systematic improvement in structure modeling,14 it has been extended with radial scaling in the Hansen–Coppens formalism,15 and in the last 30 years, database approaches have been successfully introduced to make it more feasible for routine applications.16–21 This approach focuses on reconstructing the electron density, the observable quantity of the x-ray diffraction experiment.
The second branch of approaches introducing asphericity in the atomistic crystallographic model can be summarized by the involvement of a quantum mechanical calculation used to augment the model.12,22 One such method is Hirshfeld atom refinement (HAR).23 Here, scattering factors are calculated from a stockholder partitioning of a molecular electron density, the so-called Hirshfeld atoms.20 Applying this method requires, thus, a preceding quantum mechanical calculation to obtain the electron density, which is subsequently used to derive unique scattering factors in each structure. HAR was found to be a powerful tool for the refinement of crystal structures24 as shown in various examples,25–28 including disordered moieties,5,29 heavy elements,30–32 and extended periodic structures.29,33,34 Using HAR in structure refinement has no additional requirement regarding experimental data quality (compared to a final IAM model). However, the quantum chemical calculation might sometimes pose a bigger issue, as the calculation cannot be performed on an unfinished structural model with missing or misassigned atoms or when the charge state of the asymmetric unit is not yet fully resolved. The application of the method, when the quantum chemical calculation is possible, does not entail any inherent disadvantages except for the additional investment of computational time. The model quality is thereby expected to be enhanced when a sufficient accuracy of the quantum chemical calculation is achieved and no inherent issues separating the measured sample and the built model remain, for example, missed substitutional or conformational disorder, systematic errors in the measured data like detector countrate saturation or untreated absorption.
An implementation of the original software framework for HAR, tonto,35 was interfaced with Olex2,36 named HARt,26 offering a restricted set of functionality, making this method available to a broader audience. More recently, NoSpherA25 replaced this interface with a native implementation of HAR inside Olex2. It is designed to be a routine refinement tool as it directly interfaces with the refinement engine olex2.refine,37 allowing the usage of restraints, constraints, disorder, and treatment of special positions, which posed a problem at the time within tonto. The NoSpherA2-HAR is identical to HAR within tonto, in the sense that the initial density calculation during the refinement process relies purely on a single point (SP) calculation of the structure of interest, the calculation of Hirshfeld Atoms based on the pro-molecular atomic densities, and subsequent calculation of atomic scattering factors. Several methodological improvements have since been introduced. A fragmentation approach38,39 was implemented to refine structures at the scale of proteins in a reasonable time frame by enabling linear scaling, significantly reducing the computation requirements. The absence of an intermolecular environment with neighboring molecules during the density calculation was sometimes considered a drawback for the current implementation of HAR. The placement of cluster charges,26 extremely localized molecular orbitals (ELMO) embedding,40 or periodic density functional theory (DFT) in HAR34,41 can remedy this neglect. These additions provide more accurate structure models in cases where intermolecular interactions dominate the electron density distribution.
However, among all these methods, a significant diversity in the fundamental step of calculating the electron density prevails. This complexity is compounded by the demanding task of choosing the right level of theory, as in all other fields of quantum chemistry.42 Since standard software like ORCA43 or Gaussian,44 which offer a wide variety of methods, are compatible with NoSpherA2,5 the crystallographic users are confronted with the methodological question of a quantum chemist. There are, of course, guidelines for choosing an appropriate level of theory when it comes to optimizing a molecular geometry or calculating reaction energies, but for HAR, those recommendations are not yet tested or necessarily valid. In the context of HAR, comparing crystallographic quality indicators, such as deformation or residual electron density, or the resulting accuracy in hydrogen atom positions, provides a robust assessment of various methods. This has been shown in several examples.5,9,23,24,26,32,34,40,45,46 Previous results indicate that DFT is the most recommendable method (for HAR).5,23,24,40 A frequently utilized functional is the popular B3LYP from the hybrid generalized gradient approximation (GGA) family.47,48 It is known to perform poorly in other fields of quantum chemistry49,50 but remains a frequently presented representative of its functional class. The Hartree–Fock (HF) electronic exchange component, intrinsic to the pure HF method and the hybrid class functionals, was demonstrated to benefit the quality of HAR results.46 When post-HF methods were investigated in the context of HAR, they showed only minor improvement relative to the tested DFT methods at the cost of a significant increase in computational efforts.45
The choice of a basis set has a fundamental role during the HAR procedure. Among the most commonly used basis sets is Ahlrichs' def2 series.51 Notably, smaller basis sets were observed to perform comparable or even better52 than larger ones.46 In each attempt, different methods for comparison were employed for only a limited number of crystal structures. Such restrictions in study scope are reasonable since acquiring high-quality datasets is accompanied by substantial requirements in terms of crystal quality, instrumentation, and data collection. Additionally, the chosen target structures must be suitable for the investigated matter. Therefore, the benchmarking of HAR suffers a bias, as it aims to make a general statement about the performance of a method, which should hold for a broad range of structures. However, the limited consideration of the chemical space and the limited number of similar structures in the existing benchmark studies cause a lack of generalizability for the conclusions drawn. Moreover, the observability of trends while comparing levels of theory creates a strong generality requirement on selected tested parameters. For example, a simplified statement that hybrid functionals outperform GGA-class functionals lacks transferability when only B3LYP and PBE were tested. The available benchmark studies do not conclude with a general “best method” for HAR. Since the conflicting results of the available literature do not allow such conclusions, the question arises whether a general statement about a method's performance is possible. To investigate further, it is essential to use a broader range of refinement results from a wide spectrum of levels of theory. In this work, we present a benchmarking study of HAR, which aims to provide a general statement about the performance of a given method when applied to a selection of small organic polar molecules. The methods used for calculating the electron density are focused on isolated molecular-level calculations in this study. The set of test structures analyzed is restricted to the well-defined class of amino acids. Therefore, any statement about the performance of a level of theory should show transferability mostly to similar structures of organic molecules containing polar groups.
II. METHODOLOGY
The benchmark study was conducted on SCXRD datasets of 14 amino acids. Parameters and combinations thereof were chosen based on the availability within the refinement software. The implementation of NoSpherA25 in Olex236 was used to perform HAR. A complete permutation of the refinement parameters was performed so as not to miss a potential influence by the correlation of inputs on the outcome (see Sec. II B). The sequential refinement was aided by the plugin SISYPHOS.53
A. X-ray data
The SCXRD data recorded for the amino acids were of adequate quality with a resolution of dmin = 0.6 Å or better (see Sec. V). The selection covers two non-essential amino acids, 2-aminoisobutyric acid (Aib) and racemic pidolic acid (racPA). L-arginine, L-asparagine, and L-serine datasets were collected as hydrates (see Fig. 1). The datasets are labeled by the three-letter code of the standard amino acids,54 and the L-stereo descriptor and hydrate state are not explicitly stated from hereon. The initial IAM refinement, used as input for SISYPHOS, was performed with olex2.refine,36,37 and all data were refined with their full resolution. Positions of the H-atoms were refined freely, and the hydrogen displacement parameters were refined isotropically. Anisotropic displacement parameters were employed on all non-hydrogen atoms. The presented data are starting points from which the HARs are performed. From here on, the resulting models of these starting points are referred to as IAM structures.
FIG. 1.
Structures used for this benchmark after initial IAM refinement, ADPs are shown at a 50% probability level.
B. Selection of quantum mechanical methods in HAR
All single-point calculations were performed using the ORCA 5.0 software,55 operated by NoSpherA2.5 Twelve density functional theory (DFT) functionals and the Hartree–Fock (HF) method were tested, as available within the NoSpherA2 user interface. This selection contains several functionals from each class of DFT methods (see Sec. II C). The tested basis sets include all 16 (non-relativistic) ones available in the NoSpherA2-GUI, covering several examples of the most widespread basis set families (see Sec. II D). The default SP calculation is performed for a free molecule in vacuum. As described above, considering the molecular surroundings in the crystal packing is deemed advantageous to HAR. Instead of explicit modeling of a crystal environment, the easily applicable conductor-like polarizable continuum model (CPCM solvent model)56 is tested, in which solvation is treated as a dielectric polarizable continuum surrounding the molecule. A water model is chosen since the crystal packing of the amino acids is dominated by hydrogen bonding. The inclusion of the non-local self-consistent dispersion correction (SCNL) VV10 tests the influence of dispersion interactions on the refinement outcome results.57 The Becke grid size,58 which accounts for the accuracy of the numerical integration during both single-point calculation and calculation of scattering factors, is varied to test for the influence of this numerical parameter. A systematic permutation of the parameters was performed according to the decision tree in Fig. 2 to investigate any conceivable trends.
FIG. 2.
Flow chart presenting the variation tree and tested parameters for the refinement; following one branch represents one set of options chosen for one refinement. The options cover the choice whether to include a self-consistent non-local (SCNL) dispersion correction after Vydrov and Van Voorhis, a solvation model, the grid density for numerical integration, and the level of theory, including QM method and basis set choice. Three dots represent a repetition of the boxes above for the same path, but are omitted for readability. References for each parameter and further details about the depicted options are listed in Secs. III A–III E.
In combination, all permutation paths yield 2496 refinements, which were performed for each amino acid dataset, yielding 34 944 structural models in total. The sequential processing of these calculations was aided by the SISYPHOS plugin, where an input file controls the settings of each refinement performed successively. Two parameters were chosen to stay constant: The convergence strategy was fixed at “EasyConv,” and the SCF threshold at “NoSpherA2SCF” (NoSpherA2SCF is a selection of ORCA input criteria: Energy convergence at 3 × 10−5, DIIS error convergence at 1 × 10−4, the density matrix threshold on 1 × 10−9, the orbital gradient convergence at 3 × 10−4 and the orbital rotation angle convergence at 3 × 10−4). Hydrogen atoms were refined anisotropically and free of positional constraints or restraints.
C. Density functionals
The rating of DFT functionals is often conceptualized with rungs on Jacob's ladder, as proposed by Perdew,59 which illustrates the hierarchical nature of functionals in terms of their inherent properties and accuracy. In this work, the low-rung class of local density approximated (LDA) functionals is represented by PWLDA.60 The GGA family is represented by BP,61 BP86,62 PBE,63 and BLYP;48 the meta-GGA functionals, including higher derivatives of the electron density, tested are TPSS64 and R2SCAN.65 Hybrid functionals are considered especially suited for molecular systems, represented in this benchmark by PBE066 and B3LYP.47,48 The meta hybrid functionals M06-2X,67 ωB97,68 and ωB97X68 represent the highest rung tested, closest to chemical accuracy.
D. Basis sets
The 16 basis sets tested comprise several members of different popular families.
E. Quality indicators for quantitative evaluation of refinement results
The performance of each refinement is quantified using two quality indicators. The first accounts for the representation of the electronic structure and combines classical R-values (R1, wR2), minimum and maximum residual electron density, and the root mean square (RMS) of the residual electron density map. They are combined into a single value, since they are different aspects of the agreement of the electron density distribution of the diffraction data, either in reciprocal or real space. Their improvement compared to the IAM shows high correlation in most cases (see Fig. S5.1), which allows the combination into a single value. For further details, the comparison of all refinement results of all individual descriptors can be found in the corresponding chapters of the supplementary material.99 All individual values should be as low as possible for a refinement that performs better. Those indicators K are combined into a single figure of merit (FOM) of the electron density description that quantifies the improvement relative to the IAM model [Eq. (1)]
| (1) |
A negative value would indicate a worsening of refinement statistics, while a value of 5.0 would indicate a perfect refinement, with zero discrepancy between the measured data and the HAR model. A superior set of settings for HAR is, therefore, reflected by a higher value of the FOM, which allows a ranking of the parameter sets. Despite their high correlation for HAR (see Fig. S5.1),99 the supplementary material contains the visual representations of the discussion with each K separately, replacing the FOM with the individual metric for more in-depth information.
The accuracy of the refined hydrogen atom positions is evaluated to complement the electronic description achieved by the FOM. The corresponding neutron experiments of the amino acids in the literature are used as a reference. The average deviation of the refined X-H distances from the reference neutron data are calculated as the weighted root mean square deviation (wRMSD) [Eq. (2)]. We only consider the improvement of hydrogen atom positions since other atomic positions have been proven not to be influenced as much by the introduction of non-spherical scattering factors5,80
| (2) |
Employing the wRMSD, the structural agreement between the HAR SCXRD and the neutron diffraction results regarding hydrogen atom positions is quantified. The experimental neutron diffraction data for the amino acids were used from the corresponding studies: Ala,81 Arg,82 Asn,83 Cys,84 His,85 Glu,86 Gly,87 Ser,88 Thr,89 and Tyr.90 For structures not subjected to neutron diffraction experiments (including Aib, Asp, Pro, and racPA), generalized X-H bond distances were applied as in Ref. 91. A comparison of the bond lengths from the available neutron structures with the generalized neutron data (Fig. S1.1) shows a close agreement in this regard. The systematic error of the X-H bond introduced by the temperature difference between our x-ray diffraction data and the reported neutron diffraction experiment is considered negligible for this study.24
During the preparation of this manuscript, two articles that highlight the benefits of the general use of NoSpherA2 for small molecule structure refinement92 and the benefits of HAR in quantum crystallographic protocols93 were published. Together, they provide recommendations for the application of HAR in quantum crystallographically augmented procedures, further discussion of the experimental requirements and expectations from employing HAR, that would go beyond the scope of this study.
III. RESULTS AND DISCUSSION
Using the FOM [Eq. (1)] and wRMSD [Eq. (2)], each refinement's result is evaluated, allowing the analysis of each parameter's impact on the refinement result. Consequently, the global average of a parameter's influence on the FOM and wRMSD is computed. The respective relative mean values represent the influence on the refinement outcome. For clarification, the four structures for which no neutron diffraction data were available have been positioned at the bottom of Figs. 3 and 4. Additionally, alternative versions of Figs. 5 and 6 are displayed in the supplementary material (Figs. S8.1 and S8.7).99 These figures only consider the ten amino acid datasets, which include neutron diffraction data for the calculation of wRMSD. The results are presented in Secs. III A–III E, grouped by the option.
FIG. 3.
FOM (orange) and wRMSD (blue) violin plots for the distributions of refinements using a CPCM model for water or without any treatment of environmental effects (Vacuum) plotted around the global mean (dotted line), the respective means are marked black. The scales are orientated so that improvement compared to the global mean is located at the right-hand side of the mean line.
FIG. 4.
FOM (orange) and wRMSD (blue) violin plots of the distributions using different integration grid settings plotted around the global mean (red dotted line), the respective means of individual distributions are marked in black. The scales are oriented so that improvement compared to the global mean is located at the right-hand side of the mean line.
FIG. 5.
2D-scatterplot of refinement results for all structures using different QM-methods in the FOM*-wRMSD* plane. Mean values of the different method clusters are indicated as black x-markers; each method-cluster is colored according to the distance of the mean value to the point (−1,−1); a short distance is indicated with a darker color. Box-and-whiskers plots (right) show the median (white line), and the box frames the interquartile range (IQR). The whiskers are in a range of 1.5 times the IQR; datapoints outside this range are drawn as flier-props.
FIG. 6.
Two-dimensional metric multidimensional scaling (MDS)97 output for the employed quantum mechanical methods, according to which the similarities are visualized as pairwise distances. The dimensions which span up the graph do not have an intrinsic meaning and are arbitrarily chosen to preserve the underlying data structure. Stress value for both MDS: 0.011.
A. Influence of the non-local dispersion correction
Listed first in the decision tree (Fig. 2) is whether or not to apply the self-consistent non-local dispersion correction (keyword: SCNL True, or False). The use of a dispersion correction is a standard procedure in quantum chemistry since accounting for the missing contribution of dispersion interactions to the electron density is indispensable for optimizing molecular geometries or calculating reaction energies. The SCNL method in ORCA pairs a nonlocal van der Waals functional with the corresponding exchange-correlation functional of the respective DFT method. A recommended choice is the VV10 correlation model introduced by Vydrov and Van Voorhis.57 The influence on the refinements using NoSpherA2 by dispersion correction is visualized in Fig. S2.1. It displays the refinement results of each amino acid dataset separated into two groups, one with the SCNL method applied (1920 refinements per structure, as SCNL is not defined for M06-2X, wB97, PWLDA) and the ones without. The results show that the indicators FOM and wRMSD are closely located on the global mean in all cases with their distributions not being distinguishable. The SCNL method thus shows no significant influence on the refinement outcome. Therefore, the less time-consuming refinement without the SCNL method is recommended for small polar molecules.
B. Influence of a solvent model
Performing the molecular electron density calculation in the gas phase is a simplification, which introduces a systematic error. Several methods to account for the molecular environment were reported for HAR, ranging from periodic DFT calculations34 to explicitly arranging cluster point charges based on the calculated unit.24,26 This section presents the influence of the electrostatic contribution of a CPCM solvent model56 on the refinement outcome, a method not reported to be used before. Considering the application to a solid-state structure, including solvation in the quantum mechanical calculation might appear counterintuitive, but modeling the surroundings even to an imprecise extent may offer an improved structure model compared to the complete neglect of environmental effects. Due to the strong hydrogen bonding network in the crystal packing of the amino acids, a water model was chosen as a suitable CPCM model. The results from these calculations were compared against a solvent-free (isolated molecule in vacuo) calculation.
As presented in Fig. 3, the solvation model exhibits a positive effect for almost all structures. Both averages of the improvement indicators, FOM and wRMSD, shift above the average of all calculations. Glu is the only exception, showing a negative influence of the solvation model regarding the wRMSD. Even though applied in the solid state, the results suggest that using a solvation model is generally beneficial for HAR of small polar organic molecules. The overall beneficial effect probably originates from the hydrogen bonding network in the crystal packing, which a water solvation model imitates by polarizing the bonds within the calculated molecule, and the boundary provided for the negatively charged carboxylate groups, known to otherwise become overproportionally diffuse in vacuo.94,95
C. Influence of the integration grid
The accuracy of the integration grid is a computational parameter that influences the numerical calculation of the electron density based on spherical, atom centered grids, affecting both the single point calculation in ORCA and the calculation of scattering factors within NoSpherA2. These grids are not evenly spaced with respect to the radius, but grid points are more concentrated in regions of high electron density near the nucleus to capture regions of higher variance more effectively.96 The respective options that can be adjusted and tested are the grid settings Low, Normal, and High (Fig. 4).
The resulting distribution highlights that the grid size influences the refinement outcome. In addition to Asn and Asp, the highest FOM values are achieved with the High grid setting, while the mean values of the Low setting are located below the average. A numerical error introduced by the grid accuracy thus impacts the electronic structure representation, and the coarse-meshed grid results in a lower-quality structural model. Since the difference between the Normal and High grid settings is small compared to that between Low and Normal, the numerical stability has mostly been recovered by the Normal grid setting.
The wRMSD of bond lengths is less influenced by the grid size. Still, the Low setting is, surprisingly, above average in most cases, contradicting the above-mentioned trend observed for the electron density description. From this standpoint, the Normal grid setting is recommended to refine amino acids. The Normal setting bears another advantage, as it is significantly less time-consuming for the partitioning than the High setting (see Fig. S7.1), making it a reasonable choice for refinements from a practical standpoint.
D. Influence of the QM-method
The refinement quality indicators underwent a shift for a whole series of refinements by changing each parameter discussed above. The influence of the method and the basis set on the refinement outcome is discussed in this section and Sec. III E. A recalculated version of the electronic FOM and geometric wRMSD values [Eqs. (3) and (4)] will be introduced to allow a better relative evaluation of the effect on the refinement outcome to allow an easier discussion. The values for each refinement i are referenced to the global mean of the respective geometrical or electronic parameter of all refinements. A value of 0.0 for this rescaled metric, therefore, is the average performance. At the same time, both indicators are calculated so that a negative value indicates an improvement compared to the average across all datapoints
| (3) |
| (4) |
All 2496 performed refinements were color-coded using the QM-method for each test structure and plotted in the FOM*-wRMSD* plane (Fig. 5). The different performance of the individual quantum mechanical method applied during HAR is observable by the shifted mean values of each subset marked by a cross, the corresponding label for the used method, and the color. In this plot, the best result would be a negative score in either FOM* or wRMSD* due to the rescaling of the quality parameters.
The data obtained using Eqs. (3) and (4), shown in Fig. 5, indicate a clear trend after which the methods are rankable. The mean value of the refinements using the HF method has the shortest distance toward the point (−1, −1). It outperforms all other methods regarding the FOM and wRMSD. Functionals with intrinsic HF exchange are positioned at the left side in the ranking by method, including PBE0, the long-range corrected wB97(X), and the meta-hybrid functional M062X. The meta-GGA R2SCAN is the best-performing non-hybrid, surpassing the results of B3LYP. The tested GGA functionals yield surprisingly poor results.
In conclusion, pure HF outperforms all other tested methods, which is, considering the missing account of electron correlation, surprising. Additionally, R2SCAN, a modern meta-GGA without HF exchange, surpasses the popular hybrid functional B3LYP. Ranked in between, PWLDA, considered a rudimentary method unsuitable for modern applications, yields hydrogen positions of comparable quality to functionals of higher rungs on the Jacob's ladder, surpassing those obtained with B3LYP. The amount of HF exchange for modeling strong polar environments is known to impact the results, and comparisons between the HF and hybrid functionals have already shown this result in recent studies.46,80 However, the results of the complete benchmark show that HF plays an outstanding role as its refinement outcomes make it stand out from all high-level hybrid functionals, not only in individual refinements but also observable as a global average over a broad range of structures and parameter combinations within the tested class of compounds.
The performance of pure HF and how it compares qualitatively the hybrid methods is demonstrated in Fig. 7 by the multidimensional scaling analysis (MDS). The MDS can represent high complexity data's (dis)similarities as distances in a 2D space, allowing an inspection of multidimensional datapoints in only two dimensions. Starting from a distance (or dissimilarity) matrix of the data, MDS seeks points in a low-dimensional space, that best preserve the initial pairwise distances. This enhances the visual interpretation and identification of patterns compared to the original high-dimensional data. Here, MDS was performed on the wRMSD and FOM of all refinements, labeling them by the methods used. The pairwise distance indicates the dissimilarity, so with HF taking a distinct position in both plots, its dissimilarity against the other methods is highly pronounced. The hybrid functionals are located close to each other, with R2SCAN as the only non-hybrid as part of the cluster, especially pronounced in the wRMSD MDS. The observable separation within Fig. 7 between Jacob's ladder's higher and lower rungs indicates the different functional families' disparate behavior and overall performance differences.
FIG. 7.
2D-scatterplot assigning the refinements using different basis sets for all structures to a point in the FOM*-wRMSD* plane. Different basis set-cluster mean values are indicated as black x-markers; each cluster is colored according to the distance from the mean value to the point (−1,−1), a short distance indicated by a darker color. Box-and-whiskers plots (right) show the median (white line), the box frames the interquartile range (IQR), and the whiskers are in a range of 1.5 times the IQR for FOM* (top) and wRMSD* (bottom); data outside are drawn as flier-props.
The qualitative distinction of pure HF is evident in both the wRMSD and the FOM MDS plot, where it appears noticeably more distant from all other methods. So, based on this benchmark study of amino acids, the HF method emerges as the most favorable choice. However, the generalizability of these findings to a broader chemical context requires further investigation. The R2SCAN functional is a recommended alternative because of its performance and lower computational cost, especially if the system is expected to show strong electron correlation. The suggestion to desist from using B3LYP (and BLYP) should also be highlighted at this point, particularly when the more cost-effective R2SCAN functional is available.
E. Influence of the basis set
Traditionally, the question of choice of a basis set is primarily a question of time and computational resources. What qualitative differences emerge based on the selection of a basis set are analyzed using the same methodology as in the case of a method. Figure 6 is the 2D map of all refinements in the FOM*-wRMSD* plane. Most basis sets tested show an average value (black “X”) in both metrics close to each other, indicating similar performance. The expected outliers are the minimal basis sets STO-3G and 3-21G. The refinements with these basis sets are significantly separated regarding wRMSD* and FOM*. The 6-31G(d) basis set also exhibits notable deviation from the bulk of refinements concerning the wRMSD*. Interestingly, the double-zeta basis sets show a superior performance regarding the FOM*, as indicated by their median values. For the wRMSD*, on the other hand, the larger basis sets exhibit better results.
Considering the observed deviations between the basis sets (besides the clear outliers), the choice of the basis set is less critical than that of the method. Therefore, a definite ranking according to the performance of the basis sets is difficult due to the slight differences in the FOM* and wRMSD* values. Regarding resources and time, even smaller basis sets like def2-SVP are recommendable for HAR. The MDS similarity analysis of the basis sets is presented in Fig. 8.
FIG. 8.
Two-dimensional metric multidimensional scaling (MDS)97 output for the quantum basis sets, according to which the similarities are visualized as pairwise distances. The dimensions which span up the grapg do not have an intrinsic meaning and are arbitrarily chosen to preserve the underlying data structure. Stress value for both MDS: 0.033. Note: Using STO-3G did not reach convergence during the HAR procedure for all structures and was therefore excluded from the MDS analysis.
The double-zeta basis sets exhibit proximity within a cluster, while the larger sets demonstrate a distinct separation. The small basis 3-21G is located in a distant position for both the FOM* and wRMSD* MDS, indicating its different behavior. The 6-31G(d) basis set is notably segregated from most basis sets regarding wRMSD*, aligning with its comparatively poor refinement performance observed in Fig. 6.
The clustering of basis sets in the MDS, showing similar performance in Fig. 8, emphasizes the relatively minor influence of the basis set during HAR. Only minute differences in the refinement outcomes are observed for double-zeta and larger basis sets, indicating a limited possibility of spoiling a refinement by choice of a particular basis set.
IV. CONCLUSIONS
This study investigated the impact of various parameters regarding the computational methods on the quality of resulting Hirshfeld atom refinement (HAR) models. Based on conflicting information in the existing literature, a recommendation has lacked a clear consensus. A well-defined set of test structures with sufficient experimental data quality was crucial to address this issue and ensure reliable results systematically. The selection of amino acids represented a diverse chemical range within a defined group and, coupled with enhanced experimental data quality, established a distinct and reliable test set. A comprehensive permutation of refinement parameters was conducted to avoid oversight of potential influences or correlations.
Among these parameters, the SCNL correction did not qualitatively affect the outcomes, while applying a continuum water solvent model proved beneficial for the tested molecules compared to a calculation in vacuo. The integration grid accuracy should be chosen as “Normal,” as it balances refinement outcome and computational effort. The method and basis set exhibited the highest variability and potential influence on refinement outcome. Various DFT methods demonstrated performance proportional to their computational effort, approximately aligning with a Jacob's ladder ranking. Especially the use of the fast R2SCAN functional was beneficial, and it outperformed other, more sophisticated functionals, most notably the popular B3LYP functional.
In contrast, while HF significantly neglects electron exchange, it outperformed all DFT methods across all structures in the test set. This observation can be understood as a hint that another effect overcompensates the missing electron density redistribution into the electronic structure's core during HAR. One possibility is the neglect of the atomic charge by Hirshfeld partitioning, which other partitioning schemes can overcome.52,98 Future studies could also be extended to explicitly include packing effects or perform periodic calculations to more accurately capture the crystalline surroundings, which may significantly influence the observed trends and provide a more comprehensive assessment of the methods.
Small Pople basis sets were generally less performant than the remaining ones. Larger bases, particularly from the def2 series, demonstrated superior performance with parameter combinations that were beneficial during their benchmark. MDS's qualitatively independent analysis supported the observed trends regarding method and basis set choices.
In summary, based on the obtained results, the following settings are recommended for a HAR on polar organic molecules:
-
•
Method: R2SCAN
-
•
basis set of the def2 family: def2-SVP or higher
-
•
Integration accuracy: Normal
-
•
Solvation: Water
-
•
SCF Convergence: NormalSCF or NoSpherA2SCF
These settings do not guarantee the best results, but they have yielded comparable results to computationally more demanding settings while compromising on the required computational effort.
Settings like “High” integration accuracy or extremely sophisticated basis sets like cc-pVQZ or 6–311++G(2d,2p) do not perform better than the recommended settings for polar organic molecules and can, therefore, be avoided. Also, functionals like B3LYP and calculations in vacuo should be exchanged for either more efficient ones or more suitable ones, as shown above.
V. EXPERIMENT
The amino acid crystals were obtained by recrystallizing the commercially obtained bulk materials. Details are described in the supplementary material. The diffraction experiments were conducted on a STOE STADIVARI Eulerian 4-circle diffractometer. All measurements utilized Mo-Kα radiation generated from a Genix microfocus tube, and the collection of intensities used a DECTRIS Pilatus 200 K detector. The experiments were conducted at 100 K, cooled by an Oxford Cryostream 800 with nitrogen gas. The frames were integrated with integrate3d and raw intensities further corrected for absorption by multi-scan absorption correction and scaling with STOE's LANA software. The unmerged data were subsequently employed for structure solution and refinement by olex2.solve and olex2.refine.36,37 The resulting IAM structure models are the starting points for all HARs as described.
ACKNOWLEDGMENTS
D.B. acknowledges funding by the RWTH-Graduiertenförderung. F.M. acknowledges funding by the Studienstiftung des Deutschen Volkes. Calculations were performed on the RWTH HPC computing infrastructure within thesis and small computing projects.
Note: This paper is part of the Special Topic on Neutron Scattering and Quantum Crystallography.
AUTHOR DECLARATIONS
Conflict of Interest
The authors have no conflicts to disclose.
Author Contributions
Daniel Brüx: Conceptualization (equal); Data curation (lead); Investigation (lead); Methodology (equal); Software (equal); Validation (equal); Visualization (lead); Writing – original draft (lead). Florian Meurer: Conceptualization (equal); Software (equal); Writing – review & editing (equal). Florian Kleemiss: Conceptualization (equal); Funding acquisition (lead); Methodology (supporting); Project administration (lead); Supervision (lead); Writing – review & editing (equal).
DATA AVAILABILITY
The data that support the findings of this study are available within the article and its supplementary material.
References
- 1.International Tables for Crystallography: Mathematical, Physical and Chemical Tables, edited by Prince E., 1st ed. (International Union of Crystallography, Chester, England, 2006). [Google Scholar]
- 2.Stewart R. F., “Electron population analysis with rigid pseudoatoms,” Acta Crystallogr. A 32(4), 565–574 (1976). 10.1107/S056773947600123X [DOI] [Google Scholar]
- 3.Flierler U. and Stalke D., “More than just distances from electron density studies,” in Electron Density and Chemical Bonding I, edited by Stalke D. (Springer, Berlin, Heidelberg, 2012), pp. 1–20. [Google Scholar]
- 4.Lecomte C., Guillot B., Jelsch C., and Podjarny A., “Frontier example in experimental charge density research: Experimental electrostatics of proteins,” Int. J. Quantum Chem. 101(5), 624–634 (2005). 10.1002/qua.20317 [DOI] [Google Scholar]
- 5.Kleemiss F., Dolomanov O. V., Bodensteiner M., Peyerimhoff N., Midgley L., Bourhis L. J., Genoni A., Malaspina L. A., Jayatilaka D., Spencer J. L., White F., Grundkötter-Stock B., Steinhauer S., Lentz D., Puschmann H., and Grabowsky S., “Accurate crystal structures and chemical properties from NoSpherA2,” Chem. Sci. 12(5), 1675–1692 (2021). 10.1039/D0SC05526C [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Hupf E., Kleemiss F., Borrmann T., Pal R., Krzeszczakowska J. M., Woińska M., Jayatilaka D., Genoni A., and Grabowsky S., “The effects of experimentally obtained electron correlation and polarization on electron densities and exchange-correlation potentials,” J. Chem. Phys. 158(12), 124103 (2023). 10.1063/5.0138312 [DOI] [PubMed] [Google Scholar]
- 7.Ernst M., Genoni A., and Macchi P., “Analysis of crystal field effects and interactions using X-ray restrained ELMOs,” J. Mol. Struct. 1209, 127975 (2020). 10.1016/j.molstruc.2020.127975 [DOI] [Google Scholar]
- 8.Sanjuan-Szklarz W. F., Woińska M., Domagała S., Dominiak P. M., Grabowsky S., Jayatilaka D., Gutmann M., and Woźniak K., “On the accuracy and precision of X-ray and neutron diffraction results as a function of resolution and the electron density model,” IUCrJ 7(5), 920–933 (2020). 10.1107/S2052252520010441 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Chrappová J., Pateda Y. R., and Rakovský E., “Synthesis and crystal structure analysis of NH4[Zn(cma)(H2O)2]·H2O using IAM and HAR approaches,” J. Chem. Crystallogr. 53(2), 228–235 (2023). 10.1007/s10870-022-00961-1 [DOI] [Google Scholar]
- 10.Hoser A. A., Dominiak P. M., and Woźniak K., “Towards the best model for H atoms in experimental charge-density refinement,” Acta Crystallogr. A 65(4), 300–311 (2009). 10.1107/S0108767309019862 [DOI] [PubMed] [Google Scholar]
- 11.Hanson J. C., Sieker L. C., and Jensen L. H., “Sucrose: X-ray refinement and comparison with neutron refinement,” Acta Crystallogr. B 29(4), 797–808 (1973). 10.1107/S0567740873003365 [DOI] [Google Scholar]
- 12.Genoni A. and Macchi P., “Quantum crystallography in the last decade: Developments and outlooks,” Crystals 10(6), 473 (2020). 10.3390/cryst10060473 [DOI] [Google Scholar]
- 13.Stewart R. F., “Valence structure from x-ray diffraction data: Physical properties,” J. Chem. Phys. 57(4), 1664–1668 (1972). 10.1063/1.1678452 [DOI] [Google Scholar]
- 14.Dittrich B., Hübschle C. B., Messerschmidt M., Kalinowski R., Girnt D., and Luger P., “The invariom model and its application: Refinement of D,L-serine at different temperatures and resolution,” Acta Crystallogr. A 61(Pt. 3), 314–320 (2005). 10.1107/S0108767305005039 [DOI] [PubMed] [Google Scholar]
- 15.Hansen N. K. and Coppens P., “Testing aspherical atom refinements on small-molecule data sets,” Acta Crystallogr. A 34(6), 909–921 (1978). 10.1107/S0567739478001886 [DOI] [Google Scholar]
- 16.Pichon-Pesme V., Lecomte C., and Lachekar H., “On building a data bank of transferable experimental electron density parameters applicable to polypeptides,” J. Phys. Chem. 99(16), 6242–6250 (1995). 10.1021/j100016a071 [DOI] [Google Scholar]
- 17.Volkov A., Li X., Koritsanszky T., and Coppens P., “Ab initio quality electrostatic atomic and molecular properties including intermolecular energies from a transferable theoretical pseudoatom databank,” J. Phys. Chem. A 108(19), 4283–4300 (2004). 10.1021/jp0379796 [DOI] [Google Scholar]
- 18.Zarychta B., Pichon-Pesme V., Guillot B., Lecomte C., and Jelsch C., “On the application of an experimental multipolar pseudo-atom library for accurate refinement of small-molecule and protein crystal structures,” Acta Crystallogr. A 63(2), 108–125 (2007). 10.1107/S0108767306053748 [DOI] [PubMed] [Google Scholar]
- 19.Volkov A., Messerschmidt M., and Coppens P., “Improving the scattering-factor formalism in protein refinement: Application of the University at Buffalo Aspherical-Atom Databank to polypeptide structures,” Acta Crystallogr. D 63(2), 160–170 (2007). 10.1107/S0907444906044453 [DOI] [PubMed] [Google Scholar]
- 20.Dittrich B., Hübschle C. B., Pröpper K., Dietrich F., Stolper T., and Holstein J. J., “The generalized invariom database (GID),” Acta Crystallogr. B 69(2), 91–104 (2013). 10.1107/S2052519213002285 [DOI] [PubMed] [Google Scholar]
- 21.Jha K. K., Gruza B., Sypko A., Kumar P., Chodkiewicz M. L., and Dominiak P. M., “Multipolar atom types from theory and statistical clustering (MATTS) data bank: Restructurization and extension of UBDB,” J. Chem. Inf. Model. 62(16), 3752–3765 (2022). 10.1021/acs.jcim.2c00144 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Grabowsky S., Genoni A., and Bürgi H.-B., “Quantum crystallography,” Chem. Sci. 8(6), 4159–4176 (2017). 10.1039/C6SC05504D [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Dittrich B. and Jayatilaka D., “X-ray structure refinement using aspherical atomic density functions obtained from quantum-mechanical calculations,” Acta Crystallogr. A 64(3), 383–393 (2008). 10.1107/S0108767308005709 [DOI] [PubMed] [Google Scholar]
- 24.Capelli S. C., Bürgi H.-B., Dittrich B., Grabowsky S., and Jayatilaka D., “Hirshfeld atom refinement,” IUCrJ 1(5), 361–379 (2014). 10.1107/S2052252514014845 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Malaspina L. A., Edwards A. J., Woińska M., Jayatilaka D., Turner M. J., Price J. R., Herbst-Irmer R., Sugimoto K., Nishibori E., and Grabowsky S., “Predicting the position of the hydrogen atom in the short intramolecular hydrogen bond of the hydrogen maleate anion from geometric correlations,” Cryst. Growth Des. 17(7), 3812–3825 (2017). 10.1021/acs.cgd.7b00390 [DOI] [Google Scholar]
- 26.Fugel M., Jayatilaka D., Hupf E., Overgaard J., Hathwar V. R., Macchi P., Turner M. J., Howard J. A. K., Dolomanov O. V., Puschmann H., Iversen B. B., Bürgi H.-B., and Grabowsky S., “Probing the accuracy and precision of Hirshfeld atom refinement with HARt interfaced with Olex2,” IUCrJ 5(1), 32–44 (2018). 10.1107/S2052252517015548 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Woińska M., Grabowsky S., Dominiak P. M., Woźniak K., and Jayatilaka D., “Hydrogen atoms can be located accurately and precisely by x-ray crystallography,” Sci. Adv. 2(5), e1600192 (2016). 10.1126/sciadv.1600192 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Brüx D., Ebel B., Pelzer N., Kalf I., and Kleemiss F., “Experimental spin state determination of iron(II) complexes by hirshfeld atom refinement,” Chemistry 31(14), e202404017 (2025). 10.1002/chem.202404017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Jha K. K., Kleemiss F., Chodkiewicz M. L., and Dominiak P. M., “Aspherical atom refinements on X-ray data of diverse structures including disordered and covalent organic framework systems: A time–accuracy trade-off,” J. Appl. Crystallogr. 56(1), 116–127 (2023). 10.1107/S1600576722010883 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Bučinský L., Jayatilaka D., and Grabowsky S., “Relativistic quantum crystallography of diphenyl- and dicyanomercury. Theoretical structure factors and Hirshfeld atom refinement,” Acta Crystallogr. A 75(5), 705–717 (2019). 10.1107/S2053273319008027 [DOI] [PubMed] [Google Scholar]
- 31.Pawlędzio S., Malinska M., Woińska M., Wojciechowski J., Andrade Malaspina L., Kleemiss F., Grabowsky S., and Woźniak K., “Relativistic Hirshfeld atom refinement of an organo-gold(I) compound,” IUCrJ 8(4), 608–620 (2021). 10.1107/S2052252521004541 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Woińska M., Pawlędzio S., Chodkiewicz M. L., and Woźniak K., “Hirshfeld atom refinement of metal–organic complexes: Treatment of hydrogen atoms bonded to transition metals,” J. Phys. Chem. A 127(13), 3020–3035 (2023). 10.1021/acs.jpca.2c06998 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Xu Y., Chodkiewicz M. L., Woińska M., Trzybiński D., Brekalo I., Topić F., Woźniak K., and Arhangelskis M., “Hirshfeld atom refinement of metal–organic frameworks for accurate positioning of hydrogen atoms and disorder analysis,” Chem. Commun. 59(57), 8799–8802 (2023). 10.1039/D3CC01369C [DOI] [PubMed] [Google Scholar]
- 34.Ruth P. N., Herbst-Irmer R., and Stalke D., “Hirshfeld atom refinement based on projector augmented wave densities with periodic boundary conditions,” IUCrJ 9(2), 286–297 (2022). 10.1107/S2052252522001385 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Jayatilaka D. and Grimwood D. J., “Tonto: A Fortran based object-oriented system for quantum chemistry and crystallography,” in Lecture Notes in Computer Science, edited by Sloot P. M. A., Abramson D., Bogdanov A. V., Gorbachev Y. E., Dongarra J. J., and Zomaya A. Y. (Springer, Berlin, Heidelberg, 2003), pp. 142–151. [Google Scholar]
- 36.Dolomanov O. V., Bourhis L. J., Gildea R. J., Howard J. A. K., and Puschmann H., “OLEX2: A complete structure solution, refinement and analysis program,” J. Appl. Crystallogr. 42(2), 339–341 (2009). 10.1107/S0021889808042726 [DOI] [Google Scholar]
- 37.Bourhis L. J., Dolomanov O. V., Gildea R. J., Howard J. A. K., and Puschmann H., “The anatomy of a comprehensive constrained, restrained refinement program for the modern computing environment – Olex2 dissected,” Acta Crystallogr. A 71(1), 59–75 (2015). 10.1107/S2053273314022207 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Bergmann J., Davidson M., Oksanen E., Ryde U., and Jayatilaka D., “fragHAR: Towards ab initio quantum-crystallographic X-ray structure refinement for polypeptides and proteins,” IUCrJ 7(2), 158–165 (2020). 10.1107/S2052252519015975 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Chodkiewicz M., Patrikeev L., Pawlędzio S., and Woźniak K., “Transferable Hirshfeld atom model for rapid evaluation of aspherical atomic form factors,” IUCrJ 11(2), 249–259 (2024). 10.1107/S2052252524001507 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Wieduwilt E. K., Macetti G., and Genoni A., “Climbing Jacob's ladder of structural refinement: Introduction of a localized molecular orbital-based embedding for accurate x-ray determinations of hydrogen atom positions,” J. Phys. Chem. Lett. 12(1), 463–471 (2021). 10.1021/acs.jpclett.0c03421 [DOI] [PubMed] [Google Scholar]
- 41.Wall M. E., “Quantum crystallographic charge density of urea,” IUCrJ 3(4), 237–246 (2016). 10.1107/S2052252516006242 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Bursch M., Mewes J., Hansen A., and Grimme S., “Best-practice DFT protocols for basic molecular computational chemistry,” Angew. Chem., Int. Ed. 61(42), e202205735 (2022). 10.1002/anie.202205735 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Neese F., “The ORCA program system,” WIREs Comput. Mol. Sci. 2(1), 73–78 (2012). 10.1002/wcms.81 [DOI] [Google Scholar]
- 44.Frisch M. J., Trucks G. W., Schlegel G. E., Scuseris G. E., Robb M. A., Cheeseman J. R., Scalmani G., Barone V., Mennucci B., Petersson G. A., Nakatsuji H., Caricato M., Li X., Hratchian H. P., Izmaylov A. F., Bloino J., Zheng G., Sonnenberg J. L., Hada M., Ehara M., Toyota K., Fukuda R., Hasegawa J., Ishida M., Nakajima T., Honda Y., Kitao O., Nakai H., Vreven T., Montgomery J. A., Peralta J. E., Ogliaro F., Bearpark M., Heyd J. J., Brothers E., Kudin K. N., Staroverov V. N., Keith T., Kobayashi R., Normand J., Raghavachari K., Rendell A., Burant J. C., Iyengar S. S., Tomasi J., Cossi M., Rega N., Millam J. M., Klene M., Knox J. E., Cross J. B., Bakken V., Adamo C., Jaramillo J., Gomperts R., Stratmann R. E., Yazyev O., Austin A. J., Cammi R., Pomelli C., Ochterski W., Martin R. L., Morokuma K., Zakrzewski V. G., Voth G. A., Salvador P., Dannenberg J. J., Dapprich S., Daniels A. D., Farkas O., Foresman J. B., Ortiz J. V., Cioslowski J., and Fox D. J., Gaussian 09, Revision D.01 (Gaussian, Inc., Pittsburgh, PA, 2013). [Google Scholar]
- 45.Wieduwilt E. K., Macetti G., Malaspina L. A., Jayatilaka D., Grabowsky S., and Genoni A., “Post-Hartree-Fock methods for Hirshfeld atom refinement: Are they necessary? Investigation of a strongly hydrogen-bonded molecular crystal,” J. Mol. Struct. 1209, 127934 (2020). 10.1016/j.molstruc.2020.127934 [DOI] [Google Scholar]
- 46.Landeros-Rivera B., Ramírez-Palma D., Cortés-Guzmán F., Dominiak P. M., and Contreras-García J., “How do density functionals affect the Hirshfeld atom refinement?” Phys. Chem. Chem. Phys. 25(18), 12702–12711 (2023). 10.1039/D2CP04098K [DOI] [PubMed] [Google Scholar]
- 47.Becke A. D., “Density-functional thermochemistry. I. The effect of the exchange-only gradient correction,” J. Chem. Phys. 96(3), 2155–2160 (1992). 10.1063/1.462066 [DOI] [Google Scholar]
- 48.Lee C., Yang W., and Parr R. G., “Development of the Colle-Salvetti correlation-energy formula into a functional of the electron density,” Phys. Rev. B 37(2), 785–789 (1988). 10.1103/PhysRevB.37.785 [DOI] [PubMed] [Google Scholar]
- 49.Chéron N., Jacquemin D., and Fleurat-Lessard P., “A qualitative failure of B3LYP for textbook organic reactions,” Phys. Chem. Chem. Phys. 14(19), 7170 (2012). 10.1039/c2cp40438a [DOI] [PubMed] [Google Scholar]
- 50.Check C. E. and Gilbert T. M., “Progressive systematic underestimation of reaction energies by the B3LYP model as the number of C−C bonds increases: Why organic chemists should use multiple DFT models for calculations involving polycarbon hydrocarbons,” J. Org. Chem. 70(24), 9828–9834 (2005). 10.1021/jo051545k [DOI] [PubMed] [Google Scholar]
- 51.Schäfer A., Horn H., and Ahlrichs R., “Fully optimized contracted Gaussian basis sets for atoms Li to Kr,” J. Chem. Phys. 97(4), 2571–2577 (1992). 10.1063/1.463096 [DOI] [Google Scholar]
- 52.Chodkiewicz M. L., Woińska M., and Woźniak K., “Hirshfeld atom like refinement with alternative electron density partitions,” IUCrJ 7(6), 1199–1215 (2020). 10.1107/S2052252520013603 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.F. Meurer (2023). “plugin-SISYPHOS,” Github. https://github.com/FlorianMeurer/plugin-SISYPHOS [Google Scholar]
- 54.Bielka G. D. R., Sharon N., and Australia E. W., “Nomenclature and symbolism for amino acids and peptides (Recommendations 1983),” Pure Appl. Chem. 56(5), 595–624 (1984). 10.1351/pac198456050595 [DOI] [Google Scholar]
- 55.Neese F., “Software update: The ORCA program system—Version 5.0,” WIREs Comput. Mol. Sci. 12(5), e1606 (2022). 10.1002/wcms.1606 [DOI] [Google Scholar]
- 56.Barone V. and Cossi M., “Quantum calculation of molecular energies and energy gradients in solution by a conductor solvent model,” J. Phys. Chem. A 102(11), 1995–2001 (1998). 10.1021/jp9716997 [DOI] [Google Scholar]
- 57.Vydrov O. A. and Van Voorhis T., “Nonlocal van der Waals density functional: The simpler the better,” J. Chem. Phys. 133(24), 244103 (2010). 10.1063/1.3521275 [DOI] [PubMed] [Google Scholar]
- 58.Pérez-Jordá J. M., Becke A. D., and San-Fabián E., “Automatic numerical integration techniques for polyatomic molecules,” J. Chem. Phys. 100(9), 6520–6534 (1994). 10.1063/1.467061 [DOI] [Google Scholar]
- 59.Perdew J. P., “Jacob's ladder of density functional approximations for the exchange-correlation energy,” AIP Conf. Proc. 577, 1–20 (2001). 10.1063/1.1390175 [DOI] [Google Scholar]
- 60.Perdew J. P. and Wang Y., “Accurate and simple analytic representation of the electron-gas correlation energy,” Phys. Rev. B 45(23), 13244–13249 (1992). 10.1103/PhysRevB.45.13244 [DOI] [PubMed] [Google Scholar]
- 61.Becke A. D., “Density-functional exchange-energy approximation with correct asymptotic behavior,” Phys. Rev. A 38(6), 3098–3100 (1988). 10.1103/PhysRevA.38.3098 [DOI] [PubMed] [Google Scholar]
- 62.Perdew J. P., “Density-functional approximation for the correlation energy of the inhomogeneous electron gas,” Phys. Rev. B 33(12), 8822–8824 (1986). 10.1103/PhysRevB.33.8822 [DOI] [PubMed] [Google Scholar]
- 63.Ernzerhof M. and Scuseria G. E., “Assessment of the Perdew–Burke–Ernzerhof exchange-correlation functional,” J. Chem. Phys. 110(11), 5029–5036 (1999). 10.1063/1.478401 [DOI] [PubMed] [Google Scholar]
- 64.Tao J., Perdew J. P., Staroverov V. N., and Scuseria G. E., “Climbing the density functional ladder: Nonempirical meta–generalized gradient approximation designed for molecules and solids,” Phys. Rev. Lett. 91(14), 146401 (2003). 10.1103/PhysRevLett.91.146401 [DOI] [PubMed] [Google Scholar]
- 65.Furness J. W., Kaplan A. D., Ning J., Perdew J. P., and Sun J., “Accurate and numerically efficient r2SCAN meta-generalized gradient approximation,” J. Phys. Chem. Lett. 11(19), 8208–8215 (2020). 10.1021/acs.jpclett.0c02405 [DOI] [PubMed] [Google Scholar]
- 66.Adamo C. and Barone V., “Toward reliable density functional methods without adjustable parameters: The PBE0 model,” J. Chem. Phys. 110(13), 6158–6170 (1999). 10.1063/1.478522 [DOI] [Google Scholar]
- 67.Zhao Y. and Truhlar D. G., “The M06 suite of density functionals for main group thermochemistry, thermochemical kinetics, noncovalent interactions, excited states, and transition elements: Two new functionals and systematic testing of four M06-class functionals and 12 other functionals,” Theor. Chem. Account. 120(1–3), 215–241 (2008). 10.1007/s00214-007-0310-x [DOI] [Google Scholar]
- 68.Chai J.-D. and Head-Gordon M., “Systematic optimization of long-range corrected hybrid density functionals,” J. Chem. Phys. 128(8), 084106 (2008). 10.1063/1.2834918 [DOI] [PubMed] [Google Scholar]
- 69.Hehre W. J., Stewart R. F., and Pople J. A., “Self-consistent molecular-orbital methods. I. Use of Gaussian expansions of slater-type atomic orbitals,” J. Chem. Phys. 51(6), 2657–2664 (1969). 10.1063/1.1672392 [DOI] [Google Scholar]
- 70.Binkley J. S., Pople J. A., and Hehre W. J., “Self-consistent molecular orbital methods. 21. Small split-valence basis sets for first-row elements,” J. Am. Chem. Soc. 102(3), 939–947 (1980). 10.1021/ja00523a008 [DOI] [Google Scholar]
- 71.Hehre W. J., Ditchfield R., and Pople J. A., “Self—consistent molecular orbital methods. XII. Further extensions of Gaussian—Type basis sets for use in molecular orbital studies of organic molecules,” J. Chem. Phys. 56(5), 2257–2261 (1972). 10.1063/1.1677527 [DOI] [Google Scholar]
- 72.Hariharan P. C. and Pople J. A., “The influence of polarization functions on molecular orbital hydrogenation energies,” Theoret. Chim. Acta 28(3), 213–222 (1973). 10.1007/BF00533485 [DOI] [Google Scholar]
- 73.Dill J. D. and Pople J. A., “Self-consistent molecular orbital methods. XV. Extended Gaussian-type basis sets for lithium, beryllium, and boron,” J. Chem. Phys. 62(7), 2921–2923 (1975). 10.1063/1.430801 [DOI] [Google Scholar]
- 74.Krishnan R., Binkley J. S., Seeger R., and Pople J. A., “Self-consistent molecular orbital methods. XX. A basis set for correlated wave functions,” J. Chem. Phys. 72(1), 650–654 (1980). 10.1063/1.438955 [DOI] [Google Scholar]
- 75.Dunning T. H., “Gaussian basis sets for use in correlated molecular calculations. I. The atoms boron through neon and hydrogen,” J. Chem. Phys. 90(2), 1007–1023 (1989). 10.1063/1.456153 [DOI] [Google Scholar]
- 76.Weigend F. and Ahlrichs R., “Balanced basis sets of split valence, triple zeta valence and quadruple zeta valence quality for H to Rn: Design and assessment of accuracy,” Phys. Chem. Chem. Phys. 7(18), 3297 (2005). 10.1039/b508541a [DOI] [PubMed] [Google Scholar]
- 77.Rappoport D. and Furche F., “Property-optimized Gaussian basis sets for molecular response calculations,” J. Chem. Phys. 133(13), 134105 (2010). 10.1063/1.3484283 [DOI] [PubMed] [Google Scholar]
- 78.Canal Neto A., Muniz E. P., Centoducatte R., and Jorge F. E., “Gaussian basis sets for correlated wave functions. Hydrogen, helium, first- and second-row atoms,” J. Mol. Struct. THEOCHEM 718(1–3), 219–224 (2005). 10.1016/j.theochem.2004.11.037 [DOI] [Google Scholar]
- 79.Barbieri P. L., Fantin P. A., and Jorge F. E., “Gaussian basis sets of triple and quadruple zeta valence quality for correlated wave functions,” Mol. Phys. 104(18), 2945–2954 (2006). 10.1080/00268970600899018 [DOI] [Google Scholar]
- 80.Malaspina L. A., Genoni A., Jayatilaka D., Turner M. J., Sugimoto K., Nishibori E., and Grabowsky S., “The advanced treatment of hydrogen bonding in quantum crystallography,” J. Appl. Crystallogr. 54(3), 718–729 (2021). 10.1107/S1600576721001126 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Wilson C. C., Myles D., Ghosh M., Johnson L. N., and Wang W., “Neutron diffraction investigations of L- and D-alanine at different temperatures: The search for structural evidence for parity violation,” New J. Chem. 29(10), 1318 (2005). 10.1039/b419295h [DOI] [Google Scholar]
- 82.Lehmann M. S., Verbist J. J., Hamilton W. C., and Koetzle T. F., “Precision neutron diffraction structure determination of protein and nucleic acid components. Part V. Crystal and molecular structure of the amino-acid L-arginine dihydrate,” J. Chem. Soc., Perkin Trans. 2 2, 133 (1973). 10.1039/P29730000133 [DOI] [Google Scholar]
- 83.Ramanadham M., Sikka S. K., and Chidambaram R., “Structure of L-asparagine monohydrate by neutron diffraction,” Acta Crystallogr. B 28(10), 3000–3005 (1972). 10.1107/S0567740872007356 [DOI] [Google Scholar]
- 84.Kerr K. A., Ashmore J. P., and Koetzle T. F., “A neutron diffraction study of L -cysteine,” Acta Crystallogr. B 31(8), 2022–2026 (1975). 10.1107/S0567740875006772 [DOI] [Google Scholar]
- 85.Lehmann M. S., Koetzle T. F., and Hamilton W. C., “Precision neutron diffraction structure determination of protein and nucleic acid components. IV. The crystal and molecular structure of the amino acid L-histidine,” Int. J. Pept. Protein Res. 4(4), 229–239 (1972). 10.1111/j.1399-3011.1972.tb03424.x [DOI] [PubMed] [Google Scholar]
- 86.Lehmann M. S., Koetzle T. F., and Hamilton W. C., “Precision neutron diffraction structure determination of protein and nucleic acid components. VIII: The crystal and molecular structure of the β-form of the amino acidL-glutamic acid,” J. Cryst. Mol. Struct. 2(5–6), 225–233 (1972). 10.1007/BF01246639 [DOI] [Google Scholar]
- 87.Power L. F., Turner K. E., and Moore F. H., “The crystal and molecular structure of α-glycine by neutron diffraction – A comparison,” Acta Crystallogr. B 32(1), 11–16 (1976). 10.1107/S0567740876002227 [DOI] [Google Scholar]
- 88.Frey M. N., Lehmann M. S., Koetzle T. F., and Hamilton W. C., “Precision neutron diffraction structure determination of protein and nucleic acid components. XI. Molecular configuration and hydrogen bonding of serine in the crystalline amino acids L-serine monohydrate and DL-serine,” Acta Crystallogr. B 29(4), 876–884 (1973). 10.1107/S0567740873003481 [DOI] [Google Scholar]
- 89.Ramanadham M., Sikka S. K., and Chidambaram R., “Structure determination of Ls-threonine by neutron diffraction,” Pramana 1(6), 247–259 (1973). 10.1007/BF02848502 [DOI] [Google Scholar]
- 90.Frey M. N., Koetzle T. F., Lehmann M. S., and Hamilton W. C., “Precision neutron diffraction structure determination of protein and nucleic acid components. X. A comparison between the crystal and molecular structures of L-tyrosine and L-tyrosine hydrochloride,” J. Chem. Phys. 58(6), 2547–2556 (1973). 10.1063/1.1679537 [DOI] [Google Scholar]
- 91.Allen F. H. and Bruno I. J., “Bond lengths in organic and metal-organic compounds revisited: X—H bond lengths from neutron diffraction data,” Acta Crystallogr. B 66(3), 380–386 (2010). 10.1107/S0108768110012048 [DOI] [PubMed] [Google Scholar]
- 92.Hill N. D. D. and Boeré R. T., “Small molecule X-ray crystal structures at a crossroads,” Chem. Methods 5(5), e202400052 (2025). 10.1002/cmtd.202400052 [DOI] [Google Scholar]
- 93.Balmohammadi Y., Malaspina L. A., Nakamura Y., Cametti G., Siczek M., and Grabowsky S., “A quantum crystallographic protocol for general use,” Sci. Rep. 15(1), 13584 (2025). 10.1038/s41598-025-96400-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 94.Dale S. G. and Johnson E. R., “Counterintuitive electron localisation from density-functional theory with polarisable solvent models,” J. Chem. Phys. 143(18), 184112 (2015). 10.1063/1.4935177 [DOI] [PubMed] [Google Scholar]
- 95.Ren F. and Liu F., “Impacts of polarizable continuum models on the SCF convergence and DFT delocalization error of large molecules,” J. Chem. Phys. 157(18), 184106 (2022). 10.1063/5.0121991 [DOI] [PubMed] [Google Scholar]
- 96.Stratmann R. E., Scuseria G. E., and Frisch M. J., “Achieving linear scaling in exchange-correlation density functional quadratures,” Chem. Phys. Lett. 257(3–4), 213–223 (1996). 10.1016/0009-2614(96)00600-8 [DOI] [Google Scholar]
- 97.Carroll J. D. and Arabie P., “Multidimensional scaling,” in Measurement, Judgment and Decision Making (Elsevier, 1998), pp. 179–250. [Google Scholar]
- 98.Verstraelen T., Ayers P. W., Van Speybroeck V., and Waroquier M., “Hirshfeld-E partitioning: AIM charges with an improved trade-off between robustness and accurate electrostatics,” J. Chem. Theory Comput. 9(5), 2221–2225 (2013). 10.1021/ct4000923 [DOI] [PubMed] [Google Scholar]
- 99.See the 10.60893/figshare.sdy.c.7989461 for further statistical information and crystallographic information. Associated structures are obtainable under the accession codes 2422229, 2422230, 2422231, 2422150, 2422151, 2422152, 2422154, 2422155, 2422156, 2422157, 2422158, 2422159, 2422160, 2422161. [DOI]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Citations
- See the 10.60893/figshare.sdy.c.7989461 for further statistical information and crystallographic information. Associated structures are obtainable under the accession codes 2422229, 2422230, 2422231, 2422150, 2422151, 2422152, 2422154, 2422155, 2422156, 2422157, 2422158, 2422159, 2422160, 2422161. [DOI]
Data Availability Statement
The data that support the findings of this study are available within the article and its supplementary material.








