Skip to main content
ACS AuthorChoice logoLink to ACS AuthorChoice
. 2020 May 26;16(7):4238–4255. doi: 10.1021/acs.jctc.0c00189

Canonical and DLPNO-Based G4(MP2)XK-Inspired Composite Wave Function Methods Parametrized against Large and Chemically Diverse Training Sets: Are They More Accurate and/or Robust than Double-Hybrid DFT?

Emmanouil Semidalas 1, Jan M L Martin 1,*
PMCID: PMC7366511  PMID: 32456427

Abstract

graphic file with name ct0c00189_0004.jpg

The large and chemically diverse GMTKN55 benchmark was used as a training set for parametrizing composite wave function thermochemistry protocols akin to G4(MP2)XK theory (Chan, B.; Karton, A.; Raghavachari, K. J. Chem. Theory Comput. 2019, 15, 4478–4484). On account of their availability for elements H through Rn, Karlsruhe def2 basis sets were employed. Even after reparametrization, the GMTKN55 WTMAD2 (weighted mean absolute deviation, type 2) for G4(MP2)-XK is actually inferior to that of the best rung-4 DFT functional, ωB97M-V. By increasing the basis set for the MP2 part to def2-QZVPPD, we were able to substantially improve performance at modest cost (if an RI-MP2 approximation is made), with WTMAD2 for this G4(MP2)-XK-D method now comparable to the better rung-5 functionals (albeit at greater cost). A three-tier approach with a scaled MP3/def2-TZVPP intermediate step, however, leads to a G4(MP3)-D method that is markedly superior to even the best double hybrids ωB97M(2) and revDSD-PBEP86-D4. Evaluating the CCSD(T) component with a triple-ζ, rather than split-valence, basis set yields only a modest further improvement that is incommensurate with the drastic increase in computational cost. G4(MP3)-D and G4(MP2)-XK-D have about 40% better WTMAD2, at similar or lower computational cost, than their counterparts G4 and G4(MP2), respectively: detailed comparison reveals that the difference lies in larger molecules due to basis set incompleteness error. An E2/{T,Q} extrapolation and a CCSD(T)/def2-TZVP step provided the G4-T method of high accuracy and with just three fitted parameters. Using KS orbitals in MP2 leads to the G4(MP3|KS)-D method, which entirely eliminates the CCSD(T) step and has no steps costlier than scaled MP3; this shows a path forward to further improvements in double-hybrid density functional methods. None of our final selections require an empirical HLC correction; this cuts the number of empirical parameters in half and avoids discontinuities on potential energy surfaces. G4-T-DLPNO, a variant in which post-MP2 corrections are evaluated at the DLPNO-CCSD(T) level, achieves nearly the accuracy of G4-T but is applicable to much larger systems.

I. Introduction

Among applied computational chemists, density functional theory (DFT) is presently the most widely used electronic structure approach, followed by wave function ab initio theory (WFT). WFT has a precisely defined Hamiltonian and a clear “road map” for systematic refinement, but in unmodified form hits an “exponential scaling wall” that limits application to small molecules. DFT “tunnels” through the scaling wall by reducing the many-electron Schrödinger equation to a set of coupled one-particle equations for an approximate exchange-correlation functional.1 The accuracy of DFT stands or falls with the functional.

One well-established approach for reducing the computational cost of WFT methods has been the introduction of composite WFT (cWFT) protocols such as the following:

  • Gaussian-n theory (Gn)28 by the Pople group (see ref (9) for a review);

  • The CBS-QB310,11 and related methods12 by Petersson and co-workers;

  • Multicoefficient correlation methods of Zhao and co-workers;1315

  • In a higher accuracy regime, the ccCA approach1618 of Wilson and co-workers;

  • In still higher accuracy regimes, the Weizmann-n approaches1924 of the present group and lower-cost modifications thereof (e.g., refs (2527)); the HEAT-n approaches of an international team around Stanton;2831 and the more general Feller–Peterson–Dixon (FPD) approach (refs (3234) and references therein).

Especially the first two types of methods, being built into the popular Gaussian program system,35 are applicable in a fairly black-box fashion, and hence are in fairly widespread use by nonspecialists. For recent reviews of cWFT methods and their performance for the thermochemistry and thermochemical kinetics of organic molecules, see Karton36 and Chan;37 for a recent general review of both experimental and computational thermochemistry, see Ruscic and Bross.38

One feature that cWFTs all share is the additivity approximation of the following form:

I.

in which LARGE and SMALL represent two different basis set sizes, HIGH and LOW represent a more rigorous and a more economical electron correlation level, respectively, and the cost savings derives from assuming the coupling term is negligible. (For a discussion of one-particle basis set–n-particle space coupling, see ref (39).)

Such approximations can be nested, e.g., in G4 theory,4 which features three levels of electron correlation (MP2, MP4, and CCSD(T)).

While the more accurate of these methods, such as HEAT-n, Weizmann-n, and ccCA, achieve their performance without fitted empirical parameters, computational demands of even the most economical among them preclude routine use for larger molecules: for both ccCA and W1, CCSD(T) calculations in the spdf basis sets are required, and W1 requires CCSD/spdfg calculations, while ccCA “makes do” with MP2 calculations for the basis set extension steps. Our present focus will be on more economical methods similar in cost to G4(MP2) or G4 theory, i.e. (with one exception), requiring no basis sets larger than the polarized split-valence for the CCSD(T) step.

An approach that, in practical operation, combines elements of WFT and DFT methods is the double-hybrid density functional method (DHDFT,4045 see ref (46) for a recent review), which occupies the fifth rung on Perdew’s “Jacob’s Ladder”.47 In DHDFT, both the exchange and the correlation terms are mixtures of DFT and WFT approaches (evaluated in a basis of Kohn–Sham orbitals). Using the very large GMTKN55 (general main-group thermochemistry, kinetics, and noncovalent interactions, 55 subsets) benchmark suite,48 with nearly 2500 main-group molecules, we found that the best DHDFT functionals, ωB97M(2) by Mardirossian and Head-Gordon49 and revDSD-PBEP86-D4 by our group,45 have WTMAD2 (weighted mean absolute deviation) statistics around 2.2 kcal/mol, competitive with or superior to the cWFT methods we tested. Needless to say, double hybrids are computationally much more economical, especially if RI (resolution of the identity) approximations5052 are applied.

An additional impetus for the present work was the recent publication of the G4(MP2)-XK method,53 which employs Weigend–Ahlrichs54 def2 basis sets rather than Pople basis sets, and is thus applicable to the first five rows of the Periodic Table (H–Rn). Somewhat to our surprise, its WTMAD2 proved inferior to the best DHDFT functionals.

Deeper inspection of performance for the subsets revealed that, while the cWFT methods yielded better performance for heats of formation of small molecules (i.e., the W4-11 benchmark55), this was outweighed by degraded performance for larger-molecule subsets (see below and Table 1). As essentially all cDFT methods are parametrized against small-molecule training sets, this prompts the question whether this result was not an artifact of parametrization bias—if overall performance hadn’t been “sacrificed on the altar of small-molecule thermochemistry”, so to speak.

Table 1. Error Statistics (kcal/mol) for the GMTKN55 Database and Its Five Top-Level Subsets of Selected WFT, cWFT, and DFT Methodsa.

methods WTMAD2 thermo barrier large confor. intermol.
cWFT
G4(MP2)-XK-T 1.43 0.40 0.18 0.18 0.11 0.56
G4-Q-DLPNO 1.52 0.25 0.12 0.20 0.46 0.49
G4-T 1.51 0.35 0.16 0.23 0.14 0.63
G4-T-DLPNO 1.66 0.26 0.12 0.24 0.52 0.52
G4(MP3)-D 1.65 0.37 0.17 0.28 0.30 0.55
G4(MP3|KS)-D 1.96 0.41 0.28 0.26 0.45 0.56
G4(MP2)-XK-D 2.56 0.46 0.29 0.34 0.68 0.79
G44 2.52 0.38 0.23 0.75 0.38 0.78
G4(MP2)5 2.96 0.53 0.34 0.91 0.33 0.85
CBS-QB310,11 3.10 0.40 0.35 0.60 0.20 1.55
rev-G4MP2XK 3.53 0.50 0.29 0.61 1.16 0.96
MP2.X-Q 3.29 0.71 0.78 0.88 0.42 0.50
G4(MP2)-XK53 3.71 0.45 0.31 0.67 1.25 1.02
WFT
SCS-MP2-D389,b 5.22 1.23 0.95 1.39 0.91 0.75
SCS-MP289 5.35 0.94 1.01 1.15 1.02 1.23
MP2-D3b 5.83 1.21 1.21 1.66 0.87 0.87
MP2-D3c 5.54 1.20 1.18 1.52 0.80 0.84
MP2 6.91 1.21 1.23 1.78 1.47 1.21
HF-D3d 13.08 5.05 2.65 2.06 1.85 1.48
HF 29.46 5.87 3.74 3.66 7.27 8.92
DFT
ωB97M(2)49 2.19 0.44 0.26 0.42 0.58 0.49
xrevDSD-PBEP86-D445 2.26 0.56 0.27 0.52 0.43 0.47
revDSD-PBEP86-D445 2.33 0.56 0.31 0.58 0.41 0.48
revDOD-PBEP86-D445 2.36 0.59 0.30 0.59 0.41 0.47
revDSD-PBEP86-NL 2.44 0.55 0.30 0.55 0.47 0.57
revDSD-PBE-D445 2.46 0.65 0.35 0.53 0.43 0.50
revDSD-PBEP86-D345 2.42 0.54 0.31 0.55 0.46 0.57
revDSD-BLYP-D445 2.59 0.57 0.34 0.58 0.48 0.62
DSD-SCAN-D445 2.64 0.60 0.40 0.62 0.45 0.56
DSD-PBE-D442 2.64 0.61 0.39 0.56 0.53 0.54
DSD-PBEP86-D442 2.65 0.54 0.37 0.63 0.55 0.56
revDSD-PBEB95-D445 2.70 0.64 0.31 0.45 0.78 0.52
DSD-BLYP-D445 2.83 0.58 0.38 0.59 0.68 0.60
DSD-PBEP86-D342 3.10 0.55 0.45 0.49 0.65 0.97
DSD-PBE-D342 3.17 0.66 0.41 0.54 0.73 0.83
B2GP-PLYP-D390 3.19 0.63 0.42 0.66 0.64 0.85
ωB97M-V91 3.29 0.73 0.45 0.64 0.90 0.57
ωB97X-V92 3.96 1.02 0.56 1.07 0.73 0.58
M06-2X-D3(0)93 4.79 0.86 0.48 1.08 1.22 1.14
B3LYP-D3 6.50 1.31 1.14 1.66 1.15 1.24
a

D3 stands for D3(BJ) throughout this table; M06-2X was evaluated with a D3(0) correction, for want of D3BJ parameters. Statistics for M06-2X without D3(0) are essentially identical. Tabulated data for the DFT methods employing the def2-QZVPP basis set (def2-QZVPPD for subsets AHB21, G21EA, IL16, RG18, and WATER27) were obtained from refs (45 and 46) while the WFT (MP2, SCS-MP2, and HF) data in the same basis sets were obtained in this work, as were all cWFT results. The abbreviations “thermo”, “barrier”, “large”, “confor.”, and ‘intermol.’ stand for basic thermochemistry, barrier heights, reactions involving large molecules, conformer energies, and intermolecular interactions, respectively.

b

D3(BJ) parameters taken from Table S1 of ref (94).

c

a1 = 0, a2 = 5.5, s6 = −0.345, s8 = 0 (this work).

d

Present work; D3(BJ) parameters from Table 2 of the original D3(BJ) paper.70

The purpose of the present paper is twofold. The first aim is to investigate whether superior, or simply more transferable, cWFT methods can be obtained by employing a large and diverse training set like GMTKN55. We shall show below that this is the case and shall present three options of ascending accuracy and computational cost.

The second objective is to see if such methods still have an edge in accuracy over the best DHDFT methods. We shall show that the answer is “yes, but not as big as you might expect”.

II. Computational Details

All calculations were carried out on the ChemFarm HPC cluster of the Faculty of Chemistry at the Weizmann Institute. CCSD(T) calculations,56,57 as well as standard G4,4 G4(MP2),5 and CBS-QB310,11 composite methods calculations, were performed using Gaussian 16, revision C.01;35 medium-basis set MP3 and large-basis set MP2 (the latter using the RI-MP2 approximation50,52) calculations were performed using Q-CHEM 5.2;58 the DLPNO-MP259 and DLPNO-CCSD(T)60 calculations were carried out using ORCA 4.2.1.61 The conventional calculations benefited from 4TB SSD (solid state disk) arrays on our dedicated nodes, although some of the largest MP3 calculations exceeded that limit; for them we resorted to a 40TB shared-over-InfiniBand storage server custom-developed for us by Access Technologies of Ness Ziona, Israel.

As regards basis sets, for the standard G4, G4(MP2), and CBS-QB3 methods, the Pople basis sets specified in the original protocols were used without modification. Otherwise, we relied on the Weigend–Ahlrichs/Karlsruhe def2 family,54 namely, the original def2-SVP, def2-TZVP, def2-TZVPP, and def2-QZVPP basis sets, as well as their diffuse function-augmented variants def2-SVPD, def2-TZVPPD, and def2-QZVPPD.62 The unaugmented variants are available for all elements H–Rn, and the augmented variants for H–La and Hf–Rn.

For the original G4(MP2)-XK approach,53 we employed the “minimally augmented” Karlsruhe basis sets ma-TZVP, ma-QZVP, ma-TZVXP (the prefix indicates addition of a single shell of diffuse valence exponents obtained by dividing the smallest s and p exponents by a factor of 3). In addition, the CCSD(T) step was carried out in a def2-SVSP basis set, which corresponds to def2-SVP with the polarization functions on hydrogen removed. Calculations were performed and results processed with the script provided in the Supporting Information to ref (53).

Where not already available internally in the respective codes, basis sets were retrieved from the Basis Set Exchange.63 Auxiliary basis sets for RI-MP2 and DF-HF were used as stored in the Q-CHEM internal library; see refs (52, 64, 65), and (66) for the original references.

As in the original G4(MP2)-XK approach and the standard G4 and CBS-QB3 methods, all open-shell species were treated using UHF (unrestricted HF) orbitals. We mention in passing the existence of ROHF-based variants of G4(MP2) that are more suitable to radical research.67

For the alkaline and alkaline earth metals heavier than Ne, we correlated the (n −1)sp subvalence electrons, as it is well-known that these orbitals intrude into the valence band, especially of {O, F, Cl} compounds.45 For the heavy p-block elements, we additionally correlated the (n – 1)d subvalence electrons for similar reasons. As standard in def2 basis sets, small-core energy-consistent relativistic pseudopotentials68 were used for elements heavier than Kr.

Dispersion corrections were evaluated using the Grimme et al. D3 model69 with the Becke–Johnson damping function.70 This combination is conventionally denoted D3(BJ). Based on our experience with double-hybrid functional parametrization,45 the damping function’s shape parameters were held fixed at a1 = s8 = 0, a2 = 5.5, leaving the R–6 overall scaling parameter s6 as the only adjustable parameter.

For the DLPNO-CCSD(T), the “TightPNO” option71 and the aforementioned settings of frozen inner orbitals were applied along with the original (T0) triples approximation;60 we used the “VeryTightSCF” convergence criteria and the RIJCOX approximation for constructing the Fock matrices.72 Similarly, we used the def2-SVPD, def2-TZVPP, and def2-QZVPP basis sets along with the auxiliary versions of def2/J (see ref (73)) and def2-SVP/C, def2-TZVPP/C, and def2-QZVPP/C (see ref (74)) as implemented in ORCA. The core electrons were described by the def2 effective core potential (def2-ECP).75 For the subsets AHB21, G21EA, IL16, RG18, and WATER27, we employed the diffuse-function augmented def2-TZVPPD and def2-QZVPPD,62 inspired by ref (45). The DLPNO-CCSD(T)- based cWFT methods presented here have their energy difference CCSD-MP2 obtained from separate DLPNO-MP2 calculations rather than from the “semi-local (SL) MP2” energy obtained as a byproduct of the DLPNO-CCSD(T) step.

Our training set was the GMTKN55 benchmark;48 as in our previous study on the revDSD functionals,45 reference energetics, geometries, and charge/multiplicity information were obtained via the ACCDB database of Morgante and Peverati76 and used as-is. (For the G4, G4(MP2), and CBS-QB3 implementations in Gaussian, “NoOpt” was specified to suppress the geometry optimization at these composite methods’ standard levels of theory.) The calculations for the subsets C60ISO (isomerization energies of C60 molecules)77 and UPU23 (relative energies of uracil dinucleotides)78 were currently not within reach for MP3 and canonical CCSD(T) methods. While the reported WTMAD2 values of DH-DFT and MP2-based WFT methods, i.e., SCS-MP2, are based on analysis of all subsets, excluding C60ISO and UPU23 does not appreciably affect WTMAD2 owing to the small (UPU23) and very small (C60ISO) weights these two subsets have in eq 1. For instance, C60ISO and UPU23 contribute only 0.3% and 2.5%, respectively, to the WTMAD2 for the ωB97M(2) double hybrid. Hence, despite excluding these two subsets, our WTMAD2 results are de facto equivalent to all-GMTKN55 WTMAD2 values, and we shall hence refer to them as GMTKN55 throughout. Statistical analysis was automated using a Fortran code developed in-house and available upon request. (We note that, as the reference data are complete basis set limit CCSD(T) calculations in the hypothetical motionless state, zero-point and thermal corrections need not be evaluated.)

The main performance metric for the GMTKN55 database is the WTMAD2,48 which takes into account the different sizes and energy ranges of each subset; it is obtained according to the following equation:

II. 1

where Δ corresponds to the mean signed deviation from the reference reaction energies for the ith subset, Ni is the number of systems in the subset, and the MADi is the mean absolute deviation between calculated and reference energies for the ith subset.

At one extreme, interaction energies |ΔE| for the RG18 subset48 of rare-gas complexes range from 0.1 to 1.5 kcal/mol. At the other extreme, decomposition reaction energies |ΔE| for the MB16-43 subset48 of artificial molecules run the gamut from −363 to 1290 kcal/mol. Thus, while a deviation of a few kcal/mol for an MB16-43 reaction would not materially affect the overall statistical error, a deviation of 1 kcal/mol at the RG18 subset would make it “shoot” up. Through trial and error, the application of numerous statistical schemes across the GMTKN24,79 GMTKN30,80 and GMTKN5548 databases demonstrated in these papers that the balanced metric should encompass the number of reactions per subset Νi, the mean absolute deviation MADi between the calculated and reference relative energies, and the total mean absolute deviation per subset |Δ|i. That metric was dubbed the weighted mean absolute deviation, type 2, conventionally abbreviated WTMAD2.48 In prior papers45,46,48 (as in the present work), MAD was favored over the RMSD (root-mean-square deviation) as MAD is a more “robust” measure81 in the statistical sense of being less sensitive to one or a few outlier data points. For a pure normal distribution without systematic error, it can easily be shown82 that Inline graphic = Inline graphic = 1.25331 ... ≈ Inline graphic, which may be helpful to readers more attuned to RMSD error measures. The MAD, RMSD, and the MAD/RMSD ratio values for each subset are included in the Supporting Information for the final selected methods.

The optimization of the parameters for each method was accomplished by the late Michael Powell’s BOBYQA83 (Bound Optimization BY Quadratic Approximation) derivative-free constrained optimizer. Different initial guesses were provided to ensure a global minimum was obtained for each optimization.

The reference energies of the GMTKN55 database are obtained from high-level ab initio methods including CCSD(T)/CBS, CCSD(T)-F12/CBS, or Weizmann-n theories such as W1-F12 or W2-F12.84 These energies are nonrelativistic and ZPVE exclusive where all electrons were correlated for most calculations. For example, the references energies for the subset of rare-gas complexes RG18 were calculated at CCSD(T)/CBS(aug-cc-pVTZ/aug-cc-pVQZ), while for the subset of protonated isomers “PAREL”, they were calculated at CCSD(T)/CBS(def2-TZVPP/def2-QZVPP).48 The subsets of GMTKN55 are described in ref (48) as well as in the Supporting Information of the present paper.

III. Description and Nomenclature of the Theoretical Methods

III.A. Description of the Theoretical Methods

The original G4(MP2)-XK method has the following energy expression:

III.A. 2

where each energy term is given by the following equations. The Hartree–Fock energy extrapolated to the complete basis set limit may be written as

III.A. 3

The correction terms for the spin-component-scaled MP2 correlation energy, ΔESCS-MP2, the scaled coupled cluster singles–double correlation energy, and the coupled cluster triples excitations, are

III.A. 4
III.A. 5
III.A. 6

The “high-level correction” HLC, like in G4 and G4(MP2) theory, is defined as

III.A. 7

The variables nα and nβ (by convention nαnβ) correspond to the number of α and β valence electrons, respectively, according to the conventional largest noble-gas-core definition. The parameters A, A′, B, C, D, and E were established through parameter optimization based on the chosen training set of reference energies. The HLC was originally incorporated in the Gaussian-n theories and the recent G4(MP2)-XK method to account for residual basis set incompleteness error; its oldest incarnation was based on correcting the total energies for the hydrogen atom and the hydrogen molecule.7

The inexpensive two-tier methods considered here share the G4(MP2)-XK energy expression, albeit with different parameters.

The three-tier methods considered have an energy expression of the following energy form, e.g., when Schwenke-type85,86 MP2 extrapolation is used:

III.A. 8

The first term is the extrapolated Hartree–Fock energy of the form

III.A. 9

The remaining terms are the post-HF correlation corrections. The energy difference, ΔEMP2+MP3, combining the third-order MP correction and the second-order correlation term at the extrapolation limit is given by

III.A. 10

The post-MP3 improvements to the total energy from the coupled cluster wave function are

III.A. 11
III.A. 12

The last two terms involve the empirical dispersion correction and the HLC.

III.A. 13

Finally, in the most expensive two-tier method we considered, CCSD(T)/def2-TZVP is the smallest basis set step, which makes the separate MP3/def2-TZVPP step redundant. Hence, we have

III.A. 14

The Hartree–Fock extrapolated energy, EHF/CBS, is likewise given by eq 9. The post-HF energy corrections arise from MP2, CCSD(T), and the dispersion correction. For the avoidance of doubt, the coefficients in the SCS-MP2 term are adjustable parameters and not standard87,88 SCS-MP2 coefficients.

III.A. 15

The remaining corrections are the energy difference between CCSD and MP2 in the same basis set, ΔEc,CCSD-MP2, the triples excitations correlation energy, and the dispersion energy:

III.A. 16
III.A. 17
III.A. 18

In all of the above, varying the HF extrapolation parameter (originally from ref (4)) had only insignificant effect, and we have hence held it fixed. Detailed equations and parameters for all individual composite methods considered in this work are given in the Supporting Information.

A minor but significant practical detail about the HLC needs to be clarified here. When introducing separate coefficients for molecules and separate atoms in the HLC, the authors of the original G4 papers probably never envisaged an application to noble gas complexes: hence, the unmodified HLC yields an abnormally large relative error in the RG18 subset, which through RG18’s large weight in WTMAD2 has an effect of several kcal/mol there. The problem can be solved by the simple expedient of treating the closed-shell rare gas atoms as if they were molecules: we have done so throughout the present paper. The original G4(MP2)-XK yields a discouraging WTMAD2 of 6.31 kcal/mol; a closer inspection revealed a contribution of 1.58 kcal/mol to the WTMAD2 from the RG18 subset. After treating the “rare gas atoms” as molecules, the RG18 contribution to WTMAD2 plummets to 0.17 kcal/mol, bringing the total down to 3.71 kcal/mol.

III.B. Nomenclature of the Theoretical Methods

In this section, we focus on the nomenclature of the theoretical methods, providing concise and unambiguous names. The naming scheme is illustrated in Figure 1. There are two different ways of naming the methods based on the E2 extrapolation: either G4-scs-n and G4(scsMP2.X)-n, if the same-spin and opposite-spin E2 components are separately extrapolated, or G4-n and G4(MP2.X)-n, if the total E2 energy is extrapolated. The presence of the MP3/def2-TZVPP step is denoted as “MP2.X”. We will use the suffix KS if the MP2(-like) term is evaluated in a basis of Kohn–Sham orbitals with the largest basis set; additionally including an MP3/def2-TZVPP step determines the “MP3” label. The DLPNO suffix indicates the use of DLPNO-CCSD(T) instead of canonical CCSD(T). The basis set for the canonical CCSD(T) step is noted in abbreviated form as “D” or ‘T” for def2-SVSP or def2-TZVP, respectively. Similarly, for the DLPNO-CCSD(T), the basis set is denoted as “D”, “T”, or “Q” for def2-SVPD, def2-TZVPP, or def2-QZVPP, respectively.

Figure 1.

Figure 1

Nomenclature scheme of the theoretical methods.

Regarding the HLC, we add the suffix H6 to indicate the original HLC of G4(MP2)-XK. If the constraint A = A′ is imposed on eq 7, this is indicated by the H5 label, while the absence of either label indicates that the HLC was eliminated altogether. Overall, we tried numerous variants of the composite methods described herein, namely, 291 variants for the canonical CCSD(T)-based cWFTs, 18 variants for DLNPO-CCSD(T)-, and 18 variants for DLPNO-CCSD(T1)-based methods. Said variants differ from each other in the number of optimized parameters or HLC treatment; e.g., some parameters might be set equal to zero, such as the triples correction term, or some pairs of parameters might be set to equal during the parametrization. Therefore, we have adopted the suffix “vx”, where x is the variant number. The equations of all composite methods are explicitly listed in the Supporting Information.

A few examples will further clarify the nomenclature. When we use the formalism of the original G4(MP2)-XK, the methods’ names retain the “XK” label, and suffixes contain information about the HLC and the basis set used for the CCSD(T) step. G4(MP2)-XK-T-H6-v1 has an energy expression similar to the original G4(MP2)-XK; the letter “T” stands for the CCSD(T)/def2-TZVP step and “H6” for the six-parameter HLC, while “v1” denotes the variant with dispersion correction included. One primary goal was to limit the empirical parameters of the composite methods, and the “XK” label is omitted for the minimally empirical methods where we did not scale the E2 correlation energy of the smaller basis set compared to the original G4(MP2)-XK. G4-T-v2 is based on a CCSD(T)/def2-TZVP step, without HLC, and the suffix “v2” stands for the second variant of the G4-T. As we proceed toward the final choice of the minimally empirical methods, we will drop the variant label “vn” for the best overall methods.

IV. Results and Discussion

Full statistics and parameters for all the different empirical methods considered are given in the Supporting Information. The component breakdown of WTMAD2 for selected methods is presented in Table 1. A selection of the most relevant data is summarized by the rows in Table 2, and the final selected composite methods are briefly presented in Table 3. Figure 2 depicts WTMAD2 values for several DFT functionals and cWFT methods taken from refs (45 and 46) as well as for the cWFT methods developed in the present work.

Table 2. WTMAD2 Values (kcal/mol) for Selected Standard cWFT and Presently Developed cWFT Methods, as Well as Optimized Parameters for the Latter.

            coefficients of the energy components
HLC coefficients
MP2 MP3 CCSD(T) Nparam method WTMAD2 c(E2,os)′ c(E2,ss)′ cE2,os cE2,ss cE(C,CCSD) cT cDisp A A B C D E
T D 12 G4(MP2)-XK 3.71 1.131 0.512 1.041 0.704 1.048 0.526 [0] 9.369 9.449 3.832 9.594 1.874 2.491
T D 12 revG4(MP2)-XK-H6-v1 3.53 1.307 0.385 1.170 0.614 0.984 0.736 [0] 8.540 8.603 2.199 7.771 1.052 3.108
Q D 12 G4(MP2)-XK-D-H5-v1 2.53 1.216 0.660 1.008 0.925 1.062 0.778 –0.276 7.082 7.082 2.263 7.198 1.379 3.292
Q D 11 G4(MP2)-XK-D-H5-v2 2.64 1.270 0.417 1.069 0.669 1.041 0.701 [0] 7.262 7.262 2.035 6.999 1.420 3.467
Q D 13 G4(MP2)-XK-D-H6-v1 2.52 1.210 0.616 1.026 0.865 1.042 0.825 –0.227 7.161 7.197 2.081 7.131 1.584 3.546
Q D 7 G4(MP2)-XK-D-v1 2.56 1.309 0.674 1.124 0.890 1.113 0.699 –0.383 [0] [0] [0] [0] [0] [0]
Q D 6 G4(MP2)-XK-D-v2 2.73 1.482 0.271 1.320 0.457 1.052 0.778 [0] [0] [0] [0] [0] [0] [0]
Q T 13 G4(MP2)-XK-T-H6-v1 1.36 1.227 1.030 1.128 1.128 1.029 1.000 –0.095 7.139 7.172 1.886 7.262 1.591 3.513
Q T 6 G4(MP2)-XK-T-v2 1.43 1.609 0.720 1.516 0.722 1.068 1.051 [0] [0] [0] [0] [0] [0] [0]
            cE2,ss cE2,os   cCCSD-MP2   cT cDisp A A B C D E
Q D 10 G4(MP2)-D-H5-v1 2.59 1.224 0.865   1.038   0.881 –0.294 7.456 7.456 2.268 7.318 1.043 3.653
Q D 9 G4(MP2)-D-H5-v2 2.71 1.106 0.852   0.956   0.837 [0] 7.547 7.547 2.206 7.385 0.278 4.005
Q D 5 G4(MP2)-D-v1 2.68 1.226 0.977   1.111   0.776 –0.456 [0] [0] [0] [0] [0] [0]
Q D 4 G4(MP2)-D-v2 3.01 1.045 0.968   1.033   0.796 [0] [0] [0] [0] [0] [0] [0]
            c(E2/CBS)     cCCSD-MP2   cTcT cDisp A A B C D E
{T, Q} T 10 G4-T-H6-v1 1.42 0.312     1.024   1.053 –0.025 7.046 7.107 2.783 6.880 2.598 4.298
{T, Q} T 4 G4-T-v1 1.48 0.550     1.063   1.187 –0.070 [0] [0] [0] [0] [0] [0]
{T, Q} T 3 G4-T-v2 1.51 0.542     1.060   1.114 [0] [0] [0] [0] [0] [0] [0]
{T, Q} T(DLPNO) 3 G4-T-DLPNO-v2 1.66 0.601     1.019   1.204 [0] [0] [0] [0] [0] [0] [0]
{T, Q} Q(DLPNO) 3 G4-Q-DLPNO-v2 1.52 0.513     1.003   1.185 [0] [0] [0] [0] [0] [0] [0]
            cE2,ss cE2,os cE3 cCCSD-MP3   cT cDisp A A B C D E
Q T D 12 G4(MP3)-D-H6-v1 1.63 1.250 0.951 1.033 1.039   0.566 –0.087 7.138 7.173 2.228 7.162 1.467 3.581
Q T D 6 G4(MP3)-D-v1 1.65 1.284 1.018 1.089 1.119   0.411 –0.185 [0] [0] [0] [0] [0] [0]
KSQ T D 11 G4(MP3|KS)-D-H5-v1 1.89 0.831 0.944 0.601 –0.089   0.057 –0.056 7.614 7.614 2.293 7.828 1.525 3.600
KSQ T 9 G4(MP3|KS)-D-H5-v5 1.89 0.845 0.938 0.596 [0]   [0] –0.053 7.505 7.505 2.057 7.639 1.544 3.861
KSQ T 8 G4(MP3|KS)-D-H5-v6 1.89 0.822 0.938 0.566 [0]   [0] [0] 6.233 6.233 3.006 6.566 1.076 3.366
KSQ T 4 G4(MP3|KS)-D-v5 1.93 0.795 1.013 0.623 [0]   [0] –0.099 [0] [0] [0] [0] [0] [0]
KSQ T 3 G4(MP3|KS)-D-v6 1.96 0.751 1.023 0.609 [0]   [0] [0] [0] [0] [0] [0] [0] [0]
Q T 4 MP2.X-Q 3.28 1.007 1.128 0.816 [0]   [0] –0.033 [0] [0] [0] [0] [0] [0]
{T, Q} T D 4 G4(scsMP2.X)-D-v8 1.84 0.633 0.906 0.965 1.026   1.026 [0] [0] [0] [0] [0] [0] [0]
        G44 2.52                          
        G4MP25 2.96                          
        CBS-QB311 3.10                          
method energy terms included
G4(MP2)-XK = EHF/CBS + c3EMP2,OS/ma-TZVXP + c4EMP2,SS/ma-TZVXP + c5ECCSD/def2-SVSPc1EMP2,OS/def2-SVSPc2EMP2,SS/def2-SVSP + c6EC,(T)/def2-SVSP + HLC
revG4(MP2)-XK-H6-v1
graphic file with name ct0c00189_m023.jpg
G4(MP2)-XK-D-H5-v1
graphic file with name ct0c00189_m024.jpg
G4(MP2)-XK-D-H5-v2 = EHF/CBS + c3EMP2,OS/def2-QZVPPD + c4EMP2,SS/def2-QZVPPD + c5ECCSD/def2-SVSPc1EMP2,OS/def2-SVSPc2EMP2,SS/def2-SVSP + c6EC,(T)/def2-SVSP + HLC′
G4(MP2)-XK-D-H6-v1
graphic file with name ct0c00189_m025.jpg
G4(MP2)-XK-D-v1
graphic file with name ct0c00189_m026.jpg
G4(MP2)-XK-D-v2 = EHF/CBS + c3EMP2,OS/def2-QZVPPD + c4EMP2,SS/def2-QZVPPD + c5ECCSD/def2-SVSPc1EMP2,OS/def2-SVSPc2EMP2,SS/def2-SVSP + c6EC,(T)/def2-SVSP
G4(MP2)-XK-T-H6-v1
graphic file with name ct0c00189_m027.jpg
G4(MP2)-XK-T-v2
graphic file with name ct0c00189_m028.jpg
G4(MP2)-D-H5-v1
graphic file with name ct0c00189_m029.jpg
G4(MP2)-D-H5-v2
graphic file with name ct0c00189_m030.jpg
G4(MP2)-D-v1
graphic file with name ct0c00189_m031.jpg
G4(MP2)-D-v2
graphic file with name ct0c00189_m032.jpg
G4-T-H6-v1
graphic file with name ct0c00189_m033.jpg
G4-T-v1 = EHF/CBS + [(c1 + 1)Ecorr,MP2/def2-QZVPPDc1Ecorr,MP2/def2-TZVPPD] + c2(EC,CCSD/def2-TZVPEMP2/def2-TZVP) + c3EC,(T)/def2-TZVP + c4E[D3BJ]
G4-T-v2
graphic file with name ct0c00189_m034.jpg
G4-T-DLPNO-v2
graphic file with name ct0c00189_m035.jpg
G4-Q-DLPNO-v2
graphic file with name ct0c00189_m036.jpg
G4(MP3)-D-H6-v1
graphic file with name ct0c00189_m037.jpg
G4(MP3)-D-v1
graphic file with name ct0c00189_m038.jpg
G4(MP3|KS)-D-H5-v1
graphic file with name ct0c00189_m039.jpg
G4(MP3|KS)-D-H5-v5
graphic file with name ct0c00189_m040.jpg
G4(MP3|KS)-D-H5-v6
graphic file with name ct0c00189_m041.jpg
G4(MP3|KS)-D-v5
graphic file with name ct0c00189_m042.jpg
G4(MP3|KS)-D-v6
graphic file with name ct0c00189_m043.jpg
MP2.X-Q
graphic file with name ct0c00189_m044.jpg
G4(scsMP2.X)-D-v8
graphic file with name ct0c00189_m045.jpg
G44
graphic file with name ct0c00189_m046.jpg
G4MP25
graphic file with name ct0c00189_m047.jpg
CBS-QB311
graphic file with name ct0c00189_m048.jpg
a

The HF{T,Q} extrapolation was used for all methods. The coefficients c(E2,os)’ and c(E2,ss)’ correspond to the E2αβ and E2ss components from the CCSD(T) step, respectively. The final methods are highlighted. All methods are contained in the full table in the Supporting Information and highlighted are the final selected methods.

Table 3. Summary of Recommended cWFT Methods.

method WTMAD2 parameters
G4(MP2)XK-T 1.43 6
G4-T 1.51 3
G4-Q-DLPNO 1.52 3
G4-T-DLPNO 1.66 3
G4(MP3)-D 1.65 6
G4(MP3|KS)-D 1.96 3
G4(MP2)XK-D 2.56 7

Figure 2.

Figure 2

Weighted mean absolute deviation (WTMAD2 in kcal mol–1) of selected composite wave function and double-hybrid DFT methods over the GMTKN55 database.

IV.A. Using Our Largest Available Basis Sets

Our best result overall is WTMAD2 = 1.36 kcal/mol (G4(MP2)-XK-T-H6-v1, Table 2). Interestingly, replacing def2-QZVPPD by an extrapolation E2/{T,Q} does not improve WTMAD2 and in fact slightly raises it to 1.42 kcal/mol (G4-T-H6-v1).

The best approximation has 13 empirical parameters. Can we reduce this number? Setting A = A′ comes at a minimal cost of just 0.02 kcal/mol. Instead eliminating the dispersion correction while retaining the HLC has a slightly greater cost, 0.07 kcal/mol, but then also setting A = A′ (leaving 11 parameters) barely affects statistics. (Conveniently, setting A = A′ eliminates a problem for radical reactions first identified by Chan, Coote, and Radom.95) However, deleting the HLC entirely and leaving just seven parameters yields a result of the same quality, 1.42 kcal/mol, which increases marginally to 1.43 kcal/mol if we remove the dispersion correction. Inspection of the WTMAD2 component breakdown reveals that dropping the HLC slightly degrades the thermochemistry and large-molecule reaction components but (as expected) leaves the intermolecular and conformer sets unaffected and actually slightly improves the barrier heights.

Extrapolating E2/{T,Q} instead [G4-T-v1 and G4-T-v2] gives WTMAD2 = 1.48 and 1.51 kcal/mol, respectively, with and without the dispersion correction: the latter especially could be regarded as a def2-based version of ccCA, albeit with the addition of three empirical parameters.

These latter statistics can be marginally lowered (about 0.02–0.03 kcal/mol) through separate extrapolation of E2αβ and E2ssΕ2αα + Ε2ββ, at the expense of introducing one additional parameter.

IV.B. Using Three Basis Set Tiers

Obviously, CCSD(T)/def2-TZVP is computationally quite costly for larger systems, and in fact proved out of reach for the very largest species in GMTKN55. Hence, we considered a three-tier approach akin to, but different from, G4 theory. The (scaled, see below) MP3 component was evaluated using the def2-TZVPP basis set, while for the CCSD(T)-MP3 difference, we fell back to the smallest basis set used in G4(MP2)-XK, namely, def2-SVSP, which is just def2-SVP but omitting the p polarization functions on hydrogen. For MP2, we used the same basis sets as in the previous paragraph.

Our best result here, WTMAD2 = 1.63 kcal/mol (G4(MP3)-D-H6-v1), at first sight, represents a degradation of 0.3 kcal/mol over our best result in (III.A)—however, the comparison is not entirely fair as the 1.63 result is weighted over almost 200 additional (larger-molecule) reactions. If we re-evaluate for the same 1257 systems as above, then we find WTMAD2 = 1.55, which represents about a 0.2 kcal/mol degradation for a vast reduction in CPU time.

By way of comparison, standard G4 theory—which employs a similar three-tier basis set scheme—achieves 2.52 kcal/mol on 1273 reactions. (It should be noted that this required tweaking the HLC for the rare gas atoms, as detailed at the end of the Methods section.) We note that the 2.52 kcal/mol number is in fact inferior to several of the best double-hybrid density functionals, notably46 ωB97M(2) of the Head–Gordon group (2.18 kcal/mol) and revDSD-PBEP86-D4 (2.33 kcal/mol) and xrevDSD-PBEP86-D4 (2.23 kcal/mol) of the Weizmann group. In contrast, our best approach here has comfortably lower WTMAD2 than all of these.

CBS-QB3, which is likewise a three-tiered method, but with MP4(SDQ) as the middle level, reaches WTMAD2 = 3.10 kcal/mol. This figure would be considerably lower were it not for the poor performance of the intermolecular interactions subsets (which at 1.55 kcal/mol accounts for half the total WTMAD2); detailed inspection reveals this to be due to basis set superposition error. Numerous DH-DFT methods are in the 2.3–2.7 kcal/mol WTMAD2 range, and they show significant overall improvement over CBS-QB3; i.e., ωB97M(2) and xrevDSD-PBEP86-D4 are at 0.49 and 0.47 kcal/mol for the intermolecular interactions, respectively. At this point, we can correlate our CBS-QB3 findings with the conclusions of the detailed theoretical study of Karton and Goerigk on pericyclic reactions,96 where dispersion was shown to play an important role (stabilizing the transition states). These authors compared the performance of WFT, cWFT, and DH-DFT methods with W1-F12 reference energies and found CBS-QB3 to be inferior (RMSD = 2.6 kcal/mol) to the DH-DFT methods DSD-BLYP-D3, B2GP-PLYP-D3, and B2K-PLYP-D3 (RMSD = 1.4 kcal/mol for all three).

Our best value requires 12 empirical parameters. Can we winnow this down at an acceptable cost in accuracy? Setting A = A′ costs a negligible 0.01 kcal/mol; excising the dispersion correction costs a still smallish 0.04 kcal/mol—but eliminating the HLC, thus cutting our number of parameters in half, actually achieves a slightly lower WTMAD2 at 1.65 kcal/mol. (This has the additional benefit of eliminating discontinuities in HLC as a bond is broken.) Once again, that small cost is confined to thermochemistry and large-molecule reactions.

Further reduction from six to five parameters can be achieved by eliminating the dispersion term, but now this comes at a cost of 0.11 kcal/mol. (Note the negative coefficient of −0.19 for the dispersion term, which appears to indicate it compensates for overbinding due to BSSE in the noncovalent interaction subsets. For detailed discussions of that issue, see refs (97 and 98).)

Using E2/{T,Q} extrapolation instead of scaling does come at a small cost in accuracy; however, the dispersion correction can then be eliminated at the expense of just 0.02 kcal/mol, leaving us with 1.84 kcal/mol (G4(scsMP2.X)-D-v8) for four parameters.

One additional refinement, at zero computational cost, we can consider would be to carry out separate extrapolations for E2αβ and E2ss. Thus, introducing an additional parameter turns out to yield only negligible benefits across the board, and we have hence abandoned this.

The conventional MP3/def2-TZVPP turned out to be something of a bottleneck for the largest systems. This could obviously be mitigated with an RI-MP3 algorithm (e.g., such as reported in refs (99 and 100)), but none was available to us. This prompted the question at what price the middle tier could be eliminated.

IV.C. Eliminating the Middle Tier

Our best result then (G4(MP2)-XK-D-H6-v1) is 2.52 kcal/mol, which represents a substantial degradation of over 1.63 kcal/mol above. A component breakdown (Table 1) reveals that the said degradation happens across the board for all subsets.

On the one hand, that implies the middle MP3-like tier (incidentally, inspired by the MP2.X method101) is clearly beneficial; on the other hand, that means that without it we are, in fact, at a slight disadvantage compared to the best double hybrids.

The costs of setting A = A′ in the HLC, and of eliminating the HLC outright, follow the same trends as in the previous section. This leaves us with 2.56 kcal/mol for seven parameters; additionally, dropping the dispersion correction brings us up to 2.73 kcal/mol.

The G4(MP2)-XK paper applies separate scaling coefficients to the overall CCSD correlation energy and the same-spin and opposite-spin CCSD components. (The third, hidden, component is the singles contribution for open-shell cases.) Does this breakdown have material advantages over simply scaling the CCSD-MP2 difference globally? We find [G4(MP2)-XK-D-H5-v1 vs G4(MP2)-D-H5-v1] that applying the latter constraint (thus eliminating two empirical parameters) adds just 0.06 kcal/mol to WTMAD2. Without dispersion [G4(MP2)-XK-D-H5-v2 vs G4(MP2)-D-H5-v2] this increases to 0.07 kcal/mol. In the absence of HLC, this increases further to 0.12 kcal/mol [G4(MP2)-XK-D-v1 vs G4(MP2)-D-v1], and when the dispersion correction is removed [G4(MP2)-XK-D-v2 vs G4(MP2)-D-v2] this shoots up to 0.3 kcal/mol. Hence, the advantage of scaling the CCSD spin components separately appears to lie primarily in noncovalent interactions. Setting cCCSD-MP2 = c(T), i.e., treating all post-MP2 terms as a single correction, incurs a comparatively low cost. Somewhat surprisingly, perhaps, when eliminating (T) altogether (if dispersion is left in), WTMAD2 still stays below 3 kcal/mol. For perspective, however, we are now approaching the performance territory of the ωB97M-V range-separated hybrid meta-GGA91 (WTMAD2 = 3.29 kcal/mol),46 the best rung-4 functional!

To put this in context, let us consider the original G4(MP2) and G4(MP2)-XK methods. For a smaller sample of 1208 molecules where we have G4(MP2) results, the WTMAD2 of G4(MP2) was equal to 2.96 compared to our best method, 2.31 kcal/mol.

(A short note is in order about some of the DFT functionals. WTMAD2 statistics for ωB97X-V and ωB97M-V were first published by Najibi and Goerigk,102 and for B3LYP-D3 and M06-2X-D3(0) in the original GMTKN55 paper. These authors use the def2-QZVP basis set, augmented with diffuse s and p functions for the anionic subsets G21EA, AHB21, and IL16, while we, in the present work and in refs (45 and 46), employ def2-QZVPP as our baseline and def2-QZVPPD for the three aforementioned subsets plus RG18 and WATER27. While our WTMAD2 values (taken from ref (45) by way of ref (46)) for ωB97X-V and B3LYP-D3 are quite similar to those reported earlier in ref (102), our WTMAD2s for ωB97M-V and M06-2X-D3(0) are lower by 0.24 and 0.40 kcal/mol, respectively. Part of the difference could be traced to our larger basis sets, and the remainder is likely due to the superfine integration grids used here: meta-GGAs are well-known to display strong grid sensitivity,103 even though mitigation measures had been taken in the development of wB97M-V. For MP2-D3, the larger basis sets used here impart WTMAD2 improvements for the anionic species, but also for ACONF and RG18, which have large weights in the WTMAD2 formula.)

IV.D. Reducing the MP2 Basis Set to TZ

If we do so, we effectively obtain a reparametrized version (rev-G4MP2XK-H6-v1) of G4(MP2)-XK.53 The refitted WTMAD2, 3.53 kcal/mol, is not much lower than the 3.71 kcal/mol obtained with the original G4(MP2)-XK parameters. This relatively small reduction suggests that the training set used in ref (53) was fairly representative. Both the original and refitted WTMAD2 values are, in fact, inferior to ωB97M-V and represent a marked deterioration over G4(MP2)-XK-T-H6-v1 in the previous section; a component breakdown of WTMAD2 (Table 1) reveals that this happens across the board. Considering the fairly low cost of an RI-MP2/def2-QZVPPD calculation if an RI-MP2 capable code is available (be it MOLPRO, Q-Chem, PSI4, ORCA, ...), there is really no valid reason not to “walk that additional tenth of a mile”.

If we include the dispersion correction, WTMAD2 drops to 3.28 kcal/mol (note the sizable coefficient cDisp = −0.43). Without D3(BJ), eliminating HLC comes at a price of about 0.5 kcal/mol, but with D3(BJ), no such large price need be paid: WTMAD2 = 3.35 kcal/mol, with cDisp = −0.81. As above, the physical meaning of the large negative coefficient is clearly to compensate for basis set superposition error, particularly in the noncovalent interaction subsets.

IV.E. Using the E2 Correlation Energy from a Double-Hybrid Calculation

The good performance of double hybrids, inspired by GLPT2 (second-order Görling–Levy perturbation theory),104 elicits the question: could we improve performance by using KS orbitals in the MP2 for the large-basis steps?

We have investigated this possibility both for the two-tier scenario from subsection IV.C and for the three-tier scenario from subsection IV.B. The E2 energy components were taken from a ωrevDSD-PBEP86-D3BJ calculation with x = 0.72, ω = 0.16.

In the two-tier scenario, we see no noticeable improvement, but an intriguing phenomenon can be observed in the triple excitations coefficient cT, which takes on relative small negative values. In the three-tier scenario (G4(MP3|KS)-D-H5-v1), we can get down to WTMAD = 1.89 kcal/mol—but cT ≈ 0, and in addition cCCSD–MP3 ≈ 0. This offers the tantalizing possibility of eliminating the costly CCSD(T) calculation step entirely (i.e., cCCSD-MP3 = cT = 0): doing so [G4(MP3|KS)-D-H5-v5 and G4(MP3|KS)-D-H5-v6] yields WTMAD = 1.89 with or without cDisp = 0. This offers a performance superior to double hybrids but requiring no steps more expensive than MP3/def2-TZVPP and with only three parameters (plus the six of the HLC). Can we cut out the HLC as well? Doing so [G4(MP3|KS)-D-v5 and G4(MP3|KS)-D-v6] only slightly increases WTMAD2 to 1.93 kcal/mol (with dispersion) and 1.96 kcal/mol (without dispersion): these last two options only have four and three parameters total, respectively, i.e., fewer than the double hybrids.

We note that code limitations precluded carrying out MP3 calculations in a basis of KS orbitals; hence, the MP3 term could only be considered in a basis of HF orbitals. From another perspective, that of fifth-rung density functional methods, this points toward a path for further improving on ωB97M(2) and revDSD-PBEP86-D4 at a relatively modest computational cost. Past attempts involving post-MP2 methods by Chan, Goerigk, and Radom,105 using correlation ranging from MP3 to CCSD(T), showed no improvement: however, these authors for obvious cost reasons used the G2/97 small-molecule thermochemistry data set of 148 small molecules (about one-tenth the size of the present training set) and also restricted themselves to TZ quality basis sets. In view of our findings about the importance of a QZ quality basis set for the MP2-like component, it is quite possible that any improvements from post-MP2 methods in ref (105) were masked by residual basis set incompleteness error in the KS-MP2 contribution. The proverb about a chain being no stronger than its weakest link comes to mind.

The benefits of using KS orbitals in the MP2 for the large-basis steps can be illustrated from a different angle: if we use instead (spin-component-scaled) MP2, MP3, and a dispersion correction and optimize the four adjustable parameters (MP2.X-Q), we obtain WTMAD2 = 3.28 kcal/mol, which stays the same if we cut out the dispersion correction. Comparison with the MP2|KS cognates of these two methods (1.93 and 1.96 kcal/mol, see above) reveals that substituting MP2|KS for MP2 imparts a benefit of about 1.2–1.3 kcal/mol. A component breakdown shows that the difference lies in the small-molecule thermochemistry, barrier heights, and large-molecule reaction energies, while the two noncovalent interaction subsets taken together stay approximately unchanged in quality.

For this “empirical SCS-MP2x” approach, basis set extrapolation does offer a slight improvement (on the order of 0.05 kcal/mol).

IV.F. Performance of the DLPNO-CCSD(T)-Based Composite Methods

As the O(n7) CPU time scaling with a system size of canonical CCSD(T) will present an insurmountable obstacle for application to larger systems, we considered its replacement by the DLPNO-CCSD(T) approach, which is linear-scaling with system size. Specifically, for two-tier approaches, we replaced the post-MP2 correction, E[CCSD(T)] – E[MP2], by E[DLPNO-CCSD(T)] – E[DLPNO-MP2] and left the remaining terms unchanged. The performance of such methods, in which we combine an E2/{T,Q} extrapolation with post-MP2 corrections for different-sized basis sets, is illustrated in Figure 3. Using the largest basis set def2-QZVPP in the post-MP2 correction, our best overall WTMAD2 is 1.50 kcal/mol for G4-Q-DLPNO-H6-v1 with 10 parameters. Setting cDisp = 0 for G4-Q-DLPNO-H6-v2 leaves WTMAD2 unchanged. When A = A′ for HLC, that yields 1.51 kcal/mol with or without dispersion [G4-Q-DLPNO-H5-v1 and G4-Q-DLPNO-H5-v2]. Eliminating the HLC has a negligible effect: WTMAD2 = 1.52 kcal/mol for G4-Q-DLPNO-v1 and G4-Q-DLPNO-v2. The second variant, G4-Q-DLPNO-v2, has only three parameters, and this minimally empirical cWFT will be referred to as G4-Q-DLPNO.

Figure 3.

Figure 3

Weighted mean absolute deviation (WTMAD2 in kcal mol–1) of DLPNO-CCSD(T)-based composite methods over the GMTKN55 database. Darkest colors represent the non-HLC variants.

What about using the smaller basis sets def2-TZVPP or even def2-SVPD in the DLPNO-CCSD(T) step instead? For the larger def2-TZVPP, the dispersion similarly has a negligible contribution to WTMAD2 of all G4-T-DLPNO variants. The lowest WTMAD2 is 1.65 for G4-T-DLPNO-H6-v1 with 10 parameters and does not change when constraining cDisp = 0. Setting A = A′ slightly raises WTMAD2 by 0.03 with, and 0.04 kcal/mol without, dispersion [G4-T-DLPNO-H5-v1 and G4-T-DLPNO-H5-v2, respectively]. Removing HLC yields 1.67 and 1.68 kcal/mol when retaining or discarding dispersion, respectively.

The smallest basis set def2-SVPD renders the dispersion beneficial for the G4-D-DLPNO variants. The lowest WTMAD2 is 2.26 kcal/mol for G4-D-DLPNO-H6-v1 and 10 fitted parameters. Removal of dispersion increases WTMAD2 by 0.06 kcal/mol, while setting A = A′ raises it by 0.05 and 0.10 kcal/mol with and without dispersion [G4-D-DLPNO-H5-v1 vs G4-D-DLPNO-H5-v2]. Eliminating the HLC increases WTMAD2 to 2.31 kcal/mol for G4-Q-DLPNO-v1, while further eliminating the dispersion leads to WTMAD2 = 2.38 kcal/mol for G4-D-DLPNO-v2 with only three parameters. This is better by 1.33 kcal/mol than the original G4(MP2)-XK at a fraction of its cost, and with nine fewer parameters.

Recently, Neese and co-workers106 found for the GMTKN55 data set and the aug-cc-pVDZ basis set that DLPNO-CCSD(T1) with TightPNO has a WTMAD2 from the canonical results of 0.87 kcal/mol (1.58 kcal/mol for NormalPNO). Intriguingly, the present results show a much smaller difference between corresponding canonical and DLPNO approaches, implying a degree of error compensation between the MP2 and post-MP2 components. Moreover, replacing DLPNO-CCSD(T) by DLPNO-CCSD(T1) yields a marginal improvement of statistics for the G4-DLPNO-T1 type methods in this framework (see the Supporting Information).

Gas-phase thermochemistry of small organic molecules has previously been studied employing DLPNO-CCSD(T) electronic energies by Paulechka and Kazakov,107 who compared experimental gas-phase formation enthalpies with those at DLPNO-CCSD(T)/def2-QZVP//RI-MP2/def2-QZVP. They considered four variants, the most costly scheme being single-point energies at DLPNO-CCSD(T)/def2-QZVP and a geometry optimization at RI-MP2/def2-QZVP. Their data set consisted of 45 small organic molecules, ranging in size from benzene and pyridine to biphenyl. That implementation was available only for C, H, O, and N containing closed-shell molecules. Mielczarek et al.108 extended this work to 164 molecules composed of the elements H, C, N, O, F, S, Cl, and Br.

IV.G. Final Selected Methods

It is clear from the discussion above that the WTMAD2 improvement from including the HLC is not commensurate with having to include an extra six parameters. This winnows our choices to the non-HLC ones.

We then find that the benefit of the dispersion corrections is negligible for the QZ/TZ two-tier approaches but rather less negligible for the QZ/TZ/DZ three-tier approaches and the Q/D two-tier. This leaves us with the following hierarchy of three methods; the variant label is omitted for the final selected methods, and the original method’s name is in brackets:

GOOD: [G4(MP2)-XK-D-v1] G4(MP2)-XK-D WTMAD2 = 2.56 kcal/mol, with seven fitted parameters, and with CCSD(T)/def2-SVSP as the computationally costliest step;

BETTER: [G4(MP3)-D-v1] G4(MP3)-D WTMAD2 = 1.65 kcal/mol with six fitted parameters, and with MP3/def2-TZVPP and CCSD(T)/def2-SVSP as the costliest steps;

BEST: [G4(MP2)-XK-T-v2] G4(MP2)-XK-T WTMAD2 = 1.43 kcal/mol likewise with six fitted parameters, and with CCSD(T)/def2-TZVP as the costliest step. (The “T” refers to the basis set for CCSD(T)/def2-TZVP used there. The basis set extrapolation steps are only a subsidiary element of the computational cost if an RI-MP2 approximation is used.)

If one seeks an alternative to “best” with fewer fitted parameters, [G4-T-v2] G4-T with WTMAD2 = 1.51 kcal/mol fits the bill.

If KS orbitals in the MP2 are used, then [G4(MP3|KS)-D-v6] G4(MP3|KS)-D with WTMAD2 = 1.96 kcal/mol, with no step costlier than MP3/def2-TZVPP.

If one considers DLPNO-CCSD(T) as a gentler-scaling alternative to canonical CCSD(T), then G4-T-DLPNO becomes an attractive option, with WTMAD2 = 1.66 kcal/mol, no step costlier than DLPNO-CCSD(T)/def2-TZVPP, and just three fitted parameters. Somewhat superior accuracy can be obtained (WTMAD2 = 1.52 kcal/mol) at the G4-Q-DLPNO level, where DLPNO-CCSD(T)/def2-QZVPP is the costliest step.

In Table 3, we present the final selected cWFT methods. They combine the highest accuracy for the molecular energies across the GMTKN55 database with a minimal number of empirical parameters. The energy expressions that we employed eliminate any redundant parameters (HLC) and render the listed methods transferable to various chemical systems, other than those of the 2459 molecules of GMTKN55. Extensive benchmark sets and the survival of the fittest approach, where the proper energy expressions eliminate empirical parameters, improved the DFT performance significantly within chemical accuracy for energetics. Representative cases include ωB97M(2) and revDSD-PBEP86-D4 that have been successfully employed to expanded porphyrins109 or transition metal reaction barrier heights of the MOBH35 database.110,111

Example input and output files for the popular Gaussian 16 electronic structure system, as well as postprocessing scripts, have been given in the Supporting Information. The geometry optimization and frequencies steps (and scale factors) are the same there as in the original G4(MP2)XK specification.53

When perusing the CPU and wall clock times in these outputs, it should be kept in mind that this code has no RI-MP2 implementation, and hence the largest basis set MP2 step will weigh much heavier on the total than it would with another code like ORCA or Q-CHEM.

Further performance assessment of the presently proposed cWFT methods is especially desirable for transition metal (and heavy p-block) energetics and barrier heights, particularly in order to learn their limitations. Toward that aim, we shall consider the following data sets in future work: TMC151 of transition metals;112 CUAGAU of small copper, silver, and gold compounds;113 the MOBH35 (metal–organic barrier height) data set;110,111 and for the main group, the heavy p-block subset from ref (53) and the smaller MG8 of noncovalent interactions, isomerization reactions, thermochemistry, and barrier heights.114

V. Conclusions

We have attempted to develop a hierarchy of composite WFT computational thermochemistry protocols based on Weigend–Ahlrichs basis sets but parametrized against the very large and chemically diverse GMTKN55 benchmark.

The original G4(MP2)-XK approach of Chan, Karton, and Raghavachari could be considered the lowest tier of our approach, using no larger basis set than def2-TZVPP for the MP2 correlation and def2-SVSP for the post-MP2 corrections. For GMTKN55, with the original parameters (but including a fix for rare gas atoms), G4(MP2)-XK is intermediate in performance between the ωB97X-V and ωB97M-V combinatorially optimized range-separated hybrid functionals. Refitting lowers WTMAD2 somewhat, but not spectacularly. Adding an empirical dispersion correction brings G4(MP2)-XK-D in the WTMAD2 = 3.3 kcal/mol range (equal to ωB97M-V, the best-performing rung-4 functional) and additionally allows us to eliminate the “high-level correction”.

If an RI-MP2 code is available, expanding the MP2 basis set to def2-QZVPPD is relatively inexpensive in both CPU time and storage requirements. The resulting G4(MP2)-XK-D method performs in the same range as revDSD double hybrids but is still inferior to the best among them like revDSD-PBEP86-D4, as well as to the combinatorially optimized ωB97M(2) range-separated double hybrid. Especially if a dispersion correction is included, the HLC turns out to be superfluous, bringing the total number of empirical parameters down to just six.

A significant improvement is achieved with the three-tier G4(MP3)-D method, which entails RI-MP2/def2-QZVPPD, MP3/def2-TZVPP, and CCSD(T)/def2-SVSP calculations. We can then achieve WTMAD2 = 1.65 kcal/mol, markedly superior to even the best double hybrids.

A further enhancement is possible as a two-tier approach with RI-MP2/def2-QZVPPD and CCSD(T)/def2-TZVP, but at WTMAD2 = 1.43 kcal/mol, the substantial premium in computational cost is hard to justify by a performance benefit of just 0.2 kcal/mol over G4(MP3)-D. Still, further improvements probably will require going to quintuple-zeta basis sets for the MP2 step, including inner-shell correlation, considering post-CCSD(T) correlation effects, or a combination thereof.

While G4(MP2)-XK-D and G4(MP3)-D are comparable in computational cost to G4(MP2) and G4, respectively, their WTMAD2 metrics over GMTKN55 are markedly superior. A component breakdown of WTMAD2 reveals that the advantage of G4(MP2)-XK-D and G4(MP3)-D resides primarily in the large-molecule, intramolecular NCI (noncovalent interactions), and intermolecular NCI subsets: we note that the HLC as defined for G4(MP2) and G4 cancels exactly between the reactant and product sides of these reactions (as for any reactions that only involve closed-shell molecules), and hence it is unable to correct for basis set incompleteness.

The combination of (MP2|KS)/def2-QZVPPD correlation from a double-hybrid calculation with scaled MP3/def2-TZVPP correlation energy yields very promising results, WTMAD2 = 1.96 kcal/mol, at a comparatively low computational cost. This points toward an avenue for further improving the performance of empirical double-hybrid density functionals.

The DLPNO-CCSD(T)-based variants of the two-tier family represent the lowest-cost approach due to their minimal requirements compared to the canonical CCSD(T)-based cWFT. (By way of illustration: the most time-consuming step of G4-T-DLPNO on a melatonin conformer took about 2 h on eight cores of a Skylake machine, compared to about 2 days for the canonical G4-T. This gap will only widen for larger molecules.) G4-D-DLPNO was found to have WTMAD = 2.38 kcal/mol, i.e., 1.33 kcal/mol lower that the original G4(MP2)-XK at a fraction of its cost and with nine fewer parameters. The G4-Q-DLPNO and G4-T-DLPNO had the lowest WTMAD2 = 1.52 and 1.65 kcal/mol for the GMTKN55 database, respectively. Out of the different DLPNO-based approaches, G4-T-DLPNO appears to offer the best price–performance ratio.

Overall, we can conclude that composite WFT methods still offer advantages over the best double hybrids—but that the gap has narrowed substantially.

Finally, we would like to point out that G4-T, G4-Q-DLPNO, G4-T-DLPNO, G4(MP2)-XK-T, G4(MP2)-XK-D, G4(MP3)-D, and G4(MP3|KS)-D are available for all s, p, and d block elements of the Periodic Table. Sample inputs and postprocessing scripts are given in the Supporting Information.

Acknowledgments

J.M.L.M. acknowledges Dr. Mark Vilensky (scientific computing manager of ChemFarm) and Mr. Jacov Wallerstein (CEO and CTO of Access Technologies) for development of the bscratch shared high-bandwidth scratch storage server, without which some of the largest conventional calculations would have been impossible. Peer reviewers are thanked for helpful suggestions. We also thank Mr. Golokesh Santra and Dr. Mark A. Iron for helpful discussions.

Supporting Information Available

The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.jctc.0c00189.

  • Gaussian 16 files (including required basis sets) and postprocessing scripts for the G4-T, G4(MP2)-XK-T, G4(MP3|KS)-D, G4(MP2)-XK-T, and G4(MP3)-D methods, accompanied by an instructions document; detailed equations for all composite methods developed here; the full table of parameters and statistics of all methods considered in Microsoft Excel format; and a report with a detailed breakdown of all 55 individual subsets of the performance statistics of the methods proposed in this paper (ZIP)

Author Contributions

The manuscript was written through contributions of all authors. All authors have given approval to the final version of the manuscript.

Research was supported by the Israel Science Foundation (grant 1358/15) and by the Yeda-Sela-SABRA program (Weizmann Institute of Science). The work of E.S. on this scientific paper was supported by the Onassis Foundation—Scholarship ID: F ZP 052-1/2019-2020.

The authors declare no competing financial interest.

Supplementary Material

ct0c00189_si_001.zip (13.4MB, zip)

References

  1. Kohn W.; Sham L. J. Self-Consistent Equations Including Exchange and Correlation Effects. Phys. Rev. 1965, 140, A1133–A1138. 10.1103/PhysRev.140.A1133. [DOI] [Google Scholar]
  2. Curtiss L. A.; Raghavachari K.; Redfern P. C.; Rassolov V.; Pople J. A. Gaussian-3 (G3) Theory for Molecules Containing First and Second-Row Atoms. J. Chem. Phys. 1998, 109, 7764–7776. 10.1063/1.477422. [DOI] [Google Scholar]
  3. Baboul A. G.; Curtiss L. A.; Redfern P. C.; Raghavachari K. Gaussian-3 Theory Using Density Functional Geometries and Zero-Point Energies. J. Chem. Phys. 1999, 110, 7650. 10.1063/1.478676. [DOI] [Google Scholar]
  4. Curtiss L. A.; Redfern P. C.; Raghavachari K. Gaussian-4 Theory. J. Chem. Phys. 2007, 127, 124105. 10.1063/1.2770701. [DOI] [PubMed] [Google Scholar]
  5. Curtiss L. A.; Redfern P. C.; Raghavachari K. Gaussian-4 Theory Using Reduced Order Perturbation Theory. J. Chem. Phys. 2007, 127, 124105. 10.1063/1.2770701. [DOI] [PubMed] [Google Scholar]
  6. Curtiss L. A.; Raghavachari K.; Trucks G. W.; Pople J. A. Gaussian-2 Theory for Molecular Energies of First- and Second-row Compounds. J. Chem. Phys. 1991, 94, 7221–7230. 10.1063/1.460205. [DOI] [Google Scholar]
  7. Pople J. A.; Head-Gordon M.; Fox D. J.; Raghavachari K.; Curtiss L. A. Gaussian-1 Theory: A General Procedure for Prediction of Molecular Energies. J. Chem. Phys. 1989, 90, 5622–5629. 10.1063/1.456415. [DOI] [Google Scholar]
  8. Curtiss L. A.; McGrath M. P.; Blaudeau J. P.; Davis N. E.; Binning R. C.; Radom L. Extension of Gaussian-2 Theory to Molecules Containing Third-Row Atoms Ga-Kr. J. Chem. Phys. 1995, 103, 6104–6113. 10.1063/1.470438. [DOI] [Google Scholar]
  9. Curtiss L. A.; Redfern P. C.; Raghavachari K. Gn Theory. Wiley Interdiscip. Rev.: Comput. Mol. Sci. 2011, 1, 810–825. 10.1002/wcms.59. [DOI] [Google Scholar]
  10. Montgomery J. A.; Frisch M. J.; Ochterski J. W.; Petersson G. A. A Complete Basis Set Model Chemistry. VI. Use of Density Functional Geometries and Frequencies. J. Chem. Phys. 1999, 110, 2822. 10.1063/1.477924. [DOI] [Google Scholar]
  11. Montgomery J. A.; Frisch M. J.; Ochterski J. W.; Petersson G. A. A Complete Basis Set Model Chemistry. VII. Use of the Minimum Population Localization Method. J. Chem. Phys. 2000, 112, 6532–6542. 10.1063/1.481224. [DOI] [Google Scholar]
  12. Petersson G. A.Complete Basis Set Models for Chemical Reactivity: From the Helium Atom to Enzyme Kinetics. In Quantum-Mechanical Prediction of Thermochemical Data; Cioslowski J., Ed.; Kluwer Academic Publishers: Dordrecht, The Netherlands, 2001; pp 99–130. [Google Scholar]
  13. Zhao Y.; Xia L.; Liao X.; He Q.; Zhao M. X.; Truhlar D. G. Extrapolation of High-Order Correlation Energies: The WMS Model. Phys. Chem. Chem. Phys. 2018, 20, 27375–27384. 10.1039/C8CP04973D. [DOI] [PubMed] [Google Scholar]
  14. Zhao Y.; Lynch B. J.; Truhlar D. G. Multi-Coefficient Extrapolated Density Functional Theory for Thermochemistry and Thermochemical Kinetics. Phys. Chem. Chem. Phys. 2005, 7, 43. 10.1039/b416937a. [DOI] [Google Scholar]
  15. Zhao Y.; Lynch B. J.; Truhlar D. G. Doubly Hybrid Meta DFT: New Multi-Coefficient Correlation and Density Functional Methods for Thermochemistry and Thermochemical Kinetics. J. Phys. Chem. A 2004, 108, 4786–4791. 10.1021/jp049253v. [DOI] [Google Scholar]
  16. Peterson C.; Penchoff D. A.; Wilson A. K. Prediction of Thermochemical Properties Across the Periodic Table: A Review of the Correlation Consistent Composite Approach (CcCA) Strategies and Applications. Annu. Rep. Comput. Chem. 2016, 12, 3–45. 10.1016/bs.arcc.2016.04.001. [DOI] [Google Scholar]
  17. DeYonker N. J.; Wilson B. R.; Pierpont A. W.; Cundari T. R.; Wilson A. K. Towards the Intrinsic Error of the Correlation Consistent Composite Approach (CcCA). Mol. Phys. 2009, 107, 1107–1121. 10.1080/00268970902744359. [DOI] [Google Scholar]
  18. DeYonker N. J.; Cundari T. R.; Wilson A. K. The Correlation Consistent Composite Approach (CcCA): An Alternative to the Gaussian-n Methods. J. Chem. Phys. 2006, 124, 114104. 10.1063/1.2173988. [DOI] [PubMed] [Google Scholar]
  19. Martin J. M. L.; de Oliveira G. Towards Standard Methods for Benchmark Quality Ab Initio Thermochemistry—W1 and W2 Theory. J. Chem. Phys. 1999, 111, 1843–1856. 10.1063/1.479454. [DOI] [Google Scholar]
  20. Parthiban S.; Martin J. M. L. Assessment of W1 and W2 Theories for the Computation of Electron Affinities, Ionization Potentials, Heats of Formation, and Proton Affinities. J. Chem. Phys. 2001, 114, 6014–6029. 10.1063/1.1356014. [DOI] [Google Scholar]
  21. Martin J. M. L.; Parthiban S.. W1 and W2 Theories, and Their Variants: Thermochemistry in the kJ/Mol Accuracy Range. In Quantum-Mechanical Prediction of Thermochemical Data; Cioslowski J., Ed.; Understanding Chemical Reactivity; Kluwer Academic Publishers: Dordrecht, The Netherlands, 2002; Vol. 22, pp 31–65. [Google Scholar]
  22. Karton A.; Rabinovich E.; Martin J. M. L.; Ruscic B. W4 Theory for Computational Thermochemistry: In Pursuit of Confident Sub-KJ/Mol Predictions. J. Chem. Phys. 2006, 125, 144108. 10.1063/1.2348881. [DOI] [PubMed] [Google Scholar]
  23. Karton A.; Martin J. M. L. Explicitly Correlated W n Theory: W1-F12 and W2-F12. J. Chem. Phys. 2012, 136, 124114. 10.1063/1.3697678. [DOI] [PubMed] [Google Scholar]
  24. Sylvetsky N.; Peterson K. A.; Karton A.; Martin J. M. L. Toward a W4-F12 Approach: Can Explicitly Correlated and Orbital-Based Ab Initio CCSD(T) Limits Be Reconciled?. J. Chem. Phys. 2016, 144, 214101. 10.1063/1.4952410. [DOI] [PubMed] [Google Scholar]
  25. Chan B.; Radom L. W1X-1 and W1X-2: W1-Quality Accuracy with an Order of Magnitude Reduction in Computational Cost. J. Chem. Theory Comput. 2012, 8, 4259–4269. 10.1021/ct300632p. [DOI] [PubMed] [Google Scholar]
  26. Chan B.; Radom L. W3X: A Cost-Effective Post-CCSD(T) Composite Procedure. J. Chem. Theory Comput. 2013, 9, 4769–4778. 10.1021/ct4005323. [DOI] [PubMed] [Google Scholar]
  27. Chan B.; Radom L. W2X and W3X-L: Cost-Effective Approximations to W2 and W4 with KJ Mol – 1 Accuracy. J. Chem. Theory Comput. 2015, 11, 2109–2119. 10.1021/acs.jctc.5b00135. [DOI] [PubMed] [Google Scholar]
  28. Tajti A.; Szalay P. G.; Császár A. G.; Kállay M.; Gauss J.; Valeev E. F.; Flowers B. A.; Vázquez J.; Stanton J. F. HEAT: High Accuracy Extrapolated Ab Initio Thermochemistry. J. Chem. Phys. 2004, 121, 11599–11613. 10.1063/1.1811608. [DOI] [PubMed] [Google Scholar]
  29. Bomble Y. J.; Vázquez J.; Kállay M.; Michauk C.; Szalay P. G.; Császár A. G.; Gauss J.; Stanton J. F. High-Accuracy Extrapolated Ab Initio Thermochemistry. II. Minor Improvements to the Protocol and a Vital Simplification. J. Chem. Phys. 2006, 125, 064108. 10.1063/1.2206789. [DOI] [PubMed] [Google Scholar]
  30. Harding M. E.; Vázquez J.; Ruscic B.; Wilson A. K.; Gauss J.; Stanton J. F. High-Accuracy Extrapolated Ab Initio Thermochemistry. III. Additional Improvements and Overview. J. Chem. Phys. 2008, 128, 114111. 10.1063/1.2835612. [DOI] [PubMed] [Google Scholar]
  31. Thorpe J. H.; Lopez C. A.; Nguyen T. L.; Baraban J. H.; Bross D. H.; Ruscic B.; Stanton J. F. High-Accuracy Extrapolated Ab Initio Thermochemistry. IV. A Modified Recipe for Computational Efficiency. J. Chem. Phys. 2019, 150, 224102. 10.1063/1.5095937. [DOI] [PubMed] [Google Scholar]
  32. Feller D.; Peterson K. A.; Ruscic B. Improved Accuracy Benchmarks of Small Molecules Using Correlation Consistent Basis Sets. Theor. Chem. Acc. 2014, 133, 1407. 10.1007/s00214-013-1407-z. [DOI] [Google Scholar]
  33. Dixon D.; Feller D.; Peterson K. A Practical Guide to Reliable First Principles Computational Thermochemistry Predictions Across the Periodic Table. Annu. Rep. Comput. Chem. 2012, 8, 1–28. 10.1016/B978-0-444-59440-2.00001-6. [DOI] [Google Scholar]
  34. Feller D.; Peterson K. A.; Dixon D. A. The Impact of Larger Basis Sets and Explicitly Correlated Coupled Cluster Theory on the Feller–Peterson–Dixon Composite Method. Annu. Rep. Comput. Chem. 2016, 12, 47–48. 10.1016/bs.arcc.2016.02.001. [DOI] [Google Scholar]
  35. Frisch M. J.; Trucks G. W.; Schlegel H. B.; Scuseria G. E.; Robb M. A.; Cheeseman J. R.; Scalmani G.; Barone V.; Petersson G. A.; Nakatsuji H.; Li X.; Caricato M.; Marenich A. V.; Bloino J.; Janesko B. G.; Gomperts R.; Mennucci B.; Hratchian H. P.; Ortiz J. V.; Izmaylov A. F.; Sonnenberg J. L.; Williams-Young D.; Ding F.; Lipparini F.; Egidi F.; Goings J.; Peng B.; Petrone A.; Henderson T.; Ranasinghe D.; Zakrzewski V. G.; Gao J.; Rega N.; Zheng G.; Liang W.; Hada M.; Ehara M.; Toyota K.; Fukuda R.; Hasegawa J.; Ishida M.; Nakajima T.; Honda Y.; Kitao O.; Nakai H.; Vreven T.; Throssell K.; Montgomery J. A. J.; Peralta J. E.; Ogliaro F.; Bearpark M. J.; Heyd J. J.; Brothers E. N.; Kudin K. N.; Staroverov V. N.; Keith T. A.; Kobayashi R.; Normand J.; Raghavachari K.; Rendell A. P.; Burant J. C.; Iyengar S. S.; Tomasi J. C.; Cossi M.; Millam J. M.; Klene M.; Adamo C.; Cammi R.; Ochterski J. W.; Martin R. L.; Morokuma K.; Farkas O.; Foresman J. B.; Fox D. J.. Gaussian 16, rev. C.01; Gaussian, Inc.: Wallingford, CT, 2016. [Google Scholar]
  36. Karton A. A Computational Chemist’s Guide to Accurate Thermochemistry for Organic Molecules. Wiley Interdiscip. Rev. Comput. Mol. Sci. 2016, 6, 292–310. 10.1002/wcms.1249. [DOI] [Google Scholar]
  37. Chan B. How to Computationally Calculate Thermochemical Properties Objectively, Accurately, and as Economically as Possible. Pure Appl. Chem. 2017, 89, 699–713. 10.1515/pac-2016-1116. [DOI] [Google Scholar]
  38. Ruscic B.; Bross D. H. Thermochemistry. Comput.-Aided Chem. Eng. 2019, 45, 3–114. 10.1016/B978-0-444-64087-1.00001-2. [DOI] [Google Scholar]
  39. Martin J. M. L. Coupling between the Convergence Behavior of Basis Set and Electron Correlation: A Quantitative Study. Theor. Chem. Acc. 1997, 97, 227–231. 10.1007/s002140050256. [DOI] [Google Scholar]
  40. Grimme S. Semiempirical Hybrid Density Functional with Perturbative Second-Order Correlation. J. Chem. Phys. 2006, 124, 034108. 10.1063/1.2148954. [DOI] [PubMed] [Google Scholar]
  41. Zhang I. Y.; Xu X. Doubly Hybrid Density Functional for Accurate Description of Thermochemistry, Thermochemical Kinetics and Nonbonded Interactions. Int. Rev. Phys. Chem. 2011, 30, 115–160. 10.1080/0144235X.2010.542618. [DOI] [Google Scholar]
  42. Kozuch S.; Martin J. M. L. Spin-Component-Scaled Double Hybrids: An Extensive Search for the Best Fifth-Rung Functionals Blending DFT and Perturbation Theory. J. Comput. Chem. 2013, 34, 2327–2344. 10.1002/jcc.23391. [DOI] [PubMed] [Google Scholar]
  43. Goerigk L.; Grimme S. Double-Hybrid Density Functionals. Wiley Interdiscip. Rev. Comput. Mol. Sci. 2014, 4, 576–600. 10.1002/wcms.1193. [DOI] [Google Scholar]
  44. Schwabe T.Double Hybrid Density Functional Approximations. In Chemical Modelling; Springborg M., Joswig J.-O., Eds.; Royal Society of Chemistry: Cambridge, U.K., 2017; Vol. 13, pp 191–220. [Google Scholar]
  45. Santra G.; Sylvetsky N.; Martin J. M. L. Minimally Empirical Double-Hybrid Functionals Trained against the GMTKN55 Database: RevDSD-PBEP86-D4, RevDOD-PBE-D4, and DOD-SCAN-D4. J. Phys. Chem. A 2019, 123, 5129–5143. 10.1021/acs.jpca.9b03157. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Martin J. M. L.; Santra G. Empirical Double-Hybrid Density Functional Theory: A ‘Third Way’ in Between WFT and DFT. Isr. J. Chem. 2020, 60, ijch.201900114 10.1002/ijch.201900114. [DOI] [Google Scholar]
  47. Perdew J. P.; Schmidt K. Jacob’s Ladder of Density Functional Approximations for the Exchange-Correlation Energy. AIP Conf. Proc. 2000, 577, 1–20. 10.1063/1.1390175. [DOI] [Google Scholar]
  48. Goerigk L.; Hansen A.; Bauer C.; Ehrlich S.; Najibi A.; Grimme S. A Look at the Density Functional Theory Zoo with the Advanced GMTKN55 Database for General Main Group Thermochemistry, Kinetics and Noncovalent Interactions. Phys. Chem. Chem. Phys. 2017, 19, 32184–32215. 10.1039/C7CP04913G. [DOI] [PubMed] [Google Scholar]
  49. Mardirossian N.; Head-Gordon M. Survival of the Most Transferable at the Top of Jacob’s Ladder: Defining and Testing the ωB97M(2) Double Hybrid Density Functional. J. Chem. Phys. 2018, 148, 241736. 10.1063/1.5025226. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Häser M. Møller-Plesset (MP2) Perturbation Theory for Large Molecules. Theor. Chim. Acta 1993, 87, 147–173. 10.1007/BF01113535. [DOI] [Google Scholar]
  51. Weigend F.; Häser M. RI-MP2: First Derivatives and Global Consistency. Theor. Chem. Acc. 1997, 97, 331–340. 10.1007/s002140050269. [DOI] [Google Scholar]
  52. Weigend F.; Häser M.; Patzelt H.; Ahlrichs R. RI-MP2: Optimized Auxiliary Basis Sets and Demonstration of Efficiency. Chem. Phys. Lett. 1998, 294, 143–152. 10.1016/S0009-2614(98)00862-8. [DOI] [Google Scholar]
  53. Chan B.; Karton A.; Raghavachari K. G4(MP2)-XK: A Variant of the G4(MP2)-6X Composite Method with Expanded Applicability for Main-Group Elements up to Radon. J. Chem. Theory Comput. 2019, 15, 4478–4484. 10.1021/acs.jctc.9b00449. [DOI] [PubMed] [Google Scholar]
  54. Weigend F.; Ahlrichs R. Balanced Basis Sets of Split Valence, Triple Zeta Valence and Quadruple Zeta Valence Quality for H to Rn: Design and Assessment of Accuracy. Phys. Chem. Chem. Phys. 2005, 7, 3297–3305. 10.1039/b508541a. [DOI] [PubMed] [Google Scholar]
  55. Karton A.; Daon S.; Martin J. M. L. W4–11: A High-Confidence Benchmark Dataset for Computational Thermochemistry Derived from First-Principles W4 Data. Chem. Phys. Lett. 2011, 510, 165–178. 10.1016/j.cplett.2011.05.007. [DOI] [Google Scholar]
  56. Raghavachari K.; Trucks G. W.; Pople J. A.; Head-Gordon M. A Fifth-Order Perturbation Comparison of Electron Correlation Theories. Chem. Phys. Lett. 1989, 157, 479–483. 10.1016/S0009-2614(89)87395-6. [DOI] [Google Scholar]
  57. Watts J. D.; Gauss J.; Bartlett R. J. Coupled-Cluster Methods with Noniterative Triple Excitations for Restricted Open-Shell Hartree–Fock and Other General Single Determinant Reference Functions. Energies and Analytical Gradients. J. Chem. Phys. 1993, 98, 8718–8733. 10.1063/1.464480. [DOI] [Google Scholar]
  58. Shao Y.; Gan Z.; Epifanovsky E.; Gilbert A. T. B.; Wormit M.; Kussmann J.; Lange A. W.; Behn A.; Deng J.; Feng X.; Ghosh D.; Goldey M.; Horn P. R.; Jacobson L. D.; Kaliman I.; Khaliullin R. Z.; Kuś T.; Landau A.; Liu J.; Proynov E. I.; Rhee Y. M.; Richard R. M.; Rohrdanz M. a.; Steele R. P.; Sundstrom E. J.; Woodcock H. L.; Zimmerman P. M.; Zuev D.; Albrecht B.; Alguire E.; Austin B.; Beran G. J. O.; Bernard Y. a.; Berquist E.; Brandhorst K.; Bravaya K. B.; Brown S. T.; Casanova D.; Chang C.-M.; Chen Y.; Chien S. H.; Closser K. D.; Crittenden D. L.; Diedenhofen M.; DiStasio R. A.; Do H.; Dutoi A. D.; Edgar R. G.; Fatehi S.; Fusti-Molnar L.; Ghysels A.; Golubeva-Zadorozhnaya A.; Gomes J.; Hanson-Heine M. W. D.; Harbach P. H. P.; Hauser A. W.; Hohenstein E. G.; Holden Z. C.; Jagau T.-C.; Ji H.; Kaduk B.; Khistyaev K.; Kim J.; Kim J.; King R. a.; Klunzinger P.; Kosenkov D.; Kowalczyk T.; Krauter C. M.; Lao K. U.; Laurent A. D.; Lawler K. V.; Levchenko S. V.; Lin C. Y.; Liu F.; Livshits E.; Lochan R. C.; Luenser A.; Manohar P.; Manzer S. F.; Mao S.-P.; Mardirossian N.; Marenich A. V.; Maurer S. a.; Mayhall N. J.; Neuscamman E.; Oana C. M.; Olivares-Amaya R.; O’Neill D. P.; Parkhill J. a.; Perrine T. M.; Peverati R.; Prociuk A.; Rehn D. R.; Rosta E.; Russ N. J.; Sharada S. M.; Sharma S.; Small D. W.; Sodt A.; Stein T.; Stück D.; Su Y.-C.; Thom A. J. W.; Tsuchimochi T.; Vanovschi V.; Vogt L.; Vydrov O.; Wang T.; Watson M. a.; Wenzel J.; White A.; Williams C. F.; Yang J.; Yeganeh S.; Yost S. R.; You Z.-Q.; Zhang I. Y.; Zhang X.; Zhao Y.; Brooks B. R.; Chan G. K. L.; Chipman D. M.; Cramer C. J.; Goddard W. A.; Gordon M. S.; Hehre W. J.; Klamt A.; Schaefer H. F.; Schmidt M. W.; Sherrill C. D.; Truhlar D. G.; Warshel A.; Xu X.; Aspuru-Guzik A.; Baer R.; Bell A. T.; Besley N. a.; Chai J.-D.; Dreuw A.; Dunietz B. D.; Furlani T. R.; Gwaltney S. R.; Hsu C.-P.; Jung Y.; Kong J.; Lambrecht D. S.; Liang W.; Ochsenfeld C.; Rassolov V. a.; Slipchenko L. V.; Subotnik J. E.; Van Voorhis T.; Herbert J. M.; Krylov A. I.; Gill P. M. W.; Head-Gordon M. Advances in Molecular Quantum Chemistry Contained in the Q-Chem 4 Program Package. Mol. Phys. 2015, 113, 184–215. 10.1080/00268976.2014.952696. [DOI] [Google Scholar]
  59. Pinski P.; Riplinger C.; Valeev E. F.; Neese F. Sparse Maps - A Systematic Infrastructure for Reduced-Scaling Electronic Structure Methods. I. An Efficient and Simple Linear Scaling Local MP2 Method That Uses an Intermediate Basis of Pair Natural Orbitals. J. Chem. Phys. 2015, 143, 034108. 10.1063/1.4926879. [DOI] [PubMed] [Google Scholar]
  60. Riplinger C.; Sandhoefer B.; Hansen A.; Neese F. Natural Triple Excitations in Local Coupled Cluster Calculations with Pair Natural Orbitals. J. Chem. Phys. 2013, 139, 134101. 10.1063/1.4821834. [DOI] [PubMed] [Google Scholar]
  61. Neese F. The ORCA Program System. Wiley Interdiscip. Rev.: Comput. Mol. Sci. 2012, 2, 73–78. 10.1002/wcms.81. [DOI] [Google Scholar]; Software update: the ORCA program system, version 4.0. 2018, 8, e1327. 10.1002/wcms.1327
  62. Rappoport D.; Furche F. Property-Optimized Gaussian Basis Sets for Molecular Response Calculations. J. Chem. Phys. 2010, 133, 134105. 10.1063/1.3484283. [DOI] [PubMed] [Google Scholar]
  63. Pritchard B. P.; Altarawy D.; Didier B. T.; Gibson T. D.; Windus T. L. A New Basis Set Exchange: An Open, Up-to-Date Resource for the Molecular Sciences Community. J. Chem. Inf. Model. 2019, 59, 4814. 10.1021/acs.jcim.9b00725. [DOI] [PubMed] [Google Scholar]
  64. Hättig C. Optimization of Auxiliary Basis Sets for RI-MP2 and RI-CC2 Calculations: Core-Valence and Quintuple-Zeta Basis Sets for H to Ar and QZVPP Basis Sets for Li to Kr. Phys. Chem. Chem. Phys. 2005, 7, 59–66. 10.1039/B415208E. [DOI] [Google Scholar]
  65. Weigend F.; Köhn A.; Hättig C. Efficient Use of the Correlation Consistent Basis Sets in Resolution of the Identity MP2 Calculations. J. Chem. Phys. 2002, 116, 3175–3183. 10.1063/1.1445115. [DOI] [Google Scholar]
  66. Weigend F. Hartree-Fock Exchange Fitting Basis Sets for H to Rn. J. Comput. Chem. 2008, 29, 167–175. 10.1002/jcc.20702. [DOI] [PubMed] [Google Scholar]
  67. Chan B.; Karton A.; Raghavachari K.; Radom L. Restricted-Open-Shell G4(MP2)-Type Procedures. J. Phys. Chem. A 2016, 120, 9299–9304. 10.1021/acs.jpca.6b09361. [DOI] [PubMed] [Google Scholar]
  68. Dolg M.; Cao X. Relativistic Pseudopotentials: Their Development and Scope of Applications. Chem. Rev. 2012, 112, 403–480. 10.1021/cr2001383. [DOI] [PubMed] [Google Scholar]
  69. Grimme S.; Antony J.; Ehrlich S.; Krieg H. A Consistent and Accurate Ab Initio Parametrization of Density Functional Dispersion Correction (DFT-D) for the 94 Elements H-Pu. J. Chem. Phys. 2010, 132, 154104. 10.1063/1.3382344. [DOI] [PubMed] [Google Scholar]
  70. Grimme S.; Ehrlich S.; Goerigk L. Effect of the Damping Function in Dispersion Corrected Density Functional Theory. J. Comput. Chem. 2011, 32, 1456–1465. 10.1002/jcc.21759. [DOI] [PubMed] [Google Scholar]
  71. Liakos D. G.; Sparta M.; Kesharwani M. K.; Martin J. M. L.; Neese F. Exploring the Accuracy Limits of Local Pair Natural Orbital Coupled-Cluster Theory. J. Chem. Theory Comput. 2015, 11, 1525–1539. 10.1021/ct501129s. [DOI] [PubMed] [Google Scholar]
  72. Neese F.; Wennmohs F.; Hansen A.; Becker U. Efficient, Approximate and Parallel Hartree-Fock and Hybrid DFT Calculations. A “chain-of-Spheres” Algorithm for the Hartree-Fock Exchange. Chem. Phys. 2009, 356, 98–109. 10.1016/j.chemphys.2008.10.036. [DOI] [Google Scholar]
  73. Weigend F. Accurate Coulomb-Fitting Basis Sets for H to Rn. Phys. Chem. Chem. Phys. 2006, 8, 1057–1065. 10.1039/b515623h. [DOI] [PubMed] [Google Scholar]
  74. Hellweg A.; Hättig C.; Höfener S.; Klopper W. Optimized Accurate Auxiliary Basis Sets for RI-MP2 and RI-CC2 Calculations for the Atoms Rb to Rn. Theor. Chem. Acc. 2007, 117, 587–597. 10.1007/s00214-007-0250-5. [DOI] [Google Scholar]
  75. Andrae D.; Häußermann U.; Dolg M.; Stoll H.; Preuß H. Energy-Adjustedab Initio Pseudopotentials for the Second and Third Row Transition Elements. Theor. Chim. Acta 1990, 77, 123–141. 10.1007/BF01114537. [DOI] [Google Scholar]
  76. Morgante P.; Peverati R. ACCDB: A Collection of Chemistry Databases for Broad Computational Purposes. J. Comput. Chem. 2019, 40, 839–848. 10.1002/jcc.25761. [DOI] [PubMed] [Google Scholar]
  77. Sure R.; Hansen A.; Schwerdtfeger P.; Grimme S. Comprehensive Theoretical Study of All 1812 C 60 Isomers. Phys. Chem. Chem. Phys. 2017, 19, 14296–14305. 10.1039/C7CP00735C. [DOI] [PubMed] [Google Scholar]
  78. Kruse H.; Mladek A.; Gkionis K.; Hansen A.; Grimme S.; Sponer J. Quantum Chemical Benchmark Study on 46 RNA Backbone Families Using a Dinucleotide Unit. J. Chem. Theory Comput. 2015, 11, 4972–4991. 10.1021/acs.jctc.5b00515. [DOI] [PubMed] [Google Scholar]
  79. Goerigk L.; Grimme S. A General Database for Main Group Thermochemistry, Kinetics, and Noncovalent Interactions - Assessment of Common and Reparameterized (Meta -)GGA Density Functionals. J. Chem. Theory Comput. 2010, 6, 107–126. 10.1021/ct900489g. [DOI] [PubMed] [Google Scholar]
  80. Goerigk L.; Grimme S. Efficient and Accurate Double-Hybrid-Meta-GGA Density Functionals—Evaluation with the Extended GMTKN30 Database for General Main Group Thermochemistry, Kinetics, and Noncovalent Interactions. J. Chem. Theory Comput. 2011, 7, 291–309. 10.1021/ct100466k. [DOI] [PubMed] [Google Scholar]
  81. Huber P. J.; Ronchetti E. M.. Robust Statistics; Wiley Series in Probability and Statistics; John Wiley & Sons, Inc.: Hoboken, NJ, 2009. [Google Scholar]
  82. Geary R. C. The Ratio of the Mean Deviation to the Standard Deviation as a Test of Normality. Biometrika 1935, 27, 310–332. 10.1093/biomet/27.3-4.310. [DOI] [Google Scholar]
  83. Powell M. J. D.The BOBYQA Algorithm for Bound Constrained Optimization without Derivatives; Department of Applied Mathematics and Theoretical Physics, Centre for Mathematical Sciences: Cambridge, U.K., 2009. [Google Scholar]
  84. Karton A.; Martin J. M. L. Explicitly Correlated Wn Theory: W1-F12 and W2-F12. J. Chem. Phys. 2012, 136, 124114. 10.1063/1.3697678. [DOI] [PubMed] [Google Scholar]
  85. Schwenke D. W. The Extrapolation of One-Electron Basis Sets in Electronic Structure Calculations: How It Should Work and How It Can Be Made to Work. J. Chem. Phys. 2005, 122, 014107. 10.1063/1.1824880. [DOI] [PubMed] [Google Scholar]
  86. Martin J. M. L. A Simple “Range Extender” for Basis Set Extrapolation Methods for MP2 and Coupled Cluster Correlation Energies. AIP Conf. Proc. 2018, 2040, 020008. 10.1063/1.5079050. [DOI] [Google Scholar]
  87. Grimme S. Improved Third-Order Møller-Plesset Perturbation Theory. J. Comput. Chem. 2003, 24, 1529–1537. 10.1002/jcc.10320. [DOI] [PubMed] [Google Scholar]
  88. Grimme S.; Goerigk L.; Fink R. F. Spin-Component-Scaled Electron Correlation Methods. Wiley Interdiscip. Rev. Comput. Mol. Sci. 2012, 2, 886–906. 10.1002/wcms.1110. [DOI] [Google Scholar]
  89. Grimme S. Improved Second-Order Møller–Plesset Perturbation Theory by Separate Scaling of Parallel- and Antiparallel-Spin Pair Correlation Energies. J. Chem. Phys. 2003, 118, 9095–9102. 10.1063/1.1569242. [DOI] [Google Scholar]
  90. Karton A.; Tarnopolsky A.; Lamère J.-F.; Schatz G. C.; Martin J. M. L. Highly Accurate First-Principles Benchmark Data Sets for the Parametrization and Validation of Density Functional and Other Approximate Methods. Derivation of a Robust, Generally Applicable, Double-Hybrid Functional for Thermochemistry and Thermochemical. J. Phys. Chem. A 2008, 112, 12868–12886. 10.1021/jp801805p. [DOI] [PubMed] [Google Scholar]
  91. Mardirossian N.; Head-Gordon M. ωB97M-V: A Combinatorially Optimized, Range-Separated Hybrid, Meta-GGA Density Functional with VV10 Nonlocal Correlation. J. Chem. Phys. 2016, 144, 214110. 10.1063/1.4952647. [DOI] [PubMed] [Google Scholar]
  92. Mardirossian N.; Head-Gordon M. ωB97X-V: A 10-Parameter, Range-Separated Hybrid, Generalized Gradient Approximation Density Functional with Nonlocal Correlation, Designed by a Survival-of-the-Fittest Strategy. Phys. Chem. Chem. Phys. 2014, 16, 9904–9924. 10.1039/c3cp54374a. [DOI] [PubMed] [Google Scholar]
  93. Zhao Y.; Truhlar D. G. The M06 Suite of Density Functionals for Main Group Thermochemistry, Thermochemical Kinetics, Noncovalent Interactions, Excited States, and Transition Elements: Two New Functionals and Systematic Testing of Four M06-Class Functionals and 12 Other Function. Theor. Chem. Acc. 2008, 120, 215–241. 10.1007/s00214-007-0310-x. [DOI] [Google Scholar]
  94. Mehta N.; Casanova-Páez M.; Goerigk L. Semi-Empirical or Non-Empirical Double-Hybrid Density Functionals: Which Are More Robust?. Phys. Chem. Chem. Phys. 2018, 20, 23175–23194. 10.1039/C8CP03852J. [DOI] [PubMed] [Google Scholar]
  95. Chan B.; Coote M. L.; Radom L. G4-SP, G4(MP2)-SP, G4-Sc, and G4(MP2)-Sc: Modifications to G4 and G4(MP2) for the Treatment of Medium-Sized Radicals. J. Chem. Theory Comput. 2010, 6, 2647–2653. 10.1021/ct100266u. [DOI] [PubMed] [Google Scholar]
  96. Karton A.; Goerigk L. Accurate Reaction Barrier Heights of Pericyclic Reactions: Surprisingly Large Deviations for the CBS-QB3 Composite Method and Their Consequences in DFT Benchmark Studies. J. Comput. Chem. 2015, 36, 622–632. 10.1002/jcc.23837. [DOI] [PubMed] [Google Scholar]
  97. Burns L. A.; Marshall M. S.; Sherrill C. D. Comparing Counterpoise-Corrected, Uncorrected, and Averaged Binding Energies for Benchmarking Noncovalent Interactions. J. Chem. Theory Comput. 2014, 10, 49–57. 10.1021/ct400149j. [DOI] [PubMed] [Google Scholar]
  98. Brauer B.; Kesharwani M. K.; Martin J. M. L. Some Observations on Counterpoise Corrections for Explicitly Correlated Calculations on Noncovalent Interactions. J. Chem. Theory Comput. 2014, 10, 3791–3799. 10.1021/ct500513b. [DOI] [PubMed] [Google Scholar]
  99. Ishikawa T.; Sakakura K.; Mochizuki Y. RI-MP3 Calculations of Biomolecules Based on the Fragment Molecular Orbital Method. J. Comput. Chem. 2018, 39, 1970–1978. 10.1002/jcc.25368. [DOI] [PubMed] [Google Scholar]
  100. Bozkaya U. Orbital-Optimized MP3 and MP2.5 with Density-Fitting and Cholesky Decomposition Approximations. J. Chem. Theory Comput. 2016, 12, 1179–1188. 10.1021/acs.jctc.5b01128. [DOI] [PubMed] [Google Scholar]
  101. Riley K. E.; Řezáč J.; Hobza P. The Performance of MP2.5 and MP2.X Methods for Nonequilibrium Geometries of Molecular Complexes. Phys. Chem. Chem. Phys. 2012, 14, 13187. 10.1039/c2cp41874f. [DOI] [PubMed] [Google Scholar]
  102. Najibi A.; Goerigk L. The Nonlocal Kernel in van Der Waals Density Functionals as an Additive Correction: An Extensive Analysis with Special Emphasis on the B97M-V and ωB97M-V Approaches. J. Chem. Theory Comput. 2018, 14, 5725–5738. 10.1021/acs.jctc.8b00842. [DOI] [PubMed] [Google Scholar]
  103. Wheeler S.; Houk K. N. Integration Grid Errors for Meta-GGA-Predicted Reaction Energies: Origin of Grid Errors for the M06 Suite of Functionals. J. Chem. Theory Comput. 2010, 6, 395–404. 10.1021/ct900639j. [DOI] [PMC free article] [PubMed] [Google Scholar]
  104. Görling A.; Levy M. Exact Kohn-Sham Scheme Based on Perturbation Theory. Phys. Rev. A: At., Mol., Opt. Phys. 1994, 50, 196–204. 10.1103/PhysRevA.50.196. [DOI] [PubMed] [Google Scholar]
  105. Chan B.; Goerigk L.; Radom L. On the Inclusion of Post-MP2 Contributions to Double-Hybrid Density Functionals. J. Comput. Chem. 2016, 37, 183–193. 10.1002/jcc.23972. [DOI] [PubMed] [Google Scholar]
  106. Liakos D. G.; Guo Y.; Neese F. Comprehensive Benchmark Results for the Domain Based Local Pair Natural Orbital Coupled Cluster Method (DLPNO-CCSD(T)) for Closed- and Open-Shell Systems. J. Phys. Chem. A 2020, 124, 90–100. 10.1021/acs.jpca.9b05734. [DOI] [PubMed] [Google Scholar]
  107. Paulechka E.; Kazakov A. Efficient DLPNO–CCSD(T)-Based Estimation of Formation Enthalpies for C-, H-, O-, and N-Containing Closed-Shell Compounds Validated Against Critically Evaluated Experimental Data. J. Phys. Chem. A 2017, 121, 4379–4387. 10.1021/acs.jpca.7b03195. [DOI] [PMC free article] [PubMed] [Google Scholar]
  108. Mielczarek D. C.; Nait Saidi C.; Paricaud P.; Catoire L. Generalized Prediction of Enthalpies of Formation Using DLPNO-CCSD(T) Ab Initio Calculations for Molecules Containing the Elements H, C, N, O, F, S, Cl, Br. J. Comput. Chem. 2019, 40, 768–793. 10.1002/jcc.25763. [DOI] [PubMed] [Google Scholar]
  109. Woller T.; Banerjee A.; Sylvetsky N.; Santra G.; Deraet X.; De Proft F.; Martin J. M. L.; Alonso M. Performance of Electronic Structure Methods for the Description of Hückel–Möbius Interconversions in Extended π-Systems. J. Phys. Chem. A 2020, 124, 2380–2397. 10.1021/acs.jpca.9b10880. [DOI] [PMC free article] [PubMed] [Google Scholar]
  110. Iron M. A.; Janes T. Evaluating Transition Metal Barrier Heights with the Latest Density Functional Theory Exchange–Correlation Functionals: The MOBH35 Benchmark Database. J. Phys. Chem. A 2019, 123, 3761–3781. 10.1021/acs.jpca.9b01546. [DOI] [PubMed] [Google Scholar]
  111. Iron M. A.; Janes T. Correction to “Evaluating Transition Metal Barrier Heights with the Latest Density Functional Theory Exchange–Correlation Functionals: The MOBH35 Benchmark Database.. J. Phys. Chem. A 2019, 123, 6379–6380. 10.1021/acs.jpca.9b06135. [DOI] [PubMed] [Google Scholar]
  112. Chan B.; Gill P. M. W.; Kimura M. Assessment of DFT Methods for Transition Metals with the TMC151 Compilation of Data Sets and Comparison with Accuracies for Main-Group Chemistry. J. Chem. Theory Comput. 2019, 15, 3610–3622. 10.1021/acs.jctc.9b00239. [DOI] [PubMed] [Google Scholar]
  113. Chan B. The CUAGAU Set of Coupled-Cluster Reference Data for Small Copper, Silver, and Gold Compounds and Assessment of DFT Methods. J. Phys. Chem. A 2019, 123, 5781–5788. 10.1021/acs.jpca.9b03976. [DOI] [PubMed] [Google Scholar]
  114. Chan B. Formulation of Small Test Sets Using Large Test Sets for Efficient Assessment of Quantum Chemistry Methods. J. Chem. Theory Comput. 2018, 14, 4254–4262. 10.1021/acs.jctc.8b00514. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

ct0c00189_si_001.zip (13.4MB, zip)

Articles from Journal of Chemical Theory and Computation are provided here courtesy of American Chemical Society

RESOURCES