Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2011 Mar 4.
Published in final edited form as: J Chem Theory Comput. 2010 Mar 4;6(4):1109–1117. doi: 10.1021/ct100025j

Free Energies of Solvation with Surface, Volume, and Local Electrostatic Effects and Atomic Surface Tensions to Represent the First Solvation Shell

Junjun Liu 1,2, Casey P Kelly 3, Alan C Goren 2,4, Aleksandr V Marenich 3, Christopher J Cramer 3,*, Donald G Truhlar 3,*, Chang-Guo Zhan 2,*
PMCID: PMC2856966  NIHMSID: NIHMS184751  PMID: 20419072

Abstract

Building on the SVPE (surface and volume polarization for electrostatics) model for electrostatic contributions to the free energy of solvation with explicit consideration of both surface and volume polarization effects, on the SMx approach to including first-solvation-shell contributions, and on the linear relationship between the electric field and short-range electrostatic contributions found by Chipman, we have developed a new method for computing absolute aqueous solvation free energies by combining the SVPE method with semiempirical terms that account for effects beyond bulk electrostatics. The new method is called SMVLE, and the elements it contains are denoted by SVPE-CDSL where SVPE denotes accounting for bulk electrostatic interactions between solute and solvent with both surface and volume contributions, CDS denotes the inclusion of solvent cavitation, changes in dispersion energy, and possible changes in local solvent structure by a semiempirical term utilizing geometry-dependent atomic surface tensions as implemented in SMx models, and L represents the local electrostatic effect derived from the outward-directed normal electric field on the cavity surface. The semiempirical CDS and L terms together represent the deviation of short-range contributions to the free energy of solvation from those accounted for by the SVPE term based on the bulk solvent dielectric constant. A solute training set containing a broad range of molecules used previously in the development of SM6 is used here for SMVLE model calibration. The aqueous solvation free energies predicted by the parameterized SMVLE model correlate exceedingly well with experimental values. The square of the correlation coefficient is 0.9949 and the slope is 1.0079. Comparison of the final SMVLE model against the earlier SMx solvation model shows that the parameterized SMVLE model not only yields good accuracy for neutrals but also significantly increases the accuracy for ions, making it the best implicit solvation model to date for aqueous solvation free energies of ions. The semiempirical terms associated with the outward-directed electric field account in a physical way for the improvement in the predictive accuracy for ions. The SMVLE method greatly decreases the need to include explicit water molecules for accurate modeling of solvation free energies of ions.

Introduction

Dielectric continuum solvation models1,2 have been widely and successfully used for estimating solvation free energies. Such models are also called implicit solvation models because the solvent is not atomistically explicit but rather is implicit in the dielectric medium. In the self-consistent versions of such models, the solvent is considered to be a continuous dielectric medium that is polarized by the solute, leading to a reaction field that in turn polarizes the solute, which changes the solvent polarization, leading ultimately to a self-consistent reaction field (SCRF). A dielectric continuum solvation model accurately describes the long-range permanent-multipole-moment and inductive interactions between solute and bulk solvent; in the language conventionally used in the continuum solvation model literature, both permanent-multipole-moment and induction effects are labeled as electrostatic, and we will follow that convention in the rest of this article. SCRF methods require less computational effort than explicit-solvent approaches involving the same quality treatment of the solute, and this makes them appealing for the study of complex chemical, materials, and biochemical processes and for the rapid screening of many solutes in, for example, molecular docking studies. When the bulk-solvent model is augmented with additional terms to account for the deviation of short-range solute-solvent interactions from the bulk electrostatic model, useful accuracy can be obtained.3

One popular way to implement the SCRF approach is to describe the solvent polarization in terms of the electrostatic potential that it introduces inside the solute cavity under the assumption that all solute charge density resides inside the cavity; this is often called the polarized continuum model (PCM).1,2 However, unconstrained quantum mechanical calculations of solute electronic structure always lead to a tail of the wave function penetrating outside the cavity, thereby causing an additional polarization effect called volume polarization.4 It has been demonstrated5 that neglecting charge penetration (also called outlying charge) leads to inconsistencies in the course of solving Poisson’s equation. Such inconsistencies render many SCRF implementations sensitive to cavity size6 and prone to overestimating solvent shifts of energy barriers in aqueous solution.7,8 In previous studies,57,9,10 a general model called surface and volume polarization for electrostatic interaction (SVPE), or the fully polarizable continuum model (FPCM),1118 was developed. This model, implemented for irregularly shaped solute cavities, fully accounts for both surface and volume polarization effects in solute-solvent electrostatic interactions. Therefore the SVPE solvation model provides a theoretically well justified continuum methodology for studying long-range electrostatic interactions. It has also been useful in a practical sense, having been applied successfully to study mechanisms for various chemical reactions and to make pKa predictions.7,8,19,20

One must bear in mind that absolute solvation free energies result not only from long-range electrostatic interactions between solute and bulk solvent, but also from significant short-range contributions, such as short-range, non-bulk electrostatics, as well as cavitation, exchange repulsion, dispersion, and disruption or formation of the nearby solvent structure. (Note that the short-range, non-bulk electrostatic effect may be considered to be a solvent structure effect.) These interactions are not treated satisfactorily within the framework of a pure dielectric continuum model. Previously, the difference between solvation free energies calculated by dielectric continuum solvation methods and experimental solvation free energies has been labeled in an SVPE context as the nondielectric or short-range contribution21,22 and in other contexts3 as a cavity–dispersion–solvent-structure (CDS) effect. Such short-range contributions are often7,8 (but not always23) neglected in estimating energy barriers by implicit solvation methods, but for systems with strong hydrophobic effects or hydrogen bonding between solute and solvent molecules, the short-range contributions to the energy barriers may be very significant or even decisive. Furthermore, accounting for the short-range contributions is essential for calculating reliable absolute solvation free energies of neutral solutes.3

In a supermolecular approach the nearby solvent molecules are represented explicitly as components of a cluster continuing the solute. It has been reported that, by employing a combined supermolecule–continuum approach,15,18,2427 involving both explicit and implicit solvent, the SVPE method can account for short-range contributions between solute and solvent and hence more accurately predict the absolute solvation free energies for a series of charged chemical species, including H+, Li+, OH, e (the hydrated electron), and F.15,18,24,27 The supermolecule–continuum approach explicitly treats a portion of the solvent surrounding the solute at a high quantum mechanical level. The computational efficiency depends heavily on solute size and larger solutes require more solvent water molecules to be explicitly considered, thus in practice limiting the application to smaller solutes. Even more significantly, the proper application of the supermolecule–continuum method requires statistical mechanical averaging over the various possible sites and orientations of the explicit solvent molecules, and this become impractical for even a small number (for example, two or three) of explicit solvent molecules. Finally, one should note that the interaction of solvent molecules in the supersolute (cluster) with the continuum must be treated accurately, e.g., through recourse to very high levels of electronic structure theory.

For these reasons, much effort has been made to augment the dielectric continuum model with short-range contributions.1,2,28 For example, the SMx series of solvation models (with x being 1–6, 8, 8AD, or D)2946 augment and correct the bulk electrostatic portion, obtained by the generalized Born (GB) approximation29,47 (for x = 1–6, 8, or 8AD) or the PCM approximation2 (for x = 5C or D), with a semiempirical term that accounts for short-range contributions. By employing a training data set containing a broad range of solutes, atomic radii used for defining the cavity in electrostatic calculations (such radii are called intrinsic Coulomb radii) were calibrated to calculate the bulk electrostatic interactions, and a set of atomic surface tension parameters was optimized to calculate the short-range contributions. It has been shown3,43,44 that the accuracy of SMx models for predicting absolute aqueous solvation free energies is quite good, about ~0.5 kcal/mol for neutral solutes. Although SMx models, like all other solvation models, have larger absolute errors for predicting aqueous solvation free energies of ions, which are much larger than those of neutral solutes, SMx still outperforms other continuum models for ionic solvation free energies.3,48 While the SMx models, by implicitly including local electrostatic effects as part of the semiempirical CDS terms, provide significant improvement over PCM models in predicting absolute solvation free energies,4345 it is worthwhile to consider more explicit ways to include local electrostatic effects.

The long-range electrostatic contribution, which is a function only of the solvent’s bulk dielectric constant, is included in the bulk electrostatic term, but the bulk electrostatic term also includes a somewhat arbitrary approximation to the short-range electrostatic effect, because the solute–solvent boundary that surrounds the solute cavity is located within the region occupied by the first solvation shell, but this shell does not behave like a bulk dielectric. The deviation of short-range electrostatics from bulk electrostatics (this is called the non-bulk electrostatic contribution or the local electrostatic contribution) has previously been included in SMx models as a solvent-structure contribution to the CDS term. Here we incorporate a new function, denoted by L for local electrostatics, that treats the non-bulk electrostatic contribution explicitly.

The motivation for the new function is the observation that the short-range electrostatic contribution is linearly correlated with the maximum or minimum outward-directed normal electric field on the solute–solvent boundary surface.21,22 Therefore we make the contribution beyond the bulk part calculated with SVPE dependent not only on the solvent-accessible atomic surface area of the solute, as in the CDS terms, but also on the outward-directed normal electric field on the cavity surface; the combination of the new local electrostatic terms (L) with the atomic surface tension terms (CDS) is denoted CDSL. Replacing the CDS terms by CDSL terms and replacing the generalized Born approximation (of SM1–SM8 or SM8AD)3,2939,4145 or the PCM approximation (of SM5C and SMD)40,46 by the SVPE treatment is the essence of the present attempt to make the solvation model more physical. As we have just explained, a long abbreviation for the new method is SVPE-CDSL. It will, however, be more convenient to simply call the new method SMVLE, which denotes Solvation Model with Volume and Local Electrostatics, since the explicit accounting for volume polarization and local electrostatics are the new elements beyond those included in previous SMx solvation models.

The same training set which was used to calibrate parameters for SM643 is used here for optimizing the parameters of SMVLE. The absolute aqueous solvation free energies obtained by the parameterized SMVLE model are compared with experimentally measured aqueous salvation free energies to calculate the mean unsigned error (MUE), which measures the predictive accuracy of the solvation model. The accuracy of the SMVLE model is compared with SM643 and the recent SM844 and SMD46 solvation models, and the role of the new kind of semiempirical term, called GL, is discussed.

Methods

Description of the SMVLE model

As explained in the introduction, the free energy of solvation is a sum of three terms:

ΔGS*=ΔGSVPE+GCDS+GL (1)

Here ΔGS* is the fixed-concentration absolute solvation free energy,49 ΔGSVPE is the bulk electrostatic portion calculated by SVPE method, GCDS is the semiempirical term based on atomic surface tensions, and GL is the semiempirical electric-field-dependent term, whose form is motivated by Chipman’s work on ions21,22 where two semiempirical terms were generated for anions and cations, respectively. If the standard-state solvation free energy, ΔGSo, with a concentration corresponding to a solute partial pressure of 1 atm in the gas phase and a solute concentration of 1 M in the liquid, is desired instead of ΔGS*, then another term, Gconco=1.89kcal/mol, must be added to account for the change in concentration.50 This value of Gconco and all other free energies considered in this paper correspond to a temperature of 298 K.

ΔGSVPE can be expressed as:

ΔGSVPE=Ψ(1)|H0+12V|Ψ(1)Ψ(0)|H0|Ψ(0) (2)

where Ψ is the solute wave function, H0 is the solute Hamiltonian in vacuum, and V is the energy operator associated with the reaction field. The factor of ½ in eq. 2 stems from assuming a linear response of the surrounding medium to the solute’s charge distribution so that one half of the induced favorable solute-solvent interaction is canceled by the cost of reorganizing the solvent.51 The superscripts (0) and (1) refer to the gas-phase isolated molecule and the liquid-phase solution, respectively.

The GCDS term includes free energy changes associated with solvent cavitation, changes in dispersion energy, and possible changes in local solvent structure. It is calculated according to29,43

GCDS=atomskσkAk (3)

where Ak is the solvent-accessible surface area52,53 of atom k, which depends on the solute geometry, atomic van der Waals radius, and solvent radius, and σk is the atomic surface tension of atom k. The physical basis for eq. 3 is that deviations from bulk electrostatics, as well as cavitation, dispersion, and solvent-structural contributions, are all concentrated in the first solvation shell, and Ak is basically a continuous measure of the amount of solvent in the first solvation shell of atom k.29,52,54 The atomic surface tensions are sensitive to local environment, and therefore they are computed according to

σk=σ˜Zk+atomsk'm=1Mσ˜ZkZk'(m)Tkk'(m) (4)

where σ̃Zk and σ̃ZkZk' are the semiempirical surface tension coefficients for atom k or atom pair kk′, and Tkk' is a geometry-dependent switching function called a cutoff tanh (referred to as a COT).34 In most cases the sum over m has only one term, and when m = 1, the superscript is omitted. For k, k′ = 7, 6, we have M = 3.

The remaining term GL is motivated by the work of Chipman, who found that more accurate solvation energies could be obtained for ions by adding the following terms to the bulk-electrostatic term:21,22

GL(for anions)Wanion(EminE0min) (5)
GL(for cations)Wcation(EmaxE0max) (6)

where GL (for anions) and GL (for cations) are “local” (or short-range) electrostatic contributions for anions and cations, respectively; Emin and Emax are the minimum and maximum outward-directed normal electric field on the cavity surface, respectively; and Wanion, E0min, Wcation, and E0max are fitting parameters. One may interpret these terms as corrections for local electrostatics. Equation (5) and Equation (6) reveal a linear relationship between the local electrostatic contribution and the minimum or the maximum outward-directed normal electric field on the cavity surface for anions and cations, respectively, and the existence of this relationship indicates that a local (short-range) electrostatic interaction between solute and solvent, can physically be modeled in term of the outward-directed normal electric field. However, it is not straightforward to generalize above linear relationships into a formula valid for all solutes, including neutrals and ions with either sign of the charge. For example, one cannot simply combine Eqs. (5) and (6) because that ignores the local electrostatic effects for neutrals and zwitterions, and therefore the solvation free energy would not vary smoothly along a reaction coordinate where charge is developed or neutralized. The quantities Emin and Emax are also not adequate to represent the local electrostatic effects for dianions or dications with separated charge centers where both the minimum/maximum and the second minimum/maximum normal electric field should be considered. Furthermore, Emin and Emax do not necessarily vary smoothly during geometry optimization. Therefore a more complicated functional form that does not have these disadvantages is required. We obtain such a function by summing over terms involving the normal electric fields on each surface node and by taking advantage of the properties of the COT function. In particular, we postulate that

GL=i=1I[Bi(m=1MT(Ai,xi,Em)Emiwm)+Bi+I(m=1MT(Ai,xi,Em)Emiwm)] (7)
T(Ai,xi,Em)=1+tanh[Ai(Emxi)]2 (8)

where Em is the outward-directed normal electric field at node m on the cavity surface; wm is the surface area of node m; M is the total number of surface nodes used (M = 1202 in the present study); I is an integer that represents the highest power of Em; tanh is the hyperbolic tangent; and the Bi, Ai, and xi are semiempirical parameters.

In eq. 7, the summation over all surface elements means that we consider the local electrostatic contribution not only for ions but also for neutral solutes and zwitterions, and all normal electric fields, instead of only one minimum or one maximum normal electric field for singly charged ions, are considered for all kinds of uncharged and charged systems. In addition, the high powers of Em allow significant nonlinearity in the relationship. The functional form of eq. 7 varies smoothly along a reaction coordinate.

Computational details

The optimized geometries and the corresponding experimental aqueous solvation free energies were obtained from the data set used to calibrate the parameters in the development of SM6,43 with two exceptions. One exception is that one neutral molecule (O-ethyl O'-4-bromo-2-chlorophenyl S-propyl phosphorothioate or profenofos) in the SM6 training set could not be treated by the SVPE program because of its irregular molecular shape. Thus the SM6 data set has 273 neutral solutes and 143 ions (416 data), and the SMVLE training set has 272 neutral solutes and 143 ions (415 data). All of these molecules and their experimental aqueous solvation free energies are provided as supporting material. The second exception concerns the data used for the solvation energies of the ions. Most experimental aqueous solvation free energies are calculated based on thermodynamic cycles, in which the solvation free energies are based on the absolute aqueous solvation free energy of the proton, denoted by ΔGS(H+).55 The parametrization of SM6 was based on Zhan and Dixon’s ΔGS(H+) value18 of −264.3 kcal/mol, whereas the later SM8,44 SM8AD,45 and SMD46 were based on Tissandier et al.’s ΔGsol(H+) value56 of −265.9 kcal/mol. We used the Tissandier et al. value for the present work.

All the solvation calculations with the SMVLE model were carried out at the HF/6–31+G* electronic structure method.

We previously43 concluded that the partial charges in some ions are so large that they should be treated by a supermolecule–continuum approach. Therefore we developed a procedure based on the criterion that if any atom of the ion has partial atomic charge greater than or equal to the partial atomic charge on oxygen in a water molecule, then the ion should be treated as a supermolecule consisting a cluster of a bare ion and one solvent molecule. The ionic data set started with 112 bare ions, and by this criterion, 31 of them should be clustered (so the unclustered instances of these 31 ions are called improperly unclustered). This gives three sets of ions to consider, 81 properly unclustered (PU) ions, 31 improperly unclustered (IU) ions, and 31 monohydrated (MH) ions. The collection of 81 PU and 31 MH ions (totally, 112 ions) is called the selectively clustered (SC) set, the collection of 81 PU and 31 IU ions (totally, 112 ions) is called the unclustered (UC) set, and collection of 81 PU, 31 IU, and 31 MH ions (totally, 143 ions) is called “all ions”. We used all 143 ions set for parametrization, but we give statistics for various subsets for discussion purposes.

The bulk electrostatics were calculated by the SVPE method by using a local version10 of Gaussian03.57 The SVPE results depend only on the level and basis set of the quantum mechanical calculation and the isodensity contour value that defines the solute cavity. All the solvation calculations with the SMVLE model were carried out at the HF/6–31+G* level of theory. Previous studies have shown that contour values in the range of 0.0005 to 0.002 atomic units lead to a satisfactory description of the electrostatic contributions to the solvation energies for many neutral6,9 and ionic21,22 solutes. For this reason, we chose 0.001 atomic units as the contour value to determine the dielectric cavity. Cavity surface interactions were calculated using a set of 1202 Lebedev grid points and weights that are expected to yield precision of 0.1 kcal/mol or less for SVPE contributions to solvation free energies of all the solutes examined. Surface areas for the CDS term of eq 3 were calculated by the ASA algorithm58 with Bondi’s values59 for the atomic radii.

The molecules studied here are generally rigid, except for methyl rotors, whose conformation does not have a large effect on solvation free energies, and consequently no attempt was made to account for relaxation of geometry, change of conformation, or change in vibrational frequencies upon solvation. For solvation free energy calculations we adopt the Ben-Naim convention49 that the solute is transferred from a fixed position in the gas phase to a fixed position in solvent (this is called the fixed-concentration solvation free energy above). A value of 78.5 for the dielectric constant value of water is used in all our solvation calculations, which nominally corresponds to 298 K.

Calibration

After the bulk electrostatic interactions between solute and solvent were accurately determined for each molecule with the SVPE method, a set of target short-range contributions were obtained from the difference between bulk electrostatic interactions and experimental aqueous solvation free energies. All the CDSL parameters were then subjected to a fitting routine. First the atomic surface tension coefficients were optimized to minimize the root-mean-square error (RMSE) over the 272 neutral solutes. Then the semiempirical parameters in eq. 7 were optimized against all 415 molecules (272 neutral solutes and 143 ions) with the atomic surface tension coefficients frozen. The optimization of Ai and xi was carried out in steps:

  1. First I was set temporarily to 1, and values from 0.001 to 2 were tried for x1; for each x1, values of 0.1 to 2000 were tried for A1. The values of x1 and A1 which produced the smallest value of the weighted root-mean-square deviation (WRMSD) for the 415 solvation energies in the multiple linear regression fitting process was selected for the next step. In computing the WRMSD in this step and all subsequent steps, the neutral solutes and the ions had relative weights of W:1 where W is an integer parameter of the optimization scheme.

  2. I was increased by 1. The values of Ai and xi already obtained were fixed and the values of AI and xI were optimized in the same way as in step 1.

  3. Step 2 was repeated until an arbitrary maximum value of I was reached. In this way, a first-round set of values of Ai and xi (i = 1, 2, …, I) was obtained.

  4. Now with I fixed at its maximum value, each Ai and xi were re-optimized with the remaining parameters fixed. For example, A1 and x1 were re-optimized with other Ai and xi (i=2, 3, …, I) obtained from previous steps fixed; then the re-optimized values of A1 and x1 along with the values of other Ai and xi (i = 3, …, I) were fixed in the re-optimization of A2 and x2; and so forth.

  5. Step 4 was repeated until the values of Ai and xi (i = 1, 2, …, I) did not change.

  6. Steps 1–5 were repeated for several maximal values of I. The final value of I was chosen to be 3 as discussed in the next section.

Results and Discussion

We tested the SMVLE method by examining the errors obtained with various maximal values of I in the range from 1 to 5. We found that the mean errors for ions decreased when I was increased from 1 to 3, but when I was increased further the mean errors for the anions improved but those for cations increased. Therefore, we set I = 3.

Because solvation free energies of ions are about an order of magnitude larger than those for neutrals, the predictions would have larger errors for ions even if the relative errors were similar. Furthermore, the experimental data for ions usually have larger absolute uncertainties. Another consideration is that our training set contains more neutral data than ionic data. Thus it is a matter of subjective judgment what value to choose for the parametrization weight W to balance the relative accuracies and desired accuracies for neutrals and for ions; we chose W = 3.

The calibrated parameters, namely the surface tension parameters and the semiempirical GL parameters, are given in Table 1 and Table 2, respectively. The calculated absolute aqueous solvation free energies are obtained by the SMVLE method with these calibrated parameters. They are plotted in Figure 1 along with the experimental values. The predicted aqueous solvation free energies by the parameterized SMVLE are in excellent agreement with the experimental values. The square of correlation coefficient between the calculated and the experimental aqueous solvation free energies is 0.9945. Moreover, the slope (0.9847) of the correlation equation is basically 1 and the corresponding constant (0.1039) is nearly 0, showing that the predicted solvation energies by the parameterized SMVLE method are in very good agreement with experimental values.

Table 1.

SMVLE Surface Tension Coefficients (cal/Å2)

k σ k,k′ σ


H 57.88 H,C −75.22
C 114.49 C,C −70.59
N −30.82 H,O 110.62
O −84.28 O,C 187.69
F 46.48 O,O 98.59
Cl 14.69 C,N 30.94
Br 12.56 N,C −52.83
P −31.35 N,C (2)a −261.62
S −4.13 N,C (3)a 97.52
O,N 256.52
O,P 79.30
a

Number in parentheses is m when m is not 1.

Table 2.

Parameters for the Local Electrostatic Terms

i 1 2 3
Ai 1984.0 1528.0 1488.0
xi 0.07 0.037 0.057
Bi −2.679 −23.413 453.544
Bi+3 1.454 −5.64 −139.35

Figure 1.

Figure 1

Correlation between the experimental and predicted aqueous solvation free energies.

The mean unsigned errors for each solute class obtained from SMVLE were calculated to compare with those from SM6, SM8, SM8AD, and SMD. For the present study we recalculated the previously reported SM6 and SM8 errors using the reference solvation free energies for water-cluster data (data for ionic clusters and the water dimer) corrected by +2.38 kcal/mol to account for a recently discovered error due to a missing concentration correction term (see ref 46 for more detail) and to convert the SM6 error analysis to the scale based on the proton solvation energy of Tissandier et al., as discussed above. Note also that although SM8, SM8AD, and SMD were parameterized with 274 data for neutral solutes in water (the 272 data used here plus profenofos and tetramethylsilane), all mean errors given in the present article have been calculated for the 272 neutral data used here. The mean errors for the new SMVLE model and the previous SM6, SM8, SM8AD, and SMD models are shown in Table 3.

Table 3.

Mean Unsigned Errors (kcal/mol) in Aqueous Solvation Free Energies Calculated by SMVLE and Older SMx Methodsa

Model ESM 272
neutrals
81
PU
ions
31
IU
ions
31
MH
ions
143
all
ions
415
all
solutes
60
SC
anions
52
SC
cations
112
SC
ions
112
UC
ions
384
proper
solutesb
SMVLE HF/6–31+G* 0.55 2.55 4.45 3.88 3.25 1.48 3.17 2.63 2.92 3.07 1.24
SM6 mPW1PW/6–31G* 0.47 2.55 8.57 4.34 4.24 1.77 3.45 2.58 3.05 4.22 1.22
   /6–31+G* 0.55 2.90 8.52 4.29 4.42 1.89 3.00 3.61 3.28 4.46 1.35
SM8 mPW1PW/6–31G* 0.56 2.51 8.41 4.16 4.15 1.80 3.44 2.42 2.97 4.14 1.26
   /6–31+G* 0.63 2.59 8.46 4.03 4.17 1.85 2.96 3.03 2.99 4.21 1.32
M05-2X/6–31G* 0.59 2.54 8.41 4.18 4.17 1.83 3.49 2.42 2.99 4.16 1.29
SM8AD mPW1PW/6–31G* 0.60 2.94 6.25 2.64 3.59 1.63 2.86 2.85 2.86 3.86 1.26
M05-2X/6–31G* 0.52 3.06 6.25 2.68 3.67 1.61 2.95 2.96 2.95 3.94 1.23
M06-2X/6–31G* 0.61 2.88 6.05 2.59 3.50 1.61 2.78 2.82 2.80 3.76 1.25
SMD HF/6–31G* 0.91 3.28 8.10 3.63 4.40 2.12 3.86 2.82 3.38 4.61 1.63
    /6–31+G* 1.31 3.59 8.93 3.60 4.75 2.50 4.53 2.50 3.59 5.07 1.98
mPW1PW/6–31G* 0.62 4.50 9.64 4.63 5.64 2.35 5.61 3.29 4.53 5.92 1.76
M05-2X/6–31G* 0.59 4.08 9.11 4.24 5.21 2.18 5.01 3.09 4.12 5.47 1.62
M06-2X/6–31G* 0.62 4.39 9.35 4.44 5.48 2.30 5.45 3.19 4.40 5.76 1.73
a

ESM = electronic structure method, PU = properly unclustered, IU = improperly unclustered, MH = monohydrated, SC = selectively clustered, UC = unclustered. See text for the detailed description of the subsets (“Computational details” subsection).

b

all solutes except IU ions

The error of the SMVLE model for neutrals is 0.55 kcal/mol and it is close to or better than that of the SMx methods for which mean unsigned errors range between 0.47 (SM6/mPW1PW/6–31G*) and 1.31 (SMD/HF/6–31+G*) kcal/mol (Table 3). The SMVLE method not only retains good accuracy for neutrals, but also it significantly improves the accuracy for ions. The SMVLE mean unsigned errors for the set of all 143 ions (3.25 kcal/mol) and for the set of 112 UC ions (3.07 kcal/mol) are smaller than the corresponding errors obtained with any other method tested in the present work (and we have shown previously3,44 that the methods tested here are better than other available methods). The error of the SMVLE model for the set of 112 SC ions is 2.92 kcal/mol, which is close to or better that of the SMx methods for which the MUE ranges between 2.80 (SM8AD/M06-2X/6–31G*) and 4.53 (SMD/mPW1PW/6–31G*) kcal/mol.

The local solvent environment is critical for ions, and it is difficult to simulate with implicit solvent. It was noticed in a previous study43 that the overall error for aqueous ions decreased when one explicitly bound solvent molecule was introduced. The data for the two subsets of 31 ions (IU and MH) listed in Table 3 show that, in contrast to previous models, explicitly including one solvent water molecule with the ion just slightly increases the predictive accuracy for ions within the SMVLE method. The SMVLE error over 31 tested ions decreases from 4.45 kcal/mol (31 IU ions) to 3.88 kcal/mol (31 MH ions). The difference of ~0.57 kcal/mol in SMVLE is much less than the ~4 kcal/mol difference found on average with the non-SMVLE models tested in the present study (Table 3). The almost identical accuracies obtained with or without the addition of one explicit solvent molecule suggests that SMVLE is capable of successfully modeling the strong local electrostatic interactions between ionic solutes and solvent, and the addition of an explicit solvent molecule is unnecessary for SMVLE.

In an attempt to assess the reason for the success of SMVLE in modeling unclustered ions, we removed the electric field semiempirical term, i.e., GL, in eq (1) and we recalculated the errors. The MUE for neutrals changes only about 4% whereas the MUE for ions increases from ~5 kcal/mol to ~ 12 kcal/mol. This implies that the improvement in the predictive accuracy for ions can be attributed to the newly introduced GL term. The values of the outward-directed normal electric fields are acting as indicators of the local solvent environment such that solute-solvent interactions stronger than would be anticipated from the bulk dielectric constant are associated with large values of the outward-directed normal electric field. As a special case, a hydrogen bond between solute and solvent may often be located by the direction of strongest outward-directed normal electric field, and the strength of the hydrogen bond might be represented by the magnitude of the strongest outward-directed normal electric field.21,22

SMVLE does not involve the optimization of intrinsic Coulomb radii for each atomic number, as in previous SMx models, or for atoms with various bonding types, as in some more empirical models. It is especially encouraging that SMVLE yields good results for ions despite not requiring this. In addition, this is of practical importance because it means SMVLE should be easier to extend to additional atomic numbers, if desired.

Although SMVLE provides significant improvement over all previous SMx models for ions, the improvement over SMD is particularly large and especially important. The reason it is especially important is that SMVLE and SMD do not require accurate partial atomic charges, which can sometimes be difficult to obtain for arbitrary levels of theory, extended basis sets, and complex systems. Thus SMVLE and SMD are more generally applicable. We also note that the present test of SMVLE includes diffuse functions (denoted by the “+” in 6–31+G*), and Table 3 shows that in previous methods the use of diffuse functions often decreases accuracy, which we interpreted as due to less stable partial atomic charges in SM6 and SM8 and to outlying charge in SMD. The good performance of SMVLE with a basis set containing diffuse functions is therefore particularly encouraging.

Concerning computational complexity of SMVLE method, the ratio of computing time spent for each SCRF cycle of the SMVLE calculation to that for each SCF cycle of the corresponding gas phase calculation arranges from 1.2 to 1.6 when the number of basis functions used for the solute is larger than ~200.

Concluding remarks

We have developed a new method, called SMVLE, for predicting absolute aqueous free energies of solvation by combining (1) the SVPE method, (2) semiempirical atomic surface tensions as used in the SM6 model, and (3) a new functional form, developed in the present study, that explicitly accounts for the local electrostatic effect. The SVPE term accounts for bulk electrostatics, the atomic surface tensions account for solvent cavitation, changes in dispersion energy, and possible changes in local solvent structure, and the final contribution accounts explicitly for non-bulk electrostatics in terms of the local electric field at the solute–solvent boundary. The parameters for SMVLE have been calibrated against a broad range of solutes, including 272 neutrals and 143 ions. The predicted aqueous solvation free energies by the parameterized SMVLE method correlate very well with experiment and have a value of the square of the correlation coefficient equal to 0.9945 and a slope of 0.9847. Comparisons with previous SMx solvation models show that the SMVLE model not only has comparable accuracy for neutrals but that it also impressively increases the predictive accuracy for ions. The semiempirical terms (GL) derived from the electric field are found to be primarily responsible for the increase in predictive accuracy for ions. The outward-directed normal electric fields that make the most important contributions account for strong interactions between the ionic solute and the nearby solvent, which makes the addition of explicit water molecules unnecessary. These encouraging results demonstrate that the parameterized SMVLE is accurate and effective in predicting absolute solvation free energies not only for neutral molecules, but also for ions exhibiting strong solute-solvent interactions.

Supplementary Material

1_si_001

Acknowledgments

This work was supported in part by the NIH (grants R01DA013930, R01DA025100, and RC1MH088480 to C.-G.Z.), the NSF (grant CHE07-04974 to D. G. T.), and the U. S. Army (grant US ARMY RES LAB/W911NF09-1-0377 to C. J. C. and D. G. T.). J. Liu worked in C.-G. Zhan’s laboratory at the University of Kentucky as an exchange graduate student from Central China Normal University. A. Goren worked in C.-G. Zhan’s laboratory at the University of Kentucky as a visiting professor from Transylvania University. The authors also acknowledge the Center for Computational Sciences (CCS) at University of Kentucky for supercomputing time on IBM X-series Cluster with 340 nodes or 1,360 processors. Computational resources were also provided by Minnesota Supercomputing Institute.

Footnotes

Supporting Information Available. Three tables for all molecules involved in this study, including 272 neutral solutes, 112 unclustered ions, and 31 clustered monohydrated ions, and their experimental aqueous solvation free energies. This material is available free of charge via the Internet at http://pubs.acs.org.

Contributor Information

Christopher J. Cramer, Email: cramer@umn.edu.

Donald G. Truhlar, Email: truhlar@umn.edu.

Chang-Guo Zhan, Email: zhan@uky.edu.

References

  • 1.Cramer CJ, Truhlar DG. Chem. Rev. 1999;99:2161–2200. doi: 10.1021/cr960149m. [DOI] [PubMed] [Google Scholar]
  • 2.(a) Tomasi J, Persico M. Chem. Rev. 1994;94:2027–2094. [Google Scholar]; (b) Rivail J-L, Rinaldi D. In: Computational Chemistry: Reviews of Current Trends. Leszczynski J, editor. Vol. 1. Singapore: World Scientific; 1996. pp. 139–174. [Google Scholar]; (c) Orozco M, Luque FJ. Chem. Rev. 2000;100:4187–4225. doi: 10.1021/cr000703z. [DOI] [PubMed] [Google Scholar]; (d) Tomasi J, Mennucci B, Cammi R. Chem. Rev. 2005;105:2999–3093. doi: 10.1021/cr9904009. [DOI] [PubMed] [Google Scholar]; (e) Mennucci B, Cammi R, editors. Continuum Solvation Models in Chemical Physics. Wiley: Chichester; 2007. [Google Scholar]
  • 3.Cramer CJ, Truhlar DG. Acc. Chem. Res. 2008;41:760–768. doi: 10.1021/ar800019z. 2009, 42, 493–497. [DOI] [PubMed] [Google Scholar]
  • 4.Chipman DM. J. Chem. Phys. 1997;106:10194–10206. [Google Scholar]; Chipman DM. J. Chem. Phys. 1996;104:3276–3289. [Google Scholar]
  • 5.Zhan C-G, Bentley J, Chipman DM. J. Chem. Phys. 1998;108:177–192. [Google Scholar]
  • 6.Zhan C-G, Chipman DM. J. Chem. Phys. 1999;110:1611–1622. [Google Scholar]
  • 7.Zhan C-G, Landry DW, Ornstein RL. J. Phys. Chem. A. 2000;104:7672–7678. [Google Scholar]; Chen X, Zhan C-G. J. Phys. Chem. A. 2004;108:6407–6413. [Google Scholar]
  • 9.Zhan C-G, Chipman DM. J. Chem. Phys. 1998;109:10543–10558. [Google Scholar]
  • 10.Vilkas MJ, Zhan C-G. J. Chem. Phys. 2008;129:194109. doi: 10.1063/1.3020767. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Zhan C-G, Niu SQ, Ornstein RL. J. Chem. Soc., Perkin Trans. 2001;2:23–29. [Google Scholar]
  • 12.Dixon DA, Feller D, Zhan C-G, Francisco JS. J. Phys. Chem. A. 2002;106:3191–3196. [Google Scholar]
  • 13.Zhan C-G, Dixon DA, Sabri MI, Kim MS, Spencer PS. J. Am. Chem. Soc. 2002;124:2744–2752. doi: 10.1021/ja0113394. [DOI] [PubMed] [Google Scholar]
  • 14.Zheng F, Zhan C-G, Ornstein RL. J. Phys. Chem. B. 2002;106:717–722. [Google Scholar]
  • 15.Zhan C-G, Dixon DA. J. Phys. Chem. B. 2003;107:4403–4417. [Google Scholar]
  • 16.Zhan C-G, Spencer P, Dixon DA. J. Phys. Chem. B. 2003;107:2853–2861. [Google Scholar]
  • 17.Dixon DA, Feller D, Zhan C-G, Francisco JS. Int. J. Mass Spectrom. 2003;227:421–438. [Google Scholar]
  • 18.Zhan C-G, Dixon DA. J. Phys. Chem. A. 2001;105:11534–11540. [Google Scholar]
  • 19.Xiong Y, Zhan C-G. J. Org. Chem. 2004;69:8451–8458. doi: 10.1021/jo0487597. [DOI] [PubMed] [Google Scholar]
  • 20.Chen X, Zhan C-G. J. Phys. Chem. A. 2004;108:3789–3797. [Google Scholar]; Lu H, Chen X, Zhan C-G. J. Phys. Chem. B. 2007;111:10599–10605. doi: 10.1021/jp072917r. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Chipman DM. J. Chem. Phys. 2003;118:9937–9942. [Google Scholar]
  • 22.Chipman DM, Chen FW. J. Chem. Phys. 2006;124 doi: 10.1063/1.2162542. 144507/1–5. [DOI] [PubMed] [Google Scholar]
  • 23.Cramer CJ, Hawkins GD, Truhlar DG. J. Chem. Soc. Faraday Trans. 1994;90:1802–1804. [Google Scholar]; Storer JW, Giesen DJ, Hawkins GD, Lynch GC, Cramer CJ, Truhlar DG, Liotard DA. ACS Symp. Ser. 1994;568:24–49. [Google Scholar]; Tuñón I, Ruiz-López MF, Rinaldi D, Bertrán J. J. Comp. Chem. 1996;17:148–155. [Google Scholar]; Chuang Y-Y, Cramer CJ, Truhlar DG. Int. J. Quantum Chem. 1998;70:887–896. [Google Scholar]; Cramer CJ, Truhlar DG. Faraday Discuss. Chem. Soc. 1998;110:477–479. [Google Scholar]; Chuang Y-Y, Radhakrishnan ML, Fast PL, Cramer CJ, Truhlar DG. J. Phys. Chem. A. 1999;103:4893–4909. [Google Scholar]; Chuang Y-Y, Truhlar DG. J. Am. Chem. Soc. 1999;121:10157–10167. [Google Scholar]; Jaque P, Marenich A, Cramer CJ, Truhlar DG. J. Phys. Chem. C. 2007;111:5783–5799. [Google Scholar]; Kim Y, Cramer CJ, Truhlar DG. J. Phys. Chem. A. 2009;113:9109–9114. doi: 10.1021/jp905429p. [DOI] [PubMed] [Google Scholar]; Kim Y, Marenich AV, Zheng J, Kim KH, Kołodziejska-Huben M, Rostkowski M, Paneth P, Truhlar DG. J. Chem. Theory Comput. 2009;5:59–67. doi: 10.1021/ct800345j. [DOI] [PubMed] [Google Scholar]
  • 24.Zhan C-G, Dixon DA. J. Phys. Chem. A. 2004;108:2020–2029. [Google Scholar]
  • 25.Xiong Y, Zhan C-G. J. Phys. Chem. A. 2006;110:12644–12652. doi: 10.1021/jp063140p. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Zhan C-G, Landry DW, Ornstein RL. J. Am. Chem. Soc. 2000;122:2621–2627. [Google Scholar]
  • 27.Zhan C-G, Dixon DA. J. Phys. Chem. A. 2002;106:9737–9744. [Google Scholar]
  • 28.Rivail JL, Terryn B, Rinaldi D, Ruiz-Lopez MF. J. Mol. Strcut. Theochem. 1985;120:387–400. [Google Scholar]; Rinaldi D, Costa Cabral BJ, Rivail J-L. Chem. Phys. Lett. 1986;125:495–499. [Google Scholar]
  • 29.Cramer CJ, Truhlar DG. J. Am. Chem. Soc. 1991;113:8305–8311. [Google Scholar]
  • 30.Cramer CJ, Truhlar DG. J. Comput. Chem. 1992;13:1089–1097. [Google Scholar]
  • 31.Cramer CJ, Truhlar DG. Science. 1992;256(5054):213–217. doi: 10.1126/science.256.5054.213. [DOI] [PubMed] [Google Scholar]
  • 32.Chambers CC, Hawkins GD, Cramer CJ, Truhlar DG. J. Phys. Chem. 1996;100:16385–16398. [Google Scholar]
  • 33.Hawkins GD, Cramer CJ, Truhlar DG. J. Phys. Chem. 1996;100:19824–19839. [Google Scholar]
  • 34.Hawkins GD, Cramer CJ, Truhlar DG. J. Phys. Chem. B. 1998;102:3257–3271. [Google Scholar]
  • 35.Li JB, Hawkins GD, Cramer CJ, Truhlar DG. Chem. Phys. Lett. 1998;288:293–298. [Google Scholar]
  • 36.Zhu TH, Li JB, Hawkins GD, Cramer CJ, Truhlar DG. J. Chem. Phys. 1998;109:9117–9133. [Google Scholar]
  • 37.Zhu TH, Li JB, Hawkins GD, Cramer CJ, Truhlar DG. J. Chem. Phys. 1999;111:5624–5624. [Google Scholar]
  • 38.Li JB, Zhu TH, Hawkins GD, Winget P, Liotard DA, Cramer CJ, Truhlar DG. Theor. Chem. Acc. 1999;103:9–63. [Google Scholar]
  • 39.Li JB, Zhu TH, Cramer CJ, Truhlar DG. J. Phys. Chem. A. 2000;104:2178–2182. [Google Scholar]
  • 40.Dolney DM, Hawkins GD, Winget P, Liotard DA, Cramer CJ, Truhlar DG. J. Comput. Chem. 2000;21:340–366. [Google Scholar]
  • 41.Thompson JD, Cramer CJ, Truhlar DG. J. Phys. Chem. A. 2004;108:6532–6542. [Google Scholar]
  • 42.Thompson JD, Cramer CJ, Truhlar DG. Theor. Chem. Acc. 2005;113:107–131. [Google Scholar]
  • 43.Kelly CP, Cramer CJ, Truhlar DG. J. Chem. Theory Comput. 2005;1:1133–1152. doi: 10.1021/ct050164b. [DOI] [PubMed] [Google Scholar]
  • 44.Marenich AV, Olson RM, Kelly CP, Cramer CJ, Truhlar DG. J. Chem. Theory Comput. 2007;3:2011–2033. doi: 10.1021/ct7001418. [DOI] [PubMed] [Google Scholar]
  • 45.Marenich AV, Cramer CJ, Truhlar DG. J. Chem. Theory Comput. 2009;5:2447–2464. doi: 10.1021/ct900312z. [DOI] [PubMed] [Google Scholar]
  • 46.Marenich AV, Cramer CJ, Truhlar DG. J. Phys. Chem. B. 2009;113:6378–6396. doi: 10.1021/jp810292n. [DOI] [PubMed] [Google Scholar]
  • 47.Hoijtink GJ, Deboer E, Vandermeij PH, Weijland WP. Recl. Trav. Chim. Pays-Bas Belg. 1956;75:487–503. [Google Scholar]; Peradejordi F. Cahiers Phys. 1963;17:393. [Google Scholar]; Constanciel R, Contreras R. Theor. Chim. Acta. 1984;65:1–11. [Google Scholar]; Tucker SC, Truhlar DG. Chem. Phys. Lett. 1989;157:164–170. [Google Scholar]; Still WC, Tempczyk A, Hawley RC, Hendrickson T. J. Am. Chem. Soc. 1990;112:6127–6129. [Google Scholar]; Cramer CJ, Truhlar FG. Rev. Comp. Chem. 1995;6:1–72. [Google Scholar]; Bashford D, Case DA. Annu. Rev. Phys. Chem. 2000;51:129–153. doi: 10.1146/annurev.physchem.51.1.129. [DOI] [PubMed] [Google Scholar]
  • 48.Cramer CJ, Truhlar DG. Acc. Chem. Res. 2009;42:493. doi: 10.1021/ar800019z. [DOI] [PubMed] [Google Scholar]
  • 49.Ben-Naim A. J. Phys. Chem. 1978;82:792–803. [Google Scholar]
  • 50.Cramer CJ, Truhlar DG. In: Free Energy Calculations in Rational Drug Design. Reddy MR, Erion MD, editors. New York: Kluwer; 2001. pp. 63–95. [Google Scholar]
  • 51.Cramer CJ, Truhlar DG. In: Solvent Effects and Chemical Reactivity. Tapia O, Bertran J, editors. Dordrecht: Kluwer; 1996. pp. 1–81. [Google Scholar]
  • 52.Hermann RB. J. Phys. Chem. 1972;76:2754–2759. [Google Scholar]
  • 53.Lee B, Richards FM. J. Mol. Biol. 1971;55:379–400. doi: 10.1016/0022-2836(71)90324-x. [DOI] [PubMed] [Google Scholar]
  • 54.Nemethy G, Scheraga HA. J. Chem. Phys. 1962;36:3401–3417. [Google Scholar]
  • 55.Kelly CP, Cramer CJ, Truhlar DG. J. Phys. Chem. B. 2006;110:16066–16081. doi: 10.1021/jp063552y. [DOI] [PubMed] [Google Scholar]
  • 56.Tissandier MD, Cowen KA, Feng WY, Gundlach E, Cohen MH, Earhart AD, Coe JV, Tuttle TR. J. Phys. Chem. A. 1998;102:7787–7794. [Google Scholar]
  • 57.Frisch MJ, Trucks GW, Schlegel HB, Scuseria GE, Robb MA, Cheeseman JR, Montgomery JJA, Vreven T, Kudin KN, Burant JC, Millam JM, Iyengar SS, Tomasi J, Barone V, Mennucci B, Cossi M, Scalmani G, Rega N, Petersson GA, Nakatsuji H, Hada M, Ehara M, Toyota K, Fukuda R, Hasegawa J, Ishida M, Nakajima T, Honda Y, Kitao O, Nakai H, Klene M, Li X, Knox JE, Hratchian HP, Cross JB, Bakken V, Adamo C, Jaramillo J, Gomperts R, Stratmann RE, Yazyev O, Austin AJ, Cammi R, Pomelli C, Ochterski JW, Ayala PY, Morokuma K, Voth GA, Salvador P, Dannenberg JJ, Zakrzewski VG, Dapprich S, Daniels AD, Strain MC, Farkas O, Malick DK, Rabuck AD, Raghavachari K, Foresman JB, Ortiz JV, Cui Q, Baboul AG, Clifford S, Cioslowski J, Stefanov BB, Liu G, Liashenko A, Piskorz P, Komaromi I, Martin RL, Fox DJ, Keith T, Al-Laham MA, Peng CY, Nanayakkara A, Challacombe M, Gill PMW, Johnson B, Chen W, Wong MW, Gonzalez C, Pople JA. Gaussian 03. Wallingford CT: Gaussian, Inc.; 2004. Revision C.02. [Google Scholar]
  • 58.Liotard DA, Hawkins GD, Lynch GC, Cramer CJ, Truhlar DG. J. Comp. Chem. 1995;16:422–440. [Google Scholar]
  • 59.Bondi A. J. Phys. Chem. 1954;58:929–939. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1_si_001

RESOURCES