Skip to main content
Frontiers in Molecular Biosciences logoLink to Frontiers in Molecular Biosciences
. 2018 Feb 8;5:11. doi: 10.3389/fmolb.2018.00011

Free Energy, Enthalpy and Entropy from Implicit Solvent End-Point Simulations

Federico Fogolari 1,2,*, Alessandra Corazza 2,3, Gennaro Esposito 1,2,4
PMCID: PMC5809407  PMID: 29473043

Abstract

Free energy is the key quantity to describe the thermodynamics of biological systems. In this perspective we consider the calculation of free energy, enthalpy and entropy from end-point molecular dynamics simulations. Since the enthalpy may be calculated as the ensemble average over equilibrated simulation snapshots the difficulties related to free energy calculation are ultimately related to the calculation of the entropy of the system and in particular of the solvent entropy. In the last two decades implicit solvent models have been used to circumvent the problem and to take into account solvent entropy implicitly in the solvation terms. More recently outstanding advancement in both implicit solvent models and in entropy calculations are making the goal of free energy estimation from end-point simulations more feasible than ever before. We review briefly the basic theory and discuss the advancements in light of practical applications.

Keywords: free energy, enthalpy, entropy, molecular dynamics simulations, implicit solvent, continuum solvent, MM/GBSA

1. Introduction

Free energy is the key quantity in the description of the thermodynamics of biological systems, and therefore an important objective of biomolecular simulations. In spite of its relevance, free energy calculations are not performed for every simulation performed although most often concepts like (local) stability are used to describe the results of the simulation suggesting a thermodynamic description.

The difficulties arise obviously from the calculation of the entropy of the system which cannot be immediately computed as an ensemble average over simulation snapshots, contrary to the enthalpy. The free energy G can be formally related to an ensemble average:

G=RTlogexp(βU) (1)

where R is the gas constant, T the temperature, β is equal to 1RT, U is the potential energy and ⟨⟩ indicates thermodynamic ensemble average. However, as pointed out many years ago (Beveridge and DiCapua, 1989), this formula is of no practical use unless all configurational space enters (implicitly or explicitly) the average, because lowest probability configurations display the highest value contribution to the average and there cannot be convergence for upper unbound potentials. In molecular simulations typically only near-equilibrium configurations are sampled.

For this reason free energy differences calculation is performed using methods, like umbrella sampling (Torrie and Valleau, 1977), thermodynamic integration (Straatsma and McCammon, 1991) or metadynamics (Laio and Parrinello, 2002), that compute by various methods the free energy along a pathway connecting end-point states, even if it is often just the difference in free energy between the latter points that is actually needed.

If the two end-points are close enough a simulation of one or both end-points may be performed and free energy differences obtained by free energy perturbation (Zwanzig, 1954), but this is not possible in general.

The problem with the computation of the entropy from a single MD simulation is that only part of the configurational space is accessed by simulation.

In this short perspective we remark that the long sought calculation of free energies from end point simulations may be afforded with reasonable accuracy from implicit solvent end-point simulations. In particular, some recent developments in the way entropy is calculated in fact allow to compute free energy from a single simulation, so it is worth to review briefly the theory here and to provide simple formulae to compute free energy, entropy and enthalpy from implicit solvent simulations.

2. Theory

We consider here simulations sampling different states of a system A in solution. This can be done by different simulations if the states are kinetically well separated, or by post-processing a single simulation to divide microstates belonging to different states.

If the simulation is extensive enough most probable microstates are sampled and representative thermodynamic ensembles are generated for the different states which include samples of all microstates whose probability density is higher than a threshold.

Enthalpy can be computed as the energy ensemble average, whereas entropy is problematic, in particular for what concerns solvent degrees of freedom. Although solvent entropy has been taken into account in some studies (e.g., De Simone et al., 2005) the extensive correlation among solvent molecules makes solvent entropy estimation a difficult task. In order to get rid of this problem we treat the system using an implicit solvent model.

2.1. Implicit solvent models

Following the excellent reviews available (Gilson et al., 1997; Roux and Simonson, 1999; Wereszczynski and McCammon, 2012) we write the standard chemical potential, or molar free energy, in the following form:

GA0=-RTlog(exp(-βU(rA,rS))drAdrSexp(-βU(rS))drS)+C (2)

where rA and rS are the solute and solvent coordinates. C includes the integrals over the reference state (1M, random orientation) and the momentum integrals (independent of conformation) that cancel when comparing different states of the same system and the term P0V¯A, with P0 the standard pressure and V¯A the partial volume of the solute, whose dependence on conformation provides negligible effects at standard pressure. For a derivation see Gilson et al. (1997).

For this reason in the following C will not be considered further.

Due to the difficulties in estimating the entropy of highly correlated solvent molecules the solvation potential of mean force (ΔW(rA,T)) is defined by integrating out the solvent degrees of freedom.

exp(-βΔW(rA,T))=exp(-β(UAS(rA,rS)+US(rS)))drSexp(-βUS(rS))drS (3)

where UAS denotes the energy terms that couple the solute and the solvent and US is the solvent energy.

With this definition and the assumptions made above we have:

GA0=-RTlog(exp(-β(U(rA)+ΔW(rA,T)))drA) (4)

where the dependence on the temperature of the solvation energy ΔW has been made explicit.

Implicit solvent models provide a functional form and parameters for ΔW(rA,T).

Although the implicit solvent model based on the Poisson-Boltzmann equation (Fogolari et al., 2002) and the solvent accessible surface area (PBSA) could be used in molecular dynamics simulations (Fogolari et al., 2003), the method based on the Generalized Born model and solvent accessible surface area (GBSA) is the method of choice for its computational efficiency (Still et al., 1990; Onufriev et al., 2004). Limitations of both methods include treatment of small crevices and cavities, where water could not display bulk solvation properties, and neglection of curvature dependence of surface tension coefficient (Nicholls et al., 1991). For GBSA an additional limitation could be due to the dependence of empirical parameters on molecular shape (see e.g., Fogolari et al., 2015a). Compared to approaches simulating the solutes in explicit solvent and post-processing the trajectory in implicit solvent (Kollman et al., 2000), methods that generate the configurational ensemble using the same implicit solvent used for energy calculations are more consistent because they don't suffer from possible mismatches between the implicit and explicit solvent models.

In recent years some of the available implicit solvent models (including forcefield and set of parameters used (Swanson et al., 2005, 2007)) have been shown to be extremely accurate in the treatment of protein thermodynamics, as demonstrated by the study by Simmerling and coworkers, where 16 out of 17 proteins could be correctly folded using the GB-Neck2 implicit solvent model and for 14 of them the native fold was preferred over the misfolded one (Mongan et al., 2007; Nguyen et al., 2014).

2.2. Enthalpy in implicit solvent models

The molar enthalpy is obtained as:

HA0=GA0+TSA0=GA0-TGA0T (5)

When the derivation is performed, taking into account that also ΔW(rA,T) depends on the temperature, we obtain:

HA0=<U+ΔW>-T<ΔWT> (6)

where the symbol <> indicates the ensemble average of the quantity within brackets, i.e., the average over the simulation of the same quantity.

The above equation shows that the enthalpy, which is obtained as the energy ensemble average in MD simulations, is directly related to the ensemble average of the implicit solvent model potential energy. The difference is expressed by the -T<ΔWT> term which may however be obtained by explicit derivation with respect to the temperature of the solvation energy terms in the implicit solvation model.

In the GBSA model the solvation energy is the sum of an electrostatic (ΔWel) and a surface tension (ΔWSA) term:

ΔW=ΔWel+ΔWSA=12(1ϵ(T)-1ϵin)i,jqiqjfij+γ(T)A (7)

where ϵ(T) and ϵin are the solvent and molecular (typically 1) dielectric constants, respectively, qi is the charge of the ith atom, fij a pairwise function depending on all atoms' coordinates, γ(T) is a surface tension and A is the solvent-accessible surface area. The dependence on temperature of the solvent dielectric constant and of the surface tension has been made explicit in the above equation. The contributions to the derivative of ΔW with respect to temperature are easily obtained from the electrostatic and surface tension solvation energy terms, when the temperature dependence of the parameters of the implicit solvent model is known:

ΔWT=ΔWelT+ΔWSAT=ΔWel(ϵinϵ(ϵ-ϵin))ϵT+ΔWSA1γγT (8)

2.3. Entropy in implicit solvent models

The entropy is linked to the configurational solute probability distribution.

First consider that the probability density distribution in the implicit solvent model is:

ρ(rA,T)=exp(-β(U(rA)+ΔW(rA,T)))(exp(-β(U(rA)+ΔW(rA,T)))drA) (9)

When the derivation of the free energy is performed the entropy can be written as:

SA0=-GA0T (10)
=-Rρ(rA,T)log(ρ(rA,T))drA-<ΔWT> (11)
s=-R<log(ρ(rA,T))>-<ΔWT> (12)

Rewriting the entropy in terms of probability density allows to rewrite the free energy in turn as the sum of an ensemble average and a configurational entropy term, which is itself an ensemble average:

GA0=<U+ΔW>+RT<log(ρ(rA,T))> (13)

As said above the difficulty in estimating free energies resides in the entropy estimation which requires consideration of both sampled and non-sampled configurational space. Similarly, to estimate ρ(rA,T), consideration of non-sampled configurational space is needed. Implicit solvent models circumvent the problem treating solvent entropy implicitly, through the parameters and their temperature dependence.

In the above equation the configurational entropy is written formally as an ensemble average of the probability density. In this respect a convenient description of systems containing proteins is the bond, angle, torsion (BAT) representation (Go and Scheraga, 1976), because bonds, and to some extent also angles, contribute very little to changes in entropy for different states of proteins (e.g., notably, bound and non-bound) (Karplus et al., 1987). For this reason entropy is estimated often considering to a first approximation only torsional degrees of freedom and possibly external rotation and translation degrees of freedom.

Other descriptions, e.g., in cartesian coordinates, have been used and the entropy has been calculated assuming the system is moving harmonically or anharmonically about the energy minimum (see recent reviews Polyansky et al., 2012; Wereszczynski and McCammon, 2012). This approximation may however be poor for loops or other unrestrained parts of the molecules.

A particularly attractive method to compute entropy is the approach proposed by Singh et al. (2003) which was further developed by the same authors and others (Hnizdo et al., 2003, 2007, 2008; Darian et al., 2005; Numata et al., 2007; Wang et al., 2009; Mukherjee, 2011; Fenley et al., 2014; Huggins, 2014; Fogolari et al., 2015b, 2016). In practice the probability density ρ(x1, x2, …, xs) may be estimated considering a ball of radius ri around each configurational sample xi up to the k-th nearest neighbor (Figure 1), then the local probability density (ρ^(xi)) is obtained by dividing the number k of the n samples which are found inside the ball over the volume of the ball Vi and n, i.e.,:

Figure 1.

Figure 1

Pictorial illustration of the k-th nearest neighbor estimation of entropy. See text.

ρ^(xi)=1Vikn (14)

The idea is very simple and can be made rigorous (Singh et al., 2003). When equation 14, or its exact form, is substituted in equation 13 the free energy may be estimated easily from configurational samples.

The discrepancy between naive and exact treatment is limited. Note that the volume around each configurational sample is tuned to match the local density and there is no need consider regions where no configurational sample is found. Obviously the dimensionality of a protein system is large and it is not possible in practice to consider a probability density without actually assuming that most degrees of freedom are decoupled from each other.

The groups of Gilson (Killian et al., 2007) and Tidor (King and Tidor, 2009; King et al., 2012) have proposed treatments of mutual information that provide a practical estimate of global entropy (actually, an upper bound) considering only single degrees of freedom and pairwise mutual information. Depending on the number of samples the approach can be easily extended to higher orders of correlation.

3. Conclusion

We have recapitulated above the fundamentals of implicit solvent free energy calculations from end-point simulations, recalling the relationship between explicit and implicit solvent models, and showing how entropy and enthalpy can be obtained from implicit solvent simulations.

From a practical point of view we can estimate free energy from implicit solvent simulations using equation 13:

GA0=<U+ΔW>+RT<log(ρ(rA,T))>

The ensemble average of U + ΔW is provided by the potential energy in the implicit solvent model used, whereas the entropy -R<log(ρ(rA,T))> must be computed from the conformational ensemble.

The available implicit solvent models and parameters, together with forcefield parameters, have been shown to be accurate enough to reproduce complex phenomena like protein folding giving confidence in the accuracy of the implicit solvent potential of mean force. The average of the potential energy over an implicit solvent molecular dynamics trajectory thus provides the first term in the above equation.

The entropic term may be estimated using the nearest neighbor method which is emerging as an accurate entropy estimator, with many advantages over more traditional methods, including lack of hypothesis on non-sampled conformational space, no need to consider explicitly non-sampled conformational space and sound theoretical basis.

To make the application of the method straightforward we have implemented the nearest-neighbor method in two programs, PDB2ENTROPY and PDB2TRENT, available through the git-hub repository (URL: https://github.com/federico-fogolari) which allow to compute conformational and rotational-translational entropy directly from the conformational ensemble in PDB format.

In summary, the recent advancements in solvation models and entropy calculation, based on the nearest-neighbor method, are making computation of free-energy from end-point simulations significantly more accurate than before, with many possible applications in the next future.

Author contributions

All authors listed, have made substantial, direct and intellectual contribution to the work, and approved it for publication.

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Acknowledgments

Dr. I. M. Lait and A. Makek are gratefully acknowledged for support.

References

  1. Beveridge D. L., DiCapua F. M. (1989). Free energy via molecular simulation: application to chemical and biomolecular systems. Annu. Rev. Biophys. Biophys. Chem. 18, 431–492. 10.1146/annurev.bb.18.060189.002243 [DOI] [PubMed] [Google Scholar]
  2. Darian E., Hnizdo V., Fedorowicz A., Singh H., Demchuk E. (2005). Estimation of the absolute internal-rotation entropy of molecules with two torsional degrees of freedom from stochastic simulations. J. Comput. Chem. 26, 651–660. 10.1002/jcc.20198 [DOI] [PubMed] [Google Scholar]
  3. De Simone A., Dodson G. G., Verma C. S., Zagari A., Fraternali F. (2005). Prion and water: tight and dynamical hydration sites have a key role in structural stability. Proc. Natl. Acad. Sci. U.S.A. 102, 7535–7540. 10.1073/pnas.0501748102 [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Fenley A. T., Killian B. J., Hnizdo V., Fedorowicz A., Sharp D. S., Gilson M. K. (2014). Correlation as a determinant of configurational entropy in supramolecular and protein systems. J. Phys. Chem. B 118, 6447–6455. 10.1021/jp411588b [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Fogolari F., Brigo A., Molinari H. (2002). The Poisson-Boltzmann equation for biomolecular electrostatics: a tool for structural biology. J. Mol. Recogn. 15, 377–392. 10.1002/jmr.577 [DOI] [PubMed] [Google Scholar]
  6. Fogolari F., Brigo A., Molinari H. (2003). Protocol for MM/PBSA molecular dynamics simulations of proteins. Biophys. J. 85, 159–166. 10.1016/S0006-3495(03)74462-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Fogolari F., Corazza A., Esposito G. (2015a). Accuracy assessment of the linear Poisson-Boltzmann equation and reparametrization of the OBC generalized Born model for nucleic acids and nucleic acid-protein complexes. J. Comput. Chem. 36, 585–596. 10.1002/jcc.23832 [DOI] [PubMed] [Google Scholar]
  8. Fogolari F., Corazza A., Fortuna S., Soler M. A., VanSchouwen B., Brancolini G., et al. (2015b). Distance-based configurational entropy of proteins from molecular dynamics simulations. PLoS ONE 10:e0132356. 10.1371/journal.pone.0132356 [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Fogolari F., Dongmo Foumthuim C. J., Fortuna S., Soler M. A., Corazza A., Esposito G. (2016). Accurate estimation of the entropy of rotationtranslation probability distributions. J. Chem. Theory Comput. 12, 1–8. 10.1021/acs.jctc.5b00731 [DOI] [PubMed] [Google Scholar]
  10. Gilson M. K., Given J. A., Bush B. L., McCammon J. A. (1997). The statistical-thermodynamic basis for computation of binding affinities: a critical review. Biophys J. 72, 1047–1069. 10.1016/S0006-3495(97)78756-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Go N., Scheraga H. A. (1976). On the use of classical statistical mechanics in the treatment of polymer chain conformation. Macromolecules 9, 535–542. 10.1021/ma60052a001 [DOI] [Google Scholar]
  12. Hnizdo V., Darian E., Fedorowicz A., Demchuk E., Li S., Singh H. (2007). Nearest-neighbor nonparametric method for estimating the configurational entropy of complex molecules. J. Comput. Chem. 28, 655–668. 10.1002/jcc.20589 [DOI] [PubMed] [Google Scholar]
  13. Hnizdo V., Fedorowicz A., Singh H., Demchuk E. (2003). Statistical thermodynamics of internal rotation in a hindering potential of mean force obtained from computer simulations. J. Comput. Chem. 24, 1172–1183. 10.1002/jcc.10289 [DOI] [PubMed] [Google Scholar]
  14. Hnizdo V., Tan J., Killian B. J., Gilson M. K. (2008). Efficient calculation of configurational entropy from molecular simulations by combining the mutual-information expansion and nearest-neighbor methods. J. Comput. Chem. 29, 1605–1614. 10.1002/jcc.20919 [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Huggins D. J. (2014). Quantifying the entropy of binding for water molecules in protein cavities by computing correlations. Biophys. J. 108, 928–936. 10.1016/j.bpj.2014.12.035 [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Karplus M., Ichiye T., Pettitt B. M. (1987). Configurational entropy of native proteins. Biophys. J. 52, 1083–1085. 10.1016/S0006-3495(87)83303-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Killian B. J., Yundenfreund Kravitz J., Gilson M. K. (2007). Extraction of configurational entropy from molecular simulations via an expansion approximation. J. Chem. Phys. 127:024107. 10.1063/1.2746329 [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. King B. M., Silver N. W., Tidor B. (2012). Efficient calculation of molecular configurational entropies using an information theoretic approximation. J. Phys. Chem. B 116, 2891–2904. 10.1021/jp2068123 [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. King B. M., Tidor B. (2009). MIST: maximum information spanning trees for dimension reduction of biological data sets. Bioinformatics 25, 1165–1172. 10.1093/bioinformatics/btp109 [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Kollman P., Massova I., Reyes C., Kuhn B., Huo S., Chong L., et al. (2000). Calculating structures and free energies of complex molecules: combining molecular mechanics and continuum models. Acc. Chem. Res. 33, 889–897. 10.1021/ar000033j [DOI] [PubMed] [Google Scholar]
  21. Laio A., Parrinello M. (2002). Escaping free energy minima. Proc. Natl. Acad. Sci. U.S.A. 99, 12562–12566. 10.1073/pnas.202427399 [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Mongan J., Simmerling C., McCammon J. A., Case D. A., Onufriev A. (2007). Generalized Born model with a simple robust molecular volume correction. J. Chem. Theory Comp. 3, 156–169. 10.1021/ct600085e [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Mukherjee A. (2011). Entropy balance in the intercalation process of an anti-cancer drug daunomycin. J. Phys. Chem. Lett. 2, 3021–3026. 10.1021/jz2013566 [DOI] [Google Scholar]
  24. Nguyen H., Maier J., Huang H., Perrone V., Simmerling C. (2014). Folding simulations for proteins with diverse topologies are accessible in days with a single physics-based force field and implicit solvent. J. Am. Chem. Soc. 136, 13959–13962. 10.1021/ja5032776 [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Nicholls A., Sharp K. A., Honig B. (1991). Protein folding and association: insights from the interfacial and thermodynamic properties of hydrocarbons. Proteins Struct. Funct. Genet. 11, 281–296. 10.1002/prot.340110407 [DOI] [PubMed] [Google Scholar]
  26. Numata J., Wan M., Knapp E. W. (2007). Conformational entropy of biomolecules: beyond the quasi-harmonic approximation. Genome Inform. 18, 192–205. 10.1142/9781860949920_0019 [DOI] [PubMed] [Google Scholar]
  27. Onufriev A., Bashford D., Case D. A. (2004). Exploring protein native states and large-scale conformational change s with a modified generalized Born model. Proteins Struct. Func. Gen. 55, 383–394. 10.1002/prot.20033 [DOI] [PubMed] [Google Scholar]
  28. Polyansky A. A., Zubac R., Zagrovic B. (2012). Estimation of conformational entropy in protein-ligand interactions: a computational perspective. Methods Mol. Biol. 819, 327–353. 10.1007/978-1-61779-465-0_21 [DOI] [PubMed] [Google Scholar]
  29. Roux B., Simonson T. (1999). Implicit solvent models. Biophys. Chem. 78, 1–20. 10.1016/S0301-4622(98)00226-9 [DOI] [PubMed] [Google Scholar]
  30. Singh H., Misra N., Hnizdo V., Fedorowicz A., Demchuk E. (2003). Nearest neighbours estimates of entropy. J. Math. Manag. Sci. 23, 301–321. 10.1080/01966324.2003.10737616 [DOI] [Google Scholar]
  31. Still W. C., Tempczyk A., Hawley R. C., Hendrickson T. (1990). Semianalytical treatment of solvation for molecular mechanics and dynamics. J. Am. Chem. Soc. 112, 6127–6129. 10.1021/ja00172a038 [DOI] [Google Scholar]
  32. Straatsma T. P., McCammon J. A. (1991). Multiconfiguration thermodynamic integration. J. Chem. Phys. 95, 1175–1188. 10.1063/1.461148 [DOI] [Google Scholar]
  33. Swanson J. M. J., Adcock S. A., McCammon J. A. (2005). Optimized radii for Poisson–Boltzmann calculations with the AMBER force field. J. Chem. Theory Comp. 1, 484–493. 10.1021/ct049834o [DOI] [PubMed] [Google Scholar]
  34. Swanson J. M. J., Wagoner J. A., Baker N. A., McCammon J. A. (2007). Optimizing the Poisson dielectric boundary with explicit solvent forces and energies: lessons learned with atom-centered dielectric functions. J. Chem. Theory Comp. 3, 170–183. 10.1021/ct600216k [DOI] [PubMed] [Google Scholar]
  35. Torrie G. M., Valleau J. P. (1977). Nonphyisical sampling distributions in Monte Carlo free-energy estimation: umbrella sampling. J. Comp. Phys. 23, 187–199. 10.1016/0021-9991(77)90121-8 [DOI] [Google Scholar]
  36. Wang L., Abel R., Friesner R. A., Berne B. J. (2009). Thermodynamic properties of liquid water: an application of a nonparametric approach to computing the entropy of a neat fluid. J. Chem. Theory Comput. 5, 1462–1473. 10.1021/ct900078k [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Wereszczynski J., McCammon J. A. (2012). Statistical mechanics and molecular dynamics in evaluating thermodynamic properties of biomolecular recognition. Q. Rev. Biophys. 45, 1–25. 10.1017/S0033583511000096 [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Zwanzig R. W. (1954). High temperature equation of state by a perturbation method. I. Nonpolar gases. J. Chem. Phys. 22, 1420–1426. 10.1063/1.1740409 [DOI] [Google Scholar]

Articles from Frontiers in Molecular Biosciences are provided here courtesy of Frontiers Media SA

RESOURCES