Estimating Absolute Configurational Entropies of Macromolecules: The Minimally Coupled Subspace Approach

Ulf Hensen; Oliver F Lange; Helmut Grubmüller

doi:10.1371/journal.pone.0009179

. 2010 Feb 23;5(2):e9179. doi: 10.1371/journal.pone.0009179

Estimating Absolute Configurational Entropies of Macromolecules: The Minimally Coupled Subspace Approach

Ulf Hensen ¹, Oliver F Lange ^2,^*, Helmut Grubmüller ¹

Editor: Jörg Langowski³

PMCID: PMC2826394 PMID: 20186277

Abstract

We develop a general minimally coupled subspace approach (MCSA) to compute absolute entropies of macromolecules, such as proteins, from computer generated canonical ensembles. Our approach overcomes limitations of current estimates such as the quasi-harmonic approximation which neglects non-linear and higher-order correlations as well as multi-minima characteristics of protein energy landscapes. Here, Full Correlation Analysis, adaptive kernel density estimation, and mutual information expansions are combined and high accuracy is demonstrated for a number of test systems ranging from alkanes to a 14 residue peptide. We further computed the configurational entropy for the full 67-residue cofactor of the TATA box binding protein illustrating that MCSA yields improved results also for large macromolecular systems.

Introduction

Entropies are key quantities in physics, chemistry, and biology. While free energy changes govern the direction of all chemical processes including reaction equilibria, entropy changes are the underlying driving forces of ligand binding, protein folding and other phenomena driven by hydrophobic effect. Traditionally calculating entropies from atomistic ensembles Inline graphic of configurations of a macromolecule of atoms remains notoriously difficult.

We here propose and apply a method for calculating configurational entropies

(1)

where Inline graphic denotes the configurational probability density in the dimensional configurational space governed by the potential energy of the system. The fact that is usually on the order of several hundreds or thousands renders the evaluation of this integral quite challenging despite a number of successful attempts. [1]–[4] These broadly fall into three classes, (i) special-purpose perturbation type approaches, also known as thermodynamic integration [5], (ii) step-by-step reconstruction methods, in particular the scanning procedures introduced by Meirovitch [6], [7], (iii) direct approaches which analyse information readily available in standard equilibrium simulation trajectories [8]–[10].

While perturbation approaches provide relatively accurate free energy differences also for larger systems, accurate entropies are obtained only for smaller molecules. The main obstacle, which aggravates with system size, is the sampling problem, which severely limits the accuracy, in particular for explicit solvent models [2], [5].

The most widely used direct method is the quasi-harmonic approximation [8] (QH), which provides an upper limit to the configurational entropy in terms of Inline graphic independent classical or quantum mechanical harmonic oscillators [9], [10], which is equivalent to approximating the configurational density by a multi-variate Gaussian function,

with Inline graphic derived from the covariance matrix [9], [10] . However, for macromolecules undergoing large conformational motions the entropy is likely to be considerably smaller than this QH upper limit due to coupling and anharmonicities and, in particular, due to the existence of multiple conformational states [11]–[14]. Indeed, for smaller systems such as di-saccharides [15] or lipids [16], or small subsets of larger proteins [17] significantly lower entropies than with QH were obtained by inclusion of anharmonicities [11]–[13], [18], [19] and pairwise correlation of QH modes [20].

Results

The MCSA Scheme

Here we develop a direct method consisting of three building blocks. Results for small test systems will be presented during this introduction of the methodology to illustrate the effect of each building block. Figure 1 shows that indeed for various small test systems (alkanes, dialanine and a complete 14-residue Inline graphic -turn) the quasi-harmonic approximation severely overestimates the reference entropy. The reference values were obtained by thermodynamic integration (TI) gradually perturbing the systems towards an analytically tractable reference state consisting of non-interacting particles in harmonic wells, as described in methods and Refs. [21], [22]. Entropy estimates obtained for all test systems are also summarized in Table 1.

Five selected alkane systems, dialanine (left), and the C-terminal turn of Protein G (right, please note that here the units are kJ/(mol K)). Thermodynamic integration (TI), density estimates over the whole configurational space (dir), full correlation analyis with subsequent clustering and kernel density estimation (FCA), quasi-harmonic (QH) and mutual information expansion estimates of 2nd (MIE2) and 3rd (MIE3) order were obtained as described in the text.

Inline graphic — Five selected alkane systems, dialanine (left), and the C-terminal turn of Protein G (right, please note that here the units are kJ/(mol K)). Thermodynamic integration (TI), density estimates over the whole configurational space (dir), full correlation analyis with subsequent clustering and kernel density estimation (FCA), quasi-harmonic (QH) and mutual information expansion estimates of 2nd (MIE2) and 3rd (MIE3) order were obtained as described in the text.

Table 1. Entropy estimates obtained for all systems.

System							clust
Butane	4	1850.29	1870.11	1870.36	1600.24	1970.34	5	2110.18
Pentane	5	2450.30	2510.17	2520.69	2030.44	2650.25	8	3030.08
Hexane	6	3070.68	3190.21	3230.40	2440.55	3831.15	11	3950.17
Heptane	7	3880.92	3990.34	4070.33	3171.26	4841.58	13	4920.17
Octane	8	4500.48	4850.67	4920.59	3971.13	5221.15	15	5870.07
Nonane	9	5020.46	5770.88	5891.8	5150.95	5440.88	19	6820.14
Decane	10	5640.75	6701.10	6831.3	5711.57	6850.88	21	7780.13
Dialanine	15	5241.1	5660.4	6102.2	3592.67	6532.23	32	7072.1
-turn	169						84–108
TBP cofactor	696	–	–	2225058	21543152	2185393	32–88	2322688
TBP complex	696	–	–	24918229	24371392	24514500	56–80	25880197

Open in a new tab

Alkane test systems butane to decane, dialanine, the 14-residue Inline graphic -turn, as well as free and complexed TATA box binding protein (TBP) cofactor. : absolute configurational entropy obtained by TI (in J/(mol K)); : direct density estimate without clustering; : sum of density estimates after subspace clustering; and : Mutual information expansion estimates of 2nd (MIE2) and 3rd order (MIE3); Inline graphic : size of largest cluster; : QH entropy estimate.

Non-Parametric Density Estimation

As the first of the three building blocks of the methodology we recently introduced a non-parametric density estimation resting on adaptive anisotropic ellipsoidal kernels [21] that captures the configurational density in sufficient detail. Briefly, the configurational part of the entropy in a Inline graphic -dimensional space is estimated from configurations according to

(2)

where Inline graphic denotes the ensemble average of an adaptive anisotropic kernel function , whose anisotropy and scaling depends on the local density at point , and whose -measure is denoted by . This formula simplifies to the well-known -nearest neighbour entropy (-NN) by fixing the kernel function to an (isotropic) sphere whose radius Inline graphic is chosen such that exactly configurations are within the sphere centered at configuration . In this limiting case, is the volume of the -dimensional unit sphere. NN estimators in general are entirely non-parametric and, at a finite sample size , have minimal bias [23] in any given number of dimensions Inline graphic . A major drawback, however, is the fact that due to the so-called ‘curse of dimensionality’ [24] simple -NN estimators are applicable for up to ten dimensional configurational spaces only [25]. In contrast, as can be seen in Fig. 1 (left, “dir”-bar), adaptive anisotropic kernels yield accurate results even for the 45-dimensional configurational space of dialanine. For the more than 500-dimensional configurational space of the 14-residue Inline graphic -turn, however, the ‘curse of dimensionality’ [24] renders it impossible to improve on the quasi-harmonic approximation with direct density estimation alone (Fig. 1 right). Convergence properties and full technical details of this first MCSA module are discussed in Ref. [21].

Generation of Minimally Coupled Subspaces

As the second building block of our method, we apply an entropy invariant transformation Inline graphic such that the usually highly coupled degrees of freedom separate into optimally uncoupled subspaces, each of which being sufficiently low-dimensional to render non-parametric density estimation applicable. As the most straightforward class of entropy invariant transformations, we consider here linear orthonormal transformations of the form Inline graphic with . More general transformations are currently explored [26]. We apply Full Correlation Analysis (FCA) [27] which minimizes mutual information by considering

where Inline graphic denote the components of and the 1-dimensional marginal density along . This procedure minimizes non-linear correlations of second and higher order [27] and therefore generalizes the principal component analysis (PCA) which only considers linear correlations of second order. For complex macromolecules, however, even for the optimal linear FCA transformation Inline graphic , considerable non-linear correlations between several degrees of freedom will remain and cannot be neglected. To address this issue, the FCA modes are subsequently clustered according to the generalized correlation coefficient [25], [28]

with the mutual information

graphic file with name pone.0009179.e126.jpg

between components Inline graphic and . This is achieved by assigning mode indices to clusters such that all modes with correlation coefficients larger than a certain threshold are assigned to the same cluster. This disjoint clustering defines an approximate factorization where denotes the generalized -dimensional marginal density along Inline graphic . This factorization is approximate in the sense that for the entropy

(3)

the residual entropy Inline graphic is small.

Such approximate factorization, of course, neglects all inter-cluster correlations. These can be pairwise correlations, and thus are small Inline graphic by construction, or higher-order correlations. For the latter we have to assume that they are also effectively eliminated by our threshold criterion. This assumption is supported by the observation that for the alkanes and for dialanine, with , (cf. Fig. 1). Thus, our factorization yields accurate entropies and Inline graphic is indeed small.

Mutual Information Expansions for Oversized Clusters

However, for the larger molecules considered here, the necessarily small threshold typically results in at least one cluster being too large for a sufficiently accurate density estimate (e.g., for the Inline graphic -turn ). Accordingly, while our factorization still improves the entropy estimate (cf. Fig. 1), cannot be neglected anymore. The third building block of our method addresses this issue by subdividing each oversized cluster into disjoint subclusters of sizes , , irrespective of the necessarily remaining strong correlations between these. The residual entropy contributions to the configurational entropy

graphic file with name pone.0009179.e150.jpg

will be drastically increased due to non-neglegible intra-cluster contributions Inline graphic from all subdivided clusters , where we have omitted the argument in the rightmost two terms for brevity. We here propose to compute each via the mutual information expansion (MIE) as

graphic file with name pone.0009179.e155.jpg

(4)

where Inline graphic . Expanding the mutual information terms

(5)

up to second or third order, respectively, with the right-hand sum running over all possible permutations Inline graphic , has proven sufficiently accurate in liquid state theory [29] and information theory [30], [31]. Indeed, for the -turn, inclusion of the remaining correlations via this expansion improved the entropy estimate (Fig. 1). For the other test systems . In contrast, for some of the test systems Inline graphic , such that from our observations, 3rd order MIE provides a better estimate and an upper bound to the true entropy.

Applications of MIE to macro-molecular systems can be hampered by the curse of dimensionality and combinatorial explosion of the number of terms [32], [33]. In this work, the problem is circumvented by clustering into sufficiently high-dimensional ( Inline graphic ) subspaces which minimizes residual inter- correlations and delays the onset of the combinatorial explosion. At the same time the subspaces are sufficiently small that even for the 3rd-order MIE no direct density estimates beyond the critical dimensionality of are required.

TATA Box Binding Protein: Protein Test Case and Error Estimate

Together, these three building blocks enable one to calculate configurational entropies even for larger biomolecules. We considered the 67-residue TATA box binding protein (TBP, pdb code 1TBA) inhibitor in two different configurations; complexed (Fig. 2 top left) and free (Fig. 2 top right). To estimate the statistical error of MCSA and QH configurational entropy estimates, for both states five independent molecular dynamics (MD) simulations were carried out using the OPLS force-field [34] and the TIP4P explicit solvent model [35] (see methods section for full simulation details). Fig. 2 shows the results obtained by the five entropy estimation methods for both complexed (left) and free (right) inhibitor. All methods estimate the free cofactor's entropy to be significantly higher than that of the bound cofactor. As can be seen, for both complexed and free cofactor, QH yields the largest estimate. The first two MCSA modules combined (kernel density estimation on little correlated configurational subspaces obtained from FCA) already yield remarkably smaller estimates, irrespective of whether a high or a low clustering threshold Inline graphic was chosen (hi thresh and low thresh in Fig. 2), i.e., chosing small but higher correlated subspaces or larger but lowly correlated subspaces provides similar estimates. Finally, employing all the three MCSA modules including MIE of 2nd (MIE2) and 3rd (MIE3) lowered the estimate again with, as before, the 2nd-order estimate being lower than the 3rd-order estimate.

The following techniques are used: quasi-harmonic approximation (QH); FCA with subsequent density estimation using a high clustering threshold (hi thresh) or, respectively, a low threshold (lo thresh); mutual information expansion of order 2 (MIE2) or, respectively, of order 3 (MIE3). The displayed entropy estimates are averages over five independent simulations of 100 ns each, the error bars indicate standard deviations of the mean.

The fact that the QH estimate is the largest in all cases corroborates the observations for the small test cases, and generally shows that MCSA yields improved estimates also for large macromolecules. Already the first two MCSA modules provide lower entropy estimates, even though relatively large configurational subspaces ( Inline graphic , see Table 1) were obtained from FCA, which illustrates that indeed our kernel density estimator works accurately also for the complex high-dimensional configurational spaces spanned by proteins. Further, the fact that the clustering threshold did not affect the final estimate very much naturally reflects the fact that clustering with a high threshold yields small subspaces Inline graphic which are correlated, such that in Eq. 3 is large, increasing our estimate . On the other hand, clustering with a small threshold gives rise to a small but sparse sampling due to large then entails higher , such that is also increased in this case. As expected, the third MCSA module, MIE, circumvents this problem and lowers the MCSA estimate further by 404 or 397 Inline graphic for the free and the complexed cofactor, respectively. The 2nd-order estimate is lower than the 3rd-order estimate in all cases, which shows that also for proteins the pair correlations are generally overestimated, and inclusion of 3rd-order correlations is indeed crucial.

The statistical errors are relatively small in all cases, but generally twice as large for the free than for the complexed cofactor. We attribute this observation to the larger inherent flexibility of the free state, and hence to insufficient molecular dynamics sampling. Consequently, the MIE error for the free cofactor is over three times larger than that of the the complex. Interestingly, the MIE estimate is slightly more affected with the error for the free cofactor being three- to fourfold as high as for the complex. Due to the high number of terms to be evaluated for the MIEs (Eq. 5), already small errors of each Inline graphic result in relatively large errors in .

Discussion

We have developed a minimally coupled subspace approach (MCSA) to estimate absolute macromolecular configurational entropies from structure ensembles which takes anharmonicities and higher-order correlations into account. The approach combines three building blocks which together allow one to calculate absolute entropies even for the highly complex configurational densities generated by the dynamics of biological macromolecules such as proteins. MCSA shares the versatility of the quasi-harmonic approach as it can be applied to unperturbed equilibrium trajectories while achieving the accuracy of special-purpose perturbation type methods. The effective dimension reduction provided by the Full Correlation Analysis allows for the application of mutual information expansions to large macromolecules. Further, the adaptive kernel non-parametric density estimation method developed for MCSA requires much weaker a-priori assumptions about the properties of the configurational densities than (quasi-)harmonic approaches. The method is applicable also to large macromolecules such as proteins. In this study, we showed that MCSA applied to the TATA box binding protein yielded significantly smaller and thus improved entropy estimates.

We note that here we focus at configurational entropies of the solute only, thus missing both the solvent as well as the solvent/solute parts. Using permutation reduction techniques [36], our method should be capable of capturing also these important contributions, which however lies outside the scope of the present work.

Methods

Thermodynamic Integration Reference Entropy

Absolute free energies for the test systems butane to decane, dialanine, and the ProteinG Inline graphic -turn were calculated by thermodynamic integration (TI). Simulation parameters cf. below. The TI scheme we have chosen to obtain the Helmholtz free energy of the fully interacting particles consists of two phases. Harmonic position restraints with a force constant were slowly switched on for each atom in the first phase, and in the second phase all force-field components were gradually switched off. Within the second phase, the charges were switched off prior to the rest of the force field. After the second phase, the system consisted of non-interacting dummy particles with mass Inline graphic oscillating in their respective harmonic position restraint potentials, i.e.,

The free energy of this harmonic system can be obtained analytically,

where Inline graphic denotes the mass-weighted force constant. Hence, the thermodynamic integration yields the absolute free energy

and the entropy by Inline graphic , where denotes the ensemble average of the potential energy.

For the TI between the systems given by Inline graphic (start) and (end), 21 intermediate steps were used, and the intermediate values of , 1e-6, 5e-6, 1e-5, 5e-4, 1e-4, 1e-3, 1e-2, 2e-2, 3e-2, 5e-2, 7e-2, 9e-2, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1 were distributed unevenly to obtain approximately balanced values. For each value of Inline graphic a trajectory of (alkanes and dialanine) or (-turn), respectively, was generated.

The error estimates of the TI reference entropies detailed in Table 1 were obtained via two ways for the alkane test systems and dialanine. First, by averaging over five independent simulations and, second, by performing blockwise averaging as derived in Ref. [37] over each of the 23 Inline graphic of each of these five trajectories. We found that the error estimates obtained by these two methods agree very well. Accordingly, for the -turn only the block averaging method was applied and the resulting error estimates are also given in Table 1.

Molecular/Stochastic Dynamics Simulations

The test systems that were compared with a thermodynamic integration reference (butane to decane, dialanine, and the ProteinG Inline graphic -turn) were set up as follows. Force-field parameterizations were obtained from the Dundee Prodrug server [38] based on the GROMOS united-atom force field [39]. Stochastic Dynamics simulations were performed using the molecular simulations package GROMACS [40] in vacuo at with friction constant Inline graphic set to 10, dielectric constant , integration step size of and no bond constraints. Positional restraints were applied to three adjacent terminal heavy atoms. To obtain MCSA error estimates, each of the simulations was carried out five times using different starting velocities. MCSA and QH entropy estimates were obtained from trajectories of lengths Inline graphic (alkanes and dialanine) or (-turn), respectively, i.e. the TI entropy references required times as much computing time as MCSA and QH estimates.

The TATA box binding protein (TBP) complex (protein database entry 1TBA) was simulated using the OPLS all atom force field [34] in explicit TIP4P solvent [35] and periodic boundary conditions. NpT ensembles were simulated, with the protein and solvent coupled separately to a 300-K heat bath ( Inline graphic ). [41] The systems were isotropically coupled to a pressure bath at 1 bar () [41]. Application of the Lincs [42] and Settle [43] algorithms allowed for an integration time step of . Short-range electrostatics and Lennard–Jones interactions were calculated within a cut-off of , and the neighbour list was updated every 10 steps. The particle mesh Ewald (PME) method was used for the long-range electrostatic interactions [44], with a grid spacing of Inline graphic . The free cofactor was simulated using the same parameters as above. The starting structure was obtained by removing the TBP from the X-ray structure of the complex and equilibrating for 2 ns. Entropy estimates and corresponding errors for both complexed and free cofactor were obtained from five trajectories of 200 ns length each.

Mutual Information Expansions Implementation Details

Fill modes

Due to the moderate regularization assumptions, our adaptive kernel density estimator is sensitive to the sparse sampling problem whose effect is highly dependent on the dimensionality. To guarantee the same accuracy of all density estimates required for the computation of the correlation terms Inline graphic of Eq. 5 despite different dimensionality it is, thus, necessary to ensure the same local densities around points in different terms. This is normally not provided. The mutual information between two modes and ,

(6)

contains differently well sampled terms in denominator and numerator, because the number of sampling points available to estimate Inline graphic is only half the number of sampling points available for estimating the marginal densities and (see Fig. 3). The accuracy for the estimation of the marginal densities is, consequently, possibly higher than the joint estimate yielding an inaccurate correlation estimate. To overcome this problem, we devised the concept of fill modes. Accordingly, artificially decorrelated modes Inline graphic are created by permuting its components , with . The marginal densities and , yielding a new expression for Eq. 6,

(7)

where the product of the marginal densities Inline graphic and is now computed from the synthetically decorrelated joint distribution , such that the same accuracy for the joint estimate is guaranteed as for the marginal estimates. Conducting this scheme on the 3rd order correlation function of three modes , and ,

yields

graphic file with name pone.0009179.e234.jpg

(8)

where the pairwise joint distributions have been ‘filled up’ with permuted ‘fill modes’, as described above, e.g. Inline graphic .

a) Two arbitrarily correlated modes and marginally distributed on the axes. Correlation is clearly visible from the -distributed . The joint distribution is more sparsely sampled than both marginal distributions. b) The -distributed is decorrelated and has exactly as many sample points as the joint distribution in a), allowing precise computation of .

Consistent dimensions

The sensitivity of the nearest-neigbour estimates, Eq. 2, towards the sparse sampling problem also affects the different terms of Eq. 5, which inevitably suffer from different sparse sampling problems if computed separately. Furthermore, a huge number of probability density distributions Inline graphic is computed more than once for the many instances of identical correlation terms appearing in that equation. Expanding over entropy terms rather than correlation terms, in contrast, yields

(9)

where the first summation runs over different orders Inline graphic until truncation order . designates how many times a certain order appears and whether it needs to be added or subtracted, and the second sum over all possible combinations . To guarantee the same estimation accuracy for all of Eq. 9, each term is filled up to truncation order yielding Inline graphic . Under this modification, Eq. 9 reads

graphic file with name pone.0009179.e254.jpg

(10)

with the number of marginal entropies,

graphic file with name pone.0009179.e255.jpg

which depends on the fill mode weighting index

where, like above, primes indicate permuted entries.

Footnotes

Competing Interests: The authors have declared that no competing interests exist.

Funding: U.H. was supported by the Deutsche Forschungsgemeinschaft (research training group 782). O.F.L. was supported by the Human Frontiers of Science Program and by the Volkswagen Foundation, Grant I/80436. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

1.Beveridge DL, DiCapua FM. Free energy via molecular simulation: Applications to chemical and biomolecular systems. Annual Review of Biophysics and Biophysical Chemistry. 1989;18:431–492. doi: 10.1146/annurev.bb.18.060189.002243. [DOI] [PubMed] [Google Scholar]
2.Straatsma TP, McCammon JA. Computational alchemy. Annual Review of Physical Chemistry. 1992;43:407–435. [Google Scholar]
3.Kollman P. Free energy calculations: Applications to chemical and biochemical phenomena. Chem Rev. 1993;93:2395–2417. [Google Scholar]
4.Meirovitch H. Recent developments in methodologies for calculating the entropy and free energy of biological systems by computer simulation. Curr Opin Struct Biol. 2007;17:181–186. doi: 10.1016/j.sbi.2007.03.016. [DOI] [PubMed] [Google Scholar]
5.Peter C, Oostenbrink C, van Dorp A, van Gunsteren WF. Estimating entropies from molecular dynamics simulations. J Chem Phys. 2004;120:2652–2661. doi: 10.1063/1.1636153. [DOI] [PubMed] [Google Scholar]
6.Cheluvaraja S, Meirovitch H. Simulation method for calculating the entropy and free energy of peptides and proteins. PNAS. 2004;101:9241–9246. doi: 10.1073/pnas.0308201101. [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Cheluvaraja S, Meirovitch H. Calculation of the entropy and free energy of peptides by molecular dynamics simulations using the hypothetical scanning molecular dynamics method. J Chem Phys. 2006;125:024905. doi: 10.1063/1.2208608. [DOI] [PubMed] [Google Scholar]
8.Karplus M, Kushick JN. Method for estimating the configurational entropy of macromolecules. Macromolecules. 1981;14:325–332. [Google Scholar]
9.Schlitter J. Estimation of absolute and relative entropies of macromolecules using the covariance matrix. Chemical Physics Letters. 1993;215:617–621. [Google Scholar]
10.Karplus M, McCammon JA. Molecular dynamics simulations of biomolecules. Nat Struct Mol Biol. 2002;9:646–652. doi: 10.1038/nsb0902-646. [DOI] [PubMed] [Google Scholar]
11.Chang C, Chen W, Gilson M. Evaluating the accuracy of the quasiharmonic approximation. J Chem Theory Comput. 2005;1:1017–1028. doi: 10.1021/ct0500904. [DOI] [PubMed] [Google Scholar]
12.Chang C, Chen W, Gilson MK. Ligand configurational entropy and protein binding. PNAS. 2007;104:1534–1539. doi: 10.1073/pnas.0610494104. [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Gilson MK, Zhou HX. Calculation of protein-ligand binding affinities. Ann Rev Biophys Biomol Struct. 2007;36:21–42. doi: 10.1146/annurev.biophys.36.040306.132550. [DOI] [PubMed] [Google Scholar]
14.Minh DDL, Bui JM, Chang C, Jain T, Swanson JMJ, et al. The entropic cost of protein-protein association: A case study on acetylcholinesterase binding to fasciculin-2. Biophys J. 2005;89:25–27. doi: 10.1529/biophysj.105.069336. [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Pereira CS, Kony D, Baron R, Müller M, van Gunsteren WF, et al. Conformational and dynamical properties of disaccharides in water: a molecular dynamics study. Biophysical Journal. 2006;90:4337–4344. doi: 10.1529/biophysj.106.081539. [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Baron R, deVries A, Hünenberger P, van Gunsteren W. Comparison of atomic-level and coarse-grained models for liquid hydrocarbons from molecular dynamics configurational entropy estimates. J Phys Chem B. 2006;110:8464–8473. doi: 10.1021/jp055888y. [DOI] [PubMed] [Google Scholar]
17.Baron R, McCammon JA. (thermo)dynamic role of receptor flexibility, entropy, and motional correlation in protein-ligand binding. ChemPhysChem. 2008;9:983–988. doi: 10.1002/cphc.200700857. [DOI] [PubMed] [Google Scholar]
18.Kolossvary I. Evaluation of the molecular configuration integral in all degrees of freedom for the direct calculation of conformational free energies: Prediction of the anomeric free energy of monosaccharides. J Phys Chem A. 1997;101:9900–9905. [Google Scholar]
19.Chang C, Potter M, Gilson M. Calculation of molecular configuration integrals. J Phys Chem B. 2003;107:1048–1055. [Google Scholar]
20.Baron R, van Gunsteren W, Hünenberger P. Estimating the configurational entropy from molecular dynamics simulations: anharmonicity and correlation corrections to the quasi-harmonic approximation. Trends Phys Chem. 2006;11:87–122. [Google Scholar]
21.Hensen U, Grubmüller H, Lange OF. Adaptive anisotropic kernels for nonparametric estimation of absolute configurational entropies in high-dimensional configuration spaces. Phys Rev E. 2009;80:011913. doi: 10.1103/PhysRevE.80.011913. [DOI] [PubMed] [Google Scholar]
22.Tyka M, Clarke A, Sessions R. An efficient, path-independent method for free-energy calculations. J Phys Chem B. 2006;110:17212–17220. doi: 10.1021/jp060734j. [DOI] [PubMed] [Google Scholar]
23.Kraskov A, Stögbauer H, Grassberger P. Estimating mutual information. Phys Rev E. 2004;69:066138. doi: 10.1103/PhysRevE.69.066138. [DOI] [PubMed] [Google Scholar]
24.Bellman RE. Adaptive Control Processes. Princeton University Press; 1961. [Google Scholar]
25.Hnizdo V, Darian E, Fedorowicz A, Demchuk E, Li S, et al. Nearest-neighbor nonparametric method for estimating the configurational entropy of complex molecules. J Comp Chem. 2007;28:655–668. doi: 10.1002/jcc.20589. [DOI] [PubMed] [Google Scholar]
26.Hennig M. Entropy invariant transformations. 2007. Master's thesis, Universität Jena.
27.Lange OF, Grubmüller H. Full correlation analysis of conformational protein dynamics. Proteins. 2008;70:1294–1312. doi: 10.1002/prot.21618. [DOI] [PubMed] [Google Scholar]
28.Lange OF, Grubmüller H. Generalized correlation for biomolecular dynamics. Proteins. 2006;62:1053–1061. doi: 10.1002/prot.20784. [DOI] [PubMed] [Google Scholar]
29.Baranyai A, Evans DJ. Direct entropy calculation from computer simulation of liquids. Phys Rev A. 1989;40:3817–3822. doi: 10.1103/physreva.40.3817. [DOI] [PubMed] [Google Scholar]
30.Attard P, Jepps OG, Marčelja S. Information content of signals using correlation function expansions of the entropy. Phys Rev E. 1997;56:4052–4067. [Google Scholar]
31.Attard P. 1999. Statistical Physics on the Eve of the Twenty-First Century, World Scientific, chapter Markov Superposition Expansion for the Entropy and Correlation Functions in Two and Three Dimensions.
32.Killian BJ, Kravitz JY, Gilson MK. Extraction of configurational entropy from molecular simulations via an expansion approximation. J Chem Phys. 2007;127:024107. doi: 10.1063/1.2746329. [DOI] [PMC free article] [PubMed] [Google Scholar]
33.Hnizdo V, Tan J, Killian BJ, Gilson MK. Efficient calculation of configurational entropy from molecular simulations by combining the mutual-information expansion and nearest-neighbor methods. J Comp Chem. 2008;29:1605–1614. doi: 10.1002/jcc.20919. [DOI] [PMC free article] [PubMed] [Google Scholar]
34.Kaminski G, Friesner R, Tirado-Rives J, Jorgensen W. Evaluation and reparametrization of the OPLS-AA force field for proteins via comparison with accurate quantum chemical calculations on peptides. J Phys Chem B. 2001;105:6474–6487. [Google Scholar]
35.Jorgensen WL, Chandrasekhar J, Madura JD, Impey RW, Klein ML. Comparison of simple potential functions for simulating liquid water. J Chem Phys. 1983;79:926–935. [Google Scholar]
36.Reinhard F, Grubmüller H. Estimation of absolute solvent and solvation shell entropies via permutation reduction. J Chem Phys. 2007;126:014102. doi: 10.1063/1.2400220. [DOI] [PubMed] [Google Scholar]
37.Hess B. Determining the shear viscosity of model liquids from molecular dynamics simulations. J Chem Phys. 2002;116:209–217. [Google Scholar]
38.Schüttelkopf AW, van Aalten DMF. PRODRG - a tool for high-throughput crystallography of protein-ligand complexes. Acta Crystallographica D. 2004;60:1355–1363. doi: 10.1107/S0907444904011679. [DOI] [PubMed] [Google Scholar]
39.van Gunsteren WF, Daura X, Mark AE. GROMOS force field. 1998. pp. 1211–1216. Encyclopaedia of computational chemistry edition.
40.van der Spoel D, Lindahl E, Hess B, Groenhof G, Mark AE, et al. Gromacs: Fast, flexible, and free. J Comp Chem. 2005;26:1701–1718. doi: 10.1002/jcc.20291. [DOI] [PubMed] [Google Scholar]
41.Berendsen HJC, Postma JPM, van Gunsteren WF, DiNola A, Haak JR. Molecular dynamics with coupling to an external bath. J Chem Phys. 1984;81:3684–3690. [Google Scholar]
42.Hess B, Bekker H, Berendsen HJC, Fraaije JGEM. Lincs: A linear constraint solver for molecular simulations. Journal of Computational Chemistry. 1997;18:1463–1472. [Google Scholar]
43.Miyamoto S, Kollman PA. Settle: An analytical version of the shake and rattle algorithm for rigid water models. J Comp Chem. 1992;13:952–962. [Google Scholar]
44.Darden T, York D, Pedersen L. Particle mesh Ewald: An N log(N) method for Ewald sums in large systems. J Chem Phys. 1993;98:10089–10092. [Google Scholar]

[pone.0009179-Beveridge1] 1.Beveridge DL, DiCapua FM. Free energy via molecular simulation: Applications to chemical and biomolecular systems. Annual Review of Biophysics and Biophysical Chemistry. 1989;18:431–492. doi: 10.1146/annurev.bb.18.060189.002243. [DOI] [PubMed] [Google Scholar]

[pone.0009179-Straatsma1] 2.Straatsma TP, McCammon JA. Computational alchemy. Annual Review of Physical Chemistry. 1992;43:407–435. [Google Scholar]

[pone.0009179-Kollman1] 3.Kollman P. Free energy calculations: Applications to chemical and biochemical phenomena. Chem Rev. 1993;93:2395–2417. [Google Scholar]

[pone.0009179-Meirovitch1] 4.Meirovitch H. Recent developments in methodologies for calculating the entropy and free energy of biological systems by computer simulation. Curr Opin Struct Biol. 2007;17:181–186. doi: 10.1016/j.sbi.2007.03.016. [DOI] [PubMed] [Google Scholar]

[pone.0009179-Peter1] 5.Peter C, Oostenbrink C, van Dorp A, van Gunsteren WF. Estimating entropies from molecular dynamics simulations. J Chem Phys. 2004;120:2652–2661. doi: 10.1063/1.1636153. [DOI] [PubMed] [Google Scholar]

[pone.0009179-Cheluvaraja1] 6.Cheluvaraja S, Meirovitch H. Simulation method for calculating the entropy and free energy of peptides and proteins. PNAS. 2004;101:9241–9246. doi: 10.1073/pnas.0308201101. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0009179-Cheluvaraja2] 7.Cheluvaraja S, Meirovitch H. Calculation of the entropy and free energy of peptides by molecular dynamics simulations using the hypothetical scanning molecular dynamics method. J Chem Phys. 2006;125:024905. doi: 10.1063/1.2208608. [DOI] [PubMed] [Google Scholar]

[pone.0009179-Karplus1] 8.Karplus M, Kushick JN. Method for estimating the configurational entropy of macromolecules. Macromolecules. 1981;14:325–332. [Google Scholar]

[pone.0009179-Schlitter1] 9.Schlitter J. Estimation of absolute and relative entropies of macromolecules using the covariance matrix. Chemical Physics Letters. 1993;215:617–621. [Google Scholar]

[pone.0009179-Karplus2] 10.Karplus M, McCammon JA. Molecular dynamics simulations of biomolecules. Nat Struct Mol Biol. 2002;9:646–652. doi: 10.1038/nsb0902-646. [DOI] [PubMed] [Google Scholar]

[pone.0009179-Chang1] 11.Chang C, Chen W, Gilson M. Evaluating the accuracy of the quasiharmonic approximation. J Chem Theory Comput. 2005;1:1017–1028. doi: 10.1021/ct0500904. [DOI] [PubMed] [Google Scholar]

[pone.0009179-Chang2] 12.Chang C, Chen W, Gilson MK. Ligand configurational entropy and protein binding. PNAS. 2007;104:1534–1539. doi: 10.1073/pnas.0610494104. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0009179-Gilson1] 13.Gilson MK, Zhou HX. Calculation of protein-ligand binding affinities. Ann Rev Biophys Biomol Struct. 2007;36:21–42. doi: 10.1146/annurev.biophys.36.040306.132550. [DOI] [PubMed] [Google Scholar]

[pone.0009179-Minh1] 14.Minh DDL, Bui JM, Chang C, Jain T, Swanson JMJ, et al. The entropic cost of protein-protein association: A case study on acetylcholinesterase binding to fasciculin-2. Biophys J. 2005;89:25–27. doi: 10.1529/biophysj.105.069336. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0009179-Pereira1] 15.Pereira CS, Kony D, Baron R, Müller M, van Gunsteren WF, et al. Conformational and dynamical properties of disaccharides in water: a molecular dynamics study. Biophysical Journal. 2006;90:4337–4344. doi: 10.1529/biophysj.106.081539. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0009179-Baron1] 16.Baron R, deVries A, Hünenberger P, van Gunsteren W. Comparison of atomic-level and coarse-grained models for liquid hydrocarbons from molecular dynamics configurational entropy estimates. J Phys Chem B. 2006;110:8464–8473. doi: 10.1021/jp055888y. [DOI] [PubMed] [Google Scholar]

[pone.0009179-Baron2] 17.Baron R, McCammon JA. (thermo)dynamic role of receptor flexibility, entropy, and motional correlation in protein-ligand binding. ChemPhysChem. 2008;9:983–988. doi: 10.1002/cphc.200700857. [DOI] [PubMed] [Google Scholar]

[pone.0009179-Kolossvary1] 18.Kolossvary I. Evaluation of the molecular configuration integral in all degrees of freedom for the direct calculation of conformational free energies: Prediction of the anomeric free energy of monosaccharides. J Phys Chem A. 1997;101:9900–9905. [Google Scholar]

[pone.0009179-Chang3] 19.Chang C, Potter M, Gilson M. Calculation of molecular configuration integrals. J Phys Chem B. 2003;107:1048–1055. [Google Scholar]

[pone.0009179-Baron3] 20.Baron R, van Gunsteren W, Hünenberger P. Estimating the configurational entropy from molecular dynamics simulations: anharmonicity and correlation corrections to the quasi-harmonic approximation. Trends Phys Chem. 2006;11:87–122. [Google Scholar]

[pone.0009179-Hensen1] 21.Hensen U, Grubmüller H, Lange OF. Adaptive anisotropic kernels for nonparametric estimation of absolute configurational entropies in high-dimensional configuration spaces. Phys Rev E. 2009;80:011913. doi: 10.1103/PhysRevE.80.011913. [DOI] [PubMed] [Google Scholar]

[pone.0009179-Tyka1] 22.Tyka M, Clarke A, Sessions R. An efficient, path-independent method for free-energy calculations. J Phys Chem B. 2006;110:17212–17220. doi: 10.1021/jp060734j. [DOI] [PubMed] [Google Scholar]

[pone.0009179-Kraskov1] 23.Kraskov A, Stögbauer H, Grassberger P. Estimating mutual information. Phys Rev E. 2004;69:066138. doi: 10.1103/PhysRevE.69.066138. [DOI] [PubMed] [Google Scholar]

[pone.0009179-Bellman1] 24.Bellman RE. Adaptive Control Processes. Princeton University Press; 1961. [Google Scholar]

[pone.0009179-Hnizdo1] 25.Hnizdo V, Darian E, Fedorowicz A, Demchuk E, Li S, et al. Nearest-neighbor nonparametric method for estimating the configurational entropy of complex molecules. J Comp Chem. 2007;28:655–668. doi: 10.1002/jcc.20589. [DOI] [PubMed] [Google Scholar]

[pone.0009179-Hennig1] 26.Hennig M. Entropy invariant transformations. 2007. Master's thesis, Universität Jena.

[pone.0009179-Lange1] 27.Lange OF, Grubmüller H. Full correlation analysis of conformational protein dynamics. Proteins. 2008;70:1294–1312. doi: 10.1002/prot.21618. [DOI] [PubMed] [Google Scholar]

[pone.0009179-Lange2] 28.Lange OF, Grubmüller H. Generalized correlation for biomolecular dynamics. Proteins. 2006;62:1053–1061. doi: 10.1002/prot.20784. [DOI] [PubMed] [Google Scholar]

[pone.0009179-Baranyai1] 29.Baranyai A, Evans DJ. Direct entropy calculation from computer simulation of liquids. Phys Rev A. 1989;40:3817–3822. doi: 10.1103/physreva.40.3817. [DOI] [PubMed] [Google Scholar]

[pone.0009179-Attard1] 30.Attard P, Jepps OG, Marčelja S. Information content of signals using correlation function expansions of the entropy. Phys Rev E. 1997;56:4052–4067. [Google Scholar]

[pone.0009179-Attard2] 31.Attard P. 1999. Statistical Physics on the Eve of the Twenty-First Century, World Scientific, chapter Markov Superposition Expansion for the Entropy and Correlation Functions in Two and Three Dimensions.

[pone.0009179-Killian1] 32.Killian BJ, Kravitz JY, Gilson MK. Extraction of configurational entropy from molecular simulations via an expansion approximation. J Chem Phys. 2007;127:024107. doi: 10.1063/1.2746329. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0009179-Hnizdo2] 33.Hnizdo V, Tan J, Killian BJ, Gilson MK. Efficient calculation of configurational entropy from molecular simulations by combining the mutual-information expansion and nearest-neighbor methods. J Comp Chem. 2008;29:1605–1614. doi: 10.1002/jcc.20919. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0009179-Kaminski1] 34.Kaminski G, Friesner R, Tirado-Rives J, Jorgensen W. Evaluation and reparametrization of the OPLS-AA force field for proteins via comparison with accurate quantum chemical calculations on peptides. J Phys Chem B. 2001;105:6474–6487. [Google Scholar]

[pone.0009179-Jorgensen1] 35.Jorgensen WL, Chandrasekhar J, Madura JD, Impey RW, Klein ML. Comparison of simple potential functions for simulating liquid water. J Chem Phys. 1983;79:926–935. [Google Scholar]

[pone.0009179-Reinhard1] 36.Reinhard F, Grubmüller H. Estimation of absolute solvent and solvation shell entropies via permutation reduction. J Chem Phys. 2007;126:014102. doi: 10.1063/1.2400220. [DOI] [PubMed] [Google Scholar]

[pone.0009179-Hess1] 37.Hess B. Determining the shear viscosity of model liquids from molecular dynamics simulations. J Chem Phys. 2002;116:209–217. [Google Scholar]

[pone.0009179-Schttelkopf1] 38.Schüttelkopf AW, van Aalten DMF. PRODRG - a tool for high-throughput crystallography of protein-ligand complexes. Acta Crystallographica D. 2004;60:1355–1363. doi: 10.1107/S0907444904011679. [DOI] [PubMed] [Google Scholar]

[pone.0009179-vanGunsteren1] 39.van Gunsteren WF, Daura X, Mark AE. GROMOS force field. 1998. pp. 1211–1216. Encyclopaedia of computational chemistry edition.

[pone.0009179-vanderSpoel1] 40.van der Spoel D, Lindahl E, Hess B, Groenhof G, Mark AE, et al. Gromacs: Fast, flexible, and free. J Comp Chem. 2005;26:1701–1718. doi: 10.1002/jcc.20291. [DOI] [PubMed] [Google Scholar]

[pone.0009179-Berendsen1] 41.Berendsen HJC, Postma JPM, van Gunsteren WF, DiNola A, Haak JR. Molecular dynamics with coupling to an external bath. J Chem Phys. 1984;81:3684–3690. [Google Scholar]

[pone.0009179-Hess2] 42.Hess B, Bekker H, Berendsen HJC, Fraaije JGEM. Lincs: A linear constraint solver for molecular simulations. Journal of Computational Chemistry. 1997;18:1463–1472. [Google Scholar]

[pone.0009179-Miyamoto1] 43.Miyamoto S, Kollman PA. Settle: An analytical version of the shake and rattle algorithm for rigid water models. J Comp Chem. 1992;13:952–962. [Google Scholar]

[pone.0009179-Darden1] 44.Darden T, York D, Pedersen L. Particle mesh Ewald: An N log(N) method for Ewald sums in large systems. J Chem Phys. 1993;98:10089–10092. [Google Scholar]

PERMALINK

Estimating Absolute Configurational Entropies of Macromolecules: The Minimally Coupled Subspace Approach

Ulf Hensen

Oliver F Lange

Helmut Grubmüller

Roles

Abstract

Introduction

Results

The MCSA Scheme

Figure 1. Entropy estimates for a set of small test systems.

Table 1. Entropy estimates obtained for all systems.

Non-Parametric Density Estimation

Generation of Minimally Coupled Subspaces

Mutual Information Expansions for Oversized Clusters

TATA Box Binding Protein: Protein Test Case and Error Estimate

Figure 2. Entropy estimates for the TATA box binding protein (TBP) inhibitor in complex (left) and free (right).

Discussion

Methods

Thermodynamic Integration Reference Entropy

Molecular/Stochastic Dynamics Simulations

Mutual Information Expansions Implementation Details

Fill modes

Figure 3. Principle of fill modes.

Consistent dimensions

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Estimating Absolute Configurational Entropies of Macromolecules: The Minimally Coupled Subspace Approach

Ulf Hensen

Oliver F Lange

Helmut Grubmüller

Roles

Abstract

Introduction

Results

The MCSA Scheme

Figure 1. Entropy estimates for a set of small test systems.

Table 1. Entropy estimates obtained for all systems.

Non-Parametric Density Estimation

Generation of Minimally Coupled Subspaces

Mutual Information Expansions for Oversized Clusters

TATA Box Binding Protein: Protein Test Case and Error Estimate

Figure 2. Entropy estimates for the TATA box binding protein (TBP) inhibitor in complex (left) and free (right).

Discussion

Methods

Thermodynamic Integration Reference Entropy

Molecular/Stochastic Dynamics Simulations

Mutual Information Expansions Implementation Details

Fill modes

Figure 3. Principle of fill modes.

Consistent dimensions

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases