Abstract
The differential Shannon entropy of information theory can change under a change of variables (coordinates), but the thermodynamic entropy of a physical system must be invariant under such a change. This difference is puzzling, because the Shannon and Gibbs entropies have the same functional form. We show that a canonical change of variables can, indeed, alter the spatial component of the thermodynamic entropy just as it alters the differential Shannon entropy. However, there is also a momentum part of the entropy, which turns out to undergo an equal and opposite change when the coordinates are transformed, so that the total thermodynamic entropy remains invariant. We furthermore show how one may correctly write the change in total entropy for an isothermal physical process in any set of spatial coordinates.
Keywords: thermodynamic entropy, Shannon entropy, spatial entropy, non-Cartesian coordinates, canonical transformation, Jacobian
“Since no one understands what entropy is, by using this word you will have an advantage over your adversary in any debate.” John von Neumann’s advice to Claude Shannon, according to J. Campbell [1].
1. Introduction
The Gibbs entropy of classical statistical thermodynamics is, apart from some non-essential constants, the differential Shannon entropy [2] of the probability density function (pdf) in the phase space of the system under consideration. However, whereas the thermodynamic entropy is not expected to depend upon the choice of variables, the differential entropy can be changed by a transformation of variables. In particular, the differential entropy of a spatial pdf depends on the choice of the coordinates used to describe the spatial configuration of the system. Moreover, a change of variables can change not only the absolute differential entropy, but also its change on a change in the pdf, as shown below. This sensitivity to coordinates appears paradoxical, since a physically meaningful quantity ought to be independent of the choice of spatial coordinates. A similar concern was previously expressed in a critique of the concept of the differential entropy itself [3].
Here, we demonstrate that, for the thermodynamic entropy, a transformation of the spatial coordinates is accompanied by a compensating change of the entropy of the canonically conjugate momenta so that the full thermodynamic entropy remains invariant. The invariance of the full entropy stems from the fact that the Jacobian of a canonical transformation equals unity, and explicit demonstration of this invariance yields a simple formula for correcting a spatial entropy computed with transformed coordinates to yield correct full entropy changes. These results have application in calculations of spatial entropy from molecular simulations when Cartesian coordinates are transformed, for example to bond-angle-torsion [4] coordinates.
The paper is structured as follows. Section 2 reviews the formalism of entropy in classical statistical thermodynamics, defines a splitting of the full thermodynamic entropy into momentum and spatial parts for the case in which the spatial coordinates used are Cartesian, and shows that then the change in the spatial entropy equals the change in the total entropy, for an isothermal process. (Note that the spatial entropy of the solute part of a solute-solvent system is often termed the solute’s configurational entropy.) Sections 3 and 4 investigate the effects on the spatial and momentum entropy, respectively, of a transformation of the Cartesian coordinates to general spatial coordinates. Section 5, discusses how one may evaluate the change in the full thermodynamic entropy due to an isothermal process in terms of the change in the spatial entropy evaluated in non-Cartesian coordinates. Finally, Section 6 draws conclusions.
2. Spatial Entropy in Cartesian Coordinates
In classical statistical thermodynamics, the entropy S of a system described by coordinates q1, …, qs and the canonically conjugate momenta p1, …, ps, briefly p and q, respectively, is given in terms of the system’s pdf ρ(p, q) in the phase space (p, q) as [5]
(1) |
Here, kB is Boltzmann’s constant, and the factor hs, where h = 2πħ is Planck’s constant (quasi-classically, the number of states in a volume element ΔpΔq of the phase space (p, q) is ΔpΔq/hs, see e.g., [6]), ensures that the argument of the logarithm is dimensionless. The probability distribution function ρ(p, q) is given by the Boltzmann-Gibbs distribution
(2) |
where β = 1/(kBT), with T being the absolute temperature, E(p, q) is the system’s energy, and
(3) |
is the partition function, which is the distribution’s normalization constant divided by hs to make it dimensionless. The entropy S can be written in terms of the partition function Z as
(4) |
where
(5) |
is the mean (expectation) value of the energy E(p, q).
Let us assume that the coordinates q are Cartesian. Then the energy E(p, q) is the sum of a kinetic energy, K, which is a function of the momenta p only, , where mi are the masses associated with degrees of freedom i, and a potential energy U = U(q), which depends only on spatial coordinates. The probability distribution (2) and the partition function (3) now factorize as
(6) |
(7) |
where,
(8) |
(9) |
are momentum and spatial probability distributions with normalization constants
(10) |
(11) |
which can be termed, respectively, the momentum and spatial partition functions. It should be noted that neither of these partition functions are dimensionless; the treatment of the factor 1/hs is discussed later in this section.
Accordingly, the full entropy (1) can now be separated as
(12) |
where Sm is a momentum entropy which can be evaluated in closed form,
(13) |
and Ss is a spatial entropy,
(14) |
Similarly to (4), the spatial entropy Ss can be written as
(15) |
where
(16) |
is the mean value of the potential energy U(q).
A factor of hs is included in the definition of the momentum entropy (13) so that the full entropy Sm + Ss has the correct physical dimensions of energy/temperature. The association of this factor with the momentum entropy is arbitrary (as was its inclusion in the momentum partition function (10)); it could instead have been included in the spatial entropy. Unfortunately, the factor cannot be split so that both parts of the full entropy are dimensionless. Thus, neither of the “partial” entropies Sm and Ss is correctly dimensioned. Nonetheless, the troubling dimensions cancel for differences in Sm and Ss arising from changes in the pdf, since the relevant terms appear in the arguments of logarithms, and so such differences are physically meaningful.
It is evident from Equation (13) that the momentum entropy does not depend upon the spatial pdf ρs(q), but only upon the momentum pdf ρm(p), which in turn depends only on the atomic masses and the temperature. As a consequence, for an isothermal physical process with a fixed set of atoms, the change in total entropy equals the change in spatial entropy:
(17) |
More generally, this equation holds in any coordinate system for which the kinetic energy K is independent of the spatial coordinates, K = K(p).
Note that the spatial entropy defined here is akin to the configurational entropy. However, the latter usually refers specifically to the entropy associated with the conformational fluctuations of a molecule in solution, and is therefore exclusive of the solvent entropy [9]. The spatial entropy is more general, as it may refer to the whole system or any of its parts.
3. Spatial Entropy under a Coordinate Transformation
It is sometimes of interest to compute the change in total entropy, ΔS, associated with an isothermal molecular process, such as protein-ligand binding or protein-folding. Equation (17) shows that the change in total entropy can be obtained by computing the change in the spatial entropy in Cartesian coordinates. However, Cartesian coordinates are not always optimal for this purpose, because the pdf in Cartesian coordinates includes many coordinate dependencies that are rather easily removed by transforming to more natural coordinates, such as bond-angle-torsion (BAT) coordinates [4]. For example, even the Cartesian coordinates of a single atom, (xi, yi, zi), can be strongly correlated with each other, due to the natural tendency of each atom to move in a circular trajectory corresponding to a bond-rotation. This motion is readily captured by a single torsional variable. For this reason, a transformation from Cartesian coordinates to suitably defined internal coordinates of the molecular system under consideration, plus the coordinates of the translation and rotation of the system as a whole, is often performed.
Transforming from Cartesian coordinates q to new coordinates Q(q) (with an inverse q = q(Q)), transforms the probability density function (pdf) ρs(q) into a pdf ρ̃s(Q) of the new coordinates Q according to a rule of general probability theory as
(18) |
where J(Q) is the Jacobian of the transformation. The differential Shannon entropy Hq of information theory, defined as [2]
(19) |
can be written in terms of the new coordinates as
(20) |
where
(21) |
is the Shannon entropy of the transformed pdf ρ̃s(Q) and
(22) |
is the expectation value of the logarithm of the Jacobian J(Q).
Based upon Equation (20), the spatial part of the thermodynamic entropy (14) in Cartesian coordinates q, Ss = kBHq, transforms to a different value on changing to internal coordinates Q:
(23) |
where S̃s = kBHQ. This result may seem troubling, since the thermodynamic entropy of a physical system should not depend on the choice of the spatial coordinates used. Nonetheless, this Jacobian correction is real and is not even guaranteed to cancel when one computes the difference between two entropies associated with a change in the potential function and a consequent change in the physical pdf from ρ to some ρ′. That is, the change ΔSs in the Cartesian spatial entropy Ss arising from a change in the pdf of q does not necessarily equal the change ΔS̃s in the transformed spatial entropy S̃s arising from the corresponding change in the transformed pdf of Q. This is because the Jacobian can vary with Q, so its contribution to the entropy for ρ (22) can differ from that for ρ′
(24) |
Hence,
(25) |
We note that a similar treatment of the transformation of spatial entropy under a change of coordinates has been given in [7].
4. Momentum Entropy under a Coordinate Transformation
The coordinate transformation q → Q = Q(q) is called a point transformation because the new coordinates Q are functions only of the old coordinates q; that is, they do not involve the old conjugate momenta p. (A general point transformation may also involve an explicit dependence on time.) A point transformation of the spatial coordinates is associated with a canonical transformation of the full phase space coordinate system; i.e., of both the spatial and momentum coordinates, such that p, q → P = P(p, q), Q = Q(q), with the inverses p = p(P,Q), q = q(Q). In a canonical transformation, the new variables P,Q remain canonically conjugate, which means that
(26) |
where L = L(Q, Q̇, t), with Q̇ = dQ/dt, is the system’s Lagrangian expressed in terms of Q. Classical statistical thermodynamics is based on the Hamilton formalism of mechanics, and property (26) ensures that the Hamilton equations in terms of P, Q retain their canonical form. Here, an important property of a canonical transformation p, q → P,Q is that its Jacobian equals unity [8]. This means that the phase-space volume element of an integration in the full phase space is invariant:
(27) |
A point transformation q → Q = Q(q) itself has in general a Jacobian J(Q) ≠ = 1, and so dq = J(Q) dQ. Thus, for invariance (27) to hold, the momentum volume element dp must transform in a canonical transformation as
(28) |
Using (27), we can write the full entropy (1) in terms of an integral in the new phase space variables P,Q simply as
(29) |
where
(30) |
is the pdf of the new canonical variables P,Q. Note that because in general p = p(P,Q), the kinetic energy K is now a function of not only the momenta P, but also of the non-Cartesian coordinates Q: K = K(P,Q). Like any joint pdf, (30) may be factorized by means of the product rule as
(31) |
where
(32) |
is the marginal pdf of the coordinates Q, and
(33) |
is the conditional pdf of P given Q. Using (30), (28) and (10), it can be verified that the marginal pdf (32) equals the spatial probability density (18) that was introduced in Sec. 2 under the same symbol:
(34) |
Here, the transformation dP = J(Q) dp was performed, which reverts the kinetic energy K(P,Q) to being a function of the Cartesian momenta p only, K = K(p), since this transformation is the inverse of that which made the kinetic energy K(p) a function of both P and Q. (The transformation P → p = p(P,Q) has an inverse P = P(p,Q) that is such that K(P(p,Q), Q) is in fact a function of p only.) Since Zm/Z = 1/Zs see Equation (7)], the last line of (34) indeed yields the spatial pdf (18).
Using (31) and the fact that, like any conditional pdf, the distribution (33) is properly normalized (i.e., ∫dP ρ̃m(P|Q) = 1 at any Q), the full entropy (29) can be written as
(35) |
where
(36) |
is the mean (expectation) value with respect to Q of the Shannon entropy of the conditional pdf ρ̃m(P|Q),
(37) |
and HQ is the Shannon entropy (21) of ρ̃s(Q). In analogy with (12), Equation (35) in turn can be written as
(38) |
where
(39) |
may be termed the momentum entropy associated with the new momenta P, and
(40) |
is the spatial entropy in the new spatial coordinates Q. Equations (12) and (38), together with (23), straightforwardly yield the transformation of the momentum entropy:
(41) |
This relation complements the transformation (23) of the spatial entropy so that, on a point transformation q → Q = Q(q), a change in the spatial entropy is compensated by a change in the momentum entropy and the full entropy remains invariant: Sm + Ss = S̃m + S̃s.
5. Total Entropy in Terms of the Spatial Entropy in Non-Cartesian Coordinates
It is now easy to show how the change in the spatial entropy of a molecular system, calculated using a non-Cartesian system of coordinates, can be corrected to provide the corresponding change in the full thermodynamic entropy of the system, which is the quantity of physical interest. We consider the entropy change arising from some physicochemical process which changes the phase-space pdf of the system from ρ to ρ′. The molecular species of interest is considered to be present at a standard concentration C∘, which corresponds to an isolated molecule in a container of volume V∘ = 1/C∘ [9]. For an isothermal process, and with Cartesian coordinates, the momentum pdf is unchanged, so that there is no change in the momentum entropy, Sm. The change ΔS of the full thermodynamic entropy therefore equals the change in the Cartesian spatial entropy Ss:
(42) |
If non-Cartesian coordinates Q are used, as in many methods for calculating the spatial entropy, a change ΔS̃s in the entropy of the coordinates Q obtained in this way can be corrected to the change ΔSs in the Cartesian spatial entropy Ss using (23):
(43) |
Here, the correction term
(44) |
is the difference between the means, evaluated with the changed and original spatial distributions and ρ̃s(Q), respectively, of the logarithm of the Jacobian of the transformation q → Q = Q(q). If molecular simulations are used for the evaluation of spatial entropy, then 〈ln J(Q)〉 is evaluated easily as a simple arithmetic mean,
(45) |
where Qi, i = 1, …, n is a sample of the coordinates obtained from snapshots of the simulation trajectory.
We now consider the specific case where Q represents bond-angle-torsion (BAT) coordinates. For N atoms, Q comprises 3 external translational coordinates rex = (xex, yex, zex), 3 external rotational coordinates θex, ϕex, ψex, and 3N − 6 internal coordinates of N − 1 bond lengths b = (b2, …, bN), N − 2 bond angles θ = (θ3, …, θN), and N − 3 torsional angles ϕ = (ϕ4, …, ϕN), where the subscripts indicate the atoms to which the internal coordinates correspond. The Jacobian for this transformation is given as [4]
(46) |
and the spatial pdf in Cartesian coordinates q = x, ρs(x), is thus transformed into the following pdf in BAT coordinates:
(47) |
(48) |
where we have used expression (9) for ρs, along with the fact that the potential energy U(x) of a molecule of N atoms that is not located in an external field depends only on its 3N − 6 internal BAT coordinates (b, θ, ϕ). The right-hand side of (47) may be written as a product of two factors, one depending on only the external coordinate θex and the other on only the internal coordinates (b, θ, ϕ). Therefore, the joint pdf (47) can be factorized as
(49) |
where
(50) |
and
(51) |
Here, ρ̃ex, the marginal pdf of the external coordinates, is clearly normalized because
(52) |
and ρ̃in(b, θ, ϕ), the marginal pdf of the internal BAT coordinates, is normalized based on the definition of Zs Equation(48)].
The entropy of the joint pdf (49) now separates as
(53) |
where S̃ex and S̃in are the entropies of the marginal pdf’s ρ̃ex and ρ̃in, respectively. Using expression (50) for ρ̃ex, the external entropy S̃ex can be written as
(54) |
where 〈ln(sin θex)〉 is the mean of ln(sin θex) for a uniform distribution of molecular orientations. Finally, using Equations (23), (46), (53) and (54), we have for the spatial entropy in Cartesian coordinates, Ss:
(55) |
In previous work [10], Ss has been written in terms of the BAT coordinates (b, θ, ϕ) as
(56) |
where ρ(b, θ, ϕ) is a distribution function of the internal BAT coordinates that becomes the normalized marginal pdf ρ̃in(b, θ, ϕ) on multiplication by J(b, θ),
(57) |
Expression (56) is entirely consistent with the presented formalism, as it can be rewritten in terms of the properly normalized marginal pdf ρ̃in(b, θ, ϕ) and J(b, θ) to give
(58) |
which is identical with (55). The results of [10] and other work that adopted a similar evaluation of Ss as that in (56) are thus not affected by our findings.
The approaches outlined above for evaluating the spatial entropy in Cartesian coordinates in BAT coordinates avoid potential shortcomings that may arise from approximating J(b, θ) as a constant equal to its value at the equilibrium values b = b0 and θ = θ0 of the internal BAT coordinates. In particular, although most of these coordinates are “hard” and therefore make nearly constant contributions to the Jacobian, this is by no means the case for the pseudo-bond and pseudo-angles often used to define the position and orientation of one molecule relative to another in a noncovalent complex [11].
One circumstance in which the Jacobian can be approximated as constant is that in which the molecule occupies only a single, reasonably narrow energy well with its local minimum at internal coordinates b0, θ0, ϕ0. One may then make the approximation J(b, θ) ≈ J(b0, θ0), in which case the pdf of Equation (51) simplifies to
(59) |
If, as a further approximation, the harmonic approximation is used for the potential energy U(b, θ, ϕ), this pdf becomes a multivariate normal (Gaussian) distribution, the entropy of which can be evaluated in closed form, yielding for the entropy of internal coordinates the following estimate:
(60) |
where F(b0, θ0, ϕ0) is the Hessian matrix of the harmonic potential energy at the energy minimum. The widely used quasiharmonic approximation for estimation of the configurational entropy from molecular simulations was based in its original formulation [12] on the assumption that F ≈ ∑−1/β where ∑ is the covariance matrix of a simulation sample of internal coordinates. Using now Equations (55) and (60), the spatial entropy in Cartesian coordinates is obtained as
(61) |
With these approximations, an isothermal process that changes the conformation of the system produces a change ΔSs in the spatial entropy Ss given by
(62) |
where the primed quantities pertain to the changed system. Clearly, the Jacobian-dependent term may be neglected only when the equilibrium Jacobians of the two conformations are approximately the same. As noted above, this approximation holds to good accuracy when and
Another perspective on this condition is also of interest. We first note that , where 𝒢 is the total kinetic-energy matrix in the BAT coordinates, and mi are the masses of the atoms; 𝒢, like JBAT, is a function of the BAT coordinates. This expression follows from the fact that 𝒢 = ℬℳ−1ℬT, where ℳ = diag(m1, m1, m1, …, mN,mN,mN) and ℬ is a 3N × 3N matrix whose elements are the partial derivatives ∂Qi/∂xj of all the BAT coordinates with respect to the Cartesian coordinates, so that det ℬ−1 = JBAT (see, e.g., [14]). Using this identity, along with, e.g., Equations (2.34) and (3.5) of [13], one may show that
(63) |
where I(b, θ, ϕ) is the matrix of the instantaneous inertia tensor and G(b, θ, ϕ) is the kinetic-energy matrix of the 3N − 6 internal degrees of freedom. Note that det I = I1I2I3, where Ii are the principal moments of inertia. The Jacobian term in (62) therefore may be written as
(64) |
6. Conclusions
We have addressed the situation in which one wishes to compute ΔS for a classically treated, isothermal physical process where the phase-space probability distribution ρ(p, q) goes to ρ′(p, q). In the biophysical context, this might be a binding or folding process. If (p, q) are Cartesian, then we can factorize ρ(p, q) as ρm(p)ρs(q) and accordingly decompose the entropy S into Ss + Sm. The momentum entropy, Sm, is not affected by the physical process, so ΔS = ΔSs. (If the temperature T changes, then there is a contribution from Sm as well, which can be computed analytically.) In practical applications, it is often preferable to compute spatial entropy in non-Cartesian coordinates, Q, but questions arise regarding the correct way to treat the entropy under a coordinate transformation because the differential Shannon entropy of information theory, which has the same mathematical form as the thermodynamic entropy, is not invariant under a change of coordinates. This lack of invariance appears problematic, because a simple change of coordinates must not affect the change in the entropy computed for a physical process.
This paradox is reconciled when one recognizes that the thermodynamic spatial entropy does in fact transform in the same manner as the differential Shannon entropy, but that the change in the transformed spatial entropy, Δ S̃s, is not in general equal to the change in total entropy, ΔS. The reason is that, in the new coordinates, unlike in Cartesians, the physical process also produces a change Δ S̃m in the entropy associated with the conjugate momenta, where S̃m is defined as an average of the momentum entropy over all values of the spatial coordinates. This change in the transformed momentum entropy precisely cancels the change in the spatial entropy associated with the transformation of coordinates, so that the change in total entropy due to the physical process is invariant under the transformation of coordinates.
The present analysis furthermore has provided useful expressions for the total entropy change for an isothermal physical process in terms of the spatial entropy in any set of spatial coordinates.
Acknowledgments
The authors thank Dan S. Sharp for many helpful discussions, and the anonymous referees for their insightful comments. This research was supported by grant no. 212-2006-M-16936 from the National Institute for Occupational Safety and Health to MKG and grant no.GM61300 from the National Institute of General Medical Sciences to MKG. The findings of this report are solely those of the authors and do not necessarily represent the views of the National Institute for Occupational Safety and Health or the National Institutes of Health.
References
- 1.Campbell J. Grammatical Man; Information, Entropy, Language & Life. New York, NY, USA: Simon and Schuster; 1982. [Google Scholar]
- 2.Cover TM, Thomas JA. Elements of Information Theory. New York, NY, USA: Wiley; 1991. [Google Scholar]
- 3.White H. The entropy of a continuous distribution. Bull. Math. Biophys. 1965;27:135–143. doi: 10.1007/BF02477270. [DOI] [PubMed] [Google Scholar]
- 4.Gō N, Scheraga HA. On the use of classical statistical mechanics in the treatment of polymer chain conformation. Macromolecules. 1976;9:535–542. [Google Scholar]
- 5.Landau LD, Lifshitz EM. Statistical Physics, Part 1. Oxford, UK: Butterworth-Heinemann; 1989. [Google Scholar]
- 6.Landau LD, Lifshitz EM. Quantum Mechanics. Oxford, UK: Butterworth-Heinemann; 1981. [Google Scholar]
- 7.Edholm O, Berendsen HJC. Entropy estimation from simulations of non-diffusive systems. Mol. Phys. 1984;51:1011–1028. [Google Scholar]
- 8.Landau LD, Lifshitz EM. Mechanics. Oxford, UK: Butterworth-Heinemann; 1976. [Google Scholar]
- 9.Gilson MK, Given JA, Busch B, McCammon JA. The statistical-thermodynamic basis for computation of binding affinites: A critical review. Biophys. J. 1997;72:1047–1069. doi: 10.1016/S0006-3495(97)78756-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Killian BJ, Kravitz JY, Gilson MK. Extraction of configurational entropy from molecular simulations via an expansion approximation. J. Chem. Phys. 2007;127:024107. doi: 10.1063/1.2746329. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Killian BJ, Kravitz JY, Somani S, Dasgupta P, Pang YP, Gilson MK. Configurational entropy in protein-protein binding: Computational study of Tsg101 ubiquitin E2 variant domain with an HIV-derived PTAP nonapeptide. J. Mol. Biol. 2009;389:315–335. doi: 10.1016/j.jmb.2009.04.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Karplus M, Kushick JN. Method for estimating the configurational entropy of macromolecules. Macromolecules. 1981;14:325–332. [Google Scholar]
- 13.Meyer R, Günthard HH. General internal motion of molecules, classical and quantum-mechanical Hamiltonian. J. Chem. Phys. 1969;49:1510–1520. [Google Scholar]
- 14.Turrell G. Mathematics for Chemistry and Physics. London, UK: Academic Press; 2002. [Google Scholar]