Abstract
We present an iteration-free weighted histogram method in terms of intensive variables that directly determines the inverse statistical temperature, βS = ∂S/∂E, with S the microcanonical entropy. The method eliminates iterative evaluations of the partition functions intrinsic to the conventional approach and leads to a dramatic acceleration of the posterior analysis of combining statistically independent simulations with no loss in accuracy. The synergistic combination of the method with generalized ensemble weights provides insights into the nature of the underlying phase transitions via signatures in βS characteristic of finite size systems. The versatility and accuracy of the method is illustrated for the Ising and Potts models.
The weighted histogram analysis method (WHAM) or multiple histogram method1 is a powerful technique for combining multiple independent Monte Carlo (MC) or molecular dynamics simulations to consistently calculate thermodynamic properties. Enhanced sampling methods greatly benefit from WHAM, improving the precision of the density of states,2 free energy differences,3 and potentials of mean force along reaction coordinates.4, 5, 6, 7
The central quantity in the original formulation of WHAM (Ref. 1) is the density of states Ω(E) or the microcanonical entropy S(E) = kBln Ω(E) (kB = 1). In this approach, M independent simulations performed with the sampling weights Wα(E)=e−wα(E) (α = 1, …, M), and wα the effective potential, are combined to determine the optimal estimate for Ω as
| (1) |
where H(E) = ∑αHα, Hα(E) = NαPα(E), and Nα and Pα are the number of samples and the normalized distribution in run α, respectively. The unknown relative partition function Zα in Eq. 1 is determined self-consistently by solving . The direct iteration method for Zα is commonly used with the convergence criterion, , where δZ is a threshold value and k is the iteration step.2 However, the convergence often becomes slow with increasing M, requiring thousands of iterations.8
In this paper, we propose an iteration-free, statistical temperature weighted histogram analysis method (ST-WHAM). While conventional WHAM is formulated in terms of all extensive quantities {S; Hα, Wα}, ST-WHAM is expressed in terms of the corresponding derivatives. The goal is to directly determine the inverse statistical temperature βS = ∂S/∂E as a weighted superposition of the individual statistical temperature estimates, , with no undetermined parameters. Eliminating the need to determine Zα leads to a substantial acceleration of the posterior analysis of merging independent runs without a loss in accuracy. The determination of βS yields S, and reveals valuable information characteristic of phase transitions in finite size systems.9
Basic formulation: We proceed by first converting Eq. 1 into a weighted average of the individual density of states estimates, Ωα = Hα/Πα, Πα = NαWα/Zα,
| (2) |
where represents the energy-dependent, normalized weight, Πα/∑αΠα. Multiplying the numerator and denominator by further identifies , where the histogram is reweighted by . The reweighted is not necessarily identical to the simulated Hα even though and .
We take the logarithm of both sides of Eq. 2 and differentiate with respect to E to express
| (3) |
where is the simulated histogram fraction. Throughout the paper the “*” symbol denotes ST-WHAM estimates.
The first term in Eq. 3 is a weighted superposition of each individual statistical temperature estimate, , yielding the ST-WHAM estimate for βS, as
| (4) |
where , , and , with the weight-dependent, inverse effective temperature.10 The key observation is that with no undetermined parameters, Eq. 4 uniquely determines by weighting the known, intensive estimates, and , in proportion to the number of samples in the corresponding histograms at energy E. In contrast, smoothly joining the extensive quantity Ωα requires the determination of Zα, even though is replaced by in Eq. 2.
The second term in Eq. 3 is the difference between the WHAM and ST-WHAM estimates, and after substituting reduces to
| (5) |
yielding . Note that with replacing by Hα, leads to and . As Nα increases, both and Hα rapidly converge to the exact result , where “ex” denotes exact values. Hence, the accuracy of both methods is similar with δβS ≃ 0 for Nα ≫ 1, which we will demonstrate for the Ising model.
Once is determined via Eq. 4 we can compute the corresponding entropy estimate
| (6) |
where , . Directly integrating Eq. 6 is not desirable due to the rapid variation of βS for small E. We approximate the statistical temperature on an equally spaced energy grid Ej = G(E/Δ)Δ, where Δ is the bin size and G(x) returns the nearest integer to x. Hence, for E ∈ [Ej, Ej + 1], with and . This approximation allows an analytical integration and gives a mapping from to
| (7) |
where , and imax = i − 1 if E ∈ [Ei − Δ/2, Ei], and imax = i if E ∈ [Ei, Ei + Δ/2].
The same strategy is equally applicable to the potential mean force (PMF) calculation along the reaction coordinate η(x), x being coordinates. The PMF at the inverse temperature β0 with the reference potential w0 is determined as , ρ(η) = ∫dxδ[η(x) − η]W0(x)/Z0, W0=e−β0w0. The WHAM estimate for ρ(η), conjugated with multiple runs with the sampling weight Wα= exp {−β0(w0+wα(η))}, wα being the biasing potential, is obtained5 as
| (8) |
. Denoting each individual estimate, ρα = Hα/Πα, , Eq. 8 further transforms to
| (9) |
analogous to Eq. 2. Taking the logarithm of both sides and differentiating with respect to η yields the WHAM estimate for the derivative of ln ρ(η) as
| (10) |
By retaining only the first term in Eq. 10 the ST-WHAM estimate for is obtained as a weighted superposition of ∂ln ρα/∂η over the simulated histogram faction as in Eq. 4. A similar expression to the first term in Eq. 10 is also derived in the “umbrella integration” by extending the thermodynamic integration method and has shown to reduce the statistical errors compared to conventional WHAM.6
Numerical simulations: The determination of by ST-WHAM given Hα and Wα is now illustrated for the 2D Ising model. We exploit the known exact values Sex (Ref. 11) to prepare the normalized histograms, , with Δ = 4 at four equally distributed temperatures Tα between T1 = 2.0 and T4 = 2.6 [see Fig. 1a]. The normalized weight in Fig. 1a equals one for non-overlapping energy regions and rapidly decreases to zero as decreases.
Figure 1.
ST-WHAM results for the 2D Ising model with linear dimension L = 32 and M = 4. (a) Hex (solid line), (dashed line), and (dotted line); (b) (solid line) and (dashed line); (c) βH (solid line) and (dashed line), and (d) βW (solid line) and (dashed line) as a function of e = E/2L2. The magnitude of is adjusted for visualization, and α = 1, 2, 3, and 4 from left to right in (a). The same color scheme is used for all figures.
By replacing by its finite difference form, , Eq. 4 determines the smoothly varying in Fig. 1b, which is indistinguishable from . Both and in Figs. 1c, 1d, respectively, are significant only for . Approximating , where the prime indicates differentiation and is determined from , we find that changes sign as E crosses , giving rise to the oscillatory behavior of βH in Fig. 1c. The weighted average of , , exhibits a staircase modulation [see Fig. 1d] and offsets βH, resulting in corresponding to a weighted superposition of tangents of βS at .
As most errors in S* arise from the mapping in Eq. 7. To examine the accuracy of this mapping we compute the error ε(S*), with for , by shifting and to their corresponding values at . We find ε(S*) ≈ 10−9, showing that the error is negligible, even though the energies are discrete. To demonstrate the speed-up of the posterior data analysis using ST-WHAM we compare the time, τM, needed to determine the entropy estimate for increasing M. Histograms (α = 1, …, M) are prepared at M equally divided temperatures between T1 = 2.0 and TM = 2.6. The log-log plot in Fig. 2a reveals that τM in WHAM scales as τM ∼ M2.3 for large M regardless of the value of δZ. In contrast, τM in ST-WHAM is independent of M, because the need to determine Zα has been eliminated.
Figure 2.
(a) Time τM required to compute S given at M equally distributed temperatures between T1 = 2.0 and TM = 2.6 for L = 32, (b) the error estimates ε(βS), and (c) ε(S) as a function of Monte Carlo steps per spin (MCS) averaged over ten realizations of canonical runs with M = 4.
The main source of error in finite length simulations is the statistical fluctuation of Hα. The accuracy of WHAM and ST-WHAM, is compared by plotting ε(βS) and ε(S) as a function of MC steps per spin (MCS) in Figs. 2b, 2c, respectively, for canonical runs at evenly distributed temperatures with M = 4. All quantities are averages over ten independent realizations and is calculated from the reweighted . Both and in WHAM depend strongly on δZ and gradually decrease with decreasing δZ. These errors reach the accuracy of ST-WHAM for δZ = 10−11, implying that ST-WHAM corresponds to the asymptotic limit of WHAM associated with δZ = 0. Because ε(S*) is greater than the error intrinsic to the mapping (≈10−9), the errors in S* are mostly due to the statistical uncertainties in Hα.
In addition to the simplification of the numerical analysis realized using ST-WHAM, the direct determination of βS via ST-WHAM unveils key signatures characteristic of phase transitions.9 Of particular interest is its use with the generalized ensemble weight,
| (11) |
where {λα,γ,E} are a set of tunable parameters.12, 13 This form of Wα yields centered at the crossing point between TS and . Here and . If we vary γ from −∞ to γS, we can continuously tune the ensemble from to a locally flat Hα. The use of Wα is particularly well suited to sampling strong first-order phase transitions, in which coexisting states are associated with the characteristic backbending of TS, i.e., γS(E) < 0.9 Phase-mixed configurations are intrinsically unstable in the canonical ensemble due to κ* > 0. These metastable states are directly accessible in Wα with γ < γS via a unimodal Hα.13
To explore the synergistic combination of ST-WHAM with generalized ensembles in strong first-order phase transitions, we consider the q-state 2D Potts model with toroidal geometry. For each q, two short canonical runs at Tl = 0.9Tc and Th = 1.1Tc, with the critical temperature of the infinite lattice, were performed to approximately determine the internal energies El and Eh, giving γ = 10(Th − Tl)/(El − Eh) < γS for all sampled energies and E=El, λ1 = Tl, λM = Th + γ(Uh − Ul), and λα = λ1 + (α − 1)(λM − λ1)/(M − 1).13 Runs of 106 MCS for each α with M = 100 associated with in Fig. 3a produce successive unimodal Hα, which are merged to determine for q = 50. Note that Hα are peaked at crossing points between and . Representative configurations at intermediate α in Fig. 3b demonstrate that various mixed-phase configurations are sampled.
Figure 3.
Results for the q-state 2D Potts model. (a) (dashed line), (solid line), Hα (dotted line), and characteristic energies , , , , , and (see the text) from left to right (circles); (b) representative configurations at different α for q = 50 and L = 24, (c) , (d) for L = 24 (solid line) and L = 36 (dashed line); (e) free energy densities per spin with varying q. In (a), α = 10, 20, 30, 40, 50, 60, 70, 80, and 90 in both and Hα. The same color scheme is applied to (c)–(e).
The non-monotonic variation of in Fig. 3a characterizes a sequence of phase transitions.14 The local maximum and minimum at and are associated with the nucleation of disordered (α = 20) and ordered (α = 80) droplets in each stable phase. The flat region near Tc between and represents the formation of strip phases corresponding to α = 50 and 60 in Fig. 3b. Here , with eo and ed the energies of the free energy minima of the ordered and disordered phases at Tc. As q increases both backbending ( and the strip phase region ) gradually expand with more pronounced transition markers [see Fig. 3c].
All the relevant transitions are determined by identifying the locations of zeros and peaks in the derivatives of in Fig. 3d. The two central peaks at and , locate the transitions between the droplet and strip phases, and are close to the droplet-strip transition energies (gray vertical lines) in the infinite volume limit, π−1 − 1 and π−1, respectively.15 For L increasing from 24 to 36 (dashed line) both and for q = 50 and 100 shift to the thermodynamic transition points. Zeros of corresponding to and yield “effective spinodal points,” in which metastable droplets start to grow by absorbing background fluctuations in stable phases.14 The free energy densities per spin in Fig. 3e, , exhibit wells at and 0, and inflections at and , with flat humps between and . Here F is set to zero at .
In summary, an efficient weighted histogram analysis method, ST-WHAM, has been proposed in terms of intensive variables. The method directly determines βS and S with no iterative evaluations of partition functions, providing the same accuracy as conventional WHAM for infinite iterations. If combined with parameterized, generalized ensemble weights, ST-WHAM gives the complete sequence of phase transitions among various metastable states via distinct markers in βS as exemplified by our simulations of the q-state Potts model. We anticipate that directly accessing both βS and S “on the fly” during the simulation via ST-WHAM will allow for considerable acceleration in the performance of sampling algorithms that rely on iterative refinements of S (Ref. 16) or βS.17
In closing, some potential limitations in our approach should be addressed. As both WHAM and ST-WHAM assume overlaps between energy distributions extra interpolations or extrapolations of Hα(E) using a proper functional form would be necessary for unvisited energy regions in rugged or glassy systems. The numerical instability of computing partial derivatives with respect to each order parameter and recovering extensive quantities from intensive ones poses a challenge in the extension of our approach to PMF calculations in multiple order parameters.
Acknowledgments
We thank the National Science Foundation (NSF) (CHE-0750309, CHE-1114676, and CHE-0833605) and the National of Institutes of Health (NIH) (R01 GM076688) for support. Special thanks to Professor Harvey Gould for a careful reading of the manuscript.
References
- Ferrenberg A. M. and Swendsen R. H., Phys. Rev. Lett. 63, 1195 (1989). 10.1103/PhysRevLett.63.1195 [DOI] [PubMed] [Google Scholar]
- Newman M. E. J. and Barkema G. T., Monte Carlo Methods in Statistical Physics (Clarendon, Oxford, 1999). [Google Scholar]
- Frenkel D. and Smit B., Understanding Molecular Simulation: From Algorithms to Applications (Academic, San Diego, 1996). [Google Scholar]
- Kumar S., Bouzuda D., Swendsen R. H., Kollman P. A., and Rosenberg J. M., J. Comput. Chem. 13, 1011 (1992). 10.1002/jcc.540130812 [DOI] [Google Scholar]
- Souaille M. and Roux B., Comput. Phys. Commun. 135, 40 (2001). 10.1016/S0010-4655(00)00215-0 [DOI] [Google Scholar]
- Kastner J. and Thiel W., J. Chem. Phys. 123, 144104 (2005). 10.1063/1.2052648 [DOI] [PubMed] [Google Scholar]
- Chodera J. D., Swope W. C., Pitera J. W., Seok C., and Dill K. A., J. Chem. Theory Comput. 3, 26 (2007). 10.1021/ct0502864 [DOI] [PubMed] [Google Scholar]
- Bereau T. and Swendsen R. H., Comput. Phys. Commun. 228, 6119 (2009). 10.1016/j.jcp.2009.05.011 [DOI] [Google Scholar]
- Gross D. H. E., Rep. Prog. Phys. 53, 605 (1990); 10.1080/02783199009553305 [DOI] [Google Scholar]; Wales D. J. and Berry R. S., Phys. Rev. Lett. 73, 2875 (1994); 10.1103/PhysRevLett.73.2875 [DOI] [PubMed] [Google Scholar]; Kim J., Keyes T., and Straub J. E., Phys. Rev. E 79, 030902(R) (2009). 10.1103/PhysRevE.79.030902 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kim J., Fukunishi Y., and Nakamura H., J. Chem. Phys. 121, 1626 (2004); 10.1063/1.1763841 [DOI] [PubMed] [Google Scholar]; Kim J., Fukunishi Y., Kidera A., and Nakamura H., J. Chem. Phys. 121, 5590 (2004). 10.1063/1.1786578 [DOI] [PubMed] [Google Scholar]
- Beale P. D., Phys. Rev. Lett. 76, 78 (1996). 10.1103/PhysRevLett.76.78 [DOI] [PubMed] [Google Scholar]
- Tsallis C., J. Stat. Phys. 52, 479 (1988). 10.1007/BF01016429 [DOI] [Google Scholar]
- Kim J., Keyes T., and Straub J. E., J. Chem. Phys. 132, 224107 (2010); 10.1063/1.3432176 [DOI] [PMC free article] [PubMed] [Google Scholar]; Kim J. and Straub J. E., J. Chem. Phys. 133, 154101 (2010). 10.1063/1.3503503 [DOI] [PMC free article] [PubMed] [Google Scholar]
- MacDowell L. G., Virnau P., Muller M., and Binder K., J. Chem. Phys. 120, 5293 (2004). 10.1063/1.1645784 [DOI] [PubMed] [Google Scholar]
- Bauer B., Gull E., Trebst S., Troyer M., and Huse D. A., J. Stat. Mech. P01020 (2010). 10.1088/1742-5468/2010/01/P01020 [DOI] [Google Scholar]
- Berg B. A. and Celik T., Phys. Rev. Lett. 69, 2292 (1992). 10.1103/PhysRevLett.69.2292 [DOI] [PubMed] [Google Scholar]
- Hansmann U. H. E., Okamoto Y., and Eiseenmenger F., Chem. Phys. Lett. 259, 321 (1996); 10.1016/0009-2614(96)00761-0 [DOI] [Google Scholar]; Nakajima N., Nakmura H., and Kidera A., J. Phys. Chem. B 101, 817 (1997); 10.1021/jp962142e [DOI] [Google Scholar]; Kim J., Fukunishi Y., and Nakamura H., Phys. Rev. E 70, 057103 (2004). 10.1103/PhysRevE.70.057103 [DOI] [PubMed] [Google Scholar]



