Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2016 Oct 1.
Published in final edited form as: J Phys Chem Lett. 2015 Sep 14;6(19):3834–3840. doi: 10.1021/acs.jpclett.5b01771

A Stochastic Solution to the Unbinned WHAM Equations

Bin W Zhang , Junchao Xia , Zhiqiang Tan , Ronald M Levy †,
PMCID: PMC4894662  NIHMSID: NIHMS790610  PMID: 26722879

Abstract

The Weighted Histogram Analysis Method (WHAM) and unbinned versions such as the Multistate Bennett Acceptance Ratio (MBAR) and Unbinned WHAM (UWHAM) are widely used to compute free energies and expectations from data generated by independent or coupled parallel simulations. Here we introduce a Replica Exchange-like algorithm (RE-SWHAM) that can be used to solve the UWHAM equations stochastically. This method is capable of analyzing large data sets generated by hundreds or even thousands of parallel simulations that are too large to be “WHAMMED” using standard methods. We illustrate the method by applying it to obtain free energy weights for each of the 240 states in a simulation of host-guest ligand binding containing ~ 3.5 × 107 data elements collected from 16 parallel Hamiltonian Replica Exchange simulations, performed at 15 temperatures. In addition to using much less memory, RE-SWHAM showed a nearly eighty fold improvement in computational time compared with UWHAM.

Keywords: UWHAM, Stochastic Reweighting, Parallel Simulations, Free Energy, MBAR

Graphical Abstract

graphic file with name nihms790610f4.jpg


The Weighted Histogram Analysis Method (WHAM) is used to compute free energies and expectations from data generated by multi-canonical simulations and independent simulations at multiple Hamiltonian or thermodynamic states113 For example, after running Umbrella Sampling, the standard procedure is to use WHAM to combine the pieces of the free energy profile along chosen reaction coordinates.4 After running Replica Exchange (RE) simulations,14,15 WHAM is used to estimate the free energy differences between Hamiltonian states, or to obtain expectation values at the targeted thermodynamic states from the simulation data at all of the thermodynamic states.6,7 In this letter we describe a RE-like algorithm we have developed that corresponds to a stochastic solution of the UWHAM equations. We refer to this algorithm as Replica Exchange Stochastic WHAM or RE-SWHAM. Since running RE-SWHAM requires much less computing power compared with present WHAM-based analysis tools such as MBAR8 and UWHAM,5,9 we believe RE-SWHAM is a promising method for analyzing very large ensembles of data generated by multi-canonical simulations or massively parallel but independent datasets generated on very large computer grids.16,17

Before introducing RE-SWHAM, we review the binless WHAM (UWHAM) equations.9 In parallel simulations, if each simulation has different thermodynamic parameters such as temperature and pressure, they are referred to as simulations at different thermodynamic states; if each simulation has different potential energy functions, such as in Hamiltonian RE simulations, they can be referred to as simulations at different alchemical Hamiltonian states. Moreover, parallel simulations can differ in both thermodynamic parameters and potential energy functions such as in two dimensional RE simulations. For simplicity we refer to each of these states, which is characterized by a specific combination of thermodynamic parameters and potential energy functions, as a thermodynamic state, in order to avoid confusion with our description of the UP and DOWN conformational states. Consider a system of M parallel simulations labeled by Greek letters. Each thermodynamic state has a biased potential wα(u), where u is the reduced coordinate of the observation. Suppose uαi is the ith observation at the αth thermodynamic state, and Nα is the total number of observations at the αth thermodynamic state. The UWHAM estimates of the density of states Ω(uγi) and the partition function Zα are (up to a multiplicative constant):6,9

α=γ=1Mi=1Nγcα(uγi)Ω^(uγi), (1)
Ω^(uγi)=1κ=1MNκκ1cκ(uγi), (2)

where

cα(uγi)=exp[βαwα(uγi)], (3)

is the bias factor. Equation (1), (2) and (3) constitute a coupled set of equations which can by solved by Newton iteration8 or optimization.9,10 The UWHAM estimate of the probability of the observation uγi at the αth state is

p^α(uγi)=α1Ω^(uγi)cα(uγi). (4)

We note that although the solution to the UWHAM equations depends on the number of observations at each thermodynamic state Nα, it does not depend on the original thermodynamic state at which each sample uγi is observed.

We now describe a stochastic Monte-Carlo approach to solve the coupled equation (1), (2) and (3) inspired by the Replica Exchange algorithm. Consider a system of M parallel simulations labeled by Greek letters. Suppose there are Nα observations at the αth thermodynamic state and N is the total number of observations, namely N=α=1MNα. To initialize RE-SWHAM we first construct a weight array of N elements for each thermodynamic state, corresponding to the N observations. (A computationally efficient algorithm implementing RE-SWHAM is presented later and illustrated by Fig. 1.) There are M arrays of N elements, one for each thermodynamic state. The individual elements of the weight array have value 0 or 1 at every iteration. Initially, we set all the elements corresponding to the data actually observed at this thermodynamic state to 1, and the others to 0. Consider the αth thermodynamic state, the initial weight array is

Xα(t=0)={δα(u11,t=0)=0,,δα(uα1,t=0)=1,δα(uα2,t=0)=1,,δα(uαNα,t=0)=1,,δα(uγi,t=0)=0,,δα(uMNM,t=0)=0}, (5)

where γ ≠ α, and it has Nα nonzero elements. For simplicity, we reindex the N observations and omit the label of the original thermodynamic state at which each data element was observed since that information is not required. The weight array for the αth thermodynamic state at time t is referred to as Xα(t) = {δα(u1, t), δα(u2, t), ⋯, δα(ui, t), ⋯, δα(uN, t)}, where δα(ui, t) = 1 if the observation ui occupies the αth thermodynamic state at t, δα(ui, t) = 0 otherwise.

Figure 1.

Figure 1

An illustration of the RE-SWHAM algorithm. This drawing illustrates two replica exchange cycles of the RE-SWHAM method, and shows only two thermodynamic states with “grey” or “cyan” color. In each cycle one data element is chosen from each thermodynamic state first, then a replica exchange is performed. In the first cycle since the swap is accepted, the data associated with the two replicas is swapped to the other thermodynamic state’s data array. At the end of each cycle, the data associated with replicas are recorded as the output like explicit RE simulations..

The Metropolis Monte-Carlo RE-SWHAM algorithm resembles running explicit replica exchange simulations, which consists of cycles. There are two components in each cycle: i) the MD or MC simulation of each replica at a fixed thermodynamic state (the “move” process), which includes comformational relaxation at that thermodynamic state; ii) the attempted swaps of replicas (the “exchange” process), which is the relaxation in the replica and thermodynamic state permutation space. The RE-SWHAM algorithm also runs by replica exchange cycles. At each cycle, a data element at each thermodynamic state is chosen based on the current normalized weights, δα(ui,t)/j=1Nδα(uj,t)=δα(ui,t)/Nα. Since in RE-SWHAM the elements of the weight array at each iteration represent the set of Nα out of N data elements that currently occupy that thermodynamic state, randomly choosing the next data element is analogous to the move process of an explicit RE simulation when its MD simulation period per cycle is so long that two adjacent configurations chosen for replica exchange have no correlations. Then a set of replica exchanges is performed which follows the same exchange criterion as in explicit replica exchange simulations (see the Metropolis function in Eq. (9)). In this letter we use the independence sampling algorithm as the proposal scheme,18 which attempts to exchange replicas from two thermodynamic states chosen at random.

RE-SWHAM updates the weights of observations of thermodynamic states based on the exchange process. When an exchange attempt is accepted, RE-SWHAM swaps the thermodynamic states of the two replicas, and changes the weights of the observations associated with the replicas at both thermodynamic states. For example, if one replica associated with the observation ui at the αth thermodynamic state hops to the γth thermodynamic state at time t, RE-SWHAM changes the weight of ui to 0 at the αth thermodynamic state, namely δα (ui, t) = 0, and to 1 at the γth thermodynamic state, namely δγ (ui, t) = 1. As part of this exchange move, a corresponding data element uj moves from the γth to the αth thermodynamic state. Notice that in RE-SWHAM algorithm the number of nonzero weights remains the same at each thermodynamic state and each observation has a nonzero weight at one and only one thermodynamic state. Following this procedure, the instantaneous weight of each data element at each thermodynamic state oscillates as 0 and 1, but the normalized time-average of the weight converges to the unbiased estimate of the probability of the data element, α(ui), in Eq(4).

The RE-SWHAM algorithm performs a random walk in the space of the weight arrays of observations. In each cycle, RE-SWHAM moves from one set of weight arrays, or a configuration, to another following the same exchange criterion as that for multi-canonical replica exchange simulations:

X(0)RE cycleX(1)RE cycleX(t)RE cycle, (6)

where X(t) is a set of weight arrays {X1(t), X2(t), ⋯, Xα(t), ⋯, XM(t)} with Xα(t) corresponding to the weight array at the αth thermodynamic state as described earlier. The total number of configurations of this Markov chain corresponds to the total number of ways to assign N non-zero instantaneous weights to M thermodynamic states, with the number of observations at each thermodynamic state fixed at {N1, N2, ⋯, Nα, ⋯, NM} which is: N!/(α=1MNα!). The equivalent Master equation perspective for this random walk is

dPX(t)dt=A·PX(t), (7)

where A is the rate matrix and PX(t) represents the vector probability of all the possible configurations. The solution of Eq. (7) can be written as

PX(t+Δt)=PX(t)·T, (8)

where T is the row-normalized transition matrix.

It is possible to write down the transition matrix for RE-SWHAM when one exchange is attempted per cycle. Suppose at the beginning of a cycle, RE-SWHAM is at the ith configuration (which corresponds to a specification of the N instantaneous weights at each of the M thermodynamic states), and two replicas are chosen at random to attempt an exchange. Consider the trial move in RE-SWHAM which attempts to exchange one observation um at the αth thermodynamic state and the other observation un at the γth thermodynamic state. The new configuration will be called the jth configuration, which is the same as the ith configuration except that the weights of um and un are exchanged in the weight arrays at the αth and the γth thermodynamic states. The probability that this trial move is accepted, namely, the exchange from the ith configuration to the jth configuration is accepted is

Tij=2M(M1))1Nα1Nγmin (1,exp[βαwα(un)]exp[βγwγ(um)]exp[βαwα(um)]exp[βγwγ(un)])=2M(M1))1Nα1NγΨ(log [cα(um)cγ(un)cα(un)cγ(um)]). (9)

where the first factor 2/(M(M − 1))) is the probability of choosing the replicas at the αth and the γth thermodynamic states from the M replicas, and the factors 1/Nα and 1/Nγ are the probabilities of choosing one observation from the respective thermodynamic states. Throughout, Ψ is the Metropolis function19

Ψ(x)=min(1,exp[x]). (10)

Consider the reverse trial exchange move from the jth configuration to the ith configuration, in other words, the swap of the observation un at the αth state and the observation um in at the γth state with all other weights in configuration i and j having the same values. The probability of this move is

Tji=2M(M1))1Nα1NγΨ(log [cα(un)cγ(um)cα(um)cγ(un)]). (11)

By construction the stationary probabilities of the ith and jth configurations, denoted by pi and pj, satisfy the detailed balance condition:

piTij=pjTji. (12)

Next consider a subgroup I including all the configurations which have the observation um at the αth thermodynamic state and the other observation un at the γth thermodynamic state, and another subgroup J including all the configurations which have the observation un at the αth thermodynamic state and the other observation um at the γth thermodynamic state. For each configuration k in the subgroup I, there exists one configuration l in the subgroup J for which the only difference between these two configurations are the thermodynamic state the observation um and un belongs to. Every pair of such configurations, (k, l) with kI and lJ, satisfies pkTij = plTji, where Tij and Tji remain the same regardless of (k, l). Then the total probabilities of these two subgroups satisfy

(kIpk)Tij=(lJpl)Tji. (13)

Theoretically the correlation is nonzero between the occurrence of the mth observation at the αth thermodynamic state and the occurrence of the nth observation at the γth thermodynamic state (or the occurrence of the mth observation at the γth thermodynamic state and the occurrence of the nth observation at the αth thermodynamic state). However, such correlations become negligible when the total number of observations at each thermodynamic state is large. Therefore, the total probabilities of the subgroup I and J are

kIpk=δα(um,t)δγ(un,t)δα(um,t)δγ(un,t)=Nαp˜α(um)Nγp˜γ(un)
lJpl=δα(un,t)δγ(um,t)δα(un,t)δγ(um,t)=Nαp˜α(un)Nγp˜γ(um), (14)

where 〈δα(um, t)〉 is the time-average weight of the observation um at the αth thermodynamic state, and α(um) is the normalized RE-SWHAM time-average weight, namely α(um) = 〈δα(um, t)〉/Nα. Combining Eq. (9), (11), (13), (14) and applying the property Φ(x)/Φ(−x) = exp(−x) lead to the detailed balance relation

p˜α(um)/cα(um)p˜γ(um)/cγ(um)=p˜α(un)/cα(un)p˜γ(un)/cγ(un), for all (um,un) and (α,γ). (15)

Denote the common value of the ratios in Eq. (15) by f(α, γ) which depends on (α, γ), but not (um, un), and fix γ at a baseline thermodynamic state, say α0. Then Eq. (15) gives

p˜α(um)/cα(um)=f(α,α0)(p˜α0(um)/cα0(um)). (16)

That is, the probability α(um) can be expressed in the form

p˜α(um)=Z˜α1Ω˜(um)cα(um), for all um and α, (17)

where α = 1/f(α, α0) and Ω̃(um) = α0(um)/cα0(um). Since m=1Np˜α(um)=1 at each thermodynamic state, summing both sides of Eq. (17) over m yields

Z˜α=m=1Ncα(um)Ω˜(um). (18)

As mentioned previously, the RE-SWHAM algorithm keeps the total number of observations, Nα, unchanged at each thermodynamic state; the normalized time-average weight of every observation α(um) satisfies α=1MNαp˜α(um)=α=1Mδα(um,t)=1, because each observation appears at one and only one thermodynamic state at one time. Multiplying both sides of Eq. (17) by Nα and summing over α yields

Ω˜(um)=1α=1MNαZ˜α1cα(um). (19)

Thus the RE-SWHAM estimates {Ω̃(um) : m = 1, …, N} and {α: α = 1, …, M} satisfy the array of equations (18)(19), which are equivalent to the UWHAM equations Eq. (1)(2). But the UWHAM estimates {Ω̂(um) : m = 1, …, N} and {α : α = 1, …, M} are, up to a multiplicative constant, unique solutions to Eq. (1)(2).5,9 Then Ω̃(um) = Ω̂(um) for all m = 1, …, N and α = α for all α = 1, …, M provided that the same baseline thermodynamic state α0 is used in UWHAM and RE-SWHAM. Therefore by Eq. (4) and (17), the estimated probabilities from RE-SWHAM agree with those from UWHAM: α(um) = α(um).

In the previous discussion, we showed how to write any element of the transition matrix T for RE-SWHAM with one exchange attempt per cycle. Theoretically one can obtain the UWHAM solution by diagonalizing the transition matrix T and solving the equilibrium probabilities of all the possible configurations. This is impractical because of the overwhelming size of the transition matrix; however, RE-SWHAM provides a Markov Chain Monte Carlo solution to the problem.

RE-SWHAM, described conceptually above, can be implemented in practice as a computationally efficient algorithm. See Fig. 1 for an illustration of RE-SWHAM for a problem containing two thermodynamic states. To initialize RE-SWHAM, instead of constructing a weight array, we construct a current data array for each thermodynamic state using all the data elements observed from that thermodynamic state. In Fig. 1, there are m observations originally from the “grey” state and n observations from the “cyan” state. Then RE-SWHAM is run by cycles: first, one of the data elements is chosen with equal probability from each thermodynamic state. Second, a replica exchange attempt is performed following the multi-canonical exchange criterion. When an exchange attempt is accepted, the thermodynamic states of the two replicas are swapped, and the observations (data elements) associated with these two replicas are also swapped to the other thermodynamic state’s data array. As shown by the first replica exchange cycle in Fig. 1, the observation up is swapped to the “cyan” state and the observation “um+q” is swapped to the “grey” state. At the end of each cycle, no matter whether or not the exchange attempt is accepted, the observation associated with the replica at each thermodynamic state is recorded as the output like explicit RE simulations. Notice our implementation does not track the weight array but tracks the explicit observations with non-zero weights for each thermodynamic state to save the memory, since the total number of weights, (M×α=1MNα), increases rapidly with the number of thermodynamic states M. Although the illustration shows only one exchange attempt per cycle, in practice performing multiple exchange attempts per cycle accelerates the convergence of RE-SWHAM.18

The output of RE-SWHAM at each thermodynamic state is a sample of all observations according to their UWHAM weights from Eq. (4) or RE-SWHAM weights from Eq. (17). The free energy difference between two adjacent thermodynamic states can be obtained by the standard “free energy perturbation formula” (FEP),20 but note that unlike standard FEP, in Eq. (20) the reweighted (i.e. maximum likelihood) density of states appears. For instance, the free energy difference between the αth and (α + 1)th thermodynamic states is

kBT ln (Z˜α+1Z˜α)=kBT lni=1NΩ˜(ui)cα+1(ui)i=1NΩ˜(ui)cα(ui)=kBT lni=1NΩ˜(ui)cα(ui)(cα+1(ui)cα(ui))i=1NΩ˜(ui)cα(ui)=kBT lncα+1(ui)cα(ui)α, (20)

where the triangular brackets denote an average over the ui values sampled at the αth thermodynamic state according to their converged RE-SWHAM probabilities α(ui).

We include three numerical examples of the application of RE-SWHAM in this letter. The first example is to estimate the binding of a guest molecule to a host at 300 K from 16 independent 12 ns long MD simulations with different Hamiltonians for which the coupling between the guest and the host is varied. See Supporting Information and Simulation Methods section for more details. We deliberately chose this set of short unconverged independent parallel simulations to guarantee the obvious differences between the raw data and their RE-SWHAM estimates. The second example is to calculate the free energy differences between the 16 λ states at 300 K from a 72 ns one dimensional RE simulation using the same 16 Hamiltonians used in the first example. In both examples we applied UWHAM to analyze the data as the benchmark. As shown in Fig. 2(a) and (b), the binding energy distribution at the λ = 0.95 thermodynamic state and the free energy differences estimated by UWHAM and RE-SWHAM are indistinguishable. (see the Supporting Information for the distribution of binding energy at each λ thermodynamic state.)

Figure 2.

Figure 2

Numerical comparisons between UWHAM and RE-SWHAM. Plot (a) shows the Heptanoate-β-cyclodextrin binding energy distribution of λ = 0.95 state at 300 K obtained from a 12 ns independent MD simulation, and the UWHAM and RE-SWHAM estimates calculated from 16 independent 12 ns MD simulations run at different λ states. Plot (b) shows the free energy differences between these 16 λ states at 300 K estimated by UWHAM and RE-SWHAM from a 72 ns RE simulation. In both plots the UWHAM and RE-SWHAM results are indistinguishable.

To illustrate the ability of RE-SWHAM to analyze a very large data set which is difficult to analyze using standard UWHAM or MBAR methods, we applied RE-SWHAM to estimate the binding energy distributions from the data generated by a set of independent one dimensional RE simulations (each with 16 replicas, using the same 16 Hamiltonians used in the first two examples) run at 15 different temperatures. The data ensemble in this example contains 240 thermodynamic states varying in λ values and temperatures, and a total 3.456 × 107 data elements. Fig. 3 shows the raw data and the RE-SWHAM estimates of the binding energy distributions of λ = 1.0 state at 200 K and 300 K. We also plotted the results obtained from an asynchronous two dimensional 240 state RE simulation reported previously.16 These results serve as a converged benchmark. As can be seen, at low temperature (200 K), even the one dimensional RE simulations are not converged after 72 ns. However, compared with the two dimensional asynchronous RE simulation results, applying RE-SWHAM to this unconverged two dimensional (λ, T) data set led to significant improvements for the binding energy distribution at 200 K by using information from the simulations at higher temperature. At 300 K, RE-SWHAM estimates also show some improvements compared with the raw data. (see the Supporting Information for more discussion including the distribution of binding energy of λ = 1.0 state at each of 15 temperatures and the corresponding UWHAM estimate.) The RE-SWHAM estimates in Fig. 3 are the statistical results from the output stream of RE-SWHAM run for 12 minutes (2.0 × 106 RE cycles), however it took 15.7 hours to obtain similar converged estimates with UWHAM.

Figure 3.

Figure 3

An application of RE-SWHAM to analyze a very large data set containing 240 thermodynamic states and 3.456 × 107 data elements. These plots show the binding energy distributions of λ = 1.0 state at 200 k and 300 K obtained from one dimensional RE simulations and the respective RE-SWHAM estimates calculated from 15 independent one dimensional RE simulations run at 15 different temperatures. The blue dash lines are the binding energy distributions obtained from an asynchronous two dimensional 240 state RE simulation, which serve as a converged benchmark. The estimates from RE-SWHAM run for 12 minutes agree with those obtained from UWHAM (not shown) which required 15.7 hours to converge. The RE-SWHAM results show significant improvements for the binding energy distribution at 200 K compared with the unconverged one dimensional RE simulation data. At 300 K the one dimensional RE simulation data are better converged, but even for this data set, reweighting with RE-SWHAM leads to some improvements in the estimates of the low energy tail of the distribution.

As far as we aware this is the first time that the WHAM equations have been solved stochastically. More importantly, we believe RE-SWHAM is a promising tool to handle very large data sets. Over the last decade, hardware and software developments made it possible to run enhanced sampling simulations with hundreds of thermodynamic states, or Umbrella Sampling with thousands of windows.16,17,2123 These kinds of large-scale simulations provide significant sampling power to study complex biological systems, however they also pose a challenge for reweighting techniques like UWHAM to analyze the large raw data sets. For example, for the data obtained from a multicanonical simulation with M thermodynamic states and Nα¯ observations per state, UWHAM needs to manipulate a matrix with Nα¯×M2 elements, and the size of this matrix increases rapidly with the number of thermodynamic states M. Instead the RE-SWHAM algorithm is a better choice for such scenario because it only manipulates Nα¯×M data elements, and performs simple replica exchange processes during the analysis. We have developed another algorithm called “local WHAM” (LWHAM) to analyze very large data sets which is based on the idea of only “WHAMMing” the data elements in the local neighborhood for each thermodynamic state (manuscript in preparation). Finally, we note a recent study by Meng and Roux who proposed an algorithm based on a multivariate linear regression, to process large data sets generated by Umbrella Sampling along a chosen reaction coordinate without solving the WHAM equations.23 RE-SWHAM can also be used to construct potentials of mean force from data generated by Umbrella Sampling.

Simulation Methods

The biological system used to generate simulation data is the binding of a guest molecule (Heptanoate) to a host molecule (β-cyclodextrins) — a problem we studied previously.24,25 The binding energy distribution analysis method (BEDAM) was applied to study the binding of Heptanoate/BCD complex.26 BEDAM is a free energy method based on RE simulations in which the interaction between ligand and acceptor is scaled by the factor λ changing gradually from zero to one, namely (H = H0 + λV ; 0 ≤ λ ≤ 1). Here we chose 16 λ values: (0.0, 0.001, 0.002, 0.004, 0.01, 0.04, 0.07, 0.1, 0.2, 0.4, 0.6, 0.7, 0.8, 0.9, 0.95, 1.0). For the two dimensional problem we “RE-SWHAMED” data from 240 states with different (λ, T) values. One dimensional replica exchange simulations each with 16 λ values were carried out independently at 15 temperatures: (200 K, 206 K, 212 K, 218 K, 225 K, 231 K, 238 K, 245 K, 252 K, 260 K, 267 K, 275 K, 283 K, 291 K, 300 K). There are replica exchanges between simulations of different λ states at each temperature but no replica exchanges allowed between simulations at different temperatures. Each one dimensional replica exchange simulation lasted 72 ns. (See Supporting Information for more details.)

Acknowledgments

This work was supported by grants from the National Science Foundation (CDI type II 1125332) and the National Institute of Health (GM30580). We would like to acknowledge valuable scientific discussions with Dr. Emilio Gallicchio, Peng He and Wei Dai.

Footnotes

Supporting Information Available

The Heptanoate and β-cyclodextrins binding complex, additional plots to Fig. 2(a) and additional plots to Fig. 3. This material is available free of charge via the Internet at http://pubs.acs.org/.

References

  • 1.Ferrenberg A, Swendsen R. Optimized Monte Carlo Data Analysis. Phys. Rev. Lett. 1989;63:1195–1198. doi: 10.1103/PhysRevLett.63.1195. [DOI] [PubMed] [Google Scholar]
  • 2.Kumar S, Rosenberg JM, Bouzida D, Swendsen RH, Kollman PA. The Weighted Histogram Analysis Method for Free-Energy Calculations on Biomolecules. I. The Method. J. Comput. Chem. 1992;13:1011–1021. [Google Scholar]
  • 3.Kumar S, Rosenberg JM, Bouzida D, Swendsen RH, Kollman PA. Multidimensional Free-energy Calculations Using the Weighted Histogram Analysis Method. J. Comput. Chem. 1995;16:1339–1350. [Google Scholar]
  • 4.Bartels C, Karplus M. Multidimensional Adaptive Umbrella Sampling: Applications to Main Chain and Side Chain Peptide Conformations. J. Comput. Chem. 1997;18:1450–1462. [Google Scholar]
  • 5.Tan Z. On a Likelihood Approach for Monte Carlo Integration. J. Am. Stat. Assoc. 2004;99:1027–1036. [Google Scholar]
  • 6.Gallicchio E, Andrec M, Felts AK, Levy RM. Temperature Weighted Histogram Analysis Method, Replica Exchange, and Transition Paths. J. Phys. Chem. B. 2005;109:6722–6731. doi: 10.1021/jp045294f. [DOI] [PubMed] [Google Scholar]
  • 7.Chodera JD, Swope WC, Pitera JW, Seok C, Dill KA. Use of the Weighted Histogram Analysis Method for the Analysis of Simulated and Parallel Tempering Simulations. J. Chem. Theory Comput. 2007;3:26–41. doi: 10.1021/ct0502864. [DOI] [PubMed] [Google Scholar]
  • 8.Shirts MR, Chodera JD. Statistically Optimal Analysis of Samples from Multiple Equilibrium States. J. Chem. Phys. 2008;129:124105. doi: 10.1063/1.2978177. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Tan Z, Gallicchio E, Lapelosa M, Levy RM. Theory of Binless Multi-State Free Energy Estimation with Applications to Protein-Ligand Binding. J. Chem. Phys. 2012;136:144102. doi: 10.1063/1.3701175. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Zhu F, Hummer G. Convergence and Error Estimation in Free Energy Calculations Using the Weighted Histogram Analysis Method. J. Comput. Chem. 2012;33:453–465. doi: 10.1002/jcc.21989. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Law SM, Ahlstrom LS, Panahi A, Brooks CL., III Hamiltonian Mapping Revisited: Calibrating Minimalist Models to Capture Molecular Recognition by Intrinsically Disordered Proteins. J. Phys. Chem. Lett. 2014;5:3441–3444. doi: 10.1021/jz501811k. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Wu H, Mey ASJS, Rosta E, Noé F. Statistically Optimal Analysis of State-Discretized Trajectory Data from Multiple Thermodynamic States. J. Chem. Phys. 2014;141:214106. doi: 10.1063/1.4902240. [DOI] [PubMed] [Google Scholar]
  • 13.Rosta E, Hummer G. Free Energies from Dynamic Weighted Histogram Analysis Using Unbiased Markov State Model. J. Chem. Theory Comput. 2015;11:276–285. doi: 10.1021/ct500719p. [DOI] [PubMed] [Google Scholar]
  • 14.Sugita Y, Okamoto Y. Replica-exchange Molecular Dynamics Method for Protein Folding. Chem. Phys. Lett. 1999;314:141–151. [Google Scholar]
  • 15.Lyman E, Ytreberg FM, Zuckerman DM. Resolution Exchange Simulation. Phys. Rev. Lett. 2006;96:028105. doi: 10.1103/PhysRevLett.96.028105. [DOI] [PubMed] [Google Scholar]
  • 16.Xia J, Flynn WF, Gallicchio E, Zhang BW, He P, Tan Z, Levy RM. Large-scale Asynchronous and Distributed Multidimensional Replica Exchange Molecular Simulations and Efficiency Analysis. J. Comput. Chem. 2015;36:1772–1785. doi: 10.1002/jcc.23996. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Gallicchio E, Xia J, Flynn WF, Zhang B, Samlalsingh S, Mentes A, Levy RM. Asynchronous Replica Exchange Software for Grid and Heterogeneous Computing. Comput. Phys. Commun. 2015 doi: 10.1016/j.cpc.2015.06.010. (in press) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Chodera JD, Shirts MR. Replica Exchange and Expanded Ensemble Simulations as Gibbs Sampling: Simple Improvements for Enhanced Mixing. J. Chem. Phys. 2011;135:194110. doi: 10.1063/1.3660669. [DOI] [PubMed] [Google Scholar]
  • 19.Bennett CH. Efficient Estimation of Free Energy Differences from Monte Carlo Data. J. Comput. Phys. 1976;22:245–268. [Google Scholar]
  • 20.Zwanzig RW. High-Temperature Equation of State by a Perturbation Method. I. Nonpolar Gases. J. Chem. Phys. 1954;22:1420–1426. [Google Scholar]
  • 21.Jiang W, Luo Y, Maragliano L, Roux B. Calculation of Free Energy Landscape in Multi-Dimensions with Hamiltonian-Exchange Umbrella Sampling on Petascale Supercomputer. J. Chem. Theory Comput. 2012;8:4672–4680. doi: 10.1021/ct300468g. [DOI] [PubMed] [Google Scholar]
  • 22.Kokubo H, Tanaka T, Okamoto Y. Two-dimensional Replica-Exchange Method for Predicting Protein-Ligand Binding Structures. J. Comput. Chem. 2013;34:2601–2614. doi: 10.1002/jcc.23427. [DOI] [PubMed] [Google Scholar]
  • 23.Meng Y, Roux B. Efficient Determination of Free Energy Landscapes in Multiple Dimensions from Biased Umbrella Sampling Simulations Using Linear Regression. J. Chem. Theory Comput. 2015;11:3523–3529. doi: 10.1021/ct501130r. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Wickstrom L, He P, Gallicchio E, Levy RM. Large Scale Affinity Calculations of Cyclodextrin Host-Guest Complexes: Understanding the Role of Reorganization in the Molecular Recognition Process. J. Chem. Theory Comput. 2013;9:3136–3150. doi: 10.1021/ct400003r. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Gallicchio E, Levy RM. Prediction of SAMPL3 Host-Guest Affinities with the Binding Energy Distribution Analysis Method (BEDAM) J. Comput. Aided Mol. Des. 2012;26:505–516. doi: 10.1007/s10822-012-9552-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Gallicchio E, Lapelosa M, Levy RM. The Binding Energy Distribution Analysis Method (BEDAM) for the Estimation of Protein-Ligand Binding Affinities. J. Chem. Theory Comput. 2010;6:2961–2977. doi: 10.1021/ct1002913. [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES