Markov processes follow from the principle of maximum caliber

Hao Ge; Steve Pressé; Kingshuk Ghosh; Ken A Dill

doi:10.1063/1.3681941

. 2012 Feb 13;136(6):064108. doi: 10.1063/1.3681941

Markov processes follow from the principle of maximum caliber

Hao Ge ^1,^2,^3,^a), Steve Pressé ⁴, Kingshuk Ghosh ⁵, Ken A Dill ⁶

PMCID: PMC3292588 PMID: 22360170

Abstract

Markov models are widely used to describe stochastic dynamics. Here, we show that Markov models follow directly from the dynamical principle of maximum caliber (Max Cal). Max Cal is a method of deriving dynamical models based on maximizing the path entropy subject to dynamical constraints. We give three different cases. First, we show that if constraints (or data) are given in the form of singlet statistics (average occupation probabilities), then maximizing the caliber predicts a time-independent process that is modeled by identical, independently distributed random variables. Second, we show that if constraints are given in the form of sequential pairwise statistics, then maximizing the caliber dictates that the kinetic process will be Markovian with a uniform initial distribution. Third, if the initial distribution is known and is not uniform we show that the only process that maximizes the path entropy is still the Markov process. We give an example of how Max Cal can be used to discriminate between different dynamical models given data.

INTRODUCTION

Dynamical processes are commonly modeled as Markov processes.¹^,² When making a dynamical model, what is the underlying justification for asserting a Markov model? We show here that Markov modeling can be justified on the grounds of the principle of maximum caliber (Max Cal), the dynamical analog of maximum entropy.³ Max Cal provides a general foundation for stochastic dynamics in the same way that the principle of maximizing the entropy provides a foundation for equilibrium statistical physics.⁴^,⁵^,⁶^,⁷^,⁸

In Max Cal, one begins with knowledge of all the possible microscopic trajectories that a system could take in time. A small number of dynamical quantities are measured (far fewer than the total number of possible microscopic trajectories). These quantities can include known average macroscopic fluxes. The route entropy is then maximized subject to the dynamical constraints on these average quantities (called maximizing the caliber) to build a weighted ensemble of these microscopic trajectories consistent with the constrained averages. The ensemble is then used to infer other dynamical averages and fluctuations.⁹^,¹⁰^,¹¹^,¹²^,¹³^,¹⁴^,¹⁵^,¹⁶ The first application of this basic idea is probably due to Filyukov and Karpov who, in 1967,¹⁷ assumed dynamical microscopic trajectories that could be treated as Markov chains. They then maximized the caliber to parameterize the rates of their model which was assumed to be Markovian from the onset.

Here, we pose Filyukov and Karpov's problem in reverse and ask: given dynamical constraints, what models will maximize the caliber? The answer to this question is one systematic recipe for building dynamical models. A common approach to dynamical modeling is to assert a particular kinetic model having a given set of rates as parameters. Then, the parameters of the model are fit to the experimental data. However, this type of approach has limitations: the models are often not unique, they are sometimes overparameterized, and they are not guaranteed to provide a principled description of the data beyond the known average quantities that are used as input. We are interested in answering the following question: Without invoking a hypothetical model in advance, how much can the data alone tell us about how to model the dynamics of the system underlying it?

THE MAXIMUM-CALIBER APPROACH

Suppose we have a stochastic process with a discrete state space Γ = {1, 2, ⋅ ⋅ ⋅, N}.

Consider its trajectories of length T and denote the probability for the trajectory {i₀, i₁, ⋅⋅⋅, i_T} as $p_{i_{0} i_{1} \dots i_{T}}$ where i_n is the state visited at time n. The path entropy, S(T), is

\begin{matrix} S (T) = - \sum_{i_{0}, i_{1}, \dots, i_{T}} p_{i_{0} i_{1} \dots i_{T}} \log p_{i_{0} i_{1} \dots i_{T}} . \end{matrix}

(1)

Each variable i_n in the sum runs from 1 to N. Equation 1 is the measure which, when maximized subject to constraints, yields the least biased probability for the set of microscopic trajectories, ${p_{i_{0} i_{1} \dots i_{T}}}$ , consistent with observed constraints.¹⁸^,¹⁹ For a trajectory of length T = 1, p_{1, 2} is the probability of being in state 1 at time 0 and state 2 at time 1. The sum Eq. 1 is over all possible combinations of states at time 0 and 1.

Furthermore, if $p_{i_{0} i_{1} \dots i_{T}}$ is a path probability, then it must satisfy certain logical consistency requirements, as established by Kolmogorov.²⁰

Specifically, for any T equal to 0 or to any positive integer, the probability distribution for a microscopic trajectory must satisfy

\sum_{i_{T + 1}} p_{i_{0} \dots i_{T} i_{T + 1}} = p_{i_{0} \dots i_{T}},

(2)

i.e., the distribution of the process during the time interval {0, 1, 2, ⋅⋅⋅, T} is the distribution of the process during the time interval {0, 1, 2, ⋅⋅⋅, T + 1} marginalized over T + 1.

Constraint on the mean of the singlet distribution

First, we suppose we only know the mean number of time intervals during which the system dwells in each state m. Let, A_m be those quantities in a trajectory of length T and

\begin{matrix} A_{m} (T) = \sum_{i_{0}, i_{1}, \dots, i_{T}} p_{i_{0} i_{1} \dots i_{T}} \sum_{k = 0}^{T} δ_{i_{k}, m}, \end{matrix}

(3)

where δ_{i, j} = 1 when i = j and 0 otherwise. Thus, in Eq. 3, $\sum_{k = 0}^{T} δ_{i_{k}, m}$ counts the number of times state m is visited. By averaging this quantity over all microscopic trajectories, we find the mean number of times this state is visited from time 0 to T. We, therefore, have that ∑_mA_m = T + 1. When data are given in terms of quantities like A_m(T), we refer to these as “singlet statistics.”

Now, we maximize the path entropy in Eq. 1 subject to constraints on our singlet statistic, imposed using Lagrange multipliers. That is, we maximize the Caliber, S(T) − ∑_mλ_mA_m(T) with respect to each $p_{i_{0} i_{1} \dots i_{T}}$ where λ_m is the Lagrange multiplier for state m. We therefore solve

\begin{matrix} \frac{\partial [S (T) - \sum_{m} λ_{m} A_{m} (T)]}{\partial p_{i_{0} i_{1} \dots i_{T}}} \\ = & 1 + \log p_{i_{0} i_{1} \dots i_{T}} - \sum_{m} λ_{m} \sum_{k = 0}^{T} δ_{i_{k}, m} = 0, \end{matrix}

which yields

\begin{matrix} p_{i_{0} i_{1} \dots i_{T}} \propto e^{- \sum_{m} λ_{m} \sum_{k = 0}^{T} δ_{i_{k}, m}} = \prod_{k = 0}^{T} e^{- λ_{i_{k}}} . \end{matrix}

(4)

Note that these λ's are all implicitly dependent on the length of the microscopic trajectory in time, T.

Once normalized, the probability becomes

\begin{matrix} p_{i_{0} i_{1} \dots i_{T}} & = & \frac{\prod_{k = 0}^{T} e^{- λ_{i_{k}}}}{\sum_{i_{0} i_{1} \dots i_{T}} \prod_{k = 0}^{T} e^{- λ_{i_{k}}}} \\ = & \frac{\prod_{k = 0}^{T} e^{- λ_{i_{k}}}}{{(\sum_{i = 1}^{N} e^{- λ_{i}})}^{T + 1}} = \prod_{k = 0}^{T} p_{i_{k}} (T), \end{matrix}

(5)

where $p_{i_{k}} (T) = e^{- λ_{i_{k}}} / \sum_{i = 1}^{N} e^{- λ_{i}}$ is the probability of being in state i_k at time k for a trajectory of length T. The trajectory $p_{i_{0} i_{1} \dots i_{T}}$ thus breaks up into the product of independent probabilities for each time interval.

Therefore, we have

\begin{matrix} A_{m} = \sum_{i_{0}, i_{1}, \dots, i_{T}} p_{i_{0} i_{1} \dots i_{T}} \sum_{k = 0}^{T} δ_{i_{k}, m} = (T + 1) p_{m} (T), \end{matrix}

(6)

where p_m(T) = ∑_ip_iδ_{i, m} is interpreted as the probability of being found in state m over the course of time T.

As a final note on this example, we consider the result of imposing the consistency condition, Eq. 2, to constrain the distribution obtained above. From this we obtain for the time interval [0, T + 1],

\begin{matrix} \sum_{i_{T + 1}} p_{i_{0} \dots i_{T} i_{T + 1}} & = & \prod_{k = 0}^{T} p_{i_{k}} (T + 1) [\sum_{i_{T + 1}} p_{i_{T + 1}} (T + 1)] \\ = & \prod_{k = 0}^{T} p_{i_{k}} (T + 1) . \end{matrix}

(7)

Since $\sum_{i_{T + 1}} p_{i_{0} \dots i_{T} i_{T + 1}} = p_{i_{0} i_{1} \dots i_{T}} = \prod_{k = 0}^{T} p_{i_{k}} (T)$ due to the results from Max Cal for time interval [0, T], it then follows that

\begin{matrix} \prod_{k = 0}^{T} p_{i_{k}} (T + 1) = \prod_{k = 0}^{T} p_{i_{k}} (T), \end{matrix}

(8)

which gives

\begin{matrix} p_{i_{k}} (T + 1) = p_{i_{k}} (T) \end{matrix}

(9)

for any i_k.

The proof from Eqs. 8 to 9 follows from summing both sides of Eq. 8 over all indices except i_k. Furthermore, since the state i_k is arbitrary, we recover Eq. 9.

So, under singlet constraints, maximizing the caliber leads to a model which is an identical, independent distributed process. Equation 5 is the statement of independence and Eq. 9 is the statement of the identical distributions.

Constraint on pairwise statistics

Now we consider instead the situation in which the constraint is on the pairwise statistics for each step m → n over the time period [0, T], i.e.,

\begin{matrix} A_{m n} = \sum_{i_{0}, \dots, i_{T}} p_{i_{0} \dots i_{T}} \sum_{k = 0}^{T - 1} δ_{i_{k}, m} δ_{i_{k + 1}, n} . \end{matrix}

(10)

Here, $\sum_{k = 0}^{T - 1} δ_{i_{k}, m} δ_{i_{k + 1}, n}$ is just the number of occurrence of the transition m → n and ∑_{m, n}A_mn = T.

Now we maximize the path entropy in Eq. 1 subject to constraints on the pairwise statistics given by Eq. 10. We use the λ_mn quantities as Lagrange multipliers to constrain A_mn. Following a derivation similar to that of Eq. 5, we now get

\begin{matrix} p_{i_{0} \dots i_{T}} = \prod_{k = 0}^{T - 1} p_{i_{k} i_{k + 1}} \propto e^{- \sum_{m, n} λ_{m n} \sum_{k = 0}^{T - 1} δ_{i_{k}, m} δ_{i_{k + 1}, n}}, \end{matrix}

(11)

where $p_{i_{k} i_{k + 1}} \propto e^{- λ_{i_{k} i_{k + 1}}}$ . Following similar reasoning to Eqs. 7, 8, we find

\begin{matrix} p_{i_{k} i_{k + 1}} (T + 1) = p_{i_{k} i_{k + 1}} (T) . \end{matrix}

(12)

Equations 11, 12 are the main results of this subsection. In particular, Eq. 11 shows that when transitions are used as constraints the procedure of Max Cal yields a Markov process as the model where the probability of a microtrajectory is given by the product of the independent transition probabilities. Furthermore, Eq. 12 says that these transitions are independent of the total trajectory length.

What if the initial distribution is not uniformly distributed?

In Secs. 1, 2B, we did not use any information about the initial condition of the states; hence, we implicitly assumed uniform initial conditions. In many cases, however, the initial conditions are known and not uniform. How can we include this information within Max Cal? As an example, if a system is subjected to a temperature spike at time t = 0, prior to relaxation, its initial conditions will not be uniform. By this we mean that proteins are initially sampled from a particular (presumably unequilibrated) conformational state. T-jump experiments are useful, for instance, in studying the folding kinetics of proteins from a denatured state.²¹

Max Cal still applies in cases where the initial condition is not uniform. However, we must now condition the path entropy on the known initial state, $p_{i_{0}} (0) \equiv p_{i_{0}} (t = 0)$ . This is an additional piece of information that can be regarded as a constraint on path probability. In general, we will denote the probability of occupying state i_n at time n as $p_{i_{n}} (n)$ .

In other words, the following expression:

\begin{matrix} p_{i_{0} i_{1} \dots i_{T}} = p_{i_{0}} (0) p_{i_{1} \dots i_{T} | i_{0}} \end{matrix}

(13)

must be substituted into the definition of the path entropy, Eq. 1, that is then to be maximized with respect to a further constraint available on $p_{i_{0}} (0)$ . In the above $p_{i_{1} \dots i_{T} | i_{0}}$ denotes the probability of the path from time 1 to T conditioned on being in state i₀ at time 0.

Having pre-specified the initial conditions and now constraining pairwise statistics, we recover

\begin{matrix} p_{i_{0} \dots i_{T}} = p_{i_{0}} (0) \prod_{k = 0}^{T - 1} p_{i_{k} i_{k + 1}} . \end{matrix}

(14)

In this case, the state occupancies, the p_i’s, are not necessarily independent of time. A particular state occupancy $p_{i_{k}}$ can depend on how much time has elapsed since t = 0. To see this explicitly, we note that $\sum_{i_{0}} p_{i_{0}} (0) p_{i_{0} i_{1}}$ is now equal to $p_{i_{1}} (1)$ which might be different from the steady-state value (which is denoted as $π_{i_{1}}$ ). The steady-state value would instead be obtained from $\sum_{i_{0}} π_{i_{0}} p_{i_{0} i_{1}}$ . Upon specifying initial conditions, we have in general

\begin{matrix} \sum_{i_{n}} p_{i_{n}} (n) p_{i_{n} i_{n + 1}} = p_{i_{n}} (n + 1) . \end{matrix}

(15)

When initial conditions are not specified, i.e., no constraints are imposed on $p_{i_{0}}$ , then plugging Eq. 14 into the expression for the path entropy, Eq. 1, we have

\begin{matrix} S (T) = - \sum_{i_{0}, i_{1}, \dots, i_{T}} p_{i_{0}} (0) \prod_{k = 0}^{T - 1} p_{i_{k} i_{k + 1}} \log (p_{i_{0}} (0) \prod_{k = 0}^{T - 1} p_{i_{k} i_{k + 1}}) . \end{matrix}

The $p_{i_{0}} (0)$ which maximizes this expression (i.e., taking the derivative of the above with respect to $p_{i_{0}}$ and setting that derivative equal to zero) is the initial uniform distribution. By uniform we mean that the initial distribution is one with equal a priori weight assigned to each state. The same result is also obtained in statistical mechanics by maximizing the entropy over states without constraints. The initial distribution is, therefore, not the stationary distribution.

In theory, when knowledge of a stationary distribution is known apriori, i.e., when $π_{i_{0}}$ is known, this value of $p_{i_{0}}$ can be used as a prior. That is,

\begin{matrix} S (T) = & - \sum_{i_{0}, i_{1}, \dots, i_{T}} p_{i_{0}} (0) \prod_{k = 0}^{T - 1} p_{i_{k} i_{k + 1}} \\ \times \log (p_{i_{0}} (0) \prod_{k = 0}^{T - 1} p_{i_{k} i_{k + 1}} / π_{i_{0}}) . \end{matrix}

(16)

In this case, value of $p_{i_{0}} (0)$ which maximizes the path entropy is found to be $π_{i_{0}}$ upon normalization.

In practical terms, for a long trajectory, where each state is visited multiple times, the steady state is reliably obtained as follows. We define the dynamical partition function

\begin{matrix} Q_{d} (T) = \sum_{i_{0}, \dots, i_{T}} e^{- \sum_{m, n} λ_{m n} \sum_{k = 0}^{T - 1} δ_{i_{k}, m} δ_{i_{k + 1}, n}} . \end{matrix}

(17)

Taking the derivative of this dynamical partition function gives the average number of time intervals of dwell, A_mn = −∂log Q_d(T)/∂log λ_mn (resembling the way that taking derivatives of equilibrium partition functions yield equilibrium averages and higher cumulants). When T is large enough, we have ∑_mA_mn(T)/T → π_n, which is the stationary distribution of state n.

Thus, even when the initial conditions are assumed to be uniform, rather than to select initial conditions from the stationary distribution, the system still tends towards the correct stationary state.

A note on data analysis is warranted. We have just seen that constraints determine the functional form of the probability for observing a particular trajectory in maximum caliber – in other words they justify the Markov chain. Max Cal thus yields the mathematical form for the likelihood function used to describe the probability of a particular trajectory. Parameterizing a trajectory – i.e., finding numerical values for all transition probabilities – is equivalent to maximizing the caliber.

MODEL-BUILDING IN MAX CAL IS DRIVEN BY THE DATA

Max Cal gives a systematic procedure for building kinetic models. To build a kinetic model, pair statistics are first used as constraints. But how do we know this is the correct model? If measured correlation functions show this constraint to be insufficient, triplet statistics are otherwise used as constraints and so forth. Below we show a simple problem where pairwise statistics gives the correct distribution, but in another slightly complex example it makes the wrong prediction. The example further shows how we can improve the prediction by considering higher order statistics.

Suppose the underlying physics is just two-state, $A \overset{k_{A}}{\underset{k_{B}}{⇌}} B$ and the profiles of A and B are measured over time intervals of length δt. Hence, one could apply Max Cal by first considering a microtrajectory starting in state A, i.e., using p_A(0) = 1 and p_B(0) = 0. For example, suppose we measure p_AA = 1 − k_Aδt where p_AA is the probability that the system is still in state A at time δt, given that it was in state A at time t = 0. Since p_AB + p_AA = 1 then p_AB = k_Aδt. The probability of staying some total time t ≡ Nδt in state A is $p_{A} (t) \equiv p_{A A}^{N} = {(1 - k_{A} δ t)}^{N} = exp (- k_{A} t)$ for long enough t. This would be your expectation based only on pairwise constraints. The same situation would occur for p_BB, p_BA, and p_B(t). We know that for such a system the steady-state distribution is $p_{A}^{s s} = k_{B} / (k_{A} + k_{B})$ and $p_{B}^{s s} = k_{A} / (k_{A} + k_{B})$ . These are obtained from $p_{A}^{s s} = p_{B A} / (p_{A B} + p_{B A})$ and $p_{B}^{s s} = p_{A B} / (p_{A B} + p_{B A})$ . To summarize, Max Cal infers the correct steady-state distribution after repeated measurements, i.e., accurate measurements of p_AB and p_BA, though, in the absence of initial conditions, it infers uniform initial conditions. However, if the trajectory observed is sufficiently long, Max Cal does predict the correct steady-state occupation of each state provided all states are clearly resolved no matter how many states we have.

Now we consider the case where not all states are resolved. Consider the following reaction scheme as an example: $A \overset{k_{A}}{\to} B \overset{k_{C}}{\to} C$ . Suppose that species A and B cannot be resolved experimentally. When not all states are resolved, transitions from the aggregated state are not Markov (technically not first-order Markov). The aggregate of A and B is labeled $\bar{A}$ . In this case, the probability of dwelling in state $\bar{A}$ as a function of t, $p_{\bar{A}} (t)$ , is

\begin{matrix} p_{\bar{A}} (t) & = p_{A} (t) + p_{B} (t) \\ = {(k_{C} - k_{A})}^{- 1} (k_{C} e^{- k_{A} t} - k_{A} e^{- k_{C} t}) . \end{matrix}

(18)

Theoretically, there are two ways of backing out a model. One is to fit or to infer the number of exponential contributions to the decay of Eq. 18 and thereby infer the intermediate state B.²² The other is to build the Markovian model from Max Cal through entropy maximization with finite-step constraints. In practice Max Cal is more general (it does not rely on exponentials) and is easier to apply.

Pairwise constraints, such as above, would predict exponential distribution for $p_{\bar{A}} (t)$ instead of Eq. 18. Clearly, this would be inconsistent with a measurement consistent with Eq. 18 instead. From pairwise constraints, we could predict quantities like the probability of observing $p_{i_{0} i_{1} i_{2}} = p_{\bar{A} \bar{A} \bar{A}}$ within δt. However, our prediction would be incorrect.

We, therefore, turn to data beyond two point constraints to see if we can improve our prediction. We use the set of three point constraints such as $p_{\bar{A} \bar{A} \bar{A}}$ based on the underlying model $A \overset{k_{A}}{\to} B \overset{k_{C}}{\to} C$ . The constraint $p_{\bar{A} \bar{A} \bar{A}}$ is computed as follows:

\begin{matrix} p_{\bar{A} \bar{A} \bar{A}} = p_{A A A} + p_{A A B} + p_{A B B}, \end{matrix}

(19)

where p_AAA = p_AAp_AA = (1 − k_A(δt/2))², p_AAB = p_AAp_AB = (1 − k_A(δt/2))k_A(δt/2), and p_ABB = p_ABp_BB = k_A(δt/2)(1 − k_C(δt/2)). Therefore,

\begin{matrix} p_{\bar{A} \bar{A} \bar{A}} = 1 - k_{A} k_{C} {(\frac{δ t}{2})}^{2} . \end{matrix}

(20)

In a similar fashion, we find $p_{\bar{A} \bar{A} C} = k_{A} k_{C} {(δ t / 2)}^{2}$ and $p_{\bar{A} C C} = 0$ where $p_{\bar{A} \bar{A} C} = p_{A B C} = k_{A} k_{C} {((δ t / 2))}^{2}$ .

Now, expanding $p_{\bar{A}} (t)$ , Eq. 18, to second order in δt, we have $p_{\bar{A}} (t) \sim 1 - k_{A} k_{C} δ t^{2} / 2$ . Comparing this with $p_{\bar{A} \bar{A} \bar{A}}$ , it is clear they both capture the correct k_A and k_C dependence to second order in δt though the pre-factors do not agree. Predictions of other higher order trajectories would also be in better agreement by having third-order constraints.

Just as we had before, this exercise can be carried out to even higher order and the difference in the pre-factor would be eliminated when we use more and more steps as constraints. For instance, using fourth-order time constraints we find

\begin{matrix} p_{\bar{A} \bar{A} \bar{A} \bar{A}} = 1 - 3 k_{A} k_{C} [{(\frac{δ t}{3})}^{2} - (k_{A} + k_{C}) {(\frac{δ t}{3})}^{3} / 3] . \end{matrix}

(21)

This should be compared with the expansion of Eq. 18 to the third order in δt that yields

\begin{matrix} p_{\bar{A}} (t) = 1 - \frac{k_{A} k_{C}}{2} [{(δ t)}^{2} - (k_{A} + k_{C}) {(δ t)}^{3} / 3] . \end{matrix}

(22)

This time the prefactor of the second-order term in Eq. 21 changes to 1/3 from 1/4 in Eq. 20, and in the limit of large number of time steps, it would finally become the correct 1/2. Thus, we find that with higher order constraints we capture more of the behavior of the full profile $p_{\bar{A}} (t)$ when the underlying process cannot be accurately determined by a two-point constraint. The example above outlines how to improve our choice of constraints depending on the data.

CONCLUSIONS AND DISCUSSION

The dynamics of bulk chemical reaction kinetics are governed by the law of mass action. For chemical reactions in small systems like in cellular biochemical environments, the dynamics are stochastic, and many models assume Markovian dynamics. On the other hand, with growing microscopic biochemical data on dynamical systems, a statistical approach to chemical kinetics which is founded on the inferential principles of maximum entropy is desirable. For example, it is well known that exponential distribution for a positive random variable, and Gaussian distribution, both essential to chemical kinetics, can be justified using Jaynes “principle of maximum entropy.”⁷^,⁸ It is in this spirit that the approach based on Max Cal for statistical dynamics was proposed.³^,⁹^,¹⁰^,¹¹^,¹² Maximum caliber can be used both as a first principle from which to derive stochastic dynamical models and as a method of data analysis. We showed here that Markovian dynamics can be justified from Max Cal; the Markov property is a natural consequence of maximizing a dynamical entropy over trajectories for time-independent processes. Our treatment can be generalized to handle dynamical systems having finite memory (time delays), corresponding to cases of inference used in time-series analysis.²³

ACKNOWLEDGMENTS

We thank Professor Hong Qian at the University of Washington and Professor Julian Lee at Soongsil University in Seoul for very helpful discussions. We acknowledge National Institutes of Health (NIH) Grant Nos. R01GM 34993 and 1R01GM090205. In addition, H. Ge acknowledges support by the NSF of China (NSFC) 10901040 and the specialized Research Fund for the Doctoral Program of Higher Education (New Teachers) 20090071120003. This article was first written while H. Ge visited the Xie group in the Department of Chemistry and Chemical Biology, Harvard University. H. Ge thanks Professor Xiaoliang Sunney Xie and his department for their support and hospitality.

References

van Kampen N. G., Stochastic Processes in Chemistry and Physics (North-Holland, Amsterdam, 1981). [Google Scholar]
Chung K. L., Lectures from Markov Processes to Brownian Motion (Springer-Verlag, New York, 1982). [Google Scholar]
Jaynes E. T., “Macroscopic prediction,” in Complex Systems – Operational Approaches in Neurobiology, Physics, and Computers, edited by Haken H. (Springer-Verlag, Berlin, 1985). [Google Scholar]
Jaynes E. T., Probability Theory: The Logic of Science (Cambridge University Press, London, 2003). [Google Scholar]
Steinbach P. J., Chu K., Frauenfelder H., Johnson J. B., Lamb D. C., Nienhaus G. U., Sauke T. B., and Young R. D., Biophys. J. 61, 235 (1992). 10.1016/S0006-3495(92)81830-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
Gull S. F. and Daniell G. J., Nature (London) 272, 686 (1978). 10.1038/272686a0 [DOI] [Google Scholar]
Jaynes E. T., Phys. Rev. 106, 620 (1957). 10.1103/PhysRev.106.620 [DOI] [Google Scholar]
Jaynes E. T., Phys. Rev. 108, 171 (1957). 10.1103/PhysRev.108.171 [DOI] [Google Scholar]
Seitaridou E., Inamdar M. M., Phillips R., Ghosh K., and Dill K. A., J. Phys. Chem. B 111, 2288 (2007). 10.1021/jp067036j [DOI] [PMC free article] [PubMed] [Google Scholar]
Pressé S., Ghosh K., Phillips R., and Dill K. A., Phys. Rev. E 82, 031905 (2010). 10.1103/PhysRevE.82.031905 [DOI] [PubMed] [Google Scholar]
Pressé S., Ghosh K., and Dill K. A., J. Phys. Chem. B 115, 6202 (2011). 10.1021/jp111112s [DOI] [PMC free article] [PubMed] [Google Scholar]
Otten M. and Stock G., J. Chem. Phys. 133, 034119 (2010). 10.1063/1.3455333 [DOI] [PubMed] [Google Scholar]
Stock G., Ghosh K., and Dill K. A., J. Chem. Phys. 128, 194102 (2008). 10.1063/1.2918345 [DOI] [PMC free article] [PubMed] [Google Scholar]
Smith E., Rep. Prog. Phys. 74, 046601 (2011). 10.1088/0034-4885/74/4/046601 [DOI] [Google Scholar]
Monthus C., J. Stat. Mech. 2011, P03008. 10.1088/1742-5468/2011/03/P03008 [DOI] [Google Scholar]
Ghosh K., J. Chem. Phys. 134, 195101 (2011). 10.1063/1.3590918 [DOI] [PubMed] [Google Scholar]
Filyukov A. A. and Karpov V. Ya., Inzh.-Fiz. Zh. 13, 798 (1967). [Google Scholar]
Livesey A. K. and Skilling J., Acta Cryst. A41, 113 (1985). [Google Scholar]
Shore J. E. and Johnson R. W., IEEE Trans. Inf. Theory IT-26, 26 (1980). 10.1109/TIT.1980.1056144 [DOI] [Google Scholar]
Kolmogorov A. N., Grundbegriffe der Wahrscheinlichkeitsrechnung (Springer, Berlin, 1933) [Google Scholar]; Kolmogorov A. N. [Foundations of the Theory of Probability (Chelsea, New York, 1950) (in English)]. [Google Scholar]
Gruebele M., Annu. Rev. Phys. Chem. 50, 485 (1999). 10.1146/annurev.physchem.50.1.485 [DOI] [PubMed] [Google Scholar]
Lu H. P., Xun L., and Xie X. S., Science 282, 1877 (1998). 10.1126/science.282.5395.1877 [DOI] [PubMed] [Google Scholar]
Brockwell P. J. and Davis R. A., Time Series: Theory and Methods, 2nd ed. (Springer-Verlag, New York, 2009). [Google Scholar]

[c1] van Kampen N. G., Stochastic Processes in Chemistry and Physics (North-Holland, Amsterdam, 1981). [Google Scholar]

[c2] Chung K. L., Lectures from Markov Processes to Brownian Motion (Springer-Verlag, New York, 1982). [Google Scholar]

[c3] Jaynes E. T., “Macroscopic prediction,” in Complex Systems – Operational Approaches in Neurobiology, Physics, and Computers, edited by Haken H. (Springer-Verlag, Berlin, 1985). [Google Scholar]

[c4] Jaynes E. T., Probability Theory: The Logic of Science (Cambridge University Press, London, 2003). [Google Scholar]

[c5] Steinbach P. J., Chu K., Frauenfelder H., Johnson J. B., Lamb D. C., Nienhaus G. U., Sauke T. B., and Young R. D., Biophys. J. 61, 235 (1992). 10.1016/S0006-3495(92)81830-1 [DOI] [PMC free article] [PubMed] [Google Scholar]

[c6] Gull S. F. and Daniell G. J., Nature (London) 272, 686 (1978). 10.1038/272686a0 [DOI] [Google Scholar]

[c7] Jaynes E. T., Phys. Rev. 106, 620 (1957). 10.1103/PhysRev.106.620 [DOI] [Google Scholar]

[c8] Jaynes E. T., Phys. Rev. 108, 171 (1957). 10.1103/PhysRev.108.171 [DOI] [Google Scholar]

[c9] Seitaridou E., Inamdar M. M., Phillips R., Ghosh K., and Dill K. A., J. Phys. Chem. B 111, 2288 (2007). 10.1021/jp067036j [DOI] [PMC free article] [PubMed] [Google Scholar]

[c10] Pressé S., Ghosh K., Phillips R., and Dill K. A., Phys. Rev. E 82, 031905 (2010). 10.1103/PhysRevE.82.031905 [DOI] [PubMed] [Google Scholar]

[c11] Pressé S., Ghosh K., and Dill K. A., J. Phys. Chem. B 115, 6202 (2011). 10.1021/jp111112s [DOI] [PMC free article] [PubMed] [Google Scholar]

[c12] Otten M. and Stock G., J. Chem. Phys. 133, 034119 (2010). 10.1063/1.3455333 [DOI] [PubMed] [Google Scholar]

[c13] Stock G., Ghosh K., and Dill K. A., J. Chem. Phys. 128, 194102 (2008). 10.1063/1.2918345 [DOI] [PMC free article] [PubMed] [Google Scholar]

[c14] Smith E., Rep. Prog. Phys. 74, 046601 (2011). 10.1088/0034-4885/74/4/046601 [DOI] [Google Scholar]

[c15] Monthus C., J. Stat. Mech. 2011, P03008. 10.1088/1742-5468/2011/03/P03008 [DOI] [Google Scholar]

[c16] Ghosh K., J. Chem. Phys. 134, 195101 (2011). 10.1063/1.3590918 [DOI] [PubMed] [Google Scholar]

[c17] Filyukov A. A. and Karpov V. Ya., Inzh.-Fiz. Zh. 13, 798 (1967). [Google Scholar]

[c18] Livesey A. K. and Skilling J., Acta Cryst. A41, 113 (1985). [Google Scholar]

[c19] Shore J. E. and Johnson R. W., IEEE Trans. Inf. Theory IT-26, 26 (1980). 10.1109/TIT.1980.1056144 [DOI] [Google Scholar]

[c20] Kolmogorov A. N., Grundbegriffe der Wahrscheinlichkeitsrechnung (Springer, Berlin, 1933) [Google Scholar]; Kolmogorov A. N. [Foundations of the Theory of Probability (Chelsea, New York, 1950) (in English)]. [Google Scholar]

[c21] Gruebele M., Annu. Rev. Phys. Chem. 50, 485 (1999). 10.1146/annurev.physchem.50.1.485 [DOI] [PubMed] [Google Scholar]

[c22] Lu H. P., Xun L., and Xie X. S., Science 282, 1877 (1998). 10.1126/science.282.5395.1877 [DOI] [PubMed] [Google Scholar]

[c23] Brockwell P. J. and Davis R. A., Time Series: Theory and Methods, 2nd ed. (Springer-Verlag, New York, 2009). [Google Scholar]

PERMALINK

Markov processes follow from the principle of maximum caliber

Hao Ge

Steve Pressé

Kingshuk Ghosh

Ken A Dill

Abstract

INTRODUCTION

THE MAXIMUM-CALIBER APPROACH

Constraint on the mean of the singlet distribution

Constraint on pairwise statistics

What if the initial distribution is not uniformly distributed?

MODEL-BUILDING IN MAX CAL IS DRIVEN BY THE DATA

CONCLUSIONS AND DISCUSSION

ACKNOWLEDGMENTS

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Markov processes follow from the principle of maximum caliber

Hao Ge

Steve Pressé

Kingshuk Ghosh

Ken A Dill

Abstract

INTRODUCTION

THE MAXIMUM-CALIBER APPROACH

Constraint on the mean of the singlet distribution

Constraint on pairwise statistics

What if the initial distribution is not uniformly distributed?

MODEL-BUILDING IN MAX CAL IS DRIVEN BY THE DATA

CONCLUSIONS AND DISCUSSION

ACKNOWLEDGMENTS

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases