Geometric fluid approximation for general continuous-time Markov chains

Michalis Michaelides; Jane Hillston; Guido Sanguinetti

doi:10.1098/rspa.2019.0100

. 2019 Sep 25;475(2229):20190100. doi: 10.1098/rspa.2019.0100

Geometric fluid approximation for general continuous-time Markov chains

Michalis Michaelides ^1,^✉, Jane Hillston ¹, Guido Sanguinetti ¹

PMCID: PMC6784392 PMID: 31611711

Abstract

Fluid approximations have seen great success in approximating the macro-scale behaviour of Markov systems with a large number of discrete states. However, these methods rely on the continuous-time Markov chain (CTMC) having a particular population structure which suggests a natural continuous state-space endowed with a dynamics for the approximating process. We construct here a general method based on spectral analysis of the transition matrix of the CTMC, without the need for a population structure. Specifically, we use the popular manifold learning method of diffusion maps to analyse the transition matrix as the operator of a hidden continuous process. An embedding of states in a continuous space is recovered, and the space is endowed with a drift vector field inferred via Gaussian process regression. In this manner, we construct an ordinary differential equation whose solution approximates the evolution of the CTMC mean, mapped onto the continuous space (known as the fluid limit).

Keywords: continuous-time Markov chains, Markov jump processes, fluid approximation, diffusion maps, Gaussian processes

1. Introduction

Stochastic process models of dynamical systems play a central role in scientific investigations across a broad range of disciplines, from computer science, to physics, to biology. Continuous-time Markov chains (CTMCs), in particular, have emerged over the last two decades as an especially powerful class of models to capture the intrinsic discreteness and stochasticity of biological systems at the single-cell level. The ensuing cross-fertilization of stochastic systems biology with methods emerging from the formal modelling of computer systems has led to the dramatic explosion of a new interdisciplinary field at the intersection of computer science and biology [1–3].

Despite the unquestionable success of these modelling efforts, scaling formal analysis techniques to larger systems remains a major challenge, since such systems usually result in very large state-spaces, making subsequent analysis particularly onerous. In most cases, retrieving the evolution of the state distribution, while theoretically possible (by solving the Kolmogorov–Chapman equations), in practice is prohibitively expensive. These hurdles also affect statistical techniques based on Monte Carlo sampling, since trajectories from CTMCs with a large state-space typically exhibit very frequent transitions and therefore require the generation of a very large number of random numbers. A popular alternative is therefore to rely on model approximations, by constructing alternative models which in some sense approximate the system's behaviour. In the special case of CTMCs with a population structure (pCTMCs), fluid approximations replacing the original dynamics with a deterministic set of ordinary differential equations (ODEs) have seen great success, due to their scalability and their well-understood convergence properties [2,4,5]. Such approximations rely on the particular structure of the state-space of pCTMCs; to the best of our knowledge, fluid approximations for general CTMCs have not been developed.

In this paper, we propose a general strategy to obtain a fluid approximation for any CTMC. Our approach uses manifold learning approaches, popular in machine learning, to embed the transition graph of a general CTMC in a Euclidean space. A powerful Bayesian non-parametric regression method complements the embedding, by inferring a drift vector field over the Euclidean space to yield a continuous process, which we term the geometric fluid approximation (GFA). The accuracy of the GFA depends on how naturally the CTMC embeds in a Euclidean space. Crucially, we show that in a simple pCTMC case an approximation related to the GFA is consistent with the standard fluid approximation. Empirical results on a range of examples of CTMCs without a strict population structure show that if the transition graph approximately resembles a continuous manifold, our approach captures well the average behaviour of the CTMC trajectories, and can be useful to efficiently solve approximate reachability problems.

2. Background theory and related work

We briefly review here the mathematical definition of CTMCs, as well as the foundations of fluid approximation for population CTMCs (pCTMCs) [5–7], highlighting the specific aspects of pCTMCs that enable such a construction.

(a). Continuous-time Markov chains

A continuous-time Markov Chain (CTMC) is a continuous-time, Markovian stochastic process over a finite state-space I. The process is characterized by its generator matrix Q, encoding the infinitesimal probability of transitioning between any two states. A more formal definition of CTMC [7] can be given as follows:

Definition 2.1. —

Let X(t), t∈T, be a right-continuous process with values in a countable set $I \subset N_{> 0}$ . Let Q be a generator matrix on I with jump matrix Π, such that for i, j∈I:

$\begin{aligned} Q_{i j} \in R_{\geq 0} \forall i \neq j, Q_{i i} = - \sum_{j \neq i} Q_{i j} \forall i \\ and & Π_{i j} = {\begin{cases} - \frac{Q_{i j}}{Q_{i i}} & if i \neq j \land Q_{i i} \neq 0, \\ 0 & otherwise . \end{cases} \end{aligned}$

The process X(t) is a continuous-time Markov chain with initial state distribution π and generator matrix Q (CTMC(π, Q)), if it satisfies

(i)
P(X(0) = i) = π_i; and

(ii)
X(t) = Y_n for $\sum_{k = 0}^{n} S_{k} \geq t \geq \sum_{k = 0}^{n - 1} S_{k}$ , where $n \in N_{> 0}$ and S₀ = 0, such that P(Y _n+1 = j|Y _n = i) = Π_ij, and ∀ n≥1, S_n|Y _n = i∼Exp( − Q_ii).

This definition emphasizes the so-called holding (or residence) times, i.e. the random time that the system spends into any one state. The exponential distribution of the holding times is a simple consequence of the Markovian nature of the process; it also naturally suggests an exact algorithm to sample trajectories of CTMCs by drawing repeatedly exponential random numbers. This consideration forms the basis of the Gillespie algorithm, widely used in the field of systems biology and known as the stochastic simulation algorithm (SSA) [8].

An important special instance of the CTMC is the so-called population CTMC (pCTMC). Population CTMCs model systems consisting of indistinguishable agents of different types, or species. The state-space is, therefore, identified with a vector of integer numbers, with each entry counting the species population, and transitions occurring at different rates depending on the counts of each agent type.

The transitions in a pCTMC of $m \in N_{> 0}$ species can be regarded as occurrences of R chemical reactions, written as

\sum_{i = 1}^{m} u_{i r} {\hat{e}}_{i} \overset{k_{r}}{\to} \sum_{i = 1}^{m} v_{i r} {\hat{e}}_{i}, r = 1, \dots, R,

where u_ir counts the number of particles of species ${\hat{e}}_{i}$ that are consumed and v_ir the particles of the same species created, in reaction r (species unit vectors are orthonormal: $⟨ {\hat{e}}_{i} | {\hat{e}}_{j} ⟩ = δ_{i, j}$ ). The reaction rate constant k_r is a factor in the propensity function f_r(u), which constructs the Q-matrix of the CTMC. See [1,6] for a thorough exposition of pCTMCs for chemical reaction networks (CRNs) and a justified definition of the propensity function.

(b). Continuous relaxation and the fluid limit

As discussed in the previous section, the Markovian nature of CTMCs naturally provides an exact sampling algorithm for drawing CTMC trajectories. The same Markovian nature also leads to a set of ODEs governing the evolution of the single-time marginal state probability, the celebrated Chapman–Kolmogorov equations (CKE), which in the case of pCTMCs go under the name of Master equation. Unfortunately, such equations are rarely practically solvable, and analysis of CTMCs is often reliant on computationally intensive simulations.

In the case of pCTMCs, a more concise description in terms of the collective dynamics of population averages is however available. Starting with the seminal work of van Kampen [9], and motivated by the interpretation of pCTMCs as chemical reaction systems, several approximation schemes have been developed which relax the original pCTMC to a continuous stochastic process; see [1] for a recent review.

In this paper, we are primarily interested in the so-called fluid approximation, which replaces the pCTMC with a set of ODEs, which captures the average behaviour of the system. Fluid approximations have been intensely studied and their limiting behaviour is well understood, providing specific error guarantees. There are two characteristics of a pCTMC which are instrumental to enabling the fluid approximation. Firstly, there is a natural interpretation of the states as points in a vector space, where each dimension represents a species. Secondly, a drift vector field can be naturally defined by extending the propensity function to be defined on the whole vector space, which is a polynomial function of the number of agents of each type (i.e. polynomial function of the elements of the system state vector).

(i). Established guarantees

Following Darling & Norris [5], we examine and formalize the aspects of pCTMCs which render them especially amenable to the fluid approximation. As mentioned, the first is that pCTMC state-spaces are countable and there exists an obvious ordering. We can therefore write a trivial linear mapping from the discrete, countable state-space I to a continuous Euclidean space $x : I \to R^{d}$ , where d is the number of agent types in the system.

The second aspect is that rates of transition from each state to all others (i.e. elements of the Q-matrix) can be expressed as a function of the state vector x. A drift vector β(ξ) can be defined as

β (ξ) = \sum_{ξ^{'} \neq ξ} (x (ξ^{'}) - x (ξ)) q (ξ, ξ^{'}),

for each ξ∈I. Since q(ξ, ξ′) is some parametric function of ξ, ξ′ in pCTMCs (due to the indistinguishable nature of the agents) the definition of the drift vector can be extended over the entire Euclidean space $R^{d}$ to produce the drift vector field $b (x) : U \to R^{d}$ , where $U \subseteq R^{d}$ . There is then a set of conditions given in [5] that must be satisfied by these elements to bind the error of the fluid approximation to the Markov process. The conditions ensure that: ‘The first exit time from a suitably selected domain of the Euclidean mapping of the Markov chain state-space U, converges in probability to the first exit time of the fluid limit’.

(ii). Canonical embedding of pCTMCs

In the canonical embedding for continuous relaxation of pCTMCs, we construct an $E \subset R^{d}$ Euclidean space, where each dimension corresponds to the concentration of each species in the system, i∈{1, …, m}. The states are then uniformly embedded in continuous space [0, 1]^m∈E at intervals 1/n_i by x(ξ) = u_i/n_i, where ξ represents the population $\sum_{i} u_{i} {\hat{e}}_{i}$ . Further, $N = | I | = \prod_{i} n_{i}$ , is a scale parameter which defines x_N(ξ), q_N(ξ, ξ′) and β_N(ξ) for any such pCTMC of size N. The motivation is that in the limit of N → ∞, the distance between neighbouring states will vanish in the embedding, and jump sizes will similarly vanish, producing an approximately continuous trajectory of the system in the continuous concentration space.

In [10], we find how the canonical embedding above satisfies the conditions given in [5], and that the approximation error shrinks as the scale parameter N grows. Specifically, the authors show that there exists a fluid approximation (deterministic trajectory) to the x-mapped pCTMC, whose error diminishes in N, under the conditions that:

—
initial conditions converge, i.e. ∃a∈U, a≠x(ξ₀) such that
$Pr [∥ x_{N} (ξ_{0}) - a ∥ > δ] \leq \frac{κ_{1} (δ)}{N}, \forall δ > 0;$
—
mean dynamics converge as N → ∞, i.e. $\tilde{b} : U \to R^{d}$ is a Lipschitz field independent of N, such that
$sup_{ξ} ∥ β_{N} (ξ) - \tilde{b} (x_{N} (ξ)) ∥ \to 0 as N \to \infty;$
—
noise converges to zero as N → ∞, i.e. that,
$sup_{ξ} {\sum_{ξ^{'} \neq ξ} q_{N} (ξ, ξ^{'})} \leq κ_{2} N$
and
$sup_{ξ} {{∥ \frac{β_{N} (ξ)}{q_{N} (ξ)} ∥}^{2} + \sum_{ξ^{'} \neq ξ} ∥ x_{N} (ξ^{'}) - x_{N} (ξ) ∥^{2} \frac{q_{N} (ξ, ξ^{'})}{q_{N} (ξ)}} \leq κ_{3} N^{- 2},$

where κ₁(δ), κ₂, κ₃ are positive constants, $q (ξ) = \sum_{ξ^{'} \neq ξ} q (ξ, ξ^{'})$ , and the inequalities hold uniformly in N.

There are many ways to satisfy the above criteria, but a common one (used in pCTMCs) is ‘hydrodynamic scaling’, where the increments of the N-state Markov process mapped to the Euclidean space are $O (N^{- 1})$ and the jump rate is $O (N)$ .

3. Methodology

As discussed in the previous section, the fluid approximation of pCTMCs is critically reliant on the structure of the state-space of pCTMCs being isomorphic to a lattice in $R^{n}$ . This enables the definition of a drift vector field, which can be then naturally extended to the whole ambient space and, under mild assumptions, leads to convergence under suitable scaling. Neither of these ingredients are obviously available in the general case of CTMCs lacking a population structure.

In this section, we describe the proposed methodology for a GFA for CTMCs. We motivate our approach by describing an exact, if trivial, general embedding of a CTMC's state-space into a very high-dimensional space. Such an embedding, however, affords the non-trivial insight that suitable approximate embeddings may be obtained considering the spectral geometry of the generator matrix. This provides an unexpected link with a set of techniques from machine learning, diffusion maps (DM), which embed graphs into Euclidean spaces. The geometry of DM is well studied, and their distance preservation property is particularly useful for our purpose of obtaining a fluid approximation.

DM, however, provide only one ingredient to a fluid approximation; they do not define an ODE flow over the ambient Euclidean space. To do so, we use Gaussian Process regression: this provides a smooth interpolation of the dynamic field between embedded states. Smoothness guarantees that nearby states in the CTMC (which are embedded to nearby points in Euclidean space by virtue of the distance preservation property of DM) will have nearby drift values, somewhat enforcing the pCTMC property that the transition rates are a function of the state vector.

This two-step strategy provides a general approach to associate a deterministic flow on a vector space to a CTMC. We empirically validate that such flow indeed approximates the mean behaviour of the CTMCs on a range of examples in the next section. Prior to the empirical section, we prove a theorem showing that, in the special case of pCTMCs of birth/death type, our geometric construction returns the same fluid approximation as the standard pCTMC construction, providing an important consistency result.

(a). Eigen-embeddings of continuous-time Markov chains

(i). Trivial embedding of continuous-time Markov chains

Consider a CTMC with initial distribution π and generator matrix Q, on countable state-space $Ξ \subset N$ . The single time marginal p_t over Ξ at time t of the process obeys the CKE:

\partial_{t} p_{t} = Q^{⊤} p_{t},

3.1

where p_t is a column vector. Given an arbitrary embedding of the states in some continuous space, $x : Ξ \to R^{d}$ , the projected mean 〈x_t〉 = X^→ pp_t, obeys:

\partial_{t} ⟨ x_{t} ⟩ = X^{⊤} \partial_{t} p_{t} = X^{⊤} Q^{⊤} p_{t} = X^{⊤} Q^{⊤} X^{- ⊤} X^{⊤} p_{t} = X^{⊤} Q^{⊤} X^{- ⊤} ⟨ x_{t} ⟩,

where X_ij refers to the j∈{1, …, d} coordinate of state i∈Ξ. In general, the last step is only possible for XX⁻¹ = I, with I the |Ξ| × |Ξ| identity matrix (i.e. with d = |Ξ|).

We note that choosing the trivial embedding X = I (i.e. each state mapped to the vertex of the probability (|Ξ| − 1)-simplex), equates the fluid process to the original CKE:

\partial_{t} ⟨ x_{t} ⟩ = Q^{⊤} ⟨ x_{t} ⟩ .

3.2

(ii). The fluid approximation

For any embedding $y : Ξ \to R^{d}$ , the standard fluid approximation defines the drift at any state y(i)≡y_i, i∈Ξ, to be:

β (y_{i}) = \sum_{j \neq i} (y_{j} - y_{i}) Q_{i j} = \sum_{j \neq i} y_{j} Q_{i j} - y_{i} \sum_{j \neq i} Q_{i j} = \sum_{j \neq i} y_{j} Q_{i j} + y_{i} Q_{i i} = {[Q Y]}_{i},

where Y is a |Ξ| × d matrix, and β(y_i) is the ith row of QY .

In order to extend the drift over the entire space $R^{d}$ , we let $q (y_{t}, y_{j}) = \sum_{i} Q_{i j} y_{i}^{⊤} y_{t}$ be the transition kernel between any point $y_{t} \in R^{d}$ and any state y_j, j∈Ξ. Then we naturally define the continuous vector field b to be

b (y_{t}) = \sum_{j} y_{j} q (y_{t}, y_{j}) = \sum_{j} y_{j} \sum_{i} Q_{i j} y_{i}^{⊤} y_{t} = Y^{⊤} Q^{⊤} Y y_{t} .

3.3

Trivially, Y = I yields the original CKE as shown above,

\partial_{t} y_{t} = b (y_{t}) = Y^{⊤} Q^{⊤} Y y_{t} = Q^{⊤} y_{t} .

3.4

If embedded states V are eigenvectors of Q = V ΛV⁻¹, then the mean of the mapped process 〈v_t〉 is given by

⟨ v_{t} ⟩ = V^{⊤} p_{t} = V^{⊤} V^{- ⊤} e^{t Λ} V^{⊤} π = e^{t Λ} V^{⊤} π .

3.5

Similarly, in the fluid approximation with $q (y_{t}, y_{j}) = \sum_{i} Q_{i j} y_{i}^{⊤} y_{t}$ , and y₀ = V^→ pπ, one has

y_{t} = e^{t Λ V^{⊤} V} y_{0} = e^{t Λ V^{⊤} V} V^{⊤} π .

3.6

We can, therefore, claim the following: ‘in the case of V^→ pV = I, or for a symmetric Q = V ΛV^→ p, the fluid approximation process is exactly equivalent to the projected mean’. This suggests that an approximate, low-dimensional representation might be obtained by truncating the spectral expansion of the generator matrix of the CTMC. Spectral analysis of a transport operator is also the approach taken by DM, a method which is part of the burgeoning field of manifold learning for finding low-dimensional representations of high-dimensional data.

(iii). Markov chain as a random walk on a graph

Another avenue to reach the same conclusion is to consider the CTMC as a random walk on an abstract graph, where vertices represent states of the chain, and weighted directed edges represent possible transitions. From this perspective, it is natural to seek an embedding of the graph in a suitable low-dimensional vector space; it is well known in the machine learning community that an optimal (in a sense specified in §3b) embedding can be obtained by spectral analysis of the transport operator linked to the random walk on the graph. Intuitively, we expect that if the graph geometry discretely approximates some continuous space, then the underlying space will serve well as a continuous state-space for a fluid limit approximation, when endowed with an appropriate drift vector field to capture the non-geometric dynamics in the graph.

(b). Diffusion maps

A natural method to embed the CTMC states in continuous space for our purposes is DM [11–15]. This is a manifold learning method, where the authors consider a network defined by a symmetric adjacency matrix, with the aim of finding coordinates for the network vertices on a continuous manifold (as is usually the case with similarities of high-dimensional points).

(i). Diffusion on a manifold

The method of DM follows from regarding high-dimensional points in $R^{p}$ to be observations of a diffusion process at regular intervals, which evolves on a hidden manifold $M \subset R^{p}$ with a smooth boundary $\partial M$ . Concurrently, a similarity matrix between the high-dimensional points (i.e. an adjacency matrix) is interpreted as the un-normalized transition kernel of the hidden diffusion process. These assumptions imply that the geometry of $M$ must be such that similar points are likely consecutive observations of the diffusion process, since the latter is dependent on the geometry of $M$ . The goal is then to recover coordinates for the points, natural to their position on the assumed manifold—in doing so, we infer a continuum $M$ from relations between some of its point elements. In essence, we seek a low-dimensional representation of the points which best preserves their similarities as proximity.

In the context of CTMCs, the points that are to be embedded are the CTMC states, and the transition matrix Q is the transition kernel of the diffusion process evaluated at discrete points (CTMC states). A family of diffusion operators are constructed which can be spectrally analysed to yield coordinates for each vertex on the manifold. The continuous operators, which are theoretically constructed to govern the diffusion process, are assumed to be approximated by the analogous discrete operators which are constructed from data. The method can be thought to optimally preserve the normalized diffusion distance of the diffusion process on the high-dimensional manifold, as Euclidean distance in the embedding. Diffusion distance between two vertices x₀, x₁ at time t is defined to be the distance between the probability densities over the state-space, each initialized at x₀, x₁, respectively, and after a time t has passed:

D_{t}^{2} (x_{0}, x_{1}) = ∥ p (x, t | x_{0}) - p (x, t | x_{1}) ∥_{L_{2} (w)}^{2},

where $L_{2} (M, w)$ is a Hilbert space in which the distance is defined, with w(x) = 1/ϕ₀(x), the inverse of the steady-state distribution ϕ₀(x) = lim_{t → ∞}p(x, t|x₀, 0). The procedure is similar to principal component analysis, since taking the first k < p eigenvectors of the diffusion distance matrix provides coordinates for nodes in a low-dimensional space optimally preserving $\sum_{i, j} D_{t}^{2} (x_{i}, x_{j})$ . We point to [11,12] for a comprehensive theoretical exposition to DM.

(ii). Diffusion with drift for asymmetric networks

The methodology of DM has been extended in [16] to deal with learning manifold embeddings for directed weighted networks. Given an asymmetric adjacency matrix, the symmetric part is extracted and serves as a discrete approximation to a geometric operator on the manifold. Spectral analysis of the relevant matrix can then yield embedding coordinates for the nodes of the network. In the same manner as for the original formulation of DM a set of backward evolution operators are derived, the two relevant ones being:

- \partial_{t} ψ_{t} = H_{a a}^{(α)} ψ_{t} = [Δ + (r - 2 (1 - α) \nabla U) \cdot \nabla] ψ_{t}

3.7

and

- \partial_{t} ψ_{t} = H_{s s}^{(α)} ψ_{t} = [Δ - 2 (1 - α) \nabla U \cdot \nabla] ψ_{t},

3.8

where $ψ_{t} \in C^{2} (M)$ is the mean of a real-valued bounded function of a random walker on the manifold after time t (e.g. ψ_t is a probability density). The sampling potential U defines the steady-state distribution of the diffusion, lim_{t → ∞}p_t = e^−U/Z which is taken to represent the sampling density of points on the manifold. The operators are parameterized by α, which determines how affected the diffusion process on the manifold is by U. Choosing α = 1 allows us to spectrally analyse a discrete approximation to the Laplace–Beltrami operator $Δ = H_{s s}^{(α = 1)}$ , extricating the geometry of the manifold from the density dependent term −2(1 − α)∇U · ∇ in the diffusion operator $H_{s s}^{(α)}$ . The choice of α effectively allows one to control how much the recovered Euclidean representation of the manifold geometry is affected by the sampling density. Finally, r is a drift vector component tangential to the manifold, which additively guides the diffusion process. Perrault–Joncas and Meilă comprehensively treat the application DM on directed graphs in [16].

(iii). Diffusion maps for continuous-time Markov chains

For an arbitrary CTMC(π, Q), we regard $Q \in R^{N \times N}$ to be a discrete approximation of the operator $H_{a a}^{(α)}$ . However, it is unclear how one can extract the geometrically relevant component Δ under a hidden potential U and parameter α. In practice, therefore, we assume a uniform measure on the manifold, i.e. constant U, which renders Q a discrete approximation of $H_{a a} = Δ + r \cdot \nabla$ (the choice of α no longer matters); further, we take the sampling transition kernel corresponding to this operator to be composed of a symmetric and anti-symmetric part (without loss of generality), which renders $lim_{N \to \infty} (Q + Q^{T}) / 2 = \tilde{Δ}$ , an un-normalized version of $Δ = diag (β_{1}) \tilde{Δ} diag (β_{2})$ .¹ Δ contains the relevant geometric information about the network, with the first k + 1 eigenvectors of the operator used as embedding coordinates in a k-dimensional Euclidean space (ignoring the first eigenvector which is trivial by construction). A detailed exposition of the method as it relates to our purposes of embedding a Markov chain network can be found in electronic supplementary material, appendix C.

It should be noted that, while DM have been used to construct low-dimensional approximations of high-dimensional SDEs [15], and to embed a discrete-time Markov chain in continuous space with an accompanying advective field [16], doing the same for a continuous-time Markov chain has not been attempted. Distinctively, the focus of that work was not to clear a path between discrete and continuous state Markov processes, but rather the low-dimensional embedding of processes or sample points. In terms of the convenient table presented in [13] and restated here in table 1, we seek to examine the omitted entry that completes the set of Markov models; this is the third entry added here to the original table, taking N < ∞ and the time interval between transitions limit ϵ → 0 to be the case of a CTMC with finite generator matrix Q.²

Table 1.

Resulting random walk (RW) or process from the limiting cases of number of vertices N and time interval between transitions ϵ in the diffusion maps literature [13]. We highlight the addition of the third entry for CTMCs to complete the set.

case	operator	stochastic process
ϵ > 0	finite N × N matrix P	RW in discrete space discrete in time (DTMC)
N < ∞
ϵ > 0	operators T_f, T_b	RW in continuous space discrete in time
N → ∞
ϵ → 0	infinitesimal generator matrix $Q \in R^{N \times N}$	Markov jump process; discrete in space, continuous in time
N < ∞
ϵ → 0	infinitesimal generator $H_{f}$	diffusion process continuous in space and time
N → ∞

Open in a new tab

(c). Gaussian processes for inferring drift vector field

DM provide a convenient way to embed the CTMC graph into a Euclidean space E; however, the push-forward CTMC dynamics is only defined on the image of the embedding, i.e. where the embedded states are. In order to define a fluid approximation, we require a continuous drift vector field to be defined everywhere in E. A natural approach is to treat this extension problem as a regression problem, where we use the push-forward dynamics at the isolated state embeddings as observations. We, therefore, use Gaussian processes (GPs), a non-parametric Bayesian approach to regression, to infer a smooth function $b : E \to R^{d}$ that has the appropriate drift vectors where states lie.

A Gaussian process is a collection of random variables {f_t}_t∈T indexed by a continuous quantity t, which follows a distribution over a family of functions $f : T \to R$ , $f \in H$ . Over the Hilbert space $H = L^{2} (T)$ , the distribution can be thought of as an infinite-dimensional Gaussian distribution over function values, where each dimension corresponds to a point on the domain of the function. We write

f (\cdot) \sim G P (m (\cdot), k (\cdot, \cdot)),

where $m : T \to R$ is the mean function, and $k : T \times T \to R$ is the covariance kernel of the distribution over $H$ . The choice of kernel k( · , · ) acts as the inner product in the space of functions $H$ , and so determines the kind of functions over which the distribution is defined. Certain kernels define a distribution over a dense subspace of L²(T), and we therefore say that the GP is a universal approximator—it can approximate any function in L²(T) arbitrarily well. One such kernel is the squared exponential

k (t, t^{'}) = a^{2} \exp (- \frac{∣ t - t^{'} ∣^{2}}{2 l^{2}}),

where the constants a and l are hyperparameters: the amplitude and length scale, respectively.

(i). Gaussian process regression

Suppose we observe evaluations f = (f(t₁), …, f(t_n)) of an (otherwise hidden) function, at points t = (t₁, …, t_n) of the function's domain. Once an appropriate prior is established, we are able to perform Bayesian inference to obtain a posterior distribution over possible functions consistent with our observations. In Gaussian process regression, the prior is the distribution given by the kernel. The function value f_⋆ at an unobserved domain point t_⋆, conditioned on observations f at points t, follows the predictive distribution

f_{⋆} ∣ f, t \sim \int p (f_{⋆} ∣ t_{⋆}, t, f) p (f ∣ t) d f .

Since the integral involves only normal distributions, it is tractable and has a closed-form solution, which is again a normal distribution. The observations may also be regarded as noisy, which will allow the function to deviate from the observed value in order to avoid extreme fluctuations. Using an appropriate noise model (Gaussian noise), retains the tractability and normality properties. Usually, the mean of the predictive distribution is used as a point estimate of the function value. For a comprehensive understanding of GPs for regression purposes, we refer to [17].

In our case, the choice of kernel and its hyperparameters is critical, especially when the density of states is low. In the limit of infinite observations of the function, the Gaussian process will converge to the true function over T, if the function is in the space defined by the kernel, regardless of the hyperparameters chosen. However, the number of states we embed is finite and so the choice of an appropriate prior can greatly aid the Gaussian process in inferring a good drift vector field. Here, we use the standard squared exponential kernel with a different length scale for each dimension, and select hyperparameters which optimize the likelihood of the observations. The optimization is performed via gradient descent since the gradient for the marginal likelihood is available.

(d). The geometric fluid approximation algorithm

Instructions to implementing the GFA are given in algorithm ??, detailing the recovery of embedding coordinates using DM, and inferring the drift vector field using Gaussian process regression. Given initial state coordinates y(t = 0) and a duration of time T, the inferred drift vector field is used as the gradient in an ODE solver to produce deterministic continuous trajectories in the Euclidean space where states have been embedded. These trajectories are interpreted as approximations to the evolution of the mean of the original process, mapped to the Euclidean embedding space.

(d).

(e). Consistency result

The GFA scheme is applicable in general to all CTMCs; it is, therefore, natural to ask whether it reduces to the standard fluid approximation on pCTMCs. We have the following result for a related construction, the unweighted Laplacian fluid approximation.

Theorem 3.1. —

Let $C$ be a pCTMC, whose underlying transition graph maps to a multi-dimensional grid graph in Euclidean space using the canonical hydrodynamic scaling embedding. The unweighted Laplacian fluid approximation of $C$ coincides with the canonical fluid approximation in the hydrodynamic scaling limit.

The proof (see electronic supplementary material, appendix B) relies on the explicit computation of the spectral decomposition of the Laplacian operator of an unweighted grid graph [18], and appeals to the universal approximation property of GP [17]. We conjecture that the conditions for fluid approximation for such a pCTMC will also be satisfied by our GFA.

Intuitively, away from the boundaries of the network, the coordinates of the embedded states approach the classical concentration embedding, where each dimension corresponds to a measure of concentration for each species. As the network grows (i.e. allowing larger maximum species numbers in the state-space of the chain) states are mapped closer together, reducing jump size, but preserving the ordering. The spacing of states near the centre of the population size is almost regular, approaching the classical density embedding, and the GP smoothing will therefore converge to the classical extended drift field.

4. Empirical observations

Experimental evidence of our GFA is necessary to give an indication of the method's validity, and a better intuition for its domain of effectiveness. We apply the GFA to a range of CTMCs with differing structure, and present the experimental results in this section. The CTMC models we used are defined in §4a, and the Python code used to produce the results can be found at https://bitbucket.org/webdrone/gfa/src/master/.

There is no absolute way to assess whether the method produces a good approximation to the true probability density evolution; we, therefore, focus on two comparisons: how close the geometric fluid trajectory over time is to the empirical mean of the original CTMC, mapped on the same state-space (§4b); and how close the first-passage time (FPT) estimate from the fluid approximation is to the true FPT cumulative density function (estimated by computationally intensive Monte Carlo sampling; §4d).

Furthermore, we demonstrate in §4c how the method is applicable to a subset of the CTMC graph, such that only a neighbourhood of the state-space is embedded. This may result in fluid approximations for graphs whose global structure is not particularly amenable to embedding in a low-dimensional Euclidean space, and so is useful for gauging the behavioural characteristics of the system near a section of the state-space.

In all figures in this section, red lines are solutions of our GFA, obtained via numerical integration of the drift vector field as inferred by GP regression, and blue lines are the mean of CTMC trajectories mapped to the embedding space, which were obtained via Gillespie's exact SSA [8]. Finally, in figures showing trajectories on the DM manifold, grey line intersections are embedded states (the grey lines being the possible transitions, or edges of the network).

(a). Models

We examine an array of models to assess the applicability domain of our method. The models are defined below and empirical comparisons for each are presented throughout this section.

(i). Two species birth–death processes

This model describes two independent birth–death processes for two species, and serves as a basic sanity check. The CTMC graph has a two-dimensional (2D) grid structure and in this sense resembles the system in theorem 3.1. In the usual CRN notation, we write:

\emptyset \overset{10}{\to} A, A \overset{1 / 2}{\to} \emptyset, \emptyset \overset{10}{\to} B and B \overset{1 / 2}{\to} \emptyset,

for the two species A, B, and note that, contrary to standard CRN convention for open systems, we introduce a system size variable N = 30 such that the count for each species n_A, n_B cannot exceed N; this produces a finite-state CTMC that can be spectrally decomposed and embedded. Note further that the birth process involves no particles here, and so transitioning from state s = (n_A, n_B) to state s′ = (n_A + 1, n_B) (or from s = (n_A, n_B) to s′ = (n_A, n_B + 1)) occurs at the same rate of 10/N per second ∀ n_A, n_B. Conversely, death processes are uni-molecular reactions, such that transitioning from s = (n_A, n_B) to s′ = (n_A − 1, n_B) occurs at a rate of (1/2)n_A per second ∀ n_A, n_B, as the CRN interpretation dictates.³

(ii). Two species Lotka–Volterra model

This is a Lotka–Volterra model of a predator–prey system. Allowed interactions are prey birth, predators consuming prey and reproducing, and predator death. The interactions with associated reaction rates are defined below in the usual CRN notation:

R \overset{b = 1 / 2}{\to} 2 R, R + F \overset{c = 1 / 10}{\to} 2 F and F \overset{d = 1 / 3}{\to} \emptyset,

where prey is represented by species R (rabbits) and predators by species F (foxes), with maximum predator and prey numbers of N = 30.

(iii). SIRS model

We describe a widely used stochastic model of disease spread in a fixed population, wherein agents can be in three states: susceptible, infected and recovered (S, I, R) and a contagious disease spreads from infected individuals to susceptible ones. After some time, infected individuals recover and are immune to the disease, before losing the immunity and re-entering the susceptible state. We define a pCTMC for the process as follows:

S + I \overset{k_{i} = 0.1}{\to} 2 I, I \overset{k_{r} = 0.05}{\to} R and R \overset{k_{s} = 0.01}{\to} S,

where the constants (k_i, k_r, k_s) have been chosen such that the ODE steady state is reached some time after t = 100 s. The state of the pCTMC at time t is X(t) = (S(t), I(t), R(t)), where S(t) refers to the number of agents in state S at time t, and so on for all species.

(iv). Genetic switch model

This is a popular model for the expression of a gene, when the latter switches between two activation modes: active and inactive [3,19]. While active, the gene is transcribed into mRNA at a much faster rate than while inactive (factor of approx. 10). The gene switches between the two modes stochastically with a slow rate. We have the following reactions:

P \overset{10^{- 4}}{⟷} \bar{P} P \overset{1}{\to} A + P, \bar{P} \overset{0.1}{\to} A + \bar{P} and A \overset{0.05}{\to} \emptyset,

where the active and inactive modes are represented by the species P and $\bar{P}$ , respectively, with a maximum count of 1. Despite being able to express this model in the usual CRN language, we emphasize that the binary nature of the switch prohibits usual scaling arguments for reaching the fluid limit.

(b). Assessing fluid solution and mean trajectory in embedding space

In our GFA, we create a map using directed DM to embed the CTMC states into a Euclidean space of small dimensionality, and use Gaussian process regression to infer a drift vector field over the space. The resulting continuous trajectories, which we refer to as the GFA, are in this section compared to average trajectories of the CTMC systems, projected on the same space.⁴ The latter are obtained by drawing 1000 trajectories of the CTMC using the SSA algorithm, and taking a weighted average of the state positions in the embedding space.

Our GFA does well for pCTMC models, where we know that the state-space can be naturally embedded in a Euclidean manifold. This is especially true for systems like the independent birth–death processes of two species, which do not involve heavy asymmetries in the graph structure. The more the structure deviates from a pCTMC and the more asymmetries in the structure, the larger the deviations we expect from the mean SSA trajectory. Additionally, we expect large deviations in the case of bi-modal distributions over the state-space, as is the general case for fluid approximations. This is because the latter are point-mass approximations of a distribution, and so are naturally more suited to approximate uni-modal, concentrated densities.

(i). Two species birth–death processes

As a sanity check, we examine how our method approximates the mean trajectory of the trivial system of two independent birth–death processes described above. The true distribution for such a system is uni-modal in the usual concentration space, and the graph has the structure of a 2D grid lattice with no asymmetries. As shown in figure 1, the geometric fluid approximation is very close to the empirical mean trajectory, which supports our consistency theorem and expectations for agreement in the case of symmetric graphs.

Figure 1. — Independent birth–death process for two species, showing the fluid solution (red) and the projected mean evolution (blue). (a) Embedded state-space and trajectories in $R^{2}$ , where grid structure is preserved and species counts are in orthogonal directions. (b) Fluid and mean SSA trajectories along embedded dimensions over time.

(ii). Lotka–Volterra model

We perform our geometric fluid approximation for the non-trivial case of a Lotka–Volterra system, which models a closed predator–prey system as described above. The asymmetric consumption reaction distorts the grid structure representative of the Euclidean square two species space. Therefore, the manifold recovered is the Euclidean square with shrinkage along the consumption dimension—more shrinkage is observed where predators and prey numbers are higher, since this implies faster consumption reactions. We observe in figure 2 that the fluid estimate keeps close to the mean initially and slowly diverges; however, the qualitative characteristics of the trajectory remain similar.

Figure 2. — A two species Lotka–Volterra model, showing the fluid solution (red) and the projected mean evolution (blue) slowly diverging from each other. The qualitative behaviour of both is similar as they begin to perform the oscillations typical of this system.

(iii). SIRS model

The SIRS model gives us the opportunity to compare trajectories in the embedding space of the geometric fluid, with trajectories in the concentration space used by the standard fluid approximation. We observe in figure 3 good agreement with the empirical mean trajectory for both fluid methods.

The classical fluid trajectory (figure 3a) is attainable in terms of the concentration of each species; it evolves according to coupled ODEs:

\frac{d s}{d t} = k_{s} r (t) - \frac{k_{i}}{N} i (t) s (t), \frac{d i}{d t} = \frac{k_{i}}{N} i (t) s (t) - k_{r} i (t) and \frac{d r}{d t} = k_{r} i (t) - k_{s} r (t),

4.1

where x(t) = (s(t), i(t), r(t)) = (S(t), I(t), R(t))/N and $N \in N_{> 0}$ is the total population. Increasing N linearly scales the ODE solution without affecting the dynamics; the SSA average converges to the ODE solution as N → ∞. Similarly, figure 3b shows the fluid solution in $R^{3}$ obtained by our geometric fluid approximation.

(iv). Genetic switch model

The model of a genetic switch is a departure from the usual pCTMC structure, since the binary switch introduces very slow mixing between two birth–death processes each with a different fixed point. The bi-modality of the resulting steady-state distribution is problematic to capture for any point-mass trajectory, and quickly leads to divergence of the fluid trajectory from the mean. With the particularly slow switching rate of 10⁻⁴ s⁻¹, our method produces fluid trajectories close to the mean trajectory for up to 100 s, mostly because the mixing is very slow and the distribution remains relatively concentrated for a long time (figure 4). However, with the faster rate of 5 × 10⁻³ s⁻¹, our fluid approximation quickly diverges from the mean trajectory (figure 5), as the expected result of faster mixing.

Figure 4. — The genetic switch model with switching rate 10⁻⁴ s⁻¹, showing the fluid solution (red) and the projected mean evolution (blue) keeping close to each other. Transitions from the set of states at d₁ = − 0.1 (inactive mode) to the set of states at d₁ = 0.1 happen very rarely, which is reflected by the mean SSA trajectory.

Figure 5. — The genetic switch model with a faster switching rate (5 × 10⁻³ s⁻¹), showing how the fluid solution (red) diverges from the projected mean evolution (blue) after t ≈ 20 s; the qualitative aspects of the trajectory remain similar.

(v). Population continuous-time Markov chains perturbations

It is expected that the method will perform well for CTMCs that are in some sense similar to a pCTMC, but cannot be exactly described by a CRN. We, therefore, demonstrate how the method performs for perturbations of a Lotka–Volterra system. To achieve the perturbation, we add noise to every existing transition rate (non-zero element of Q) of the Lotka–Volterra system we had above. The perturbed transition matrix Q_per is described in terms of the Lotka–Volterra matrix Q_LV by

{[Q_{per}]}_{i j} = {\begin{cases} {[Q_{LV}]}_{i j} + | η_{i j} |, & if {[Q_{LV}]}_{i j} > 0, \\ 0 & otherwise, \end{cases}

4.2

for all i≠j, where $η_{i j} \sim N (0, {0.5}^{2})$ , and ${[Q_{per}]}_{i i} = \sum_{j} {[Q_{per}]}_{i j}$ as usual. The projection in figure 6 shows that our method performs reasonably well near the pCTMC regime, where no classical continuous state-space approximation method exists.

A different kind of perturbation is achieved by randomly removing possible transitions of the original pCTMC. This amounts to setting some off-diagonal elements of the Q matrix to 0, and re-adjusting the diagonal so that all rows sum to 0. In order to avoid creating absorbing states or isolated states, we remove transitions randomly with a probability of 0.1. Our method performs reasonably under both kinds of perturbations, as seen in figure 7.

Figure 7. — A two species Lotka–Volterra model, perturbed by both noisy transition rates and random removal of transitions. The fluid solution (red) remains similar to the projected mean trajectory (blue) away from the boundary.

(c). Embedding a subset of the system

The empirical success of the method on perturbed pCTMC systems encouraged further exploration in cases where there is no global continuous approximation method, but the CTMC graph has regions which resemble a pCTMC structure, or are otherwise suitable for embedding in a continuous space. Consequently, we sought to embed only a subset of the state-space of a CTMC. Embedding state-space subsets can be useful for CTMCs that have a particularly disordered global structure (e.g. require many dimensions, or have areas on the manifold with low density), but which may contain a neighbourhood of the state-space that better admits a natural embedding. Additionally, one could introduce coffin states near the boundary of a pCTMC to apply the method on reachability problems.

A subset includes every reachable state within r transitions from a selected root state s_r, denoted as Δ(s_r, r). Transitions from or to states outside the selected subset are ignored, and the remaining Q matrix is embedded in $R^{2}$ . The drift vectors on boundary states lack all components of transitions outside the subset, and so the probability flux is inaccurate on the boundary. Figure 8 shows the Lotka–Volterra model subset Δ(s_r = (R = 5, F = 9), r = 8), embedded in $R^{2}$ . We can see that the behaviour near the root state is close to the projected sample mean evolution, despite the boundary issues.

Figure 8. — Embedding the subset Δ(s_r = (R = 5, F = 9), r = 8) of the two species Lotka–Volterra model. The fluid solution (red) remains similar to the projected mean trajectory (blue) away from the boundary, despite the boundary inaccuracies of the probability flux.

(d). First passage times

Another common quest of such approximation techniques is estimating the FPT distribution for a target subset of states of the Markov chain. The literature on this is rich—there has been significant effort in this direction, utilizing both established probability evolution methods and constructing new theoretical methods tailored to this problem [20–22]. The former is possible since FPT estimation can be formulated as the classical problem of estimating how the probability distribution over the state-space evolves for a modified version of the Markov chain in question.

Specifically, consider a Markov chain with rate matrix Q for the state-space I. Let B⊆I be a set of target states for which we want to estimate the distribution for the FPT τ, given some initial state ξ₀∈I\B. The FPT cumulative density function (CDF) is equivalent to the probability mass on the set B at time τ, if every state in B is made absorbing. In this manner, many methods for approximating probability density evolution over the state-space of a CTMC can also be used to approximate FPT distributions.

(i). The fluid proximity approach

A natural avenue to estimate the FPT when a fluid approximation to the CTMC exists, is to consider how close the fluid solution is to the target set B. The classical fluid approximation usually relies on population structured CTMCs, where the target set is often a result of some population ratio threshold (e.g. all states where more than 30% of the total population is of species A: N_A/N > 0.3). Since the set is defined in terms of population ratios, it is trivial to map threshold ratios to the continuous concentration space where the pCTMC is embedded, and hence define corresponding concentration regions. The time at which the fluid ODE solution enters that region of concentration space is then an approximation for the FPT CDF. The latter will, of course, be a step function (from 0 to 1) since the solution is the trajectory of a point mass. Keeping the same threshold ratios for the target set, and scaling the population size N should drive the true FPT CDF towards the fluid approximation. If more moments of the probability distribution are approximated (for instance in moment closure methods) one can derive bounds for the FPT CDF; these can be made tighter as higher-order moments are considered, as shown in [21].

In our case, the fluid ODE solution only tracks the first moment of the distribution which implies a point mass approximation. Additionally, we have done away with the population structure requirement, such that thresholds for defining target sets are no longer trivially projected to the continuous space where we embed the chain. The latter challenge is overcome by considering the Voronoi tessellation of the continuous space, where each embedded state serves as the seed for a Voronoi cell. We then say that the fluid solution has entered the target region if it has entered a cell whose seed state belongs in the target set B. Equivalently, the solution is in the target region when it is closer (with Euclidean distance as the metric) to any target state than to any non-target state.

Checking which is the closest state is computationally cheap, and so we can produce FPT estimates at little further cost from the fluid construction. Results for the SIRS model, the Lotka–Volterra and perturbed Lotka–Volterra models follow.

(ii). First-passage time in the SIRS model

We define a set of barrier states in the SIRS model, B = {(S, I, R)|R/N≥1/10}, and examine the FPT distribution of the system into the set B, with initial state X(0)∉B. Note that the trivial scaling laws for this model, owing to the fixed population size, makes it simple to identify corresponding barrier regions in concentration space: b = {(s, i, r)|r≥1/10}. We can, therefore, compare the fluid solution FPT estimate to the empirical CDF (trajectories drawn by the SSA), as well as to our own fluid construction with an embedding given by DM and a drift vector field estimated via a Gaussian process. Figure 9 shows that our approach is in good agreement with both the empirical mean FPT and the classical fluid result.

Figure 9. — First passage time CDFs for the SIRS model with different populations. The classical solution gives the same estimate for all N, to which the SSA estimates converge as N → ∞. Naturally, both the classical and our estimates are single step functions, since we approximate the probability distribution evolution by a point mass. We are consistently close to both the SSA and classical fluid CDFs.

(iii). First-passage time in the Lokta–Volterra model

Here, we embed the Lotka–Volterra model, and define the barrier set of states B = {(R, F)|0.6N > F≥0.2N} for which we estimate FPT CDFs, with initial state X(0) = (0.3, 0.7)N, for various system sizes N = {30, 40, 50}.

We show in figure 10a how our fluid construction estimates an FPT close to the SSA CDF. This is expected when embedding a structured model such as the Lotka–Volterra, where two dimensions are adequate to preserve the network topology and the Gaussian process can well approximate the continuous drift vector field. Finally, we show in figure 10b that a good estimate of the FPT is recovered for the perturbed Lotka–Volterra, which is no longer a CRN.

5. Conclusion

CTMCs retain a central role as models of stochastic behaviour across a number of scientific and engineering disciplines. For pCTMCs, model approximation techniques such as fluid approximations have played a central role in enabling scalable analysis of such models. These approximations, however, critically rely on structural features of pCTMCs which are not shared by general CTMCs. In this paper, we presented a novel construction based on machine learning which extends fluid approximation techniques to general CTMCs. Our new construction, the geometric fluid approximation, is (with certain hyperparameters) equivalent to classical fluid approximations for a class of pCTMCs; empirically, the geometric fluid approximation provides good quality approximations in a number of non-trivial case studies from epidemiology, ecology and systems biology.

While this work was motivated by generalizing methods whose aim was to scale analysis of pCTMCs, applying the GFA on a large CTMC may prove computationally prohibitive; however, approximating subsets of the state-space, or subsystems present in the CTMC, is possible as demonstrated, which can be computationally beneficial.

On a more conceptual note, all approximations are accurate over a specific range of applicability and most only possible on certain classes of CTMCs. We, therefore, sought to construct a universally applicable method resting upon fundamental properties of a general CTMC. Our method conjectures that the quality of a continuous approximation depends on how well diffusion distance can be preserved in a Euclidean space, over which the approximation evolves. Despite no significant decrease of the computational cost for CTMC analysis, we offer insight on how this property influences the quality of continuous approximations to CTMCs in general, and expect that this property may be cast into a suitable metric to quantify how continuous approximation quality varies across general CTMCs. For instance, a CTMC which is close to a pCTMC in this sense (as the ones used in the empirical section above) is expected to admit a relatively accurate continuous approximation.

Some potential paths forward become apparent under the lens of this work. Firstly, our method might be optimized to accommodate particular classes of CTMCs, for example, by designing specific kernels for the GP regression part. This might be an effective way to incorporate domain knowledge and further improve the quality of the geometric approximation.

Secondly, we can extend this methodology by approximating the diffusion matrix field as well as the drift vector field. This would enable us to define a diffusion process on the manifold and so construct an approximating pdf rather than a point mass. An evolving pdf will be comparable to solutions produced by Van Kampen's system size expansion, moment closure methods and the chemical Langevin equation for the case of CTMCs representing CRNs.

Finally, the geometric fluid approximation produces trajectories in a low-dimensional Euclidean space, but these coordinates are not immediately interpretable as they are in the canonical fluid approximation. We have here used a Voronoi tesselation to relate continuous trajectories in the Euclidean space to those in the original discrete state-space, and used this to estimate first passage times. However, our method would benefit from further work on interpreting the dimensions of the embedding. In particular, it would be interesting to extend the work on FPTs to define methodologies to approximate more complex path properties, such as temporal logic formulae which are often encountered in computer science applications [23,24].

Supplementary Material

Supplementary material for "Geometric fluid approximation for general continuous-time Markov chains"

rspa20190100supp1.pdf^{(234.8KB, pdf)}

Reviewer comments

rspa20190100_review_history.pdf^{(635.4KB, pdf)}

Acknowledgments

The authors thank Luca Bortolussi for the many useful discussions.

Notes

Recall that r · ∇ψ is the drift vector component tangential to the manifold. As such, it is an anti-symmetric field in the limit N → ∞, and so r · ∇ must be an anti-symmetric operator under transposition when N < ∞.

Note that a discrete-time Markov chain (π, P) with P = I + ϵQ, where ϵ is a small time interval, will tend to the CTMC(π, Q) as ϵ → 0.

This follows from the definition of the propensity function, see [1,6] for details.

⁴

Note that in figures 1–8 when referring to ‘dimension d = j’ of the diffusion map projection, we refer to the jth coordinate of the embedding of the manifold in Euclidean space, as recovered by diffusion maps.

Data accessibility

This article has no additional data.

Author's contributions

M.M. instigated this work, and produced the theoretical and empirical results presented. J.H. and G.S. provided feedback and advice during the development process, and guided the research path. The manuscript was jointly drafted and edited.

Competing interests

We declare we have no competing interests.

Funding

This work was supported by the EPSRC under grant no. EP/L027208/1.

References

1.Schnoerr D, Sanguinetti G, Grima R. 2017. Approximation and inference methods for stochastic biochemical kinetics-a tutorial review. J. Phys. A: Math. Theor. 50, 093001 ( 10.1088/1751-8121/aa54d9) [DOI] [Google Scholar]
2.Hillston J. 2005. Fluid flow approximation of PEPA models. In Second Int. Conf. on the Quantitative Evaluation of Systems (QEST'05), Torino, Italy, 19–22 September, pp. 33–42. Piscataway, NJ: IEEE CS Press.
3.Larsson AJM. et al. 2019. Genomic encoding of transcriptional burst kinetics. Nature 565, 251–254. ( 10.1038/s41586-018-0836-1) [DOI] [PMC free article] [PubMed] [Google Scholar]
4.Kurtz TG. 1971. Limit theorems for sequences of jump markov processes approximating ordinary differential processes. J. Appl. Probab. 8, 344–356. ( 10.2307/3211904) [DOI] [Google Scholar]
5.Darling R, Norris J. 2008. Differential equation approximations for Markov chains. Probab. Surv. 5, 37–79. ( 10.1214/07-PS121) [DOI] [Google Scholar]
6.Gardiner CW. 2009. Stochastic methods: a handbook for the natural and social sciences, 4th edn Springer Series in Synergetics Berlin, Germany: Springer. [Google Scholar]
7.Norris JR. 1998. Markov chains. Cambridge, UK: Cambridge University Press. [Google Scholar]
8.Gillespie DT. 1977. Exact stochastic simulation of coupled chemical reactions. J. Phys. Chem. 81, 2340–2361. ( 10.1021/j100540a008) [DOI] [Google Scholar]
9.vanKampen N. 1961. A power series expansion of the master equation. Can. J. Phys. 39, 551–567. ( 10.1139/p61-056) [DOI] [Google Scholar]
10.Darling RWR. 2002. Fluid limits of pure jump Markov processes: a practical guide. (http://arxiv.org/abs/quant-ph/math/0210109).
11.Coifman RR, Lafon S, Lee AB, Maggioni M, Nadler B, Warner F, Zucker SW. 2005. Geometric diffusions as a tool for harmonic analysis and structure definition of data: diffusion maps. Proc. Natl Acad. Sci. USA 102, 7426–7431. ( 10.1073/pnas.0500334102) [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Coifman RR, Lafon S. 2006. Diffusion maps. Appl. Comp. Harm. Anal. 21, 5–30. ( 10.1016/j.acha.2006.04.006) [DOI] [Google Scholar]
13.Nadler B, Lafon S, Kevrekidis I, Coifman RR. 2006. Diffusion maps, spectral clustering and eigenfunctions of Fokker–Planck operators. In Advances in NeurIPS 18 (eds Y Weiss, B Schölkopf, JC Platt), pp. 955–962. Cambridge, MA: MIT Press.
14.Nadler B, Lafon S, Coifman RR, Kevrekidis IG. 2006. Diffusion maps, spectral clustering and reaction coordinates of dynamical systems. Appl. Comp. Harm. Anal. 21, 113–127. ( 10.1016/j.acha.2005.07.004) [DOI] [Google Scholar]
15.Coifman RR, Kevrekidis IG, Lafon S, Maggioni M, Nadler B. 2008. Diffusion maps, reduction coordinates, and low dimensional representation of stochastic systems. Multiscale Model. Simul. 7, 842–864. ( 10.1137/070696325) [DOI] [Google Scholar]
16.Perrault-Joncas DC, Meilă M. 2011. Directed graph embedding: an algorithm based on continuous limits of Laplacian-type operators. In Advances in NeurIPS 24 (eds J Shawe-Taylor, RS Zemel, PL Bartlett, F Pereira, KQ Weinberger), pp. 990–998. Curran Associates, Inc.
17.Rasmussen CE, Williams CKI. 2006. Gaussian processes for machine learning. Adaptive Computation and Machine Learning Cambridge, MA: MIT Press. [Google Scholar]
18.Kłopotek MA. 2017. Spectral analysis of laplacian of a multidimensional grid graph. (http://arxiv.org/abs/quant-ph/1707.05210).
19.Vu TN, Wills QF, Kalari KR, Niu N, Wang L, Rantalainen M, Pawitan Y. 2016. Beta-poisson model for single-cell RNA-seq data analyses. Bioinformatics 32, 2128–2135. ( 10.1093/bioinformatics/btw202) [DOI] [PubMed] [Google Scholar]
20.Darling DA, Siegert AJF. 1953. The first passage problem for a continuous Markov process. Ann. Math. Stat. 24, 624–639. ( 10.1214/aoms/1177728918) [DOI] [Google Scholar]
21.Hayden RA, Stefanek A, Bradley JT. 2012. Fluid computation of passage-time distributions in large Markov models. Theor. Comput. Sci. 413, 106–141. ( 10.1016/j.tcs.2011.07.017) [DOI] [Google Scholar]
22.Schnoerr D, Cseke B, Grima R, Sanguinetti G. 2017. Efficient low-order approximation of first-passage time distributions. Phys. Rev. Lett. 119, 210601 ( 10.1103/PhysRevLett.119.210601) [DOI] [PubMed] [Google Scholar]
23.Milios D, Sanguinetti G, Schnoerr D. 2018. Probabilistic model checking for continuous-time Markov chains via sequential Bayesian inference. In Quantitative Evaluation of Systems (eds A McIver, A Horvath), vol. 11024, pp. 289–305. Cham, Switzerland: Springer International Publishers.
24.Bortolussi L, Milios D, Sanguinetti G. 2016. Smoothed model checking for uncertain continuous-time Markov chains. Inf. Comput. 247, 235–253. ( 10.1016/j.ic.2016.01.004) [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary material for "Geometric fluid approximation for general continuous-time Markov chains"

rspa20190100supp1.pdf^{(234.8KB, pdf)}

Reviewer comments

rspa20190100_review_history.pdf^{(635.4KB, pdf)}

Data Availability Statement

This article has no additional data.

[RSPA20190100C1] 1.Schnoerr D, Sanguinetti G, Grima R. 2017. Approximation and inference methods for stochastic biochemical kinetics-a tutorial review. J. Phys. A: Math. Theor. 50, 093001 ( 10.1088/1751-8121/aa54d9) [DOI] [Google Scholar]

[RSPA20190100C2] 2.Hillston J. 2005. Fluid flow approximation of PEPA models. In Second Int. Conf. on the Quantitative Evaluation of Systems (QEST'05), Torino, Italy, 19–22 September, pp. 33–42. Piscataway, NJ: IEEE CS Press.

[RSPA20190100C3] 3.Larsson AJM. et al. 2019. Genomic encoding of transcriptional burst kinetics. Nature 565, 251–254. ( 10.1038/s41586-018-0836-1) [DOI] [PMC free article] [PubMed] [Google Scholar]

[RSPA20190100C4] 4.Kurtz TG. 1971. Limit theorems for sequences of jump markov processes approximating ordinary differential processes. J. Appl. Probab. 8, 344–356. ( 10.2307/3211904) [DOI] [Google Scholar]

[RSPA20190100C5] 5.Darling R, Norris J. 2008. Differential equation approximations for Markov chains. Probab. Surv. 5, 37–79. ( 10.1214/07-PS121) [DOI] [Google Scholar]

[RSPA20190100C6] 6.Gardiner CW. 2009. Stochastic methods: a handbook for the natural and social sciences, 4th edn Springer Series in Synergetics Berlin, Germany: Springer. [Google Scholar]

[RSPA20190100C7] 7.Norris JR. 1998. Markov chains. Cambridge, UK: Cambridge University Press. [Google Scholar]

[RSPA20190100C8] 8.Gillespie DT. 1977. Exact stochastic simulation of coupled chemical reactions. J. Phys. Chem. 81, 2340–2361. ( 10.1021/j100540a008) [DOI] [Google Scholar]

[RSPA20190100C9] 9.vanKampen N. 1961. A power series expansion of the master equation. Can. J. Phys. 39, 551–567. ( 10.1139/p61-056) [DOI] [Google Scholar]

[RSPA20190100C10] 10.Darling RWR. 2002. Fluid limits of pure jump Markov processes: a practical guide. (http://arxiv.org/abs/quant-ph/math/0210109).

[RSPA20190100C11] 11.Coifman RR, Lafon S, Lee AB, Maggioni M, Nadler B, Warner F, Zucker SW. 2005. Geometric diffusions as a tool for harmonic analysis and structure definition of data: diffusion maps. Proc. Natl Acad. Sci. USA 102, 7426–7431. ( 10.1073/pnas.0500334102) [DOI] [PMC free article] [PubMed] [Google Scholar]

[RSPA20190100C12] 12.Coifman RR, Lafon S. 2006. Diffusion maps. Appl. Comp. Harm. Anal. 21, 5–30. ( 10.1016/j.acha.2006.04.006) [DOI] [Google Scholar]

[RSPA20190100C13] 13.Nadler B, Lafon S, Kevrekidis I, Coifman RR. 2006. Diffusion maps, spectral clustering and eigenfunctions of Fokker–Planck operators. In Advances in NeurIPS 18 (eds Y Weiss, B Schölkopf, JC Platt), pp. 955–962. Cambridge, MA: MIT Press.

[RSPA20190100C14] 14.Nadler B, Lafon S, Coifman RR, Kevrekidis IG. 2006. Diffusion maps, spectral clustering and reaction coordinates of dynamical systems. Appl. Comp. Harm. Anal. 21, 113–127. ( 10.1016/j.acha.2005.07.004) [DOI] [Google Scholar]

[RSPA20190100C15] 15.Coifman RR, Kevrekidis IG, Lafon S, Maggioni M, Nadler B. 2008. Diffusion maps, reduction coordinates, and low dimensional representation of stochastic systems. Multiscale Model. Simul. 7, 842–864. ( 10.1137/070696325) [DOI] [Google Scholar]

[RSPA20190100C16] 16.Perrault-Joncas DC, Meilă M. 2011. Directed graph embedding: an algorithm based on continuous limits of Laplacian-type operators. In Advances in NeurIPS 24 (eds J Shawe-Taylor, RS Zemel, PL Bartlett, F Pereira, KQ Weinberger), pp. 990–998. Curran Associates, Inc.

[RSPA20190100C17] 17.Rasmussen CE, Williams CKI. 2006. Gaussian processes for machine learning. Adaptive Computation and Machine Learning Cambridge, MA: MIT Press. [Google Scholar]

[RSPA20190100C18] 18.Kłopotek MA. 2017. Spectral analysis of laplacian of a multidimensional grid graph. (http://arxiv.org/abs/quant-ph/1707.05210).

[RSPA20190100C19] 19.Vu TN, Wills QF, Kalari KR, Niu N, Wang L, Rantalainen M, Pawitan Y. 2016. Beta-poisson model for single-cell RNA-seq data analyses. Bioinformatics 32, 2128–2135. ( 10.1093/bioinformatics/btw202) [DOI] [PubMed] [Google Scholar]

[RSPA20190100C20] 20.Darling DA, Siegert AJF. 1953. The first passage problem for a continuous Markov process. Ann. Math. Stat. 24, 624–639. ( 10.1214/aoms/1177728918) [DOI] [Google Scholar]

[RSPA20190100C21] 21.Hayden RA, Stefanek A, Bradley JT. 2012. Fluid computation of passage-time distributions in large Markov models. Theor. Comput. Sci. 413, 106–141. ( 10.1016/j.tcs.2011.07.017) [DOI] [Google Scholar]

[RSPA20190100C22] 22.Schnoerr D, Cseke B, Grima R, Sanguinetti G. 2017. Efficient low-order approximation of first-passage time distributions. Phys. Rev. Lett. 119, 210601 ( 10.1103/PhysRevLett.119.210601) [DOI] [PubMed] [Google Scholar]

[RSPA20190100C23] 23.Milios D, Sanguinetti G, Schnoerr D. 2018. Probabilistic model checking for continuous-time Markov chains via sequential Bayesian inference. In Quantitative Evaluation of Systems (eds A McIver, A Horvath), vol. 11024, pp. 289–305. Cham, Switzerland: Springer International Publishers.

[RSPA20190100C24] 24.Bortolussi L, Milios D, Sanguinetti G. 2016. Smoothed model checking for uncertain continuous-time Markov chains. Inf. Comput. 247, 235–253. ( 10.1016/j.ic.2016.01.004) [DOI] [Google Scholar]

PERMALINK

Geometric fluid approximation for general continuous-time Markov chains

Michalis Michaelides

Jane Hillston

Guido Sanguinetti

Abstract

1. Introduction

2. Background theory and related work

(a). Continuous-time Markov chains

Definition 2.1. —

(b). Continuous relaxation and the fluid limit

(i). Established guarantees

(ii). Canonical embedding of pCTMCs

3. Methodology

(a). Eigen-embeddings of continuous-time Markov chains

(i). Trivial embedding of continuous-time Markov chains

(ii). The fluid approximation

(iii). Markov chain as a random walk on a graph

(b). Diffusion maps

(i). Diffusion on a manifold

(ii). Diffusion with drift for asymmetric networks

(iii). Diffusion maps for continuous-time Markov chains

Table 1.

(c). Gaussian processes for inferring drift vector field

(i). Gaussian process regression

(d). The geometric fluid approximation algorithm

(e). Consistency result

Theorem 3.1. —

4. Empirical observations

(a). Models

(i). Two species birth–death processes

(ii). Two species Lotka–Volterra model

(iii). SIRS model

(iv). Genetic switch model

(b). Assessing fluid solution and mean trajectory in embedding space

(i). Two species birth–death processes

Figure 1.

(ii). Lotka–Volterra model

Figure 2.

(iii). SIRS model

Figure 3.

(iv). Genetic switch model

Figure 4.

Figure 5.

(v). Population continuous-time Markov chains perturbations

Figure 6.

Figure 7.

(c). Embedding a subset of the system

Figure 8.

(d). First passage times

(i). The fluid proximity approach

(ii). First-passage time in the SIRS model

Figure 9.

(iii). First-passage time in the Lokta–Volterra model

Figure 10.

5. Conclusion

Supplementary Material

Acknowledgments

Notes

Data accessibility

Author's contributions

Competing interests

Funding

References

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases