A hybrid tau-leap for simulating chemical kinetics with applications to parameter estimation

Thomas Trigo Trindade; Konstantinos C Zygalakis

doi:10.1098/rsos.240157

. 2024 Dec 4;11(12):240157. doi: 10.1098/rsos.240157

A hybrid tau-leap for simulating chemical kinetics with applications to parameter estimation

Thomas Trigo Trindade ¹, Konstantinos C Zygalakis ^2,^✉

PMCID: PMC11615191 PMID: 39635156

Abstract

We consider the problem of efficiently simulating stochastic models of chemical kinetics. The Gillespie stochastic simulation algorithm (SSA) is often used to simulate these models; however, in many scenarios of interest, the computational cost quickly becomes prohibitive. This is further exacerbated in the Bayesian inference context when estimating parameters of chemical models, as the intractability of the likelihood requires multiple simulations of the underlying system. To deal with issues of computational complexity in this paper, we propose a novel hybrid τ-leap algorithm for simulating well-mixed chemical systems. In particular, the algorithm uses τ-leap when appropriate (high population densities), and SSA when necessary (low population densities, when discrete effects become non-negligible). In the intermediate regime, a combination of the two methods, which uses the properties of the underlying Poisson formulation, is employed. As illustrated through a number of numerical experiments, the hybrid τ offers significant computational savings when compared with SSA without, however, sacrificing the overall accuracy. This feature is particularly welcomed in the Bayesian inference context, as it allows for parameter estimation of stochastic chemical kinetics at reduced computational cost.

Keywords: stochastic chemical kinetics, hybrid scheme, jump process, parameter inference

1. Introduction

In the last few decades, there has been an increase in the interest in biochemical systems with a small number of interacting components; see for example the phage $λ$ -lysis decision circuit [1], circadian rhythms [2] as well as the cell cycle [3]. In the setting of low copy numbers of interacting components, the stochastic variations may constitute a crucial element in the description of the dynamics of the systems, often in the form of bursts and cascading mechanisms that are typically not well captured by macroscopic models. Additionally, the general consensus now is that accounting for the stochasticity plays a central role in the interpretation of experimental data originating from cell and molecular processes [4,5].

Even when incorporating stochasticity in the modelling of biochemical systems one needs to decide on the assumptions that hold for the system in question. In particular, when the underlying system is not well mixed, the appropriate microscopic description involves describing the dynamics of each chemical molecule separately [6,7]. On the other hand, when the system is sufficiently well mixed, the kinetics of each species are described by a continuous time discrete space Markov chain, and in this case, the corresponding master equation is known as the chemical master equation (CME) [8]. Essentially, the CME is a (potentially infinite-dimensional) system of ordinary differential equations (ODEs) that describes, at each point in time, the probability density of all the different possible states of the system.

Except for some very simple chemical systems [9], due to the inherent high dimensionality of the CME, analytic solutions of the CME are not available. Therefore several methods have been developed [10–12] that try to solve the corresponding system of differential equations directly. An alternative and more widely adopted approach relates to the direct simulation of the underlying Markov process. More precisely, the stochastic simulation algorithm (SSA) [13] exactly simulates trajectories whose probability density function matches that of the CME as the system evolves in time. In addition, several alternative exact algorithms have been subsequently proposed [14,15] that were shown to be computationally more efficient than SSA. The core idea behind these algorithms is that one samples a waiting time for the next reaction from an appropriate exponential distribution, while another draw of a random variable is then used to decide which of the possible reactions will occur.

A fundamental issue with all the exact algorithms described above is that running them can be computationally intensive for realistic problems. The reason behind this is that the time between subsequent reactions becomes very small leading thus to a computational bottleneck. This issue is further exacerbated when one is interested in estimating parameters of the stochastic kinetics models from data, since the underlying likelihood is intractable, i.e. not available in closed form, and one needs to perform multiple stochastic simulations, for example within a particle Markov chain Monte Carlo framework [16–19], to deal with this intractability.

One approach to deal with the computational complexity of exact algorithms like SSA, is to use an approximate algorithm such as $τ$ -leap [20] in which the system is simulated over suitable time intervals for which several chemical events might occur. This lumping of chemical events can lead to significant computational savings [8]. Furthermore, several variants of this algorithm have been proposed in the literature [21–24]. An alternative approach to speeding up the SSA is to employ different approximations on the level of the description of the chemical system. A prime example of this is the reaction rate [8] equation (RRE). This is an ODE that is valid in the limit of large molecular populations, and it can be thought of as approximating the time evolution of the mean of the evolving Markov chain. An intermediate regime between the SSA and the reaction rate equation is the one where stochasticity is still important, but there exists a sufficient number of molecules to describe the evolving kinetics by a continuous model. This regime is called the chemical Langevin equation (CLE) [25], which is an Itô stochastic differential equation (SDE) driven by a multi-dimensional Wiener process.

In practice, a lot of chemical systems can contain many different species with a wide range of population numbers. This multi-scale nature makes the direct application of approximate methods such as $τ$ -leap or of approximate models such as the CLE or the RRE non-trivial. This has motivated several different hybrid algorithms [26–28] that only treat certain chemical species as continuous variables and others as discrete. By doing so, such schemes can benefit from the computational efficiency of continuum approximations (either deterministic or stochastic) while still taking into account discrete fluctuations when necessary. Such schemes typically involve partitioning the reactions into fast and slow reactions, with the fast reactions modelled using a continuum approximation (CLE or the reaction rate equation), while using the Markov jump process to simulate the discrete reactions. As the effective firing rate of reactions depends on the present chemical populations, the reactions may effectively transition from displaying fast to slow behaviours if the involved chemical populations vary significantly over time. This issue can be addressed by periodic repartitioning [29,30] or by adopting a different approach based on population scaling, i.e. adapting the type of reaction simulation as a function of the current population. Such an approach was followed in [31], where an algorithm was proposed that would perform Langevin dynamics in regions of abundance, jump dynamics in regions where one of the involved chemical species is in small concentrations, and a mixture of both in intermediate regions.

In this paper inspired by the work in [31] we propose a hybrid $τ$ -leap scheme (hereafter denoted hybrid $τ$ ) that uses $τ$ -leaping dynamics to simulate reactions in which the discreteness cannot be discounted. In particular, the proposed algorithm corresponds to a discretization of the CME and in the limit of small $τ$ coincides with SSA. This is contrary to the algorithm proposed in [31], where the limit of small time-step does not coincide with SSA. In addition, similar to the approach in [31] our scheme does not explicitly keep track of fast and slow reactions, but rather, performs $τ$ -leap dynamics in regions of abundance, jump dynamics in regions where one of the involved chemical species is in small concentrations, and a mixture of both in intermediate regions. The preference of jump over $τ$ -leap dynamics is controlled for each individual reaction using a blending function, which is chosen to take value $1$ in regions of low concentration, $0$ in regions where all involved chemical species are abundant and smoothly interpolates in between. The choice of each blending region will depend on the reaction rate associated with the given reaction. The region should be generally chosen so that the resulting propensity is large in the $τ$ -leaping region and small in the discrete region.

The rest of the paper is organized as follows. In §2, we review the standard approaches for simulating chemical kinetics such as SSA and the $τ$ -leap method. Furthermore, we introduce some basic ideas associated with parameter estimation for chemical kinetics, highlighting the fact that since the underlying likelihood is intractable one needs to design inference algorithms based on using fast and accurate simulations of the underlying chemical system. Then, in §3, we introduce our new hybrid $τ$ algorithm, while in §4 we perform a number of numerical simulations that demonstrate the excellent performance of the proposed numerical scheme when compared with other state-of-the-art methods. We conclude in §5 with a summary of our findings and a discussion of future directions.

2. Preliminaries

We will consider a biochemical network of $N$ species that interact through $M$ reaction channels within an isothermal reactor of fixed volume $V$ . We will denote with $X_{i} (t), i = 1, \dots, N$ the number of molecules of species $S_{i}$ at time $t$ and let $𝑿 (t) = (X_{1} (t), \dots, X_{N} (t))$ . Throughout this work, we will assume that the chemical species are well mixed and hence $𝑿 (t)$ can be modelled as a continuous time discrete space Markov process [32]. More precisely, when in state $𝑿 (t)$ , the $j$ th reaction gives rise to a transition $𝑿 (t) \to 𝑿 (t) + 𝝂_{j}$ with exponential distributed waiting time with inhomogeneous rate $a_{j} (𝑿 (t))$ where $a_{j} (\cdot)$ and $𝝂_{j} \in ℤ^{N}$ denote the propensity and stoichiometric vector corresponding to the $j$ th reaction, respectively.

Each reaction is of the form

μ_{j, 1} S_{1} + μ_{j, 2} S_{2} + \dots μ_{j, N} S_{N} \overset{c_{j}}{\to} μ_{j, 1}^{'} S_{1} + μ_{j, 2}^{'} S_{2} + \dots μ_{j, N}^{'} S_{N},

where $j = 1, \dots, M$ and $μ_{j, i}, μ_{j, i}^{'} \in ℕ = {0,1,2, \dots}$ , for $i = 1, \dots, N$ . We will denote with $𝝁 = (μ_{j, 1}, \dots, μ_{j, N}), 𝝁^{'} = (μ_{j, 1}^{'}, \dots, μ_{j, N}^{'})$ and then we have that the stoichiometric vectors $𝝂_{j}$ , $j = 1, \dots, M$ satisfy

𝝂_{j} = 𝝁_{j}^{'} - 𝝁_{j} .

These vectors describe how much the number of molecules change when the $j$ th reaction takes place. For notational convenience, hereafter, $V = [𝝂_{1}, \dots, 𝝂_{M}]$ . Under the assumptions of mass action kinetics, the associated propensity $a_{j}$ for the $j$ th reaction is

a_{j} (x) = c_{j} \prod_{i = 1}^{N} \frac{x_{i}!}{(x_{i} - μ_{j, i})!},

where $x_{i}$ is the number of molecules of $S_{i}$ . Again, for notational convenience, hereafter $𝒂 (𝒙) = (a_{1} (𝒙), \dots, a_{M} (𝒙))$ .

2.1. Algorithms for simulating chemical kinetics

The main assumption in modelling the evolution of $𝑿 (t)$ is that within the time interval $[t, t + d t)$ , the probability of the reaction $j$ firing is proportional to $a_{j} (𝑿 (t)) d t + o (d t)$ . The process $𝑿 (t)$ can thus be expressed as the sum of $M$ Poisson processes with inhomogeneous rates $a_{j} (𝑿 (t))$ . Furthermore, the process $𝑿 (t)$ can be expressed [25,33] as a random time change of unit rate Poisson processes

X (t) = X (0) + \sum_{j = 1}^{M} P_{j} (\int_{0}^{t} a_{j} (X (s)) d s) ν_{j},

(2.1)

where $P_{j}$ are independent unit-rate Poisson processes. This formulation of the stochastic process is very helpful in terms of designing numerical methods that produce either exact or approximate samples. In particular, the standard way of sampling realizations of $𝑿 (t)$ is the Gillespie SSA [13], see Algorithm 2.1.

2.1.

As we can see in Algorithm 2.1, in order to advance the system from time $t$ to time $t + τ$ , one needs to generate two random variables. The next reaction method [14] exploits further the structure of (2.1) to provide a more efficient implementation of Gillespie’s SSA when simulating systems with many reaction channels. However, a fundamental computational issue with exact algorithms such as Gillespie’s SSA or the next reaction method is they become computationally expensive when the number of molecules becomes large. In particular, in this case, the time to the next reaction becomes small and, if one is interested in simulating the chemical systems to timescales of $O (1)$ , will have to simulate a very high number of reaction events.

One approach for speeding up exact algorithms is to further exploit the structure of (2.1) to construct approximate algorithms. In particular, instead of explicitly calculating the time to the next reaction, one can choose a timescale of interest $τ$ and then calculate how many reactions have occurred in each of the reaction channels. More precisely, one can use the following approximation:

\int_{t}^{t + τ} a_{j} (X (s)) d s ≃ a_{j} (X (t)) τ,

(2.2)

and then using the formulation (2.1), it is not difficult to see that the number of reactions $k_{j}$ in the $j -$ th channel can be approximated by

k_{j} \sim P (α_{j} (𝑿 (t)) τ),

where $P (α_{j} (𝑿 (t)) τ)$ is a Poisson random variable with mean given by $α_{j} (𝑿 (t)) τ$ . The corresponding algorithm is called $τ$ -leaping [20], see also Algorithm 2.2. A discussion about under which assumptions the approximation (2.2) is valid as well as strategies for choosing $τ$ can be found in [25,34,35].

The main computational savings here come from the fact that several reaction events are lumped together, while in addition under appropriate assumptions on the propensity functions of the system [34,36] the error induced by this approximation is not very large. However, unlike exact methods like SSA in principle, the simple $τ$ -leap method might lead to negative populations, so one has to modify the original algorithm to avoid this issue [22,37].

2.1.

2.2. Parameter estimation for chemical kinetics

In many practical applications one might be interested in estimating parameters of stochastic kinetics models such as reaction rates from time series data. In the typical setting [38,39], given a realization of the stochastic process ${𝑿_{t}, t \in [0, T]} ≕ 𝑿_{[0, T]}$ , the law of which depends on some parameter $c$ , we consider the problem of inferring the value of $𝐜$ while having access only to discrete and noisy observations ${𝒚_{i}, i = 1, \dots, L}$ of $𝑿_{[0, T]}$ at discrete times $(t_{1}, \dots, t_{L}) \subset [0, T]$ .

In our setting, the true data $𝒙_{[0, T]}$ is a realization of a stochastic chemical system $𝑿_{[0, T]}$ . That true data is (for simplicity) measured and saved at integer times $[[0, T]] ≔ {0, 1, \dots, T}$ , during which measurement errors are possible. This leads to the noisy data $𝒚_{[[0, T]]}$ (a realization of $𝒀_{[[0, T]]}$ ). The observations $𝒀_{[[0, T]]}$ are assumed to be conditionally independent among each other given $𝑿_{[0, T]}$ .

We aim to characterize or sample from the probability density $ℙ (𝒄 | 𝒚_{[[0, T]]})$ , to which end we use a Markov chain targeting that density. By Bayes’ theorem,

\begin{aligned} P (c | y_{[[0, T]]}) = \frac{P (y_{[[0, T]]} | c) P_{0} (c)}{P (y_{[[0, T]]})} \end{aligned} .

(2.3)

Here, $ℙ_{0} (𝒄)$ is the prior assigned to $𝒄$ . Using the law of total probability on the denominator, we obtain

\begin{matrix} ℙ (𝒄 | 𝒚_{[[0, T]]}) \propto ℙ_{0} (𝒄) \int ℙ (𝒚_{[[0, T]]} | 𝒙_{[0, T]}, 𝒄) ℙ (𝒙_{[0, T]} | 𝒄) d 𝒙_{[0, T]} . \end{matrix}

(2.4)

As the following example reveals, the likelihood $ℙ (𝒚_{[[0, T]]} | 𝒄)$ is intractable, even in the case where there is no noise in the data $𝒚_{[[0, T]]}$ and one observes $𝒙_{[0, T]}$ directly.

Example 2.1. We consider the simplest case where the data $𝒚_{[0, T]}$ coincide exactly with $𝒙_{[0, T]}$ . In this case, $ℙ (𝒚_{[[0, T]]} | 𝒄)$ becomes

ℙ (𝒚_{[[0, T]]} | 𝒄) = \prod_{i = 1}^{T} p (𝑿 (t_{i}) | 𝒄, 𝑿 (t_{i - 1})),

where $p (𝑿 (t_{i}) | 𝒄, 𝑿 (t_{i - 1}))$ is the solution to the CME. However, except for some very simple chemical systems [9], solutions to the CME are not analytically available, which in turn implies that the likelihood $ℙ (𝒚_{[[0, T]]} | 𝒄)$ is in general intractable.

As the example 2.1 indicates, the likelihood $ℙ (𝒚_{[[0, T]]} | 𝒄)$ is intractable. There are different ways of dealing with this issue, one of which is through approximate Bayesian computation [19]. However, here we choose to proceed by following the pseudo-marginal approach [40], similarly to what was done in [18]. In particular, the idea is that if we have access to an unbiased estimator $\hat{ℙ} (𝒚_{[[0, T]]} | 𝒄)$ of our intractable likelihood $ℙ (𝒚_{[[0, T]]} | 𝒄)$ we can proceed in the standard manner to perform Bayesian inference within a Metropolis–Hastings framework by replacing the intractable likelihood by its unbiased estimator.

2.2.1. Particle pseudo-marginal Metropolis–Hastings algorithm

As discussed above within the pseudo-marginal framework, we need to have access to an unbiased estimator of our intractable likelihood. We do this by using a (bootstrap) particle filter [41] with importance resampling to iteratively construct the (unbiased) estimate of $ℙ (𝒄 | 𝒚_{[[0, T]]})$ . Combining this unbiased estimate with a Metropolis–Hastings step gives rise to the particle pseudo-marginal Metropolis–Hastings algorithm (PPMMH) [39]; see Algorithm 2.3. In typical Metropolis–Hastings fashion, the state space is explored via a proposal kernel $q (\cdot | c)$ generating proposals $𝒄^{⋆}$ from the current state $𝒄$ , and the proposals are kept in the chain using a Metropolis–Hastings accept/reject mechanism.

2.2.1.

The bootstrap particle filter (computed in line $2$ of Algorithm 2.3) relies on the fact that, for $j > 0$ ,

\begin{matrix} ℙ (𝒙_{[0, j + 1]} | 𝒚_{[[0, j + 1]]}) \propto ℙ (𝒚_{j + 1} | 𝒙_{j + 1}) ℙ (𝒙_{[0, j]} | 𝒚_{[[0, j]]}) ℙ (𝒙_{(j, j + 1]} | 𝒙_{[0, j]}) . \end{matrix}

(2.5)

This allows refining the naive approach of simply simulating $K$ realizations/particles up to time $T$ . To avoid ambiguities, we hereafter solely use the term ‘particle’ to denote realizations of the particle filter, and not chemical species. The iterative method consists of propagating the particles over a length $1$ time interval, evaluating the likelihood of each particle given the data, and using an importance resampling mechanism. Among others, this allows to avoid the degeneracy of the filter [42]. The unbiasedness of the estimator can be established using, e.g. [39, p. 290]. The initial step of the algorithm is performed by sampling from a given prior $ℙ_{0} (𝒙_{0})$ .

The steps are summarized in Algorithm 2.4.

Line 7 of Algorithm 2.4 entails the repeated simulation of K particles over a time interval, resulting in a computationally intensive process. In this work, we speed up those computations by using the hybrid τ algorithm.

2.2.1.

Remark 2.1. Practically running the bootstrap particle filter requires to make a few algorithmic choices and perform some amount of fine-tuning, chief among which are determining the number of particles $K$ and the type and parameters of the proposal kernel $q (\cdot | 𝒄)$ . In this work, we make use of a Gaussian proposal kernel, and proceed in a bootstrap fashion by performing a sequence of exploratory runs in order to determine those quantities. Those exploratory runs allow to determine a first estimate of the mean $𝒄_{pre}$ and the covariance of the kernel $\hat{𝐂}$ (up to a tuning constant $γ$ ). Following [43,44], the number of particles $K$ is then chosen such that the variance in the log-posterior $Var (\log \hat{ℙ} (𝒚 | 𝒄_{pre}))$ is approximately $2$ . The tuning constant $γ$ the kernel proposal is also determined in a similar bootstrap fashion, again following the approach outlined in [43], with the aim that the accept–reject ratio is close to $10 %$ . The practical details of our implementation are discussed in §4.2.2.

3. Hybrid $τ$ -leap

We now introduce our proposed algorithm. The idea here is similar to the one in [31]. In particular, we will introduce one blending function for each reaction denoted by $β_{j} (𝒙) : ℝ^{N} \mapsto [0,1], j = 1, \dots, M$ . One can then simply rewrite equation (2.1) in the following way:

X (t) = X (0) + \sum_{j = 1}^{M} P_{j} (\int_{0}^{t} β_{j} (X (s)) a_{j} (X (s)) d s + \int_{0}^{t} (1 - β_{j} (X (s))) a_{j} (X (s)) d s) ν_{j} .

Using the property of Poisson processes, it is now possible to rewrite the equation above in the following manner:

\begin{aligned} X (t) & = & X (0) + \sum_{j = 1}^{M} P_{j} (\int_{0}^{t} β_{j} (X (s)) a_{j} (X (s)) d s) ν_{j} \\ + & \sum_{j = 1}^{M} P_{j} (\int_{0}^{t} (1 - β_{j} (X (s))) a_{j} (X (s)) d s) ν_{j} \end{aligned} .

(3.1)

This rewriting might appear trivial at first sight, but it is essential in terms of explaining our algorithm, given the form of the blending functions $β_{r}$ . In particular, for a single-species system, a natural choice of blending function is the following piecewise linear function:

β (x, I_{1}, I_{2}) = {\begin{cases} 1, & i f x \leq I_{1} \\ \frac{I_{2} - x}{I_{2} - I_{1}}, & if I_{1} \leq x \leq I_{2} \\ 0, & if x \geq I_{2} \end{cases},

where $0 \leq I_{1} < I_{2}$ . Furthermore, in the case of a chemical system with $N$ species, we can construct blending functions in the following way. Let $R_{j}$ be the indices of the chemical species involved in the $j$ th reaction (both as reactants and products of reaction), i.e. $R_{j} = {i : μ_{j, i} \neq 0 or μ_{j, i}^{'} \neq 0}$ . Then we can define $β_{j} (𝒙), j = 1, \dots, M$ as follows:

\begin{matrix} β_{j} (𝒙) = 1 - \prod_{n \in R_{j}} (1 - β (x_{n}, I_{1}^{n, j}, I_{2}^{n, j})), \end{matrix}

(3.2)

where $I_{1}^{n, j} < I_{2}^{n, j}$ , are the boundaries for each individual chemical species. The boundaries $I_{1}^{n, j}$ and $I_{2}^{n, j}$ are user-defined and depend on the problem at hand—in the general case, they may depend on both the species and the reaction. They allow to separate the state space into a region where the reaction is simulated with SSA dynamics, pure $τ$ -leap or transitory hybrid dynamics in between. As such, they should be chosen in such a way that the region where the $τ$ -leaping is applied is a valid numerical approximation. Two natural simplifications of this approach are when the boundaries depend only on the species ( $I_{i}^{n, j} \equiv I_{i}^{n}$ for $i = 1,2$ ) and when the boundaries are independent of species and reaction ( $I_{i}^{n, j} \equiv I_{i}$ for $i = 1,2$ ). In our numerical experiments, we restrict ourselves to the latter case.

We now define the following sets:

\begin{aligned} C_{S S A}^{j} & = & {x \in N^{N} | \exists n \in R_{j}, x_{n} \leq I_{1}^{n}}, \end{aligned}

(3.3a)

C_{τ - l e a p}^{j} = {x \in N^{N} | \forall n \in R_{j}, x_{n} \geq I_{2}^{n}},

(3.3b)

C_{h y b r i d}^{j} = N^{N} ∖ (C_{S S A}^{j} \cup C_{τ - leap}^{j}) .

(3.3c)

It is not difficult to see that $𝒙 \in \cap_{j = 1}^{M} C_{SSA}^{j}$ translates to $\min β_{j} (𝒙) = 1$ , while $𝒙 \in \cap_{j = 1}^{M} C_{τ - l e a p}^{j}$ corresponds when $\max β_{j} (𝒙) = 0$ . Hence combining (3.2) with (3.1) when $𝒙 \in \cap_{j = 1}^{M} C_{SSA}^{j}$ , we will use SSA to simulate (3.1), while when $𝒙 \in \cap_{j = 1}^{M} C_{τ - l e a p}^{j}$ we will use the $τ$ -leap to simulate (3.1). It is only in the region $𝒙 \in \cup_{j = 1}^{M} C_{hybrid}^{j}$ that some reactions have blending functions obtaining values in $(0,1)$ and it is for these reactions that we use a combination of SSA for the term $P_{j} (\int_{0}^{t} β_{j} (𝑿 (s)) a_{j} (𝑿 (s)) d s)$ , and $τ$ -leap for the term $P_{j} (\int_{0}^{t} (1 - β_{j} (𝑿 (s))) a_{j} (𝑿 (s)) d s)$ . Note that, except for the region where the system is simulated solely via SSA or $τ$ -leap, the simulation regime might be quite diverse, with some reactions being simulated with SSA, while others are simulated with $τ$ -leap and others still are in the hybrid regime. For convenience, we call this region ‘mixed’ in figure 1d, which is simply given by

(a–c): Illustration of Cj SSA ,Cj hybrid and Cj T-leap associated with each of the reactions of the chemical system in example 3.1; (d): Partition of state space as per applicable simulation regime. — (a–c): Illustration of $C_{SSA}^{j}$ , $C_{hybrid}^{j}$ and $C_{τ - l e a p}^{j}$ associated with each of the reactions of the chemical system in example 3.1; (d) Partition of state space as per applicable simulation regime.

ℕ^{N} ∖ (\cap_{j = 1}^{R} C_{SSA}^{j} \cup \cap_{j = 1}^{R} C_{τ - leap}^{j}) .

We call the resulting algorithm the hybrid $τ$ -method (see Algorithm 3.1). Note that, for clarity, we denote by $τ$ -leap( $τ$ , $𝒂$ ) the shorthand for performing one $τ$ -leap step through Algorithm 2.2, only expressing its dependence on the time-step $τ$ and the involved propensities $𝒂$ . Furthermore, for the propensities, the expression $𝜸 𝒂$ is to be understood component-wise.

Example 3.1. We consider here the following chemical system

\begin{aligned} S_{1} & \overset{c_{1}}{\to} \emptyset & \emptyset & \overset{c_{2}}{\to} S_{1} & S_{1} + S_{2} & \overset{c_{3}}{\to} \emptyset & \emptyset \overset{c_{4}}{\to} & S_{2} \end{aligned} .

Figure 1a–c illustrates the sets $C_{SSA}^{j}, C_{τ - l e a p}^{j}, C_{hybrid}^{j}$ associated with each of the reactions, here $j = 1, \dots, 4$ . In addition, in figure 1d we can see the partition of the state space in terms of which simulation regime is applied where.

Remark 3.2. In Algorithm 3.1, two different time-stepping strategies are being used. There is $δ t$ that relates to the time-step used by the $τ$ -leap method in the intermediate regime and $Δ t$ that relates to the time-step used by the $τ$ -leap method when only $τ$ -leap is used for simulation. This is done to provide extra flexibility but is not crucial for the performance of the algorithm.

Remark 3.3. Choosing one blending function per reaction is a modelling choice that tries to fully exploit the multi-scale nature (when present) of the chemical kinetics. A more conservative approach would be to define a single blending function for all reactions, i.e. $\tilde{β} (𝒙) = 1 - \prod_{i = 1}^{N} β (x_{i}, I_{1}, I_{2})$ . This would cause the simulation regime of any reaction to be determined by the current state of all chemical species present in the system. As long as this choice of blending function sensibly partitions the state space, i.e. ensuring that the underlying stochastic process spends some time outside the SSA region, it would still lead to an algorithm that is more efficient than SSA.

Remark 3.4. The idea of partitioning the state space is rather general. In particular, one could replace $τ$ -leap with the numerical method of their choice and the only thing that would need to be considered is how to do the simulation in the region of space where the numerical method co-exists with $τ$ -leap. An example of this is the hybrid CLE method proposed in [31] where instead of using $τ$ -leap one simulates the term $P_{r} (\int_{0}^{t} (1 - β_{j} (𝑿 (s))) a_{j} (𝑿 (s)) d s)$ in (3.1) by using the diffusion approximation.

4. Numerical investigations

We now present several different numerical experiments to illustrate the robustness and the accuracy of our proposed approach. In particular, in §4.1 we study three different model chemical systems and compare the performance of hybrid $τ$ with other algorithms. Furthermore, in §4.2, we study the performance of the hybrid $τ$ method when used as the stochastic simulator of choice for parameter estimation as described in §2.2.

4.1. Comparison with other numerical methods

4.1.1. Lotka–Volterra system

We begin by considering a stochastic version of the Lotka–Volterra system. It is a first example where the hybrid $τ$ algorithm captures the correct statistical behaviour, while the standard CLE approximation (with reflective boundary conditions) completely fails to do so. The system is defined as:

Chemical system 4.1. Lotka–Volterra system

\begin{aligned} A & \overset{c_{1}}{\to} 2 A & A + B & \overset{c_{2}}{\to} 2 B & B & \overset{c_{3}}{\to} \emptyset \end{aligned} .

The molecules of $A$ and $B$ are in a predator–prey relationship, both populations oscillating between states of abundance and scarcity. As in [31], the reaction constants are set to $c_{1} = 2$ , $c_{2} = 0.002$ , $c_{3} = 2$ . In figure 2, a histogram is generated using $10^{3}$ SSA realizations simulated until $T = 5$ , which corresponds to one period of the solution to the corresponding RREs. The initial conditions are chosen to be $A (0) = 50, B (0) = 60$ . The system clearly exhibits a multi-scale behaviour, spending time in all possible configurations of scarcity and abundance for both species. Furthermore, a peculiarity of the system is that, for some realizations, the chemical species $B$ reaches $0$ ; in consequence, the system reduces to $A \to 2 A$ , i.e. exponential growth of the chemical species $A$ . The locally high copy numbers of species cause the simulation via SSA to become excessively slow (especially so in the situation of exponential growth, as the generated time-steps evolve as $O (A^{- 1})$ for rapidly growing $A$ , causing the simulation cost to become prohibitively expensive); this calls for employing approximate but accelerated schemes. Here we will consider two such schemes, the first being the CLE with reflective boundary conditions to ensure the non-negativity of the chemical species, and the second one being the hybrid CLE proposed in [31].

Histogram of 103 SSA simulations up to time T=5 of the Lotka–Volterra system with A(0)=50,B(0)=60. — Histogram of $10^{3}$ SSA simulations up to time $T = 5$ of the Lotka–Volterra system with $A (0) = 50, B (0) = 60$ .

Figure 3a displays the numerical means of $A$ computed using respectively SSA, the hybrid $τ$ , the hybrid CLE [31] and the CLE (with reflective boundary conditions) with $10^{4}$ samples. We use a Euler–Maryama discretization for the CLE with time-step $Δ t = 10^{- 2}$ , while for the hybrid algorithms, the parameters are set to $(I_{1}, I_{2}) = (5,10)$ and the step-sizes $Δ t = 10^{- 2}$ and $δ t = 10^{- 3}$ . The step-sizes were chosen manually by computing the error for a number of short exploratory runs. A more sophisticated implementation of the hybrid $τ$ algorithm would require an adaptive scheme for the $τ$ -leap part of the process. As we can observe, the numerical means of hybrid $τ$ and hybrid CLE follow the same trend as SSA, whereas the CLE simulation completely fails to capture the right behaviour. The mismatch is not due to the chosen step-size in the CLE simulations, but rather to the fact that the solution to the CLE with reflective boundary conditions is fundamentally different from that of the CME. This underlines the need for simulators directly based on the CME and not approximations thereof.

Chemical system 4.1 (Lotka–Volterra). Simulators : SSA — Chemical system 4.1 (Lotka–Volterra). Simulators: SSA, $τ$ -leap (with reflective boundary conditions), CLE (also with reflective BC), hybrid CLE, hybrid $τ$ . All averages were performed over $10^{4}$ simulations.

We now investigate the gains in computational efficiency. To this end, we simulate the Lotka–Volterra system up to time $T = 6$ and measure the average computational time (using $10^{4}$ realizations) as a function of time, using different simulation algorithms: SSA, $τ$ -leap, CLE, hybrid CLE and the hybrid $τ$ . For the hybrid algorithms, we used the blending region $(20,25)$ , and the time-steps are fixed to $Δ t = δ t = 10^{- 2}$ . The results are presented in figure 3b. The computational time for the SSA grows exponentially for increasing $t$ , while for the hybrid $τ$ simulations the computational cost grows linearly with time. At $T = 6$ , the computational cost of the SSA algorithm is two orders of magnitude higher than that of the hybrid $τ$ and hybrid CLE. Note that the hybrid CLE and $τ$ -leap perform similarly well in this situation.

4.1.2. Birth–death system

As a next example, we consider the following one-dimensional birth–death process:

Chemical system 4.2. Birth–death system

\begin{aligned} S \overset{c_{1}}{\to} \emptyset & \emptyset \overset{c_{2}}{\to} S \end{aligned} .

We modify the propensity function of the birth reaction to be ${\tilde{a}}_{2} (x) = c_{2} \cdot χ_{(0, N_{\max})} (x)$ , which ensures the process never leaves the interval $[0, N_{\max}]$ [31]. We are interested in calculating the amount of time that it takes for the system to reach zero when starting from the value $S = c_{2} / c_{1}$ . Note that due to our modification of the birth reaction, once the system reaches zero, it stays there, therefore we refer to this time as the extinction time.

In our experiments, we chose the parameters $N_{\max} = 50$ , $c_{1} = 1$ and $c_{2} = 10$ , and simulated the system over $10^{4}$ times to record extinction times via SSA, $τ$ -leaping, CLE, hybrid CLE and hybrid $τ$ . For the hybrid $τ$ and CLE, we used the blending region $(5,7)$ and time-step $Δ t = δ t = 10^{- 1}$ . This is the time-step that $τ$ -leap and CLE use as well. Figure 4a shows the quantile–quantile plot of extinction times obtained via $τ$ -leaping, CLE, hybrid $τ$ and hybrid CLE, compared with those obtained from SSA. A quantile–quantile plot compares one set of quantiles from one distribution against another set. Therefore, the closer the plotted points are to the $y = x$ line, the more similar the distributions. In that perspective, we see that $τ$ -leaping yields extinction time samples with a distribution significantly different to that of SSA, while the hybrid $τ$ matches the distribution much better; the same statement is true for CLE and hybrid CLE. Figure 4b displays the same data, this time by comparing the empirical cumulative distribution functions. Furthermore, we report in table 1 the relative speed-ups of each of the approximate algorithms with respect to SSA. As we can see, all the approximate algorithms are faster than SSA, while the hybrid $τ$ is a bit faster than the hybrid CLE while displaying similar accuracy with it. In principle further speed-ups could be observed for all the algorithms if one was to use larger time-steps but this would result is a loss in accuracy.

Birth-death system. Statistical comparisons of extinction times obtained via SSA — Birth–death system. Statistical comparisons of extinction times obtained via SSA, $τ$ -leap, CLE, hybrid $τ$ and hybrid CLE, taken over $10^{4}$ simulations.

Table 1.

Simulation times of the birth–death system until extinction, averaged over $10^{4}$ simulations.

algorithm	avg. sim. time (s)	ratio SSA
SSA	1.760409e−02	1
h- $τ$	1.258614e−02	1.39869
$τ$ -L	7.784003e−03	2.26157
h CLE	1.689645e−02	1.04188
CLE	8.669714e−03	2.03053

Open in a new tab

4.1.3. Schlögl system

We now study the Schlögl system taken from [45]. It is a nonlinear model where the density function of the main reactant $S$ displays bistability for a certain choice of reaction constants.

Chemical system 4.3. Schlögl system

\begin{aligned} B_{1} + 2 S ⇌_{c_{2}}^{c_{1}} 3 S & B_{2} ⇌_{c_{4}}^{c_{3}} S, \end{aligned}

where $B_{1}$ and $B_{2}$ are buffered species, i.e. their populations are kept at constant values $N_{1} = 10^{5}$ and $N_{2} = 2 \cdot 10^{5}$ , respectively. In effect, only $S$ needs to be tracked and this results in the following propensity functions:

\begin{matrix} a_{1} (x) = \frac{c_{1}}{2} N_{1} x (x - 1), & a_{2} (x) = \frac{c_{2}}{6} x (x - 1) (x - 2), & a_{3} (x) = c_{3} N_{2}, & a_{4} (x) = c_{4} x . \end{matrix}

We set $c_{1} = 3 \cdot 10^{- 7}$ , $c_{2} = 10^{- 4}$ , $c_{3} = 10^{- 3}$ and $c_{4} = 3.5$ . In figure 5a, we plot one trajectory of the system for this choice of parameters for time $T = 10^{6}$ using the SSA. As we can see the system exhibits a bistable behaviour as it tends to send time around two peaks one located around $80$ and one around $560$ .

Trajectories of the Schlögl chemical system.

We experimented similarly to what was done in [45], comparing the numerical probability distributions obtained from the SSA and hybrid $τ$ . For the hybrid $τ$ , different combinations of blending regions $(I_{1}, I_{2})$ and time-steps $(Δ t, δ t)$ were tested. The initial condition is given by $S (0) = 250$ . For each combination of parameters, $10^{6}$ paths were generated up to time $T = 50$ . Note that this time is not large enough for the system to jump between the two peaks of distribution, as we can see in figure 5b in which we plot $10$ trajectories of the system using SSA.

These results are presented in figure 6, in which the lower plot represents the numerical probability distributions, and the upper plots are zoom-ins corresponding to the intervals where the peaks are located. It is seen that, when using $Δ t = 10^{- 2}$ (parameter sets 1–4), the position of the blending region and its width hardly affect the probability distribution, which aligns very well with that obtained via SSA. By contrast, a numerical bias is clearly visible when using the time-step $Δ t = 0.25$ (parameter sets 5–8). In the high peak region (upper right zoom-in of figure 6), the obtained numerical densities concentrate on the same probability distribution which is not aligned with the SSA distribution. This is of course to be expected, since the hybrid $τ$ method coincides with standard $τ$ -leap in this region due to the choice of the blending function.

Numerical pdfs at T=50 for the Schlögl system using hybrid T, for different parameter — Numerical probability density functions (pdfs) at $T = 50$ for the Schlögl system using hybrid $τ$ , for different parameter sets of $(Δ t, δ t, I_{1}, I_{2})$ . See table 2 for the values of each parameter set. Upper left plot : zoom-in of numerical pdfs on $[70,100]$ . Upper right plot : zoom-in on $[500, 620]$ .

The numerical densities appear less stable in the first peak. The kink in the parameter set $5$ density is probably due to the combination of the sharply varying step-size between $δ t$ and $Δ t$ (two orders of magnitude), the transition from the hybrid regime to the pure $τ$ -leaping regime at $S = 80$ , and the non-trivial interplay of dynamics oscillating between these two regions. This is further justified by the fact that the distribution associated to the parameter set $1$ , which has the same blending function but smaller $Δ t$ , does not display such a behaviour and on the contrary is well aligned with the SSA density.

Table 2 describes the average time taken (over $10^{3}$ realizations) for the simulation of realizations oscillating near the low peak, respectively, the high peak. Note that for $T = 50$ the probability of switching between these two peaks is very small indeed. The last two columns display the ratio between the average time taken by SSA to simulate realizations in the low peak (resp. high peak) and that of the hybrid $τ$ . This illustrates the potential computational benefits of the hybrid $τ$ but also the caveats of the algorithm. In all cases except for the simulation of the low peak when using $(I_{1} I_{2}) = (50, 200)$ , hybrid $τ$ outperforms SSA. The slow-down in the latter case can be attributed to the fact that a large portion of the mass of the probability density function (and hence, where the algorithm will spend more time) is located in a part that is still simulated in part with SSA with propensities weighted by $β$ . Observing a speed-up when the system mainly evolves in the hybrid region is still possible; see the lines corresponding to parameters $(I_{1}, I_{2}) = (40, 100)$ in table 2. This, however, is only possible when carefully choosing the time-step in the hybrid dynamics region. Indeed, choosing $δ t$ too big will cause the hybrid $τ$ to often accept the next reaction time proposed by the SSA process with modified propensities, hence causing the algorithm to advance (at most) at the speed of the SSA. On the other hand, choosing $δ t$ too small causes the algorithm to simply advance at the speed of that small time-step, which might be smaller than just propagating the system with SSA. A compromise is then found when the chosen time-step allows to consistently skip a few reactions within the hybrid dynamics region, i.e. it should correspond to a few multiples of the average time until next reaction of the SSA dynamics with propensities weighted by $β$ . Furthermore, we remark that these speed-ups are obtained even with the computational overhead that the hybrid $τ$ carries at each iteration through its conditional logic.

Table 2.

Schlögl system simulation times for SSA and hybrid $τ$ (H $τ$ ) with multiple parameters. LP = low peak, HP = high peak. Timings show the average runtime (over $10^{3}$ simulations) to simulate a realization in LP or HP over the time interval $[0,50]$ . Ratio SSA columns show the ratio of that time with the time needed by the SSA for the same peak.

						timings (s)		ratio SSA
	param. set	$Δ t$	$δ t$	$I_{1}$	$I_{2}$	LP	HP	LP	HP
	1	1.0e−2	5e−3	40	80	5.47e−03	3.51e−03	1.801	38.05
	2	1.0e−2	1e−2	20	40	3.03e−03	3.49e−03	3.2531	38.161
	3	1.0e−2	2e−3	50	200	2.98e−02	3.51e−03	0.3307	38.008
	4	1.0e−2	1e−2	40	100	7.88e−03	3.53e−03	1.2512	37.779
	5	2.5e−1	5e−3	40	80	2.77e−03	1.38e−04	3.554	968.82
	6	2.5e−1	1e−2	20	40	1.42e−04	1.34e−04	69.506	993.38
	7	2.5e−1	2e−3	50	200	2.95e−02	1.39e−04	0.33435	957.39
h $τ$	8	2.5e−1	1e−2	40	100	6.60e−03	1.36e−04	1.4925	983
SSA						9.85e−03	1.33e−01	1	1

Open in a new tab

4.1.4. Autorepressive system

Finally, we consider the autorepressive genetic system, taken from [28]. This is yet another example where chemical reactions happen on different scales. In this gene regulatory system, the DNA is only present in one or two copies (active or inactive), the messenger RNA (mRNA) and proteins may reach much higher numbers. The system is described as follows:

System 4.4. Autorepressive gene system

\begin{aligned} D N A \overset{c_{1}}{\to} D N A + m R N A & m R N A \overset{c_{2}}{\to} m R N A + P & D N A + P \overset{c_{3}}{\to} {D N A}_{0} \\ m R N A \overset{c_{5}}{\to} \emptyset & P \overset{c_{6}}{\to} \emptyset & {D N A}_{0} \overset{c_{4}}{\to} D N A + P \end{aligned} .

In essence, the active DNA is transcribed to mRNA at rate $c_{1}$ , which in turn produces proteins (P) at rate $c_{2}$ . A protein may repress the active DNA, causing it to become inactive (represented by DNA₀) at rate $c_{3}$ . The inactive DNA reactivates at rate $c_{4}$ . Finally, the mRNA and the proteins are degraded at rates $c_{5}$ and $c_{6}$ , respectively.

The reaction rates are set to $c_{1} = 10^{- 2}$ , $c_{2} = 0.5$ , $c_{3} = 0.1$ , $c_{4} = 10^{- 2}$ , $c_{5} = 5 \cdot 10^{- 3}$ and $c_{6} = 0.2$ . Figure 7, shows a simulation of the system up to time $T = 10^{3}$ using SSA. As we can see DNA and DNA₀ obtain very small values but mRNA and P obtain quite a long range of values.

An SSA-simulated realization of the Autorepressive Gene System. The system displays a multiple-scales structure — An SSA-simulated realization of the autorepressive gene system. The system displays a multiple-scales structure. The mRNA and P species go through all possible simulation regime zones with blending regions BR 1 = $(5,10)$ , BR 2 = $(10, 15)$ and BR 3 = $(15,20)$ .

We now simulate this chemical system with the hybrid $τ$ and the hybrid CLE for three different blending regions namely $(5,10), (10,15)$ and $(15, 20)$ ; the time-steps were set to $Δ t = 10^{- 2}$ and $δ t = 10^{- 3}$ . For each combination of parameters, $10^{4}$ simulations were performed up to time $T = 10^{3}$ . The means and variances of these different processes are detailed in tables 4 and 5 (in the appendix). The results are again fairly accurate for all combinations of parameters and algorithms, and indicate a certain robustness with respect to the chosen blending region. Albeit somewhat marginal, the hybrid $τ$ seems to perform slightly more accurately than the hybrid CLE when using bigger time-steps. We do not compare the efficiency with respect to $τ$ -leap or CLE; simulation via either of these methods would immediately break the ‘switch’ nature of the DNA, which is either activated ( $DNA = 1$ and ${DNA}_{0} = 0$ ) or not ( $DNA = 0$ and ${DNA}_{0} = 1$ ). Furthermore, in table 6 in the appendix, we report the times taken by each algorithm for each combination of parameters, as well as their speed-up compared with SSA. Both methods display a modest but consistent speed-up with respect to SSA, up to three times faster.

4.2. Parameter estimation

This section investigates the reliability of the hybrid $τ$ algorithm for Bayesian inverse problems, in the setting described in §2.2.

4.2.1. Experiment setting

The system on which the parameter estimation is performed is defined by the reactions (see also [18]):

System 4.5. Bayesian inference system

\begin{aligned} R_{1} : \emptyset \overset{c_{1}}{\to} S_{1} & R_{3} : S_{1} \overset{c_{3}}{\to} \emptyset & R_{5} : S_{1} + S_{2} \overset{c_{5}}{\to} 20 S_{2} \\ R_{2} : \emptyset \overset{c_{2}}{\to} S_{2} & R_{4} : S_{2} \overset{c_{4}}{\to} \emptyset . \end{aligned}

(4.1)

$R_{1}$ and $R_{2}$ are production reactions, $R_{3}$ and $R_{4}$ are decay reactions, and $R_{5}$ is a second-order reaction. The reaction rates are given by

𝐜 = (2, s c, 1 / 50, 1, 1 / (50 \cdot s c)),

where $s c$ is used as a parameter, that takes the values $s c \in {1, 10, 10^{2}, 10^{3}}$ . We aim to infer the value of $𝐜$ for each of those values. To ensure identifiability, $c_{3}$ is set to its true value $1 / 50$ in all simulations—hence practically, the Bayesian inference is performed on $(c_{1}, c_{2}, c_{4}, c_{5})$ .

Following [18], the noisy measurements are assumed to follow the distribution

Y_{i}^{n} | X_{i} (t_{n}) \sim {\begin{aligned} P o i s s o n (X_{i} (t_{n})) & i f X_{i} (t_{n}) > 0, \\ B e r n o u l l i (0.1) & e l s e . \end{aligned}

(4.2)

4.2.2. Methodology

We are now interested in calculating the posterior distribution $ℙ (𝒄 | 𝒚_{[0, T]})$ . We do this using the Algorithms 2.3 and 2.4.

There are a number of modelling choices and parameter tuning that need to be done before we proceed with our numerical experiments and comparisons. In this work, we use a Gaussian proposal kernel $q (\cdot | 𝒄)$ with mean $𝒄$ and a covariance $γ \hat{𝐂}$ which will be estimated using successive short exploratory runs. Using a first exploratory run of the particle filter with $γ = 1, \hat{𝐂} = I$ (Algorithm 2.3) allows us to determine a first estimate of the parameter by $𝒄_{pre}$ and the sample covariance matrix $\hat{𝐂}$ . The number of particles $K$ is then chosen such that $Var (\log \hat{ℙ} (𝒚 | 𝒄_{pre}))$ lies between $0.25$ and $2.5$ . Finally, the scaling parameter $γ$ is set such that the accept–reject ratio is close to $10 %$ . The choice of the number of particles $K$ and the acceptance probability are chosen following the guidelines outlined in [43,44]. We summarize the obtained values in table 3.

Table 3.

Experimental values obtained for particle filter simulation.

$K$		$γ$
SSA	hybrid $τ$	SSA	hybrid $τ$
$s c = 1$	800	320	0.05	2.0
$s c = 10$	640	1280	4.0	1.0
$s c = 10^{2}$	80	160	1.0	1.0
$s c = 10^{3}$	320	160	2.0	1.0

Open in a new tab

After those values are determined, a small burn-in phase of $200$ iterations is performed to ensure the particle filter starts in a credible region. Finally, the actual experiment is run with $n_{iter} = 2 \cdot 10^{5}$ iterations, using the last explored state of the burn-in phase as the starting point. This procedure is repeated for each value of $s c$ , and for simulations obtained via the SSA or via the hybrid $τ$ . The tunable parameters in the hybrid $τ$ simulations were set to $I_{1} = 20$ , $I_{2} = 60$ and $Δ t = 10^{- 1}$ and $δ t = 10^{- 2}$ .

The values $(N_{part}, γ)$ chosen in the set-up phase bootstrap are roughly of the same order for both forward simulators. This suggests that the hybrid $τ$ simulates the system in the same way as SSA, and thus causes the particle filter to explore the state space in the same way (only much more efficiently when $s c$ becomes large).

4.2.3. Numerical results

We run the inference procedure using either the standard SSA or the hybrid $τ$ algorithm. As can be seen in the box-and-whiskers plots (figures 8 and 9), across all scales the inference of the parameters $c_{1}, c_{2}, c_{4}, c_{5}$ are similar no matter which simulator is used. In all plots, the dashed red line represents the true value to be inferred. The coloured boxes are delimited by the first and third quartiles, the whiskers have length $1.5 \times IQR$ (interquartile range), and the points outside are statistical outliers. The values $c_{1}$ and $c_{4}$ have no multi-scale dependence, while the parameters $c_{2}$ and $c_{5}$ do.

Bayesian inference system. Box plots of samples of parameters C1,C2,C4,C5 from PPMMH algorithm — Bayesian inference system. Box plots of samples of parameters $c_{1}, c_{2}, c_{4}, c_{5}$ from PPMMH algorithm, with simulator SSA and hybrid $τ$ , respectively, for multi-scale parameter $s c = 1$ . The red line represents the true value to be inferred.

Bayesian inference system. Box plots of samples of parameters $c_{1}, c_{2}, c_{4}, c_{5}$ from PPMMH algorithm, with simulator SSA and hybrid $τ$ , respectively, for multi-scale parameter $s c = 10^{3}$ . The red line represents the true value to be inferred.

The major difference is noted in the computational times (figure 10 illustrates how); as the system displays stronger multi-scale behaviour, the simulation time using the hybrid $τ$ algorithm is bounded, while the SSA simulation time grows linearly.

Mean simulation time for system (4.1) (averaged over 100 simulations) for different values of the parameter sc. — Mean simulation time for system (4.1) (averaged over $100$ simulations) for different values of the parameter $s c$ .

5. Conclusions

We have proposed a novel hybrid $τ$ algorithm, inspired by the hybrid CLE method proposed in [31]. Unlike it, the quantities remain fully discrete at all times. The method leverages the splitting property of Poisson processes to rewrite the stochastic process as a sum of Poisson random variables that are simulated using SSA or $τ$ -leaping; the computational gains become obvious when one manages to ensure that the reactions fired frequently are simulated using the $τ$ -leaping, while those firing less often (and for which small stochastic variations may lead to important changes) are simulated using the exact SSA.

We point out that this algorithm really belongs to a larger class of hybrid schemes based on the splitting of the Poisson process into separate processes that are simulated differently. In this sense, there is a significant amount of flexibility in the simulation of (multi-scale) stochastic chemical kinetics. Straightforwardly, one could swap the $τ$ -leaping by a more robust algorithm such as binomial $τ$ -leaping, $R$ -leaping, etc. Other approaches such as a separation into multiple levels (e.g. micro-meso-macro) with a suitable simulation at each level are also possible.

The choice of blending function discussed in this article is a ‘cautious’ one, as it considers all the involved species in a given reaction (reactants and products). This can be understood as dynamically tagging a reaction to be simulated using the SSA if any species is low, $τ$ -leaping if all are abundant or a combination of both in the intermediate regime. Including the reactants in the blending functions makes intuitive sense, among others it can make the algorithm particularly robust to the issue of negative species. However, if it is known a priori that certain products do not have a significant impact on system (no feedback loops, etc.), not including them in the blending function may prove to be computationally interesting as it would naturally result in a larger state space in which the system is simulated with $τ$ -leaping.

The various experiments presented in this article display how the hybrid $τ$ algorithm produces accurate and computationally efficient results. This is especially the case in the multi-scale setting. The algorithm is naturally amenable to that setting, as it handles it with no necessary additional modelling steps, unlike other algorithms specifically designed for the multi-scale case, such as piecewise deterministic Markov processes in [28]. The algorithm requires some user-prescribed values, the time-steps for the $τ$ -leaping and for the intermediate regime, as well as the blending functions (additionally in our case, with their corresponding parameters $I_{1}$ and $I_{2}$ ). Numerically, it was observed that the dependence of the time-step sizes on the quality of the simulations is significantly more important than that of the blending regions. It would be of interest to investigate further the impact of the blending functions and regions, establishing criteria on how to suitably define them, as well as incorporating adaptive time-step mechanisms for the different simulation regimes. Additionally, in the future, it would be interesting to adopt time-step strategies similar to the ones in [35] for choosing $τ$ . Finally, in terms of Bayesian inference, an interesting research direction would be to see if the proposed framework could be modified to allow particle trajectories to be conditioned on the next observation, similarly to the approach in [46], as this could allow for dealing with the shortcomings of the bootstrap particle filter. Similarly, it would be interesting to see if the proposed approach could be adjusted to perform inference using the procedure outlined in [47].

Appendix

We present here some tables summarizing our numerical simulations for the autorepressive gene regulatory system (tables 4–6).

Table 4.

Mean values of the autorepressive gene regulatory system at $T = 10^{3}$ with sample size $N = 10^{4}$ using SSA, hybrid $τ$ (h $τ$ ) and hybrid CLE with different parameters.

	$(Δ t, δ t)$	$I_{1}$	$I_{2}$	DNA	DNA $,_{0}$	proteins	mRNA
		5	10	0.173	0.827	64.411	25.868
		10	15	0.134	0.866	63.041	25.267
	$(0.2, 0.15)$	15	20	0.111	0.889	61.763	24.734
		5	10	0.12	0.88	64.235	25.514
		10	15	0.151	0.849	63.974	25.401
h $τ$	$(1, 0.4)$	15	20	0.134	0.866	63.286	25.54
		5	10	0.133	0.867	63.148	25.241
		10	15	0.125	0.875	62.039	24.962
	$(0.2, 0.15)$	15	20	0.121	0.879	63.462	25.546
		5	10	0.122	0.878	64.336	25.797
		10	15	0.133	0.867	61.483	24.616
h CLE	$(1, 0.4)$	15	20	0.128	0.872	61.61	24.679
SSA				0.137	0.863	64.323	25.775

Open in a new tab

Table 5.

Standard deviations of the autorepressive gene regulatory system at $T = 10^{3}$ with sample size $N = 10^{4}$ using SSA and hybrid $τ$ (h $τ$ ) with different parameters.

	$(Δ t, δ t)$	$I_{1}$	$I_{2}$	DNA	DNA $,_{0}$	proteins	mRNA
		5	10	0.37825	0.37825	32.777	12.726
		10	15	0.34065	0.34065	32.699	12.789
	$(0.2, 0.15)$	15	20	0.31413	0.31413	33.77	13.286
		5	10	0.32496	0.32496	34.09	13.228
		10	15	0.35805	0.35805	34.61	13.492
h $τ$	$(1, 0.4)$	15	20	0.34065	0.34065	33.797	13.271
		5	10	0.33957	0.33957	33.043	13.051
		10	15	0.33072	0.33072	32.518	12.759
	$(0.2, 0.15)$	15	20	0.32613	0.32613	32.486	12.933
		5	10	0.32729	0.32729	33.769	13.232
		10	15	0.33957	0.33957	33.414	13.085
h CLE	$(1, 0.4)$	15	20	0.33409	0.33409	33.662	13.015
SSA				0.34385	0.34385	33.101	13.346

Open in a new tab

Table 6.

Simulation times of the autorepressive gene regulatory system at $T = 10^{3}$ , averaged over $N = 500$ realizations.

	$(Δ t, δ t)$	$I_{1}$	$I_{2}$	simulation time	ratio SSA
		5	10	5.66759e−03	1.485
		10	15	5.92247e−03	1.421
	$(0.2, 0.15)$	15	20	6.47657e−03	1.3
		5	10	2.36346e−03	3.562
		10	15	2.72807e−03	3.086
h $τ$	$(1, 0.4)$	15	20	3.46922e−03	2.426
		5	10	6.98658e−03	1.205
		10	15	7.31605e−03	1.151
	$(0.2, 0.15)$	15	20	8.07277e−03	1.043
		5	10	2.78733e−03	3.02
		10	15	3.28434e−03	2.563
h CLE	$(1, 0.4)$	15	20	4.19815e−03	2.005
SSA				8.41800e−03	1

Open in a new tab

Contributor Information

Thomas Trigo Trindade, Email: thomas.trigotrindade@epfl.ch; t.trigotrindade@gmx.ch.

Konstantinos C. Zygalakis, Email: K.Zygalakis@ed.ac.uk.

Ethics

This work did not require ethical approval from a human subject or animal welfare committee.

Data accessibility

Data and relevant code for this research work are stored in GitHub [48] and have been archived within the Zenodo repository [49].

Declaration of AI use

We have not used AI-assisted technologies in creating this article.

Authors’ contributions

T.T.T.: software, writing—original draft, writing—review and editing; K.C.Z.: conceptualization, writing—original draft, writing—review and editing.

Both authors gave final approval for publication and agreed to be held accountable for the work performed therein.

Conflict of interest declaration

We declare we have no competing interests.

Funding

No funding has been received for this article.

References

1. Arkin A, Ross J, McAdams HH. 1998. Stochastic kinetic analysis of developmental pathway bifurcation in phage lambda-infected Escherichia coli cells. Genetics 149, 1633–1648. ( 10.1093/genetics/149.4.1633) [DOI] [PMC free article] [PubMed] [Google Scholar]
2. Vilar JMG, Kueh HY, Barkai N, Leibler S. 2002. Mechanisms of noise-resistance in genetic oscillators. Proc. Natl Acad. Sci. USA 99, 5988–5992. ( 10.1073/pnas.092133899) [DOI] [PMC free article] [PubMed] [Google Scholar]
3. Kar S, Baumann WT, Paul MR, Tyson JJ. 2009. Exploring the roles of noise in the eukaryotic cell cycle. Proc. Natl Acad. Sci. USA 106, 6471–6476. ( 10.1073/pnas.0810034106) [DOI] [PMC free article] [PubMed] [Google Scholar]
4. McAdams HH, Arkin A. 1997. Stochastic mechanisms in gene expression. Proc. Natl Acad. Sci. USA 94, 814–819. ( 10.1073/pnas.94.3.814) [DOI] [PMC free article] [PubMed] [Google Scholar]
5. Swain PS, Elowitz MB, Siggia ED. 2002. Intrinsic and extrinsic contributions to stochasticity in gene expression. Proc. Natl Acad. Sci. USA 99, 12795–12800. ( 10.1073/pnas.162041399) [DOI] [PMC free article] [PubMed] [Google Scholar]
6. Doi M. 1976. Stochastic theory of diffusion-controlled reaction. J. Phys. A. Math. Gen. 9, 1479–1495. ( 10.1088/0305-4470/9/9/009) [DOI] [Google Scholar]
7. Erban R, Chapman SJ. 2009. Stochastic modelling of reaction-diffusion processes: algorithms for bimolecular reactions. Phys. Biol. 6, 1( 10.1088/1478-3975/6/4/046001) [DOI] [PubMed] [Google Scholar]
8. Higham DJ. 2008. Modeling and simulating chemical reactions. SIAM Rev. 50, 347–368. ( 10.1137/060666457) [DOI] [Google Scholar]
9. Jahnke T, Huisinga W. 2007. Solving the chemical master equation for monomolecular reaction systems analytically. J. Math. Biol. 54, 1–26. ( 10.1007/s00285-006-0034-x) [DOI] [PubMed] [Google Scholar]
10. Wolf V, Goel R, Mateescu M, Henzinger TA. 2010. Solving the chemical master equation using sliding windows. BMC Syst. Biol. 4, 42. ( 10.1186/1752-0509-4-42) [DOI] [PMC free article] [PubMed] [Google Scholar]
11. MacNamara S, Burrage K. 2009. Krylov and steady-state techniques for the solution of the chemical master equation for the mitogen-activated protein kinase cascade. Numer. Algorithms 51, 281–307. ( 10.1007/s11075-008-9239-y) [DOI] [Google Scholar]
12. Schnoerr D, Cseke B, Grima R, Sanguinetti G. 2017. Efficient low-order approximation of first-passage time distributions. Phys. Rev. Lett. 119, 210601. ( 10.1103/PhysRevLett.119.210601) [DOI] [PubMed] [Google Scholar]
13. Gillespie DT. 1977. Exact stochastic simulation of coupled chemical reactions. J. Phys. Chem. 81, 2340–2361. ( 10.1021/j100540a008) [DOI] [Google Scholar]
14. Gibson MA, Bruck J. 2000. Efficient exact stochastic simulation of chemical systems with many species and many channels. J. Phys. Chem. A 104, 1876–1889. ( 10.1021/jp993732q) [DOI] [Google Scholar]
15. Cao Y, Li H, Petzold L. 2004. Efficient formulation of the stochastic simulation algorithm for chemically reacting systems. J. Chem. Phys. 121, 4059–4067. ( 10.1063/1.1778376) [DOI] [PubMed] [Google Scholar]
16. Golightly A, Wilkinson DJ. 2011. Bayesian parameter inference for stochastic biochemical network models using particle Markov chain Monte Carlo. Interface Focus 1, 807–820. ( 10.1098/rsfs.2011.0047) [DOI] [PMC free article] [PubMed] [Google Scholar]
17. Wilkinson DJ. 2020. Stochastic modelling for systems biology, 3rd edn. Philadelphia, PA: Chapman & Hall/CRC. (Computational Biology Series). [Google Scholar]
18. Sherlock C, Golightly A, Gillespie CS. 2014. Bayesian inference for hybrid discrete-continuous stochastic kinetic models. Inverse Probl. 30, 114005. ( 10.1088/0266-5611/30/11/114005) [DOI] [Google Scholar]
19. Molyneux GW, Abate A. 2020. ABC(SMC)²: simultaneous inference and model checking of chemical reaction networks. In Computational methods in systems biology (eds Abate A, Petrov T, Wolf V), p. 255. Cham, Switzerland: Springer. ( 10.1007/978-3-030-60327-4_14) [DOI] [Google Scholar]
20. Gillespie DT. 2001. Approximate accelerated stochastic simulation of chemically reacting systems. J. Chem. Phys. 115, 1716–1733. ( 10.1063/1.1378322) [DOI] [Google Scholar]
21. Auger A, Chatelain P, Koumoutsakos P. 2006. R-leaping: accelerating the stochastic simulation algorithm by reaction leaps. J. Chem. Phys. 125, 084103. ( 10.1063/1.2218339) [DOI] [PubMed] [Google Scholar]
22. Tian T, Burrage K. 2004. Binomial leap methods for simulating stochastic chemical kinetics. J. Chem. Phys. 121, 10356–10364. ( 10.1063/1.1810475) [DOI] [PubMed] [Google Scholar]
23. Burrage K, Tian T. 2004. Poisson Runge-Kutta methods for chemical reaction systems. In Advances in scientific computing and applications (eds Lu Y, Sun W, Tang T), pp. 82–96. Beijing, China: Science Press. [Google Scholar]
24. Yates CA, Burrage K. 2011. Look before you leap: a confidence-based method for selecting species criticality while avoiding negative populations in τ-leaping. J. Chem. Phys. 134, 084109. ( 10.1063/1.3554385) [DOI] [PubMed] [Google Scholar]
25. Gillespie DT. 2000. The chemical Langevin equation. J. Chem. Phys. 113, 297–306. ( 10.1063/1.481811) [DOI] [Google Scholar]
26. Hepp B, Gupta A, Khammash M. 2015. Adaptive hybrid simulations for multiscale stochastic reaction networks. J. Chem. Phys. 142, 034118. ( 10.1063/1.4905196) [DOI] [PubMed] [Google Scholar]
27. Safta C, Sargsyan K, Debusschere B, Najm HN. 2015. Hybrid discrete/continuum algorithms for stochastic reaction networks. J. Comput. Phys. 281, 177–198. ( 10.1016/j.jcp.2014.10.026) [DOI] [Google Scholar]
28. Winkelmann S, Schütte C. 2017. Hybrid models for chemical reaction networks: multiscale theory and application to gene regulatory systems. J. Chem. Phys. 147, 114115. ( 10.1063/1.4986560) [DOI] [PubMed] [Google Scholar]
29. Haseltine EL, Rawlings JB. 2002. Approximate simulation of coupled fast and slow reactions for stochastic chemical kinetics. J. Chem. Phys. 117, 6959–6969. ( 10.1063/1.1505860) [DOI] [Google Scholar]
30. Crudu A, Debussche A, Radulescu O. 2009. Hybrid stochastic simplifications for multiscale gene networks. BMC Syst. Biol. 3, 89. ( 10.1186/1752-0509-3-89) [DOI] [PMC free article] [PubMed] [Google Scholar]
31. Duncan A, Erban R, Zygalakis KC. 2016. Hybrid framework for the simulation of stochastic chemical kinetics. J. Comput. Phys. 326, 398–419. ( 10.1016/j.jcp.2016.08.034) [DOI] [Google Scholar]
32. Gillespie DT. 1992. A rigorous derivation of the chemical master equation. Physica A Stat. Mech. Appl. 188, 404–425. ( 10.1016/0378-4371(92)90283-V) [DOI] [Google Scholar]
33. Anderson DF, Kurtz TG. 2011. Continuous time Markov chain models for chemical reaction networks. In Design and analysis of biomolecular circuits (eds Koeppl H, Setti G, di Bernardo M, Densmore D), pp. 3–42. New York, NY: Springer New York. ( 10.1007/978-1-4419-6766-4_1) [DOI] [Google Scholar]
34. Cao Y, Gillespie DT, Petzold LR. 2006. Efficient step size selection for the tau-leaping simulation method. J. Chem. Phys. 124, 044109. ( 10.1063/1.2159468) [DOI] [PubMed] [Google Scholar]
35. Moraes A, Tempone R, Vilanova P. 2014. Hybrid Chernoff tau-leap. Multiscale Model. Simul. 12, 581–615. ( 10.1137/130925657) [DOI] [Google Scholar]
36. Anderson DF, Koyama M. 2012. Weak error analysis of numerical methods for stochastic models of population processes. Multiscale Model. Simul. 10, 1493–1524. ( 10.1137/110849699) [DOI] [Google Scholar]
37. Cao Y, Gillespie DT, Petzold LR. 2005. Avoiding negative populations in explicit Poisson tau-leaping. J. Chem. Phys. 123, 054104. ( 10.1063/1.1992473) [DOI] [PubMed] [Google Scholar]
38. Andrieu C, Doucet A, Holenstein R. 2009. Particle Markov chain Monte Carlo for efficient numerical simulation. In Monte Carlo and quasi-Monte Carlo methods 2008 (eds L’ Ecuyer P, Owen A), pp. 45–60. Berlin, Germany: Springer. ( 10.1007/978-3-642-04107-5_3) [DOI] [Google Scholar]
39. Andrieu C, Doucet A, Holenstein R. 2010. Particle Markov chain Monte Carlo methods. J. R. Stat. Soc. Series B Stat. Methodol. 72, 269–342. ( 10.1111/j.1467-9868.2009.00736.x) [DOI] [Google Scholar]
40. Andrieu C, Roberts GO. 2009. The pseudo-marginal approach for efficient Monte Carlo computations. Ann. Statist. 37, 697. ( 10.1214/07-AOS574) [DOI] [Google Scholar]
41. Gordon NJ, Salmond DJ, Smith AFM. 1993. Novel approach to nonlinear/non-Gaussian Bayesian state estimation. IEE Proc. F 140, 107. ( 10.1049/ip-f-2.1993.0015) [DOI] [Google Scholar]
42. Doucet A, Godsill S, Andrieu C. 2000. On sequential Monte Carlo sampling methods for Bayesian filtering. Stat. Comput. 10, 197–208. ( 10.1023/A:1008935410038) [DOI] [Google Scholar]
43. Sherlock C, Thiery AH, Roberts GO, Rosenthal JS. 2015. On the efficiency of pseudo-marginal random walk metropolis algorithms. Ann. Statist. 43. ( 10.1214/14-AOS1278) [DOI] [Google Scholar]
44. Pitt MK, dos Silva RS, Giordani P, Kohn R. 2012. On some properties of Markov chain Monte Carlo simulation methods based on the particle filter. J. Econ. 171, 134–151. ( 10.1016/j.jeconom.2012.06.004) [DOI] [Google Scholar]
45. Abdulle A, Gander L, Rosilho de Souza G. 2023. Optimal explicit stabilized postprocessed τ-leap method for the simulation of chemical kinetics. J. Comput. Phys. 493, 112482. ( 10.1016/j.jcp.2023.112482) [DOI] [Google Scholar]
46. Golightly A, Sherlock C. 2019. Efficient sampling of conditioned Markov jump processes. Stat. Comput. 29, 1149–1163. ( 10.1007/s11222-019-09861-5) [DOI] [Google Scholar]
47. Altıntan D, Alt B, Koeppl H. 2023. Bayesian inference for jump-diffusion approximations of biochemical reaction networks. arXiv. ( 10.48550/arXiv.2304.06592) [DOI] [Google Scholar]
48. Thomas TT. 2024. hybrid-tau-schemePub. https://github.com/trigotri/hybrid-tau-scheme
49. Thomas TT. 2024. trigotri/hybrid-tau-scheme: article support code (v1.0.0). Zenodo. ( 10.5281/zenodo.13318872) [DOI]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

Data and relevant code for this research work are stored in GitHub [48] and have been archived within the Zenodo repository [49].

[B1] 1. Arkin A, Ross J, McAdams HH. 1998. Stochastic kinetic analysis of developmental pathway bifurcation in phage lambda-infected Escherichia coli cells. Genetics 149, 1633–1648. ( 10.1093/genetics/149.4.1633) [DOI] [PMC free article] [PubMed] [Google Scholar]

[B2] 2. Vilar JMG, Kueh HY, Barkai N, Leibler S. 2002. Mechanisms of noise-resistance in genetic oscillators. Proc. Natl Acad. Sci. USA 99, 5988–5992. ( 10.1073/pnas.092133899) [DOI] [PMC free article] [PubMed] [Google Scholar]

[B3] 3. Kar S, Baumann WT, Paul MR, Tyson JJ. 2009. Exploring the roles of noise in the eukaryotic cell cycle. Proc. Natl Acad. Sci. USA 106, 6471–6476. ( 10.1073/pnas.0810034106) [DOI] [PMC free article] [PubMed] [Google Scholar]

[B4] 4. McAdams HH, Arkin A. 1997. Stochastic mechanisms in gene expression. Proc. Natl Acad. Sci. USA 94, 814–819. ( 10.1073/pnas.94.3.814) [DOI] [PMC free article] [PubMed] [Google Scholar]

[B5] 5. Swain PS, Elowitz MB, Siggia ED. 2002. Intrinsic and extrinsic contributions to stochasticity in gene expression. Proc. Natl Acad. Sci. USA 99, 12795–12800. ( 10.1073/pnas.162041399) [DOI] [PMC free article] [PubMed] [Google Scholar]

[B6] 6. Doi M. 1976. Stochastic theory of diffusion-controlled reaction. J. Phys. A. Math. Gen. 9, 1479–1495. ( 10.1088/0305-4470/9/9/009) [DOI] [Google Scholar]

[B7] 7. Erban R, Chapman SJ. 2009. Stochastic modelling of reaction-diffusion processes: algorithms for bimolecular reactions. Phys. Biol. 6, 1( 10.1088/1478-3975/6/4/046001) [DOI] [PubMed] [Google Scholar]

[B8] 8. Higham DJ. 2008. Modeling and simulating chemical reactions. SIAM Rev. 50, 347–368. ( 10.1137/060666457) [DOI] [Google Scholar]

[B9] 9. Jahnke T, Huisinga W. 2007. Solving the chemical master equation for monomolecular reaction systems analytically. J. Math. Biol. 54, 1–26. ( 10.1007/s00285-006-0034-x) [DOI] [PubMed] [Google Scholar]

[B10] 10. Wolf V, Goel R, Mateescu M, Henzinger TA. 2010. Solving the chemical master equation using sliding windows. BMC Syst. Biol. 4, 42. ( 10.1186/1752-0509-4-42) [DOI] [PMC free article] [PubMed] [Google Scholar]

[B11] 11. MacNamara S, Burrage K. 2009. Krylov and steady-state techniques for the solution of the chemical master equation for the mitogen-activated protein kinase cascade. Numer. Algorithms 51, 281–307. ( 10.1007/s11075-008-9239-y) [DOI] [Google Scholar]

[B12] 12. Schnoerr D, Cseke B, Grima R, Sanguinetti G. 2017. Efficient low-order approximation of first-passage time distributions. Phys. Rev. Lett. 119, 210601. ( 10.1103/PhysRevLett.119.210601) [DOI] [PubMed] [Google Scholar]

[B13] 13. Gillespie DT. 1977. Exact stochastic simulation of coupled chemical reactions. J. Phys. Chem. 81, 2340–2361. ( 10.1021/j100540a008) [DOI] [Google Scholar]

[B14] 14. Gibson MA, Bruck J. 2000. Efficient exact stochastic simulation of chemical systems with many species and many channels. J. Phys. Chem. A 104, 1876–1889. ( 10.1021/jp993732q) [DOI] [Google Scholar]

[B15] 15. Cao Y, Li H, Petzold L. 2004. Efficient formulation of the stochastic simulation algorithm for chemically reacting systems. J. Chem. Phys. 121, 4059–4067. ( 10.1063/1.1778376) [DOI] [PubMed] [Google Scholar]

[B16] 16. Golightly A, Wilkinson DJ. 2011. Bayesian parameter inference for stochastic biochemical network models using particle Markov chain Monte Carlo. Interface Focus 1, 807–820. ( 10.1098/rsfs.2011.0047) [DOI] [PMC free article] [PubMed] [Google Scholar]

[B17] 17. Wilkinson DJ. 2020. Stochastic modelling for systems biology, 3rd edn. Philadelphia, PA: Chapman & Hall/CRC. (Computational Biology Series). [Google Scholar]

[B18] 18. Sherlock C, Golightly A, Gillespie CS. 2014. Bayesian inference for hybrid discrete-continuous stochastic kinetic models. Inverse Probl. 30, 114005. ( 10.1088/0266-5611/30/11/114005) [DOI] [Google Scholar]

[B19] 19. Molyneux GW, Abate A. 2020. ABC(SMC)²: simultaneous inference and model checking of chemical reaction networks. In Computational methods in systems biology (eds Abate A, Petrov T, Wolf V), p. 255. Cham, Switzerland: Springer. ( 10.1007/978-3-030-60327-4_14) [DOI] [Google Scholar]

[B20] 20. Gillespie DT. 2001. Approximate accelerated stochastic simulation of chemically reacting systems. J. Chem. Phys. 115, 1716–1733. ( 10.1063/1.1378322) [DOI] [Google Scholar]

[B21] 21. Auger A, Chatelain P, Koumoutsakos P. 2006. R-leaping: accelerating the stochastic simulation algorithm by reaction leaps. J. Chem. Phys. 125, 084103. ( 10.1063/1.2218339) [DOI] [PubMed] [Google Scholar]

[B22] 22. Tian T, Burrage K. 2004. Binomial leap methods for simulating stochastic chemical kinetics. J. Chem. Phys. 121, 10356–10364. ( 10.1063/1.1810475) [DOI] [PubMed] [Google Scholar]

[B23] 23. Burrage K, Tian T. 2004. Poisson Runge-Kutta methods for chemical reaction systems. In Advances in scientific computing and applications (eds Lu Y, Sun W, Tang T), pp. 82–96. Beijing, China: Science Press. [Google Scholar]

[B24] 24. Yates CA, Burrage K. 2011. Look before you leap: a confidence-based method for selecting species criticality while avoiding negative populations in τ-leaping. J. Chem. Phys. 134, 084109. ( 10.1063/1.3554385) [DOI] [PubMed] [Google Scholar]

[B25] 25. Gillespie DT. 2000. The chemical Langevin equation. J. Chem. Phys. 113, 297–306. ( 10.1063/1.481811) [DOI] [Google Scholar]

[B26] 26. Hepp B, Gupta A, Khammash M. 2015. Adaptive hybrid simulations for multiscale stochastic reaction networks. J. Chem. Phys. 142, 034118. ( 10.1063/1.4905196) [DOI] [PubMed] [Google Scholar]

[B27] 27. Safta C, Sargsyan K, Debusschere B, Najm HN. 2015. Hybrid discrete/continuum algorithms for stochastic reaction networks. J. Comput. Phys. 281, 177–198. ( 10.1016/j.jcp.2014.10.026) [DOI] [Google Scholar]

[B28] 28. Winkelmann S, Schütte C. 2017. Hybrid models for chemical reaction networks: multiscale theory and application to gene regulatory systems. J. Chem. Phys. 147, 114115. ( 10.1063/1.4986560) [DOI] [PubMed] [Google Scholar]

[B29] 29. Haseltine EL, Rawlings JB. 2002. Approximate simulation of coupled fast and slow reactions for stochastic chemical kinetics. J. Chem. Phys. 117, 6959–6969. ( 10.1063/1.1505860) [DOI] [Google Scholar]

[B30] 30. Crudu A, Debussche A, Radulescu O. 2009. Hybrid stochastic simplifications for multiscale gene networks. BMC Syst. Biol. 3, 89. ( 10.1186/1752-0509-3-89) [DOI] [PMC free article] [PubMed] [Google Scholar]

[B31] 31. Duncan A, Erban R, Zygalakis KC. 2016. Hybrid framework for the simulation of stochastic chemical kinetics. J. Comput. Phys. 326, 398–419. ( 10.1016/j.jcp.2016.08.034) [DOI] [Google Scholar]

[B32] 32. Gillespie DT. 1992. A rigorous derivation of the chemical master equation. Physica A Stat. Mech. Appl. 188, 404–425. ( 10.1016/0378-4371(92)90283-V) [DOI] [Google Scholar]

[B33] 33. Anderson DF, Kurtz TG. 2011. Continuous time Markov chain models for chemical reaction networks. In Design and analysis of biomolecular circuits (eds Koeppl H, Setti G, di Bernardo M, Densmore D), pp. 3–42. New York, NY: Springer New York. ( 10.1007/978-1-4419-6766-4_1) [DOI] [Google Scholar]

[B34] 34. Cao Y, Gillespie DT, Petzold LR. 2006. Efficient step size selection for the tau-leaping simulation method. J. Chem. Phys. 124, 044109. ( 10.1063/1.2159468) [DOI] [PubMed] [Google Scholar]

[B35] 35. Moraes A, Tempone R, Vilanova P. 2014. Hybrid Chernoff tau-leap. Multiscale Model. Simul. 12, 581–615. ( 10.1137/130925657) [DOI] [Google Scholar]

[B36] 36. Anderson DF, Koyama M. 2012. Weak error analysis of numerical methods for stochastic models of population processes. Multiscale Model. Simul. 10, 1493–1524. ( 10.1137/110849699) [DOI] [Google Scholar]

[B37] 37. Cao Y, Gillespie DT, Petzold LR. 2005. Avoiding negative populations in explicit Poisson tau-leaping. J. Chem. Phys. 123, 054104. ( 10.1063/1.1992473) [DOI] [PubMed] [Google Scholar]

[B38] 38. Andrieu C, Doucet A, Holenstein R. 2009. Particle Markov chain Monte Carlo for efficient numerical simulation. In Monte Carlo and quasi-Monte Carlo methods 2008 (eds L’ Ecuyer P, Owen A), pp. 45–60. Berlin, Germany: Springer. ( 10.1007/978-3-642-04107-5_3) [DOI] [Google Scholar]

[B39] 39. Andrieu C, Doucet A, Holenstein R. 2010. Particle Markov chain Monte Carlo methods. J. R. Stat. Soc. Series B Stat. Methodol. 72, 269–342. ( 10.1111/j.1467-9868.2009.00736.x) [DOI] [Google Scholar]

[B40] 40. Andrieu C, Roberts GO. 2009. The pseudo-marginal approach for efficient Monte Carlo computations. Ann. Statist. 37, 697. ( 10.1214/07-AOS574) [DOI] [Google Scholar]

[B41] 41. Gordon NJ, Salmond DJ, Smith AFM. 1993. Novel approach to nonlinear/non-Gaussian Bayesian state estimation. IEE Proc. F 140, 107. ( 10.1049/ip-f-2.1993.0015) [DOI] [Google Scholar]

[B42] 42. Doucet A, Godsill S, Andrieu C. 2000. On sequential Monte Carlo sampling methods for Bayesian filtering. Stat. Comput. 10, 197–208. ( 10.1023/A:1008935410038) [DOI] [Google Scholar]

[B43] 43. Sherlock C, Thiery AH, Roberts GO, Rosenthal JS. 2015. On the efficiency of pseudo-marginal random walk metropolis algorithms. Ann. Statist. 43. ( 10.1214/14-AOS1278) [DOI] [Google Scholar]

[B44] 44. Pitt MK, dos Silva RS, Giordani P, Kohn R. 2012. On some properties of Markov chain Monte Carlo simulation methods based on the particle filter. J. Econ. 171, 134–151. ( 10.1016/j.jeconom.2012.06.004) [DOI] [Google Scholar]

[B45] 45. Abdulle A, Gander L, Rosilho de Souza G. 2023. Optimal explicit stabilized postprocessed τ-leap method for the simulation of chemical kinetics. J. Comput. Phys. 493, 112482. ( 10.1016/j.jcp.2023.112482) [DOI] [Google Scholar]

[B46] 46. Golightly A, Sherlock C. 2019. Efficient sampling of conditioned Markov jump processes. Stat. Comput. 29, 1149–1163. ( 10.1007/s11222-019-09861-5) [DOI] [Google Scholar]

[B47] 47. Altıntan D, Alt B, Koeppl H. 2023. Bayesian inference for jump-diffusion approximations of biochemical reaction networks. arXiv. ( 10.48550/arXiv.2304.06592) [DOI] [Google Scholar]

[B48] 48. Thomas TT. 2024. hybrid-tau-schemePub. https://github.com/trigotri/hybrid-tau-scheme

[B49] 49. Thomas TT. 2024. trigotri/hybrid-tau-scheme: article support code (v1.0.0). Zenodo. ( 10.5281/zenodo.13318872) [DOI]

PERMALINK

A hybrid tau-leap for simulating chemical kinetics with applications to parameter estimation

Thomas Trigo Trindade

Konstantinos C Zygalakis

Roles

Abstract

1. Introduction

2. Preliminaries

2.1. Algorithms for simulating chemical kinetics

2.2. Parameter estimation for chemical kinetics

2.2.1. Particle pseudo-marginal Metropolis–Hastings algorithm

3. Hybrid τ-leap

Figure 1.

4. Numerical investigations

4.1. Comparison with other numerical methods

4.1.1. Lotka–Volterra system

Figure 2.

Figure 3.

4.1.2. Birth–death system

Figure 4.

Table 1.

4.1.3. Schlögl system

Figure 5.

Figure 6.

Table 2.

4.1.4. Autorepressive system

Figure 7.

4.2. Parameter estimation

4.2.1. Experiment setting

4.2.2. Methodology

Table 3.

4.2.3. Numerical results

Figure 8.

Figure 9.

Figure 10.

5. Conclusions

Appendix

Table 4.

Table 5.

Table 6.

Contributor Information

Ethics

Data accessibility

Declaration of AI use

Authors’ contributions

Conflict of interest declaration

Funding

References

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

3. Hybrid $τ$ -leap