Abstract
The discrete chemical master equation (dCME) provides a general framework for studying stochasticity in mesoscopic reaction networks. Since its direct solution rapidly becomes intractable due to the increasing size of the state space, truncation of the state space is necessary for solving most dCMEs. It is therefore important to assess the consequences of state space truncations so errors can be quantified and minimized. Here we describe a novel method for state space truncation. By partitioning a reaction network into multiple molecular equivalence groups (MEG), we truncate the state space by limiting the total molecular copy numbers in each MEG. We further describe a theoretical framework for analysis of the truncation error in the steady state probability landscape using reflecting boundaries. By aggregating the state space based on the usage of a MEG and constructing an aggregated Markov process, we show that the truncation error of a MEG can be asymptotically bounded by the probability of states on the reflecting boundary of the MEG. Furthermore, truncating states of an arbitrary MEG will not undermine the estimated error of truncating any other MEGs. We then provide an overall error estimate for networks with multiple MEGs. To rapidly determine the appropriate size of an arbitrary MEG, we also introduce an a priori method to estimate the upper bound of its truncation error. This a priori estimate can be rapidly computed from reaction rates of the network, without the need of costly trial solutions of the dCME. As examples, we show results of applying our methods to the four stochastic networks of 1) the birth and death model, 2) the single gene expression model, 3) the genetic toggle switch model, and 4) the phage lambda bistable epigenetic switch model. We demonstrate how truncation errors and steady state probability landscapes can be computed using different sizes of the MEG(s) and how the results validate out theories. Overall, the novel state space truncation and error analysis methods developed here can be used to ensure accurate direct solutions to the dCME for a large number of stochastic networks.
Keywords: Stochastic biological networks, discrete chemical master equation, state space truncation
Introduction
Biochemical reaction networks are intrinsically stochastic [1, 2]. Deterministic models based on chemical mass action kinetics cannot capture the stochastic nature of these networks [3, 4, 5]. Instead, the discrete Chemical Master Equation (dCME) that describes the probabilistic reaction jumps between discrete states provides a general framework for fully characterizing mesoscopic stochastic processes in a well mixed system [6, 7, 8, 9, 10]. The steady state and time-evolving probability landscapes over discrete states governed by the dCME provide detailed information of these dynamic stochastic processes. However, the dCME cannot be solved analytically, except for a few very simple cases [11, 12, 13, 14, 15].
The dCME can be approximated using the Fokker-Planck equation (FPE) and the chemical Langevin equation (CLE). These approximations are not applicable when copy numbers are small [16], as relatively large copy numbers of molecules are required for accurate approximation [17, 16, 18, 19, 20]. Recent studies provided assessment of errors in these approximations for several reaction networks [21, 22], as well as numerical demonstration in which the CLE of a 13-node lysogeny-lysis decision network of phage-lambda was found to fail to converge to the correct steady state probability landscape (see appendix of ref [5]). However, consequences of such approximations involving many molecular species and with complex reaction schemes are generally not known.
A widely used approach to study stochasticity is that of stochastic simulation algorithm (SSA) It generates reaction trajectories following the underlying dCME [6], and the stochastic properties of the network can then be inferred through analysis of a large number of simulation trajectories. However, convergence of such simulations is difficult to determine, and the errors in the sampled steady state probability landscape are unknown.
Directly solving the dCME offers another attractive approach. By computing the probability landscape of a stochastic network numerically, its properties, such as those involving rare events, can be studied accurately in details. The finite state projection (FSP) method is among several methods that have been developed to solve dCME directly [23, 24, 25, 26, 5, 27, 28]. The FSP is based on a truncated projection of the state space and uses numerical techniques to compute the time-evolving probability landscapes, which are solutions to the dCME [29, 23, 30]. Although the error due to state space truncation can be calculated for the time-evolving probability landscape [23], the use of an absorbing boundary, to which all truncated states are projected, will lead to the accumulation of errors as time proceeds, and eventually trap all probability mass. The FSP method was designed to study transient behavior of stochastic networks, and is not well suited to study the long-term behavior and the steady state probability landscape of a network.
A bottleneck problem for solving the dCME directly is to have an efficient and adequate account of the discrete state space. As the copy number of each of the n molecular species takes an integer value, conventional hypercube-based methods of state enumeration incorporate all vertices in a n-dimensional hypercube non-negative integer lattice, which has an overall size of , where bi is the maximally allowed copy number of molecular species i. State enumeration rapidly becomes intractable, both in storage and in computing time. This makes the direct solution of the dCME impossible for many realistic problems. To address this issue, the finite buffer discrete CME (fb-dCME) method was developed for efficient enumeration of the state space [24]. This algorithm is provably optimal in both memory usage and in time required for enumeration. It introduces a buffer queue with a fixed number of molecular tokens to keep track of the remaining number of states that can be enumerated. States with depleted buffer do not absorb probability mass but reflect them to states already enumerated, with the overall probability mass conserved. Further, instead of including every states in a hypercube, it examines only states that can be reached from a given initial state. It can be used to compute the exact steady state and time-evolving probability landscape of a closed network, or an open network when the net gain in newly synthesized molecules does not exceed the predefined finite buffer capacity.
State-space truncation eventually occurs in all methods that directly solve the dCME. For example, it occurs in open systems when no new states can be enumerated, therefore synthesis reaction cannot proceed. However, it is unclear how accurate the probability landscape computed using a truncated state space is. Furthermore, it is unclear how to minimize truncation errors, thus limiting the scope of applications of direct methods such as the fb-dCME method.
In this study, we develop a new method for state space truncation and provide a general theoretical framework for characterizing the error due to state space truncation. We start by partitioning the molecular species in a reaction network into a number of molecular equivalent groups (MEG) according to their chemical compositions. The state space is then truncated by limiting the maximum copy number of each MEG instead of individual molecular species. States with exactly the maximum copy number of a MEG form the reflecting boundary of the state space. We further discuss networks with a single reflecting boundary in the truncated state space. We then show that the total probability of the boundary states can be used as an upper bound of the truncation error in computed steady state probability landscape. This is then generalized to networks with an arbitrary number of reflecting boundaries. We further develop an a priori method derived from stochastic ordering for rapid estimation of the truncation errors of the steady state probability landscape for a given truncated state space. The required maximum copy number of each MEG for a pre-defined error tolerance can also be determined without computing costly trial solutions to the dCME. Overall, the method of state space truncation and the upper bounds of truncation errors established in this study enables accurate quantification of errors in numerical solutions of the dCME, and can help to design strategies so probability landscapes with small and controlled errors can be computed for a large class of biological problems which are previously infeasible.
This paper is organized as follows. We first review basic concepts of the discrete chemical master equation and issues associated with the finite discrete state space. We then describe how to partition a reaction network into molecular equivalent groups and how to truncate the discrete state space. We further discuss truncation errors of the steady state probability landscape and how to construct upper bounds of the truncation errors. This is followed by detailed studies of the single gene expression system and the genetic toggle switch system. We examine the a priori estimated error bound, the computed error, and the true error for different state truncations. We end with discussions and conclusions.
Methods
Theoretical Framework
Reaction Network, State Space and Probability Landscape
In a well-mixed biochemical system with constant volume and temperature, there are n molecular species, denoted as 𝒳 = {X1,X2,⋯,Xn}, and m reactions, denoted as ℛ = {R1,R2,⋯,Rm}. Each reaction Rk has an intrinsic reaction rate constant rk. The microstate of the system at time t is given by the non-negative integer column vector of copy numbers of each molecular species: x(t) = (x1(t), x2(t),⋯, xn(t))T, where xi(t) is the copy number of molecular species Xi at time t. An arbitrary reaction Rk with intrinsic rate rk takes the general form of
which brings the system from a microstate xj to xi. The difference between xi and xj is the stoichiometry vector sk of reaction . The rate Ak(xi, xj) of reaction Rk that brings the microstate from xj to xi is determined by rk and the combination number of relevant reactants in the current microstate xj:
assuming the convention .
All possible microstates that a system can visit from a given initial condition form the state space: Ω = {x(t)|x(0), t ∈ (0, ∞)}. We denote the probability of each microstate at time t as p(x(t)), and the probability distribution at time t over the full state space as p(t) = {(p(x(t))|x(t) ∈ Ω)}. We also call p(t) the probability landscape of the network [5].
Discrete Chemical Master Equation
The discrete chemical master equation (dCME) can be written as a set of linear ordinary differential equations describing the change in probability of each discrete state over time:
(1) |
Note that p(x, t) is continuous in time, but is discrete over the state space. In matrix form, the dCME can be written as:
(2) |
where A ∈ ℝ|Ω|×| Ω| is the transition rate matrix formed by the collection of all A(xi, xj), which describes the overall reaction rate from state xj to state xi:
(3) |
Molecular Equivalent Groups and Independent Birth-Death Processes
In an open reaction network, synthesis reactions are the only ones that generate new molecules and increase the total mass of the system. Degradation reactions are the only ones that destroy molecules and remove mass from the system. The net copy numbers of various molecular species in an open network gives its total mass. For a given microstate, the mass for each molecular species is defined. The total mass in a network can increase to infinity if synthesis reactions persist. The truncation of the infinite state space of such an open network, which is inevitable due to the limited computing capacity, can lead to errors in computing the probability landscapes of a dCME.
Here we introduce the concept of Molecular Equivalence Groups (MEG), which will be useful for state space truncation. Specifically, molecular species Xi and Xj belong to the same MEG if Xi can be transformed into Xj or Xj can be transformed into Xi through one or more mass-balanced reactions. A stochastic network can have one or more Molecular Equivalent Groups. The total mass of a Molecular Equivalent Group for a specific microstate is defined as the total copy number of the most elementary equivalent molecular species in the Molecular Equivalent Group.
(4) |
For example, the reaction network shown in Eqn. (4) has two Molecular Equivalent Groups, i.e., MEG1 = {A,B,C,D} and MEG2 = {X, Y, Z}. The most elementary molecular species in MEG1 and MEG2 are A and X, respectively. For any specific microstate of the network x = {a, b, c, d, x, y, z}, the total net copy number of the Molecular Equivalent Group MEG1 is calculated as nMEG1(x) = a + b + 2c + 3d, and the total net copy number of MEG2 can be calculated as nMEG2(x) = x+2y + 2z, where the a, b, c, d, x, y, and z are copy numbers of corresponding molecular species.
We are interested in MEGs containing synthesis and degradation reactions. The set of reactions associated with such an open MEG is called an independent Birth-Death process (iBD). Reactions in an iBD can increase or decrease the total net copy number of molecules in the associated MEG.
State Space Truncation by Molecular Equivalent Group
Here we introduce a novel state truncation method. Instead of truncating the state space by specifying a maximum allowed copy number B for each molecular species, we specify a maximum allowed molecular copy number B for the j-th MEG. Assume the j-th MEG contains nj distinct molecular species, and conservatively ignore the effects of stoichiometry, the number of all possible states for the j-th MEG is then that of the volume of an nj-dimensional orthogonal corner simplex, with B the length of all edges with the origin as a vertex. The number of integer lattice nodes in this nj-dimensional simplex gives the precise number of states of the j-th MEG, which is in turn exactly given by the multiset number . The size of the state space is therefore much smaller than the size of the state space Bnj that would be generated by the hypercube method, with a reduction factor of roughly nj ! factorial. Note that under the constraint of mass conservation, each molecular species in this MEG can still have a maximum of B copies of molecules.
We further conservatively assume that different MEGs are independent, and each can have maximally B copies of molecules. The size of the overall truncated state space is then . This is much smaller than the n-dimensional hypercube, which has an overall size of O(Πj Bnj) = O(Bn), with n the total number of molecular species in the network. Overall, the size of state space generated by MEG truncation can be dramatically smaller than that generated using the hypercube method.
State Space Aggregation According to the Net Copy Number in Molecular Equivalent Group
We first consider the stochastic network with only one Molecular Equivalent Group. We truncate the state space by fixing the maximum amount of total mass in the network. We are interested in estimating the errors due to such a state truncation. To do so, we first factor states in the original state space Ω(∞) of infinite size according to the total net copy number of the MEG in each state. The infinite state space Ω(∞) can be partitioned into disjoint groups of subsets Ω̃(∞) ≡ {𝒢0, 𝒢1,⋯, 𝒢N,⋯}, where states in each aggregated subset 𝒢s have exactly the same s total copies of equivalent elementary molecular species of the MEG. The total steady state probability on microstates in each group 𝒢s can then be written as:
(5) |
Based on the state space partition Ω̃(∞), we can re-construct a transition rate matrix Ã, which is a permutation of the original dCME matrix A in Eqn. (2):
(6) |
where each block sub-matrix Ai, j includes all transitions from states in group 𝒢j to states in 𝒢i.
In continuous time Markov model of mesoscopic systems, reactions occur instantaneously, and the synthesis and degradation reactions always generate or destroy one molecule at a time. This also applies to oligomers, which are assumed to form only upon association of monomers already synthesized, and dissociate into monomers first before full degradation. The re-constructed matrix à is thus a tri-diagonal block matrix, i.e., Ai, j is all 0s if |i−j| > 1. Moreover, synthesis reactions always appear as lower blocks Ai+1, i, and degradation reactions always as upper blocks Ai, i+1. Diagonal blocks Ai, i contains all coupling reactions that do not alter the net number of synthesized molecules. Note that every Ai+1, i block and Ai, i+1 block only includes synthesis and degradation reactions associated with the current MEG. For analysis of networks with multiple MEGs, we assume at this time there is no limit on the total mass of other MEGs, therefore the state space is not truncated on these MEGs. These other MEGs do not alter the total net copy number of molecular species in the current MEG.
Note that the assumption of the stoichiometric coefficient of 1 for synthesis and degradation is only for constructing the proofs of the theorems. In computation, there is no condition on the stoichiometry of any reaction, and our method is general and can be applied to any reaction network.
We can obtain the steady state probability on aggregated states without solving the dCME. It is tempting to lump all microstates in each group 𝒢j into one state and replace the original |Ω(∞)| × |Ω(∞)| rate matrix à with an aggregated matrix to study the dynamic changes of the probability landscape on this aggregated state space. However, stringent requirements must be satisfied for such lumped states to follow a Markov process [31, 32, 33, 34, 35, 36]. Specifically, a transition rate matrix A for a continuous Markov process is lumpable with respect to a partition Ω̃(∞) if and only if for all pairs of 𝒢s, 𝒢t ∈ Ω̃(∞), the condition
(7) |
holds for all xi, xj ∈ 𝒢s [31]. In other words, every state in 𝒢s must have the same total transition rate to group 𝒢t, and this must be true for all 𝒢s and 𝒢t [31].
While à does not satisfy this strong condition in general, we can instead construct a lumped transition matrix B, which is associated with the aggregated state space derived from the partition Ω̃(∞), such that the aggregated steady state probability distribution on the partition Ω̃(∞) computed from the lumped matrix B is equal to that derived from the steady state distribution computed from the original matrix A. That is, steady state probabilities on partitioned groups in Ω̃(∞) are identical using either B or the original A.
Assume the steady state probability distribution π̃(x) over the partitioned state space Ω̃(∞) is known, the aggregated synthesis rate for the group 𝒢i and the aggregated degradation rate for the group 𝒢i+1 at the steady state are two constants (Fig 1) defined as
(8) |
where π̃(𝒢i) and π̃(𝒢i+1) are the steady state probability vector over microstates in the lumped states 𝒢i and 𝒢i+1, respectively. The term 1TAi+1,i is the row vector of column-summed rates from Ai+1,i for microstates in 𝒢i, and is the steady state probability vector π̃(𝒢i) over microstates in 𝒢i normalized by the total steady state probability on 𝒢i. Similarly, 1TAi,i+1 is the row vector of column-summed rates from Ai,i+1 for microstates in 𝒢i+1, and is the steady state probability vector π̃(𝒢i+1) over microstates in 𝒢i+1 normalized by the total steady state probability on 𝒢i+1. We can construct an aggregated transition rate matrix B from à based on the following Lemma:
Lemma 1 (Rate Matrix Aggregation.)
If an Molecular Equivalence Group has no limit on the total copy number, it generates an infinite state space Ω(∞) and the rate matrix A is of infinite dimension. For any homogeneous continuous-time Markov process with such a rate matrix A, an aggregated continuous-time Markov process with an infinite rate matrix B(∞) can be constructed on the partition Ω̃(∞) = {𝒢0, 𝒢1,⋯, 𝒢N, ⋯} with respect to the total net copy number of molecules in the network, such that it gives the same steady state probability distribution for each partitioned group {𝒢s} as that given by the original matrix A, i.e., π(𝒢s) = Σx∈𝒢s π(x) for all s = 0, 1,⋯, where π(Ω(∞)) is the steady state probability distribution associated with A. Specifically, the infinite transition rate matrix B(∞) can be constructed as a tridiagonal matrix:
(9) |
with the lower off-diagonal vector , the upper off-diagonal vector , and the diagonal vector , i = 0,⋯,∞. This is equivalent to transforming the corresponding infinite transition rate matrix à in Eqn. (6) into B(∞) by substituting each block sub-matrix Ai+1, i of synthesis reactions with the corresponding aggregated synthesis rate , and each block Ai, i+1 of degradation reactions with the aggregated degradation rate , respectively, with and defined in Eqn. (8).
Proof can be found in the Appendix.
Analytical Solution of Steady State Probability of Aggregated States
The system associated with the aggregated rate matrix B can be viewed as a birth-death process controlled by a pair of “synthesis” and “degradation” transitions between aggregated states associated with different net copy number of the MEG. It takes the form:
(10) |
where E represents the elementary molecular species in the MEG, with its copy number the total net copy number of the MEG. The rates and are the aggregated “synthesis” and “degradation” rates for this MEG. The aggregated state space and transitions between them are illustrated in Fig. 1. The steady state probability distribution over the aggregated states are governed by Bπ̃(∞) = 0.
The aggregated rates and in B are from summations of all entries in the non-negative block matrices Ai+1, i and Ai, i+1. As long as there is one or more microstates in Ai+1, i or Ai, i+1 with non-zero copies of reactants, or will be non-zero. We next examine the most general case when and for all i = 0, 1,⋯. We simplify our notation and use for π̃(∞)(𝒢i). Following the well-known results on analytical solution of the steady state distribution of the birth-death processes [13, 15], the steady state solution for and can be written as:
(11) |
and
(12) |
Therefore, the steady state probability of an arbitrary group 𝒢i can be written as:
(13) |
Once and are known, the total probability of any aggregated state 𝒢i at the steady state can be easily computed. We will introduce a method in later sections for easy a priori calculation of error estimates based on Eqn. (13) and values of and , which are directly obtained from reaction rate constants of the network model, without the need of solving the dCME.
Truncation Error Is Bounded Asymptotically by Probability of Boundary States
When the maximum total net copy number of the MEG is limited to N, states with a total net copy number larger than N will not be included, resulting in a truncated state space Ω(N). Those microstates with exactly N total net copies of molecules in the network are the boundary states, because neighboring states with one additional molecule are truncated. The true error for the steady state Err(N) due to truncating states beyond those with N net copies of molecules is the summation of true probabilities over microstates that have been truncated from the original infinite state space:
(14) |
The true error Err(N) is unknown, as it requires knowledge of π(∞)(x) for all x ∈ Ω(∞). In this section, we show that Err(N) asymptotically converges to as the maximum net copy number limit N increases. If N is sufficiently large, the true error Err(N) is bounded by the true boundary probability times a constant. First, we have:
Lemma 2 (Finite Biological System.)
For any biological system in which the total amount of mass is finite, the aggregated synthesis rate becomes smaller than the aggregated degradation rate when the total molecular copy number N is sufficiently large:
(15) |
Proof can be found in the Appendix.
Note that in most biological reaction networks, the stronger condition should hold, as synthesis reactions usually have constant rates, while degradation reactions have increasing rates when the copy number of the molecule increases. When the net copy number i is sufficiently large, the ratio approaches zero.
According to the Eqn. (14) and as discussed above, when the total net molecular copy number N increases to infinity, the true error Err(N) converges to zero. For a finite system, the series of the boundary probability { } (Eqn. (13)) also converges to 0, since the sequence of its partial sums converges to 1. That is, the N-th member of this series converges to 0 and the residual sum of this series converges to 0. We now study the convergence behavior of the ratio of Err(N) and .
Theorem 3 (Asymptotic Convergence of Error.)
For a truncated state space with a maximum net molecular copy number N in the network, the true error Err(N) follows the inequality below when N increases to infinity:
(16) |
where M is an integer selected from N, ⋯, ∞ to satisfy .
Proof can be found in the Appendix.
According to Theorem 3, the true error Err(N) is asymptotically bounded by the boundary probability multiplied by a simple function of the aggregated synthesis rates and degradation rates . We can therefore use Inequality (16) to construct an upper-bound for Err(N). We examine three cases: (1) If , the true error is always smaller than the boundary probability: , when the maximum net molecular copy number N is sufficiently large. (2) If , the true error converges asymptotically to . (3) If , the error is bounded by multiplied by a constant according to Inequality (16).
In realistic biological reaction networks, case (1) is most applicable. As rates of synthesis reactions usually are constant, whereas rates of degradation reactions depend on the copy number of net molecules in the network, the ratio between aggregated synthesis rate and degradation rate decreases monotonically with increasing net molecular copy numbers N. We therefore conclude that the boundary probability indeed provides an upper bound to the state space truncation error. In addition, in case (1) M = N, and .
Therefore Inequality (16) can be further rewritten as:
(17) |
Computed Probability of Boundary States on Truncated State Space Bounds the True Boundary Probability
It is not practical to compute the true boundary probability on the original infinite state space. In this section, we show that the probability of boundary states is larger than . Therefore, we can use on truncated state space as an upper bound for Err(N). That is, the steady state probability computed using the truncated state space over the boundary states can be used to bound Err(N).
We first show that the truncated state space and its rate matrix can also be aggregated according to the net copy number of molecules in MEG following Lemma 4, which is similar to Lemma 1:
Lemma 4
A Molecular Equivalent Group with a maximum of N total copy number of elementary molecular species gives a truncated state space Ω(N) and a truncated rate matrix A(N). For any homogeneous continuous-time Markov process with such a rate matrix A(N), an aggregated continuous-time Markov process with a rate matrix B(N) can be constructed on the partition Ω̃(N) = {𝒢0, 𝒢1, ⋯, 𝒢N } with respect to the total net copy number of molecules in the network, such that it gives the same steady state probability distribution for each partitioned group {𝒢s} as that given by the original matrix A(N), i.e., π(𝒢s) = Σx∈𝒢s π(x) for all s = 0, 1, ⋯, N, where π(x) is the steady state probability distribution associated with A(N).
Specifically, the rate matrix B(N) can be constructed as:
(18) |
with the lower off-diagonal vector
the upper off-diagonal vector
and the diagonal vector
It is equivalent to substituting the block sub-matrices Ai+1, i and Ai, i+1 in the original rate matrix à with the corresponding aggregated synthesis rate and degradation rate , respectively. The aggregated rates on the truncated state space are:
(19) |
Proof
Same as Lemma 1.
Similar to the case of infinite state space, we can write out in analytic form the total steady state probability over each aggregated group 𝒢i as:
(20) |
Specifically, the total steady state probability over the group of aggregated boundary states 𝒢N is:
(21) |
We now study how state space truncation affects the steady state probabilities over the aggregated groups.
Theorem 5 (Boundary Probability Increases after State Space Truncation)
The total steady state probability of an aggregated state group 𝒢i, for all i = 0, 1, ⋯, N, on the truncated state space Ω̃(N) with a maximum net molecular copy number N, is greater than or equal to the non-truncated probability over the same group 𝒢i obtained using the original state space Ω̃(∞) of infinite size, i.e., .
Proof can be found in the Appendix.
In summary, the boundary probability increases when the state space is truncated . From Theorem 3, we always have . Therefore, we can bound Err(N) by the boundary probability computed using the truncated state space when and .
From One to Multiple MEGs
In complex reaction networks, multiple MEGs occur. Since different MEGs are pairwise disjoint, we can aggregate the same state space and re-construct the permuted the rate matrix according to different MEG one at a time. Lemmas 1, 2, and 4, and Theorems 3 and 5 are all valid for each individual MEG. That is, the true error of truncating one MEG is bounded by the boundary probability computed using the state space truncated in that particular MEG, while all other MEGs have infinite net molecular copy numbers. However, it is not possible to compute the solution of dCME with infinite molecules in any MEG. Below we study how error bounds can be constructed when states in all MEGs are truncated simultaneously.
From Truncating One to Truncating All MEGs
We use ℐ = (∞, ⋯, ∞) to denote the vector of infinite net copy numbers for all MEGs in the network. ℐ corresponds to the original infinite state space Ω(ℐ) without any truncation. We use A(ℐ) and π(ℐ) to denote the transition rate matrix and the steady state probability distribution over Ω(ℐ), respectively. Furthermore, we have A(ℐ)π(ℐ) = 0.
We use ℐj = (∞, ⋯, Nj, ⋯, ∞) to denote the vector of maximum copy numbers with only the j-th MEG limited to a finite copy number Nj and all other MEGs with infinite copy numbers. The corresponding state space is denoted Ω(ℐj), the transition rate matrix A(ℐj), and the steady state probability distribution π(ℐj). At the steady state, we also have A(ℐj)π(ℐj) = 0.
We now add one more truncation to the i-th MEG in addition to the j-th MEG. We denote the vector of maximum copies as ℐi,j = (∞, ⋯, Ni, ⋯, Nj, ⋯, ∞), with Ni and Nj the maximum copy numbers of the i-th and j-th MEG, respectively. All other MEGs can have infinite molecular copy numbers. We denote the corresponding state space as Ω(ℐi,j), the transition rate matrix A(ℐi,j), the steady state probability distribution π(ℐi,j). At the steady state, we have A(ℐi,j)p(ℐi,j) = 0.
When all w number of MEGs in the network are truncated using a vector of maximum copies ℬ = (N1, ⋯, Ni, ⋯, Nj, ⋯, Nw), we have a finite state space Ω(ℬ). Obviously, we have Ω(ℬ) ⊆ Ω(ℐi,j) ⊆ Ω(ℐj) ⊆ Ω(ℐ).
We have already shown that for each truncated MEG on the infinite state space, the truncation error is bounded by the corresponding boundary probability. We now show that this error bound also holds for the fully truncated state spaces Ω(ℬ). We show first adding only one additional truncation at the i-th MEG to the singularly truncated state space Ω(ℐj), and demonstrate that the probability of each state in the doubly truncated state space Ω(ℐi,j) is no smaller than the probability in singularly truncated state space Ω(ℐj), i.e., π(ℐi,j)(x) = π(ℐj)(x) for all x ∈ Ω(ℐi,j).
Theorem 6
At steady state, π(ℐi,j) = π(ℐj) and π(ℐi,j) approaches π(ℐj) component-wise for any state in Ω(ℐi,j) when the maximum net copy number limit for the i-th MEG Ni goes to ∞.
Proof can be found in the Appendix.
Theorem 6 shows that introducing an additional truncation at the i-th MEG does not decrease the boundary probability of the j-th MEG. Therefore, the boundary probability from doubly truncated state space Ω(ℐi,j) can also be used to bound the true error after state truncations at both i-th and j-th MEG. Furthermore, we can show by induction that boundary probabilities computed from the fully truncated state space Ω(ℬ) can also be used to bound the truncation errors of each MEG, respectively.
Upper and Lower Bounds for Steady State Boundary Probability
In this section, we introduce an efficient and easy-to-compute method to obtain an upper-and lower-bound of the boundary probabilities a priori without the need to solving the dCME. The method can be used to rapidly determine if the maximum copy number limits to MEGs are adequate to obtain the direct solution to dCME with a truncation error smaller than the predefined tolerance. The optimal maximum copy number for each MEG can therefore be estimated a priori.
As a consequence of Theorem (6) discussed above, the boundary probability computed on the truncated state space Ω(B) can be used as an error bound. We now use the truncated rate matrix to derive the upper- and lower-bounds.
Denote the maximum and minimum aggregated synthesis rates from the block sub-matrix Ai+1, i as
(22) |
respectively, and the maximum and minimum aggregated degradation rates from the block sub-matrix Ai, i+1 as
(23) |
respectively. Note that , and can be easily calculated from the reaction rates in the network without need for generating and partitioning the dCME transition rate matrix Ã. As and given in Eqn. (8) are weighted sums of vector 1T Ai+1,i and 1T Ai,i+1 with regard to the steady state probability distribution π̃(N)(𝒢i), respectively, we have
We use results from the theory of stochastic ordering for comparing Markov processes to bound . Stochastic ordering “≤st” between two infinitesimal generator matrices Pn×n and Qn×n of Markov processes is defined as [32, 37]
To derive an upper bound for in Eqn. (21), we construct a new matrix B̄ by replacing with the corresponding and with the corresponding in the matrix B. Similarly, to derive an lower bound for , we construct the matrix B by replacing with the corresponding and replace with in B. We then have the following stochastic ordering:
All three matrices B, B, and B̄ are “≤st –monotone” according to the definitions in Truffet [32]. The steady state probability distributions of matrices B, B, and B̄ maintain the same stochastic ordering (Theorem 4.1 of Truffet [32]):
Therefore, we have the inequality:
Here the lower bound is the boundary probability from πB, is the boundary probability from πB, and the upper bound is the boundary probability computed from πB̄. From Eqn. (21), the upper bound can be calculated a priori from reaction rates:
(24) |
and the lower bound can also be calculated as:
(25) |
These are general formula for upper and lower bounds of the boundary probabilities of any MEG in a reaction network. Note that while is easy to compute, it may not be a tight error bound when the MEG involves many molecular species with overall complex interactions. This will be shown in the example of the phage lambda epigenetic switch model (Fig. 6A and B).
For a reaction network with multiple MEGs, we have
where Ni is the maximum copy number for the i-th MEG. The upper bounds for the total error Err(Ω(ℬ)) can therefore be obtained straightforwardly by taking summation of upper bounds for each individual MEG:
(26) |
This upper bound of can therefore be used as an a priori estimated bound for the total truncation error Err(Ω(ℬ)) for the state space Ω(ℬ) using truncation of ℬ = (N1, ⋯, Ni, ⋯, Nj, ⋯, Nw).
Biological Examples
Below we give examples on characterizing the truncation errors in the steady state probability landscapes for four biological reaction networks. We study the models of the birth and death process, the single gene expression, the model of genetic toggle switch, and the phage lambda epigenetic switch model. We first show how each network can be partitioned into MEGs, and how truncation errors for each MEG can be estimated a priori. By enumerating the state space and directly computing the steady state probability landscapes of the dCMEs using the fb-dCME method, we examine the true truncation errors, the computed boundary probabilities, and the a priori estimated truncation error. We demonstrate that indeed the truncation error is bounded from above by the computed boundary probability, and by the a priori error estimate according to theoretical analyses described earlier, once the copy number limit is sufficiently large for the MEG(s).
Birth-Death Process
The birth-death process is a ubiquitous biochemical phenomenon. In its simplest form, it involves synthesis and degradation of only one molecular species. We study this simple birth-death process, whose reaction scheme and rate constants are specified as follows:
(27) |
The steady state probability landscape of the birth-death process is well known [13, 15]. This process has also been studied extensively as a problem of estimating rare event probability [38, 39, 40].
Molecular equivalent group (MEG)
This single birth and death process is an open network because of the presence of the synthesis reaction. There is only one molecular equivalent group (MEG). We truncate the state space at different values of the maximum copy number of the MEG, ranging from 0 to 200, and compute the boundary probabilities at each different truncation.
Asymptotic convergence of errors (Theorem 3)
To numerically demonstrate Theorem 3, we compute the true truncation error of the steady state solution to the dCME. We use a large copy number of MEG = 200, which gives an infinitesimally small boundary probability of 1.391 × 10−72. Steady state solution obtained using this MEG number coincides with analytical solution, and is therefore considered to be exact. With this exact steady state probability landscape, the true truncation error Err(N) at smaller MEG sizes can be computed using Eqn. (14) (Fig. 3A, blue dashed line and crosses). The corresponding boundary probabilities are computed from this exact steady state probability landscape (Fig. 3A, green dashed line and circles).
Consistent with the statement in the Theorem 3, we find here that the true error Err(N) (Fig. 3A, blue dashed line and crosses) is bounded by the computed boundary probability (Fig. 3A, green dashed line and circles) when the size of the MEG is sufficiently large. The inset of Fig. 3A shows the ratio of the true errors to the computed errors at different sizes of the MEG, and the grey straight line marks the ratio one. The computed errors are larger than the true errors when the black line is below the grey straight line (Fig. 3A inset). In this example, the computed boundary probability is greater than the true error when N > 79, as would be expected from Theorem 3.
A priori estimated error bound
To examine the a priori estimated upper bound for truncation error, we follow Eqn. (22) and (23) to assign values of ᾱi = ks and βi+1 = kd(i+1) for this network. We compute the a priori upper error bound for different truncations using Eqn. (24) (Fig. 3A, red solid line). For this simple network, ᾱi = αi = αi and β̄i = βi = βi, therefore the a priori estimated error is exactly the same as the analytic solution for the steady state distribution for this simple birth-death network, and it coincides with the computed error (Fig. 3A red and green lines). The true error, computed error, and the a priori error bound all decrease monotonically with increasing MEG size N (Fig. 3A).
Increased probability after state space truncation (Theorem 5)
According to Theorem 5, the probability of a state increases upon state space truncation. We compare the steady state probability landscapes of X computed using truncations at different sizes ranging from 40 to 50 with the exact steady state landscape (Fig. 3B, red line). Our results indeed show clearly that all probabilities increase as more states are truncated (Fig. 3B). The probability landscape computed using N = 50 (Fig. 3B, yellow line) or larger is very close to the exact landscape using N = 200 (Fig. 3B, red line). However, the probability landscapes computed using smaller N deviate significantly from the exact probability landscape. The smaller the MEG size, the more significant the deviation is. These results are fully consistent with the statements of Theorem 5.
Single Gene Expression Model
Transcription and translation are fundamental processes in gene regulatory networks that often involve significant stochasticity. The abundance of mRNA and expressed proteins of a gene is usually 2–4 orders of magnitude apart in a cell. There are only a few or dozens of copies of mRNA molecules in each cell for one gene, but the copy number of proteins can range from hundreds to ten thousands [41]. Here we study a model of the fundamental process of single gene transcription and translation using the following reaction scheme and rate constants:
(28) |
Molecular equivalent group (MEG)
This single gene expression model is an open network. We can participate this model into two molecular equivalent groups (MEG), with MEG1 consists of species mRNA, MEG2 consists of Protein. Note that protein synthesis depends on the copy number mRNA, despite the fact that mRNA and Protein are two independent molecular species that cannot be transformed into each other.
Asymptotic convergence of errors (Theorem 3)
To numerically demonstrate Theorem 3, we compute the true error of the steady state solution to the dCME using sufficiently large sizes of MEG1 = 64 and MEG2 = 2, 580, which gives negligible truncation error, with infinitesimally small boundary probabilities 3.58 × 10−30 for MEG1 and 1.15 × 10−32 for MEG2. Solution obtained using these MEGs is therefore considered to be exact. With this exact steady state probability landscape, the true truncation error Err(N) at smaller sizes of MEG1 and MEG2 can be computed using Eqn. (14) (Fig. 4A and B, blue dashed lines and crosses). The corresponding boundary probabilities or computed error are obtained from the exact steady state probability landscape for both MEG1 (Fig. 4A, green dashed line and circles) and MEG2 (Fig. 4B, green dashed line and circles).
Consistent with the statement in Theorem 3, our results show that the true error Err(N) is bounded by the computed boundary probability in the MEG1 when N1 ≥ 20 (Fig. 4A, blue dashed lines and crosses, green dashed lines and circles, and the inset). In the MEG2, although the true errors are larger than computed errors even when the MEG size is large (Fig. 4A inset), the true error can be bounded by the computed error when N2 ≥ 5000 by a multiplication factor of 6 (Fig. 4B, blue dashed lines and crosses, green dashed lines and circles, and the inset). This is expected from Theorem 3.
A priori estimated error bound
To examine a priori estimated upper bounds for the truncation errors in MEG1 and MEG2, we follow Eqn. (22) and (23) to assign values of ᾱi = ke and β(i+1) = km(i + 1) for the MEG1. Because of the dependency of protein synthesis on the mRNA copy numbers, we set ᾱi = 64 · kt and βi+1 = kd(i + 1) following Eqn. (22) and (23) for the MEG2, where the factor 64 is the maximum copy number of mRNA in the MEG1. We compute the a priori estimated upper bounds of errors for different truncations of MEG1 and MEG2 using Eqn. (24) (Fig. 4A and B, red solid lines). The true truncation errors and the a priori estimated error bounds of MEG1 and MEG2 all decrease monotonically with increasing MEG sizes (Fig. 4A and B). The computed errors also monotonically decrease in both MEGs. For MEG1, the a priori estimated error bounds coincide with the computed errors (Fig. 4A red and green lines). For the MEG2, the a priori estimated error bounds are larger than computed errors at all MEG sizes.
Increased probability after state space truncation (Theorem 5)
According to Theorem 5, the probability landscape projected on the MEGs increase after state space truncation. We compute the steady state probability landscapes of Protein obtained using truncations at different sizes of the MEG, ranging from 0 to 2, 580 for MEG2 while MEG1 is fixed at 64 (Fig. 4C). The results are compared with the exact steady state landscape computed using MEG2 = 2, 600 (Fig. 4C, red line).
Our results show clearly that all probabilities in the landscapes increase when more states are truncated at smaller MEG size (Fig. 4C). The probability landscapes computed using larger size of the MEG (e.g., MEG2 = 1400, Fig. 4C, yellow line) are approaching the exact landscape (Fig. 4C, red line). The probability landscapes obtained using smaller MEG sizes deviate significantly from the exact probability landscape. The smaller the MEG size, the more pronounced the deviation is. These numerical results are fully consistent with Theorem 5.
Truncating additional MEGs does not decrease probabilities (Theorem 6)
We further examine Theorem 6, i.e., the probability landscape projected on one MEG increase with state space truncation at another MEG. We compare the projected steady state probability landscapes on mRNA obtained using truncations of different sizes of MEG2 ranging from 0 to 2580 while the MEG1 is fixed at 64 (Fig. 4D). We compare the results with the exact steady state landscape (Fig. 4D, red line).
Our results show that all probabilities on the landscapes of mRNA are not affected by the truncations at the MEG2 (Fig. 4D). The probability landscapes computed using different sizes of MEG2 are the same (Fig. 4D). These numerical results are completely consistent with Theorem 6, because the probabilities of mRNA are not decreased by the truncation at the MEG of Protein.
Genetic Toggle Switch
The bistable genetic toggle switch consists of two genes repressing each other through binding of their protein dimeric products on the promoter sites of the other genes. This genetic network has been studied extensively [42, 43, 44, 45]. We follow references [45, 24] and study a detailed model of the genetic toggle switch with a more realistic control mechanism of gene regulations. Different from simpler toggle switch models [46, 47, 48, 49], in which gene binding and unbinding reactions are approximated by Hill functions, here details of the gene binding and unbinding reactions are modeled explicitly. The molecular species, reactions, and their rate constants are listed below:
(29) |
Specifically, two genes GeneA and GeneB express protein products A and B, respectively. Two protein monomers A or B can bind on the promoter site of GeneB or GeneA to form protein-DNA complexes bGeneB or bGeneA, and turn off the expression of GeneB or GeneA, respectively.
Molecular equivalent group (MEG)
There are two MEGs in this network, MEG1 consists of species A and bGeneB, MEG2 consists of B and bGeneA.
Asymptotic convergence of errors (Theorem 3)
To numerically demonstrate Theorem 3, we compute the true error of the steady state solution to the dCME using sufficiently large sizes of MEG1 = 120 and MEG2 = 80, which gives negligible truncation error, with infinitesimally small boundary probabilities 5.275 × 10−24 for MEG1 and 2.561 × 10−23 for MEG2. Solution obtained using these MEGs is therefore considered to be exact. With this exact steady state probability landscape, the true truncation error Err(N) at smaller sizes of MEG1 and MEG2 can both be computed using Eqn. (14) (Fig. 5A and B, blue dashed lines and crosses). The corresponding boundary probabilities or computed error are computed from the exact steady state probability landscape for both MEG1 (Fig. 5A, green dashed line and circles) and MEG2 (Fig. 5B, green dashed line and circles).
Consistent with the statement in Theorem 3, our results show that the true error Err(N) (Fig. 5A and B, blue dashed lines and crosses) is bounded by the computed boundary probability (Fig. 5A and B, green dashed lines and circles) when the size of the MEG is sufficiently large. The insets in Fig. 5A and B show the ratios of the true errors to the computed errors at different sizes of the MEG, and the grey straight line marks the ratio one. The computed errors are larger than the true errors when the black line is below the grey straight line (Fig. 5A and B, insets). In this example, the computed boundary probability is greater than the true error when MEG1 > 82 and MEG2 > 42, as would be expected from Theorem 3.
A priori estimated error bound
To examine a priori estimated upper bounds for the truncation errors in MEG1 and MEG2, we follow Eqn. (22) and (23) to assign values of ᾱi = ksA and β(i+1) = [(i + 1) − 2] · kdA for the MEG1, where the subscript (i + 1) is the total copy number of species A in the system. The subtraction of 2 is necessary because up to 2 copies of A can be protected from degradation by binding to GeneB. This corresponds to the extreme case when GeneA is constantly turned on and GeneB is constantly turned off. Similarly, we have ᾱi = ksB and βi+1 = [(i + 1) − 2] · kdB following Eqn. (22) and (23) for the MEG2. This corresponds to the other extreme case when the GeneB is constantly turned on, and GeneA is constantly turned off. We compute the a priori estimated upper bounds of errors for different truncations of MEG1 and MEG2 using Eqn. (24) (Fig. 5A and B, red solid lines). The true truncation errors and the a priori estimated error bounds of MEG1 and MEG2 all decrease monotonically with increasing MEG sizes (Fig. 5A and B). The computed errors also monotonically decrease when the MEG sizes are larger than 40 for MEG1 and 20 for MEG2. For both MEGs, the a priori estimated error bounds are larger than computed errors at all MEG sizes. They are also larger than the true errors when the MEG sizes are sufficiently large.
Increased probability after state space truncation (Theorem 5)
According to Theorem 5, the probability landscape projected on the MEGs increase after state space truncation. We first compute the steady state probability landscapes of A obtained using truncations at different sizes of the MEG ranging from 0 to 119 for MEG1 while MEG2 is fixed at 80 (Fig. 5C). The results are compared with the exact steady state landscape computed using MEG1 = 120 and MEG2 = 80 (Fig. 5C, red line). We then also similarly examine the steady state probability landscapes of B obtained using truncations at different sizes of MEG2 from 0 to 79 while MEG1 is fixed at 120 (Fig. 5D).
Our results show clearly that all probabilities in the landscapes increase when more states are truncated at smaller MEG sizes (Fig. 5C and D). The probability landscapes computed using larger sizes of MEGs (e.g., MEG1 = 50, Fig. 5C, yellow line and MEG2 = 32, Fig. 5D, yellow line) are approaching the exact landscape (Fig. 5C and D, red line). The probability landscapes obtained using smaller MEGs deviate significantly from the exact probability landscape. The smaller the MEG size, the more significant the deviation is. These numerical results are completely consistent with Theorem 5.
Truncating additional MEGs does not decrease probabilities (Theorem 6)
We further examine Theorem 6, i.e., the probability landscape projected on one MEG increase with state space truncation at another MEG. We first compare the projected steady state probability landscapes on A obtained using truncations of different sizes of MEGs ranging from 0 to 80 for MEG2 while MEG1 is fixed at 120 (Fig. 5E) We compare the results with the exact steady state landscape (Fig. 5E, red line). We also similarly examine the projected steady state probability landscapes of B obtained using truncations at different sizes of MEG1 ranging from 0 to 120 while MEG2 is fixed at 80 (Fig. 5F).
Our results clearly show that all probabilities on the landscapes of MEG1 (MEG2) increase when the state space is truncated at MEG2 (MEG1) (Fig. 5E and F). The probability landscapes computed using larger sizes of MEGs (e.g., MEG2 = 32 in Fig. 5E, yellow line and MEG1 = 50 in Fig. 5F, yellow line) are approaching the exact landscape using MEG1 = 120 and MEG2 = 80 (Fig. 5E and F, red line). However, the probability landscapes using smaller MEGs significantly deviate from the exact probability landscape. The smaller the MEG size, the more significant the deviation is. These numerical results are completely consistent with Theorem 6.
Phage Lambda Bistable Epigenetic Switch
The bistable epigenetic switch for lysogenic maintenance and lytic induction in phage lambda is one of the well-parameterized realistic gene regulatory system. The efficiency and stability of the switch have been extensively studied [50, 51, 52, 53, 54]. Here we characterize the truncation error to the dCME solutions of the reaction network adapted from Cao et al. [5]. The network consists of 11 different species and 50 different reactions. The detailed reaction schemes and rate constants are shown in Table 1.
Table 1.
Reactions | Rate constants |
---|---|
Synthesis reactions [50, 55, 56, 57] | |
| |
∅ + (OR3 + OR2) → CI2 + (OR3 + OR2) | ksCI2 = 0.0069/s |
∅ + (OR3 + COR2) → CI2 + (OR3 + COR2) | ksCI2 = 0.0069/s |
∅ + (OR3 + ROR2) → CI2 + (OR3 + ROR2) | ks1CI2 = 0.069/s |
∅ + (OR1 + OR2) → Cro2 + (OR1+ OR2) | ksCro2 = 0.0929/s |
| |
Degradation reactions [58, 50] | |
| |
CI2 → ∅ | kdCI2 = 0.0026/s |
Cro2 → ∅ | kdCro2 = 0.0025/s |
| |
Association rate of binding reactions [59] | |
| |
CI2 + OR1 → ROR1 | kbOR1CI2 = 0.021/s |
CI2 + OR2 → ROR2 | kbOR2CI2 = 0.021/s |
CI2 + OR3 → ROR3 | kbOR3CI2 = 0.021/s |
Cro2 + OR1 → COR1 | kbOR1Cro2 = 0.021/s |
Cro2 + OR2 → COR2 | kbOR2Cro2 = 0.021/s |
Cro2 + OR3 → COR3 | kbOR3Cro2 = 0.021/s |
| |
Dissociation reactions - CI2 dissociation from OR1 | |
| |
ROR1 + (OR2) → CI2 + OR1 + (OR2) | 0.00898/s |
ROR1 + (ROR2 + OR3) → CI2 + OR1 + (ROR2 + OR3) | 0.00011/s |
ROR1 + (ROR2 + ROR3) → CI2 + OR1 + (ROR2 + ROR3) | 0.01242/s |
ROR1 + (ROR2 + COR3) → CI2 + OR1+ (ROR2 + COR3) | 0.00011/s |
ROR1 + (COR2) → CI2 + OR1 + (COR2) | 0.00898/s |
| |
Dissociation reactions - CI2 dissociation from OR2 | |
| |
ROR2 + (OR1 + OR3) → CI2 + OR2 + (OR1 + OR3) | 0.2297/s |
ROR2 + (ROR1 + OR3) → CI2 + OR2 + (ROR1 + OR3) | 0.0029/s |
ROR2 + (OR1 + ROR3) → CI2 + OR2 + (OR1+ ROR3) | 0.0021/s |
ROR2 + (ROR1 + ROR3) → CI2 + OR2 + (ROR1 + ROR3) | 0.0029/s |
ROR2 + (COR1 + OR3) → CI2 + OR2 + (COR1 + OR3) | 0.2297/s |
ROR2 + (OR1 + COR3) → CI2 + OR2 + (OR1 + COR3) | 0.2297/s |
ROR2 + (COR1 + COR3) → CI2 + OR2 + (COR1 + COR3) | 0.2297/s |
ROR2 + (ROR1 + COR3) → CI2 + OR2+ (ROR1 + COR3) | 0.0029/s |
ROR2 + (COR1 + ROR3) → CI2 + OR2+ (COR1 + ROR3) | 0.0021/s |
| |
Dissociation reactions - CI dissociation from OR3 | |
| |
ROR3 + (OR2) → CI2 + OR3 + (OR2) | 1.13/s |
ROR3 + (ROR2 + OR1) → CI2 + OR3 + (ROR2 + OR1) | 0.0106/s |
ROR3 + (ROR2 + ROR1) → CI2 + OR3 + (ROR2 + ROR1) | 0.0106/s |
ROR3 + (ROR2 + COR1) → CI2 + OR3+ (ROR2 + COR1) | 0.0106/s |
ROR3 + (COR2) → CI2 + OR3 + (COR2) | 1.13/s |
| |
Dissociation reactions - Cro dissociation from OR1 | |
| |
COR1 + (OR2) → Cro2 + OR1 + (OR2) | 0.0202/s |
COR1 + (ROR2) → Cro2 + OR1+ (ROR2) | 0.0202/s |
COR1 + (COR2 + OR3) → Cro2 + OR1 + (COR2 + OR3) | 0.0040/s |
COR1 + (COR2 + ROR3) → Cro2 + OR1+ (COR2 + ROR3) | 0.0040/s |
COR1 + (COR2 + COR3) → Cro2 + OR1 + (COR2 + COR3) | 0.0040/s |
| |
Dissociation reactions - Cro dissociation from OR2 | |
| |
COR2 + (OR1 + OR3) → Cro2 + OR2 + (OR1+ OR3) | 0.1413/s |
COR2 + (ROR1 + OR3) → Cro2 + OR2 + (ROR1 + OR3) | 0.1413/s |
COR2 + (OR1 + ROR3) → Cro2 + OR2 + (OR1 + ROR3) | 0.1413/s |
COR2 + (ROR1 + ROR3) → Cro2 + OR2 + (ROR1 + ROR3) | 0.1413/s |
COR2 + (COR1 + OR3) → Cro2 + OR2 + (COR1 + OR3) | 0.0279/s |
COR2 + (OR1 + COR3) → Cro2 + OR2 + (OR1 + COR3) | 0.053/s |
COR2 + (COR1 + COR3) → Cro2 + OR2 + (COR1 + COR3) | 0.0328/s |
COR2 + (ROR1 + COR3) → Cro2 + OR2+ (ROR1 + COR3) | 0.053/s |
COR2 + (COR1 + ROR3) → Cro2 + OR2+ (COR1 + ROR3) | 0.0279/s |
| |
Dissociation reactions - Cro dissociation from OR3 | |
| |
COR3 + (OR2) → Cro2 + OR3 + (OR2) | 0.0022/s |
COR3 + (ROR2) → Cro2 + OR3+ (ROR2) | 0.0022/s |
COR3 + (COR2 + OR1) → Cro2 + OR3 + (COR2 + OR1) | 0.0008/s |
COR3 + (COR2 + ROR1) → Cro2 + OR3+ (COR2 + ROR1) | 0.0008/s |
COR3 + (COR2 + COR1) → Cro2 + OR3 + (COR2 + COR1) | 0.003/s |
Molecular equivalent group (MEG)
The network can be partitioned into two MEGs. The MEG1 consists of the dimer of CI protein CI2 and all complexes of operator sites bounded with CI2. The MEG2 consists of the dimer of Cro protein Cro2 and all complexes of operator sites bounded with Cro2.
Asymptotic convergence of errors (Theorem 3)
To numerically demonstrate Theorem 3, we compute the true error of the steady state solution to the dCME using sufficiently large sizes of MEG1 = 80 and MEG2 = 38, which gives negligible truncation error, with infinitesimally small boundary probabilities 6.96 × 10−31 for MEG1 and 3.95 × 10−32 for MEG2. Solution obtained using these MEGs is therefore considered to be exact. With this exact steady state probability landscape, the true truncation error Err(N) at smaller sizes of MEG1 and MEG2 can both be computed using Eqn. (14) (Fig. 6A and B, blue dashed lines and crosses). The corresponding boundary probabilities or computed error are computed from the exact steady state probability landscape for both MEG1 (Fig. 6A, green dashed line and circles) and MEG2 (Fig. 6B, green dashed line and circles).
Consistent with the statement in Theorem 3, our results show that the true error Err(N) (Fig. 6A and B, blue dashed lines and crosses) is bounded by the computed boundary probability (Fig. 6A and B, green dashed lines and circles) when the size of the MEG is sufficiently large. The insets in Fig. 6A and B show the ratios of the true errors to the computed errors at different sizes of the MEG, and the grey straight lines mark the ratio one. The computed errors are larger than the true errors when the black line is below the grey straight line (Fig. 6A and B, insets). In this example, the computed boundary probability is greater than the true error when MEG1 ≥ 24 and MEG2 ≥ 3, as would be expected from Theorem 3.
A priori estimated error bound
To examine a priori estimated upper bounds for the truncation errors in MEG1 and MEG2, we follow Eqn. (22) and (23) to assign values of ᾱi = ks1CI2and β(i+1) = [(i+1)−3] · kdCI2 for the MEG1, where the subscript (i+1) is the total copy number of species CI2 in the system. The subtraction of 3 is necessary because up to 3 copies of CI2 can be protected from degradation by binding to operator sites OR1, OR2, and OR3. Similarly, we have ᾱi = ksCro2 and βi+1 = [(i + 1) − 3] · kdCro2 following Eqn. (22) and (23) for the MEG2. We compute the a priori estimated upper bounds of errors for different truncations of MEG1 and MEG2 using Eqn. (24) (Fig. 6A and B, red solid lines). The true truncation errors and the a priori estimated error bounds of MEG1 and MEG2 all decrease monotonically with increasing MEG sizes (Fig. 6A and B). The computed errors also monotonically decrease when the MEG sizes are larger than 13 for MEG1 and 4 for MEG2. For both MEGs, the a priori estimated error bounds are larger than computed errors at all MEG sizes. They are also larger than the true errors when the MEG sizes are sufficiently large.
Increased probability after state space truncation (Theorem 5)
According to Theorem 5, the probability landscape projected on the MEGs increase after state space truncation. We first compute the steady state probability landscapes of CI2 obtained by truncating MEG1 at different sizes ranging from 0 to 80 while MEG2 is fixed at 38 (Fig. 6C). The results are compared with the exact steady state landscape computed using MEG1 = 80 and MEG2 = 38 (Fig. 6C, red line). We then also similarly examine the steady state probability landscapes of Cro2 obtained by truncating at different sizes of MEG2 from 0 to 38 while MEG1 is fixed at 80 (Fig. 6D).
Our results show clearly that all probabilities in the landscapes increase when more states are truncated at smaller MEG sizes (Fig. 6C and D). The probability landscapes computed using larger sizes of MEGs (e.g., MEG1 = 30, Fig. 6C, yellow line and MEG2 = 8, Fig. 6D, yellow line) are approaching the exact landscape (Fig. 6C and D, red line). The probability landscapes obtained using smaller MEGs deviate significantly from the exact probability landscape. The smaller the MEG size, the more significant the deviation is. These numerical results are completely consistent with Theorem 5.
Truncating additional MEGs does not decrease probabilities (Theorem 6)
We further examine Theorem 6, i.e., the probability landscape projected on one MEG increase with state space truncation at another MEG. We first compare the projected steady state probability landscapes on CI2 obtained by truncating MEG2 at different sizes ranging from 0 to 38 while MEG1 is fixed at 80 (Fig. 6E). We compare the results with the exact steady state landscape (Fig. 6E, red line). We also similarly examine the projected steady state probability landscapes of Cro2 obtained by truncating at different sizes of MEG1 ranging from 0 to 80 while MEG2 is fixed at 38 (Fig. 6F).
Our results show that all probabilities on the landscapes of MEG1 (MEG2) increase when the state space is truncated at MEG2 (MEG1) (Fig. 6E and F). The probability landscapes computed using larger sizes of MEGs (e.g., MEG2 = 8 in Fig. 6E, yellow line and MEG1 = 30 in Fig. 6F, yellow line) are approaching the exact landscape using MEG1 = 80 and MEG2 = 38 (Fig. 6E and F, red line). However, the probability landscapes using smaller MEGs significantly deviate from the exact probability landscape. The smaller the MEG size, the more significant the deviation is. These numerical results are completely consistent with Theorem 6.
Discussions and Conclusions
Solving the discrete chemical master equation (dCME) is of fundamental importance for studying stochasticity in reaction networks. The main challenges are the discrete nature of the states and the difficulty in enumerating these states, as the size of the state space expands rapidly when the network becomes more complex. In this study, we describe a novel approach for state space truncation. Instead of taking a high dimensional hypercube as the truncated state space, we introduce the concept of molecular equivalence group (MEG), and truncate the state space into the same or lower dimensional simplexes, with the same effective copy number of molecules in each dimension by taking advantage of the principle of mass conservation. For complex networks, the reduction of the size of the state space can be dramatic.
Our study addresses a key issue in obtaining direct solution to the dCME. As state space truncation is inevitable, it is important to quantify the errors of such truncations, so the accuracy of the dCME solutions can be assessed and managed. We have developed a general theoretical framework for quantifying the errors of state space truncation on the steady state probability landscape. By decomposing the reaction network into MEGs, the error contribution from each individual MEG is quantified. This critically important task is made possible through analyzing the states on the reflecting boundary and their associated steady state probabilities. The boundary probability analysis has been based on the construction of an aggregated continuous-time Markov process by factoring the state space according to the total numbers of molecules in each MEG. With explicit formulas for calculating conservative error bounds for the steady state, one can easily calculate the a priori error bounds for any given size of a MEG. Furthermore, our theory allows the determination of the minimally required sizes of MEGs if a predefined error tolerance is to be satisfied. As shown in the examples, to determine the appropriate MEG sizes a priori, one can first calculate the estimated errors at different sizes of each MEG, and choose the minimal MEG sizes that satisfies the overall error tolerance. This eliminates the need of multiple iterations of costly trial computations to solve the dCME for determining the appropriate total copy numbers necessary to ensure small truncation errors. This is advantageous over conventional numerical techniques, where errors are typically assessed through post processing of trial solutions.
In complex networks, state truncation in one molecular group may affect the errors of other molecular groups. By partitioning the network into separate molecular equivalent groups (MEGs), the mutual influence of the effects of state truncations in different groups can be reduced. In such cases, we have proved that the asymptotic errors in any truncated MEG will not be under-estimated by the state truncations in other MEGs. Based on this conclusion, one can increase the size of each particular MEG in order to achieve a small truncation error of that MEG. When the truncation error for every MEG is below the prescribed threshold of error tolerance, the total truncation error of the whole state space will be guaranteed to be bounded by the sum of individual truncation errors in each MEG.
While our method ensures that there is no mass exchange between different MEGs and often couplings between MEGs are weak, it does not rule out the existence of possible strong couplings among MEGs. In the example of the single gene expression model, there is a strong coupling between mass-isolated mRNA MEG and the protein MEG. In this case, protein synthesis strongly depends on the amount of available mRNA. As a result, the protein probability distribution can be heavily influenced by the choices of the mRNA MEG size, and its peak is shifted when the size of mRNA MEG is near exhaustion (data not shown). This issue rapidly disappears when MEG sizes become sufficiently large to ensure that the truncation error to be smaller than the specified error tolerance (Fig. 4).
Our method differs from the finite state projection (FSP) method [23, 30], which employs an absorbing boundary state to calculate the truncation error. Transitions from any states in the available finite state space to any outside state are send to the absorbing state, and the reactions are made irreversible. The truncation error in the FSP method is taken as the probability mass on the absorbing boundary state. It has two components: one from the lost probability mass due to the state truncation, the other from the trapped probability mass due to the absorbing nature of the boundary state. As time proceeds, the trapped probability mass on the absorbing state will grow and dominate. At the steady state, all probability mass will be trapped in the absorbing state, which can no longer reflect the truncated probability mass. Therefore, the FSP method cannot be used to study the long-term as well as the steady state behavior of a stochastic network.
In contrast, our method employs a reflecting boundary and can characterize the truncation errors in the steady state. All transitions between boundary and non-boundary states are retained after state space truncation, and the reversible nature of transitions unaltered. The reflecting boundaries allow analysis of the steady state truncation error of each MEG. Our method can be used to study the steady state probability landscape. Furthermore, our method also allows direct computation of the distribution of first passage time, an important problem in studying rare events in biological networks currently relies heavily on sampling techniques.
We have also provided computational results of four stochastic networks, namely, the birth-death process consisting of one MEG, the single gene expression model, the genetic toggle switch model, and the phage lambda epigenetic switch model, each consisting of two MEGs, respectively. By comparing true errors, computed errors, and a priori estimated errors at different truncation sizes, we have numerically verified the theorems presented in this study: First, the true error for truncating a MEG is bounded by the total probability mass on the reflecting boundary of the MEG (Theorem 3). Second, the projected probability on one MEG increases upon the state space truncation at this MEG (Theorem 5). Third, the projected probability on one MEG also increases when the state space is truncated at another MEG (Theorem 6). Furthermore, we show that the a priori estimated error bound are effective when the network is truncated at a sufficiently large size of MEG.
Recent studies based on tensor representation of the transition rate matrices show that the storage requirement of solving CME can be significantly reduced and computational time improved [49, 60]. However, accurate tensor representation and tensor-based approximation strongly depend on the separability of system states, that is, whether the system can be decomposed into a number of relatively independent smaller sub-systems [61, 49]. While complete separability can be achieved in some cases, e.g. the one-dimensional quantum spin system [61], errors are generally unknown for biological networks that are not fully separable.
The tensor method of Liao et al can reduce the state space dramatically for a number of networks [60]. For example, the size of the state space of the Fokker-Planck equation of the Schlögl model is reduced from 2.74 × 1011 to 4.01 × 103 + 2.07 × 105, with a reduction factor of 106. It will be interesting to further assess the reduction factor if the full discrete CMEs instead of the Fokker-Planck equations of these network models are solved so a direct comparison can be carried out.
Our finite buffer approach compares favorably with the tensor train method of [49] for the network of enzymatic futile cycles [40]. This network is a closed system and technically no finite buffer is required when the enumerated states can fit into the computer memory, therefore analysis of truncation error would be unnecessary. Regardless, our approach of state enumeration leads to a state space of only 1, 071 microstates, a reflection of the O(n!) order of reduction. In contrast, the tensor train method is based on a state space of a size of 222 = 4.19 × 106. Using our finite buffer method, both the time-evolving and the steady state probability landscapes can be computed efficiently in < 10 seconds (data not shown), but the tensor-train method requires 1.52×104 seconds for the time evolution of t = 1 to be computed as reported in [49]. For the model of toggle switch, computing the time-evolution of the probability landscape up to t = 30 seconds requires 14, 541 seconds or 4 hours of wall clock time using the tensor-train method [49]. Our method completes the computation of the steady state probability landscape in ca. 3, 300 seconds or 55 minutes of wall clock time.
We further note that our work complements tensor-based methods [49, 60]. Tensor-based methods directly reduce the storage of the transition rate matrices [49], without altering the hypercubic nature of the underlying state space. In contrast, our method first reduces the state space by a factor of O(n!), leading to a dramatically reduced transition rate matrix. It is possible that there exist alternative approaches to construct tensors of the transition rate matrix without assuming that the truncated state space is a hypercube as is the case in [49]. Whether our approach can be useful for further reduction of storage and computational speed-up is a possible direction for future exploration.
Overall, we have introduced an efficient method for state space truncation and have developed theory to quantify the errors of state space truncations. Results presented here provide a general framework for high precision numerical solutions to a dCME. It is envisioned that the approach of direct solution of a dCME can be broadly applied to many stochastic reaction networks, such as those found in systems biology and in synthetic biology.
Acknowledgments
This work is supported by NIH grant GM079804, NSF grant MCB1415589, and the Chicago Biomedical Consortium with support from the Searle Funds at The Chicago Community Trust. We thank Dr. Ao Ma for helpful discussions and comments. YC is also supported by the LDRD program of CNLS at LANL.
APPENDIX
Proof of Lemma 1
Proof
By sorting the state space according to the partition Ω̃(∞) and re-constructing the transition rate matrix à in Eqn. (6), the dCME can be re-written as , where p̃(∞) is the probability distribution on the partitioned state space. We sum up the master equations over all microstates in each group 𝒢i and obtain a separate aggregated equation for each group. As the re-ordered matrix à is a block tri-diagonal matrix, the summed discrete chemical master equation is reduced to:
(30) |
The overall probability change of each group 𝒢i depends on the probability vector p̃(∞)(𝒢i, t) itself, as well as the probability vector p̃(∞)(𝒢i−1, t) and the probability vector p̃(∞)(𝒢i+1, t) of the immediate neighboring groups. It also depends on the rates of synthesis and degradation reactions in elements of Ai,i−1 and Ai,i+1, respectively, as well as rates of coupling reactions in Ai,i. From the definition of transition rate matrix given in Eqn. (3), we have:
(31) |
At the steady state when all , we combine line 1 of Eqn. (30) and line 1 of Eqn. (31), and obtain:
From line 2 of Eqn. (30) at steady state and after incorporating line 1 of Eqn. (31), we have: (1TA1,2) π̃(∞)(𝒢2) = (1TA0,0) π̃(∞)(𝒢0) − (1TA1,1) π̃(∞)(𝒢1). After further incorporating line 1 of Eqn. (30) at steady state, we have (1TA1,2) π̃(∞)(𝒢2) = −(1TA0,1) π̃(∞)(𝒢1) − (1TA1,1) π̃(∞)(𝒢1). Incorporating line 2 of Eqn. (31), we have:
Assume (1TAi, i−1) π̃(∞)(𝒢i−1) = (1TAi−1, i) π̃(∞)(𝒢i), we have from the i-the line of Eqn. (30) at the steady state
(32) |
With the i-th line of Eqn. (31), we further have:
Overall, we have:
(33) |
As both sides are constants, we can find αi and βi+1 such that:
(34) |
for all i = 0, 1, …, where i is the total copy number of the MEG. We obviously have:
where αi is the sum of column-sums of sub-matrix Ai+1,i weighted by the steady state probability distribution π̃(∞) on group 𝒢i, βi+1 is the sum of column-summation of sub-matrix Ai,i+1 weighted by the steady state probability distribution on group 𝒢i+1.
As 1Tπ̃(∞)(𝒢i) is the total steady state probability mass over states in group 𝒢i, we substitute Eqn. (34) back into Eqn. (33) and obtain the following relationship of steady state distribution on the partitions of Ω̃∞:
(35) |
The steady state solution to Eqn. (35) is equivalent to the steady state solution of a dCME with the transition rate matrix B defined as in Eqn. (9).
Proof of Lemma 2
Proof
If held, then there would be an infinite number of terms . There should exist an integer N′ such that for all i > N′, we have . According to Eqn. (35), we would have in the steady state for all i > N′. This contradicts with the assumption of a finite system, as the total probability mass on boundary states increases monotonically as the net molecular copy number of the network increases after N′. This makes the overall system a pure-birth process. Therefore, for a finite biological system, we have Eqn. (15).
Proof of Theorem 3
Proof
From Eqn. (13), we can first derive an explicit expression of the true error Err(N) using the aggregated synthesis and degradation rates and given in Eqn. (8):
(36) |
From Eqn. (36), Eqn. (13), and Lemma 2, we have:
(37) |
When N is sufficiently large, sup from Lemma 2, the terms in the infinite series then forms a converging geometric series. Therefore, we have
and the following inequality holds:
Let M ∈ {N, …, ∞} be the integer such that , we have the following inequality equivalent to Inequality (16):
Proof of Theorem 5
Proof
We first consider two truncated state spaces Ω̃(N) and Ω̃(N+1). Following Eqn. (30), two finite sets of the block chemical master equation can be constructed for these two state spaces. The first set containing N equations is built on the state space Ω̃(N).
(38) |
The second set is built on the state space Ω̃(N+1) containing N + 1 equations.
(39) |
At steady state, the left-hand side of the equations are zeros. For the first N equations, the corresponding block matrices are the same for both state spaces Ω̃(N) and Ω̃(N+1). We can then subtract the right-hand side of Eqn. (39) from Eqn. (38) and obtain the following steady state equations:
(40) |
where is the steady state probability difference between the state group 𝒢i in the dCME on Ω̃(N) and Ω̃(N+1). However, the block sub-matrix AN,N of the boundary group 𝒢N is different between the two state spaces. From the construction of the aggregated dCME matrix Ã, columns of the full matrices Ã(N+1) over Ω̃(N+1) and ÃN over Ω̃N all sum to 0 (see Eqn 31). We use to denote the block sub-matrix of the group 𝒢N for the state space Ω̃(N), and use to denote the corresponding block sub-matrix for the state space Ω̃(N+1). From the N-th line of the truncated version of Eqn (31), we have for Ω̃(N+1) and for Ω̃(N). Since , we have the following property
(41) |
We also have
(42) |
From Eqn. (38), we have for the steady state the probability of the state group 𝒢N over the state space Ω̃(N) as:
(43) |
From Eqn. (39), we have for the steady state the probability of the state group 𝒢N and 𝒢N+1 over the state space Ω̃(N+1) as:
(44) |
and
(45) |
respectively.
As , we subtract Eqn. (44) from Eqn. (43), and obtain:
It can be re-written by applying the matrix property of Eqn. (41) as:
By using the matrix property in Eqn. (42), we can further re-write it as:
From Eqn. (45), the last two terms sum to 0. Therefore, we obtain the (N + 1)-st equation of the steady state probability difference as:
Taken together, we have the set of equations for steady state probability differences for all N + 1 blocks as:
(46) |
where all block sub-matrices are identical between those over the state spaces Ω̃(N) and Ω̃(N+1). We therefore obtain the set of equations of differences in steady state probability equivalent to Eqn. (33):
(47) |
which produces the same steady state solution as that of Eqn. (33) after scaling by a constant. As probability vector solution to Eqn. (33) has non-negative elements, this equivalence implies that all elements in each Δπi have the same sign. As the total steady state probability mass in both state spaces sum up to 1,
we therefore know that the total probability differences is non-negative:
Therefore, the probability difference of each individual 𝒢i between two state spaces must be non-negative:
This can be generalized. As N increases to infinity, we have:
Proof of Theorem 6
Proof
For convenience, we use M = Ni to denote the maximum net copy number in the truncated i-th MEG. We first aggregate the state space Ω(ℐj) into infinitely many groups {𝒢0, 𝒢1, …, 𝒢M, 𝒢M+1, …} according to the net copy number in the i-th MEG. We then re-construct the permuted matrix Ã(ℐj) according to this aggregation. We have:
(48) |
where the subscripts m and n of each block matrix indicate the actual net copy numbers of the corresponding aggregated states of the i-th MEG. Next, we further partition the matrix into four blocks by truncating the i-th MEG at the maximum copy number of M. Specifically, in the right-hand side of Eqn. (48) is the north-west corner sub-matrix of Ã(ℐj), which contains all transitions between microstates in the state space Ω(ℐi,j):
(49) |
is the north-east corner sub-matrix of Ã(ℐj), which contains all transitions from microstates in state space Ω(ℐj)/Ω(ℐi,j) to microstates in state space Ω(ℐi,j):
(50) |
is the south-west corner sub-matrix of Ã(ℐj), which contains all transitions from microstates in state space Ω(ℐi,j) to microstates in state space Ω(ℐj)/Ω(ℐi,j):
(51) |
and is the south-east corner sub-matrix of Ã(ℐj), which contains all transitions between microstates in state space Ω(ℐj)/Ω(ℐi,j):
(52) |
We now truncate the state space at the maximum copy number M of the i-th MEG. A matrix A(ℐi,j) on the truncated state space Ω(ℐi,j) using the same partition {𝒢0, 𝒢1, …, 𝒢M} can be constructed as:
(53) |
Similar to the matrix à in Eqn. (6), both matrices Ã(ℐi,j) and Ã(ℐj) are tri-diagonal matrix with and for any |m − n| > 1.
Matrix Ã(ℐi,j) and sub-matrix reside on the same state space Ω(ℐi,j) and have exactly the same permutation, i.e., the matrix element and describes the same transitions between microstates xm, xn ∈ Ω(ℐi,j) ⊂ Ω(ℐj). Only diagonal elements in have different rates. By construction, 𝒢M and 𝒢M+1 are the only two aggregated groups that are involved in transition between states across the boundary of Ω(ℐi,j). The sub-matrix is the only nonzero sub-matrix in , which forms the reflection boundary and is involved in synthesis reactions from microstates in group 𝒢M to microstates in 𝒢M+1. As a property of the rate matrix, we have
(54) |
and
(55) |
Since , we have from Eqn (55) . With Eqn (54), we further have
By construction, the only differences between the sub-matrix and are in the diagonal elements. Therefore, we have
That is:
For convenience, we use the notation , and have . We partition the steady state vector π(ℐj) accordingly into two sub-vectors: , where corresponds to states in Ω(ℐi,j), and corresponds to states in Ω(ℐj)/Ω(ℐi,j). As Ã(ℐj) π(ℐj) = 0, we have:
therefore
Hence, we have:
(56) |
As all off-diagonal entries of transition rate matrix Ã(ℐj) are non-negative, we know that , and . Since is the steady state distribution of the rate matrix Ã(ℐj) with and , we have , and . As all columns of matrix Ã(ℐi,j) sum to zero, i.e., 1TÃ(ℐi,j) = 0T, we have:
Therefore, we have:
As all entries in vector and are non-negative, we have the following equality of 1-norms, i.e. the summation of absolute values of vector elements:
(57) |
From Minkowski inequality of vector norm and Eqn. (56), we have:
(58) |
From Eqn. (57), we have:
(59) |
Now we show that the norm of converge to zero when the maximum copy number M of the i-th MEG goes to infinity. In the block tri-diagonal matrix Ã(ℐj), only the boundary block contains nonzero elements in sub-matrix , and all other blocks in contain only zero entries. From Cauchy–Schwarz inequality, we have:
where is the sub-vector corresponding to the state partition 𝒢M. Furthermore, according to Lemma 2 and Eqn. (13) after replacing the subscript i in with M and taking into consideration of the equivalence of the infinite space Ω(ℐj) and Ω(∞) in regard to truncation at ℐi, we have the probability of the boundary block when M → ∞. When synthesis reactions are concentration independent (zero order reactions) as usually the case [62], the norm is a constant representing the total synthesis rates over states in 𝒢M. We have: when M → ∞. Therefore with Eqn. (59), we have:
Hence,
That is, when the maximum copy number limit of the i-th MEG is sufficiently large, both π(ℐi,j) and are the steady state solutions of Ã(ℐi,j) y = 0. According to Perron–Frobenius theorem for the transition rate matrix of continuous-time Markov chains [63], the dCME governed by Ã(ℐi,j) has a globally unique stationary distribution. In addition, by construction of the matrix, via enumeration of the state space, matrix Ã(ℐi,j) is irreducible, as all microstates in the state space can be reached from the initial state. Therefore, the matrix Ã(ℐi,j) has only one zero eigenvalue [63], both π(ℐi,j) and are eigenvectors corresponding to the eigenvalue 0. Therefore, we have the relationship , where c is an arbitrary real number. As both vectors are non-negative, and , there must exist an , such that . According to Lemma 2, , when the maximum copy number limit of the i-th MEG goes to infinity. Therefore we have ε → 0 when M → ∞. Therefore, we have shown both and component-wise, when the maximum copy number limit of the i-th MEG goes to infinity.
References
- 1.Stewart-Ornstein J, El-Samad H. Stochastic modeling of cellular networks. Computational Methods in Cell Biology. 2012;110:111. doi: 10.1016/B978-0-12-388403-9.00005-9. [DOI] [PubMed] [Google Scholar]
- 2.Qian H. Cooperativity in cellular biochemical processes: noise-enhanced sensitivity, fluctuating enzyme, bistability with nonlinear feedback, and other mechanisms for sigmoidal responses. Annual Review of Biophysics. 2012;41:179–204. doi: 10.1146/annurev-biophys-050511-102240. [DOI] [PubMed] [Google Scholar]
- 3.McAdams H, Arkin A. It’s a noisy business! Genetic regulation at the nanomolar scale. Trends in Genetics. 1999;15(2):65–69. doi: 10.1016/s0168-9525(98)01659-x. [DOI] [PubMed] [Google Scholar]
- 4.Wilkinson DJ. Stochastic modelling for quantitative description of heterogeneous biological systems. Nature Reviews Genetics. 2009;10(2):122–133. doi: 10.1038/nrg2509. [DOI] [PubMed] [Google Scholar]
- 5.Cao Y, Lu H-M, Liang J. Probability landscape of heritable and robust epigenetic state of lysogeny in phage lambda. Proceedings of the National Academy of Sciences of the United States of America. 2010;107(43):18445–18450. doi: 10.1073/pnas.1001455107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Gillespie DT. Exact stochastic simulation of coupled chemical reactions. Journal of Physical Chemistry. 1977;81:2340–2361. [Google Scholar]
- 7.Gillespie DT. A rigorous derivation of the chemical master equation. Physica A. 1992;188:404–425. [Google Scholar]
- 8.Van Kampen N. Stochastic processes in physics and chemistry. 3. Elsevier Science and Technology books; 2007. [Google Scholar]
- 9.Beard D, Qian H. Chemical biophysics: quantitative analysis of cellular systems. Cambridge Univ Pr; 2008. [Google Scholar]
- 10.Gillespie DT. A diffusional bimolecular propensity function. The Journal of Chemical Physics. 2009;131(16):164109. doi: 10.1063/1.3253798. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Darvey I, Ninham B, Staff P. Stochastic models for second order chemical reaction kinetics. the equilibrium state. The Journal of Chemical Physics. 1966;45:2145–2155. [Google Scholar]
- 12.McQuarrie D. Stochastic approach to chemical kinetics. Journal of Applied Probability. 1967;4:413–478. [Google Scholar]
- 13.Taylor H, Karlin S. An Introduction to Stochastic Modeling. 3. Academic Press; 1998. [Google Scholar]
- 14.Laurenzi I. An analytical solution of the stochastic master equation for reversible bimolecular reaction kinetics. The Journal of Chemical Physics. 2000;113:3315–3322. [Google Scholar]
- 15.Vellela M, Qian H. A quasistationary analysis of a stochastic chemical reaction: Keizers paradox. Bulletin of Mathematical Biology. 2007;69(5):1727–1746. doi: 10.1007/s11538-006-9188-3. [DOI] [PubMed] [Google Scholar]
- 16.Gillespie DT. The chemical langevin equation. The Journal of Chemical Physics. 2000;113:297–306. [Google Scholar]
- 17.Van Kampen NG. A power series expansion of the master equation. Canadian Journal of Physics. 1961;39(4):551–567. [Google Scholar]
- 18.Gillespie DT. The chemical Langevin and FokkerPlanck equations for the reversible isomerization reaction. The Journal of Physical Chemistry A. 2002;106(20):5063–5071. [Google Scholar]
- 19.Haseltine EL, Rawlings JB. Approximate simulation of coupled fast and slow reactions for stochastic chemical kinetics. The Journal of Chemical Physics. 2002;117(15):6959–6969. [Google Scholar]
- 20.Gardiner CW. Handbook of Stochastic Methods for Physics, Chemistry and the Natural Sciences. Springer; New York: 2004. [Google Scholar]
- 21.Grima R, Thomas P, Straube AV. How accurate are the nonlinear chemical Fokker-Planck and chemical Langevin equations? The Journal of Chemical Physics. 2011 Aug;135:084103. doi: 10.1063/1.3625958. [DOI] [PubMed] [Google Scholar]
- 22.Thomas P, Matuschek H, Grima R. How reliable is the linear noise approximation of gene regulatory networks? BMC Genomics. 2013;14(Suppl 4):S5. doi: 10.1186/1471-2164-14-S4-S5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Munsky B, Khammash M. The finite state projection algorithm for the solution of the chemical master equation. The Journal of Chemical Physics. 2006;124(4):044104. doi: 10.1063/1.2145882. [DOI] [PubMed] [Google Scholar]
- 24.Cao Y, Liang J. Optimal enumeration of state space of finitely buffered stochastic molecular networks and exact computation of steady state landscape probability. BMC Systems Biology. 2008;2(1):30. doi: 10.1186/1752-0509-2-30. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.MacNamara S, Bersani AM, Burrage K, Sidje RB. Stochastic chemical kinetics and the total quasi-steady-state assumption: application to the stochastic simulation algorithm and chemical master equation. The Journal of chemical physics. 2008;129(9):095105. doi: 10.1063/1.2971036. [DOI] [PubMed] [Google Scholar]
- 26.MacNamara S, Burrage K, Sidje RB. Multiscale modeling of chemical kinetics via the master equation. Multiscale Modeling & Simulation. 2008;6(4):1146–1168. [Google Scholar]
- 27.Wolf V, Goel R, Mateescu M, Henzinger T. Solving the chemical master equation using sliding windows. BMC Systems Biology. 2010;4(1):42. doi: 10.1186/1752-0509-4-42. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Jahnke T. On reduced models for the chemical master equation. Multiscale Modeling & Simulation. 2011;9(4):1646–1676. [Google Scholar]
- 29.Sidje RB. Expokit: a software package for computing matrix exponentials. ACM Transactions on Mathematical Software (TOMS) 1998;24(1):130–156. [Google Scholar]
- 30.Munsky B, Khammash M. A multiple time interval finite state projection algorithm for the solution to the chemical master equation. Journal of Computational Physics. 2007;226(1):818– 835. [Google Scholar]
- 31.Tian JP, Kannan D. Lumpability and commutativity of Markov processes. Stochastic analysis and Applications. 2006;24(3):685–702. [Google Scholar]
- 32.Truffet L. Near complete decomposability: bounding the error by a stochastic comparison method. Advances in Applied Probability. 1997:830–855. [Google Scholar]
- 33.Buchholz P. Exact and ordinary lumpability in finite Markov chains. Journal of Applied Probability. 1994:59–75. [Google Scholar]
- 34.Stewart W. Introduction to the numerical solution of Markov chains. Princeton University Press; NJ: 1994. [Google Scholar]
- 35.Vantilborgh H. Aggregation with an error of o(ε2) Journal of the ACM (JACM) 1985;32(1):162–190. [Google Scholar]
- 36.Kemeny JG, Snell JL. Finite Markov chains. Vol. 210. Springer-Verlag; New York: 1976. [Google Scholar]
- 37.Irle A. Stochastic ordering for continuous-time processes. Journal of Applied Probability. 2003:361–375. [Google Scholar]
- 38.Daigle B, Roh M, Gillespie D, Petzold L. Automated estimation of rare event probabilities in biochemical systems. The Journal of Chemical Physics. 2011;134:044110. doi: 10.1063/1.3522769. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Roh M, Daigle B, Gillespie D, Petzold L. State-dependent doubly weighted stochastic simulation algorithm for automatic characterization of stochastic biochemical rare events. Journal of Chemical Physics. 2011;135(23):234108. doi: 10.1063/1.3668100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Cao Y, Liang J. Adaptively biased sequential importance sampling for rare events in reaction networks with comparison to exact solutions from finite buffer dCME method. The Journal of Chemical Physics. 2013;139(2):025101. doi: 10.1063/1.4811286. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Taniguchi Y, Choi PJ, Li G-W, Chen H, Babu M, Hearn J, Emili A, Xie XS. Quantifying e. coli proteome and transcriptome with single-molecule sensitivity in single cells. Science. 2010;329(5991):533–538. doi: 10.1126/science.1188308. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Gardner TS, Cantor CR, Collins JJ. Construction of a genetic toggle switch in Escherichia coli. Nature. 2000;403(6767):339–342. doi: 10.1038/35002131. [DOI] [PubMed] [Google Scholar]
- 43.Kepler TB, Elston TC. Stochasticity in transcriptional regulation: origins, consequences, and mathematical representations. Biophysical Journal. 2001;81(6):3116–3136. doi: 10.1016/S0006-3495(01)75949-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Kim K-Y, Wang J. Potential energy landscape and robustness of a gene regulatory network: toggle switch. PLoS Computational Biology. 2007;3(3):e60. doi: 10.1371/journal.pcbi.0030060. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Schultz D, Onuchic JN, Wolynes PG. Understanding stochastic simulations of the smallest genetic networks. The Journal of Chemical Physics. 2007;126(24):245102. doi: 10.1063/1.2741544. [DOI] [PubMed] [Google Scholar]
- 46.Munsky B, Khammash M. The finite state projection approach for the analysis of stochastic noise in gene networks. Automatic Control, IEEE Transactions on. 2008;53(Special Issue):201–214. [Google Scholar]
- 47.Deuflhard P, Huisinga W, Jahnke T, Wulkow M. Adaptive discrete Galerkin methods applied to the chemical master equation. SIAM Journal on Scientific Computing. 2008;30(6):2990–3011. [Google Scholar]
- 48.Sjöberg P, Lötstedt P, Elf J. Fokker–Planck approximation of the master equation in molecular biology. Computing and Visualization in Science. 2009;12(1):37–50. [Google Scholar]
- 49.Kazeev V, Khammash M, Nip M, Schwab C. Direct solution of the chemical master equation using quantized tensor trains. PLoS Computational Biology. 2014 Mar;10:e1003359. doi: 10.1371/journal.pcbi.1003359. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Arkin A, Ross J, McAdams HH. Stochastic kinetic analysis of developmental pathway bifurcation in phage λ-infected Escherichia coli cells. Genetics. 1998;149(4):1633–1648. doi: 10.1093/genetics/149.4.1633. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Aurell E, Brown S, Johanson J, Sneppen K. Stability puzzles in phage λ. Physical Review E. 2002;65(5):051914. doi: 10.1103/PhysRevE.65.051914. [DOI] [PubMed] [Google Scholar]
- 52.Aurell E, Sneppen K. Epigenetics as a first exit problem. Physical Review Letters. 2002;88(4):048101. doi: 10.1103/PhysRevLett.88.048101. [DOI] [PubMed] [Google Scholar]
- 53.Zhu X-M, Yin L, Hood L, Ao P. Robustness, stability and efficiency of phage lambda genetic switch: dynamical structure analysis. Journal of Bioinformatics and Computational Biology. 2004;2:785–817. doi: 10.1142/s0219720004000946. [DOI] [PubMed] [Google Scholar]
- 54.Zhu X-M, Yin L, Hood L, Ao P. Calculating biological behaviors of epigenetic states in the phage λ life cycle. Functional & Integrative Genomics. 2004;4(3):188–195. doi: 10.1007/s10142-003-0095-5. [DOI] [PubMed] [Google Scholar]
- 55.Li M, McClure W, Susskind M. Changing the mechanism of transcriptional activation by phage lambda repressor. Proceedings of the National Academy of Sciences of the United States of America. 1997;94(8):3691–3696. doi: 10.1073/pnas.94.8.3691. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Hawley D, McClure W. In vitro comparison of initiation properties of bacteriophage lambda wild-type PR and x3 mutant promoters. Proceedings of the National Academy of Sciences of the United States of America. 1980;77(11):6381–6385. doi: 10.1073/pnas.77.11.6381. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Hawley D, McClure W. Mechanism of activation of transcription initiation from the lambda PRM promoter. Journal of Molecular Biology. 1982;157(3):493–525. doi: 10.1016/0022-2836(82)90473-9. [DOI] [PubMed] [Google Scholar]
- 58.Shea MA, Ackers GK. The OR control system of bacteriophage lambda a physical-chemical model for gene regulation. Journal of Molecular Biology. 1985;181(2):211–230. doi: 10.1016/0022-2836(85)90086-5. [DOI] [PubMed] [Google Scholar]
- 59.Kuttler C, Niehren J. Gene Regulation in the Pi Calculus: Simulating Cooper-ativity at the Lambda Switch. Transactions on Computational Systems Biology VII. 2006;4230:24–55. [Google Scholar]
- 60.Liao S, Vejchodsky T, Erban R. Tensor methods for parameter estimation and bifurcation analysis of stochastic reaction networks. Journal of the Royal Society Interface. 2015;12(108):20150233. doi: 10.1098/rsif.2015.0233. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Verstraete F, Cirac JI. Matrix product states represent ground states faithfully. Phys Rev B. 2006 Mar;73:094423. [Google Scholar]
- 62.Nelson P. Physical Models of Living Systems. Macmillan; 2015. [Google Scholar]
- 63.Meyer CD. Matrix analysis and applied linear algebra. SIAM; 2000. [Google Scholar]