Abstract
Noise and stochasticity are fundamental to biology and derive from the very nature of biochemical reactions where thermal motion of molecules translates into randomness in the sequence and timing of reactions. This randomness leads to cell-to-cell variability even in clonal populations. Stochastic biochemical networks have been traditionally modeled as continuous-time discrete-state Markov processes whose probability density functions evolve according to a chemical master equation (CME). In diffusion reaction systems on membranes, the Markov formalism, which assumes constant reaction propensities is not directly appropriate. This is because the instantaneous propensity for a diffusion reaction to occur depends on the creation times of the molecules involved. In this work, we develop a chemical master equation for systems of this type. While this new CME is computationally intractable, we make rational dimensional reductions to form an approximate equation, whose moments are also derived and are shown to yield efficient, accurate results. This new framework forms a more general approach than the Markov CME and expands upon the realm of possible stochastic biochemical systems that can be efficiently modeled.
I. INTRODUCTION
Individual cells in isogenic populations exhibit significant variability under the same environmental conditions.1–4 The variability observed in these cases is believed to take root in the stochastic nature of biochemical reactions. For example, fluctuations in gene expression, a major contributor of cell-to-cell variability, can arise from the stochastic steps involved in transcription and translation.3,4 Further sources of fluctuations include diffusion-reactions and dissociations, allosteric changes, and degradation of biological molecules. In general, for a given biochemical network, the occurrence of the underlying biochemical reactions depends on the channels of possible interactions (topological structure of the interaction network), the interaction affinity (reaction rates), as well as the number of molecules available for such interactions.5 This inevitably results in fluctuations in the levels of the various molecular species, and translates into cell-to-cell phenotypic differences in cellular behaviors.
In some cases, cells may exploit fluctuations in expression levels productively to allow populations to “hedge their bets” with respect to future environmental shifts. This is thought to explain the ability of subpopulations of bacteria to resist antibiotics and promote latency of infection by viruses.1,2 The impact of stochastic fluctuations have even been identified in simple synthetic gene circuits, inducing substantial variability in the period of oscillators and stochastic transitions in synthetic toggle switches.6 As a consequence, the thorough investigation of the principles underlying such non-genetic individuality is essential for engineering genetic circuits as well as for understanding the differential susceptibility of cells and organisms to diseases, drugs, and pathogens. These behaviors can only be captured and rigorously dissected in the context of stochastic mathematical and computational models which are essential to understand how cells may exploit or filter stochastic fluctuations to achieve function.
A commonly used stochastic representation of biological systems adopts the formalism of the chemical master equation (CME),7–9 a differential-difference equation that describes the time-dependent joint probability distribution of the biochemical species in a system, or its accompanying numerical Stochastic Simulation Algorithm (SSA).10 The SSA is a Monte Carlo procedure that generates stochastic trajectories of memoryless processes governed by the CME. The CME/SSA approach has found great success in modeling cellular networks, including stochasticity in gene expression3 and larger transcriptional pathways that generate interesting stochastic behaviors such as excitability.11 However, there are numerous cases where the Markov approach is not directly applicable. One biologically prominent case is that of diffusion-limited reactions such as for proteins in the cytosol12 and for receptors on membranes.13 In the latter case, many important biological processes rely on sensing external signals by modulating the activity of receptors that diffuse on the surface of cells or cellular organelles. These receptors often dimerize or oligomerize, a feature that is important for their activity. We have recently established that the precise modeling of the stochastic diffusion reactions of these molecules is needed to account for their dimerization dynamics, and illustrated that the CME/SSA framework does not constitute an appropriate framework to do so.13 This is because the instantaneous propensity for a diffusion reaction to occur depends on the creation times of the molecules involved. Here, creation times could be, for example, the times at which the given molecules were exocytosed onto the membrane to allow for the diffusion reaction to occur. Our work has defined new creation-time dependent (CTD) propensities, specifically bimolecular diffusion-reaction propensities. We derived CTD propensities that model the diffusion reaction of spatially uncorrelated molecules (random initial starting location for each molecule) as well as the rebinding of recently dissociated molecules (spatially correlated molecules).13 We have developed an exact algorithm13 (NMSSA) to simulate stochastic realizations of systems with these types of CTD propensities. We will refer to the NMSSA as the CTD-SSA to keep in theme with the creation-time-dependent nomenclature. In this work, we develop a CTD-CME for general systems of this type. We then derive an approximate representation of this equation that leads to moment equations which can be efficiently solved. We illustrate our results by deriving the CME and corresponding moment equations for a number of biologically motivated examples where the CME/SSA framework is not directly applicable. Overall, our CTD-CME framework constitutes a general formulation in which the Markov CME is a limiting special case.
II. TRADITIONAL MARKOV APPROACHES TO MODEL STOCHASTIC BIOCHEMICAL SYSTEMS
The CME/SSA framework assumes a system of well-stirred chemical reactions with V molecular species. We use the state X(t) to denote the vector whose integer elements Xm(t) are the number of molecules of the mth species at time t. If there are W elementary chemical reactions that can occur among these V species, then we associate with each reaction rk (k = 1, …, W) a nonnegative propensity function defined such that ak(X(t)) τ + o(τ) is the probability that reaction rk will happen in the next small time interval (t, t + τ], as τ → 0. Under certain assumptions such as a well-stirred reaction volume, ak(x) assumes a polynomial form.10 The occurrence of a reaction rk leads to a change of νk ∈ ZV (the set of nonnegative integers) in the state X. νk is therefore a stoichiometric vector that reflects the integer change in reactant species due to a reaction rk.
This set of well-stirred chemical reactions can be represented by the time-dependent joint probability distribution p(x, t) which describes the probability of the system being in state X(t) = x at time t. The evolution of p(x, t) is given by
| (1) |
Equation (1) is the so-called CME.7–9 The CME is the limit of the Chapman-Kolmogorov equation, an identity that must be obeyed by the transition probabilities of any Markov process.
The CME/SSA approach has found great success in modeling stochastic cellular networks, however, as discussed above, there are numerous cases where the Markov approach is not applicable, for example, in many diffusion-limited reactions of receptors on membranes.12 For these cases, a more appropriate stochastic simulation framework (CTD-SSA) was developed.13 The basic premise for the CTD-SSA is the concept of CTD propensities which we describe below. The main contribution of this work is the development of a chemical master equation for these types of systems.
III. CREATION-TIME-DEPENDENT BIMOLECULAR PROPENSITIES
We have recently addressed cases that occur in diffusion reaction systems on membranes where the Markov formalism, which assumes constant reaction propensities (mean reaction times), is not applicable.13 This is due to the diffusion-reaction propensities being CTD. Let us assume that the kth reaction type in the system is a hetero-bimolecular reaction resulting from the diffusion reaction between species m and n. For the Markov case, the propensity takes the form ak(x) = cmnxmxn where cmn is the rate constant and xmxn is the number of different reacting pairs. For the CTD case, let the creation time variables and denote the creation times of the ith molecule of species m and the jth molecule of species n. The rate function for the hetero-bimolecular reaction between the two molecules takes the form . The starting time of the pairwise rate function is determined by the “max” function where an interaction cannot begin until the last of the two molecules is created. We define τm as the creation time vector of the species m and has the form where the number of species m molecules in the system at time t is xm (likewise for the creation time vector τn). The propensity, ak(x, t, τm, τn), of the reaction evaluated at time t, given all pairwise rate functions between the two species m and n would be
| (2) |
Similarly, if the kth reaction was a homo-bimolecular reaction of species m, the Markov propensity would be ak(x) = cmmxm(xm − 1)/2!. For the CTD case, the rate function is with a propensity of . To keep notation to a minimum, the propensity form ak(x, t, τm, τn) will also be used to denote memoryless propensities, ak(x). In ak(x, t, τm, τn), we will also assume the species indices, m and n, to be implicit functions of k.
IV. A CREATION-TIME-DEPENDENT CHEMICAL MASTER EQUATION
For our reacting system, we assume that the first U species are each involved in at least one reaction whose propensity depends on that species’ creation times. We will specify these species as CTD species. For the rest of the paper, we define the state x as the species state x, and the creation time vectors τ1 through τU as the creation time state [τ1 … τU]. With the state of our system fully defined, we can write down the joint distribution as
| (3) |
where the left side of (3) is simply the typical joint distribution representation and the right side is the chain rule representation. The chain rule representation in (3) quantitatively represents the probability of being in species state x at time t multiplied by the probability density function of the creation time state [τ1 … τU] given x and t. We impose a causality constraint that there can only be non-zero probability in (3) for for 1 ⩽ h ⩽ U and 1 ⩽ e ⩽ xh. That is, any molecule affecting the system at time t must have been created before or at time t. Given the stoichiometry of the reactions, the form of propensities and the joint distribution, the chemical master equation can be constructed.7–10 The chemical master equation that evolves (3) over time can be written in its most general form as
| (4) |
We will refer to this equation as the CTD-CME. The second flux term on the right side of (4), the out-flux, describes how probability leaves species state x, characterized by a given creation time state [τ1 … τU], through the propensity of the kth reaction, ak(x, t, τm, τn). The first flux term on the right side of (4) defines the in-flux operator which dictates how the probability that leaves species state x − νk and creation time state through reaction k enters species state x and creation time state [τ1 … τU]. Notice in state x − νk the creation time vector for any species h (for 1 ⩽ h ⩽ U) is represented as and for species state x it is τh. This is because and τh are each from different creation time distributions, vs. p(τ1, …, τU|x, t). Thus, and τh can have different dimensions, for example, if species h either gains or loses a molecule through reaction k. Hence, how the in-flux maps into p(x, t)p(τ1, …, τU|x, t) will explicitly depend on the type of reaction that occurs. To determine this mapping, next we develop the form of .
A. Determining the form of
Although the approach we present is applicable to general CTD systems, we develop the CTD-CME for the following simple absorbing system for illustrative purposes:
| (5a) |
| (5b) |
| (5c) |
| (5d) |
where reaction equations (5a)–(5d) represent the four reactions in the CTD-CME, respectively. In this system, the X1 molecules are created at a rate β1 (mol s−1). The X2 molecules are created at a rate β2 (mol s−1) and decay at a rate γ2 (s−1). The bimolecular absorbing reaction between X1 with X2 results in the destruction of X1 while X2 itself remains unaffected. For this system, the bimolecular absorbing reaction contains the only CTD propensity, and requires the knowledge of both the X1 creation time and the X2 creation time. Thus, our joint probability function will take the form p(x, t)p(τ1, τ2|x, t) where x = [x1 x2].
In constructing , we will first define a non-symmetric in-flux operator which produces a CTD-CME but whose creation time distribution lacks certain symmetric properties. Through symmetry operations on we will then obtain a unique symmetrized in-flux operator whose CTD-CME evolves a creation time distribution with corresponding symmetry properties. The resulting symmetric form of the CTD-CME allows for dimensionality reductions enabling the derivation of moment equations for the CTD-CME and tractable numerical solutions.
1. Application of the sifting property to construct the in-flux operator
For the kth reaction, we construct how the probability flux enters p(x, t)p(τ1, τ2|x, t) from . To achieve this we take advantage of the sifting property14 of products of delta functions
| (6) |
For ease of presentation, we will refer to a given product of delta functions as a sifting function. The utility in this case will be to assign the creation time state [τ1 τ2] to particular creation time state space coordinates in the flux term due to the kth reaction. For the CTD-CME of this absorbing system, the abstract form of the sifting function is . It describes how the creation times in τh of species h are assigned to particular creation time state space coordinates in the flux term .
For the k = 1 reaction, the creation reaction for X1, the species 1 population is increased by one with creation propensity . For the set of molecules of species 1 in state x1 − 1 with creation times , a molecule created at time t is added. A particular sifting function for species 1 that achieves this is
| (7) |
where the delta functions will allow us to “assign” each to for e ⩽ x1 − 1 as well as assign the newly created molecule at time t to . Since the creation times in species 2 are not affected by reaction 1, its sifting function is a one-to-one mapping of the form
| (8) |
We define the in-flux operator as multiplied by the sifting functions integrated over all variables, i.e., h = 1: 2 and e = 1: xh − ν1h. The in-flux operator would be
| (9) |
Notice that the final form of the in-flux in (9) is a function of the creation time variables from species state x, i.e., τ1 and τ2. For the k = 2 reaction, the creation reaction of species 2, one can easily see that the sifting functions would have the same form with, but with the species switched. The in-flux operator would be
| (10) |
The k = 3 reaction, the absorbing reaction, is a hetero-bimolecular reaction and, hence, has a propensity of the form . The sifting functions for the absorbing reaction for species 1 where the ith x1 molecule is absorbed would be
| (11) |
where the creation time variable is removed since the ith x1 molecule from state x − ν3 has been absorbed. The sifting function for species 2, which is unaffected would be
| (12) |
The in-flux operator would be
| (13) |
where the creation time variable is integrated out, over all possible values, since the ith x1 molecule from state x − ν3 has been absorbed. This ensures that the in-flux has the proper creation time variable dimensions. The decay reaction of X2, the k = 4 reaction, would use the same sifting functions as the absorbing reactions with the species switched. Here, the propensity would be . The in-flux operator would be
| (14) |
2. A natural order CTD-CME
We can constrain p(τ1, τ2|x, t) such that non-zero probability density exists only in the creation time state-space where for the h = 1, 2 given the species state x. We call this the natural-order constraint simply because the creation time values from a given sample ascend in the order of the indexes of the creation time variables. To visualize the natural-order constraint in 2D, Figure 1(a) plots the simple 2D distribution which has a single CTD species with two creation time variables and adheres to the natural order constraint. For the simple absorbing system, we denote the creation time distribution p(τ1, τ2|x, t) that adheres to the natural order constraint as p0(τ1, τ2|x, t) where the zero in the subscript indicates the natural order constraint. Examining the sifting operations, through , one can see that for this particular system, they preserve the natural order and causality constraints. We can therefore construct a CTD-CME of the form
| (15) |
that will maintain the natural order and causality constraints in p0(τ1, τ2|x, t) for t > 0. However, p0(τ1, τ2|x, t) lacks symmetry. Next, we develop symmetrized versions of both p0(τ1, τ2|x, t), , and (15). The symmetrized versions enable approximations that lead to computationally tractable equations.
FIG. 1.
Simple 2D case of natural-ordered, permuted, and S-symmetric distributions. (a) The natural-ordered distribution , nonzero for . (b) The permuted distribution , nonzero for . (c) The S-symmetric distribution .
3. A symmetrized CTD-CME
To begin, for the hth species consider the creation time vector for h = 1, 2 and where . Here, κh satisfies the natural-order constraint and therefore p0(τ1 = κ1, τ2 = κ2|x, t) can be nonzero. There are xh! vectors that could be created by permuting the order of through in the vector κh. Therefore, applying this to each of the vectors, κ1 and κ2, there would be x1!x2! unique ways to permute κ1 and κ2. Let be the vector due to the gth permutation where represents the permuting operator of the hth species creation time vector for the gth permutation. We can generate a class of probability distributions, pg(τ1, τ2|x, t), that map as
| (16) |
where and are equivalent permuted state space coordinates, that is the values within κ1 and κ2 are unchanged. From this point of view, all pg(τ1, τ2|x, t) are equivalent, but the non-zero probability density resides in an equivalent permuted state space region for each pg(τ1, τ2|x, t). To visualize this idea in 2D, consider again the simple 2D distribution in Figure 1(a). Figure 1(b) shows the permuted but equivalent distribution .
We can then create a symmetrized probability distribution of the form
| (17) |
which naturally fuses and normalizes all pg(τ1, τ2|x, t) together and guarantees continuity at the interfaces of the various pg(τ1, τ2|x, t). To visualize this, consider again the simple 2D distribution in Figure 1(a). Figure 1(c) plots the symmetrized distribution . The distribution from (17) has the following property:
| (18) |
for all g, e ⩽ x1!x2!, i.e., it is a symmetric function with respect to permutations of the creation times within a given species. We call this property S-symmetry. One can think of ps(τ1, τ2|x, t) as taking p0(τ1, τ2|x, t) and equally redistributing the probability density at τ1, τ2 over all x1!x2! equivalent, permuted creation time state-space coordinates. Applying the results from (15) to (18), we can construct a master equation for pg(τ1, τ2|x, t) with the in-flux operator applied to yielding
| (19) |
for all g ⩽ x1!x2!. For the in-flux from the S-symmetric distribution, , we have equally weighted all possible permutations to the sift operation of the kth reaction. To be clear, recall that the sift operation in assigns the creation time values of to to a particular creation time state space coordinate in . The permuting operators in from (19) simply assign to to equivalent, but permuted coordinates. We are simply recombining the in-flux from all possible equivalent, permuted coordinates, each of which have the same probably density (since ps(τ1, τ2|x, t) is S-symmetric).
Next, we apply the same approach from (17), i.e., the operator, to (19) which fuses all x1!x2! CDT-CME equations together and results in a CTD-CME for the S-symmetric ps(τ1, τ2|x, t) of the form
| (20) |
which preserves S-symmetry in ps(τ1, τ2|x, t) throughout time for all τ1, τ2. The only difference between (15) and (20) is that (20) maximally utilizes the creation time state space, but their respective solutions yield identical creation time sampling statistics and species-state distribution p(x, t). In addition, as we will later show, (20) is more amenable to approximate dimensionality reductions to make the equation computationally tractable to solve.
Without any loss of generality, the S-symmetric preserving approach can be applied to the general system in (4). Given that the general distribution p(τ1, …, τU|x, t) is S-symmetric at time t = 0, we define the symmetrized version of the in-flux operator from (4) to be
| (21) |
Note that the sifting function in (21) is written to include and . This was done to include more general reaction types where molecules from species m or n that appear in the propensity can map to species h through reaction k. Reactions of this type occur in spatio-temporally correlated systems, an example of which, we model in the numerical section below. In addition to the ones defined for the absorbing system above, all other sifting functions for the reacting systems in this paper are presented in the supplementary material.15 Because of the S-symmetry, the expression for can be greatly simplified. However, the simplification is reaction specific, e.g., creation reaction vs. bimolecular reaction. In the supplementary material,15 we present simplified forms for each reaction type used in the paper. A requirement for (21) to preserve S-symmetry is that the propensity ak(x, t, τm, τn) is S-symmetric. In all our cases, this is true. For example, it is easy to see that under any of the permuting operators (homodimer reaction) and (heterodimer reaction) are all S-symmetric. This guarantees from (4) that p(τ1, …, τU|x, t), if S-symmetric at t = 0 will stay S-symmetric for t > 0.
We can apply (21) to (4) and integrate out all creation times, resulting in the equation
| (22) |
which has the same form as (1), but with expected values for the propensities with respect to the creation time distributions and p(τ1, …, τU|x, t). Indeed, if the propensities have no creation time dependencies, then as one would expect, (22) reduces to the Traditional Markov CME (Eq. (1)).
B. Approximations for enabling tractable solutions of the CTD-CME and its moment equations
Our motivation for enforcing that p(τ1, …, τU|x, t) be an S-symmetric function is for dimensionality reduction purposes to enable a computationally tractable approximate solution. Arguably, the simplest S-symmetric function that the CTD-CME distribution might be approximated by would be one where the creation times within a given species are independent and identically distributed (iid) random variables. Let be the iid distribution for the ith molecule of the dth species. We approximate the creation time distribution with
| (23) |
This approximation necessitates only one creation time variable to be updated for each CTD species. The second approximation we make for numerically solving the systems in this paper is that where is the single parameter necessary to uniquely describe the distribution at time t. In the supplementary material,15 we show that if the variance in is dominated by creation reactions terms, then the exponential distribution is the natural solution at steady state. This is a testable assumption once the numerical solution is obtained (Appendix B). For this paper, our numerical examples will focus on systems at steady state. Fully capturing the time dynamics of may require more sophisticated approaches such as Discontinuous Galerkin (DG) numerical methods.16,17 However, in many systems the exponential assumption may be sufficient when the distributions associated with the CTD propensities are in quasi-steady state (fast) as compared with the dominant timescales of the system (slow).
In Sec. V, we solve the CTD-CME using the assumptions above for the first numerical example. However, most of our examples will be analyzed with moment equations derived from the CTD-CME. The general approach for constructing the CTD-moment equations are discussed in Appendix A. In the supplementary material,15 we present the mathematical expressions as they appear in the CTD-CME and CTD-moment equations for every reaction type used in this paper.
V. NUMERICAL EXAMPLES
A. CTD-CME results for the simple absorber system when the number of absorbers remains constant over time
For the absorber system (5a)–(5d), we first consider a CTD-CME when the number of X2 molecules remains constant over time, i.e., β2 = γ2 = 0. The X1 molecules are created at a rate β1 (mol s−1). The absorber diffusion reaction of X1 with X2 results in the destruction of X1 while X2 itself remains unaffected. The rate function for this reaction is derived from the time-dependent mathematical solution of the probability that the diffusion reaction between the ith X1 and the jth X2 molecules has occurred by time t, given that their initial starting locations are uncorrelated with each other (spatially uncorrelated).13 Figure 2(a) plots the particular rate function used for this example which was derived for a particular absorber system discussed in Chevalier and El-Samad13 whose CTD-SSA results were shown to agree well with brownian dynamics simulations. In a biological context, absorber-like reactions model the binding of receptors to clathrin-coated-pits, resulting in their endocytosis.18 Because the number of X2 molecules is constant over time, i.e., all X2 molecules were created at t = −∞, the absorber diffusion-reaction propensity is only dependent on the X1 creations times. This is because the bimolecular rate function term . The instantaneous absorbing rate for the ith X1 molecule in state x1 is thus , where is the constant number of X2 molecules.
FIG. 2.
Results for simple absorbing system: (a) Creation-Time-Dependent (CTD) rate function used for the absorbing reaction. (b) SSA and CTD-SSA results (constant absorber case). (c) CTD-CME and CTD-moments results (constant absorber case). (d) CTD-SSA results (variable absorber case). (e) CTD-moments results (variable absorber case).
We first run the CTD-SSA for 3 cases accounting for a range of parameters. For case 1: β1 = β = 5 × 104 and , case 2: and case 3: . Incidentally, these cases are chosen to differentiate non-memoryless systems from memoryless ones. If the system was memoryless, then in the steady state one can show that , where kon is a fundamental rate constant that can be fit from one of the cases. For the three cases, is a constant and therefore, the means of the distributions should be identical. One can see in Figure 2(b) that this is not the case where the CTD-SSA simulations show a lower mean for case 2 than for case 1 and a higher mean for case 3 (Figure 2(b)). This example clearly demonstrates that for such systems, a fundamental rate constant kon in the mass-action sense does not exist, and that it is crucial to account for creation-time-dependent propensities in order to accurately capture the statistics of molecular reactions.
It is easy to see in the CTD-CME framework that only the X1 species will have a creation time distribution. The joint distribution will be of the form p(x1, t)p(τ1|x1, t) where x1 is the number of X1 molecules and is its vector of creation times. Here, p(τ1|x1, t) starts and evolves as a symmetric function. Applying the iid approximation (23) to p(x1, t)p(τ1|x1, t), we project the CTD-CME onto the creation time variable, i.e., integrate out all other creation time variables. For simpler notation, we set to get the equation
| (24) |
Here, we assume that p1(τ1|x1, t) is an exponential distribution. In practice, (24) is transformed into two time dependent equations, one for p(x1, t) and one for p1(τ1, t|x1). For p1(τ1, t|x1), since we assume an exponential distribution, we only need to solve for the time-dependent mean . We directly solve for the time-dependent CTD-CME by applying the Finite-State-Projection19 approach and then numerically integrating the resulting coupled ODE equations with ODE45 in MATLAB. The direct CTD-CME solution results are in satisfactory agreement with the CTD-SSA results (Figures 2(b) and 2(c)).
If the system was memoryless, the moment equations20 derived from the Markov CME can be shown to be
| (25a) |
| (25b) |
As with the Markov CME, the CTD-CME describes the time evolution of a probability function for the systems biochemical species and therefore can be used to derive moment equations for the first two moments20 (Appendix A and the supplementary material15). However, in the context of the CTD-CME, the mean equation also need to be derived for the random variable τ1. The system of moment equations for the CTD-CME are
| (26a) |
| (26b) |
| (26c) |
Here, the notation E[f(τ1)] is used to represent the expected value of a function f(τ1). With the exponential distribution assumption, explicit formulas can be generated for E[f(τ1)].15 Notice here that the creation time distributions are coupled to the species X moment equations through E[k12(t − τ1)]. When k12(t − τ1) is a constant, the creation time distributions uncouple from the species distributions and the moment equations for the chemical species reduce to those derived for the Markov CME. Results for the CTD-CME moment equations used as parameters in a Gaussian agree extremely well with the direct CTD-CME results for all the cases (Figure 2(c)).
B. CTD-CME results for the absorber system when the number of absorbers varies over time
Next, we allowed fluctuations in the number of X2 molecules, i.e., β2 ≠ 0 and γ2 ≠ 0 and with β2 = 1 × 105. In this case, the steady-state average of X2 molecules is . To be consistent with the constant absorber system cases, the mean value was set to be (case 1, γ2 = 2.22 × 102), (case 2,γ2 = 1.11 × 102), and (case 3,γ2 = 2 × 103). As above, we keep constant to test deviation from the memoryless (exponential) assumption as these parameters change. The bimolecular rate function from Figure 2(a) is used for the absorber diffusion reaction. Because both X1 and X2 molecules are being created over time, the bimolecular rate function takes the standard form . For the CTD-SSA simulations (Figure 2(d)), case 1 and case 2 for the variable absorber system are quite similar to that for the constant absorber system. In contrast, for case 3 in the variable absorber system, the mean of the distribution shifts below the means in cases 1 and 2. We will explain this peculiar shift from the CTD-moment results below. For the CTD-CME, we model this system with the iid approximation and project the creation time distribution onto the and creation time variables and for simpler notation set and . The joint distribution then has the form p(x1, x2, t)p1(τ1|x1, x2, t)p2(τ2|x1, x2, t). We derived both the CTD-CME and the CTD-moments for this system (supplementary material15). Because p(x1, x2, t) is 2D with respect to species, its direct solution is computationally inefficient. However, we efficiently solve for the CTD-moment equations (Figure 2(e)) and use them as parameters in a Gaussian which agree extremely well with the CTD-SSA results.
To understand the shift in case 3 between the constant and variable absorber case, we analyze the steady state creation time distributions for the cases 2 and 3. Figure 3(a) (constant absorber system) only plots p1(τ, t) since the absorbers are constant, while Figure 3(b) (variable absorber system) plots p1(τ, t) and p2(τ, t). Here, τ is the dummy variable for the plots in Figure 3. For case 2, for both the constant and variable absorber system, p1(τ, t), are very similar. The mean of p1(τ, t), , for each system are therefore quantitatively similar as well. It is easy to see that for the variable absorber system, in case 2, the mean for p2(τ, t), , is larger than which implies that . Therefore, the average rate function will approximately be E[k12(t − max (τ1, τ2)) ≈ E[k12(t − τ1)], dominated by the creation times of the X1 molecules. Hence, for case 2, there is agreement between the two systems. Applying the same arguments for case 3, in the variable absorber system one can see that . Therefore, the average rate function will approximately be E[k12(t − max (τ1, τ2))] ≈ E[k12(t − τ2)], dominated by the creation times of the X2 absorbing molecules. Therefore, the two systems for case 3 will behave differently, predicting a substantially lower mean for the variable absorber case as observed.
FIG. 3.
(a) Creation time distributions (scaled to the plot) overlaid onto the rate function curve for Cases 2 and 3 (constant absorber system) where green curves represent creation time distribution for species X1. (b) Creation time distributions (scaled to the plot) overlaid onto the rate function curve for Cases 2 and 3 (variable absorber system) where green and black curves represent creation time distributions for species X1 and X2, respectively.
C. Application of CTD-CME to model systems with spatio-temporal correlations
We model a homodimerizing system with spatial-temporal correlated rebinding whose basic chemical equations are
| (27a) |
| (27b) |
| (27c) |
| (27d) |
| (27e) |
Here, X3 represents a dimer complex. X2 represents a dissociated spatially correlated monomer pair that can rebind. X1 represents a spatially uncorrelated monomer. And represents a newly spatially uncorrelated monomer whose spatially correlated partner associated with another monomer. Furthermore, the creation time is preserved, i.e., the time when it first dissociated from the dimer complex. In essence, there is a spatially uncorrelated population and a spatially correlated population of monomers, each of which can dimerize with monomers in the other population as well as with monomers from their own population. The dissociation rate constant of the dimer complex is γd. The rate functions, k11(t) and k2(t) (Figures 4(a) and 4(b)), represent the spatially uncorrelated bimolecular rate function and spatially correlated rebinding rate function, respectively. These functions were derived for a particular homodimerization system discussed in Chevalier and El-Samad13 where results from the CTD-SSA were shown to agree well with brownian dynamics simulations. From Figure 4(b) one can see that upon dissociation, the dissociated molecules have a high rebinding rate (the two molecules are in close proximity at early times) and then rapidly decreases with time as the tendency to diffuse away from each other increases. Simulations for this system were run for the CTD-SSA. Figures 4(c) and 4(d) illustrate the difference when one accounts for rebinding using the spatially correlated rate function, k2(t), as opposed to the well-mixed assumption where the rebinding rate function is modeled with the spatially uncorrelated rate function, k11(t). The use of the spatially correlated rate function yields the correct distributions of the molecules while not accounting for rebinding (uncorrelated assumption) results in pronounced errors.
FIG. 4.
(a) Uncorrelated bimolecular rate function k11(t − τ). (b) Correlated rebinding rate function k2(t − τ). (c)–(f) Distribution at steady state of dimer numbers for a set number of total receptors Ntot and a given value of γd. (c) Dashed (CTD-SSA/correlated re-association), dotted (CTD-SSA/uncorrelated case) for γd = 1000. (d) Dashed (CTD-SSA/correlated re-association), dotted (CTD-SSA/uncorrelated case) for γd = 100. (e) Dashed (CTD-moments/correlated re-association), dotted (CTD-moments/uncorrelated case) γd = 1000. (f) Dashed (CTD-moments/correlated re-association), dotted (CTD-moments/uncorrelated case) γd = 100.
We then generated CTD-moment equations for the system. Figures 4(e) and 4(f) show Gaussians based on the steady state moments of the CTD-CME for the same parameters used in the CTD-SSA simulations. There is good agreement, but with a small error incurred by the CTD-moment solutions for the spatially correlated system. For the Ntot = 1500, γd = 1000 case, the error is most easily noticeable, and we suspect it is due to assuming all creation time distributions to be exponential, an assumption that might not be accurate for the correlated CTD species. Future work will investigate different representations of creation time distributions that can accommodate for the subtleties of spatial-temporal correlations.
VI. SUMMARY AND FUTURE WORK
In this work, we have extended the CTD-SSA approach to include the CTD-CME. The new CTD-CME/CTD-SSA represents a general framework for systems with CTD propensities. This new framework forms a more general approach than the Markov CME and expands upon the realm of possible stochastic biochemical systems that can be modeled, including systems with spatial-temporal correlations.
To derive moment equations, we approximated the creation time distribution to assume an iid, exponential form. Our calculation of the normalized residual error (NRE) (Appendix B) suggests the steady solution is very close to iid for the systems modeled. In the future, to capture more accurate realizations of the CTD-CME and the CTD-moments, we need to better approximate the creation time distributions, especially when the system undergoes time-dynamic changes. One potential avenue to pursue is DG methods which constitutes a finite, discontinuous basis function approach for solving hyperbolic and parabolic partial differential equations.16 A successful approach will also allow the iid approximation to be tested during dynamical changes in the system.
It is possible to map time-dependent propensities as Markov chains into Markov systems.21,22 It would be interesting to map our systems to a Markov chain and generate time-dynamic moment equations to compare with the time-dynamic CTD-moment equations. This is a topic of future research.
Our original work on the CTD-SSA provided a new means for simulating reactions that have CTD propensities, such as diffusion reactions on membranes. As with the SSA, when the population of a CTD species becomes large, the algorithm spends the majority of its time on CTD-dependent reactions, such as CTD diffusion reactions, thereby increasing the computational cost. To circumvent this, future work will exploit the CTD-moment equations derived to develop faster approximate Monte Carlo methods that speed up the CTD-SSA incorporating ideas from techniques like tau-leaping/Langevin23,24 and other hybrid approaches.25 In summary, a wide-ranging battery of methods, similar to that developed for the Markov CME/SSA framework, is necessary for the application of the CTD-CME/CTD-SSA formalism to efficiently model and analyze stochastic biochemical systems with CTD propensities.
ACKNOWLEDGMENTS
This work was supported by NGIMS Systems Biology Center P5.GM.81879 and the Paul G. Allen Foundation to H.E. We would like to thank David Sivak for useful discussions and both Raj Bhatnagar and Benjamin Heineike for useful discussions and critical reading of the paper.
APPENDIX A: GENERAL METHOD FOR CONSTRUCTING CTD-MOMENT EQUATIONS
In the master equation with expected propensity values (22), we transform and E[ak(x, t, τm, τn)] into equivalent, but simpler forms that are amenable to moment equation generation. For instance, take the expected propensity value for the hetero-bimolecular reaction, . Here, is the expected value of the bimolecular rate function for the ith species m molecule and jth species n molecule evaluated at species state x and at time t. Because of S-symmetry, it is easy to see that is the same for all i and j. We can therefore express the propensity as . The same logic holds for the homo-bimolecular reaction as well. More generally, we can express E[ak(x, t, τm, τn)] as where is the expected value of the rate function evaluated at state x and at time t, and fk(x) is the state-dependent function, for example, xmxn in the hetero-bimolecular case. The master equation with expected propensity values (22) becomes
| (A1) |
We will generate moment equations assuming local linearity in the propensities. To do so we first represent and fk(x) as linear Taylor series about the mean z(t) of the distribution p(x, t). This yields the linearized propensities
| (A2) |
The final expression in (A2) keeps up to linear terms only which assumes that higher order terms are negligible in the volume of x where the distribution p(x, t) spans. Given this constraint, standard moment generation approaches derived for the markov/memoryless Master Equation20 can be applied. The time-dependent mean equation can be expressed as
| (A3) |
where notice that all linear terms integrate out. We can also generate time-dependent covariance equations. To directly apply the work of Engblom,20 we must make a further assumption in (A2), which is for e = e′. This assumption removes the last term in (A2) and allows the time-dependent covariance equation to be expressed as
| (A4) |
The time-dependent mean and covariance equations are necessarily coupled to equation(s) that evolve the creation time distribution which enables the calculation of . Just as the moment equations are species state averaged equations, a species state averaged equation(s) for the creation time distribution is required in order to continuously calculate . The linear Taylor series approximation of that was applied to the linear propensity approximation in (A2) makes the assumption that the creation time distribution takes the form
| (A5) |
which is simply the creation time distribution at the mean z(t) plus a sum of species dependent basis functions, , whose weight varies linearly away from the mean. Similar to the linear propensity approximation, we will keep up to first order terms only in the general equation(s) evolving the creation time distribution and then average over species state to yield the species state averaged equation(s). Because we are assuming that p(x, t) is approximately gaussian, an even-function about the mean, only the p(τ1, …, τU|z(t), t) term will remain in a given time-dependent creation-time-distribution equation, i.e., all the basis function terms will integrate out. All the terms, due to specific reactions from the systems in this paper, which appear in the species state averaged equation(s) evolving the time-dependent creation time distribution are presented in the supplementary material.15
APPENDIX B: NORMALIZED RESIDUAL ERROR IN THE CREATION TIME DISTRIBUTION VARIANCE AND COVARIANCE EQUATIONS
In the supplementary material,15 we derive the terms for the mean, variance, and covariance equations for the creation time distributions due to all reactions applied to the systems in this paper. Given our iid, exponential distribution assumptions to solve the systems, we can take the resulting solution and plug it into the exact variance and covariance equations to look at NRE a dimensionless quantity. To get the dimensionless NRE for the variance equations, the equations are normalized by , where is variance of the exponential distribution. This gives a NRE relative to the exponential assumption. Similarly for the covariance equations, the equations are normalized by where is the multiplied standard deviations assuming exponential distributions for CTD species m and n. This yields a dimensionless NRE for the covariance equations relative to the exponential assumption and can be interpreted as a pearson-like correlation coefficient. If the solution is truly iid and exponential, then the NRE should be zero for both the variance and covariance equations.
We ran the solutions for all systems and cases within the system. For the constant absorber case, given that the rate function is , one can show that the covariance equations result in zero normalized residual error. Hence, the system is always iid. The maximum NRE for the variance equations was 0.036 (Case 2), thus the exponential assumption is very reasonable.
For the variable absorber case, given the bimolecular rate function , one can show that this causes the system to be non-iid, i.e., non-zero NRE. For the covariance equations, the max NRE is 1.1 × 10−5, suggesting that it is very close to iid for all cases. And max variance NRE is 0.0276 (Case 2). Again, the exponential distribution assumption is reasonable.
Finally, for the correlated homodimer systems in all cases the covariance equations yielded a maximum NRE of 1.9 × 10−4, very close to iid. However, for the correlated homodimer system the maximum variance NRE was 0.141 (the Ntot = 1500, γd = 1000 case), at least four times higher than the max NRE of the variable and constant absorber systems.
For the steady state systems presented, the NRE results provide support for the iid assumption. For the exponential creation distribution assumption, the correlated homo-dimer system (the Ntot = 1500, γd = 1000 case) has the most significant level of NRE in the variance equations which might explain the error between the distributions calculated by the CTD-SSA and the CTD-moments for this case.
REFERENCES
- 1.Balaban N., Merrin J., Chait R., Kowalik L., and Leibler S., Science 305, 1622 (2004). 10.1126/science.1099390 [DOI] [PubMed] [Google Scholar]
- 2.Kaern M., Elston T., Blake W., and Collins J., Nat. Rev. Genet. 6, 451 (2005). 10.1038/nrg1615 [DOI] [PubMed] [Google Scholar]
- 3.McAdams H. and Arkin A., Proc. Natl. Acad. Sci. U.S.A. 94, 814 (1997). 10.1073/pnas.94.3.814 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Swain P. P., Elowitz M. M., and Siggia E., Proc. Natl. Acad. Sci. U.S.A. 99, 12795 (2002). 10.1073/pnas.162041399 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Paulsson J., Phys. Life Rev. 2, 157 (2005). 10.1016/j.plrev.2005.03.003 [DOI] [Google Scholar]
- 6.Drubin D. A., Way J. C., and Silver P. A., Genes Dev. 21, 254 (2007). 10.1101/gad.1507207 [DOI] [PubMed] [Google Scholar]
- 7.McQuarrie D., J. Appl. Prob. 4, 413 (1967). 10.2307/3212214 [DOI] [Google Scholar]
- 8.Gillespie D., Physica A 408, 404 (1992). 10.1016/0378-4371(92)90283-V [DOI] [Google Scholar]
- 9.van Kampen N., Stochastic Processes in Physics and Chemistry (North-Holland, Amsterdam, 1992). [Google Scholar]
- 10.Gillespie D., J. Comput. Phys. 22, 403 (1976). 10.1016/0021-9991(76)90041-3 [DOI] [Google Scholar]
- 11.Suel G., Kulkarni R., Dworkin J., Garcia-Ojalvo J., and Elowitz M., Science 315, 1716 (2007). 10.1126/science.1137455 [DOI] [PubMed] [Google Scholar]
- 12.Collins F. and Kimball G., J. Colloid Sci. 4, 425 (1949). 10.1016/0095-8522(49)90023-9 [DOI] [Google Scholar]
- 13.Chevalier M. and El-Samad H., J. Chem. Phys. 137, 084103 (2012). 10.1063/1.4746692 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Bracewell R., The Fourier Transform and its Applications, 3rd ed. (McGraw-Hill, 2000). [Google Scholar]
- 15. See supplementary material at http://dx.doi.org/10.1063/1.4902239E-JCPSA6-141-020445 for a construction of the mathematical terms within the CTD-CME/moment equations due to a given reaction type as well as analytical results involving the exponential distribution.
- 16.Hesthaven J. and Warburton T., Nodal Discontinuous Galerkin Methods: Algorithms, Analysis, and Applications, 1st ed. (Springer Publishing Company, Inc., 2007). [Google Scholar]
- 17.Hesthaven J. and Warburton T., J. Comput. Phys. 181, 186 (2002). 10.1006/jcph.2002.7118 [DOI] [Google Scholar]
- 18.Puthenveedu M., Lauffer B., Temkin P., Vistein R., Carlton P., Thorn K., Taunton J., Weiner O., Parton R., and von Zastrow M., Cell 143, 761 (2010). 10.1016/j.cell.2010.10.003 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Munsky B. and Khammash M., J. Chem. Phys. 124, 044104 (2006). 10.1063/1.2145882 [DOI] [PubMed] [Google Scholar]
- 20.Engblom S., Appl. Math. Comput. 180, 498 (2006). 10.1016/j.amc.2005.12.032 [DOI] [Google Scholar]
- 21.Arazi A., Ben-Jacob E., and Yechiali U., Physica A 332, 585 (2004). 10.1016/j.physa.2003.07.009 [DOI] [Google Scholar]
- 22.Deneke C., Lipowsky R., and Valleriani A., Plos One 8, 55442 (2013). 10.1371/journal.pone.0055442 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Gillespie D., J. Chem. Phys. 115, 297 (2001). 10.1063/1.1378322 [DOI] [Google Scholar]
- 24.Gillespie D., J. Chem. Phys. 113, 1716 (2000). 10.1063/1.481811 [DOI] [Google Scholar]
- 25.Chevalier M. and El-Samad H., J. Chem. Phys. 131, 054102 (2009). 10.1063/1.3190327 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Citations
- See supplementary material at http://dx.doi.org/10.1063/1.4902239E-JCPSA6-141-020445 for a construction of the mathematical terms within the CTD-CME/moment equations due to a given reaction type as well as analytical results involving the exponential distribution.




