Skip to main content
The Journal of Chemical Physics logoLink to The Journal of Chemical Physics
. 2009 Feb 10;130(6):064103. doi: 10.1063/1.3072704

The subtle business of model reduction for stochastic chemical kinetics

Dan T Gillespie 1,a), Yang Cao 2, Kevin R Sanft 3, Linda R Petzold 3
PMCID: PMC2675560  NIHMSID: NIHMS101906  PMID: 19222263

Abstract

This paper addresses the problem of simplifying chemical reaction networks by adroitly reducing the number of reaction channels and chemical species. The analysis adopts a discrete-stochastic point of view and focuses on the model reaction set S1S2S3, whose simplicity allows all the mathematics to be done exactly. The advantages and disadvantages of replacing this reaction set with a single S3-producing reaction are analyzed quantitatively using novel criteria for measuring simulation accuracy and simulation efficiency. It is shown that in all cases in which such a model reduction can be accomplished accurately and with a significant gain in simulation efficiency, a procedure called the slow-scale stochastic simulation algorithm provides a robust and theoretically transparent way of implementing the reduction.

INTRODUCTION

Biochemical systems typically contain networks of many chemical reaction channels involving many molecular species. This circumstance encourages attempts to construct simpler but equivalent “reduced” reaction networks. A well known example of such a reduction is the Michaelis–Menten abridgment of the enzyme-substrate reactions,1, 2 which has been the subject of many refinements over the years3, 4 and which continues to play an important role in biochemistry today.5

Typically, an abridgment replaces the given reaction network with a network that involves fewer reaction channels and fewer chemical species. Perhaps the simplest reaction set that presents the opportunity for doing that, one that has several features in common with the enzyme-substrate reactions but is mathematically more tractable, is

S1c2c1S2c3S3, (1)

where we assume that c1 and c3 are both nonzero. It is tempting to cut to the chase and replace this set of three three-species reactions with one two-species reaction, such as

S1cS3, (2)

where the reaction constant c is given some “suitable” value. Our focus in this paper will be to determine the conditions under which it is advisable to make such a replacement and to show how the replacement should be implemented. Of course, if a modeler deliberately chooses to model the production of S3 molecules from S1 molecules by reaction 2 instead of by reactions 1, then this issue is moot. But we are assuming here that the modeler believes that reactions 1 really describe what is going on physically, and therefore wants any abridgement of Eq. 1, such as reaction 2, to mimic the salient effects of reactions 1 with reasonable accuracy. A modeler might choose to use reaction 2 instead of reactions 1 because the values of the rate constants c1, c2, and c3 in Eq. 1 are not all known. But choosing an appropriate value for c in Eq. 2 inevitably makes assumptions about those three rate constants; thus, it might be better to use Eq. 1 with those assumptions made explicitly and openly, since that would not only preserve the topology of reactions 1 but also make it easy to incorporate later new information about the unknown rate constants.

The most obvious advantage in replacing reactions 1 with a single S3-producing reaction like Eq. 2 is the reduction in the numbers of reactions and species that we have to contend with. Another advantage might be speeding up the numerical simulation of reactions 1. By simulation we mean here stochastic simulation, since stochasticity often plays a role in cellular systems. But there are two potential drawbacks to such a reduction: First, as will be elaborated on below, this is always an approximation, since it is simply not possible for any single reaction to exactly mimic reactions 1 in all respects. Second, if we want to have the option of embedding reactions 1 in a larger network of reactions, some of which may involve species that get removed in the model reduction, as S2 has in Eq. 2, then it may be impossible to simulate those other reactions when using the reduced model.

In this paper, we will address these matters in detail for reaction set 1. We will begin by presenting some novel perspectives on simulation efficiency and simulation accuracy. We will show that these new perspectives imply that a one-reaction abridgment of Eq. 1 will be advisable in some circumstances, but not in others. We will then show that, in all cases where a model reduction can be done accurately and with a significant gain in stochastic simulation efficiency, implementing the reduction will be more involved than just swapping reactions 1 for reaction 2. Finally, we will establish a new perspective on the results of two recent papers, namely, the slow-scale stochastic simulation algorithm (ssSSA) of Cao et al.,6 and the stochastic quasi-steady-state approximation singular perturbation analysis (sQSPA) of Mastny et al.7

The reaction network 1 we are focusing on here is obviously very simple. But that simplicity allows all the mathematics, which even in this case is nontrivial, to be done exactly, and thus all issues to be explored thoroughly. We believe that a clarification of these issues in the context of reactions 1 can lead to a better understanding of how these issues play out in more complicated reaction networks.

QUANTIFYING THE GAIN IN SIMULATION EFFICIENCY

Well-stirred chemical systems with discrete molecular populations and stochastic reaction dynamics can be exactly simulated by the well-known SSA.8 The only downside is that the SSA is usually quite slow: The SSA simulates every reaction event, so the time required to make a SSA simulation run is proportional to the number of reaction events that occur.

Replacing reactions 1 with a single S3-producing reaction, such as reaction 2, would evidently have the consequence that a new S3 molecule would be produced by each reaction event. In contrast, the creation of a new S3 molecule via reactions 1 usually requires more than one-reaction event. Therefore, a fair measure of the gain in simulation efficiency realized by such an abridgment would be the average number of reaction events that are needed by reactions 1 to produce one S3 molecule. That number turns out to be surprisingly easy to compute.

Suppose a molecule starts out as an S1 molecule or, as we shall say, starts in state S1. On each visit to state S2, the molecule has probability c3∕(c2+c3) of going on to state S3 and probability c2∕(c2+c3) of going back to S1. So in n visits to state S2, the molecule would go on to state S3 an average of nc3∕(c2+c3) times; thus, in order to get an average of one visit to state S3, the molecule needs to visit state S2 a total of n1 times, where n1c3∕(c2+c3)=1. It follows that n1=(c2+c3)∕c3. But each of these n1 visits to state S2 requires exactly two reaction events, namely, the reaction R1 that brings the molecule to state S2, and either reaction R2 or R3 that takes the molecule away. Therefore, the average number of reaction events required for the molecule to get from state S1 to state S3 via the reaction set 1 is 2n1=2(c2+c3)∕c3. If the molecule had started in state S2 instead of state S1, it would be exactly one-reaction event closer to state S3, so the average number of reaction events for a molecule in state S2 to reach state S3 would be 2n1−1, a difference that is not significant for our purposes here. We thus conclude that the gain in simulation efficiency achieved by replacing reactions 1 with a single S3-producing reaction is approximately

G=2(c2+c3c3). (3)

That is to say, simulating the production of S3 molecules via some single reaction like Eq. 2 will be approximately G times faster than simulating via reactions 1. The overall gain in simulation efficiency would actually be less than this if reactions 1 were embedded in a larger reaction network that is also simulated.

Two things are noteworthy about the result 3. First, the gain depends on c2 and c3 but not on c1. Second, we will have

G1ifandonlyifc2c3. (4)

The assertion of Eq. 3 that G=2 when c2=0 is an obvious result; the assertion that G=4 when c2=c3 is perhaps less obvious. But both of those efficiency gains are modest compared to the gains achievable when c2 is one or more orders of magnitude larger than c3. It follows that one should carefully examine one’s goals in reducing reactions 1 when the strong inequality c2c3 is not satisfied.

ACCURACY: THE IMPORTANCE OF BEING EXPONENTIAL

Let T(x1,x2) be the time to the next firing of reaction R3 in Eq. 1 when there are x1S1 molecules and x2S2 molecules. Obviously, T(x1,x2) will be some kind of random variable. In this section, we will show that a necessary condition for reactions 1 to be accurately replaceable by a single S3-producing reaction, for example, reaction 2, is that T(x1,x2) be approximately an exponential random variable. In Sec. 4, we will calculate the exact probability density function (pdf) of T(x1,x2), with the aim of finding out under what conditions this exponentiality requirement is satisfied.

Stochastic chemical kinetics, which encompasses both the SSA and the chemical master equation, assumes that the dynamics of reaction 2 are described by8

cdt=theprobabilitythatagivenS1moleculeinreaction(2)willbecomeanS3moleculeinthenextinfinitesimaltimeintervaldt. (5)

Verification of this critical condition is awkward to accomplish directly. A more convenient but completely equivalent condition is afforded by the following theorem, which is proved in Appendix A.

Theorem 1. Condition 5 is equivalent to saying that the time required for a given S1 molecule to become an S3 molecule via reaction 2 is an exponential random variable with mean 1∕c.

We recall that the exponential random variable with mean c−1 is defined to be the random variable on 0≤t<∞ which has pdf c exp(−ct) and cumulative distribution function (cdf) 1−exp(−ct). It follows from Theorem 1 that reaction 2 cannot be simulated using the SSA, nor analyzed via the chemical master equation, unless the time required for a given S1 molecule to become an S3 molecule via reaction 2 is, at least approximately, an exponential random variable. This result has motivated some recent molecular dynamics studies of excluded volume effects in simple well-stirred one- and two-dimensional chemical systems.9

It might be thought that this exponential requirement, being stochastic, would not apply if we were content to describe reaction 2 in terms of traditional deterministic chemical kinetics. However, that is not true. To see why, recall that the traditional reaction rate equation for Eq. 2, written in terms of the number Xi(t) of Si molecules at time t, and assuming X3(0)=0, is

dX3(t)dt=cX1(t)=c(X1(0)X3(t)). (6a)

The solution to this equation is

X3(t)=X1(0)(1ect). (6b)

Consistency requires that X3(t) in Eq. 6b should accurately describe the behavior of the average number of S3 molecules in the stochastic formulation (note that we are dealing here with a linear first-order reaction). Let f(t) be the cdf for the time-to-reaction τ of any particular S1 molecule; i.e., f(t) is the probability that τ≤t, and hence the probability that an S1 molecule will have become an S3 molecule by time t. Then since the S1 molecules react independently of each other, the probability that exactly n of them will have become an S3 molecule by time t is

X1(0)!n![X1(0)n]![f(t)]n[1f(t)]X1(0)n.

This implies that the number of S3 molecules created in time t is the binomial random variable B(f(t),X1(0)). Since the mean of that random variable is f(t)X1(0), then agreement with Eq. 6b requires that

X1(0)(1ect)=f(t)X1(0),

or

f(t)=1ect. (7)

But this is the cdf of the exponential random variable with mean 1∕c. Thus we conclude that the time τ to reaction 2 for each individual S1 molecule must be exponentially distributed in order for the deterministic rate equations 6 to be valid.

For an example of a nonexponential τ-distribution that is clearly inconsistent with Eq. 6b, suppose that τ were uniformly distributed in the interval [c−1−ε,c−1+ε) for some ε<c−1. Then the mean time-to-reaction for each S1 molecule would indeed be c−1. But with this lifetime distribution, the number of S3 molecules would obviously stay at zero until time c−1−ε, and then rise roughly linearly to X1(0) in a time 2ε. This is clearly not the behavior predicted by formula 6b.

The relevance of the foregoing result to the problem of replacing reactions 1 with some single S3-producing reaction such as Eq. 2 can be understood as follows. If there are x1S1 molecules in the system, then Eq. 5 and the addition law of probability imply that the probability that reaction 2 will fire in the next dt is x1×cdt=cx1dt. More generally, any single reaction that produces one S3 molecule will have the property that, for some state-dependent function a which is called the reaction’s propensity function, adt gives the probability that the reaction will fire in the next dt. This implies, by the same reasoning that led to Theorem 1, that the time to the next firing of that reaction will be exponentially distributed with mean a−1. Therefore, if this reaction is to be a surrogate for reactions 1—a replacement that approximately replicates the way in which reactions 1 produce S3 molecules—then the time T(x1,x2) to the next firing of reaction R3 in Eq. 1 must be, at least approximately, exponentially distributed. If that turns out to be so, then an approximating surrogate reaction for Eq. 1 should exist. But if T(x1,x2) is found to be clearly nonexponential, then we must conclude that reactions 1cannot be accurately replaced by a single S3-producing reaction.

DISTRIBUTION OF THE TIME TO THE NEXT R3 REACTION

In Appendix B, we prove that the pdf P(t;x1,x2) of T(x1,x2), the time to the next R3 reaction in Eq. 1 when there are x1S1 molecules and x2S2 molecules, is given by Eq. B18. In that formula, the four functions p(β,t|α,0) for α, β=1,2 are given explicitly by Eqs. B13, B14, and

λ±12[(c1+c2+c3)±(c1+c2+c3)24c1c3]. (8)

That P(t;x1,x2) in Eq. B18 is not generally exponential can be seen by noting that its form for T(1,0), the time for a single S1 molecule to become an S3 molecule via reactions 1, turns out to be10

P(t;1,0)=c1c3(λ+λ)[eλteλ+t]. (9)

This pdf is obviously not exponential; indeed, it vanishes at t=0, whereas the pdf of any exponential random variable has its maximum at t=0. It also follows from Eq. B18 that the pdf of the time for a single S2 molecule to become an S3 molecule via reactions 1 is

P(t;0,1)=c3(λ+λ)[(c1λ)eλt+(λ+c1)eλ+t]. (10)

Although this pdf achieves its maximum at t=0, it still does not generally have a simple exponential form. Plots of the two pdfs 9, 10 for c1=c3=1 and c2=0.1 are shown in Fig. 1 on a semilog scale, where a truly exponential pdf would appear as a downsloping straight line. The nonexponential character of P(t;1,0) is obvious; that of P(t;0,1) is evinced by a gradual change in slope around t=2.

Figure 1.

Figure 1

Semilog plots of P(t;x1,x2) for c1=c3=1 and c2=0.1 for two cases: The solid curve is for x1=1 and x2=0, from Eq. 9. The dashed curve is for x1=0 and x2=1, from Eq. 10. Neither pdf has the straight-line form of an exponential pdf (there is a gradual change in slope in the dashed curve around t=2). The figure also shows that P(t;x1,x2) in this case depends on x1 and x2 individually, and not just on their sum.

The consequences of the nonexponential form of the pdf P(t;1,0) in Eq. 9 are illustrated in Fig. 2. The jagged solid curve shows a single X3(t) trajectory obtained in an exact SSA run of reactions 1, using the parameter values c1=c3=1 and c2=0.1 and the initial conditions X1(0)=300 and X2(0)=X3(0)=0. The dashed curve shows the average of 10 000 such trajectories. It can be shown that the mean of the pdf in Eq. 9 is (c1+c2+c3)∕c1c3, which in this case equals 2.1; i.e., the average time for an S1 molecule to become an S3 molecule via reactions 1 in this case is 2.1. If we made the usual deterministic assumption that formula 6a applies with c=(2.1)−1, then Eq. 6b would give the trajectory shown as the dotted curve in Fig. 2. The mismatch between that curve and the dashed curve illustrates the inappropriateness of replacing reactions 1 with reaction 2 when the time between R3 reactions is not exponentially distributed.

Figure 2.

Figure 2

The solid curve shows a single X3(t) trajectory obtained in a SSA run of reactions 1 with c1=c3=1, c2=0.1, X1(0)=300, and X2(0)=X3(0)=0. The dashed curve shows the average of 10 000 such trajectories. The dotted curve plots the function 6b with c=(2.1)−1, which corresponds to the same mean S1S3 conversion time. The mismatch between the dashed curve and the dotted curve indicates the error that results from replacing reactions 1 with reaction 2 in this nonexponential case.

The additional revelation in Fig. 1 that P(t;1,0) is not the same curve as P(t;0,1) illustrates another potential problem for model reduction: While any acceptable single-reaction abridgment of reactions 1 will accurately replicate the time evolution of the S3 population, and hence also the time evolution of the totalS1 and S2 population, the abridgment might not accurately replicate the time evolutions of the S1 and S2 populations separately; e.g., reaction 2 gives us no indication of the S2 population. Therefore, if P(t;x1,x2) depends on x1 and x2individually, and not just on their sum

x1+x2x12, (11)

as Fig. 1 shows happens when c1=c3=1, c2=0.1, and x12=1, then the lack of information about the individual values of x1 and x2 could make using the abridged reaction in a real simulation impossible, even if P(t;x1,x2) were exponential.

A close inspection of Eqs. B18, B13, B14 reveals that P(t;x1,x2) is in general a polynomial in e−λt and e−λ+t. From Eq. 8, it can be shown that when both c1 and c3 are positive, as we are assuming here, then 0<λ≤λ+. Therefore, a necessary condition for P(t;x1,x2) to be approximately exponential is for the rate constants to be such that

λλ+, (12)

because then, all terms involving e−λ+t will be negligibly small for t⪢1∕λ+, and we can hope that the t-dependence for t⪢1∕λ+ will be given by some power of e−λt.

When 0<λ≤λ+, the extreme inequality 12 will be satisfied if and only if

λ+λ(λ++λ)21. (13)

Since Eq. 8 implies that λ+λ=c1c3 and λ+=c1+c2+c3, then condition 13 is the same as

c1c3(c1+c2+c3)2(c1c1+c2+c3)(c3c1+c2+c3)1. (14)

Since each factor in the middle of Eq. 14 is less than 1, then the right inequality in Eq. 14 can be satisfied if and only if at least one of those two factors is ⪡1. The first factor will be ⪡1 if and only if c2+c3c1, which is the same as either c2c1 or c3c1. And the second factor will be ⪡1 if and only if c1+c2c3, which is the same as either c1c3 or c2c3. Thus we conclude that condition 12 will be satisfied if and only if at least one of the following four conditions holds:

c2c1, (15a)
c3c1, (15b)
c1c3, (15c)
c2c3. (15d)

Note that these four conditions are not mutually exclusive; e.g., the condition c2c3c1 satisfies both conditions 15d, 15b. Nor are these conditions collectively exhaustive; e.g., the condition c1=c2=c3 satisfies none of conditions 15. But satisfaction of at least one of conditions 15 is necessary, and as we shall see shortly sufficient, for P(t;x1,x2) to be exponential.

Assume now that at least one of conditions 15 is satisfied. Then the strong inequality 14 will also be satisfied, so we will have from Eq. 8 that

λ±=(c1+c2+c3)2[1±14c1c3(c1+c2+c3)2](c1+c2+c3)2[1±(1124c1c3(c1+c2+c3)2)],

whence

λ+c1+c2+c3,λc1c3c1+c2+c3. (16)

When Eq. 16 is substituted into Eqs. B13, B14, and the results are substituted into Eq. B18, we obtain since λ⪡λ+,

P(t;x1,x2)c1c3x1c1+c2+c3eλ(x1+x2)t(c1+c2c1+c2+c3)x2+c1c3x2c1+c2+c3eλ(x1+x2)t(c1+c2c1+c2+c3)x21(t1λ+). (17)

The restriction on t here ensures that all terms involving e−λ+t have become negligibly small. Again, this approximation assumes that at least one of conditions 15 is satisfied.

Now let us examine 17 for the individual conditions 15. First, if either condition 15a or 15b holds, so that c2+c3c1, then Eq. 16 gives λ+c2+c3 and λc1c3∕(c2+c3). Equation 17 simplifies slightly, in that c1 gets dropped from all denominators. Further simplification of Eq. 17 follows from the observation that the conditionc2+c3c1 implies that reaction R1, which creates S2 molecules, will occur much less frequently than reactions R2 and R3, which destroy S2 molecules. The S2 population will thus usually be very small, and a reasonable approximation would be to set x2≈0, and hence x1x12. With those approximations, the second term in Eq. 17 is effectively removed and the equation finally reduces to

P(t;x1,x2)(c1c3x12c2+c3)e(c1c3x12(c2+c3))t(t(c2+c3)1,for(15a)or(15b)). (18)

Since this pdf has the exponential form, an accurate single-reaction abridgment should be possible. And the decay constant in Eq. 18 will be the propensity function of the surrogate reaction. The fact that this decay constant depends on x1 and x2 only through their sum x12 suggests that the reduced model should be amenable to simulation.

Now suppose that either condition 15c or condition 15d holds. Then c1+c2c3, and Eq. 16 gives λ+c1+c2 and λc1c3∕(c1+c2). The relation c1+c2c3 implies that c3 can be dropped from all denominators in Eq. 17. That equation then reduces to

P(t;x1,x2)(c1c3x12c1+c2)e(c1c3x12(c1+c2))t(t(c1+c2)1,for(15c) or  (15d)). (19)

Again this pdf has the exponential form, with the decay constant depending on x1 and x2 only through their sum x12. Therefore, replacing reactions 1 with a single S3-producing reaction, whose propensity function is the decay constant in Eq. 19, should be feasible. Note that Eq. 19 does not assume, as Eq. 18 does, that x2≈0.

IMPLEMENTING THE REDUCED MODEL

We showed in Sec. 4 that an accurate replacement of reactions 1 by a single S3-producing reaction should be possible under conditions 15a, 15b, 15c, 15d. More specifically, the result in Eq. 18 shows that under conditions 15a, 15b the S3-producing reaction should have the propensity function

a=c1c3x12c2+c3(forc2c1orc3c1), (20)

with the understanding that x2≈0, and that we are not interested in phenomena occurring on timescales of order (c2+c3)−1 or smaller. And the result in Eq. 19 shows that under conditions 15c, 15d the S3-producing reaction should have propensity function

a=c1c3x12c1+c2(forc1c3orc2c3), (21)

with no restrictions on x2, but with the understanding that we are not interested in phenomena occurring on timescales of order (c1+c2)−1 or smaller. But exactly how should the replacement reaction be framed in these cases?

First let us dispose of two “obvious” cases in whichc2≈0, and reaction R2 practically never fires. The first case couples that condition with condition 15b, c3c1: In a short time of order c31, all S2 molecules become S3 molecules via reaction R3; thereafter, the S1 molecules convert to S3 molecules essentially via reaction 2 with the approximate propensity function a=c1x1, since each R1 firing will practically always be followed immediately by an R3 firing. This result also follows by setting c2≈0 in Eq. 20, and remembering that x12x1 since Eq. 20 assumes thatx2≈0. The other obvious case couples condition c2≈0 with condition 15c, c1c3: In a short time of order c11, all S1 molecules become S2 molecules via reaction R1; thereafter, the S2 molecules convert to S3 molecules via reaction R3 with the approximate propensity function a=c3x2=c3x12. This result also follows by putting c2≈0 in Eq. 21. In both of these obvious cases, the simulation speedup factor realized by the abridgment is about 2, which is rather modest.

A more interesting case arises by conjoining conditions 15a, 15b, and requiring that bothc2 and c3 be ⪢c1, a condition that we will write c2, c3c1. This condition has been analyzed in detail by Mastny et al.7 using a reduction method which they call the stochastic sQSPA. The conclusion of their analysis (see Ref. 7, Table II) expressed in our notation here is that reactions 1 can be replaced by reaction 2 with propensity function a=c1c2x1∕(c2+c3). This is the same as our result in Eq. 20, since the assumption c2, c3c1 in Ref. 7 implies x2≈0, and hence x1x12. Our first-passage-time analysis thus confirms the result of Mastny et al.,7 including the proviso which is implicit in their derivation that this approximation is valid only on timescales larger than (c2+c3)−1. The resulting gain in simulation efficiency 3 will be large or small according to whether c2c3 is large or small. But we note that the condition x2≈0 would appear to pose a problem if we wanted to embed the abridged reaction in a network of other reactions, some of which create or consume S2 molecules.

Another interesting case is Eq. 15d, c2c3. We showed in Sec. 2 that this is the condition for a truly substantial speedup in stochastic simulation. But it turns out that simply replacing reactions 1 with reaction 2 using the propensity function 21 has some limitations. To illustrate, we have used the exact SSA to simulate each of reactions 1, 2 for parameter values

c1=3,c2=2,c3=104 (22a)

and the initial conditions

X1(0)=300,X2(0)=X3(0)=0. (22b)

Figure 3 shows the results of the SSA simulation of reactions 1. In this figure, the species populations have been plotted out immediately after the occurrence of each R3 reaction, so only 300 points get plotted in the conversion of the 300 S1 molecules. But approximately 1.2×107 reaction events had to be simulated in order to get those 300 R3 reactions, so there were on average 4×104R1 and R2 reaction events between successive R3 reaction events, a figure that agrees with formula 3.

Figure 3.

Figure 3

A “true” picture of reactions 1 for the parameter settings 22 is provided by this SSA run of those reactions. Here the species populations have been plotted out only after each R3 event. Since the S3 population remains constant between successive R3 reactions, this plotting strategy reveals the full trajectory of X3(t). But of course, the populations of species S1 and S2 are not constant between successive R3 reactions.

Figure 4 shows the results of the SSA simulation of the surrogate reaction 2 using the propensity function in Eq. 21 and the same parameter values 22 as used in Fig. 3. Here the populations have been plotted out after each simulated reaction. Since only 300 reaction events were simulated in this run, compared to the 1.2×107 events that were simulated to produce Fig. 3, the gain in simulation efficiency achieved by using the surrogate reaction 2 is truly large. Comparing Fig. 4 with Fig. 3 shows that the surrogate reaction 2 does give a satisfactory representation of the X3(t) trajectory, just as we expect on the basis of our analysis. But reaction 2 evidently does not provide a satisfactory representation of the X1(t) trajectory; furthermore, it gives us no information at all about the X2(t) trajectory. The explanation for these shortcomings is not hard to fathom: When we stop simulating reactions R1 and R2, as we do when we substitute reaction 2 for reactions 1, we lose the ability to accurately track the populations of species S1 and S2.

Figure 4.

Figure 4

A SSA simulation of the surrogate reaction 2 with propensity function 21, using the same settings 22 as in Fig. 3. Only 300 reaction events were simulated here, as compared to 1.2×107 reaction events in Fig. 3, so the gain in computational speed over reactions 1 is truly enormous. The X3(t) trajectory has been accurately rendered. But the X1(t) trajectory has not, and the X2(t) trajectory has been completely lost.

If we were interested in only X3(t), and if reactions 1 were the only reactions in the system that involve species S1 and S2, then we might be satisfied with this state of affairs. But we are often concerned with situations in which reactions 1 take place concurrently with other reactions, some of which have one or both of species S1 and S2 as reactants. With no reliable information about the instantaneous populations of S1 and S2 when using reaction 2, how are we to evaluate the propensity functions of those other reactions in order to simulate their firings along with the firings of reaction 2? Evidently, simply replacing reactions 1 with reaction 2 when c2c3 will not be satisfactory if there are other reactions in the system that have S1 and S2 as reactants, or if we want to see how species S1 and S2 behave on the timescale of reaction R3.

THE SLOW-SCALE SSA: A ROBUST RECIPE FOR CONDITION 15d

We will show in this section that, under condition 15d,

c2c3, (23)

replacing reactions 1 with a single S3-producing reaction can be accurately and robustly accomplished using a procedure called the ssSSA. Designed more generally for “stiff” stochastic systems (systems with a wide separation of timescales with the fastest mode being stable), the ssSSA was introduced in Ref. 6 by some of the present authors, and is basically a refinement of ideas introduced earlier by Haseltine and Rawlings11 and Rao and Arkin.12 Instead of replacing reactions 1 with a single new reaction like Eq. 2, the ssSSA eliminates reactions R1 and R2, and then uses a modified propensity function for reaction R3.

We should note that condition 23 differs from, and thus corrects, the condition advertised in Ref. 6 for applying the ssSSA to reactions 1.13 Also, as will be explained in the next paragraph, condition 23 does not need to be supplemented by c1c3.

When condition 23 is satisfied, an S2 molecule is much more likely to become an S1 molecule than an S3 molecule. Thus, successive occurrences of reaction R3 will usually be separated by very many occurrences of reactions R1 and R2; indeed, as we showed in Sec. 2, there will be on average 2(c2+c3)∕c3R1 and R2 reactions occurring between successive R3 reactions, and that number will be ⪢1 when condition 23 holds. Since R1 and R2 will be firing much more frequently than R3, we call R1 and R2 “fast reactions,” and R3 a “slow reaction.” Notice that the designation of R1 as a fast reaction under condition 23 is justified regardless of the size of c1>0, because between two successive R3 reactions, there will inevitably be as many R1 firings as R2 firings. And it is the averagefrequency of firing, not the size of the reaction rate constant, that determines whether a reaction is “fast” or “slow” for the ssSSA. Species S1 and S2 are then designated as “fast species” because their populations get changed by a fast reaction, and S3 is called a “slow species” because its population does not.

The fast species populations evolving under only the fast reactions, i.e., S1c2c1S2, constitute what is called the virtual fast process. We denote it by (X^1(t),X^2(t)), using the hat to distinguish it from the real fast process (X1(t),X2(t)) which evolves under all three reactions in Eq. 1. For the virtual fast process (but not for the real fast process), we have the conservation relation

X^1(t)+X^2(t)=x12(aconstant); (24)

therefore, the virtual fast process has only one independent variable. We choose it to be X^2(t), and then take X^1(t)=x12X^2(t). The process X^2(t) thus evolves according to the propensity functions

a^1(x2)=c1(x12x2),a^2(x2)=c2x2,

with X^2(t) increasing by 1 each time R1 fires, and decreasing by 1 each time R2 fires. This simple stochastic process has been well studied.14 It can be shown that its t→∞ limit X^2() is the binomial random variable with parameters c1∕(c1+c2) and x12:

X^2()=B(c1(c1+c2),x12). (25)

Since B(p,N) has mean Np and variance Np(1−p), then

X^2()=c1x12c1+c2andvar{X^2()}=c1c2x12(c1+c2)2. (26a)

It then follows from Eq. 24 (or by symmetry) that

X^1()=c2x12c1+c2andvar{X^1()}=c1c2x12(c1+c2)2. (26b)

Notice that the asymptotic distribution of the virtual fast process depends on the current state (x1,x2,x3)≡x only through the quantity x1+x2=x12. That these few facts about X^2(t) are all that is needed to construct a computationally viable abridgment of reactions 1 when c2c3 is a consequence of the following theorem.

Theorem 2. Given condition 23, let the system be in state (x1,x2,x3)≡x at time t. Then for any δt that is large compared to the expected time to the next R1 or R2 reaction, but small compared to the expected time to the next R3 reaction, the probability that reaction R3 will fire in [t,tt) is approximately a¯3(x)δt, where

a¯3(x)c3X^2(). (27)

Furthermore, X^2() and x12X^2() provide good estimates of the populations of species S2 and S1 at any time aftertt but before the next R3 reaction occurs.

This theorem is proved in Appendix C. It says, first of all, that a¯3(x) as defined in Eq. 27 is the “effective propensity function” of reaction R3 on the timescale of that (slow) reaction. This is so because the defining attribute of a propensity function is that its product with an “effectively infinitesimal” time span gives the probability that the reaction will occur in that time span. With Eq. 26a, Eq. 27 takes the explicit form

a¯3(x)=c3c1x12c1+c2. (28)

Note that this is the same as the propensity function 21 that our first-passage-time analysis gave for condition 23. Theorem 2 also tells us that the S2 and S1 populations at any time greater than δt after the last R3 reaction can be estimated by drawing a sample x2 of the random variable X^2() in Eq. 25, and then taking X2=x2 and X1=x12x2.

The critical assumption used in proving Theorem 2 (see Appendix C) is that between successive firings of reaction R3 there will typically be many firings of reactions R1 and R2. We showed in Sec. 2 that this will always be so if condition 23 holds. To see that the result 28 is consistent with this fact, we reason as follows: Since a¯3(x)δt is (approximately) the probability that R3 will fire in the next δt, then the mean time to the next firing of R3 will be (approximately)

1a¯3(x)=c1+c2c3c1x12. (29a)

And since the average probability that either R1 or R2 will fire in the next dt is c1X^1()dt+c2X^2()dt, then the average mean time to the next firing of either R1 or R2 will be (approximately)

1c1X^1()+c2X^2()=c1+c22c2c1x12, (29b)

where the last step follows upon substituting from Eq. 26. Now observe that, under condition c2c3, the time 29b will indeed be very much smaller than the time 29a; moreover, no other condition is required to ensure this.

The strategy of the ssSSA is to use the standard SSA to simulate only reaction R3, but taking the propensity function for that reaction to be the function 28 instead of c3x2. At each firing of R3, the ssSSA increases the S3 population by 1 and decreases x12 by 1. The ssSSA then “waits” for a time of order δt, which is very small on the timescale of reaction R3 but nevertheless large enough for the fast species populations to “relax” to their t=∞ values, and it then estimates the populations of the fast species by sampling the binomial random variable 25. The full ssSSA simulation procedure for reactions 1 thus proceeds as follows.

  • 1.

    In state (x1,x2,x3) at time t, and with x12=x1+x2, evaluate a¯3 in Eq. 28.

  • 2.

    Draw a unit-interval uniform random number r and compute the time to the next R3 reaction,τ=(1a¯3)ln(1r).

  • 3.

    Advance time to the next R3 reaction by replacingtt+τ. Then actualize that reaction by replacingx3x3+1 and x12x12−1.

  • 4.

    Generate the “relaxed” populations of the fast species by taking x2 to be a sample of the binomial random variable 25, and x1=x12x2.

  • 5.

    Record (t,x1,x2,x3) if desired. Then return to step 1, or else stop.

Figure 5 shows the results of a ssSSA run made in this way for the parameter values 22. The results are seen to be practically indistinguishable (in a statistical sense) from the exact SSA results in Fig. 3. But whereas the SSA run took about 6 min to execute, the ssSSA run took only a fraction of a second. Notice that the ssSSA remedies the deficiencies of the reaction 2 simulation in Fig. 4 as regards species S1 and S2.

Figure 5.

Figure 5

A ssSSA simulation of reactions 1 using the same settings 22 as used in Figs. 34. Here the fast reactions R1 and R2 have been skipped over, and only firings of the slow reaction R3 have been simulated, using, however, the modified propensity function 28 or 21. As in Fig. 4, only 300 reaction events were simulated in this run (but here those were modified R3 reactions), and the population of the slow species S3 has been accurately rendered. But this ssSSA run evidently gives a much more accurate picture of the behavior of the fast species S1 and S2 than does the run in Fig. 4. Notice also that the initial rapid relaxation in Fig. 3 of X1 (from 300) and X2 (from 0) is accurately replicated in this ssSSA run.

What happens if reactions 1 are embedded in a network of other reactions, some of which involve the fast species S1 and S2 as reactants? The answer to this question depends on whether the other reactions are fast or slow. If any of the other reactions are as fast or faster than reactions R1 and R2, then we must start the analysis all over by finding, if possible, a new virtual fast process that is asymptotically stable. But if all of the new reactions are slow—i.e., they occur infrequently relative to reactions R1 and R2—then they can easily be accommodated in the above simulation procedure. For example, the additional slow reaction R4, S1+S4c4S5, which has propensity function a4(x)=c4x1x4, would be assigned the effective propensity function

a¯4(x)=c4X^1()x4=c4c2x12x4c1+c2,

where the last equality follows from Eq. 26b. And the additional slow reaction S1+S2c5S6 with propensity function a5(x)=c5x1x2 would be assigned the effective propensity function

a¯5(x)=c5X^1()X^2()=c5c1c2(c1+c2)2x12(x121).

The last step here follows by first writing

X^1()X^2()=(x12X^2())X^2()=x12X^2()X^22(),

then using X^22()=X^2()2+varX^2(), and finally invoking Eq. 26. Thus, any new slow reactions involving the fast species S1 and S2 can be accommodated by the ssSSA, despite the fact that we have no sure knowledge of the instantaneous populations of those fast species.

The status of the fast species populations in the ssSSA merits further comment: Although the values for x1 and x2 generated in step 4 get plotted in step 5, those values are not used in the computations that drive the simulation; therefore, if plots of the fast species populations are not needed, step 4 can be omitted without any impairment to simulation accuracy. The fact is that x1 and x2 are not individually “tracked” by the ssSSA, because the ssSSA does not simulate reactions R1 and R2. Step 4 merely estimates how the values of x1 and x2 might appear on the slow timescale. But the sum x1+x2=x12 is accurately tracked, and that sum is all that we need to implement reaction R3, or any other slow reaction that involves one or both of S1 and S2 as reactants.

SUMMARY AND CONCLUSIONS

In this paper, we have shown that replacing reactions 1 with a single reaction that produces S3 cannot be done accurately unless the time to the next creation of an S3 molecule via reactions 1 can be well approximated by an exponential random variable. We showed that this applies even to the associated deterministic reaction rate equations. The specific requirement for accuracy is that P(t;x1,x2), the pdf of the time to the next R3 event in Eq. 1 given x1S1 molecules and x2S2 molecules, should be well approximated by the canonical exponential form aeat. If that is so, then a surrogate reaction that accurately mimics the production of S3 molecules should exist, and a will be its propensity function. If, however, the surrogate reaction is unable to accurately track the S1 and S2 populations individually, then even if the exponential approximation obtains, a model reduction will be feasible only if a depends on x1 and x2 only through their sum x12.

Against this background, we derived using first-passage-time theory an exact formula for P(t;x1,x2). We then showed that there are only four situations in which that function satisfies the foregoing conditions: c2c1 [Eq. 15a], c3c1 [Eq. 15b], c1c3 [Eq. 15c], and c2c3 [Eq. 15d]. We found that if either of conditions 15a or 15b holds, then, under the reasonable assumption that the S2 population is practically always zero, the propensity function of the surrogate reaction will have the form a=c1c3x12∕(c2+c3). And we found that if either of conditions 15c or 15d holds, then the propensity function of the surrogate reaction will have the form a=c1c3x12∕(c1+c2), with no assumptions being made regarding the S2 population. Note that conditions 15a, 15b, 15c, 15d are not mutually exclusive; e.g., if c2c3c1, then conditions 15d, 15b are both satisfied, and each of the two different formulas for a in those two cases reduces to the same result, c1c3x12c2.

We pointed out that abridgment solely for the sake of reducing the size of the model is not always prudent. Abridging a set of reactions is always an approximation, so there is always some loss of accuracy. In particular, although we can be confident that in the scenarios 15a, 15b, 15c, 15d the true behavior of the S3 population in reactions 1 will be accurately replicated by the surrogate reaction, that might not be so for the S1 and S2 populations, since most model reductions will eliminate or severely constrain those two species. That might not matter if reactions 1 occur in isolation, in which case it would be a clear benefit of the abridgment. But it could give rise to a serious problem if reactions 1 are embedded in a larger network of reactions, some of which have S1 and S2 as reactants or products.

Since stochastic simulation is usually the tool of choice for analyzing complex cellular reaction networks, one reasonable goal of model reduction is to make stochastic simulation run faster. We showed that the maximum speedup factor in any single-reaction abridgment of reactions 1 is 2(c2+c3)∕c3. This implies that, of the four cases 15a, 15b, 15c, 15d, the only one for which abridgment has a chance of producing a significant gain in simulation speed is case 15d, c2c3. If that condition is satisfied, the speedup factor will be ⪢1. If it is not satisfied, the speedup factor will typically be rather small, and possibly not large enough to compensate for the loss of accuracy and robustness that invariably attends model reduction.

We showed that condition c2c3 is the sole requirement for accurately applying the ssSSA procedure of Cao et al.6 to reactions 1, contrary to earlier assertions.13 We emphasized that the ssSSA implements a single-reaction abridgment of reactions 1 in a way that overcomes several shortcomings that arise if reactions 1 are simply replaced by reaction 2: In the ssSSA, the S1 and S2 populations are accurately represented on the timescale of reaction R3, and additional slow reactions that involve S1 and S2 as reactants can easily be accommodated.

Finally, we showed that our first-passage-time analysis provides a framework which unites the abridgment under condition c2c3 given by the ssSSA of Cao et al.,6 and the abridgment under condition c2, c3c1 given by the sQSPA procedure of Mastny et al.7 Furthermore, our first-passage-time analysis enables us to identify all of the conditions under which a single-reaction abridgment of reactions 1 is possible.

ACKNOWLEDGMENTS

The authors thank Sotiria Lampoudi for some helpful discussions, and also the journal’s anonymous reviewer for some pertinent observations. The authors gratefully acknowledge financial support as follows: D.G. was supported by the California Institute of Technology through Consulting Agreement No. 102-1080890 pursuant to Grant No. R01GM078992 from the National Institute of General Medical Sciences, and through Contract No. 82-1083250 pursuant to Grant No. R01EB007511 from the National Institute of Biomedical Imaging and Bioengineering, and also from the University of California at Santa Barbara under Consulting Agreement No. 054281A20 pursuant to funding from the National Institutes of Health. Y.C. was supported by the National Science Foundation under Award No. CCF-0726763, and also the National Institutes of Health under Award Nos. GM073744 and GM078989. K.S. and L.P. were supported by Grant No. R01EB007511 from the National Institute of Biomedical Imaging and Bioengineering, Pfizer Inc., DOE Contract No. DE-FG02-04ER25621, NSF Contract No. IGERT DG02-21715, and the Institute for Collaborative Biotechnologies through Grant No. DFR3A-8-447850-23002 from the U.S. Army Research Office. K.S. was also supported by a National Science Foundation Graduate Research Fellowship.

APPENDIX A: PROOF OF THEOREM 1

(Necessity) Given Eq. 5, let P0(τ) denote the probability that the S1 molecule will not react during the next time span τ. By the laws of probability, this function must satisfy P0(τ+dτ)=P0(τ)×[1−cdτ], where the last factor is the probability that the S1 molecule, having not reacted in time τ, will not react in [τ,τ+dτ). This implies the differential equation dP0(τ)∕dτ=−cP0(τ). The solution of that equation for the initial condition P0(0)=1 is P0(τ)=exp(−cτ). Therefore, the probability that a given S1 molecule will react in the infinitesimal time interval [τ,τ+dτ) is {the probability that it will not react in [0,τ)} times {the probability that it will react in the next dτ}: P0(τ)×cdτ=c exp(−cτ)dτ. This implies that the pdf of the time for the S1 molecule to react is c exp(−cτ), which is precisely the pdf of the exponential random variable with mean c−1.

(Sufficiency) Given that the pdf of the time until the S1 molecule reacts is c exp(−cτ), it follows that probability that the molecule will react in the time interval [τ,τ+dτ) is c exp(−cτ)dτ. Therefore, the probability that the molecule will react in the nextdτ, i.e., in the time interval [0,dτ), is c exp(0)dτ=cdτ, as asserted in Eq. 5.

APPENDIX B: PDF OF THE TIME TO THE NEXT R3 EVENT

Regard Eq. 1 as depicting the transitions of a single “random walker” among three “states” S1, S2, and S3. We will first derive formulas for the pdfs Pα→3(t) of the times Tα→3 (α=1 or 2) required for the walker, starting in state Sα, to first reach state S3. To that end, let p(n,t|α,0) be the probability that the walker, having started at time 0 in state Sα (α=1 or 2), will be found at time t≥0 to be in state Sn (n=1,2,3). Since according to Eq. 1 the walker will remain in state S3 after it arrives there, then

p(3,t|α,0)=Prob{Tα3t}. (B1)

The probability on the right is, by definition, the cdf of the random variable Tα→3. Since the derivative of the cdf with respect to τ gives the corresponding pdf, then

Pα3(t)=dp(3,t|α,0)dt. (B2)

The laws of probability give us the following relations among the functions p(n,t|α,0) at any time t and any infinitesimally later time t+dt:

p(1,t+dt|α,0)=p(1,t|α,0)×[1c1dt]+p(2,t|α,0)×c2dt, (B3a)
p(2,t+dt|α,0)=p(1,t|α,0)×c1dt+p(2,t|α,0)×[1c2dtc3dt], (B3b)
p(3,t+dt|α,0)=p(3,t|α,0)+p(2,t|α,0)×c3dt. (B3c)

For example, Eq. B3a is the statement that {the probability of being in state S1 at time t+dt} is equal to the sum of {the probability of being in state S1 at time t and then not jumping away in the next dt} plus {the probability of being in state S2 at time t and then jumping to state S1 in the next dt}. This logic ignores routes to state S1 at time t+dt that involve more than one jump in time [t,t+dt), but that is permissible here since the probabilities for those paths will be of higher order than 1 in dt. Analogous reasoning gives Eqs. B3b, B3c. By algebraically rearranging each of these equations, dividing through by dt, and then taking the limit dt→0, we obtain the following set of coupled ordinary differential equations, which constitute the “master equation” for this stochastic process:

dp(1,t|α,0)dt=c1p(1,t|α,0)+c2p(2,t|α,0), (B4a)
dp(2,t|α,0)dt=c1p(1,t|α,0)(c2+c3)p(2,t|α,0), (B4b)
dp(3,t|α,0)dt=c3p(2,t|α,0). (B5)

Note that Eqs. B4a, B4b constitute a closed pair of coupled differential equations for p(1,t|α,0) and p(2,t|α,0). Once that pair of equations has been solved, p(3,t|α,0) can be obtained either by solving Eq. B5 or more simply from the fact that

p(3,t|α,0)=1p(1,t|α,0)p(2,t|α,0). (B6)

Combining Eqs. B2, B5, we see that the function Pα→3(t) can be computed as

Pα3(t)=c3p(2,t|α,0). (B7)

Equations B4 can be solved in a standard way that begins by writing them as

dpα(t)dt=Apα(t), (B8)

where

pα(t)(p(1,t|α,0)p(2,t|α,0))andA(c1c2c1c2+c3). (B9)

The solution to Eq. B8 turns out to involve the eigenvalues λ+ and λ of A. These are evidently the solutions of the quadratic equation

(c1λ±)(c2+c3λ±)c1c2=0, (B10)

and are easily found to be

λ±12[(c1+c2+c3)±(c1+c2+c3)24c1c3]. (B11)

A little algebra shows that the quantity under the radical here is never negative, so

0λλ+. (B12)

We shall not belabor the process by which the solutions to Eq. B4 for α=1 and 2 are obtained, because it can be verified by simple differentiation that the functions below satisfy Eq. B4. And it is also easy to verify that they satisfy the required initial conditions.

p(1,t|1,0)=1(λ+λ)[(λ+c1)eλt+(c1λ)eλ+t], (B13a)
p(2,t|1,0)=c1(λ+λ)[eλteλ+t]; (B13b)
p(1,t|2,0)=c2(λ+λ)[eλteλ+t], (B14a)
p(2,t|2,0)=1(λ+λ)[(c1λ)eλt+(λ+c1)eλ+t]. (B14b)

The pdfs of the first passage times T1→3 and T2→3 can now be obtained simply by substituting Eqs. B13b, B14b into formula B7. However, our main concern here is with the more general case in which x1 random walkers are initially in state S1 and x2 are initially in state S2. The pdf P(t;x1,x2) of the time T(x1,x2) required for the first of those walkers to reach state S3 can be computed by reasoning as follows: Since the walkers evolve independently of each other, the probability that none of them will reach state S3earlier than time t is

Prob{T(x1,x2)>t}=(Prob{T13>t})x1(Prob{T23>t})x2. (B15)

This is equivalent to

Prob{T(x1,x2)t}=1(1Prob{T13t})x1(1Prob{T23t})x2.

Using Eq. B1, this last equation can be written

Prob{T(x1,x2)t}=1(1p(3,t|1,0))x1(1p(3,t|2,0))x2. (B16)

But the left side of Eq. B16 is, by definition, the cdf of the random variable T(x1,x2). Therefore, the derivative of Eq. B16 with respect to t gives the pdf of T(x1,x2):

P(t;x1,x2)=ddt[1(1p(3,t|1,0))x1(1p(3,t|2,0))x2]. (B17)

Upon evaluating this derivative with the help of Eqs. B5, B6, we get

P(t;x1,x2)=x1c3p(2,t|1,0)(p(1,t|1,0)+p(2,t|1,0))x11(p(1,t|2,0)+p(2,t|2,0))x2+x2c3p(2,t|2,0)(p(1,t|1,0)+p(2,t|1,0))x1(p(1,t|2,0)+p(2,t|2,0))x21. (B18)

Since all the p-functions on the right side of Eq. B18 are given explicitly by Eqs. B13, B14, we have in Eq. B18 an exact, explicit formula for the pdf of the first-passage time T(x1,x2).

APPENDIX C: PROOF OF THEOREM 2

That it is possible, when c2c3⪢1, to choose a time span δt that contains very many R1 and R2 events but practically no R3 events, follows from the fact established in Sec. 2 that successive R3 reactions will, on average, be separated by (c2+c3)∕c3 pairs of R1 and R2 reactions. Let [t,t+dt) be an infinitesimal subinterval of the interval [t,tt). The probability that R3 will fire in [t,t+dt) is c3X2(t)dt. But

c3X2(t)dtc3X^2(t)dt, (C1)

because the dearth of R3 events in [t,tt) implies that the real fast process X2(t) can be well approximated there by the virtual fast process X^2(t). The probability that R3 will fire anywhere in the interval [t,tt) can now be computed by summing the probabilities C1 over all the dt subintervals of [t,tt):

tt+δtc3X^2(t)dtc3{1δttt+δtX^2(t)dt}δt. (C2)

This invocation of the addition law of probability for mutually exclusive events is justified since the probability for more than one R3 firing in [t,tt) is practically zero. Now let K be an integer that is roughly equal to the expected number of firings of R1 and R2 in [t,tt), a number that will be ⪢1. Subdividing [t,tt) into K subintervals of equal length δtK, we can approximate the integral in braces in Eq. C2 as

1δttt+δtX^2(t)dt1δtk=1KX^2(tk)(δtK)=1Kk=1KX^2(tk), (C3)

where tk (k=1,…,K) locates the center of the kth subinterval. After the first few R1 and R2 firings, the process X^2(t) will effectively “decorrelate” and “relax” to its time-independent form X^2(); thus, the K values X^2(t1),,X^2(tK) in Eq. C3 can collectively be approximated by Ksample values X^2()(1),,X^2()(K) of the random variable X^2(). Equation C3 then becomes

1δttt+δtX^2(t)dt1Kk=1KX^2()(k)X^2(). (C4)

Substituting Eq. C4 into Eq. C2, we conclude that the probability that reaction R3 will fire in (t,tt) is approximately equal to c3X^2()δt. That is the first assertion of Theorem 2. The second assertion follows from the fact that, for any t>tt prior to the next R3 event, X^2(t) can be approximated by X^2().

References

  1. Michaelis L. and Menten M. L., Biochem. Z. 49, 333 (1913). [Google Scholar]
  2. Briggs G. E. and Haldane J. B. S., Biochem. J. 19, 338 (1925) [DOI] [PMC free article] [PubMed] [Google Scholar]; Nelson D. L. and Cox M. M., Lehninger Principles of Biochemistry (Freeman, San Francisco, 2005). [Google Scholar]
  3. Borghans J. A., deBoer R. J., and Segel L. A., Bull. Math. Biol. 10.1007/BF02458281 58, 43 (1996). [DOI] [PubMed] [Google Scholar]
  4. Tzafriri A. R. and Edelman E. R., J. Theor. Biol. 226, 303 (2004). [DOI] [PubMed] [Google Scholar]
  5. Barik D., Paul M., Baumann W., Cao Y., and Tyson J., Biophys. J. 95, 3563 (2008). 10.1529/biophysj.108.129155 [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Cao Y., Gillespie D. T., and Petzold L. R., J. Chem. Phys. 10.1063/1.1824902 122, 014116 (2005). [DOI] [Google Scholar]
  7. Mastny E. A., Haseltine E. L., and Rawlings J. B., J. Chem. Phys. 10.1063/1.2764480 127, 094106 (2007). [DOI] [PubMed] [Google Scholar]
  8. For a review of the stochastic simulation algorithm, the chemical master equation, and related topics, see Gillespie D. T., Annu. Rev. Phys. Chem. 10.1146/annurev.physchem.58.032806.104637 58, 35 (2007). [DOI] [PubMed] [Google Scholar]
  9. Gillespie D. T., Lampoudi S., and Petzold L. R., J. Chem. Phys. 10.1063/1.2424461 126, 034302 (2007) [DOI] [PubMed] [Google Scholar]; Lampoudi S., Gillespie D. T., and Petzold L. R., “Effect of excluded volume on 2D discrete stochastic chemical kinetics,” J. Comput. Phys. (in press). [DOI] [PMC free article] [PubMed]
  10. If c2=0 and c3≥c1, Eq. gives λ+=c3 and λ−=c1. Setting those results into Eq. gives a formula that is indeterminate when c3=c1. But applying L’Hospital’s rule to that indeterminate form, taking derivatives with respect to c3 yields the pdf c12te−c1t. This nonexponential form, which goes to zero as t→0, is the pdf of the gamma random variable Γ(c1,2), which is defined as the sum of two statistically independent exponentials with the same mean c1−1. And this is exactly what we should expect for the time for an S1–S3 conversion via reactions when c2=0 andc3=c1.
  11. Haseltine E. L. and Rawlings J. B., J. Chem. Phys. 10.1063/1.1505860 117, 6959 (2002). [DOI] [Google Scholar]
  12. Rao C. and Arkin A. P., J. Chem. Phys. 10.1063/1.1545446 118, 4999 (2003). [DOI] [Google Scholar]
  13. In Ref. , it was stated that the condition for applying the ssSSA to reactions is (c1+c2)2⪢c1c3x12. That is incorrect, as it arises from comparing a single-walker timescale with a many-walker timescale. The correct condition is simply c2⪢c3, as can be seen not only from the result but also from the argument at Eq. . The reason why it is not necessary to supplement the condition c2⪢c3 with the condition c1⪢c3 is explained in the second paragraph of Sec. .
  14. McQuarrie D. A., J. Chem. Phys. 10.1063/1.1733676 38, 433 (1963) [DOI] [Google Scholar]; Darvey I. G., Ninham B. W., and Staff P. J., J. Chem. Phys. 10.1063/1.1727900 45, 2145 (1966). [DOI] [Google Scholar]

Articles from The Journal of Chemical Physics are provided here courtesy of American Institute of Physics

RESOURCES