Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2019 Feb 25.
Published in final edited form as: Bull Math Biol. 2005 Jan 19;67(5):901–946. doi: 10.1016/j.bulm.2004.09.009

A stochastic analysis of first-order reaction networks

Chetan Gadgil a,1, Chang Hyeong Lee a,1, Hans G Othmer b,*,1,2
PMCID: PMC6388624  NIHMSID: NIHMS1011837  PMID: 15998488

Abstract

A stochastic model for a general system of first-order reactions in which each reaction may be either a conversion reaction or a catalytic reaction is derived. The governing master equation is formulated in a manner that explicitly separates the effects of network topology from other aspects, and the evolution equations for the first two moments are derived. We find the surprising, and apparently unknown, result that the time evolution of the second moments can be represented explicitly in terms of the eigenvalues and projections of the matrix that governs the evolution of the means. The model is used to analyze the effects of network topology and the reaction type on the moments of the probability distribution. In particular, it is shown that for an open system of first-order conversion reactions, the distribution of all the system components is a Poisson distribution at steady state. Two different measures of the noise have been used previously, and it is shown that different qualitative and quantitative conclusions can result, depending on which measure is used. The effect of catalytic reactions on the variance of the system components is also analyzed, and the master equation for a coupled system of first-order reactions and diffusion is derived.

1. Introduction

Understanding the time-dependent behavior of a system of interacting species is necessary for analyzing numerous problems, including the dynamics of chemical reactions, gene expression profiles, signal transduction, and other biochemical processes. Many of these systems are characterized by low numbers of interacting species: for example, gene transcription involves interactions between 1–3 promoter elements, 10–20 polymerase holoenzyme units, 10–20 molecules of repressor proteins, 3000 RNA polymerase molecules, and ca. 1000 ribosomes (Kuthan, 2001). Since interactions at the molecular level are inherently stochastic there is an inherent “irreproducibility” in these dynamics, which has been demonstrated experimentally for single cell gene expression events (Ozbudak et al., 2002; Spudich and Koshland, 1976; Levsky and Singer, 2003). A major unsolved problem is to understand how the interplay between the nature of the individual steps and the connectivity or topology of the entire network affects the dynamics of the system, irrespective of whether a deterministic or a stochastic description is the most appropriate. In this paper we formulate and analyze the master equation that governs the time evolution of the number density of species that participate in a network of first-order reactions. The network may comprise both conversion reactions of the form AB, in which one component is converted to another, and catalytic reactions of the form ϕBA, in which the rate of formation of a particular component depends on the concentration of the other (the catalyst), but the concentration of the catalyst is unchanged by the reaction. This is the first step in the analysis of higher-order reaction networks.

There are numerous examples of first-order reaction networks that involve a small number of molecules, for which this analysis is directly applicable. Transcription and translation have been modeled as first-order catalytic reactions (Thattai and van Oudenaarden, 2001). The evolution of the surface morphology during epitaxial growth involves the nucleation and growth of atomic islands, and these processes may be described by first-order adsorption and desorption reactions coupled with diffusion along the surface. Proteins exist in various conformational states, and the reversible transitions between states may be described as a first-order conversion processes (Mayor et al., 2003). Fluctuating protein conformations are important in the movement of small molecules through proteins such as myoglobin; hence it is important to understand the distribution of these states (Iorio et al., 1991; Austin et al., 1975). RNA also exists in several conformations, and the transitions between various folding states follow first-order kinetics (Bokinsky et al., 2003).

One of the earliest investigations of stochastic effects in reactions is by Delbruck (1940), who studied the distribution of the number of molecules for a single reacting species in an auto-catalytic one-component system, and derived an expression for the variance as a function of the mean and initial values of the mean and variance. Siegert (1949) derived the probability distribution for the momentum of a gas as a function of time, and formulated the first stochastic model of a system of first-order conversion reactions, using a matrix formulation to derive the master equation for first and second-order reaction networks. He also outlined the generating function approach for characterizing the distribution of the network components. The system he studied is equivalent to a closed system (i.e., the total mass is conserved) of conversion reactions. He proved that one eigenvalue of the matrix of reaction rate constants is zero and the rest are real and negative. Unfortunately, this important work was largely overlooked in the field of stochastic chemical reaction kinetics for more than a decade, until Krieger and Gans (1960) re-derived these results formulating the problem as a chemical reaction network problem.

Klein (1956) used Siegert’s results to analyze the Ehrenfest Urn problem, in which balls are transferred between two urns with given probabilities. Klein treated the number of balls in an urn as a measure of the occupancy of an energy state, and calculated the probability of the number of balls in an urn as a function of the transition probability and the initial distribution. This can be interpreted as a closed system with one first-order reversible reaction, with the urns characterizing the reactant and product of the reversible reaction. He showed that the stationary distribution is independent of the initial distribution, but assumed that transitions occur at fixed intervals of time. Kendall (1948) formulated a master equation for a birth–death process starting with one ancestor and computed the extinction time of the population. He also discussed the case of time-dependent rate constants. Birth processes may be modeled as auto-catalytic production reactions and death as a first-order degradation reaction. Bartholomay (1958) was apparently the first to derive the master equation for a unimolecular reaction allowing steps at random times, and he used the generating function approach to calculate the mean and standard deviation of the number of reactant molecules. He also showed that the stochastic process is “consistent in the mean” with the deterministic description, and later showed how to calculate the observed first-order rate constant (Bartholomay, 1959).

At about the same time Montroll and Shuler (1958) modeled chemical decomposition as a random walk between reflecting and absorbing barriers, and Kim (1958) computed mean first passage times for general first-order stochastic processes. Shuler (1960) analyzed the relaxation kinetics of a multi-state system, which is equivalent to a closed first-order system of reversible conversion reactions, and pointed out that there was no single characteristic “relaxation time”. He re-derived the property that such a system cannot exhibit any form of periodic behavior, and showed that the relaxation of any one of the moments of the distribution does not convey any information about the relaxation of the distribution.

A systematic analysis of a closed system of first-order conversion reactions was done by Krieger and Gans (1960), who showed that a multinomial distribution characterizes the temporal evolution of the system. This generalized results of Montroll and Shuler, who had proved that the end states were characterized by a multinomial distribution. Gans extended this and previous analyses of closed systems to an open system of first-order conversion reactions (Gans, 1960). Following Krieger and Gans (1960), he derived a condition for the eigenvalues associated with the evolution of the mean to be negative (αii ≤ ∑j αij), but did not analyze the evolution of the higher moments or their relation to the mean. McQuarrie (1963) derived solutions for the mean and variance of closed systems with reactions of the type as AB, AB, and AB, AC. He also discussed the use of a cumulant generating function as a method of generating lower-order moments. Gani (1965) formulated a birth–death model for bacteriophage kinetics that was similar to an open system model for one species. Fredrickson (1966) computed the stochastic mean and variance for the concentrations in a closed cyclic ternary system, and again showed that these moments do not oscillate in time. Darvey and Staff (1966) presented the first derivation for the time-dependent mean and variance of all the species present in a closed system with first-order conversion reactions. They derived an expression for the moment generating function for a case when only one species is present initially, and showed that the first moment is the same as the solution of the corresponding deterministic system, but did not analyze the evolution of the variance.

Other processes such as the waiting times in a queue have also been analyzed, and some of these results can be applied to reaction networks. Kelly (1979) considered reversible queuing processes in which “customers” enter a queue either with a defined distribution from a source or defined transition probabilities from other queues. Every queue has a finite number of “servers”, with a characteristic waiting time associated with the service. The entry into a queue from another queue can be regarded as a conversion process, entry from the source is equivalent to production from the source, and service can be thought of as a degradation process that removes customers from the queue. A pure conversion process can be considered as an infinite server queue (see Appendix), but catalytic reactions have no apparent analog in queuing theory. Kelly considered several aspects of the process, such as whether it was open or closed, and the nature of the connections between the queues (linear vs. looped), and derived the important result that the equilibrium distribution for a closed system tends to the equilibrium distribution for an open system when the number of individuals is large. We show later that this result is also true for open and closed systems where all the reactions are first-order conversion reactions. For a particular class of open migration processes in which the transition probabilities between queues (or colonies) is proportional to the number of individuals in the colony, Kelly proved that the number of individuals in each queue has a Poisson distribution, assuming that the inflow of individuals to queues (or colonies) from outside the system (source) are Poisson processes. If one considers each colony to be a distinct species, the open migration process is equivalent to an open conversion reaction system, and the proof for the stationary distribution of the number of individuals in each colony stated by Kelly (1979) may be considered as another proof for the distribution of the number of each species in an open conversion network that we derive later. Branching Markov processes also give rise to problems with a similar mathematical structure to that in kinetics (Harris, 1963; Athreya and Ney, 1972). Athreya and Ney (1972) considered continuous-time, multi-type branching processes initiated by one particle of each type and derived first and second moments of the distribution of the number of particles of each type. Catalytic reactions can be interpreted as the death of a particle with two offspring, one identical to the original and the other possibly different, but a systematic analysis of the effect of catalytic reactions on the resulting probability distribution has not been reported earlier to our knowledge.

Thattai and van Oudenaarden (2001) presented the first analysis of a system of first-order catalytic reactions. They formulated a procedure for deriving the master equation for such systems, similar to the general procedure given in Gardiner (1983). They incorporated a negative feedback regulation of some reactions (production from source), and derived the steady-state means and covariances for a system of catalytic transformations with one source term, and first-order degradation of all the species. Recently, Brown (2003) derived the probability of the number of forward and reverse steps in a reversible first-order conversion reaction in which the transition probabilities are time-dependent.

A major objective of many of the analyses treating biological systems is prediction of the stochastic variations or noise of the concentrations. Two measures of the noise have been used in the past. Until recently the standard measure was the coefficient of variation (CV), defined as the standard deviation divided by the mean (Delbruck, 1940; Singer, 1953; Kepler and Elston, 2001), or its square. The CV is used as a measure of noise in McQuarrie et al. (1964), Darvey et al. (1966), Laurenzi (2000), Elowitz et al. (2002) and Swain et al. (2002), while the Fano factor F, defined as the variance divided by the mean, was introduced by Thattai and van Oudenaarden (2001), and used in Blake et al. (2003) and Ozbudak et al. (2002). It has been shown that the use of different measures of noise may lead to different conclusions concerning the importance of noise in the underlying process (Swain et al., 2002).

Our objectives here are (i) to introduce a derivation of the master equation that clarifies the separate roles of reaction rates and network topology in the master equation, (ii) to develop a unified treatment of first-order networks, including the evolution of both the mean and the variance of any species and (iii) to understand the effect of network topology on the stochastic fluctuations in specified components. We compare the noise in the amount of a species as measured by the Fano factor or the coefficient of variation for both conversion and catalytic reactions, as well as for species in open and closed systems. We also demonstrate through simple examples the utility of this framework in the analysis of the effect of reaction network topology on the variation of the number of molecules of each network component. All of the preceding analyses can be treated as special cases of the general framework that we develop here. No previous analysis of first-order reaction systems has to our knowledge considered a system consisting of both catalytic and conversion reactions, nor has a systematic comparison of the stochastic behavior of conversion and catalytic systems been made.

2. Formulation of the master equation

We first derive the master equation for a general system of reactions to provide a framework for the analysis of reactions of arbitrary order. We then focus on first-order reactions and specialize the general result for a very large class of first-order processes.

2.1. The master equation for a general system of reactions

We begin with some background on a general deterministic description of reacting systems, and then derive the master equation for an arbitrary network of reacting species. The abstract formulation is presented in brief here and follows that given elsewhere (Othmer, 1979, 1981).

Suppose that the reacting mixture contains the set M of s chemical species Mi that participate in a total of r reactions. Let vij be the stoichiometric coefficient of the ith species in the jth reaction. The vij are non-negative integers that represent the normalized molar proportions of the species in a reaction. Each reaction is written in the form

ireac.vijreacMi=iprodvijprodMij=1,,r, (1)

where the sums are over reactants and products, respectively in the jth reaction. In this formulation, the forward and reverse reaction of a reversible pair are considered separately as two irreversible reactions.

For each reaction, once the reactants and products are specified, the significant entities so far as the network topology is concerned are not the species themselves, but rather the linear combinations of species that appear as reactants or products in the various elementary steps. Following Horn and Jackson (1972), these linear combinations of species will be called complexes. A species may also be a complex (as is the case for first-order reactions). We assume that changes in temperature, pressure and volume V of the mixture during reaction are negligible. Thus the state of the system is specified by the concentration vector c = (c1, … , cs)T, where ci is the non-negative concentration of species Mi measured in moles/liter.

Let M^ be the set of linear combinations with integral coefficients of the species, and let C={C(1),,C(p)} be a set of complexes. A reaction network consists of the triple {M,M^,C}, together with a stoichiometric function v^ : M^C and a binary relation RC×C. The function v^, which identifies a linear combination of species as a complex is onto, and the relation R has the properties (i) (C(i), C(j)) ∈ R if and only if there exists one and only one reaction of the form C(i) → C(j), (ii) for every i there is a ji such that (C(i), C(j)) ∈ R, (iii) (C(i), C(i)) ∉ R. Thus every complex is related to at least one other complex and the trivial reaction C(i) → C(i) that produces no change is not admitted. Therefore R is never reflexive and in general it is neither symmetric nor transitive.

The relation on C gives rise is to a directed graph G in the following way. Each complex identified with a vertex Vk in G and a directed edge E is introduced into G for each reaction. Each edge carries a non-negative weight R^l(c) given by the intrinsic rate of the corresponding reaction. G provides a concise representation of the reaction network.

The topology of G is in turn represented in its vertex–edge incidence matrix E, which is defined as follows.

Eil={+1 if El is incident at Vi and is directed toward it 1 if El is incident at Vi and is directed away from it 0otherwise. (2)

If there are r reactions on C, then E has p rows and r columns and every column has exactly one +1 and one −1. The rate R^l(C) of an elementary reaction C(j) → C(k) is generally not a function of C(j), but of the concentration or activity of the individual species in the complex. Once the complexes and reactions are fixed, the stoichiometry of the complexes is specified unambiguously, and we let ν denote the s × p matrix whose jth column encodes the stoichiometric amounts of the species in the jth complex. Then the temporal evolution of the composition of a reacting mixture is governed by

dcdt=vER^(c), (3)

where the columns of ν are given by the columns of νreac and νprod, and the initial condition is c(0) = c0. It follows from (2) that the columns of the product vE are the stoichiometric vectors of reactions written according to the standard convention. When the reactions are first-order this deterministic equation also governs the evolution of the mean in the Markov process description discussed later.

A special but important class of rate functions is that in which the rate of the th reaction can be written as

R^l(c)=klRj(c) (4)

for every reaction that involves the jth complex as the reactant. This includes ideal mass action rate laws, in which the rate is proportional to the product of the concentrations of the species in the reactant complex, each concentration raised to a power equal to the stoichiometric coefficient of the corresponding species in the complex. In that case

Rj=i=1s(ci)vij. (5)

For mass-action kinetics (4) implies that

R^(c)=K R(c) (6)

where K is an r × p matrix with klj>0 if and only if the th edge leaves the jth vertex, and klj=0 otherwise. The topology of the underlying graph G enters into K as follows. Define the exit matrix Eo of G by replacing all 1’s in E by zeros, and changing the sign of the resulting matrix. Let K^ be the r × r diagonal matrix with the kℓ’s, = 1, … , r, along the diagonal. Then it is easy to see that K=K^E0T and therefore

dcdt=vEK R(c)=vEK^EoT R(c). (7)

It follows from the definitions that (i) the (p, q)th entry, pq, of EK^E0T is nonzero (and positive) if and only if there is a directed edge (q,p)G, (ii) each diagonal entry of EK^E0T is minus the sum of the k’s for all edges that leave the jth vertex, and (iii) the columns of EK^0T all sum to zero, and so the rank of EK^E0T is ≤ p − 1. When all complexes are species and all reactions are first-order, ν = I for a closed system and ν = [I | 0] for an open system, where I is the s × s identity matrix and 0 is the zero vector, and the right-hand side reduces to the usual form Kc for a suitably-defined matrix K. In the following section we will treat the stochastic analysis of first-order systems in detail.

As it stands, (5) includes all reacting species, but those whose concentration is constant on the time scale of interest can be deleted from each of the complexes in which it appears and its concentration or mole fraction can be absorbed into the rate constant of that reaction in which it participates as reactant.3 As a result of these deletions, it will appear that reactions which involve constant species do not necessarily conserve mass. Furthermore, some complexes may not comprise any time-dependent species; these will be called zero or null complexes. Each null complex gives rise to a column of zeros in ν and the rate of any reaction in which the reactant complex is a null complex is usually constant. For instance, any transport reaction of the form M0Mi introduces a null complex and the corresponding flux of Mi represents a constant input to the reaction network, provided that rate of the transport step does not depend on the concentration of a time-dependent species. Of course, a constant species that appears in a complex which also contains a variable species likewise represents an input to the network, and to distinguish these from inputs due to null complexes, the former are called implicit inputs and the latter are called explicit inputs.

An alternate description of the deterministic dynamics is obtained by introducing an extent for each reaction and expressing composition changes in terms of extents. It follows from (3) that the composition changes due to reaction lie in a coset by c0 of the range of vE, and this coset is called the reaction simplex (Othmer, 1979, 1981). Therefore, by choosing coordinates in the simplex, the composition changes can be expressed in terms of an extent for each reaction as follows

c=c0+vEξ. (8)

If the reactions are all independent, i.e. if the rank of vE is r, it follows from (3) and (8) that

dξdt=R^(c0+vEξ), (9)

but in general we can only conclude that

dξdt=R^(c0+vEξ)+kγkNk, (10)

where {Nk} is a basis for the null space of vE. The γk can be chosen so as to remove all dependent steps, in particular, those that arise from cycles in the graph.

We can also describe the evolution in terms of the number of molecules present for each species. Let n = (n1, n2, … , ns) denote the discrete composition vector whose ith component ni is the number of molecules of species Mi present in the volume V. This is the discrete version of the composition vector c, and they are related by n=NAVc, where NA is Avogadro’s number. From (3) we obtain the deterministic evolution for n as

dndt=vER˜(n) (11)

where R˜(n)NAVR^(n/NAV). In particular, for mass-action kinetics

R˜l(n)=NAVklRj(n/NAV)=NAVkli=1s(niNAV)vij=kl(NAV)ivij1i=1s(ni)vil=k^li=1s(ni)vij. (12)

The number of molecules can be expressed in terms of the integer extents of each reaction as

n=n0+vEη, (13)

and it follows from (11) that

dηdt=R˜(n0+vEη)+kγkNk. (14)

The description in terms of the number of molecules present assumes that there are sufficient numbers present so that we can assume they vary continuously in time, but the same assumption is needed for (4).

2.2. The stochastic description

The first level of stochastic description is to consider an ensemble of deterministic systems that differ in the initial condition. Let P(c, t) be the probability that the state of the system is c; then the evolution of P is governed by

Pt+i=1sci((vER^(c))iP)=0 (15)

subject to the initial condition P(c, 0) = P0. The characteristic equations for this hyperbolic equation are precisely the evolution equations given at (3).

At the next level of description the numbers of the individual components are followed in time and the reactions are modeled as a continuous-time Markov jump process. Let Ni(t) be a random variable that represents the number of molecules of species Mi at time t, and let N denote the vector of Ni s. Further, let P(n, t) be the joint probability that N(t) = n, i.e., N1 = n1, N2 = n2, …,Ns = ns. Clearly the state of the system at any time is now a point in Z0s, where Z0 is the set of non-negative integers. Formally the master equation that governs the evolution of P is

ddtP(n,t)=mS(n)R(m,n)P(m,t)mT(n)R(n,m)P(n,t) (16)

where R(m,n) is the probability per unit time of a transition from state m to state n, R(n,m) is the probability per unit time of a transition from state n to state m, S(n) is the set of all states that can terminate at n after one reaction step, and T(n) is the set of all states reachable from n in one step of the feasible reactions. The notation is meant to suggest the ‘source’ and ‘target’ states at n; one could also call S(n) the predecessors of state n and T(n) the successors of state n. The predecessor states must be non-negative for production reactions and positive for conversion, degradation and catalytic reactions. Similar bounds on the target states are naturally enforced by zero rates of reaction when the reactants are absent.

The sets S(n) and T(n) are easily determined using the underlying graph structure. It follows from the definition of ν and E that the th reaction C(j) → C(k) induces a change Δn(l)=vE(l) in the number of molecules of all species after one step of the reaction, where subscript denotes the th column. Therefore the state m=nνE(l) is a source or predecessor to n under one step of the th reaction. Similarly, states of the form m=n+νE(l) are reachable from n in one step of the th reaction.4 Once the graph of the network and the stoichiometry are fixed, we can sum over reactions rather than sources and targets, and consequently the master equation takes the form

ddtP(n,t)=lRl(nvE(l))P(nvE(l),t)lRl(n)P(n,t). (17)

However, the transition probabilities Rl(n) are not simply the macroscopic rates P^ if the reactions are second-order (or higher), because as Gillespie (1976) and others have noted, combinatorial effects may play a significant role when the number of molecules is small. Hereafter we restrict attention to mass-action kinetics, and we suppose that the th reaction involves conversion of the jth to the kth complex: C(j) → C(k). Then using the notation of Gillespie (1976), we can write,

Rl=clhj(l)(n) (18)

where c is the probability per unit time that the molecular species in the jth complex react, j() denotes the reactant complex for the th reaction, and hj()(n) is the number of independent combinations of the molecular components in this complex. Thus

cl=kl(NAV)ivij(l)1=k^l (19)

and

hj(l)=i(nivij(l)). (20)

In the definition of h we use the standard convention that (n0)=1.

We can write the master equation in terms of integer extents in the form

ddtP˜(η,t)=lRl(n0+vEηvE(l))P(n0+vEηvE(l),t)lRl(n0+vEη)P(n0+vEη,t)=lR˜l(ηvE(l))P˜(ηvE(l),t)lR˜l(η)P˜(η,t). (21)

Moments of this equation or of (16) can be used to obtain the evolution equations for average extents and from this, the equations for the average change in the numbers. Only in the linear case is the right-hand side of the equation for the first moment the deterministic rate, as is shown in the following subsection. Others have derived a similar master equation for reacting systems, without the explicit inclusion of the underlying graph-theoretic structure (Gardiner, 1983; Rao and Arkin, 2003).

2.3. The master equation for general system of first-order reactions

The stochastic analysis of first-order reaction networks can be done in essentially complete generality, and in this section we analyze all cases in which every reactant and product complex is a species. We thereby exclude only those first-order splitting reactions of the type MiMj+Mk. Our aim is to separate the effects of various types of reactions (catalytic, conversion) on the distribution of the chemical species, and to this end we divide the set of all reactions, represented by the directed edges E, = 1, 2, …,r into four subsets corresponding to the following reactions: production from a constant source (which in fact is a zero-order step), degradation, conversion to another species, and production catalyzed by another species. These four types are summarized in Table 1. The first type represents an explicit input to the system, whereas the last type represents an implicit input.

Table 1.

The four classes of first-order reactions considered in the stochastic model

Label Type of reaction Reaction Rate
I Production from a source ϕMi kiS
II Degradation Miϕ kidni
III Conversion MjMi kijconnj
IV Catalytic production from source ϕMjMi kijcatnj

Every species Mi can be produced from a source at a specific rate kis, and every species can be removed by degradation at a rate proportional to its concentration, with rate constant given by kid. Each species may participate in two other types of first-order reactions: conversion reactions, in which species Mi is converted to species Mj at a rate proportional to its concentration, and catalytic reactions, in which species Mi catalyzes the formation of species Mj from a source, but is itself unchanged during the process. The first-order rate constant for the conversion reaction of species Mi to species Mj is denoted by kjicon, and the first-order rate constant for the catalytic production of species Mj, with species Mi catalyzing the reaction, is denoted by kjicat. Thus there are s uncatalyzed source reactions, s first-order decay reactions, s(s − 1) conversion and s2 catalytic reactions, for a total of up to 2s2 + s reactions.

Since all reactant and product complexes are species, the stoichiometric matrix is

v=[I|0]

if at least one reaction of type I, II, or IV is present, and

v=[I]

if the system is closed. The corresponding incidence matrices for the different types are equally simple, and if we order the types as in Table 1, then E can be written as follows.

E=[I1T|I1T|Econ|I1T] (22)

where 1T = (1, 1, …,1), I is the identity matrix of the appropriate dimension, and Econ is the incidence matrix for the conversion network. Thus the stoichiometry of the reactions and the topology of the network are easily encoded in ν and E, respectively.

It follows easily that the deterministic equations for the first-order reaction network can be written as

dcdt=Ks1Kdc+Kcatc+Kconc (23)

where Ks= diag{kis}, Kd= diag{kid}, Kijcat=kijcat and Kcon is defined as follows.

Kijcon={kijcon if ijkkkjcon if i=j.

It is clear that type I and IV reactions induce an increase of 1 in the number of species i without other changes, type IV induces a decrease of one in i alone, and type III induces a decrease of one in j and an increase of 1 in i. Therefore, for reactions of type I the predecessor state to state n is Si1n(n1,n2,,ni1,,ns), and the successor state is Si+1n(n1,n2,,ni+1,,ns), where Sik is the shift operator that increases the ith component of n by an integer amount k. For degradation of Mi, the predecessor state is Si+1n and the successor state is Si1n. In type III reactions the predecessor state is Si1Sj+1n and the successor state is Si+1Sj1n. Finally, for the catalytic reaction the predecessor and successor states are Si1n and Si+1n, respectively. Using these, the master equation for the first-order reaction network can be written as follows (here and hereafter we drop the explicit time dependence in P(., .)).

dP(n)dt=i=1s[Kiis(Si11)P(n)+j=1s(Kijcon(Si1Sj+1)+Kijcat(Si11)Kijd(Si+11))(njP(n))] (24)

where Sik(niP(n))=SikniP(Sikn).

3. Evolution equations for the mean and variance

The master equation derived in the previous section cannot be solved analytically except for a small number of specific simple systems. Usually the objective of a stochastic analysis is to calculate the moments of the distribution of the number of reactant molecules. There are several ways in which evolution equations for the moments of the distribution function can be obtained. One is to multiply both sides of the master equation by ni, ni nj, ni nj nk etc and sum over all possible values of n. Alternatively, one can use the moment generating function (MGF) approach, which is what we use here to calculate the mean and variance for all the reactants in an arbitrary network. The advantage of the MGF approach is that it allows us to get an analytical solution for the MGF of purely conversion systems, thus enabling the calculation of the probability distribution function for the distribution of each reactant in such systems.

Let z ≡ (z1, … , zs) where zi ∈ [0, 1]; then the MGF is defined as

G(z,t)=ni=0z1n1  zsns P(n).

It follows that G(z, t)|z = 1 = 1. The first and second moments can be obtained through successive derivatives of the MGF evaluated after setting all zi to one.

Mk(t)=G(z,t)zk|z=1Gk(z,t)|z=1=E[Nk]
Vlk(t)=Glk(z,t)|z=1={E[NlNk] if lk(E[Nk2]E[Nk]) if l=k

where E[ ] denotes the expectation of the quantity in the square brackets, and for any combination of indices we define

Gijk,t=zizjzktG.

The probability distribution of the number of molecules of the ith species (Ni) at steady state can be derived from the MGF. Differentiating the MGF k times w.r.t. zi, we get

kG(z,t)zik=ni=kni!(nik)!ziniknj=0,jiz1n1  zi1ni1zi+1ni+1  zsns P(n,t),

and therefore

kG(z,t)zik|zi=0,zj=1,ji=k!nj,jiPn1,,ni1,k,ni+1,,ns(t)=k!PNi(k,t),

where PNi(k,t) represents the marginal probability density function of Ni. Therefore

PNi(k,t)=1k!kG(z,t)zik|zi=0,zj=1,ji.

These relationships are valid for the MGF corresponding to any reaction network. In the case of a system of reactions with first-order kinetics we obtain the partial differential equation for the MGF as

Gt=i=1s(zi1)(KiisG+j=1s(Kijcon+KijcatzjKijd)Gj). (25)

Alternatively, and somewhat more directly, one can use a backward equation to obtain Eq. (25) (see the Appendix).

For simple network topologies with a small number of nodes, (25) can be solved analytically to get the complete characterization of the evolution of the probability distribution function P(n). We shall later derive such expressions for systems where only conversion reactions occur. However, this is not feasible for an arbitrary network structure, and therefore we first outline the procedure for obtaining the evolution equation for the moments, and then we focus on the first two moments. Differentiating Eq. (25) with respect to zk, we obtain

Gk,t=i=1s(zi1)(KiisGk+j=1s(Kijcon+KijcatzjKijd)Gjk+KikcatGk)+(KkksG+j=1s(Kkjcon+KkjcatzjKkjd)Gj). (26)

Therefore the evolution equation for the mean of the kth component is

E[Nk]=j=1S(Kkjcon+KkjcatKkjd)E[Nj]+Kkks

or in matrix form

M(t)=(Kcon+KcatKd)M(t)+Ks1 (27)
=KM(t)+Ks1, (28)

where M(t) = [E[N1(t)], …,E[Ns(t)]]T and K is defined by the second equality. From this one sees that the explicit inputs contained in the last term serve as a nonhomogeneous forcing term for the evolution of the mean. This equation is identical to Eq. (23) given earlier for the evolution of the deterministic first-order system. Therefore a general stochastic system of first-order interactions is “consistent in the mean” with the corresponding system for systems comprised of catalytic and conversion reactions, as is well known (Darvey and Staff, 1966).

Higher moments of the distribution are obtained by successive differentiation of (25). For the second moment we obtain

Glk,t=i=1s(zi1)(KiisGkl+j=1S(Kijcon+KijcatzjKijd)Gjkl+KikcatGkl+KilcatGlk)+(KllsGk+j=1s(Kljcon+KljcatzjKljd)Gjk+KlkcatGk)+(KkksGl+j=1s(Kkjcon+KkjcatzjKkjd)Gjl+KklcatGl).

Thus the matrix containing the second-order moments can be written as

V(t)=KV(t)+(KV(t))T+Γ(t)+ΓT(t) (29)

where

Γij(t)(Kijcat+Kiis)Mj(t).

It is noteworthy that in this equation both the implicit and explicit inputs, as well as the mean, appear in the forcing term for the evolution of the second moments. In the equation for the mean the forcing is time-independent, but here the forcing is time-dependent via the appearance of the mean in this term. Later we will see how this time-dependence is filtered via the action of the kinetic matrix. The somewhat unusual structure on the right-hand side arises from the fact that V is a symmetric matrix, and thus the evolution equation for it must be symmetric as well.

The equations for the mean and the second moments can be integrated to get the first two moments of the distribution. Thus we now have an algorithm to compute both the steady-state and time-dependent behavior of the mean and variance of every species in a general first-order network. Next we analyze this behavior in detail.

3.1. The steady-state and time-dependent solution for the mean

The steady-state solution Ms for the mean is the solution of

KMs=Ks1=ks (30)

where ks=(k1s,,kss). The nature of the reaction types and rates, as reflected in the spectral structure of K, dictates the steady-state mean. In general, if K is singular then Ks1 must lie in the range of K, and in particular, if there are no explicit inputs (Ks = 0) then M1s is given by an eigenvector corresponding to a zero eigenvalue of K. On the other hand, if degradation reactions are the only type present, and all species react, then K is nonsingular and M1s=0 is the only solution. This case is of little interest and will be excluded in what follows. Results concerning the localization of the spectrum of K are summarized in the following theorem.

Theorem 1. The eigenvalues of K have non-positive real parts if either of the following conditions hold.

  1. The sum of the specific rates of formation for each species Mi by conversion and catalytic reactions does not exceed the sum of the specific rates of loss of Mi by conversion reactions and degradation (this implies that the column sums of K are non-positive).

  2. The sum of the specific rates of formation catalyzed by each species Mi is less than or equal to the sum of the specific rates of degradation of that species.

Proof. The statement in (a) can be translated into the inequality

jiKijcon+jKijcatjiKjicon+Kiid. (31)

Since all the terms are non-negative, each of the terms on the left-hand side of the inequality is less than the right-hand side. In particular,

KiicatjiKjicon+Kiid, (32)

and therefore

Kii=Kiicon+KiicatKiid (33)
=jiKjicon+KiicatKiid (34)
0. (35)

Using the definition of Kiicon and Kii, we can rewrite (31)

ji[Kijcon+Kijcat][Kiicon+KiicatKiid] (36)
jiKijKii. (37)

Now, bounds on the eigenvalues of K are given by the Levy–Hadamard theorem (Bodewig, 1959), which states that for every eigenvalue λ of K,

|λKii|ji|Kij|.

Since the off-diagonal elements of K are non-negative, this can be written as

|λKii|jiKijKii (38)

where the second inequality follows from (37). Since Kii<0, it follows that Re(λ) ≤ 0, which proves (a).

To prove (b) we do a similar analysis, using the Levy–Hadamard theorem expressed in terms of a sum over the columns of K. The constraint (b) can be written as

jKjicatKiid.

By reasoning similar to that used in the proof of (a), this constraint leads to the relations

Kii0
jiKjiKii.

The Levy–Hadamard theorem applied to the columns of K states that

|λKii|jiKji,

and the proof of (b) now follows as before. □

Remark 2. (i) A special case of (a) shows that in a closed system the eigenvalues have non-negative real parts, for in that case Kiid=0, Kiicat=0 and Kjjcon=ijKijcon. In fact in that case it is known that the eigenvalues are all real as well if the system satisfies detailed balance (Gans, 1960; Wei and Prater, 1962). If the underlying graph is strongly connected, then there is exactly one zero eigenvalue (Othmer, 1979).

The theorem gives conditions for the stability of an arbitrary first-order system of reactions in terms of the specific rates of the reactions. The first sufficient condition for stability is easily understood, as it is expected that for stability of a system the specific rates of production for all species should be less than the specific rates of degradation for every component of the system. The second criterion, which requires that the rate at which any component catalyzes the formation of other species is less than its degradation rate, is less immediately obvious. This is however an important relationship that can be used to guarantee stability of the mean of artificial transcriptional networks. In particular, once the inequality between the catalytic and degradation rates is satisfied, the system will be stable irrespective of the conversion reactions in the system, and may be used in the design of such networks. In the theory of branching processes condition b is equivalent to the assumption that the offspring distribution has mean less than or equal to 1.

Further information about the structure of the solution can be gotten from a spectral representation of K. Hereafter we assume that K is semisimple (i.e., it has a complete set of eigenvectors), which is the generic case, and then it has the spectral representation

K=iλiPi (39)

where λi is the ith eigenvalue of K and Pi is the associated projection onto the span of the eigenvectors associated with λi. Since K is assumed to be semisimple, they have the property that ∑i Pi = I.

The projections have the representation

Pi=p=1miϕp*ϕp*¯ (40)

where mi is the algebraic multiplicity of the ith eigenvalue, ‘*’ represents the dyad product, and the ϕ’s and ϕ*’s are the corresponding eigenvectors and adjoint eigenvectors, defined via

Kϕp=λiϕpfor p=1,mi (41)
KTϕp*=λi¯ϕp*for p=1,mi. (42)

They can be chosen to satisfy the orthogonality relations

ϕp*,ϕq=i=1mi(ϕp*¯)i(ϕq)i=δpq

and the projection of any vector is defined as

Piu(p=1miϕp*ϕp*¯) u=p=1miϕp*,uϕp.

When K is invertible the steady-state mean can be written as

M1s=(iPiλi)ks=ip=1miϕp*,ksλiϕp. (43)

Thus in the nonsingular case the steady-state mean is the weighted sum of projections onto the ith eigenspace of K, weighted by the corresponding eigenvalue. In particular, projections corresponding to eigenvalues of large modulus contribute less to the sum than those of small modulus. If K has a d-dimensional null space there are vectors {η1, η2, …d} with the property that the functionals 〈η, M〉 are time-invariant, and this restricts the dynamics and steady-state solution to a lower-dimensional set. We leave the details of this case to the reader.

The transient solution of (28) is given by

M(t)=eKtM(0)+(0teK(tτ)dτ) Ks1=(ieλit Pi) M(0)+(0tieλi(tτ)Pidτ) Ks1=(ieλit Pi) M(0)i(1eλit)PiλiKs1. (44)

Since eKt is non-negative, i.e. all its entries are non-negative, the solution remains nonnegative if M(0) is non-negative. Therefore if there are no eigenvalues with a positive real part the solution converges to a non-negative steady state. If the real part of all eigenvalues is negative, the solution is globally asymptotically stable. However these conditions do not guarantee that the solution has strictly positive components, i.e. that it does lie on the boundary of the positive ‘orthant’ of Rs. Under stronger hypotheses one can guarantee that the solution lies in the interior of the orthant, as shown in the following theorem.

Theorem 3. Suppose that the graph G associated with the reactions is strongly connected, and that the eigenvalues of K have negative real parts. Then K1<0 and if there is at least one species produced by a source then the solution (30) is component-wise positive.

Proof. Notice that K is irreducible since the graph G is strongly connected. Write K=K+κI, where K+0 and κ > 0. Observe that K+ is also irreducible. Let

Q=K=κ IK+. (45)

Since the eigenvalues of K have negative real parts by hypothesis, the eigenvalues λ(Q) of Q have positive real parts and

λ(Q)=κλ(K+).

Let ρ(K+) be the spectral radius of K+; then by Perron–Frobenius theorem ρ(K+) is a simple positive real eigenvalue of K+ and κρ(K+) is an eigenvalue of Q. Thus κρ(K+)>0, i.e., ρ(K+)/κ<1. Since ρ(K+)/κ<1, the series

I+K+κ+(K+κ)2+

converges, and so

(IK+κ)1=n=0(K+κ)n.

Since (I+K+κ) is an s × s irreducible non-negative matrix with positive diagonal elements, it can be obtained that

(I+K+κ)s1>0,

which implies that

I+K+κ+(K+κ)2++(K+κ)s1>0.

Therefore

Q1=κ1(IK+κ)1=1κ(I+K+κ+(K+κ)2+)> 0.

Thus Q−1 > 0, so K1<0 and finally, if there is at least one species produced by a source, i.e., kis>0 for at least one i, then

Ms=K1ks>0.

Next we analyze the evolution of the second moment, and find that the conditions (a) and (b) in Theorem 1 will also guarantee the stability of the second moments.

3.2. Evolution of the second moment

One can show (cf. Appendix) that the evolution equation for the second moment (29) can be written in the form5

dvdt=Vv+γ (46)

where

v(t)=col(V(t))V=(KIn+InK)andγ(t)=col(Γ(t)+Γ(t)T).

Here the notation col(A) denotes a vector of length s2 whose elements are the columns of A stacked in order (cf. Appendix). By the definition of the tensor product

V (ϕiϕj)=(KIn+InK)(ϕiϕj)=KϕiInϕj+InϕiKϕj=(λi+λj)(ϕiϕj), (47)

and therefore ϕiϕj are the eigenvectors of KIn+InK corresponding to λi + λj. It follows that if Pi and Pj are the projections associated with the ith and jth eigenvalues, respectively, then

PijPiPj=(p=1miϕp*ϕp*¯)(p=1mjϕp*ϕp*¯) (48)

is the projection associated with λi + λj. Consequently

V=i,j(λi+λj)Pij.

After some simplification one finds (cf. Appendix) that the solution for the second moment is

v=eVtv(0)+0teV(tτ)γ(τ)dτ=i,je(λi+λj)tPijv(0)+i,j0te(λi+λj)(tτ)Pijγ(τ)dτ=i,je(λi+λj)tPijv(0)+i,j(e(λi+λj)teλitλj(PiM0Pj)+(1λiλj(e(λi+λj)teλit)1λi(λi+λj)(e(λi+λj)t1))(PiSPj)+e(λi+λj)teλjtλi(PiPjM0)+(1λiλj(e(λi+λj)teλjt)1λj(λi+λj)(e(λi+λj)t1))(PiPjS))col Ks+i,j0te(λi+λj)(tτ)(PiKcatPj+PiPjKcat)col Mddτ, (49)

where M0 = [M(0)|M(0)|⋯|M(0)], S = [ks|ks|⋯|ks] and Md = diag{M1 (t), M2 (t)⋯Mn(t)}. From (49) and (47) it is clear that the time-dependent behavior of the covariance is governed by the set of eigenvalues λiK+λjK, and if the eigenvalues of K have negative real parts, so do the eigenvalues of V. Thus the sufficient conditions for the stability of the mean derived in the preceding section also guarantee the stability of the second moment. For a closed system, one of the eigenvalues is zero and hence the longest characteristic time for the evolution of M and V will be identical. For an open system, the characteristic time for the evolution of the second moment will be twice as large as that for the evolution of M. Note however that Vii = (E [Ni (t)2] – E [Ni (t)]), and therefore this should not be interpreted to imply that the variance of the number of molecules of a particular species evolves twice as rapidly as the mean.

From the expression for v, one obtains the variance of the lth species (cf. Appendix) explicitly as

σl2(t)=i,je(λi+λj)t(PiPjM(0))l+ieλit[(PiM(0))l+(Piks)lλi]i1λi(Piks)l+i,jkD(l,k,j)[r((PrM(0))k+(Prks)kλr)fr,i,j(t)(PrkS)kλr(λi+λj)(e(λi+λj)t1)],

where

D(l,k,j)=((Pj)lkC(l,k,i)+(Pi)lkC(l,k,j)),
C(l,k,i)=m=1n(Pi)lmKmkcat
fr,i,j(t)={1λrλiλj(eλrte(λi+λj)t) if λrλi+λjte(λi+λj)t if λr=λi+λj

and ⊙ denotes element-wise multiplication. Using the expression for the mean of the lth reactant given in (44), we obtain the relationship between the mean and variance of every species in the reaction network.

σl2(t)=i,je(λi+λj)t (PiPjM(0))l+Ml(t)+i,jkD(l,k,j)[r((PrM(0))k+(PrkS)kλr)fr,i,j(t)(Prks)kλr(λi+λj)(e(λi+λj)t1)]. (50)

From this expression one can calculate the evolution of the Fano factor F=σl2/Ml and CV = σl/Ml for every species, and thus determine the effect of various network structures and reaction types on the noise. We study several examples in the following section.

4. The effect of network structure on the dynamics

The master equation for a system of first-order chemical reactions reflects three major characteristics of the system, (i) whether it is open or closed, (ii) the topology of the network of the chemical interactions, and (iii) whether or not the reacting mixture is spatially uniform, i.e., whether or not diffusive or other transport mechanisms play an important role. The effect of each of these factors on the distribution of species undergoing first-order catalytic and conversion reactions can be studied using the general results of the preceding section.

4.1. Open and closed conversion systems

In the context of first-order reaction dynamics, catalytic systems are necessarily open because they involve production from a source catalyzed by a time-dependent species (cf. Table 1). Thus the comparison of open and closed systems can only be made for those in which there are no catalytic reactions. Therefore we compare open conversion networks in which there is at least one Type I reaction and one Type II reaction, with closed networks in which all reactions are type III and thus the total mass is constant.

The equation for the MGF for a system of stochastic conversion reactions can be solved analytically (Gans, 1960; Darvey and Staff, 1966), and for closed conversion networks it has been shown that the distribution is multinomial when the eigenvalues are distinct (Darvey and Staff, 1966). We derive the general result via a backward equation. For open systems we prove that the distribution is Poisson, and we demonstrate how the choice of the noise measure leads to differing conclusions about the noise in open and closed conversion networks.

4.1.1. Closed conversion systems

In a closed system of linear reactions the molecules independently execute a random walk through the states, where the state of a molecule is the property of being a molecule of type i; i = 1, s, where s is the number of species or states. We let X (t) be the random process whose value at t is the state of a given molecule. Let Pij be Pr{X(t) = j | X(0) = i}; then the matrix P of transition probabilities satisfies the backward Kolmogorov differential equation

dP(t)dt=KTP(t), (51)

where K = Kcon. Since P(0) = I, this has the solution

P(t)=eKTt.

If we denote by pi(t) the probability that the molecule is of type i at time t, then p(t) = (p1(t), … , ps(t))T is the solution of

dpdt=Kp

and therefore

p(t)=P(t)T p(0)=eKt p(0). (52)

Assume that the graph of the underlying network is strongly-connected; then K has exactly one zero eigenvalue and the equilibrium probability distribution is given by

πi=ϕ1,ijϕ1,j (53)

where ϕ1 is the unique positive eigenvector of K corresponding to the zero eigenvalue.

Now if there are a total of N molecules in the system, the joint equilibrium distribution is the multinomial distribution

π(n1,n2,,ns)=N!n1!n2!nS!i=1Sπini, (54)

since there are s classes to put the N molecules into, and the probability of the ith class is πi. In particular, the number of individuals of the ith species is binomially distributed according to

π(ni)=(Nni)πini(1πi)Nni. (55)

If in addition the joint distribution of molecular numbers of two species is multinomial initially, then the joint distribution at any time t is also multinomial. First suppose there are only two molecular species and N molecules initially. Then for 0 ≤ mN, one can show that (see the Appendix for details)

Pr(N1(t)=m,N2(t)=Nm)=i=0N Pr (N1(t)=m,N2(t)=Nm|N1(0)=i,N2(0)=Ni) Pr (N1(0)=i,N2(0)=Ni)=k=0mi=kNm+k(ik)(ptk(1,1)ptik(1,2))(Nimk)×(ptmk(2,1)ptNim+k(2,2))(Ni)p1(0)i p2(0)Ni=(Nm)(p1(t))m (p2(t))Nm

where the last step follows from (52). Thus {N1(t), N2(t)} has a binomial distribution if initially it has a binomial distribution. By induction, it follows that the joint distribution of {N1(t), N2(t), …,Ns(t)} is multinomial if the initial distribution is multinomial. That is to say, if the initial joint distribution of molecular numbers of species is multinomial, i.e.,

P(n,0) Pr(N1(0)=n1,,Ns(0)=ns)=N!n1!  ns!(p1(0))n1  (ps(0))ns,

then the joint distribution at any time t is also multinomial and the density function is given by

P(N1(t)=n1,,Ns(t)=ns)=N!n1!  ns!(p1(t))n1  (ps(t))ns,

where the probabilities pi(t), 1 ≤ is are given by (52).

It follows from (55) that the mean and variance for the mth species at the steady state are given by

Mm=Nπm=E[Nm]σ2(Nm)=Nπm (1πm)=E[Nm](1E[Nm]N). (56)

Notice that πm is the steady-state fraction of the mth molecular species in a deterministic description, and since this is fixed by the reaction rates, the variance σ2(Nm) does not approach the mean even as N → ∞. Thus the distribution is never Poisson in a closed conversion network. The maximum variance is attained for that species for which πm is closest to 1/2. The Fano factor Fm=1πm is always less than 1 for all m, it is independent of N, and it is fixed entirely by the network topology and reaction rates. Thus it is an inappropriate measure of stochastic fluctuations in these networks.

In contrast,

CVm=1N1πmπm=1πmMm (57)

varies as 1/N, while for fixed N it is monotone decreasing with πm. In both cases the measures are smallest for the most-abundant species in the system.

4.1.2. Open conversion systems

Next we derive the steady-state distribution of species in an open conversion network. Here K=KconKd, and (25) can be written as

Gtj=1si=1sKij(zi1)Gzj=i=1sKis(zi1)G.

We find the solution of this PDE using the method of characteristics, analogous to the procedure outlined by Gans (1960). The result is that

G=exp(1det ΦisKisks1λkΦki (eλkt1)jsϕkj(zj1))i(1+1det ΦkΦkijϕkj(zj1)eλkt)mi,

where Φ is the matrix whose rows are the eigenvectors ϕk, Φkj is the cofactor of ϕkj, and mi is the number of molecules of species i present initially. Therefore at steady state

G(z,)=exp (1det ΦiKisk1λkΦkijϕkj(zj1)),

and the steady-state value of the mean is given by

Mm=E(Nm)=G(z,)zm=1det ΦisKisk1λkΦkiϕki.

To obtain the probability distribution, we differentiate k times to get

kG(z,)zmk=(1det ΦiKisk1λkΦkiϕkm)k×exp (1det ΦiKisk1λkΦkijϕkj(zj1))

and then

PNm(k,)=1k!kG(z,)zmk|zm=0,zj=1,jm=1k!MmkeMm.

This defines the density for a Poisson distribution, and as in any Poisson distribution, the variance is equal to the mean. This shows that the steady-state distribution of any species in an open first-order conversion network is a Poisson distribution, but this does not extend to the transient dynamics unless one assumes that the initial distribution of molecules is a Poisson distribution, rather than the Dirac distribution generally used.6

4.1.3. The noise during transients in conversion systems

The time-dependent variance in an open conversion system is given by (50) wherein Kcat = 0. In that case C(l, k, i) = 0 and (50) reduces to

σm2(t)=i,je(λi+λj)t (PiPjM(0))m+Mm(t).

Evidently limtσm2(t)=limtMm(t) and if M(0) = 0, then σm2(t)=Mm(t). Thus Fm will always be 1 for open conversion systems when no species is present initially (M(0) = 0, which is by definition a Poisson initial distribution), and for more general initial conditions, Fm=1 for all species at the steady state. On the other hand, it is easy to see that the CVm is always inversely proportional to the square root of the mean for all these situations where Fm is a constant. There is thus a clear difference in the qualitative estimation of the noise predicted by the two factors during transients as well as at steady state.

For closed systems, Fm=1pm at steady state, while CVm is given by (57). When Mm ~ N for some m, pm ~ 1 and Fm and CVm are both small. However when NMn, Fm1, while CVm is inversely proportional to Mm. We illustrate this and the transient behavior of the noise measures with an example of a closed three-component system shown in the following figure, where arrows indicate conversion reactions, and the symbols on the arrows indicate the specific rate constant associated with that reaction.

4.1.

Using the procedure given earlier, one can find the means and covariances by solving

M(t)=KM(t) (58)
V(t)=KV(t)+(KV(t))T, (59)

where

K=[k21k31k120k21k12k32k23k31k32k23].

The eigenvalues of K are given by

λ1=0λ2,3=12[k12k21k23k31k32±(k12+k21+k23+k31+k32)24(k12k23+k21k23+k12k31+k23k31+k21k32+k31k32)]

and because

(k12+k21+k23+k31+k32)24(k12k23+k21k23+k12k31+k23k31+k21k32+k31k32)>0,

the eigenvalues are distinct and K is semisimple. Thus,

M(t)=eKtM(0)=i=13eλit(PiM(0))k

and furthermore

σk2(t)=i,je(λi+λj)t (PiPjM(0))k+i=13eλit (PiM(0))k.

Assuming that only one species Ml for l ∈ {1, 2, 3} is present initially with N molecules, we find that for k = 1, 2, 3,

Mk(t)=Ni=13eλit(Pi)kl

and so

σk2(t)=i,j=13e(λi+λj)t(PiPjM(0))k+Mk(t)=Mk(t)i,j=13e(λi+λj)t(Pi)kl(Pj)kl N=Mk(t)(1Mk(t)N).

Fig. 1 shows the evolution of the Fano factor and CV for two components of the network. Since the sum of the number of molecules of all components at any instant is equal to the sum of the initial number, the mean and variance of the first component can be calculated from the mean and variance of the other two. It is clear that for these values of the rate parameters, which produce a steady state in which most of the total molecules exist as Species 3 and the steady-state value of the mean for Species 2 is a small fraction of the total number of molecules, the Fano factor for species 2 is close to one at steady state, and does not change as the total number of molecules is increased 10-fold. For Species 3 the steady-state value of the mean is almost equal to the total number of molecules, and both the Fano factor and the CV predict that the steady-state noise will not change appreciably when the total number of molecules in the system is changed.

Fig. 1.

Fig. 1.

Time-evolution (x-axis) of the mean scaled to the total number of molecules (-.-.-.), Fano factor (solid line) and CV (dashed line) for Species 2 (left) and Species 3 (right) for two values of N = 100 and N = 1000. Profiles for the fraction in each state and the Fano factor are independent of N. These plots illustrate the fact that use of the Fano factor leads to the conclusion that the noise of both species does not change on increasing N, whereas use of the CV predicts that increasing N reduces the noise in Species 2, whereas the noise in Species 3 remains essentially unchanged at steady state. The values of the rate constants are (in units of time−1) k21 = 100, k31 = 100, k12 = 1, k23 = 1, k32 = 100 and all N molecules exist as M1 initially.

4.2. The effect of network topology on stochastic reaction networks

The structure of the graph of interactions among the network components influences the transient stochastic evolution of the network through its effect on the eigenvalues and eigenvectors of K. However, we have shown that the steady-state distribution is always multinomial for a closed conversion system and Poisson for an open conversion system, and these conclusions are independent of the topology of the network. In both cases the distribution is completely characterized by the mean, and the effect of topology on the mean of the various species can be derived from an analysis of the structure of the vertex–edge incidence matrix E (Othmer, 1979). We illustrate the effects of a change in network topology on the steady-state distribution of a simple catalytic network, and on the evolution of the distribution to the steady state for general conversion or catalytic networks through a simple example.

We consider reaction networks that either form closed “loops” (Fig. 2(b)), with either a feedforward or feedback interaction, and compare the results with the evolution of the distribution in linear reaction networks of the form shown in Fig. 2(a). In general, for a conversion chain the ith species is formed from the (i − 1)st species and is converted to the (i + 1)st species; and in a catalytic chain the ith species is formed from the source with the reaction being catalyzed by the (i − 1)st species, and catalyzes the formation of the (i + 1)st species. All species undergo first-order degradation. The conversion chain is an example of an open conversion system, and we have shown that the distribution of the number of molecules of all components is a Poisson distribution. The analysis of the dynamics of the mean for feedback networks has been carried out earlier (Tyson and Othmer, 1978).

Fig. 2.

Fig. 2.

Linear (a) and looped (b) reaction networks. The arrows represent the dependence of one species on the rate of formation of the species at the head of the arrow: the reactions may be conversion (solid lines) or catalytic (dashed lines).

Consider an example of a feedback loop with s = 2, where only M1 has a non-negative rate of production from the source. Table 2 gives the steady-state mean of both species for conversion and catalytic reactions, and the steady-state variance for a network of catalytic reactions (the variance is equal to the mean in the conversion case).

Table 2.

Mean and variance for a 2-component feedback network with either conversion or catalytic reactions

Species Meanconversion Meancatalytic Vanancecataiytic
M1 ptk1S(k12con+k2d)(k12con+k2d)(k21con+k1d)k21conk12con ptk1Sk2dk1dk2dk21catk12cat ptk1dk2d(k1d+k2d)k21catk12cat(k1dk12cat)(k1dk2dk21catk12cat)(k1d+k2d)×M1
M2 ptk1Sk21con(k12con+k2d)(k21con+k1d)k21conk12con ptk1Sk21catk1dk2dk21catk12cat ptk1dk2d(k1d+k2d)+k21catk2d(k1dk12cat)(k1dk2dk21catk12cat)(k1d+k2d)×M2

It is clear that the presence or absence (k12con=0) of a feedback loop in a conversion network changes the mean value of both M1 and M2, but the variance is always equal to the mean and the Fano factor is always equal to one. In contrast, for a catalytic network the presence of a feedback loop changes not only the mean, but also the steady-state fluctuations, as is most clear from the change in the Fano factor for M1 from a value of one in the absence of a feedback (k12cat=0), to the value indicated in Table 2 for a network with feedback. We will further explore the differences in the variance of conversion and catalytic networks in the following subsection.

We can also compare the change in the evolution rates for these networks. For the reaction networks in Fig. 2, the matrix of reaction rates takes the form

K=[K1100K1sK21K22000K32K33000Kss1Kss].

For the linear network, the value of K1s is zero and Kii are its eigenvalues.

The characteristic equation f (λ) of K is given by

f(λ)=[(K11λ)(K22λ)  (Kssλ)]+(1)s+1K1sK21  Kss1=(1)s[i=1S(λKii)K1sK21  Kss1](1)sg(λ).

Consider a looped network that conforms to the stability condition (1) that the specific rate of production of each species is less than the specific rate of degradation. This requires that

K21<K11,K32<K22,,Kss1<Ks1s1,K1s<KSS

and this implies that

K1sK21K32  Kss1<i=1s(Kii).

Hence, for all Kkk,1ks we have

g(Kkk)=K1sK21  Kss1<0

and

g(0)=i=1S(Kii)K1nK21K32  KSS1>0.

Therefore there is a real root λr of g(λ) = 0 such that maxi Kii<λr<0, which implies that in the loops there exists at least one real negative eigenvalue λr of K which is bigger than maxi Kii.

Thus the presence of a loop leads to a slowing down of the evolution of the moments to the steady-state values. Notice that for a feed-forward looped network (i.e. Fig. 2(b) with the the arrow from M1 to Ms reversed), the corresponding reaction rate ks1 is still in the lower triangular part of K and the eigenvalues will be identical to those of the corresponding linear system.

4.3. Effect of the nature of the chemical reaction

Previous analyses of stochastic first-order reaction systems have been restricted to either all-conversion or all-catalytic systems, but they have not been compared directly. Let us consider reactions of the form shown in Fig. 2(a) in which the first component is produced from a source, and then is either converted to the second species (conversion chain), or catalyzes the formation of the second species (catalysis chain). As we know, the distribution of the number of molecules is a Poisson distribution for an open conversion chain, the Fano factor is one for all species at steady state, and the CV is inversely proportional to the square root of the mean. The results are quite different for a catalytic chain. In Fig. 3 we show the Fano factor and CV for a catalytic chain with rate parameters such that the means are identical 3(a) and different 3(b). It is clear that for a catalytic chain with identical means (ki(i1)cat=kidi2), the variance reaches a limiting value as the chain length is increased. This agrees with the results of Thattai and van Oudenaarden (2001), who studied catalytic chains with hyperbolic activation functions. However, it is important to note that variance may not saturate when the means are different, which is almost always the case in biological systems. When the means are the same, the Fano factor and CV give estimations of the noise that are qualitatively consistent, but when the means are different the use of the two measures of noise give different predictions of the effect of increasing the number of species in a catalytic chain on the level of fluctuations.

Fig. 3.

Fig. 3.

Fano factor (solid line) and CV (dashed line) at steady state for the ith species in a catalytic chain with equal (a) and unequal (b) means. In (a) all species have a steady-state mean value of 3, resulting from setting all catalysis and degradation rate constants to 2 and the production rate for M1 to 2, random parameters for rate constants lead to mean values between 10−4 and 120 for species in (b). In both simulations, one molecule of each species was assumed to be present initially: but the results do not depend on the choice of the initial condition.

5. The effect of diffusion on stochastic reaction networks

Heretofore we have ignored the possible effects of spatial nonuniformity in the distribution of species, but when transport is solely by diffusion we can analyze a suitably-discretized spatial model within the preceding framework, because diffusion is a linear process. As the reader will appreciate after the development of the equations, the same analysis applies to compartmental models in which transport between compartments is linear in the concentration of the species. Here we restrict the discussion to a closed system containing an isotropic medium having no diffusive coupling between species, but the general formulation of the corresponding deterministic linear equations allows for bi-directional exchange with a reservoir, diffusive coupling, anisotropy in the transport (Othmer and Scriven, 1971).

For simplicity of exposition we begin with a symmetric domain that is subdivided into identical cubical compartments, and denote the number of molecules of species Mi present in the jth compartment as Nji. Diffusive transport from compartment j to compartment k can be represented as the reaction (Nj,i, Nk,i) → (Nj,i, − 1, Nk,i, + 1) at a rate given by Di2, where δ is the length scale of each compartment and Di is the diffusion constant for species i (Nicolis and Prigogine, 1977; Stundzia and Lumsden, 1996).

Suppose that there are Nc cells and s reacting species. As before, K is the s × s reaction rate constant matrix = Kcat + Kcon + Kd defined for reactions without diffusion. Let Δ be the Nc × Nc structural matrix of the network of cells, whose elements Δjk for j ≠ k are 1 or 0, resp., if cell k is connected to cell j, or not, resp., and −Δjj is the number of cells connected to cell j. Δ encodes the connectivity of the network, and in the simple case treated here simply reflects the discretization of the Laplace operator on the domain in question. We define D as the s ×s diagonal matrix of diffusion rates Di2 for the species.

The deterministic evolution of the system is governed by

dcdt=Ωc,

where c is the composition vector for all cells and the ns · Nc × ns · Nc matrix ΩINcK+ΔD. It follows immediately that the equations for the means and second moments are

M(t)=ΩM(t)+Ks·1
V(t)=ΩV(t)+[ΩV(t)]T+C+CT.

Here the matrix of means is defined as

M(t)= diag[M11(t),M12(t), M1s(t),M21(t), M2s(t), MNc1(t), MNcs(t)]

and Ks is a matrix containing the rate constants for production of each of the Nc × s species, given by

Ks=diag[k11s,k12s, k1ss,k21s, k2ss, kNcss],

where kijS is the rate of inflow of species j from sources.

The covariance matrix V(t) has matrix-valued elements Vij defined as

Vij(k,m)={E[Nik(t)2]E[Nik(t)]if i=j and k=mE[Nik(t)Njm(t)]otherwise.

Finally, C = W + Ks1 · M(t)T where W is a block-diagonal matrix, with each block defined as Wkk(i,j)=KijcatE[Nkj(t)].

It is clear from earlier sections that the evolution of the first and second moments are formally known once the eigenvalues and eigenvectors of Ω are known, and these are known from earlier work. Let αk be an eigenvalue of the symmetric matrix Δ; then the eigenvalues λkj of Ω are solutions of the family of Nc sth-order determinantal equations

|K+αkDλkIs|=0 (60)

(Othmer and Scriven, 1971). It is known that whenever K is not diagonal there may be counter-intuitive effects of diffusion on the eigenvalues determined by (60), and this lies at the heart of Turing’s mechanism of pattern formation (Turing, 1952; Othmer, 1969).

The foregoing has been formulated for a regular discretization of a domain, but it holds whatever the topology of the connections between the compartments, as long as transport depends only on the concentration difference between compartments. The advantage of the present formulation is that the effects of network structure in the reaction dynamics can be separated to the maximal extent possible from the topology of the compartmental connections, and effects due to spatial variations arise from the effect of the αk’s for different spatial modes. It should be noted here that in our treatment of reaction–diffusion systems there is an assumption that individual compartments are well-mixed, and for this to be valid, the size of each compartment should be related to the diffusion coefficient (and ultimately to the mean free path) of the diffusing species. However, the basis for the choice of compartment size is not clear for a system containing species with very different diffusion coefficients. On the one hand, if the size is based on the faster-diffusing species the well-mixed assumption may not be true for species that diffuse slowly, but if the size is computed using the slower-diffusing species, any solution algorithm will become computationally inefficient. More generally, the problem of how to treat wide disparities in time scales in the full master equation remains to be solved.

6. Discussion and conclusion

We have analyzed a general system of first-order reactions amongst s species that can be produced from sources, converted to other species or degraded, and catalyze the formation of other species. All previous stochastic analyses of systems of first-order reactions can be formulated as special cases of the general model studied here. We have derived explicit evolution equations for the mean and variance of the number of molecules of each reactant, and have solved them explicitly in a number of cases when the rate matrix K is semisimple. We find that the evolution of the second moments is completely determined by the spectral properties of K and the mean itself. To our knowledge this is the first report of a method to analytically compute the first two moments for an arbitrary first-order network comprising both conversion and catalytic reactions.

We have used the general framework to explore the effect of changes in the network topology on the distribution of the number of reactant molecules, and the difference between conversion and catalytic networks with the same topology. We prove that for an open system of first-order conversion reactions, the distribution of the number of molecules of every species is always a Poisson distribution. This is not the case for closed conversion systems, since the total number of molecules is constant. This result can be directly applied to the interpretation of experimental results on protein conformational-state transitions. The folding of a protein from its unfolded state to the fully folded (“native”) state occurs through a series of intermediates. The first-order rate constants governing the reversible transitions from the unfolded state to the native state are calculated experimentally (Mayor et al., 2003). The above theory suggests the distributions that the experimental data may be fitted to in order to derive accurate estimates of the first-order transition rates. If ingress and egress through flow is allowed for the measurement device, each of the protein folding states will exhibit a Poisson distribution. If the system is closed, each state will be characterized by a multinomial distribution with a mean that is lower than the variance. A protein molecule undergoing conformational-state transitions is but one example of a set of chemical reactions that may be carried out in a closed or batch process, where there is no inflow or outflow of the chemical species, or in an open or continuous process, where one or more chemicals are introduced at a constant rate, or removed from the system. The inflows may be modeled as production reactions of the form ϕMi where species Mi is produced at a constant rate from a source. The outflows are modeled as degradation reactions that result in a depletion of species at a rate proportional to their concentration, with the specific rate constant corresponding to the dilution rate for the reactor. These equations exactly describe the addition and removal of species in microfluidic devices which may be modeled as continuous stirred-tank reactors. Our analysis of open and closed systems may be used to distinguish between the effects of batch-mode operations and continuous operations on the stochastic behavior of the chemical species undergoing the same set of chemical conversion reactions.

Two measures are used to estimate the stochastic fluctuations of reactant concentrations: the Fano factor and the coefficient of variance. We have shown that the equilibrium distribution of all components is a Poisson distribution for open conversion networks, in which case use of the Fano factor as the measure of noise leads to the prediction that the fluctuations of all components in an open conversion reaction system are identical. Thus a species that has a mean of 10 molecules will exhibit the same amount of noise as a species that has a mean concentration of 1M. This is clearly not correct, and the use of the CV as the measure of the noise will correctly predict that the noise is inversely proportional to the square root of the mean value. The only instance when the use of the Fano factor has a distinct advantage is when noise is defined as a deviation from the Poisson distribution. Our work is the first instance where the two measures have been compared theoretically, and we conclude that the only instance where either measure can be used to compare the noise of two species is when the mean values are identical. We have also shown that the use of the two measures leads to contradictory conclusions about the noise when the means are not identical.

We use the example of a linear reaction chain with and without feedback to demonstrate the effect of changes in the species interconnectivity on the dynamics of the evolution of the moments of the distribution. For the same interconnectivity, the nature of the distribution changes depending on whether the reactions are conversion reactions or catalytic reactions. It has been shown for a catalytic chain whose mean values are equal at steady state that the variance of the last species in the chain increases as the number of species in the chain increases (Thattai and van Oudenaarden, 2001), but ultimately saturates for long chains. We show with a counterexample that the ratio of the variance to the mean does not show this saturation behavior when the steady-state mean values of all the components are different.

The mathematical formulation that leads to a direct solution of the moment equations for a well-stirred system can be extended to arbitrary networks of well-mixed compartments that are coupled by diffusion. We demonstrate that the eigenvalues that govern the evolution in such distributed systems are solutions of a one-parameter family of modified kinetic matrices and thus one can formally display the solution for the first two moments in this case as well. However much remains to be done for this case to develop computationally-efficient algorithms.

We anticipate that the analytical framework presented here will be extended to the stochastic analysis of nonlinear reaction networks, and our analysis of first-order reaction network will lead to insights into the local linear behavior of such networks.

Acknowledgements

This work was supported in part by NIH Grant #29123 to H.G. Othmer. C.J. Gadgil acknowledges funding from the Minnesota Supercomputing Institute (Research Scholar program). Computations were carried out using MSI and Digital Technology Center resources. We thank one of the reviewers for an extremely thorough review and for pointing out the connections to similar results in queuing theory and the theory of branching processes.

Appendix

An alternate approach to the evolution equation for the MGF7

To obtain the generating function for the first-order network, one may use the Kolmogorov backward equation instead of the master equation (24), which is generally used in the field of chemical reaction networks.

The Kolmogorov backward equation is given by

ddtE[f(N(t))]=E[Af(N(t))],

where

Af(n)=ikis(Si+1f(n)f(n))+i,jkijconnj(Si+1Sj1f(n)f(n))+i,jkijcatnj(Si+1f(n)f(n))+ikidni(Si1f(n)f(n)).

Notice that G(z,t)=E[ziNi(t)], and therefore, taking f(N(t))=ziNi(t),

ddtG(z,t)=ddtE[ziNi(t)]=ikis(zi1)E[ziNi(t)]+i,jkijcon(zizj1)E[Nj(t)ziNi(t)]+i,jkijcat(zi1)E[Nj(t)ziNi(t)]+ikid(1zi1)E[Ni(t)ziNi(t)].

Using the fact that

E[Nj(t)ziNi(t)]=zjzjE[ziNi(t)],

we obtain Eq. (24).

Tensor products and the column operation

Here we record a few basic facts about tensor products and the col operation that are used throughout.

Let x = (x1, x2, … , xn)T and y = (y1, y2, … , ym)T. Then we define the tensor product of x and y as (cf. Othmer and Scriven (1971) and references therein)

xy=(x1y,x1y,,x1y)=(x1y1,x1y2, ,x1yn,x2y1,x2,y2, ,x2yn, ,xnym)T.

For any n × n matrix B = [bij] and an m × m matrix C, we define

(BC)=[b11Cb12Cb1nCb21Cb22Cb2nCbn1Cbn2CbnnC].

Let B(i) be the ith column of a n × n matrix B. Then define the column operation as

colB=(B(1)B(2)B(n)).

Then we have

col(BC)=(BC(1)BC(2)BC(n))=[BBB](C(1)C(2)C(n))=(InB)colC.

Similarly

col(CB)=(BTIn)colC.

We can use these results to rewrite (29) as follows. We have

V(t)=KV(t)+(KV(t))T+Γ(t)+ΓT(t)

and therefore

col(V(t))=col(KV(t))+col((KV(t))T)+col(Γ(t)+ΓT(t)).

We apply the above to the first term on the right-hand side by setting B=K and C = V, and to the second term by setting B=KT and C = V, and noting that V is symmetric. This leads to (46).

Evolution of the moments

We consider the differential equation for the second moment

dvdt=Vv+γ

where

v(t)=col(V(t))V=(KIn+InK)andγ(t)=col(Γ(t)+Γ(t)T).

Since

[Γ(t)]ij=KijcatE[Nj(t)]+kisE[Nj]

and Ks and Md are diagonal matrices,

γ(t)=col(Γ(t)+Γ(t)T)=col(KcatMd(t)+(KcatMd(t))T)+col(KsL(t)+(KsL(t))T)=(KcatIn+InKcat)colMd+(LTIn+InLT)colKs.

Thus

Pijγ(t)=Pijcol(KcatMd+(KcatMd)T+KsL+(KsL)T)=(PiPj)(KcatIn+InKcat)colMd+Pij(LTIn+InLT)colKs=(PiKcatPj+PiPjKcat)colMd+(PiLTPj+PiPjLT)colKs.

We have that

PiM(t)=Pi((ieλitPi)M(0)i(1eλit)λiPiks)=eλitPiM(0)(1eλit)λiPikS

and

PiLT=Pi[M(t)||M(t)]=[PiM(t)||PiM(t)]=eλitPiM0(1eλit)λiPiS,

where M0 = [M(0)|M(0)|⋯|M(0)], and S = [ks|ks|⋯|ks]. Thus

Pijγ(t)=(PiKcatPj+PiPjKcat)colMd+[eλit(PiM0Pj)1eλitλi(PiSPj)+eλjt(PiPjM0)1eλjtλj(PiPjS)]colKs

and therefore

v=eVtv(0)+0teV(tτ)γ(τ)dτ=i,je(λi+λj)tPijv(0)+0ti,je(λi+λj)(tτ)Pijγ(τ)dτ=i,je(λi+λj)tPijv(0)+i,j(e(λi+λj)teλitλj(PiM0Pj)+(1λiλj(e(λi+λj)teλit)1λi(λi+λj)(e(λi+λj)t1))(PiKSPj)+e(λi+λj)teλjtλi(PiPjM0)+(1λiλj(e(λi+λj)teλjt)1λj(λi+λj)(e(λi+λj)t1))(PiPjKs))colKs+i,j0te(λi+λj)(tτ)(PiKcatPj+PiPjKcat)colMddτ.

Using the fact that col(ABC) = (CTA)col B, we obtain

V(t)=i,je(λi+λj)t[PjV(0)PiT]+i,j[(e(λi+λj)teλitλj)[PjKs(PiM0)T]+(e(λi+λj)teλitλiλje(λi+λj)t1λi(λi+λj))[PjKs(PiS)T]+e(λi+λj)teλjtλi[PjM0KsPiT]+(e(λi+λj)teλjtλiλje(λi+λj)t1λi(λi+λj))[PjSKsPiT]]+i,j0te(λi+λj)(tτ)(PjMd(PiKcat)T+PjKcatMdPiT)dτ.

Thus

Vll(t)=i,je(λi+λj)t[PjV(0)PiT]ll+i,j[(e(λi+λj)teλitλj)[PjKs(PiM0)T]ll+(e(λi+λj)teλitλiλje(λi+λj)t1λi(λi+λj))[PjKs(PiS)T]ll+e(λi+λj)teλjtλi[PjM0KsPiT]ll+(e(λi+λj)teλjtλiλje(λi+λj)t1λi(λi+λj))[PjSKsPiT]ll]+i,j0te(λi+λj)(tτ)(PjMd(PiKcat)T+PjKcatMdPiT)lldτ.

We have

PjMd=[Pj11M1Pj12M2Pj1nMnPj21M1Pj22M2Pj2nMnPjn1M1Pjn2M2PjnnMn](PiKcat)T=[jPi1jKj1catjPi2jKj1catjPinjKj1catjPi1jKj2catjPi2jKj2catjPinjKj2catjPi1jKjncatjPi2jKjncatjPinjKjncat].

Thus

[PjMd(PiKcat)T]ll=Pjl1M1(jPiljKj1cat)+Pjl2M2(jPiljKj2cat)++PilnMn(jPiljKjncat)=kPjlkMkC(i,k,l)

where

C(i,k,l)=jPiljKjkcat.

Similarly

[PjKcatMdPiT]ll=mPilmMmC(j,k,l).

Thus

0te(λi+λj)(tτ)(PjMd(PiKcat)T+PjKcatMdPiT)lldτ=e(λi+λj)t0te(λi+λj)τ(k(Pj)lkMkC(i,k,l)+k(Pi)lkMkC(j,k,l))dτ=e(λi+λj)t0te(λi+λj)τ[k(Pj)lk(reλrτ(PrM(0))kr1eλrτλr(Prks)k)C(i,k,l)+k(Pi)lk(reλrτ(PrM(0))kr1eλrτλr(PrkS)k)C(j,k,l))]dτ=e(λi+λj)tk(Pj)lkrC(i,k,l)[(PrM(0))k0te(λrλiλj)τdτ(Prks)kλr×(0te(λi+λj)τdτ0te(λrλiλj)τdτ)]+e(λi+λj)tk(Pi)lkrC(j,k,l)[(PrM(0))k0te(λrλiλj)τdτ(PrkS)kλr(0te(λi+λj)τdτ0te(λrλiλj)τdτ)]=k((Pj)lkC(i,k,l)+(Pi)lkC(j,k,l))×[r,λrλi+λj(PrM(0))k1λrλiλj(eλrte(λi+λj)t)(PrkS)kλr(1λi+λj(e(λi+λj)t1)1λrλiλj(eλrte(λi+λj)t))+r,λr=λi+λj(PrM(0))kte(λi+λj)t(PrkS)kλr(1λi+λj(e(λi+λj)t1)te(λi+λj)t)],

where

C(i,k,l)=m=1nPilm Kmkcat.

Thus the (l, l) component of V is

Vll(t)=i,je(λi+λj)t[PjV(0)PiT]ll+i,j[(e(λi+λj)teλitλj)[PjKs(PiM0)T]ll+(e(λi+λj)teλitλiλje(λi+λj)t1λi(λi+λj))[PjKs(PiS)T]ll+e(λi+λj)teλjtλi[PjM0KsPiT]ll+(e(λi+λj)teλjtλiλje(λi+λj)t1λi(λi+λj))[PjSKsPiT]ll]+i,jk((Pj)lkC(i,k,l)+(Pi)lkC(j,k,l))×[r,λrλi+λj(PrM(0))k1λrλiλj(eλrte(λi+λj)t)(PrkS)kλr(1λi+λj(e(λi+λj)t1)1λrλiλj(eλrte(λi+λj)t))+r,λr=λi+λj(PrM(0))kte(λi+λj)t(PrkS)kλr(1λi+λj(e(λi+λj)t1)te(λi+λj)t)].

Note that

E[Nl(t)]=[M(t)]l=ieλit(PiM(0))li1eλitλi(PikS)l

and that

E[Nl(t)]2=([M(t)]l)2=i,je(λi+λj)t(PiM(0))l(PjM(0))l+i,j(1eλit)(1eλjt)λiλj(Piks)l(Pjks)l2i,jeλit(1eλjt)λj(PiM(0))l(Pjks)l.

Therefore we have the variance of lth species

σl2(t)=Vll(t)+[M(t)]l([M(t)]l)2=i,je(λi+λj)t(PjV(0)PiT)n+i,j[(e(λi+λj)teλitλj)[PjKs(PiM0)T]ll+(e(λi+λj)teλitλiλje(λi+λj)t1λi(λi+λj))[PjKs(PiS)T]ll+e(λi+λj)teλjtλi[PjM0KsPiT]n+e(λi+λj)teλjtλiλje(λi+λj)t1λj(λi+λj)[PjSKsPiT]ll]+i,jk((Pj)lkC(i,k,l)+(Pi)lkC(j,k,l))×[r,λrλi+λj(PrM(0))k1λrλiλj(eλrte(λi+λj)t)(PrkS)kλr(1λi+λj(e(λi+λj)t1)1λrλiλj(eλrte(λi+λj)t))+r,λr=λi+λj(PrM(0))kte(λi+λj)t(PrkS)kλr×(1λi+λj(e(λi+λj)t1)te(λi+λj)t)]i,je(λi+λj)t(PiM(0))l(PjM(0))li,j(1eλit)(1eλjt)λiλj(Piks)l(Pjks)l+2i,jeλit(1eλjt)λj(PiM(0))l(Pjks)l.

After some computation we get

σl2(t)=i,je(λi+λj)t(PiPjM(0))l+ieλit[(PiM(0))l+(PikS)lλi]i1λi(PikS)l+i,jkD(i,j,k,l)[r((PrM(0))k+(Prks)kλr)fr,i,j(t)(PrkS)kλr(λi+λj)(e(λi+λj)t1)],

where

C(i,k,l)=m=1n(Pi)lmKmkcat,D(i,j,k,l)=((Pj)lkC(i,k,l)+(Pi)lkC(j,k,l)),fr,i,j(t)={1λrλiλj(eλrte(λi+λj)t) if λrλi+λjte(λi+λj)t if λr=λi+λj,

and ‘⊙’ is componentwise matrix multiplication.

If there is no catalysis in the system, then

D(i,j,k,l)=0

and we have

σl2(t)=i,je(λi+λj)t(PiPjM(0))l+ieλit[(PiM(0))l+(PikS)lλi]i1λi(Piks)l.

Details for the proof of the binomial distribution

P(N1(t)=m,N2(t)=Nm)=i=0NP(N1(t)=m,N2(t)=Nm|N1(0)=i,N2(0)=Ni)P(N1(0)=i,N2(0)=Ni)=k=0mi=kNm+k(ik)(ptk(1,1)ptik(1,2))(Nimk)(ptmk(2,1)ptNim+k(2,2))(Ni)p1(0)ip2(0)Ni=k=0mi=kNm+k(ik)(Nimk)(Ni)(p1(0)pt(1,1))k(p1(0)pt(1,2))ik(p2(0)pt(2,1))mk×(p2(0)pt(2,2))Nim+k=k=0ml=0Nm(k+lk)(N(k+l)mk)(Nk+l)(p1(0)pt(1,1))k(p1(0)pt(1,2))l(p2(0)pt(2,1))mk×(p2(0)pt(2,2))Nml, by letting l=ik=k=0ml=0Nm(Nm)(mk)(Nml)(p1(0)pt(1,1))k(p1(0)pt(1,2))l×(p2(0)pt(2,1))mk(p2(0)pt(2,2))Nm1=(Nm)(k=0m(mk)(p1(0)pt(1,1))k(p2(0)pt(2,1))mk)(l=0Nm(Nml)(p1(0)pt(1,2))l(p2(0)pt(2,2))Nml)=(Nm)(p1(0)pt(1,1)+p2(0)pt(2,1))m×(p1(0)pt(1,2)+p2(0)pt(2,2))Nm=(Nm)(p1(t))m(p2(t))Nm, by (52)

where we used the fact that

(k+lk)(N(k+l)mk)(Nk+l)=(k+l)!k!l!(N(k+l))!(mk)!(Nlm)!N!(k+l)!(N(k+l))!=N!m!(Nm)!m!(mk)!k!(Nm)!l!(Nml)!=(Nm)(mk)(Nml).

Queuing theory and chemical reaction networks

There are formal similarities between chemical reaction networks and queuing networks (Arazi et al., 2004) that can be used to translate results between the two contexts. For example, the following shows that an open conversion network is equivalent to an M/M/∞ queuing network.

Consider an irreducible or strongly connected network consisting of M/M/∞ queues with s stations. The notation M/M/∞ means Poisson arrivals, exponential service time and an infinite number of servers. The scheme of this queuing network is as follows:

graphic file with name nihms-1011837-f0002.jpg

  • Each station has M/M/∞ queuing scheme.

  • ai = rate of arrivals from outside the network into station i.

  • When station i has n occupants, then individuals may depart the system at rate din.

  • A customer leaving station i goes to station j with probability λij, for ij.

The connection between the network of M/M/∞ queue and the open conversion network is as follows.

Network of M/M/∞ queues Open conversion network
Station Species
Customer Molecule
Number of customers in station Number of molecules of species
Arrival of a customer from outside the system Production of a molecule from source
Departure out of system Degradation
Transition from ith to jth station Conversion from ith species to jth species

From this table we can see that Ni (t), the number of customers in the ith station at time t, corresponds to the number of molecules of the ith species at time t, and the rate ai can be considered as kis, di as kid and λij as Kjicon.

It can be shown that in a network of M/M/∞ queues the stationary distribution is Poisson. Furthermore, the distributions of N1(t), …,Ns(t) for each time t are independent Poisson if the system is empty initially (Durrett, 1999).

Footnotes

3

Hereafter s will denote the number of species whose concentration may be time-dependent.

4

A slightly more abstract way of stating this is that each complex defines an equivalence class EZ0s, and the change in number of molecules due to one step of the kth reaction lies in the direction ν(j)ν(i) in EiEj.

5

A reviewer has pointed out that the equation for second moments in the form (29) is solved formally i.e. converted to an integral equation, in Athreya and Ney (1972) and estimates of the growth rate derived from this. However, an explicit solution is not given.

6

A reviewer has remarked that the distribution in an open system is Poisson at time t if the initial distribution is Poisson.

7

We thank a reviewer for pointing out this approach.

References

  1. Arazi A, Ben-Jacob E, Yechiali U, 2004. Bridging genetic networks and queuing theory. Physica A 332, 585–616. [Google Scholar]
  2. Athreya K, Ney P, 1972. Branching Processes. Springer-Verlag. [Google Scholar]
  3. Austin RH, Beeson KW, Eisenstein L, Frauenfelder H, Gunsalus IC, 1975. Dynamics of ligand binding to myoglobin. Biochemistry 14 (24), 5355–5373. [DOI] [PubMed] [Google Scholar]
  4. Bartholomay AF, 1958. Stochastic models for chemical reactions: I. theory of the unimolecular reaction process. Math. Biophys 20, 175–190. [Google Scholar]
  5. Bartholomay AF, 1959. Stochastic models for chemical reactions: II. the unimolecular rate constant. Math. Biophys 21, 363–373. [Google Scholar]
  6. Blake WJ, Kaern M, Cantor CR, Collins JJ, 2003. Noise in eukaryotic gene expression. Nature 422 (6932), 633–637. [DOI] [PubMed] [Google Scholar]
  7. Bodewig E, 1959. Matrix Calculus. Interscience Publishers, Inc., New York. [Google Scholar]
  8. Bokinsky G, Rueda D, Misra VK, Rhodes MM, Gordus A, Babcock HP, Walter NG, Zhuang X, 2003. Single-molecule transition-state analysis of RNA folding. Proc. Natl. Acad. Sci. USA 100 (16), 9302–9307. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Brown FLH, 2003. Single-molecule kinetics with time-dependent rates: a generating function approach. Phys. Rev. Lett 90 (2), 028302. [DOI] [PubMed] [Google Scholar]
  10. Darvey IG, Ninham BW, Staff PJ, 1966. Stochastic models for second-order chemical reaction kinetics, the equilibrium state. J. Chem. Phys 45 (6), 2145–2155. [Google Scholar]
  11. Darvey IG, Staff PJ, 1966. Stochastic approach to first-order chemical reaction kinetics. J. Chem. Phys 44 (3), 990. [Google Scholar]
  12. Delbruck M, 1940. Statistical fluctuations in autocatalytic reactions. J. Chem. Phys 8, 120–124. [Google Scholar]
  13. Durrett R, 1999. Essentials of Stochastic Processes. Springer-Verlag, New York. [Google Scholar]
  14. Elowitz MB, Levine AJ, Siggia ED, Swain PS, 2002. Stochastic gene expression in a single cell. Science 297 (5584), 1183–1186. [DOI] [PubMed] [Google Scholar]
  15. Fredrickson AG, 1966. Stochastic triangular reactions. Chem. Engg. Sci 21, 687–691. [Google Scholar]
  16. Gani J, 1965. Stochastic models for bacteriophage. J. Appl. Prob 2, 225–268. [Google Scholar]
  17. Gans PJ, 1960. Open first-order stochastic processes. J. Chem. Phys 33 (3), 691. [Google Scholar]
  18. Gardiner CW, 1983. Handbook of Stochastic Methods. Springer, Berlin, Heidelberg. [Google Scholar]
  19. Gillespie DT, 1976. A general method for numerically simulating the stochastic time evolution of coupled chemical reactions. J. Comput. Phys 22, 403–434. [Google Scholar]
  20. Harris T, 1963. The Theory of Branching Processes. Springer-Verlag, Berlin. [Google Scholar]
  21. Horn F, Jackson R, 1972. General mass action kinetics. Arch. Ration. Mech. Anal 48, 81. [Google Scholar]
  22. Iorio EED, Hiltpold UR, Filipovic D, Winterhalter KH, Gratton E, Vitrano E, Cupane A, Leone M, Cordone L, 1991. Protein dynamics. comparative investigation on heme-proteins with different physiological roles. Biophys. J 59 (3), 742–754. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Kelly FP, 1979. Reversibility and Stochastic Networks In: Wiley Series in Probability and Mathematical Statistics. John Wiley and Sons, New York, NY, USA, London, UK, Sydney, Australia. [Google Scholar]
  24. Kendall DG, 1948. On the generalized “birth-and-death” process. Ann. Math. Stat 19 (1), 1–15. [Google Scholar]
  25. Kepler TB, Elston TC, 2001. Stochasticity in transcriptional regulation: origins, consequences, and mathematical representations. Biophys. J 81 (6), 3116–3136. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Kim SK, 1958. Mean first passage time for a random walker and its application to chemical kinetics. J. Chem. Phys 28 (6), 1057–1067. [Google Scholar]
  27. Klein MJ, 1956. Generalization of the Ehrenfest urn model. Phys. Rev 103 (1), 17–20. [Google Scholar]
  28. Krieger IM, Gans PJ, 1960. First-order stochastic processes. J. Chem. Phys 32 (1), 247. [Google Scholar]
  29. Kuthan H, 2001. Self-organisation and orderly processes by individual protein complexes in the bacterial cell. Prog. Biophys. Mol. Biol 75 (1–2), 1–17. [DOI] [PubMed] [Google Scholar]
  30. Laurenzi IJ, 2000. An analytical solution of the stochastic master equation for reversible biomolecular reaction kinetics. J. Chem. Phys 113 (8), 3315–3322. [Google Scholar]
  31. Levsky JM, Singer RH, 2003. Gene expression and the myth of the average cell. Trends Cell Biol 13 (1), 4–6. [DOI] [PubMed] [Google Scholar]
  32. Mayor U, Guydosh NR, Johnson CM, Grossmann JG, Sato S, Jas GS, Freund SM, Alonso DO, Daggett V, Fersht AR, 2003. The complete folding pathway of a protein from nanoseconds to microseconds. Nature 421 (6925), 863–867. [DOI] [PubMed] [Google Scholar]
  33. McQuarrie DA, 1963. Kinetics of small systems. J. Chem. Phys 38 (2), 433–436. [Google Scholar]
  34. McQuarrie DA, Jachimowski CJ, Russell ME, 1964. Kinetics of small systems. II. J. Chem. Phys 40 (10), 2914. [Google Scholar]
  35. Montroll EW, Shuler KE, 1958. The application of the theory of stochastic processes to chemical kinetics. Adv. Chem. Phys 1, 361–399. [Google Scholar]
  36. Nicolis G, Prigogine I, 1977. Self-organization in nonequilibrium systems: from dissipative structures to order through fluctuations. John Wiley and Sons, New York, NY, USA, London, UK, Sydney, Australia, A Wiley–Interscience Publication. [Google Scholar]
  37. Othmer HG, 1969. Interactions of reaction and diffusion in open systems. Ph.D. Thesis, University of Minnesota, Minneapolis. [Google Scholar]
  38. Othmer HG, 1979. A graph-theoretic analysis of chemical reaction networks, Lecture Notes, Rutgers University. [Google Scholar]
  39. Othmer HG, 1981. The interaction of structure and dynamics in chemical reaction networks In: Ebert KH, Deuflhard P, Jager W (Eds.), Modelling of Chemical Reaction Systems. Springer-Verlag, New York, pp. 1–19. [Google Scholar]
  40. Othmer HG, Scriven LE, 1971. Instability and dynamic pattern in cellular networks. J. Theor. Biol 32, 507–537. [DOI] [PubMed] [Google Scholar]
  41. Ozbudak EM, Thattai M, Kurtser I, Grossman AD, van Oudenaarden A, 2002. Regulation of noise in the expression of a single gene. Nat. Genet 31 (1), 69–73. [DOI] [PubMed] [Google Scholar]
  42. Rao CV, Arkin AP, 2003. Stochastic chemical kinetics and the quasi-steady-state assumption: Application to the Gillespie algorithm. J. Chem. Phys 118 (11), 4999–5010. [Google Scholar]
  43. Shuler KF, 1960. Relaxation processes in multistate systems. Phys. Fluids 2 (4), 442–448. [Google Scholar]
  44. Siegert AJF, 1949. On the approach to statistical equilibrium. Phys. Rev 76 (11), 1708–1714. [Google Scholar]
  45. Singer K, 1953. Application of the theory of stochastic processes to the study of irreproducible chemical reactions and nucleation processes. J. Roy. Stat. Soc. Ser. B 15 (1), 92–106. [Google Scholar]
  46. Spudich JL, Koshland DE, 1976. Non-genetic individuality: chance in the single cell. Nature 262 (5568), 467–471. [DOI] [PubMed] [Google Scholar]
  47. Stundzia AB, Lumsden CJ, 1996. Stochastic simulation of coupled reaction–diffusion processes. J. Comput. Phys 127 (0168), 196–207. [Google Scholar]
  48. Swain PS, Elowitz MB, Siggia ED, 2002. Intrinsic and extrinsic contributions to stochasticity in gene expression. Proc. Natl. Acad. Sci. USA 99 (20), 12795–12800. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Thattai M, van Oudenaarden A, 2001. Intrinsic noise in gene regulatory networks. Proc. Natl. Acad. Sci. USA 98 (15), 8614–8619. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Turing AM, 1952. The chemical basis of morphogenesis. Phil. Trans. Roy. Soc. Lond. B 237, 37–72. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Tyson JJ, Othmer HG, 1978. The dynamics of feedback control circuits in biochemical pathways. Prog. Theor. Biol 5, 1–62. [Google Scholar]
  52. Wei J, Prater CD, 1962. The structure and analysis of complex reaction systems. Adv. Catal 13, 203. [Google Scholar]

RESOURCES