Skip to main content
The Journal of Chemical Physics logoLink to The Journal of Chemical Physics
. 2013 Mar 13;138(10):104111. doi: 10.1063/1.4794128

Mathematics of small stochastic reaction networks: A boundary layer theory for eigenstate analysis

Eric Mjolsness 1,a), Upendra Prasad 2,b)
PMCID: PMC3612114  PMID: 23514469

Abstract

We study and analyze the stochastic dynamics of a reversible bimolecular reaction A + BC called the “trivalent reaction.” This reaction is of a fundamental nature and is part of many biochemical reaction networks. The stochastic dynamics is given by the stochastic master equation, which is difficult to solve except when the equilibrium state solution is desired. We present a novel way of finding the eigenstates of this system of difference-differential equations, using perturbation analysis of ordinary differential equations arising from approximation of the difference equations. The time evolution of the state probabilities can then be expressed in terms of the eigenvalues and the eigenvectors.

INTRODUCTION

The elementary chemical reaction

A+BkfkrC,

which we will refer to as the “trivalent reaction” holds foundational importance as a repeating unit within reaction network models of many biochemical processes. Most of the existing study of this reaction has concentrated on the deterministic approximation and on its equilibrium state stochastic solution. The deterministic analysis of this reaction does not reflect the underlying stochasticity in the reaction and is often insufficient to completely understand the temporal progression of the process. For this reason we aim to understand the stochastic dynamics of such reactions that describe the more general behavior.

Earlier works on the stochastic model for some elementary bimolecular reactions are due to Renyi3 and Darvey et al.2 where the idea of probability generating functions has been used and closed form solution for the equilibrium state has been provided. McQuarrie4 presents a comprehensive review of the stochastic models for several such chemical reactions. Laurenzi5 gave an exact closed form for the solution of the stochastic master equation for the trivalent reaction by using Laplace transformation. Recently, Lee and Kim1 have provided exact solution for a slightly more general class of single-reaction systems. However, the resulting expression is complex and inefficient to evaluate for large numbers of molecules.

In this paper we present a perturbation theoretic approach for relatively large number of molecules to describe the eigenstates of the stochastic master equation for the trivalent reaction, which yields an efficient way to calculate the probability of the reaction for any specific state as a function of time.

Mass action kinetics

Given that the numbers of particles A, B, and C at time t are α − n, β − n, and n, respectively, the mean number of particles ⟨n⟩ should evolve roughly according to the deterministic expression

dndt=kf(αn)(βn)krn,

or (to make contact with the notation used subsequently)

dndτ=(αn)(βn)κn,

where τ = kft and κ = kr/kf.

The law of mass action gives the following relation in dynamic equilibrium:

[A][B][C]=κ=(kr/kf),

where [ . ] stands for concentration. This could be restated in terms of number of molecules of A and B (after adjustments for the volume in the term κ) as

(αn)(βn)=κn.

In the special case when α = β, we have

(αn)2=κnn2(2α+κ)n+α2=0.

Define

x=n/α(n+1)/(α+2) and μ=κ/α.

Then the law of mass action above implies

α+1α+2x2=μαα+2x1α+2.

For large values of α this suggests that

(1x)2μx+O(1/α)orx2(2+μ)x+10.

The mass action solutions are given by

x±=1+μ/2±μ(4+μ)/2, (1)

and the mass action condition could be expressed in above notation as

(xx+)(xx)=0.

For all μ > 0, the value of x+ > 1, so this root is less important than x ∈ (0, 1) which is the actual mass action solution. Note the special point μ = 1/2 for which x = 1/2. This symmetric situation will be the default when parameters are required.

FULL STOCHASTIC DYNAMICS FOR TRIVALENT REACTION

Chemical master equation

Let α and β be the initial numbers of molecules of A and B, respectively. Clearly by law of conservation, if n is the number of C at time t, starting initially from 0, then α − n and β − n are the numbers of molecules of A and B at time t, respectively. Thus, only one variable, namely, n is sufficient to represent the state of the system at time t. Let p(n) denote the probability that the reaction is in state n at time t.

p(n)p(nA,nB,nC)=p(αn,βn,n).

The chemical master equation (CME) for the trivalent reaction is given by the following relation. See a recent work by Laurenzi5 for a brief derivation. For a rigorous derivation and complete description of the underlying assumptions see Gillespie6 and references therein,

dp(n)dt=kf(αn+1)(βn+1)p(n1)[kf(αn)(βn)+krn]p(n)+kr(n+1)p(n+1)

together with conditions p(−1) = 0 = p(α + 1). On using τ = kft, κ = kr/kf and on rewriting the scaled τ as t, i.e., t ← τ, we get the following simplification for the CME:

dp(n)dt=(αn+1)(βn+1)p(n1)[(αn)(βn)+κn]p(n)+κ(n+1)p(n+1). (2)

Note that the rescaling of time does not affect the calculation of eigenvectors in Secs. 2B4E, so there is no harm in writing t in place of τ.

The CME could be written in terms of the transition rate matrix W—the generator of the continuous-time Markov process—which, from Eq. 2, is clearly a tridiagonal matrix

ddtP(t)=W·P(t)=c0b0000a1c1b1000a2c2b2000a3c300000cα·P(t),

where

P=[p(0),p(1),...,p(α)]T,ai=(αi+1)(βi+1),ci=(αi)(βi)κi, and bi=κ(i+1), for i=0,1,,α.

Matrix W could be represented in terms of birth and death operators that account for the inflow and outflow of probability in a particular state. It could be verified that the sum of every column of this matrix is zero, which explains that the total probability is conserved for any state.

Let the pair (λ, Pλ), where Pλ = [pλ(0), …, pλ(α)]T, denote an eigenvalue-eigenvector pair for the above system defined by the relation

dPλdt=λPλ.

This leads to the following three term recurrence relation when used in Eq. 2:

κ(n+1)pλ(n+1)[(αn)(βn)+κnλ]pλ(n)+(αn+1)(βn+1)pλ(n1)=0, (3)

with boundary conditions pλ(1)=pλ(α+1)=0.

This recurrence relation could be solved for an eigenvalue-eigenvector pair recursively. However, this method is not numerically efficient and does not give a closed form solution for the eigenvector. We present a matrix theoretic approach to find the same, which takes advantage of the underlying structure in the recurrence relation and is numerically more stable and efficient.

Eigenvalues from matrix theory

From previous discussion, an eigenstate of the above system can be defined as a pair of eigenvalue λ and eigenvector Pλ that satisfies the relation WPλ = λPλ.

The special form of the matrix W—which is a tridiagonal matrix—lends an easier way of evaluating the eigenvalues and eigenvectors. The matrix W defined above is diagonally similar to a symmetric matrix W^, i.e., W=DW^D1 for some diagonal matrix D. Indeed, by construction, D=diag(d0,d1,,dα) provides such a diagonal matrix, where di is defined by the first order recurrence relation

d0=1 and di=di1aibi1. (4)

Every real symmetric matrix is diagonalizable in the sense that there exists an orthogonal matrix Q (see p. 104 of Horn and Johnson10) such that

W=DW^D1=DQΛQ1D1=(DQ)Λ(DQ)1,

where the diagonal matrix Λ=diag(λ0,λ1,,λα), and where λi are eigenvalues of W^. We will assume henceforth that these eigenvalues are arranged in increasing order of magnitude.

Once these eigenvalues and associated eigenvectors are known, the solution of the master equation could be given by

P(t)=etWP(0)=(DQ)etΛ(DQ)1P(0), (5)

where matrix etΛ could be easily calculated as the diagonal matrix

etΛ=diag(etλ0,etλ1,,etλα).

On solving the recurrence relation in Eq. 4, we get the following elements for the matrix D:

dn=(α)n(β)nn!κn,

where (α)n = α(α − 1)…(α − n + 1). In the case when α = β, we get

dn=(α)nn!κn.

The symmetric matrix W^ is given by

W^=c0b0a1000b0a1c1b1a2000b1a2c2b2a3000b2a3c300000cα.

In the case when α = β,

ci=(αi)(βi)κi=(αi)2κi and biai+1=(αi)κ(i+1).

The symmetric tridiagonal form of the matrix W^ lends a faster and more accurate algorithm for eigenvalue calculation. We have used steps of implicit shift QR algorithm to find these eigenvalues.9 Similarity of W and W^ implies that W has the same exact eigenvalues.

The eigenvectors associated with an eigenvalue then can be found by solving the recurrence relation in Eq. 3 for the given eigenvalue (saving the steps in QR algorithm would also have given us the eigenvector, however, numerical accuracy is not good).

Once the eigenvectors have been found, the solution of the CME with initial condition P(0) = [p(0), p(1), …, p(α)]T could be given more explicitly as

p(k,t)=[(DQ)etΛ(DQ)1P(0)]k. (6)

If only the first few eigenvectors of W are known then pseudo-inverse could be used to express the solution of the CME as

p(k,t)=[SetΛSP(0)]k, (7)

where S is a rectangular matrix, columns of which are the first few eigenvector of W, and S denotes the pseudo-inverse of S.

It should be noted that, for all sufficiently large times t > 0, the first few terms suffice to provide a good approximation to the solution of p(k, t) in Eq. 6 as higher eigenstates make very small contribution due to large negative eigenvalues. This fact allows us to focus just on the first few eigenstates for the calculation of an approximate solution to the CME.

A motivating idea behind finding an approximate analytic solution to the trivalent reaction in the context of a large reaction network is the operator splitting method for solving differential equations. Our proposed analysis could be used to solve CME for more complex reaction networks involving a bimolecular reaction in conjunction with the splitting method that has been suggested by Jahnke and Altintan.11 The central notion in this strategy consists of subdividing the reactions in a large reaction network into subsets of uncoupled reaction channels on which either an explicit analytic solution or a stochastic simulation algorithm could be applied over small time intervals. The solution of the entire stochastic system then evolves as a flow of nested compositions of two or more operators working over small time intervals. Jahnke and Altintan11 also show that the error due to the splitting is comparable to the natural fluctuations arising in any stochastic system. As explicit solutions already exist for several reactions such as monomolecular, catalytic, and auto-catalytic reactions,12 this technique could be used with the stochastic simulation algorithm for solving larger reaction networks.

An alternative approach for solving the CME for large reaction networks is the finite state projection that has been suggested by Munsky and Khammash.13 In this method the solution of the CME evolves from a smaller subset of the state space to a larger one. The error accrued in the process is shown to be within permissible limits. However, this strategy also suffers from the computational expense of the operation, particularly for the parameter regime of medium to large but finite α (and thus particle number) studied in this paper, of calculating the exponential of the generator matrix or a submatrix. Indeed, calculating such matrix exponentials approximately for larger systems is one of the potential benefits of the perturbation theory method proposed in this paper, so perturbation theory methods could be used to extend the domain of applicability of finite state projection methods.

We believe that an explicit (albeit approximate) solution for a bimolecular reaction could be beneficial as it augments the set of reactions for which an analytic solution is known. This could be used together with existing methods such as operator splitting and/or finite state projection for efficient solutions of larger reaction systems.

Numerical evaluation of eigenstates

Figure 1a displays the plots of −λi2, i = 0, 1, …, α for α = 1, 2, …, 100 when α = β and μ = 0.5, and Figure 1b gives a plot of the reciprocal of differences of consecutive eigenvalues for α = 100. We observe the following from these two plots.

  • 1.

    There is a zero eigenvalue and all other eigenvalues of the system are nonpositive. This signifies, according to Eq. 6, that as time progresses, the contributions of higher eigenstates diminish, as the system evolves to the equilibrium solution which is associated with the zero eigenvalue and corresponding eigenvector.

  • 2.

    All the eigenvalues for a given α value interlace with the next α value. This follows from the interlacing theorem for bordered matrices (see p. 185 of Horn and Johnson10). The interlacing theorem implies that the jth nonzero eigenvalue of WRN,N is sandwiched between the jth and (j + 1)st eigenvalues of WRN+1,N+1. This implies that the solution of the CME for higher α values could be seen as a perturbation of the solutions for the preceding α values.

  • 3.

    It is observed that for higher α values, the plot of −λi2 are horizontal lines. This motivates us to fitting extrapolation curves for ith eigenvalues, which are quadratic in α for large α values. Moreover, Figure 1b shows a peak in the middle indicating a dense distribution of eigenvalues near the median. It also signifies a transition in the behavior of the differences in consecutive eigenvalues, which is decreasing in the left and increasing in the right. This suggests the need of two different kinds of the extrapolation curves to fit the eigenvalues before and after the peak. We perform these curve fittings later in Sec. 4E.

Figure 1.

Figure 1

(a) The plot of −λi2 for low-lying eigenvalues suggests a quadratic fit of λi as a function of α (performed in Sec. 4E), whereas the sharp spectral density peak in panel (b) shows an effective transition in the behavior on either side, corresponding to slow time scales to the left and fast time scales to the right of the peak, and delimits the possible domain in i of such a quadratic fit. (a) Plot of −λi2, i = 0, 1, …, α, for α = 1, 2, …, 100 and μ = 1/2. (b) Plot of the reciprocal of the difference in successive eigenvalues, for α = 100.

Once the eigenvalues of the matrix W are found the associated eigenvectors could be found by using these eigenvalues in Eq. 3 and by solving the resulting recurrence relation starting from the initial conditions p(−1) = 0 and p(0) = 1.

Figures 2a, 2b, 2c are the plots of the first three eigenvectors on log scale for α = 50 and κ = α/2 = 25 evaluated from the recurrence relation. Experiments suggest accumulation of error to the right end of the plot. However, this error is very small in relative terms.

Figure 2.

Figure 2

Plot of first few eigenvectors for α = 50 on log scale, n ∈ {0, … 51}. (a) Pλ0(n). (b) Pλ1(n). (c) Pλ2(n). (d) Pλ3(n).

Continuum differential equation

In Secs. 3, 4, we present a novel approach to find approximations to the eigenvectors using perturbation theory. The basic idea is to construct a continuum differential equation from the recurrence relation, Eq. 3, and solve the resulting boundary value problem which happens to be a singular perturbation problem. For this, we first find the difference equation from the recurrence relation.

We define the forward difference operators:

Δf(n)f(n+1)f(n),Δ2f(n)=Δ[f(n+1)f(n)]=f(n+2)2f(n+1)+f(n).

These could be reversed to express shifted functions in terms of unshifted ones and forward difference operators:

f(n+1)=Δf(n)+f(n) and f(n+2)=Δ2f(n)+2Δf(n)+f(n).

Starting with the following form of the recurrence relation in Eq. 3:

κ(n+2)pλ(n+2)[(α(n+1))(β(n+1))+κ(n+1)λ]pλ(n+1)+(αn)(βn)pλ(n)=0,

and by using the above forward differences, we get

κ[Δ2(npλ(n))+2Δ(npλ(n))+(npλ(n))]{Δ[(αn)(βn)pλ(n)]+Δ[(κnλ)pλ(n)]+(κnλ)pλ(n)}=0.

On collecting terms, we have the difference equation for pλ(n),

κΔ2(npλ(n))+κΔ(npλ(n))Δ[(αn)(βn)pλ(n)]+λΔpλ(n)+λpλ(n)=0. (8)

To convert above difference equation to an approximating differential equation in xn/α, we use the following approximations:

Δf(n)=Δf(n)(n+1)n=Δf(n)Δnf(n)n=1αf(x)x=α1xf(x),Δ2f(n)1α22f(x)x2=α2x2f(x),μ=κ/α and ν=η/α=λ/α2.

Thus,

Δα1x,Δ2α2x2,nαx,κ=μα,λ=ηα.

On applying the above in Eq. 8, we have

μαα2x2(αxpη(x))+μαα1x(αxpη(x))α1x[α2(1x)(β/αx)pη(x)]+ηαα1xpη(x)+ηαpη(x)0.

Further simplification gives

μx2(xpη(x))+μαx(xpη(x))αx[(1x)(β/αx)pη(x)]+ηxpη(x)+ηαpη(x)0.

Dividing both sides by α, we get

(μ/α)x2(xpη(x))+μx(xpη(x))x[(1x)(β/αx)pη(x)]+(η/α)xpη(x)+ηpη(x)0.

Take the special case β = α for simplicity, and we have

(μ/α)x2(xpη(x))+μx(xpη(x))x[(1x)2pη(x)]+(η/α)xpη(x)+ηpη(x)0.

So a large-α approximation to the recursion equation is given by this ordinary differential equation (ODE)-eigenvalue problem with unknown eigenvalues η in its large-α limit:

(μ/α)xp(x)+(2(μ/α)+μx+(η/α)(1x)2)p(x)+(μ+2(1x)+η)p(x)=0, (9)

together with the boundary conditions p(0) = O(1/α) and p(1) = O(1/α). Accurate extrapolation formulae for the eigenvalues of the recurrence relation will be very helpful in setting η so as to satisfy the boundary conditions. We will describe this in Sec. 4E.

STEADY STATE ANALYSIS

Ground state solution

The ground state solution corresponds to the lowest eigenvalue, λ = 0, or equivalently η = 0. The boundary value problem (BVP) in Eq. 9 turns into the following problem in this case:

(μ/α)xp(x)+[2(μ/α)+μx(1x)2]p(x)+[μ+2(1x)]p(x)=0,

with p(0) = p(1) = O(1/α) which could be rearranged as

[(μ/α)xp(x)]+[((μ/α)+μx(1x)2)p(x)]=0.

Integrating both sides with respect to x gives

(μ/α)xp(x)+[(μ/α)+μx(1x)2]p(x)=C1,

which is a first order linear equation having the solution

p ground (x)=Cexαα(2xx2/2)/μx1+α/μ×C2+x1eαy+2αy/μαy2/(2μ)αlogy/μdy.

If we use the initial condition as p(1) = 0, we get C2 = 0, and thus

p ground (x)=Cexα2xα/μ+x2α/(2μ)x1+αμ×x1eαy+2αy/μαy2/(2μ)αlogy/μdy. (10)

The unknown constant C could be taken as a normalization constant so that the sum of the state probabilities in steady state is 1.

Figures 3a, 3b plot the ground state solutions divided by their maximum values for α = 100 and α = 200, respectively, with μ = 0.5. The approximation errors are 6% and 4%, respectively.

Figure 3.

Figure 3

Ground state solutions by recurrence (red dashes) and approximation (blue dots) methods. (a) Ground state solution for α = 100. (b) Ground state solution for α = 200.

We observe that there is a strong peak in pground(x) at xx. As α → ∞, we expect a delta function in p(x) at x = x (which depends on μ).

BOUNDARY LAYER ANALYSIS OF HIGHER EIGENSTATES

In this section, we present a perturbation theoretic approach to find the approximate solution of the BVP, Eq. 9, in the case when η ≠ 0:

(μ/α)xp(x)+(2(μ/α)+μx+(η/α)(1x)2)p(x)+(μ+2(1x)+η)p(x)=0. (11)

It should be noted that conversion of Eq. 9 into Sturm-Liouville form guarantees a total of α nonzero eigenvalues for the above problem in strictly increasing order of magnitude. The eigenvector associated with the kth eigenvalue has k zeroes. So the right boundary condition changes sign alternately according to (−1)k. Our numerical experiments with recurrence relation, Eq. 3, also suggest similar behavior.

It is not possible to analytically solve this boundary value problem explicitly for nonzero values of η. Numerical methods also fail. However, perturbation methods especially the boundary layer approach could be used successfully to provide a closed form approximate solution to this problem. Indeed, this is a boundary layer problem because of the small coefficient μ/α in the highest order derivative for large α.

We apply boundary layer analysis to this problem where we aim to find an approximation to the solution. The narrow regions in the interval [0, 1] where p(x) and p(x) change very rapidly in comparison to p(x) are called the boundary layers in contrast to the outer layer where this change is not so rapid. We can recast Eq. 11 in a standard form as

εp(x)+f(x,ε)p(x)+g(x)p(x)=0,[p(0)=p(1)=O(1/α)], (12)

where ε = μ/α → 0+,

f(x,ε)=2(μ/α)+μx+(η/α)(1x)2x,=(2+η/μ)ε+μx(1x)2x, and g(x)=(μ+2(1x)+η)x.

Following are some salient points of this boundary value problem:

  • 1.

    It has a regular singularity at x = 0.

  • 2.

    It is a singular perturbation problem with a turning point (where f(x, ε) = 0).

  • 3.

    Together with the small parameter ε, there is also a large parameter η (of the order of O(1/ε)) when higher eigenstates are under consideration.

The basic idea behind boundary layer analysis of a singularly perturbed differential equation (see Bender and Orszag8) is to generate a solution in the form

p(x,ε)=P(x,ε)+P(X,ε), (13)

where X = x/ε is called stretched variable, P(x, ε) is called the outer solution, given as an asymptotic power series in ε as

P(x,ε)=j=0Pj(x)εj,

and P(X,ε) is called the boundary layer solution, given by a power series

P(X,ε)=j=0Pj(X)εj.

The boundary layer solution is asymptotically matched to the outer solution so that it satisfies appropriate initial conditions and decays exponentially to zero as X goes to infinity (cf. the review paper7).

For most practical purposes, finding the leading order approximation, i.e., the first few terms in the power series, provides good enough approximation to the solution. In this paper we mainly concentrate on finding the leading outer solution P0(x) and boundary layer correction P0(X) unless it is imperative to find any higher order terms.

Outer layer solution

The outer layer is the region where terms involving the small parameter ε = μ/α could be neglected. We get the following differential equation on doing so:

(μx+(η/α)(1x)2)p(x)+(μ+2(1x)+η)p(x)x=0.

For few lower eigenvalues, the term ηαO(ε) and can thus be neglected. We get the following on rearrangement:

p(x)p(x)=μ+2(1x)+ημx(1x)2=μ+2(1x)μx(1x)2ημx(1x)2=μ+2(1x)μx(1x)2+η(xx)(xx+).

On integration with respect to x on both sides, we get

ln|p(x)|=ln|μx(1x)2|νln|xx|+νln|xx+|+lnC.

After combining terms and rearranging, we get

pout=C|xx+|ν1|xx|ν+1, (14)

where x+,x=((2+μ)±μ2+4μ)/2 are zeroes of the denominator μx − (1 − x)2 and

ν=ηx+x=ημ2+4μ.

The constant C is found by applying the left or right boundary conditions providing two outer layer solutions to the left and right of x (obviously, in the case when x ∈ (0, 1)).

If we assume that limx→0pout(x) = limx→1pout(x) = 1/α, we get the following values for C:

C=1αx2ν,x<x(1)stateμαxν,x>x,

where “state” refers to the index of the eigenvalue.

To conform to the calculations in Sec. 4B for the purpose of asymptotic matching, however, let us denote the left and right outer solution as the following:

pout,=Aexp0xμ+2(1t)+ημt(1t)2dt,x<x, (15)
pout,+=Bexpx1μ+2(1t)+ημt(1t)2dt,xx, (16)

where A = O(1/α), B = O(1/α). Simple calculation would suffice to verify that the above two outer solutions, Eqs. 15, 16, are the same as Eq. 14 if we take the two values of C into account.

Boundary layer

The nature and complexity of the boundary layer largely depend on coefficient functions f(x, ε) and g(x). From the discussion by Bender and Orszag:8 in the case when f(x, ε) > 0 on [0, 1], there is a boundary layer at the left end point; when f(x, ε) < 0 on [0, 1], there is a boundary layer at the right end point; in the case when f(x, ε) changes sign there could be internal boundary layers at the zeros of f(x, ε)—called turning points—or boundary layers at both the end points, depending on the sign of the derivative of f(x, ε) at the turning point.7, 14, 15 We find that there is an internal boundary layer at x = x in BVP, Eq. 12. Following the discussion by Bender and Orszag8 (pp. 455–458), there can not be any boundary layer at either x = 0 or at x = 1 in this case.

Recall the boundary layer problems in Eq. 12. The coefficient function f(x, ε) has a simple zero at x = x(ε). In a small neighborhood around x = x, we use the following linear approximation:

f(x,ε)x+x1(xx)=γ(xx) and g(x)(μ+η+22x)x=η+μ2+4μ+4η/αx=β.

It could be verified that β, γ, and ν have the following relationship for large α:

βγ1=ηx+x=ημ2+4μ+4η/α=ν.

So in the small neighborhood of x, we have the simplified ODE

εp(x)+γ(xx)p(x)+βp(x)=0.

Note that the ν-value is crucial. Numerical experiments suggest that for smaller eigenvalues ν exhibits near integral values. Following solution which is based on Bender and Orszag8 assumes that ν is non-integer.

Clearly, γ = η/(x+x) > 0. Following the discussion in Bender and Orszag8(p. 456), we have only an internal boundary layer at x = x. Defining the stretched variable X by δX = xx, the internal layer solution is given by

εδ2d2PdX2+f(δX,ε)δdPdX+βP=0.

Taking δ=ε and using the linear approximations for f and g from above, we have

d2PdX2+γXdPdX+βP=0.

Using Liouville transformationP=eγX2/4W and then using Z=γX, we obtain the Weber equation

d2WdZ2+βγ1214Z2W=0.

The solution of above equation is given in terms of the parabolic cylinder functionsDν(Z) and Dν(−Z) (when ν is non-integral and positive) as

p in (x)=eαγ(xx)2/(4μ)×[C1Dν(Z(x))+C2Dν(Z(x))], (17)

where Z(x)=((xx)αγ/μ) and ν = β/γ − 1. The constants C1 and C2 are found by applying asymptotic matching with the outer solutions in Eqs. 15, 16 obtained in Sec. 4A.

Note that in the case when ν is a positive integer, Dν(Z) and Dν(−Z) are no longer linearly independent. This leads to an overdetermined system (see Bender and Orszag8 for details) and further correction is needed to find the correct structure of the solution.

Uniform approximation

The uniform approximation over the entire interval is generally defined as

p unif (x)=p out (x)+p in (x)p match (x),

where p out (x),p in (x), and p match (x) are the outer, inner, and matched solutions, respectively.

Uniform approximation for a problem related with Eq. 12 has been provided by Bender and Orszag8 (p. 458) by using asymptotic matching. A detailed and rigorous proof is provided by Wong and Yang16 for a slightly general formulation of the problem

εy+a(x)y+b(x)y=0,xlxxr,

with boundary conditions y(xl)=A and y(xr)=B.

On combining the above results in Eqs. 15, 16, 17 by the help of results provided in Bender and Orszag8 and Wong and Yang,16 we get the following uniform approximation:

punif(x)=Γ(ν)2παγμν+1eαγ(xx)2/(4μ)R(x), (18)

where

R(x)=Ax0g(t)f(t,ε)βγ(tx)dtDν(Z(x))+Bx1g(t)f(t,ε)βγ(tx)dtDν(Z(x)).

On finding the integrals above, we have

R(x)=Ax+xx+ν1Dν(Z(x))+Bx+xx+1ν1Dν(Z(x)),

where γ=x+x1,A=1α, and B=(1) state α (state refers to the index of the eigenvalue). See Bender and Orszag8 for detailed derivation.

Perturbation method for eigenvectors

The uniform approximation in Eq. 18 could be used to find the eigenvector associated with a given eigenvalue. We call this perturbation method. The nth element of the eigenvector is found by evaluating Eq. 18 at x = n/α. For the sake of comparison with recurrence method, an eigenvector is divided by the magnitude of the element with maximum value. Figure 4a shows plot of the first eigenvector by the recurrence method (red dashes) and the perturbation method (blue dots) for α = 200. It is observed from several such plots that the eigenvector found by perturbation method is slightly shifted leftward from the one found by the recurrence method.

Figure 4.

Figure 4

Plot of first few eigenvectors by recurrence (red dashes) and by perturbation (blue dots) methods. Panel (a) clearly shows a horizontal shift to the left which has reduced in the other figures. (a) Plot of shifted Pλ1 with α = 200. (b) Plot of Pλ1 with α = 200. (c) Plot of Pλ2 with α = 200. (d) Plot of Pλ3 with α = 200.

We believe that this shift results from our initial approximation (n + 1)/(α + 2) ≈ n/α = x that we have used to get a manageable differential equation from the recurrence relation. Actually, in doing so we are disregarding higher order terms in 1/α which can be seen below:

n+1α+2=n+1α11+2/α=nα11+2/α+12+α=x(12/α+4/α2)+12+α=xx2α(12/α+4/α2)+12+α Disregarded terms .

It is also observed from experiments that this horizontal shift could be minimized by a small rightward shift in the values of x in evaluating Eq. 18. The expression x = (n − 0.2k)/α − 7.0k2 for the nth element, where k is the index of the eigenvalue, incorporates such an empirical shift.

Table 1 gives the relative error in the calculations of the first five eigenvectors for different α values by the perturbation method. The error is calculated by

err (α,k)=WPλkλkPλk|λk|Pλk×100%. (19)

In every row of the table, the first line gives the error with the empirical shift and the second line, in parentheses, without any shift. It is observed that the empirical shift leads to reduction in error. Better accuracy could be achieved by taking more terms in the shift.

Table 1.

Relative error in percentage points, in accordance with Eq. 19, in the calculation of Pλk by perturbation method. First line in every row is the error after an empirical shift in the xn/α axis, which shows an apparent reduction from the unshifted error shown on the second line in parenthesis.

α Pλ1 Pλ2 Pλ3 Pλ4 Pλ5
100 3.12351 2.24851 2.41778 2.89429 3.4692
  (9.50364) (10.2087) (11.2257) (12.1963) (13.0851)
200 2.47366 1.68757 1.48142 1.48756 1.60785
  (6.73266) (7.29186) (8.06775) (8.81443) (9.50732)
400 1.89188 1.36369 1.19517 1.13698 1.13557
  (4.76492) (5.18217) (5.75141) (6.30149) (6.8153)
800 1.39515 1.04163 0.929837 0.885265 0.87155
  (3.37075) (3.67359) (4.08347) (4.48039) (4.85234)
1600 1.00789 0.766284 0.691778 0.661729 0.650777
  (2.38398) (2.60089) (2.89334) (3.17685) (3.44294)

The plots in Figures 4b, 4c, 4d compare the eigenvectors found by solving the recurrence relation to those obtained by the perturbation method with empirical shift in colors red and blue, respectively, for the first three higher eigenstates. We can see that the plots by the perturbation calculation agree well with those by recurrence calculation, once the empirical shift is incorporated in the perturbation method.

The plots in Figure 5 illustrate very schematically the possible application of our perturbation theory eigenvector results to compute time-varying CME solutions, according to Eqs. 6, 7 pseudoinverse approximation thereof in Sec. 2B. An initial condition consisting of a Gaussian probability distribution is evolved forward in time using the 10 lowest-lying eigenstates (out of a total of α = 100), using (a) the exact solutions (i.e., arbitrary-precision, without use of perturbation theory) from the recurrence equation or, alternatively, (b) their perturbation theory approximations. There is essentially no difference in solution quality, and both solutions converge quickly in simulated time to the equilibrium distribution given by the zero-eigenvalue eigenvector, which can be accurately computed by perturbation theory (Figure 3a). Errors such as small negative probabilities at early times in both histories are due to the projection down to 10 out of 100 eigenvectors (a), rather than to errors in the perturbation theory approximation of those eigenvectors (b). Including all 100 exact eigenvectors eliminates these small constraint violations but otherwise does not substantially change the results (plots not shown).

Figure 5.

Figure 5

Comparison of the state probabilities calculated (a) by recurrence and (b) by perturbation method for α = 100, κ = 50, and number of initial eigenstates = 10. Initial P(t = 0) Gaussian distribution has parameters mean = 42, standard deviation = 8. Initial distribution is projected to the closest-fitting element of the subspace spanned by 10 low-lying eigenvalues to produce an initial approximation (essentially that shown at time t = 0.001); the weights of the linear expansion each then evolve under CME by multiplication by exp(tλ). Small numerical errors visible as negative values are essentially due to the projection to 10 eigenvectors (panel (a)) rather than the perturbation theory approximation of eigenvectors (panel (b)), whose constraint violations are no greater than those in (a). Retaining all 100 eigenvectors computed by recursion relation eliminates the negative values (data not shown). This observed error is less than 0.6% in terms of relative unsigned area of the curves below zero vs above zero (or total) for α = 100 and α = 200 and for early times such as 0.001, and rapidly approach zero thereafter. Thus, numerical errors visible as negative values are small, rapidly diminishing, and essentially due to the projection to the dominant subspaces in the problem rather than to approximations made in the perturbation theory calculations. (a) Plot of p(n, t) by recurrence method. (b) Plot of p(n, t) by perturbation method.

The Gaussian initial condition illustrated in Figure 5 stays away from the extreme ends of the interval of allowed values of n. Approaching these ends would require more terms in perturbation theory, which we believe would require a deeper investigation and understanding of the spectrum shown in Figure 1b to the left of the spectrum density peak, i.e., for longer time scales, than we yet have. (Indeed it is possible that the peak or scale-transition eigenvalues correspond roughly to those eigenvectors whose effective support first reaches the ends of the interval.) Improvements in this direction could yield competitive predictive CME-solving algorithms. For example, globally low-order moment approximations (e.g., Gaussian) initial conditions could be evolved forward accurately as in the foregoing example, perhaps within an operator splitting framework.

Extrapolation on eigenvalues

We have seen in Secs. 24D that our approach for solving the BVP by perturbation method relies on the values of the parameters α and μ. There is another parameter, namely, η = λ/α that is unknown and we need a reliable method to evaluate this. Of course, the first idea that comes to mind is calculation of eigenvalues and then using ηi = λi/α. However, doing so for large values of α (as α → ∞) would not always be practical. For this reason, we have used extrapolation to find ηi corresponding to the first few higher eigenstates.

In the foregoing examples, we have used η values found by applying extrapolation to the collection of (α, ηi) points obtained from matrix method of Sec. 2B. The motivation for choosing a particular function form for extrapolation comes from observing the plot of ηi as a function of α shown in Figures 6a, 6b, and from Figure 1. We observe that as α → ∞, η1 and η5 approach horizontal lines. This behavior follows for low lying eigenvalues, however, it does not hold for higher eigenvalues. For last few eigenvalues, we see that η is a linear function of α with positive slope as in Figure 6c.

Figure 6.

Figure 6

Plot of some η values as a function of α.

The form of extrapolation that is suggested by such plots—as in Figures 6a, 6b, 1 below—falls mainly in two types: first, that has term containing 1/α, and second, that has inverse exponential powers of α. After several experiments we have chosen the following form of the extrapolating function:

ϕ(u,α)=au+bu2+cuα+dα+e, (20)

for η1 and in general

ψ(u,α,i)=aiu+biα+ciuα+di2uα+eiu2α, (21)

for ηi, where u=μ2+4μ, a, b, c, d, e are extrapolation parameters and i is the index of the eigenvalue.

On observing various extrapolation fittings on the ηi-values for dataset of triplets (α, u, i), where α = 100, 101, …, 800 and u corresponding to μ = 0.1, 0.2, …, 1.0, we find the following extrapolation functions to best describe the observed data (see Table 2):

ϕ(u,α)=1.00137u0.00053u2+0.0254567uα0.388156α0.00053, (22)
ψ(u,α,i)=1.00011iu3.87967iα+2.97136iuα0.424728i2uα0.254182iu2α. (23)

Table 2.

Relative error in the calculation of η by extrapolation function ψ in Eq. 23 for the first three nonzero eigenvalues.

α μ = 0.4 μ = 0.5 μ = 0.6
50 8.5241 × 10−5 5.8831 × 10−5 2.3497 × 10−5
  1.3308 × 10−4 8.3248 × 10−5 2.4823 × 10−5
  1.9096 × 10−4 1.1252 × 10−4 2.5003 × 10−5
100 7.2466 × 10−6 4.3603 × 10−6 5.2790 × 10−6
  1.1451 × 10−5 4.3603 × 10−6 1.2423 × 10−5
  1.5379 × 10−5 9.1540 × 10−6 2.4532 × 10−5
150 1.7316 × 10−5 1.9517 × 10−5 1.4465 × 10−5
  2.5875 × 10−5 3.1560 × 10−5 2.4532 × 10−5
  2.4502 × 10−5 3.6008 × 10−5 2.0395 × 10−5
200 2.3825 × 10−5 2.4722 × 10−5 1.6030 × 10−5
  2.7273 × 10−5 3.2416 × 10−5 2.1337 × 10−5
  2.4502 × 10−5 3.6008 × 10−5 2.3095 × 10−5

We remarked in Sec. 4B that the foregoing derivation of the uniformly asymptotic expression derived above is valid only when the values of the parameter ν are non-integer. Nevertheless, numerical observations suggest that the limiting values of ν as α → ∞ can indeed be very close to integer values. Table 3 and Fig. 7 provide a comparative idea of the observed η and ν values and their extrapolated values for some initial eigenvalues for α = 200 and μ = 0.5. We can see how close they are to the integral value of the indices i of the eigenvalues.

Table 3.

First 10 values for η and ν from actual calculations and from extrapolation for α = 200 and μ = 0.5.

Index, i ηi(observed) ηi(extrapolation) νi(observed) νi(extrapolation)
1 1.49833 1.497 0.992299 0.991426
2 2.98997 2.98763 1.96734 1.96582
3 4.4749 4.47189 2.92564 2.92372
4 5.95308 5.94978 3.86771 3.86561
5 7.42451 7.42129 4.79401 4.792
6 8.88914 8.88644 5.70499 5.70332
7 10.3469 10.3452 6.60108 6.60002
8 11.7979 11.7976 7.4827 7.48252
9 13.242 13.2437 8.35023 8.35122
10 14.6792 14.6833 9.20405 9.20649

Figure 7.

Figure 7

Accuracy of the extrapolation method for finding η values: for low lying eigenvalues the extrapolation error is small. (a) Comparison of first few η values found by ψ extrapolation (blue dots) and by numerical calculation (red dots). (b) Plot of error in calculation of η by ψ extrapolation on (α, μ, i) and the actual values.

CONCLUDING REMARKS AND DISCUSSION

In this paper we have presented a stochastic analysis of trivalent reaction A+BC. Unlike the deterministic analysis, which can only describe the ground equilibrium state of the reaction, stochastic analysis of the master equation explains not only the ground state but the higher eigenstates too. Our approach uses the boundary layer structure in the singularly perturbed boundary value problem that is derived as a continuum limit of the difference-differential master equation. Numerical experiments suggest that the asymptotic approximation achieved by boundary layer analysis conforms well to the actual eigenvectors found by recursion from the difference equation for smaller nonzero eigenstates. By taking more terms in the approximation better accuracy could be obtained, but this leads to further complications in the calculations and in the asymptotic matching of the outer and inner solutions.

The analysis in this paper has been carried out assuming that the initial numbers of molecules of both A and B are the same, that is, α = β. A very similar approach could be applied in the more general case, when (for example, and without loss of generality) α ⩾ β. Similar analysis and approach could be applied for other bimolecular reactions such as A + BC + D, A + B ↔ 2C, and 2AC as well.

The perturbation theoretic approach that we have developed for the calculation of approximate eigenvectors is so far only accurate in the case of the first few eigenvalues. Its accuracy is increased if we adopt an empirical shift in the molecule number axis, possibly representing higher order effects that we have not calculated. For the completeness of this method, it will be necessary to extend the perturbation theory method to higher eigenstates also. One essential question that should be addressed is the orthonormalization of the eigenvectors, because the state probability function p(k, t) in Eq. 6 is expressed in terms of the rows of the orthogonal matrix Q. Complete orthonormalization is possible only when all the eigenvectors of the system have been calculated.

The major motivation for developing the perturbation theoretic approach for large N, the maximum number of molecules in the system, is the following. For large N we are able to approximate the steady-state probability distribution over states, and also the slowest eigenvectors (which are those that dominate convergence to equilibrium after an initial transient), all with an amount of computation that is essentially constant in N. Our method achieves this goal. That is much better scaling than what is obtained by other recursive eigenvalue calculations of which we are aware, and it is the general advantage of a 1/N-expansion perturbation theory approach.

We also discovered empirically, using arbitrary-precision arithmetic, a peak in the density of eigenvalues (Figure 1b) that marks an effective transition between slower and faster time scales. While we do not fully understand the reason for this phenomenon, it may provide an important advantage to future multiple-time-scale approaches in solving the master equation for small reaction networks by delimiting the set of eigenvectors required for an accurate and computationally feasible slow time scale approximation.

ACKNOWLEDGMENTS

E.M. would like to acknowledge useful discussions and preliminary calculations with Carl Bender. This work has been supported in part by National Institutes of Health (NIH) R01 GM086883 and P50 GM76516.

References

  1. Lee C. H. and Kim P., J. Math. Chem. 50, 1550–1569 (2012). 10.1007/s10910-012-9988-7 [DOI] [Google Scholar]
  2. Darvey I. G., Ninham B. W., and Staff P. J., J. Chem. Phys. 45(6), 2145–2155 (1966). 10.1063/1.1727900 [DOI] [Google Scholar]
  3. Rényi A., Magy. Tud. Akad. Mat. Fiz. Tud. Oszt. Kozl. 2, 93 (1953). [Google Scholar]
  4. McQuarrie D. A., J. Appl. Probab. 4(3), 413–478 (1967). 10.2307/3212214 [DOI] [Google Scholar]
  5. Laurenzie I. J., J. Chem. Phys. 113(8), 3315–3322 (2000). 10.1063/1.1287273 [DOI] [Google Scholar]
  6. Gillespie D. T., J. Phys. Chem. 81, 2340–2361 (1977). 10.1021/j100540a008 [DOI] [Google Scholar]
  7. O’Malley R., SIAM Rev. 50(3), 459–482 (2008). 10.1137/060662058 [DOI] [Google Scholar]
  8. Bender C. and Orszag S., Advanced Mathematical Methods for Scientists and Engineers: Asymptotic Methods and Perturbation Theory (Springer, New York, 1999). [Google Scholar]
  9. Golub G. H. and Van Loan C. F., Matrix Computations (Johns Hopkins University Press, 1996). [Google Scholar]
  10. Horn R. A. and Johnson C. R., Matrix Analysis (Cambridge University Press, 1999). [Google Scholar]
  11. Jahnke T. and Altintan D., “Efficient simulation of discrete stochastic reaction systems with a splitting method,” BIT 50(4), 797–822 (2010). 10.1007/s10543-010-0286-0 [DOI] [Google Scholar]
  12. Jahnke T. and Huisinga W., “Solving chemical equation for monomolecular reaction system analytically,” J. Math. Biol. 54(1), 1–26 (2007). 10.1007/s00285-006-0034-x [DOI] [PubMed] [Google Scholar]
  13. Munsky B. and Khammash M., “The finite state projection algorithm for the solution of the chemical master equation,” J. Chem. Phys. 124, 044104 (2006). 10.1063/1.2145882 [DOI] [PubMed] [Google Scholar]
  14. R.O’MalleyJr., Singular Perturbation Method for Ordinary Differential Equations (Springer-Verlag, New York, 1991). [Google Scholar]
  15. Kevorkian J. and Cole J. D., Multiple Scale and Singular Perturbation Methods (Springer, New York, 1996). [Google Scholar]
  16. Wong R. and Yang H., J. Comput. Appl. Math. 144, 301–323 (2002). 10.1016/S0377-0427(01)00569-6 [DOI] [Google Scholar]

Articles from The Journal of Chemical Physics are provided here courtesy of American Institute of Physics

RESOURCES