Skip to main content
BMC Bioinformatics logoLink to BMC Bioinformatics
. 2024 Oct 22;24(Suppl 1):493. doi: 10.1186/s12859-024-05939-8

Translation regulation by RNA stem-loops can reduce gene expression noise

Candan Çelik 1,2,, Pavol Bokes 1,3, Abhyudai Singh 4
PMCID: PMC11515661  PMID: 39438826

Abstract

Background

Stochastic modelling plays a crucial role in comprehending the dynamics of intracellular events in various biochemical systems, including gene-expression models. Cell-to-cell variability arises from the stochasticity or noise in the levels of gene products such as messenger RNA (mRNA) and protein. The sources of noise can stem from different factors, including structural elements. Recent studies have revealed that the mRNA structure can be more intricate than previously assumed.

Results

Here, we focus on the formation of stem-loops and present a reinterpretation of previous data, offering new insights. Our analysis demonstrates that stem-loops that restrict translation have the potential to reduce noise.

Conclusions

In conclusion, we investigate a structured/generalised version of a stochastic gene-expression model, wherein mRNA molecules can be found in one of their finite number of different states and transition between them. By characterising and deriving non-trivial analytical expressions for the steady-state protein distribution, we provide two specific examples which can be readily obtained from the structured/generalised model, showcasing the model’s practical applicability.

Keywords: Stochastic gene expression, Master equation, Stochastic simulation

Background

Biochemical processes such as stochastic gene expression are inherently subject to random fluctuations that lead to noise in the number of constituents [1]. Quantifying the dynamics and the noise in such stochastic processes is an intense study of various research areas. Under simplest assumptions, gene expression is described as a two-step stochastic process comprised of transcription and translation that play a significant role in determining the levels of gene products. While RNA polymerase enzymes produce mRNA molecules in the former, protein synthesis takes place by ribosomes in the latter. Because of the similarity, it is often referred to as the (classical) two-stage gene-expression model. How gene-expression regulation affects the level of gene products such as mRNA and protein is a question of interest.

The contributions to gene expression noise give rise to cell-to-cell variability in the mRNA and protein levels [29]. The noise emerges from different sources, namely intrinsic and extrinsic noise [10, 11]; yet, structural elements such as stem-loops can also contribute to noise by binding to an untranslated region of mRNA [12, 13]. The untranslated regions of mRNAs often contain these stem-loops that can reversibly change configurations making individual mRNAs translationally active/inactive [14].

From a mathematical perspective, the dynamics of gene-expression mechanisms can be described in deterministic and stochastic settings by means of ordinary differential equations (ODEs) and Master equation formulation, respectively. On the other hand, hybrid models have also been proposed as a combination of the preceding two [1517]. Only a few of those provide an explicit solution to the (classical) two-stage gene-expression model [18, 19]; most of the studies are based on Monte Carlo simulations, which are usually computationally expensive.

In recent decades, the (classical) two-stage model of gene expression has been extensively utilised to elucidate the underlying mechanisms of stochastic processes in living cells [1922]. In particular, it has been extended by the regulation of transcription factors, which affect gene expression by modulating the binding rate of RNA polymerase [23]. Specifically, the stochastic dynamics of the classical two-stage model of gene expression is described by the reaction scheme [18, 24]

ϕλmmRNA,mRNAγmϕ,mRNAλpmRNA+protein,proteinγpϕ, 1

where λm is the mRNA production rate, λp is the protein translation rate, and γm and γp are the decay rate constants of mRNA and protein species, respectively. Here and henceforth, m and p in the superscript indicate the mRNA and protein species, respectively.

As a generalisation of the (classical) two-stage model, some studies in the literature consider a set of multiple gene states and investigate the dynamics of stochastic transitions among these states [2528]. Here, we study a structuration/generalisation of the classical two-stage gene-expression model (1), which takes into account multiple mRNA states. More specifically, after being transcribed, mRNA molecules are considered to be transitioning among their different states at constant reaction rates. Subsequently, the nascent mRNA molecule is translated, and protein is degraded. The schematic of the reactions describing this system is given by the following set of chemical reactions:

ϕλimmRNAiγimϕ,i=1,,K,mRNAiqijmRNAj,i,j=1,,K,ij,mRNAiλipmRNAi+protein,i=1,,K,proteinγpϕ, 2

where λim and γim are the production and decay rates for an mRNA molecule in i-th state, respectively. The term qij, ij, denotes the mRNA transition rate from state i to state j, λip and γp are the protein translation and decay rates, respectively. The subscript is reserved for multiple mRNA states. All model parameters and their biological meaning are listed in Table 1.

Table 1.

Model parameters and their biological meaning used in all model variations

Parameter Meaning
λm mRNA production rate
γm mRNA decay rate
λp Protein translation rate
γp Protein decay rate
λim mRNA production rate in i-th state
γim mRNA decay rate in i-th state
qij mRNA transition rate from state i to j
λip Protein translation rate in i-th state
γeffm Effective mRNA decay rate
K The number of mRNA states

The chemical reactions in (2) correspond to mRNA transcription and decay, transitions among multiple mRNA states, protein translation, and protein decay, respectively. Throughout this paper, we refer to model (2) as the generalised two-stage model, by which we mean that the model is treated as an extension of the classical two-stage model concerning the structuration of mRNA. We note that the (classical) two-stage model has been extended in this manner by the inclusion of an mRNA activation/inactivation loop recently [29]; however, here we generalise the results of [29] for a more comprehensive model. Additionally, we reanalyse published data on the influence of RNA stem loops on gene expression noise and explore the influence of kinetic rate parameters on predicted noise reduction ratios. From a biologically relevant standpoint, a similar model involving multiple mRNA states has recently been studied to quantify protein variability arising from mRNA-microRNA interactions [30].

In what follows, we present two specific examples which can be obtained from the structured/generalised model (2): the mRNA inactivation loop model and the multiphasic mRNA model. These models are given by the reactions

phiγ1mλmmRNA,mRNAq21q12imRNA,imRNAγ2mϕ,mRNAλpmRNA+protein,proteinγpϕ, 3

where the abbreviation imRNA stands for an inactive mRNA molecule, and by

phiλmmRNA1KγeffmmRNA2KγeffmKγeffmmRNAKKγeffmϕ,mRNAiλpmRNAi+protein,i=1,,K,proteinγpϕ, 4

respectively. Here, the reaction system (3) accounts for the activation/inactivation of an mRNA molecule modelled by involving a pair of reversible chemical reactions. In (4), an mRNA molecule is considered to move through its finite lifetime stages, which corresponds to the ageing of an mRNA. For a detailed discussion of these models, we refer the reader to Sections The mRNA inactivation loop model and Multiphasic mRNA lifetime and also the reference [29].

This paper is structured as follows. The core part of this study is given in Section Methods, where the generalised model is introduced in an in-depth analysis of a modelling framework. Specifically, in Section Model formulation, a brief review of the classical two-stage gene-expression model is given in stochastic settings; the underlying chemical master equation (CME) is transformed into a partial differential equation (PDE) for the generating function. Then, the main focus of this paper, which is the introduction of a generalization of the two-stage model, along with its corresponding CME and PDE, is presented. In Section Solution, a power series solution to the PDE is obtained. In Section Marginal distributions and moments, not only are the marginal mRNA and protein distributions obtained using the non-trivial analytical formula for the generating function, but the moments of the protein distributions are also determined by utilising factorial cumulants. The protein distribution is thereby recovered. Section Results pertains to data analysis, its interpretation, and summarises some of the key results of our mathematical analysis. The paper is concluded in Section Conclusions..

Results

The motivation for our mathematical analysis stems from a recent experimental study [12] on the influence of RNA stem loops on gene expression noise. Stem loops appear when two palindromic sequences on the chain of nucleic acids align and form hydrogen bonds. The aligned palindromic sequences then form the “stem” and the nucleic acids in between form the “loop” of a stem loop. Another term is “hairpin loop” because of resemblance.

The authors of [12] constructed several variants of a gene encoding for a fluorescent reporter protein. Although the constructs encode for the same reporter protein, they differ in palindromic sequences in the untranslated region at the 5’ end of the gene (5’UTR). The formation of a stem loop interferes with translation; the higher the stability of a stem loop, the greater the interference; the lower the mean. The authors also show that this is associated with an increase in the coefficient of variation (CV).

Previous theoretical studies indicate that different noise metrics can lead to different interpretations of the effects of a particular mechanism on gene expression noise. The most common are the squared coefficient of variation and the Fano factor defined by

CV2=P2-P2P2,F=P2-P2P,

where P stands for the reporter protein and . are the averaging brackets. In Fig. 1, in addition to showing the dependence of the CV2 on mean (thus reproducing Fig. 1 of [12]), we also show the dependence of F=PCV2 on the mean. Notably, decreasing the mean (which is associated with greater stem loop stability) decreases the Fano factor.

Fig. 1.

Fig. 1

Dependence of protein noise on protein mean for different 5’UTR constructs. The yEGFP reporter (bottom) and the ymNeonGreen reporter (top) constructs are treated separately. The use of a log-log scale is adopted from [12]. The dots give the experimental values taken from [12] (see Table 2). Each dot is a result of multiple experiments, and the error bars indicate the standard deviation (SD). These were obtained from the standard deviation of the (nonsquared) coefficient of variation by Taylor formula: SDCV2=2CVSDCV, SDF=PSDCV2. The dashed lines give the linear and hyperbolic dependence of the F and CV2, respectively, which are predicted by the two-stage gene expression model (cf. (5)). The protein translation rate λp and the mRNA decay rate γm are being varied to change the mean levels. Note that the use of the log-log scale results in a slight curvature of the line (with a nonzero intercept)

In order to explain the apparently contradictory interpretations, we fit the classical two-stage (transcription-translation) model (1) of gene expression [15, 24]. The model is described in full mathematical detail in Section Model formulation. For the purposes of the current section, we mention that it predicts that the stationary protein mean and Fano factor of the form

P=λmλpγmγp,F=1+λpγm+γp,

where λm is the mRNA production rate, λp is the protein translation rate, γm and γp are the decay rate constants of mRNA and protein species, respectively. We note that here in the expressions for the classical two-stage model, we omit the subscript i on mRNA in the generalised model (2) because there is only one mRNA state. Provided that the protein is more stable than the mRNA (γpγm), we can simplify to

F=1+λpγm=1+γpPλm,CV2=FP=1P+γpλm. 5

Stem loops do not affect the transcription rate λm or the protein stability γp, but they can affect the protein mean through translation rate λp and mRNA decay rate γm. Thus, the two-stage model predicts an increasing linear dependence of the Fano factor, and a decreasing hyperbolic dependence of the CV2, on the mean. In Fig. 1, the Fano factor data are fit by a straight line using simple linear regression. The slope of the regression line corresponds to the fraction λp/γm in (5), which is calculated as 0.0141 and 0.0124 for the ymNeonGreen and yEGFP reporters, respectively. The regression coefficients are reused for the hyperbolic dependence of the CV2. The fits seem to be satisfactory, leading us to attribute the changes in the noise to the decrease of mean rather than an active control of noise by the stem–loop mechanism. In the same Fig. 1, a reporter of gene expression, yeast-enhanced green fluorescent protein (yEGFP), and a monomeric protein, yellow mNeonGreen (ymNeonGreen), are used to obtain the experimental results.

Table 2.

Protein mean and noise (CV) values for the yEGFP and the ymNeonGreen reporters obtained from [12]. The first column denotes distinct constructs that are of different stabilities. For instance, the L0 (PTEF1) construct is driven by a strong promoter PTEF1, generating a large abundance of protein molecules per cell, whereas PPAB1 is a mid-range promoter. The hyphen symbol denotes undetermined values

yEGFP ymNeonGreen
Construct μ CV (%) μ CV (%)
L0 (PTEF1) 1560 11.8 ± 0.5 3050 12.2 ± 0.3
U 526 12.4 ± 0.6
M1Ug 408 13.1 ± 0.6
M3g 226 16.0 ± 0.4
G10 448 12.9 ± 0.5
G14 317 15.4 ± 1.5
M3Wn 1143 13.1 ± 0.8
M3n 579 13.3 ± 0.5
M3Un 377 13.9 ± 0.6
L0 (PPAB1) 288 13.8 ± 0.2 495 13.7 ± 0.4

Let us address the question of noise control by stem–loop formation theoretically. For reasons of mathematical elegance, we will introduce a general model that extends the classical two-stage model (1) by multiple transcript states in Section Model formulation and provide a thorough analysis of the mRNA inactivation model (3) in Sections Solution-Marginal distributions and moments. Here we discuss the special case with two states, one of them translationally active (without a stem–loop), the other translationally inactive (with a stem–loop) (cf. Eq. (3)). This special case is analysed in Section The mRNA inactivation loop model. Importantly, we note that our results pertain to this special case; therefore, we drop the subscript i on mRNA species (cf. Eqs. (45) and (46)). Using standard methods, we derive that the mean is given by

P=λpλmγpγeffm,

where

γeffm=γ1m+q12γ2mγ2m+q21 6

gives an effective mRNA decay rate constant. The Fano factor satisfies

F=1+λpγp+γ1m+q12(γp+γ2m)γp+γ2m+q21. 7

The above equations give the steady-state protein mean and Fano factor as function of the model parameters (degradation rate constants γ1m,γ2m,γp of active/inactive mRNA and protein; inactivation/activation rate constants q12,q21; translation rate constant λp). The formula for the mean implies, in particular, that making the stem–loop more stable (i.e. decreasing q21) decreases the mean. The noise requires a more subtle analysis, which is given below.

In order to compare the protein noise in the current model to that exhibited by the classical two-stage model (without the inactivation–activation loop) we define the baseline Fano factor as

F0=1+λpγp+γeffm=1+λpγp+γ1m+q12γ2mγ2m+q21, 8

which can be obtained from (7) by first setting q12=0 (no inactivation) and then replacing the mRNA decay rate γ1m by its effective value (6). Adjusting the mRNA decay rate maintains the same species means in the baseline model like in the full model extended by the inactivation loop.

The protein variability formulae (7) and (8) can equivalently be expressed in terms of the squared coefficient of variation [31, 32] CV2=F/P and CV02=F0/P. We find that

CV2=1P+1Mγpγp+γ1m+q12(γp+γ2m)γp+γ2m+q21, 9
CV02=1P+1Mγpγp+γ1m+q12γ2mγ2m+q21, 10

where M=λm/γeffm is the mean value of the activated mRNA.

Comparing (9) to (10), we see that CV2<CV02, allowing us to conclude that the inclusion of the mRNA inactivation loop decreases protein noise. The key ingredient that distinguishes (9) from (10) is the Michaelis-Menten-type term in the denominator that involves the protein decay rate γp, the mRNA activation rate q21 and the inactive mRNA decay rate γ2m. Figure 2 explores the dependence of the fractional protein noise reduction CV2/CV02 on these parameters. Without loss of generality, the active mRNA decay rate is set to one, and two plausible alternatives are considered for the decay of the inactive mRNA molecule: one, imRNA is unstable, decaying with the same rate as mRNA (left panel of Fig. 2); two, imRNA is stable, i.e. protected from degradation, and does not decay (right panel of Fig. 2). We observe qualitatively different noise reduction patterns for the two alternatives. For an unstable imRNA, there is an optimal combination of mRNA activation and protein decay rate constants that minimise the protein noise (Fig. 2, left panel). We observe that the optimal values are both greater than the unit value of the mRNA decay rate. This requirement runs counter to the biological evidence that proteins are typically more stable than mRNAs. The maximal reduction of noise is moderate (around 15% for the chosen parameter set). For a stable imRNA, the optimum is approached by making the mRNA activation as slow as possible (Fig. 2, right panel). The optimal value of protein decay rate constant is then less than that of the active mRNA, and the maximal reduction of noise is more pronounced (around 60% for the chosen parameter set).

Fig. 2.

Fig. 2

Fractional protein noise reduction by the mRNA inactivation loop as function of protein decay and mRNA activation rate constants. The colour of the heat map gives the protein noise (the squared coefficient of variation) in the two-stage model extended by the mRNA inactivation loop relative to the protein noise in a baseline two-stage model without the mRNA inactivation loop (adjusting the mRNA decay rate to obtain the same species means). The mRNA mean is set to M=10, and protein mean is P=500. The mRNA decay rate is set to γ1m=1 without loss of generality; the inactive mRNA decay rate is either the same as that of active mRNA (γ2m=1; left panel) or set to zero (γ2m=0; right panel). The inactivation rate constant is q12=3

Our analysis goes beyond the first and second moments (means and variances). In particular, for the mRNA inactivation loop model (3), we show that its steady state distribution is generated by Taylor-expanding the explicit function

G(x1,x2,z)=expλmλpγeffmγp1z2F21,1+τ1+r1,1+r2;λpγp(s-1)ds+λm(x1-1)γeffm2F21,1+τ1+r1,1+r2;λpγp(z-1)+q12λm(x2-1)γeffm(γ2m+q21)2F21,τ1+r1,1+r2;λpγp(z-1).

The conjugate variables x1, x2, and z correspond to mRNA, imRNA, and protein species, respectively. Parameters r1, r2, and τ are parameter groupings defined by (53) and (55), respectively. The symbol 2F2 stands for

pFqa1,,apb1,,bq;z~=n=0a1napnb1nbqnz~nn!

which is the generalised hypergeometric function [33].

Our mathematical analysis thus provides a complete characterisation of the steady state distribution in the mRNA inactivation model in particular (as well as the generalised model in general), and extends the generating function result previously given for the two stage model in [18].

Conclusions

In this paper, we formulated and analysed a structuration/generalisation of the two-stage gene expression model in terms of having multiple mRNA states. Unlike the classical two-stage model, the generalised model considers multiple mRNA states, among which mRNA molecules are assumed to be transitioning at constant rates. Additionally, we demonstrated that the generalised model can be used to capture the dynamics of simpler models such as the inactivation loop model and the multiphasic mRNA model, which were analysed in detail as a particular interest of this paper.

We first introduced the corresponding chemical reaction system describing the generalised model and its mathematical description given by the CME. Then, we focused on seeking a solution to the corresponding PDE, which is obtained by transforming the CME using the generating function approach. A suitable ansatz was employed for converting the PDE to a system of ODEs. Subsequently, using the power series method, we sought a solution to the ODE system, which is then expressed in matrix form as a system of recurrence equations. We recovered the generating function of the stationary distribution of mRNA and protein amounts by means of the coefficients of power series, which are obtained by solving the recurrence relations under the initial conditions.

Furthermore, the sought-after solution was then used to characterise the marginal protein and mRNA distributions. To determine the protein distribution, we used the factorial moments, which are calculated from the factorial cumulants. Additionally, we demonstrated that the mRNA distributions are Poissonian. Obtaining a Poisson distribution is evident for any monomolecular chemical reaction system [34]; therefore, we derived the protein mean and Fano factor and thus expressed it in terms of the first two factorial moments. We then provided two different examples to which the generalised model and its results can be applied.

The first example concerns the inactivation loop model. We demonstrated that integrating the mRNA inactivation loop into the classical two-stage framework for gene expression results in reduced values of protein noise. Nevertheless, we note that certain conditions on the parameter rates must be met to obtain a significant protein noise reduction. These constraints take different forms depending on the interaction between the mRNA form and the mRNA degradation pathway. The first option is that the formation of the inactivation loop does not interfere with degradation, meaning that the inactive form degrades with the same rate constant as the active form. The second option is that the formation of the loop interferes with degradation so that its instantaneous degradation rate becomes zero. In both cases, protein stability must be optimally chosen to maximise noise reduction; the protein can be neither too stable nor too unstable. However, if inactive mRNAs are subject to degradation, noise reduction is optimised for relatively low protein stabilities, whereas if inactive mRNAs are protected from degradation, noise reduction is optimised for more realistic, larger values of protein stabilities. For inactive mRNAs that do not degrade, optimal noise reduction requires low mRNA activation rates, whereas relatively fast rates of activation optimise noise reduction if inactive mRNAs degrade. Generally, the stability of the inactive mRNA form sustains greater reductions of protein noise for wider and more realistic parameter values. Overall, the noise analysis suggests that the mRNA inactivation loop may play a role in controlling gene expression noise, while also highlighting the limitations of its effect. It is worth noting that one can also compare the protein variance between the extended and canonical two-stage models using the mRNA autocovariance function [35]. The approach taken in this work has an additional advantage that we present a notably non-trivial distribution for protein, which is expressed in terms of the generalised hypergeometric series and is employed to obtain a recursive expression for the protein probability mass function.

As a second example, by making suitable parameter choices in the generalised model, we presented the multiphasic model in which an mRNA molecule is assumed to be transitioning through its lifetime stages. The solution obtained for the generalised model and the associated matrices (e.g., the transition matrix) were used to determine the first two moments of mRNA distributions, which allowed us to calculate the Fano factor for the multiphasic model.

We provided a biological example of the formation of RNA stem loops and performed data analysis to explain the influence of stem-loop structure on gene expression noise. Specifically, we based our extensive mathematical analysis on the two standard noise metrics: the CV2 and the Fano factor. By doing so, our calculations allowed us to conclude that noise in gene expression can be reduced if stem loops restrict translation.

In summary, the paper provides a systematic mathematical analysis for protein–mRNA interactions in a structured gene expression model. We believe that the model and its results can be used in understanding the dynamics of underlying biochemical processes.

Methods

Model formulation

For the two-stage gene expression model (1), the probability pm,n(t) of observing m mRNA and n protein molecules at time t satisfies the CME

ddtpm,n=λm(pm-1,n-pm,n)+γm((m+1)pm+1,n-mpm,n)+λpm(pm,n-1-pm,n)+γp((n+1)pm,n+1-npm,n), 11

subject to initial condition

pm,n(0)=δm,m0δn,n0,

where δi,j represents the Kronecker delta symbol, which is one if i=j and zero otherwise; m0 and n0 are the initial mRNA and protein amounts, respectively.

Our aim is to obtain a PDE rather than working with the CME (11). To this end, we introduce the probability generating function defined by

G(x,y,t)=mnxmynpm,n(t). 12

Multiplying the CME (11) by the factor xmyn and summing over all m and n, and using (12), we arrive at the generating function which satisfies the linear first-order PDE

Gt=(γm(1-x)+λpx(y-1))Gx+γp(1-y)Gy+λm(x-1)G. 13

Equation (13) has been used in [24] to derive mRNA and protein moments; it has been solved at steady state in [18]. Here we shall derive and study a generalisation of (13).

Without loss of generality, for the generalised model (2), the probability P(m,n,t) of observing m1 mRNA copies in state 1, m2 mRNA copies in state 2, and so on, at given time t satisfies the following CME,

dP(m,n,t)dt=i=1Kλim(Ei-1-1)P+γim(Ei-1)miP+j=1Kqij(EiEj-1-1)×miP+λip(EK+1-1-1)miP+γp(EK+1-1)nP, 14

where m=m1m2m3mK is a vector of species copy numbers. Note that the step operator [36] Ei in (14) is in the variable mi, whereas EK+1 is in the variable n; EiEj-1-1=0 for i=j.

The multivariate probability generating function is given by

G(x,y,t)=m1mKnP(m,n,t)x11x22xKKyn, 15

where x=x1x2x3xK. Multiplying (14) by x11x22xKKyn and summing over all m1,m2,,mK,n, and employing (15), we arrive at the PDE

G(x,y,t)t=i=1Kλim(xi-1)G+γim(1-xi)Gxi+j=1Kqij(xj-xi)Gxi+λip(y-1)xiGxii=1K+γp(1-y)Gy. 16

Note that the step operators Ei±1 in (14) coincide with the variables xi1 while the copy number of species mi correspond to the terms xixi in (16) for the generating function. In the next section, we will seek a solution to the PDE (16).

Solution

In this section, we shall provide a step-by-step breakdown of our solution method for solving the PDE (16). We are interested in the steady state; therefore, we set the time derivative in (16) to zero and rearrange the resulting equation to obtain

i=1Kλim(xi-1)G+γim(1-xi)+j=1Kqij(xj-xi)+λip(y-1)xiGxi+γp(1-y)Gy=0 17

for the time-independent generating function G(x,y) of the stationary distribution. The probability normalisation condition translates to G(1,,1)=1. Changing the variables according to

xi=1+ui,y=1+v,G=exp(φ) 18

allows us to transform (17) into

i=1Kλimui+λipv(1+ui)-γimui+j=1Kqij(uj-ui)φui=γpvφv, 19

which is subject to the normalisation condition

φ(0)=0. 20

Below, we focus on seeking a solution to (19)–(20) using a suitable ansatz.

Let us first consider that the solution is of the form

φ(u1,u2,u3,,uK,v)=φ0(v)+u1φ1(v)++uKφK(v). 21

With this in mind, we obtain from (21) that

φui=φi(v),φv=φ0(v)+u1φ1(v)++uKφK(v). 22

Inserting the partial derivatives (22) into (19), we get

i=1Kλimui+λipv(1+ui)-γimui+j=1Kqij(uj-ui)φi-γpvuiφi=γpvφ0. 23

Equation (23) can be rewritten as

γpφ0-i=1Kλipφiν+i=1Kγpvφi+γim-λipν+j=1Kqijφi-j=1Kqjiφj-λimui=0. 24

In order that (24) hold, we must necessarily have

i=1Kλipφi-γpφ0=0, 25
γpvφi+γim-λipv+j=1Kqijφi-j=1Kqjiφj=λim. 26

Thus far, we have converted the system of PDEs (17) into the system of ODEs (25)–(26). Next, we provide a detailed explanation of solving this system using the power series method.

Let us assume that the functions φ0 and φi are of the power series form, i.e.,

φ0(v)=n=0anvn,φi(v)=n=0bn(i)vn 27

for i{1,,K}. Differentiating (27) term by term we get

φ0(v)=n=1nanvn-1,φi(v)=n=1nbn(i)vn-1. 28

Inserting (27) and (28) into (26), and collecting same powers of v, we obtain the following system of recurrence relations

γim+j=1Kqij+nγpbn(i)-j=1Kqjibn(j)=λipbn-1(i) 29

for the coefficients bn(i), where i=1,,K. For the sake of simplicity, equations (29) can be rewritten in matrix form as

(A-Q+nγpI)Xn=BXn-1,n1, 30

where I is the identity matrix and the vector Xn is defined as

Xn=bn(1),bn(2),bn(3),,bn(K).

In (30), A is a K×K matrix defined by

Aij:=γimfori=j,0forij, 31

Q is a K×K matrix defined by

Qij:=-kiqikfori=j,qijforij, 32

and B is a K×K matrix defined by

Bij:=λipfori=j,0forij. 33

In order to solve the recurrence relations (30) initial conditions are needed. These can be obtained from (26) by setting v=0 for each i{1,2,,K}. The resulting system of linear equations is given in matrix form as

(A-Q)X0=C, 34

where C is a column vector defined as C=λ1mλ2mλKm.

Solving the system of algebraic equations (30) under the initial conditions (34) yields the terms of bn(i); the sequence an can be obtained by substituting (27) and (28) into (25) and collecting same powers of v. By doing so, we get

an=1nγpi=1Kλipbn-1(i),n1. 35

Note that the normalisation condition (20) implies that a0=φ0(0)=φ(0)=0. Having found the sequences an and bn(i), we combine (21) and (27) to obtain

φ(u,v)=n=1anun+i=1Kvin=0bn(i)un. 36

We return to the original variables in (36) via (18) to obtain the generating function of the stationary distribution of mRNA and protein amounts, which is given by

G(x,y)=expn=1an(y-1)n+i=1K(xi-1)n=0bn(i)(y-1)n. 37

Equation (37) provides the sought-after steady-state solution to the PDE (16) and will be used in the following section.

Marginal distributions and moments

In this section, we use the analytical formula for the generating function (37) to obtain marginal mRNA distributions. We determine the moments of the protein distribution by way of the factorial cumulants, which allow us to recover the protein distribution. Additionally, we derive the protein Fano factor (variance-to-mean ratio) and express it in terms of the first two factorial moments.

Marginal mRNA distributions In the generating function (37), if we take y=1, then we obtain the marginal mRNA distributions as

Gm(x)=G(x,1)=expi=1Kb0(i)(xi-1)=i=1Kexpb0(i)(xi-1), 38

from which we conclude that the steady state mRNA distributions are independent Poissons with means

mi=b0(i).

Marginal protein distribution Likewise, by inserting xi=1 (i=1,,K) into (37), we can recover the generating function of the marginal protein distribution

G(y)=G(1,y)=expn=1an(y-1)n, 39

where 1 is a K-dimensional row vector of ones.

Next, we determine the moments of the protein distributions. The factorial (combinatorial) moments hn are obtained by expanding the generating function into a power series around y=1:

G(y)=n=0hn(y-1)n.

We aim to calculate the factorial moments hn by way of the factorial cumulants an. To that end, we first differentiate (39) to obtain

DG(y)=G(y)DlnG(y), 40

where D denotes the differential operator d/dy. Then, taking the (n-1)th derivative of (40), we get

DnG(y)=i=0n-1n-1iDiG(y)Dn-ilnG(y),

which can be recast as

DnG(y)n!=i=0n-11-inDiG(y)i!Dn-i(lnG(y))(n-i)!. 41

Evaluating (41) at y=1 gives the factorial moments of the protein distribution

hn=i=0n-11-inan-ihi,forn1, 42

where h0=1. The terms of hn can be recursively obtained by inserting (35) into (42). Subsequently, by employing the recurrence method proposed in [37], we recover the protein distribution

p(n)=j=1(j+1)nn!hn+j(-1)j,

where (x)n, n being a nonnegative integer, denotes the rising factorial or namely Pochhammer symbol.

Moments Clearly, the mRNA distributions in (38) are Poissonian. Therefore, mRNA Fano factor is equal to 1. The protein mean and Fano factor can be derived from the factorial moments (42). The first two factorial moments are given by

n=h1=a1andn(n-1)=2h2=2a2+a12, 43

respectively. The Fano factor,

F=n2n-n=n(n-1)n+1-n=2a2a1+1, 44

is thus expressed in terms of the first two factorial cumulants a1 and a2.

The mRNA inactivation loop model

In this section, we present a particular example of the generalised model (2), which we refer to as the inactivation loop model, whose reaction scheme is given by (3). Specifically, we provide an explicit representation of the stationary solution using the cumulants. Furthermore, we calculate the steady-state protein Fano factor and express it as a function of the model parameters. Let us note that a possible biological scenario that can implement this model is by a regulatory RNA that temporarily blocks mRNA function [38].

The inactivation loop model (3) can be readily obtained from the generalised model (2) by taking K=2, which accounts for only two mRNA states denoting the active mRNA state m1 and the inactive mRNA state m2. In what follows, we assume that a newly produced mRNA is active, i.e. that the transcription rate satisfies

λim=λmδi,1,fori=1,2. 45

Additionally, we assume that proteins are translated only from an active mRNA, so that we have

λip=λpδi,1,fori=1,2, 46

for the translation rate. Here, δi,j denotes the Kronecker delta symbol. Cumulants We aim to recover expressions for the inactivation loop model from the generalised model. The system of algebraic equations for this model follows from (34), taking the form of

(γ1m+q12)b0(1)-q21b0(2)=λm,(γ2m+q21)b0(2)-q12b0(1)=0,

from which we recover

b0(1)=λm(γ2m+q21)(γ1m+q12)(γ2m+q21)-q12q21. 47

Combining (47) with (39) we find

m1=λmγeffm,

where

γeffm=γ1m+q12γ2mγ2m+q21 48

is the effective rate of mRNA decay. The recurrence relations (30) read

(γ1m+q12+nγp)bn(1)-λpbn-1(1)-q21bn(2)=0, 49
(γ2m+q21+nγp)bn(2)-q12bn(1)=0, 50

for n1. Solving the algebraic system (49)–(50) in bn(1) yields

bn(1)=λp(γ2m+q21+nγp)γp2n2+γp(γ2m+γ1m+q21+q12)n+γ2mγ1m+γ1mq21+γ2mq12bn-1(1), 51

which is a recursive expression whose first term (i.e. zeroth) is given by (47).

Explicit representation The recursive formula (51) can further be simplified by factorising its denominator as

bn(1)=λpγ2m+q21+nγpγp2(n+r1)(n+r2)bn-1(1)forn1, 52

where

r1,2=γ1m+q12+γ2m+q21±(γ2m+q21-γ1m-q12)2+4q21q122γp 53

are the opposite numbers to the roots of the quadratic in the denominator of (51). The sequence (52) can be rewritten as

bn(1)=λm1+τnγeffm(1+r1)n(1+r2)nλpγpn,n1, 54

where

τ=γ2m+q21γp, 55

and (x)n represents the rising factorial. Thus, an can be obtained from (35) as

an=λpnγpbn-1(1),n1. 56

Inserting (54) into (56) gives

an=λmr1r2γeffmττnn(r1)n(r2)nλpγpn,n1, 57

and, similarly, inserting (54) into (50) gives

bn(2)=q12λmτnγeffm(γ2m+q21)(1+r1)n(1+r2)nλpγpn,n1. 58

Substituting (54), (57), and (58) into (37), we obtain an explicit representation of the stationary solution

Gx1,x2,z=expλmλpγeffmγp1z2F21,1+τ1+r1,1+r2;λpγp(s-1)ds+λmx1-1γeffm2F21,1+τ1+r1,1+r2;λpγp(z-1)+q12λmx2-1γeffmγ2m+q212F21,τ1+r1,1+r2;λpγp(z-1),

where

pFqa1,,apb1,,bq,;z~=n=0a1napnb1nbqnz~nn!

is the generalised hypergeometric function [33]. Furthermore, combining (47) and (56) yields an equivalent expression

n=λpλm(γ2m+q21)γp((γ1m+q12)(γ2m+q21)-q12q21)=λpλmγpγeffm

for the protein mean given by (43) in terms of the model parameters. Likewise, substituting (56) and (51) into (44) and simplifying gives

F=1+bn(1)bn(0)=1+λpγp+γ1m+q12(γp+γ2m)γp+γ2m+q21 59

for the steady-state protein Fano factor as function of the model parameters.

Multiphasic mRNA lifetime

In this section, we consider that mRNA molecules posses K>2 stages of their lifetime, where the transition rates correspond to the ageing of an mRNA molecule. The chemical reaction system for this multiphasic model was given in (4). We note that kinetic proof reading cascades can be an interesting application of our multiphasic model [39].

By (4), there are K stages of an mRNA’s molecule lifetime, each of which lasts 1/Kγeffm on average. The total mRNA lifetime is then 1/γeffm; γeffm is thereby interpreted as the effective mRNA decay rate. The multiphasic mRNA decay in K steps leads to an Erlang-distributed lifetime with mean 1/γeffm and variance 1/(γeffm)2, whereas the lifetime distribution is exponential in the standard model (1).

The multiphasic model (4) can be obtained by making the following choices in the general model statement (2):

λim=λmfori=1,0fori1,

and

γim=Kγeffmifi=K,0otherwise.

The transition matrix Q (32) for the multiphasic model takes the form of

Q=Kγeffm-11-11-110, 60

and the matrix A (31) is given by

A=Kγeffm0001. 61

Inserting (61) and (60) into (34), we obtain the system of recurrence equations

Kγeffm1-11-11-11b0(1)b0(2)b0(i)b0(K)=λm000, 62

from which, upon taking the i-th row of (62) and solving the recursive equations

-Kγeffmb0(i-1)+Kγeffmb0(i)=0,for2iK,

where b0(1)=λm/Kγeffm, we recover

b0(i)=λmKγeffm. 63

Formula (63) gives the mean of mRNA molecule in the i-th state of its lifetime. Note that the matrix B (33) takes the form of B=λpI, where I is the identity matrix.

Having found the first moments (i.e. means) (63), we then determine the second moments. Taking n=1 in (30), we have

Kγeffm+γp-KγeffmKγeffm+γp-KγeffmKγeffm+γp-KγeffmKγeffm+γpb1(1)b1(2)b1(i)b1(K)=λpλmKγeffm1111, 64

from which we obtain the first term of the sequence b1(i) as

b1(1):=u=λpλmKγeffm(Kγeffm+γp). 65

Equation (64) implies that

-Kγeffmb1(i-1)+(Kγeffm+γp)b1(i)=λpλmKγeffm,for2iK,

which can equivalently be rewritten as

b1(i)=u+vb1(i-1),2iK, 66

where we set

v=KγeffmKγeffm+γp 67

for simplicity. Combining (66) and (65), we obtain

b1(i)=u1-v+vi-1u-u1-v,1iK, 68

from which all the elements of b1(i) (thereby the second moments) can be iteratively obtained. It is worth noting that one can derive higher moments using formula (30), but we limit our study to the first two moments.

Next, we focus on calculating the first two terms of the sequence an (35). Setting n=1,2 in (35) and inserting (63) and (68) into the resulting equations, respectively, we get

a1=λpλmγpγeffmanda2=λpuK+v-1-K+vK2γp(1-v)2. 69

Having found the first two terms of an, we are now ready to calculate the Fano factor. Inserting (69) into (44), and substituting (65) and (67) into the resulting expression yields

Fm=1+λpγp1+γeffmγp-1+KγeffmKγeffm+γpK,

where Fm stands for the multiphasic Fano factor.

Acknowledgements

AS acknowledges support by ARO W911NF-19-1-0243 and NIH grants R01GM124446 and R01GM126557.

About this supplement

This article has been published as part of BMC Bioinformatics Volume 24 Supplement 1,2023: Special Issue of the 19th International Conference on Computational Methods in Systems Biology. The full contents of the supplement are available online at https://bmcbioinformatics.biomedcentral.com/articles/supplements/volume-24-supplement-1.

Abbreviations

mRNA

Messenger RNA

CME

Chemical master equation

PDE

Partial differential equation

ODE

Ordinary differential equation

imRNA

Inactive mRNA

UTR

Untranslated region

CV

Coefficient of variation

SD

Standard deviation

yEGFP

Yeast-enhanced green fluorescent protein

ymNeonGreen

Yellow monomeric fluorescent protein

Author contributions

CÇ performed simulations. CÇ and PB analysed and interpreted the results. AS conceived the research. CÇ, PB, and AS wrote the manuscript. All authors have read and approved the final manuscript.

Funding

This work has been supported by the Slovak Research and Development Agency under the contract No. APVV-18-0308 and the VEGA grants 1/0339/21 and 1/0755/22. The funding body had no role in the design of the study and collection, analysis, and interpretation of data and in writing the manuscript.

Availability of data and materials

Not applicable.

Declarations

Ethical approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no Conflict of interest.

Footnotes

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  • 1.Elowitz MB, Levine AJ, Siggia ED, Swain PS. Stochastic gene expression in a single cell. Science. 2002;297(5584):1183–6. [DOI] [PubMed] [Google Scholar]
  • 2.Munsky B, Neuert G, van Oudenaarden A. Using gene expression noise to understand gene regulation. Science. 2012;336(6078):183–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Raser JM, O’Shea EK. Noise in gene expression: origins, consequences, and control. Science. 2005;309(5743):2010–3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Sanchez A, Choubey S, Kondev J. Regulation of noise in gene expression. Annu Rev Biophys. 2013;42:469–91. [DOI] [PubMed] [Google Scholar]
  • 5.Dar RD, Shaffer SM, Singh A, Razooky BS, Simpson ML, Raj A. Explains transcriptional bursting, the noise-versus-mean relationship in mRNA and protein levels. PLOS ONE. 2016;11(7):e158298. 10.1371/journal.pone.0158298. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Kim S, Jacobs-Wagner C. Effects of mRNA degradation and site-specific transcriptional pausing on protein expression noise. Biophys J. 2018;114(7):1718–29. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Fraser LCR, Dikdan RJ, Dey S, Singh A, Tyagi S. Reduction in gene expression noise by targeted increase in accessibility at gene loci. Proceed Nat Acad Sci. 2021;118(42):e2018640118. 10.1073/pnas.2018640118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Modi S, Dey S, Singh A. Noise suppression in stochastic genetic circuits using PID controllers. PLOS Comput Biol. 2021;17(7):1–25. 10.1371/journal.pcbi.1009249. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Smith M, Soltani M, Kulkarni R, Singh A. Modulation of stochastic gene expression by nuclear export processes. In: 2021 60th IEEE Conference on Decision and Control (CDC). 2021;655–60
  • 10.Swain PS, Elowitz MB, Siggia ED. Intrinsic and extrinsic contributions to stochasticity in gene expression. Proceed Nat Acad Sci. 2002;99(20):12795–800. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Thomas P. Intrinsic and extrinsic noise of gene expression in lineage trees. Sci Rep. 2019;9(1):474. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Dacheux E, Malys N, Meng X, Ramachandran V, Mendes P, McCarthy JEG. Translation initiation events on structured eukaryotic mRNAs generate gene expression noise. Nucleic Acids Res. 2017;45(11):6981–92. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Chiaruttini C, Guillier M. On the role of mRNA secondary structure in bacterial translation. Wiley Interdiscip Rev: RNA. 2020;11(3):e1579. [DOI] [PubMed] [Google Scholar]
  • 14.Roy B, Jacobson A. The intimate relationships of mRNA decay and translation. Trends Genet. 2013;29(12):691–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Bokes P, King JR, Wood ATA, Loose M. Transcriptional bursting diversifies the behaviour of a toggle switch: hybrid simulation of stochastic gene expression. Bull Math Biol. 2013;75(2):351–71. [DOI] [PubMed] [Google Scholar]
  • 16.Kurasov P, Mugnolo D, Wolf V. Analytic solutions for stochastic hybrid models of gene regulatory networks. J Math Biol. 2021;82(1):1–29. [DOI] [PubMed] [Google Scholar]
  • 17.Singh A, Hespanha JP. Stochastic hybrid systems for studying biochemical processes. Philosoph Trans Royal Soc A: Math, Phys Eng Sci. 1930;2010(368):4995–5011. [DOI] [PubMed] [Google Scholar]
  • 18.Bokes P, King JR, Wood ATA, Loose M. Exact and approximate distributions of protein and mRNA levels in the low-copy regime of gene expression. J Math Biol. 2012;64(5):829–54. 10.1007/s00285-011-0433-5. [DOI] [PubMed] [Google Scholar]
  • 19.Shahrezaei V, Swain PS. Analytical distributions for stochastic gene expression. Proceedings of the National Academy of Sciences. 2008 [DOI] [PMC free article] [PubMed]
  • 20.Peccoud J, Ycart B. Markovian modeling of gene-product synthesis. Theor Popul Biol. 1995;48(2):222–34. [Google Scholar]
  • 21.Pendar H, Platini T, Kulkarni RV. Exact protein distributions for stochastic models of gene expression using partitioning of Poisson processes. Phys Rev E. 2013;87(4):042720. [DOI] [PubMed] [Google Scholar]
  • 22.Schnoerr D, Sanguinetti G, Grima R. Approximation and inference methods for stochastic biochemical kinetics-a tutorial review. J Phys A: Math Theor. 2017;50(9):093001. [Google Scholar]
  • 23.Bartman CR, Hamagami N, Keller CA, Giardine B, Hardison RC, Blobel GA, et al. Transcriptional burst initiation and polymerase pause release are key control points of transcriptional regulation. Mol cell. 2019;73(3):519–32. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Thattai M, van Oudenaarden A. Intrinsic noise in gene regulatory networks. Proceed Nat Acad Sci. 2001;98(15):8614–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Li J, Ge H, Zhang Y. Fluctuating-rate model with multiple gene states. J Math Biol. 2020;81(4):1099–141. [DOI] [PubMed] [Google Scholar]
  • 26.Zhou T, Liu T. Quantitative analysis of gene expression systems. Quant Biol. 2015;3(4):168–81. [Google Scholar]
  • 27.Szavits-Nossan J, Grima R. Mean-field theory accurately captures the variation of copy number distributions across the mRNA life cycle. Phys Rev E. 2022;105:014410. 10.1103/PhysRevE.105.014410. [DOI] [PubMed] [Google Scholar]
  • 28.Filatova T, Popović N, Grima R. Modulation of nuclear and cytoplasmic mRNA fluctuations by time-dependent stimuli: Analytical distributions. Math Biosci. 2022;347:108828. [DOI] [PubMed] [Google Scholar]
  • 29.Çelik C, Bokes P, Singh A. Protein noise and distribution in a two-stage gene-expression model extended by an mrna inactivation loop. In: Cinquemani E, Paulevé L, editors. Computational Methods in Systems Biology. Cham: Springer International Publishing; 2021. p. 215–29. [Google Scholar]
  • 30.Fan R, Hilfinger A. The effect of microRNA on protein variability and gene expression fidelity. Biophys J. 2023;122(5):905–23. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Paulsson J. Summing up the noise in gene networks. Nature. 2004;427(6973):415–8. [DOI] [PubMed] [Google Scholar]
  • 32.Singh A, Bokes P. Consequences of mRNA transport on stochastic variability in protein levels. Biophys J. 2012;103(5):1087–96. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Abramowitz M, Stegun IA, Romer RH. Handbook of mathematical functions with formulas, graphs, and mathematical tables. Am J Phys. 1988;56(10):958–8. [Google Scholar]
  • 34.Jahnke T, Huisinga W. Solving the chemical master equation for monomolecular reaction systems analytically. J Math Biol. 2007;54(1):1–26. [DOI] [PubMed] [Google Scholar]
  • 35.Warren PB, Tănase-Nicola S, ten Wolde PR. Exact results for noise power spectra in linear biochemical reaction networks. J Chem Phys. 2006;25(14):144904. [DOI] [PubMed] [Google Scholar]
  • 36.Kampen NGV. Stochastic Processes in Physics and Chemistry. 3rd ed. North Holland: Elsevier; 2007. [Google Scholar]
  • 37.Ham L, Schnoerr D, Brackston RD, Stumpf MPH. Exactly solvable models of stochastic gene expression. J Chem Phys. 2020;152(14):144106. 10.1063/1.5143540. [DOI] [PubMed] [Google Scholar]
  • 38.Rodríguez Martínez M, Soriano J, Tlusty T, Pilpel Y, Furman I. Messenger RNA fluctuations and regulatory RNAs shape the dynamics of a negative feedback loop. Phys Rev—E Stat, Nonlinear, Soft Matter Phys. 2010;81(3):031924. [DOI] [PubMed] [Google Scholar]
  • 39.Hopfield JJ. Kinetic proofreading: a new mechanism for reducing errors in biosynthetic processes requiring high specificity. Proceed Nat Acad Sci. 1974;71(10):4135–9. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

Not applicable.


Articles from BMC Bioinformatics are provided here courtesy of BMC

RESOURCES