Skip to main content
Biophysical Journal logoLink to Biophysical Journal
. 2020 Feb 25;118(7):1517–1525. doi: 10.1016/j.bpj.2020.02.016

Stochastic Modeling of Autoregulatory Genetic Feedback Loops: A Review and Comparative Study

James Holehouse 1, Zhixing Cao 1,2, Ramon Grima 1,
PMCID: PMC7136347  PMID: 32155410

Abstract

Autoregulatory feedback loops are one of the most common network motifs. A wide variety of stochastic models have been constructed to understand how the fluctuations in protein numbers in these loops are influenced by the kinetic parameters of the main biochemical steps. These models differ according to 1) which subcellular processes are explicitly modeled, 2) the modeling methodology employed (discrete, continuous, or hybrid), and 3) whether they can be analytically solved for the steady-state distribution of protein numbers. We discuss the assumptions and properties of the main models in the literature, summarize our current understanding of the relationship between them, and highlight some of the insights gained through modeling.

Introduction

Gene regulatory networks provide an abstraction of the complex biochemical interactions behind transcription and translation, the central dogma of molecular biology. Feedback has been identified as an important motif in such networks, defined through the regulation of an upstream process by one downstream of it. Autoregulation is the most basic kind of feedback loop—a protein expressed from a gene activates or suppresses its own transcription. These lead to positive or negative feedback, respectively. It has been estimated that 40% of all transcription factors in Escherichia coli self-regulate (1), with most of them participating in autorepression (2). Many biological systems utilize a combination of positive and negative feedback loops, such as the circadian and segmentation clocks (3,4).

Therefore, the computational and experimental study of the behavior of autoregulatory feedback loops is an important field of study. Measurement of the distribution of protein numbers along an arbitrarily chosen lineage (using a mother machine (5)) or population snapshots (using flow cytometry (6)) are now routine. Mathematical models represent a useful tool to understand what sort of interactions in feedback loops lead to observed protein distributions, potentially leading to insight into how noise (fluctuations in molecule numbers (7)) is managed at the subcellular level (8,9). These models have also been used to understand how autoactivation influences the sensitivity to input signals and the speed of induction (10,11) and to gain insight into the sources of noise in autoregulatory networks (12,13). Various inference approaches have also been devised to estimate the rate constants characterizing feedback loops from population snapshot data (14, 15, 16, 17).

The conventional stochastic description of gene regulatory networks is given by the chemical master equation (CME), a time-evolution equation for the probability of observing a gene state and a certain number of gene products at a given time (18). This Markovian description is discrete in the sense that it takes into account that molecule numbers change by integer amounts when a reaction occurs. In Discrete Models of Autoregulation, we describe the most common coarse-grained CME models for autoregulatory feedback loops, elucidate the relationship between them, and identify the regions of parameter space where their analytical predictions for the distribution of protein numbers are accurate compared to a fine-grained model. In Continuous and Hybrid Models of Autoregulation, we compare and discuss continuous approximations of the CME using Fokker-Planck equations and partial integro-differential equations (PIDEs) and briefly review other continuous approaches. In Insights from Models, we outline the main biological insights obtained from stochastic models. Finally, we conclude with Open Problems, in which we identify open problems.

Discrete models of autoregulation

From a biologist perspective, a minimal model of autoregulation should describe the main biochemical processes describing the flow of information from gene to messenger RNA (mRNA) to protein and back to the gene. Hence, the model should describe transcription and translation (the two steps at the heart of the central dogma of molecular biology), mRNA and protein degradation and dilution, and interactions of proteins with genes. For simplicity, we consider the case in which there is a single gene copy and all processes are modeled as first-order reactions except the protein-gene interactions, which naturally follow second-order kinetics. We refer to this model as the full model because it will be our ground truth, i.e., the finest scale model that we shall consider here. The reactions describing this model are GρuG+M,Mdm,MkM+P,P+GσbG,GσuG+P,GρbG+M,Pdp. Note that for simplicity, we have assumed noncooperative protein binding. Feedback is positive (i.e., protein enhances its own production) when ρb > ρu and negative otherwise. The modeling of protein degradation by an effective first-order process captures both active degradation as well as dilution due to cell division (19); for the case of no feedback, it has recently been shown that this approximation is accurate provided the mean number of mRNAs produced per cycle is low and the cell cycle length variability is large (20). Although this model is intuitive, it has not been studied extensively because the mathematical description of its stochastic dynamics, as provided by the CME, is not easy to solve analytically. In fact, even in the absence of the feedback loop (i.e., no protein-gene interactions), its master equation has still not been solved exactly (21). Hence, historically, simplified versions of the full model have received much more attention in the literature. These are the models by Hornos et al. (22) in 2005, Grima et al. (23) in 2012, and Kumar et al. (24) in 2014. Henceforth, we shall refer to these as the Hornos, Grima, and Kumar models. There exist other discrete models, e.g., (25), which in certain limits reduce to the aforementioned three.

The three models share a few common properties: 1) they only describe protein fluctuations, i.e., there is no explicit mRNA description; 2) the models are discrete in the sense that protein numbers change by discrete integer amounts when reactions occur; and 3) the CME for each model admits an exact solution in steady-state conditions. These exact solutions have been obtained using the method of generating functions, but in other studies using similar models, the solution was obtained using the Poisson representation (12,26, 27, 28). There are, however, important differences between the models, particularly how they describe protein production and protein-gene interactions that are not often spelled out but can be discerned from the form of the CME. The Hornos model assumes protein molecules are produced one at a time and neglects changes in the protein molecule numbers when a protein binds or unbinds from the gene. The Hornos model (excluding bound-protein degradation) is explicitly given by the set of reactions GρuBG+P,Pdp,P+GσbP+G,GσuG,GρbBG+P. Note that the effective rate of protein production in state G is ρuB, where B = k/dm (the mean number of proteins produced by an mRNA molecule during its lifetime). The reason for this is that if we define nm as the mean mRNA number in the full model, then the effective mean protein production rate is knmρuk/dm = ρuB when mRNA equilibrates rapidly (dm dp, a common assumption as we discuss later). The same analysis follows for the effective production rate in state G. The Grima model also assumes protein molecules are produced one at a time but takes into account protein fluctuations from the binding-unbinding process, i.e., when a protein binds a gene, the protein numbers are decreased by one, and conversely, they are increased by one when unbinding occurs. The Grima model (excluding bound-protein degradation) is explicitly given by the set of reactions GρuBG+P,Pdp,P+GσbG,GσuG+P,GρbBG+P. The Kumar model is similar to the Hornos model except that proteins are produced in a burst (a phenomenon called translational bursting (29,30)), where the burst size is a random number. Specifically, the Kumar model is given by the set of reactions GρuG+rP,Pdp,P+GσbP+G,GσuG,GρbG+rP, where r is a positive integer drawn from the geometric distribution with mean B. This implies that if the gene is in state G, then the rate at which a burst of size r is produced is ρuBr(1 + B)−(1 + r), which means a mean rate of protein production equal to ρuB, i.e., the same as in the full Grima and Hornos models (similar reasoning follows for state G). The differences between the models are illustrated in Fig. 1.

Figure 1.

Figure 1

Models of an autoregulatory feedback loop. The full model has an explicit description of both mRNA and protein, but its CME has no known exact steady-state analytical solution. The Grima, Hornos, Kumar, and modified Kumar models represent approximations of the full model wherein only the protein is explicitly described, and the CME can be solved exactly in steady state. The Grima and Hornos models assume proteins are produced one at a time, whereas Kumar assumes bursty production with mean burst size B (denoted by dashed circles). The Hornos and Kumar models neglect protein number fluctuations due to protein-gene binding and unbinding, whereas the Grima model takes them into account. The modified Kumar model is the same as the Kumar model but takes into account fluctuations due to binding and unbinding. The LMA is a discrete approximation of the full model given by the exact solution of the CME of a bursty protein production process with promoter switching and no feedback. The rate of switching to state G is not σb, but σb, which is a function of all the parameters in the full model (see Supplementary Note 7 in (39)). Note that under the assumption of rapid mRNA equilibration, the mean rate of protein production is the same in all models, and hence, the models are indistinguishable when fluctuations are ignored. To see this figure in color, go online.

The relationship of these models to each other and to the full model is still not completely understood. In Fig. 2, we summarize our current understanding of the regions of parameter space where the models’ analytical prediction of the steady-state protein number distribution agrees with stochastic simulations of the full model (using the stochastic simulation algorithm (SSA) (31)) for the cases of positive feedback (Fig. 2, IIV) and negative feedback (Fig. 2, VVIII). We enforce fast promoter switching conditions by choosing the rate constants of protein-gene binding (σb) and unbinding (σu) to be large compared to the other rate constants. For the full model, we also choose protein degradation rates dp = 1 to be significantly smaller than mRNA degradation rates dm = 10. Both of these conditions are common to a number of genes in both prokaryotic and eukaryotic cells (21,32). For each type of feedback, there are four plots for combinations of small and large values of L and B, where L = σu/σb is the ratio of unbinding/binding rates (inversely proportional to the feedback strength σb) and B = k/dm is the mean burst size. For completeness, we also show the deterministic rate equation prediction for the protein number in the full model (vertical black lines).

Figure 2.

Figure 2

Comparison of stochastic simulations of the full model (denoted as SSA) with the steady-state analytical distributions predicted by the reduced models of Grima (G), Hornos (H), Kumar (K), and the LMA (illustrated in Fig. 1). (I)–(IV) show positive feedback (ρu < ρb), and (V)–(VIII) show negative feedback (ρu > ρb). The rate equation prediction for mean protein number in the full model is denoted as RE. Note that L = σu/σb is inversely proportional to the feedback strength and B = k/dm is the mean protein burst size. The LMA provides the most accurate discrete approximation of the full model (eight out of eight regions of parameter space), followed by Grima/Kumar (four out of eight regions) and Hornos (two out of eight regions). See text for discussion. The parameters for small L (I, II, V, and VI) are σu = 103, σb = 105, and for large L (III, IV, VII, and VIII), they are σu = 105, σb = 103. The parameter for small B (I, III, V, and VII) is B = 10−2, and for large B (II, IV, VI, and VIII), it is B = 10. The decay rates are fixed to dm = 10, dp = 1. The rest of the parameters are (I) ρu = 10−2, ρb = 103; (II) ρu = 10−1, ρb = 2.5; (III) ρu = 102, ρb = 103; (IV) ρu = 10−1, ρb = 2.5; (V) ρu = 103, ρb = 102; (VI) ρu = 10, ρb = 1; (VII) ρu = 103, ρb = 102; and (VIII) ρu = 2.5, ρb = 10−1. To see this figure in color, go online.

The following considerations allow us to deduce which models are accurate in which part of parameter space. In the fast promoter switching regime, models that assume protein fluctuations in the binding-unbinding process are negligible are incorrect when feedback is strong (L is small); the error introduced by this assumption is particularly appreciable when the mean protein numbers are small and feedback is positive (33,34)—we shall call this Property 1. This is because under such conditions, the time to switch from one promoter state to another is highly sensitive to small protein number fluctuations (see later for a more extensive discussion). Models that assume proteins are produced one molecule at a time are only correct when B is small—we shall call this Property 2. The reason for the latter property is as follows. It is well known that when mRNA decays much faster than protein, then the production of proteins in the full model without feedback occurs in bursts of random size described by a geometric distribution with a mean of B, and the Fano factor of the protein distribution is 1 + B (21). In models that assume proteins are produced one at a time, if there were no feedback, then the Fano factor of the protein distribution would be 1 (because the distribution would be Poisson). Hence, these models (with or without feedback) can only provide a good approximation to the full model when B is small.

Armed with Properties 1 and 2, we can now explain Fig. 2. The Grima model takes into account binding-unbinding fluctuations but neglects bursting and hence agrees with the SSA of the full model only when B is small (Fig. 2, I, III, V, and VII). The Hornos model neglects both binding-unbinding fluctuations and bursting and hence agrees with the SSA of the full model only when feedback is weak (L is large) and B is small (Fig. 2, III and VII). The Kumar model neglects binding-unbinding fluctuations but includes bursting and hence agrees with the SSA of the full model only when feedback is weak (L is large) (Fig. 2, III, IV, VII, and VIII). Note that the Hornos model agrees with the Kumar model for low mean burst sizes independently of the feedback strength (35). The least intuitive and most surprising model failures are those stemming from Property 1, and hence, in what follows, we provide a mechanistic explanation for the observations in the small L cases in Fig. 2. Consider the case of strong positive feedback ρu ρb with small L and small B (Fig. 2 I). When proteins are present in the system, because we are in the regime of fast promoter switching, there will be rapid switching between the G and G states. However, in the rare case of an extinction of proteins in the G state and in which protein binding fluctuations are neglected (Hornos and Kumar models), a transition from the bound state G to unbound state G does not release a protein. The system then must wait a long time for a protein to be produced via the low effective mean transcription rate ρuB if it is to leave state G, thus leading to the dominant mode of the distribution to be centered at zero. However, when protein binding fluctuations are included (Grima model and SSA), a transition from the G to G does release a protein that can immediately bind to G (because of the high σb firing rate because L is small), and hence the system does not readily encounter an extinction of proteins; rather, it spends more time in G, which leads to a dominant mode centered on a nonzero protein value. The discrepancies introduced by the Kumar model, which neglects binding-unbinding fluctuations, are smaller if the mean burst size is large (Fig. 2 II) because by the same reasoning as above, the larger burst size reduces the time to switch from G to G in the event of a protein extinction in the latter state. For negative feedback, similar mechanistic explanations can be formulated; see (33). It is noteworthy that by modifying the Kumar model so that it takes into account fluctuations due to binding-unbinding processes (illustrated in Fig. 1 as Modified Kumar), then the steady-state protein distributions obtained from the SSA of this model are practically indistinguishable from the SSA of the full model. It has recently been shown that this model can be derived from the full model and that it admits an exact analytical steady-state solution (34).

Another common discrete approximation of the full model is the master equation for an effective birth-death process for protein in which the propensity of the production reaction is a Hill function of the instantaneous number of proteins, whereas the propensity for protein decay is the same as for the usual first-order decay process. Specifically, the propensity for the production reaction reads (u + ρbn)/(L + n) (the symbols as defined in Fig. 1 and above). Hill-type propensities of this type or similar are in common use in the literature (10,36,37). The reduced master equation for this effective birth-death process can be solved exactly in steady state, and it is often thought to be a valid approximation of the full model in the limit of fast promoter switching. It is worth noting that Hill-type propensities for protein production are not rigorously derived, but rather written by analogy to the effective rate of protein production obtained from the deterministic rate equations under fast equilibrium conditions. Hence, the master equation’s validity under the same conditions is doubtful (38). Indeed, it has recently been shown that in the fast switching limit, the steady-state solution of this master equation is precisely the same as that of the Hornos model and hence is not an accurate approximation of the full model if L is small (33) (the solution of this model and that of Hornos are indistinguishable for the parameters in Fig. 2). It has recently been shown using the multiscale averaging method that in the limit of fast promoter switching, the correct reduced master equation for a single autoregulatory loop has an effective Hill-type propensity for protein production (of the same type as discussed above), as well as an effective protein degradation rate that depends on dp, σu, and σb (34).

Lastly, we consider a recent novel discrete approximation of a class of gene regulatory networks called the linear mapping approximation (LMA (39)). In the LMA, in the limit of fast mRNA equilibration, the CME of the full model is approximated by the CME describing bursty protein production and effective promoter switching with no feedback, which has an exact steady-state solution. The LMA provides a computational recipe by which the effective rates of promoter switching can be obtained as functions of the parameters in the full model. The LMA turns out to be the best discrete analytical approximation of the full model, being accurate in all eight regions of parameter space in Fig. 2. This is followed by Kumar and Grima (both accurate in four out of eight regions) and Hornos (accurate in two out of eight regions). The modified Kumar model does as well as the LMA and has the advantage of an exact analytical solution. It is to be emphasized that the LMA gives only accurate results when the mean number of proteins conditional on the gene being in state G is much greater than 1, a condition met in all cases considered in Fig. 2.

In summary, what appear to be minor and subtle differences in the construction of discrete models of autoregulation actually lead to considerable differences in the prediction of the steady-state protein distribution. For both positive and negative feedback, the models all agree in only one region of parameter space where the mean burst size and feedback strength are both small (small B and large L). The differences between the Grima, Hornos, and Kumar models and the full model originate from the fact that these models were not derived rigorously from the full model, but rather, they were written down intuitively. On the other hand, the LMA and the modified Kumar model do so well because they are derived from the full model. The aforementioned discrepancies are for the case of fast promoter switching. Recent work has shown that for slow promoter switching, independent of the value of L, the Kumar model provides an accurate approximation of the full model; the Hornos model is also accurate for all L, but only provided B is small (34). Hence, the differences between models are less important for slow compared to fast promoter switching. In this section, we have considered models of the simplest type of autoregulatory loop. Discrete models of a loop with more complex mechanisms such as cooperative protein binding to the gene and oscillatory transcription rates (e.g., because of circadian rhythms) have also been solved recently (39).

Continuous and hybrid models of autoregulation

Besides discrete models, there are also continuous models of autoregulatory loops in which it is assumed that molecule number fluctuations correspond to hops on the real axis rather than on an integer axis. The simplicity of the distributions provided by the continuous models often make the results easier to interpret than those obtained from exactly solvable discrete models, which are in terms of hypergeometric functions.

These models are typically described by the WKB approximation (37,40,41), the linear-noise approximation (LNA) (a type of a Fokker-Planck equation that differs from the chemical Fokker-Planck equation) (42, 43, 44), or PIDEs (19,45, 46, 47, 48). There are several variations of these three approaches in the literature, too numerous to enumerate here. For the purpose of this discussion, we will compare and discuss two LNA variants and a PIDE approach: 1) by LNA, we specifically mean the Fokker-Planck equation obtained by applying the LNA to the CME of the modified Kumar model in Fig. 1; 2) by cLNA, we mean the conditional LNA derived in (49) applied to the CME of the modified Kumar model; and 3) by PIDE, we specifically mean the model presented by Bokes and Singh for the case that there are no decoy binding sites (46), its solution being given by Eq. S26 of the aforementioned work (this result has been previously reported (19,50)). These three have the following different properties. 1) LNA gives a Gaussian distribution; the cLNA gives a sum of two Gaussians, each associated with one of the promoter states; and the PIDE can lead to unimodal or bimodal non-Gaussian distributions for the protein numbers. 2) All implicitly take into account mRNA through the protein burst size distribution. This is a geometric distribution for the LNA and cLNA models, whereas it is an exponential distribution for the PIDE model. 3) The LNA and PIDE are continuum models, however the cLNA is a hybrid model since protein fluctuations are assumed to be continuous, but gene state fluctuations are assumed to be discrete.

To summarize our current understanding of the regions of their validity, in Fig. 3, A and B, we compare the three approximations under fast, intermediate, and slow rates of promoter switching for both high and low basal transcription rates (ρu) versus SSA simulations of the full model. For high basal transcription rates, the LNA and PIDE provide accurate approximations of the full model for fast promoter switching (Fig. 3 A, I) and break down for intermediate (Fig. 3 A, II) and slow promoter switching (Fig. 3 A, III) because they cannot capture the bimodality of the full model distribution. In the latter regime, the cLNA performs well instead. These results make sense in the light of the PIDE solution integrating out promoter switching by using a Hill-type function and hence presuming fast promoter switching, whereas the cLNA derivation assumes that promoter switching is much slower than transcription, translation, and decay. For intermediate switching, the cLNA misses the precise location of the modes; this is a limitation imposed by trying to fit a Gaussian to each mode. Non-Gaussian effects can be systematically corrected by considering higher-order terms in the system-size expansion than those used to compute the LNA (51); in particular, the cLNA can be improved by means of the conditional system-size expansion (52). In contrast, for low basal transcription rate, all three approximations fail independently of the switching rate (Fig. 3 B, IIII). This is because continuous approximations are only valid for large enough protein numbers, and the low basal transcription rate induces a mode of the protein distribution at zero. In Fig. 3, CE, we compare the LNA and PIDE approximations as the mean burst size B is increased at constant transcription rate for fast switching conditions. The LNA does best for low burst sizes because the full model distribution is almost Gaussian; the PIDE approximation is inaccurate here because for small mean burst sizes, the exponential distribution of burst sizes is not a good approximation of the geometric distribution. The reverse is true when the mean burst size becomes large: the full model distribution is highly non-Gaussian and cannot be captured by the LNA but is well captured by the PIDE, also because the exponential is a good approximation of the geometric distribution in this case.

Figure 3.

Figure 3

(A and B) Plots comparing the accuracy of continuous approximations (LNA, cLNA, PIDE) versus the SSA of the full model under fast, intermediate, and slow rates of promoter switching for high and low basal transcription rates. The parameter μ = log10σu1 is large for slow switching and small for fast switching. LNA and PIDE approximations are accurate for fast switching in high basal transcription rate conditions, whereas the cLNA is more accurate for slow switching conditions. All approximations break for low basal transcription. (C)–(E) show plots of LNA versus PIDE versus SSA of full model for fast switching as the mean burst size B increases at constant transcription rates ρuB and ρbB. The LNA performs best at low B, whereas the PIDE performs best at large B. Note that the x axis in (C)–(E) is logarithmic. Note that the LNA and PIDE distributions, although not shown (for clarity) in case III, are also unimodal as for cases I and II, whereas the cLNA is also bimodal for case I but not shown for clarity. The parameters are as follows. (A) (I) μ = −5, ρu = 7, ρb = 25, B = 2, σu = 105, and σb = 104. (II) μ = 1, ρu = 7, ρb = 25, B = 2, σu = 10−1, and σb = 10−2. (III) μ = 2, ρu = 7, ρb = 25, B = 2, σu = 10−2, and σb = 10−3. (B) (I) μ = −4, ρu = 10−4, ρb = 5, B = 2, σu = 104, and σb = 106. (II) μ = −1, ρu = 10−4, ρb = 5, B = 2, σu = 10, and σb = 103. (III) μ = 2, ρu = 10−4, ρb = 5, B = 2, σu = 10−2, and σb = 1. (CE) ρuB = 30, ρbB = 20, σu = 106, and σb = 103. To see this figure in color, go online.

In summary, various continuum models provide reasonably accurate analytical steady-state distributions for protein numbers in terms of simple functions, provided there is not a mode of the distribution at zero and provided parameters are such that gene switching is either very fast or very slow. In the literature, there are also analytically solvable continuum models with multiple gene copies (47), and a few results are also known for promoter switching that is neither extremely slow nor exceedingly rapid (40).

Insights from models

There are at least two main insights obtained from the analysis of stochastic models of autoregulation: 1) cooperativity is not necessary for protein distributions to be bimodal. Fast, slow, or intermediate promoter switching can in some cases lead to bimodality in the absence of cooperativity; 2) there is not a simple relationship between noise reduction or amplification and the type of feedback loop (negative or positive). We next discuss each of these in detail.

Insight 1

An important property of autoregulatory feedback loops is their ability to generate protein distributions with more than one mode. Each of these modes can be associated with a subpopulation of cells of a particular phenotype, and hence, their quantification is important for understanding cellular decision making. In a single autoregulatory feedback loop, these modes can arise in at least three ways: 1) if the deterministic rate equations are bistable (which arises if feedback is positive and there is cooperativity (53)), then provided intrinsic noise is sufficiently small, there are two modes of the protein distribution, each centered on the deterministic solution. If noise is large, then it can destroy this type of bimodality (54); 2) if promoter switching is fast, there is positive feedback but no cooperativity (and hence no deterministic bistability), provided the transcription rate in the G state (ρb) is sufficiently larger than that in the G state (ρu) (see Fig. 2 b of (19), Fig. 6 a of (55), and Fig. 6 of (34)); 3) if promoter switching is slow and there is no cooperativity (deterministic rate equations are not bistable). In this case, independent of the type of feedback, the system will alternate between two steady-state protein distributions, each associated with a promoter state; hence, if these distributions are well separated, then the full distribution will be bimodal (see Fig. 2 of (23) and Fig. 5 of (34). This has been shown to lead to birhythmical expression in genetic oscillators and to hysteresis in phenotypic induction (49); furthermore, it leads to an enhancement of the sensitivity of the circuit’s response to input signals (10).

The LNA cannot capture any type of bimodality because it is a Gaussian approximation to the number distribution solution of the CME; indeed, it is only valid for those systems in which deterministic rate equations are monostable (18), provided the average number of molecules is large enough. Hybrid methods such as the cLNA and others (56, 57, 58) can capture bimodality of type 3 because they model gene states discretely. Methods based on PIDE can capture bimodality of types 1 and 2, but not 3, because they do not model discrete gene states explicitly (45); however, recent generalizations (48) leading to a model consisting of two PIDEs, one for each gene state, may well be able to capture all types of bimodality. Methods based on the Fokker-Planck equation stemming from the Kramers-Moyal expansion also cannot capture bimodality of type 3 (59). Discrete models can capture all types of bimodality (39).

Insight 2

Early work reported that negative feedback reduces protein fluctuations (60), whereas positive feedback has the opposite effect (61). For negative feedback, there is an optimal feedback strength at which the protein fluctuations are minimal (9). However, later work showed that models with different assumptions can yield contradictory conclusions about how feedback affects noise (62). More recently, it has been claimed that the effect of feedback on noise can be more easily understood via a noise decomposition (12). Noise can be decomposed as the sum of three types: promoter noise, birth-death noise, and correlation noise induced by feedback. In the case of slow switching, in which the promoter noise is dominant, positive feedback reduces the total noise, whereas negative feedback amplifies it. In the case of fast switching, in which the correlation noise is dominant, positive feedback amplifies the total noise, whereas negative feedback reduces it. Further work in this direction can be found in (13,34,35). Hence, it appears that the general intuition that negative feedback reduces noise, whereas positive feedback increases it, is not correct. It follows that the ubiquity of negative self-regulating transcription factors in prokaryotic cells cannot be explained by an evolutionary pressure to select for mechanisms that enable control of protein noise (63).

Open problems

In summary, our literature review and comparative analysis shows that differences between models of autoregulation can have a significant impact on the predicted distribution of protein numbers, e.g., different number of modes and different predictions for how positive and negative feedback influence the size of protein number fluctuations. The majority of the current generation of reduced models have been constructed intuitively, hence the existing differences between them.

We conclude by briefly pointing out a few of the open problems in the field: 1) the derivation of exact time-dependent solutions of the CME for the Grima, Kumar, and modified Kumar models (it is presently known for the Hornos model (64)). The derivation of approximate time-dependent propagators has received recent attention (65). Explicit time-dependent solutions would enable a detailed study of how perturbations to an autoregulatory circuit, e.g., inhibition of certain reactions, influences the dynamics—this would aid the interpretation of experiments of this type; 2) the development of new continuous approximation methods that can accurately predict the steady-state distribution of protein number of the full model without the assumption of fast or slow promoter switching. This is particularly relevant when at least one of the rates of promoter switching is comparable with the rates of other important processes, e.g., for thousands of genes in mouse fibroblast cells, the rate of switching from the inactive to the active gene state is similar to the mRNA degradation rate (66); 3) the derivation of reduced models of feedback loops starting from fine-grained stochastic models incorporating biological processes that are known to affect gene expression, such as partitioning of proteins due to cell division, gene replication, gene dosage compensation, and nascent mRNA maturation (such models have been studied using simulations for systems with no feedback (67)). Likely, such reduction is possible by the careful application of timescale separation methods or possibly by mapping to an effective CME with stochastic rates of transcription, translation, and feedback (68).

In our opinion, the last of these open problems is probably the hardest and the most pressing because it is imperative that the reduced models studied are biologically and physically realistic before further mathematical analysis is undertaken.

Acknowledgments

This work was supported by a Biotechnology and Biological Sciences Research Council (BBSRC) EASTBIO PhD studentship, BBSRC grant BB/M025551/1, and the UK Research Councils’ Synthetic Biology for Growth program of the BBSRC, Engineering and Physical Sciences Research Council, and Medical Research Council (BB/M018040/1).

Editor: Brian Salzberg.

References

  • 1.Rosenfeld N., Elowitz M.B., Alon U. Negative autoregulation speeds the response times of transcription networks. J. Mol. Biol. 2002;323:785–793. doi: 10.1016/s0022-2836(02)00994-4. [DOI] [PubMed] [Google Scholar]
  • 2.Shen-Orr S.S., Milo R., Alon U. Network motifs in the transcriptional regulation network of Escherichia coli. Nat. Genet. 2002;31:64–68. doi: 10.1038/ng881. [DOI] [PubMed] [Google Scholar]
  • 3.Takahashi J.S. Transcriptional architecture of the mammalian circadian clock. Nat. Rev. Genet. 2017;18:164–179. doi: 10.1038/nrg.2016.150. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Wiedermann G., Bone R.A., Dale J.K. A balance of positive and negative regulators determines the pace of the segmentation clock. eLife. 2015;4:e05842. doi: 10.7554/eLife.05842. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Robert L., Ollion J., Elez M. Mutation dynamics and fitness effects followed in single cells. Science. 2018;359:1283–1286. doi: 10.1126/science.aan0797. [DOI] [PubMed] [Google Scholar]
  • 6.Newman J.R.S., Ghaemmaghami S., Weissman J.S. Single-cell proteomic analysis of S. cerevisiae reveals the architecture of biological noise. Nature. 2006;441:840–846. doi: 10.1038/nature04785. [DOI] [PubMed] [Google Scholar]
  • 7.Elowitz M.B., Levine A.J., Swain P.S. Stochastic gene expression in a single cell. Science. 2002;297:1183–1186. doi: 10.1126/science.1070919. [DOI] [PubMed] [Google Scholar]
  • 8.Hooshangi S., Weiss R. The effect of negative feedback on noise propagation in transcriptional gene networks. Chaos. 2006;16:026108. doi: 10.1063/1.2208927. [DOI] [PubMed] [Google Scholar]
  • 9.Singh A., Hespanha J.P. Optimal feedback strength for noise suppression in autoregulatory gene networks. Biophys. J. 2009;96:4013–4023. doi: 10.1016/j.bpj.2009.02.064. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Hermsen R., Erickson D.W., Hwa T. Speed, sensitivity, and bistability in auto-activating signaling circuits. PLoS Comput. Biol. 2011;7:e1002265. doi: 10.1371/journal.pcbi.1002265. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Jia C., Qian H., Zhang M.Q. Relaxation rates of gene expression kinetics reveal the feedback signs of autoregulatory gene networks. J. Chem. Phys. 2018;148:095102. [Google Scholar]
  • 12.Liu P., Yuan Z., Zhou T. Decomposition and tunability of expression noise in the presence of coupled feedbacks. Chaos. 2016;26:043108. doi: 10.1063/1.4947202. [DOI] [PubMed] [Google Scholar]
  • 13.Jia C., Xie P., Zhang M.Q. Stochastic fluctuations can reveal the feedback signs of gene regulatory networks at the single-molecule level. Sci. Rep. 2017;7:16037. doi: 10.1038/s41598-017-15464-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Milner P., Gillespie C.S., Wilkinson D.J. Moment closure based parameter inference of stochastic kinetic models. Stat. Comput. 2013;23:287–295. Published online January 10, 2012. [Google Scholar]
  • 15.Stathopoulos V., Girolami M.A. Markov chain Monte Carlo inference for Markov jump processes via the linear noise approximation. Philos. Trans. A Math. Phys. Eng. Sci. 2013;371:20110541. doi: 10.1098/rsta.2011.0541. Published online December 31, 2012. [DOI] [PubMed] [Google Scholar]
  • 16.Cao Z., Grima R. Accuracy of parameter estimation for auto-regulatory transcriptional feedback loops from noisy data. J. R. Soc. Interface. 2019;16:20180967. doi: 10.1098/rsif.2018.0967. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Öcal K., Grima R., Sanguinetti G. Parameter estimation for biochemical reaction networks using Wasserstein distances. J. Phys. Math. Theor. 2020;53:034002. Published online July 18, 2019. [Google Scholar]
  • 18.Schnoerr D., Sanguinetti G., Grima R. Approximation and inference methods for stochastic biochemical kinetics–a tutorial review. J. Phys. A Math. Theor. 2017;50:093001. [Google Scholar]
  • 19.Friedman N., Cai L., Xie X.S. Linking stochastic dynamics to population distribution: an analytical framework of gene expression. Phys. Rev. Lett. 2006;97:168302. doi: 10.1103/PhysRevLett.97.168302. [DOI] [PubMed] [Google Scholar]
  • 20.Beentjes C.H.L., Perez-Carrasco R., Grima R. Exact solution of stochastic gene expression models with bursting, cell cycle and replication dynamics. Phys. Rev. E. 2020;101:032403. doi: 10.1103/PhysRevE.101.032403. [DOI] [PubMed] [Google Scholar]
  • 21.Shahrezaei V., Swain P.S. Analytical distributions for stochastic gene expression. Proc. Natl. Acad. Sci. USA. 2008;105:17256–17261. doi: 10.1073/pnas.0803850105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Hornos J.E.M., Schultz D., Wolynes P.G. Self-regulating gene: an exact solution. Phys. Rev. E. 2005;72:051907. doi: 10.1103/PhysRevE.72.051907. [DOI] [PubMed] [Google Scholar]
  • 23.Grima R., Schmidt D.R., Newman T.J. Steady-state fluctuations of a genetic feedback loop: an exact solution. J. Chem. Phys. 2012;137:035104. doi: 10.1063/1.4736721. [DOI] [PubMed] [Google Scholar]
  • 24.Kumar N., Platini T., Kulkarni R.V. Exact distributions for stochastic gene expression models with bursting and feedback. Phys. Rev. Lett. 2014;113:268105. doi: 10.1103/PhysRevLett.113.268105. [DOI] [PubMed] [Google Scholar]
  • 25.Vandecan Y., Blossey R. Self-regulatory gene: an exact solution for the gene gate model. Phys. Rev. E Stat. Nonlin. Soft Matter Phys. 2013;87:042705. doi: 10.1103/PhysRevE.87.042705. [DOI] [PubMed] [Google Scholar]
  • 26.Gardiner C. Volume 4. Springer; Berlin, Germany: 2009. Stochastic Methods. [Google Scholar]
  • 27.Sugár I., Simon I. Self-regulating genes. exact steady state solution by using Poisson representation. Open Phys. 2014;12:615–627. [Google Scholar]
  • 28.Iyer-Biswas S., Jayaprakash C. Mixed Poisson distributions in exact solutions of stochastic autoregulation models. Phys. Rev. E Stat. Nonlin. Soft Matter Phys. 2014;90:052712. doi: 10.1103/PhysRevE.90.052712. [DOI] [PubMed] [Google Scholar]
  • 29.Yu J., Xiao J., Xie X.S. Probing gene expression in live cells, one protein molecule at a time. Science. 2006;311:1600–1603. doi: 10.1126/science.1119623. [DOI] [PubMed] [Google Scholar]
  • 30.Cai L., Friedman N., Xie X.S. Stochastic protein expression in individual cells at the single molecule level. Nature. 2006;440:358–362. doi: 10.1038/nature04599. [DOI] [PubMed] [Google Scholar]
  • 31.Gillespie D.T. Stochastic simulation of chemical kinetics. Annu. Rev. Phys. Chem. 2007;58:35–55. doi: 10.1146/annurev.physchem.58.032806.104637. [DOI] [PubMed] [Google Scholar]
  • 32.Sepúlveda L.A., Xu H., Golding I. Measurement of gene regulation in individual cells reveals rapid switching between promoter states. Science. 2016;351:1218–1222. doi: 10.1126/science.aad0635. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Holehouse J., Grima R. Revisiting the reduction of stochastic models of genetic feedback loops with fast promoter switching. Biophys. J. 2019;117:1311–1330. doi: 10.1016/j.bpj.2019.08.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Jia C., Grima R. Small protein number effects in stochastic models of autoregulated bursty gene expression. J. Chem. Phys. 2020;152:084115. doi: 10.1063/1.5144578. [DOI] [PubMed] [Google Scholar]
  • 35.Jia C., Wang L.Y., Zhang M.Q. Single-cell stochastic gene expression kinetics with coupled positive-plus-negative feedback. Phys. Rev. E. 2019;100:052406. doi: 10.1103/PhysRevE.100.052406. [DOI] [PubMed] [Google Scholar]
  • 36.Aquino T., Abranches E., Nunes A. Stochastic single-gene autoregulation. Phys. Rev. E Stat. Nonlin. Soft Matter Phys. 2012;85:061913. doi: 10.1103/PhysRevE.85.061913. [DOI] [PubMed] [Google Scholar]
  • 37.Assaf M., Roberts E., Luthey-Schulten Z. Determining the stability of genetic switches: explicitly accounting for mRNA noise. Phys. Rev. Lett. 2011;106:248102. doi: 10.1103/PhysRevLett.106.248102. [DOI] [PubMed] [Google Scholar]
  • 38.Kim J.K., Josić K., Bennett M.R. The relationship between stochastic and deterministic quasi-steady state approximations. BMC Syst. Biol. 2015;9:87. doi: 10.1186/s12918-015-0218-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Cao Z., Grima R. Linear mapping approximation of gene regulatory networks with stochastic dynamics. Nat. Commun. 2018;9:3305. doi: 10.1038/s41467-018-05822-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Ge H., Qian H., Xie X.S. Stochastic phenotype transition of a single cell in an intermediate region of gene state switching. Phys. Rev. Lett. 2015;114:078101. doi: 10.1103/PhysRevLett.114.078101. [DOI] [PubMed] [Google Scholar]
  • 41.Newby J. Bistable switching asymptotics for the self regulating gene. J. Phys. A Math. Theor. 2015;48:185001. [Google Scholar]
  • 42.Van Kampen N.G. Volume 1. Elsevier; Amsterdam, the Netherlands: 1992. Stochastic Processes in Physics and Chemistry. [Google Scholar]
  • 43.Grima R., Thomas P., Straube A.V. How accurate are the nonlinear chemical Fokker-Planck and chemical Langevin equations? J. Chem. Phys. 2011;135:084103. doi: 10.1063/1.3625958. [DOI] [PubMed] [Google Scholar]
  • 44.Thomas P., Straube A.V., Grima R. The slow-scale linear noise approximation: an accurate, reduced stochastic description of biochemical networks under timescale separation conditions. BMC Syst. Biol. 2012;6:39. doi: 10.1186/1752-0509-6-39. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Pájaro M., Alonso A.A., Vázquez C. Shaping protein distributions in stochastic self-regulated gene expression networks. Phys. Rev. E Stat. Nonlin. Soft Matter Phys. 2015;92:032712. doi: 10.1103/PhysRevE.92.032712. [DOI] [PubMed] [Google Scholar]
  • 46.Bokes P., Singh A. Protein copy number distributions for a self-regulating gene in the presence of decoy binding sites. PLoS One. 2015;10:e0120555. doi: 10.1371/journal.pone.0120555. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Jędrak J., Ochab-Marcinek A. Influence of gene copy number on self-regulated gene expression. J. Theor. Biol. 2016;408:222–236. doi: 10.1016/j.jtbi.2016.08.018. [DOI] [PubMed] [Google Scholar]
  • 48.Jia C., Zhang M.Q., Qian H. Emergent Lévy behavior in single-cell stochastic gene expression. Phys. Rev. E. 2017;96:040402. doi: 10.1103/PhysRevE.96.040402. [DOI] [PubMed] [Google Scholar]
  • 49.Thomas P., Popović N., Grima R. Phenotypic switching in gene regulatory networks. Proc. Natl. Acad. Sci. USA. 2014;111:6994–6999. doi: 10.1073/pnas.1400049111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Mackey M.C., Tyran-Kaminska M., Yvinec R. Dynamic behavior of stochastic gene expression models in the presence of bursting. SIAM J. Appl. Math. 2013;73:1830–1852. [Google Scholar]
  • 51.Thomas P., Grima R. Approximate probability distributions of the master equation. Phys. Rev. E Stat. Nonlin. Soft Matter Phys. 2015;92:012120. doi: 10.1103/PhysRevE.92.012120. [DOI] [PubMed] [Google Scholar]
  • 52.Andreychenko A., Bortolussi L., Wolf V. Modeling Cellular Systems. Springer; 2017. Distribution approximations for the chemical master equation: comparison of the method of moments and the system size expansion; pp. 39–66. [Google Scholar]
  • 53.Cherry J.L., Adler F.R. How to make a biological switch. J. Theor. Biol. 2000;203:117–133. doi: 10.1006/jtbi.2000.1068. [DOI] [PubMed] [Google Scholar]
  • 54.Ebeling W., Schimansky-Geier L. Stochastic dynamics of a bistable reaction system. Physica A. 1979;98:587–600. [Google Scholar]
  • 55.Ochab-Marcinek A., Tabaka M. Transcriptional leakage versus noise: a simple mechanism of conversion between binary and graded response in autoregulated genes. Phys. Rev. E Stat. Nonlin. Soft Matter Phys. 2015;91:012704. doi: 10.1103/PhysRevE.91.012704. [DOI] [PubMed] [Google Scholar]
  • 56.Kurasov P., Lück A., Wolf V. Stochastic hybrid models of gene regulatory networks - a PDE approach. Math. Biosci. 2018;305:170–177. doi: 10.1016/j.mbs.2018.09.009. [DOI] [PubMed] [Google Scholar]
  • 57.Karmakar R., Bose I. Positive feedback, stochasticity and genetic competence. Phys. Biol. 2007;4:29–37. doi: 10.1088/1478-3975/4/1/004. [DOI] [PubMed] [Google Scholar]
  • 58.Lin Y.T., Doering C.R. Gene expression dynamics with stochastic bursts: construction and exact results for a coarse-grained model. Phys. Rev. E. 2016;93:022409. doi: 10.1103/PhysRevE.93.022409. [DOI] [PubMed] [Google Scholar]
  • 59.Duncan A., Liao S., Grima R. Noise-induced multistability in chemical systems: discrete versus continuum modeling. Phys. Rev. E Stat. Nonlin. Soft Matter Phys. 2015;91:042111. doi: 10.1103/PhysRevE.91.042111. [DOI] [PubMed] [Google Scholar]
  • 60.Simpson M.L., Cox C.D., Sayler G.S. Frequency domain analysis of noise in autoregulated gene circuits. Proc. Natl. Acad. Sci. USA. 2003;100:4551–4556. doi: 10.1073/pnas.0736140100. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Hasty J., Pradines J., Collins J.J. Noise-based switches and amplifiers for gene expression. Proc. Natl. Acad. Sci. USA. 2000;97:2075–2080. doi: 10.1073/pnas.040411297. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Marquez-Lago T.T., Stelling J. Counter-intuitive stochastic behavior of simple gene circuits with negative feedback. Biophys. J. 2010;98:1742–1750. doi: 10.1016/j.bpj.2010.01.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Stekel D.J., Jenkins D.J. Strong negative self regulation of prokaryotic transcription factors increases the intrinsic noise of protein expression. BMC Syst. Biol. 2008;2:6. doi: 10.1186/1752-0509-2-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Ramos A.F., Innocentini G.C.P., Hornos J.E.M. Exact time-dependent solutions for a self-regulating gene. Phys. Rev. E Stat. Nonlin. Soft Matter. Phys. 2011;83:062902. doi: 10.1103/PhysRevE.83.062902. [DOI] [PubMed] [Google Scholar]
  • 65.Veerman F., Marr C., Popović N. Time-dependent propagators for stochastic models of gene expression: an analytical method. J. Math. Biol. 2018;77:261–312. doi: 10.1007/s00285-017-1196-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Larsson A.J.M., Johnsson P., Sandberg R. Genomic encoding of transcriptional burst kinetics. Nature. 2019;565:251–254. doi: 10.1038/s41586-018-0836-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Skinner S.O., Xu H., Golding I. Single-cell analysis of transcription kinetics across the cell cycle. eLife. 2016;5:e12175. doi: 10.7554/eLife.12175. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Park S.J., Song S., Sung J. The Chemical Fluctuation Theorem governing gene expression. Nat. Commun. 2018;9:297. doi: 10.1038/s41467-017-02737-0. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Biophysical Journal are provided here courtesy of The Biophysical Society

RESOURCES