Abstract
How do single cell fate decisions induced by activation of key signaling proteins above threshold concentrations within a time interval are affected by stochastic fluctuations in biochemical reactions? We address this question using minimal models of stochastic chemical reactions commonly found in cell signaling and gene regulatory systems. Employing exact solutions and semi-analytical methods we calculate distributions of the maximum value (N) of activated species concentrations (Pmax(N)) and the time (t) taken to reach the maximum value (Pmax(t)) within a time interval in the minimal models. We find, the presence of positive feedback interactions make Pmax(N) more spread out with a higher “peakedness” in Pmax(t). Thus positive feedback interactions may help single cells to respond sensitively to a stimulus when cell decision processes require upregulation of activated forms of key proteins to a threshold number within a time window.
INTRODUCTION
Decisions made at the single cell level enable organisms to respond to changes in the local environment. Such decisions are usually processed upon upregulation of specific proteins, transcription factors, or soluble molecules that help cells to communicate with each other. These activation events often require concentrations of few key proteins to reach a threshold level within a time window. Examples of such responses include, activation of immune cells such as T cells triggered by a threshold number of pathogenic peptides,1 all or none maturation of oocytes in the frog Xenopus laevis induced by different concentrations of progesterone,2 or switch-like activation of Lac genes regulating lactose metabolism produced by a threshold concentration of stimulus in E. coli.3
However, every cell in a cell population interacting with stimuli possesses unique temporal profiles of concentrations of activated signaling molecules or genes. This cell to cell variability in the kinetics occurs due to the inherent stochastic nature of associated biochemical processes (or intrinsic noise)4, 5, 6, 8 and variations in expression levels of genes and proteins (or extrinsic noise).8 Therefore, the threshold for activation for a specific signaling molecule and the time window within which the signaling molecule should be activated to influence cell functions can change from cell to cell.2, 28 How do nonlinearities commonly found in biochemical signaling networks, such as positive feedbacks, help cells to respond to these variations? We address this question in the article, in particular, we investigate the role of positive feedback interactions which are often responsible for producing all or none responses in signaling or gene regulatory kinetics. We use a minimal model for a linear and a positive feedback interaction in a simple chemical reaction describing activation of a single chemical species representing a key signaling protein or a gene. Since many positive feedback interactions1, 2, 7 can be reduced to this form the results from the model will be relevant for a wide range of biological systems.
We consider a minimal biochemical process, , describing production and deactivation of the activated species C* which needs to reach a threshold concentration (say N) in a time interval [0, T] in order to mediate a functional response. Due to the stochastic fluctuations in the kinetics, the threshold concentration of C* could occur at different times (Fig. 1) or even stay below the threshold level in the time window. Therefore, knowing the distribution of the number (n) of C* molecules at a time T will not reveal if the concentration of C* attained the threshold level at an earlier time. However, knowledge of the joint distribution of the maximum number (N) of C* and the time t (0 ⩽ t ⩽ T) when this value was attained in a temporal profile describing the kinetics of C* in a single cell will inform us if the cell was able to cross the threshold in the time interval [0, T]. Such distributions are regularly dealt with in extreme value theory where extreme value distributions for identically distributed independent random variables have been studied extensively.9 Analysis of extreme value distributions for correlated random variables has been a topic of intense research in the recent years due to its application in physics,10, 11, 18, 19, 24 climate science,25 finance,15 and population13, 16 and cell biology.14 Application of such distributions in stochastic biochemical reaction kinetics has been initiated only recently.17 Interestingly, it has been found that for strongly correlated random variables in different types of random walks or fluctuating interfaces extreme value distributions can display simple one parameter scaling behavior.10, 11
We solve the master equation associated with the minimal model and calculate the joint probability distribution for C* attaining a maximum value N at time t in the time interval [0, T] exactly analytically and semi-analytically. We show that when the system is far from the steady state, in the presence of the feedback reaction, the distribution of the maximum value N over the time interval [0, T] is spread out over a broader range of N compared to the linear model. In contrast, the distribution of the time t when the maximum value occurred is much narrowly distributed in the presence of the feedback. This suggests that feedback interactions can help single cells to respond sensitively to weak stimulus with a well-defined response time even in the face of stochastic fluctuations.
RESULTS
Irreversible kinetics
In order to understand the role of stochastic fluctuations in affecting the distribution of maximum value of C*, it will be instructive to study the deterministic mass action kinetics for the concentration of C* (or [C*]) in the reaction, , which is described by
(1) |
The total concentration, C0 = [C] + [C*], is always fixed, and, the rates k1 and kp determine timescales for production of C* from C via a linear first order reaction, and, a second order reaction representing a positive feedback, respectively. C* is converted back to C with a rate, k−1. These time scales in a biological network can be regulated by the strength of a stimulus that results in generation of C*, e.g., a weaker (or stronger) stimulus would give rise to longer (or shorter) time scales for production of C*. The rate equation contains a single stable fixed point, and, thus starting with any initial concentration, [C*] monotonically reaches a steady state determined by the rate constants, and, C0. Consequently, if a reaction initiated with a concentration [C*(t = 0)] < [C*(t → ∞)] is followed until t = T, the maximum value of [C*], uniquely determined by the rate constants, C0, and T, is reached at t = T. However, in the presence of intrinsic stochastic noise fluctuations, the maximum value of the concentration of C* or the time when it is attained will vary in each stochastic “trajectory” (Fig. 1), where every trajectory represents activation of C* in a single cell. In this situation, P(n, t|m, 0), the conditional probability of having n number of molecules of the C* species at any time t starting with a distribution P(m, 0) at t = 0, follows the master equation,
(2) |
where, N0 denotes the total number of molecules of C and C* species. The distribution of the maximum number (N) of C* molecules and the time when the maximum was reached in a time interval [0, T] can be calculated by solving of the above master equation and using the renewal equation,20, 21
(3) |
Here, QN(n, t|m, 0) describes the probability of having n molecules of C* species at time t, when an absorbing boundary condition, QN(n, t|m, 0) = 0 for n ⩾ N, is imposed. FN(t|m, 0) denotes the probability of arriving at the state n = N for the first time at time t. If the time variable is Laplace transformed in the renewal equation, then FN(s|m, 0) is related to P(N, s|m, 0) simply by, FN(s|m, 0) = P(N, s|m, 0)/P(N, s|N, 0). The un-normalized joint probability distribution for attaining a maximum value N at time t in the time interval [0, T] is then given by
(4) |
We then calculate the un-normalized distribution of the maximum value N over the time interval [0, T], i.e.,
(5) |
and the un-normalized distribution of the time t when the maximum value occurred given by
(6) |
We first consider the case where the production of C* occurs irreversibly, i.e., k−1 = 0. In this limit, EN(T, t, |m, 0) can be evaluated analytically as calculations simplify due to the following relations: P(N, T|m, 0) = 0 for m > N, thus, QN + 1(N, T|m, 0) = P(N, T|m, 0) for m ⩽ N. Consequently, the joint probability distribution can be expressed as EN(T, t, |m, 0) = FN(t|m, 0)P(N, T|N, t). This essentially implies that the probability of having a maximum value N at time t is the probability the state n = N was attained at time t for the first time and then no reaction occurred in the time interval T − t. Next we calculate these distributions for the linear and the feedback models by solving Eq. 2 for k−1 = 0.
In the absence of the positive feedback (kp = 0), the exact solution of the master equation in Eq. 2 yields, , where, P(m, 0) = δm, 0. The first passage time distribution is given by, FN(t|m) = (N0 − N + 1)k1P(N − 1, t|m, 0), therefore, EN(T, t, |m, 0) = k1(N0 − N + 1)P(N − 1, t|m, 0)P(N, T|N, t). Thus, Pmax(N, T) = P(N, T|m, 0), and, (see Ref. 27 for additional details). In the presence of the feedback, Eq. 2 can be solved exactly by Laplace transforming the time variable. We consider the “feedback only” (k1 = 0 and kp ≠ 0) case to exclusively interrogate the role of the positive feedback. The exact solution for the time dependent probability distribution for m = 1 is given by
(7) |
The calculation of the inverse Laplace transformation of the above equation is tedious but straightforward and, since the poles of P(N, s|1, 0) at two different values of r can be equal, the probability distribution contains terms which are product of linear and exponential functions of t (details in Ref. 27). The first passage time distribution for this case is given by, FN(t|m) = kp(N − 1)(N0 − (N − 1))P(N − 1, t|1, 0), therefore, . As in the linear model we find, Pmax(N, T) = P(N, T|1, 0). However, Pmax(t, T) does not possess a simple expression as the linear model. The shapes of the distributions, Pmax(N, T) and Pmax(t, T), depend on N0 and the dimensionless variable, τ = kpT (or k1T for the linear model). In order to compare the distributions for the pure feedback and the linear models, we chose an end time T, where the average number of C* molecules was the same for both the models. Figure 2a shows the maximum value is distributed more evenly across different numbers of C* molecules in the presence of the feedback compared to the linear model, in contrast, Pmax(t, T) (Fig. 2b) is more sharply peaked for the feedback model, indicating that once the first molecules of C* are produced the positive feedback leads to fast production of C* molecules giving rise to a peak at t = T. The variation (inset, Fig. 2a) of the Fano factor, f = (⟨N2⟩ − ⟨N⟩2)/⟨N⟩, which quantifies if a distribution is broader than a Poisson distribution (where, f = 1), with ⟨N⟩ at different times shows that Pmax(N, T) is more spread out for the pure feedback model as long as the system is away from the steady state. As the system approaches the steady state, due to the irreversibility in the reactions, all the C molecules are converted into C*, i.e., , and then f decreases and become comparable for both the models. We used kurtosis (K) defined as K = μ4/σ4 − 3, where, μ4 and σ2 denote the 4th cumulant and the variance, respectively, to quantify the “peakedness” and the presence of “heavy tails”22 in Pmax(t, T) compared to a Gaussian distribution (K = 0). The pure feedback model produces much larger values of K at different values of ⟨N⟩ compared to the linear model (inset, Fig. 2b) indicating higher “peakedness” of the distributions in the presence of the feedback.
The above results become evident at the limit, N0 → ∞, , and . Then the probability distribution, P(n, t|m, 0), follows a simple functional form for both the models. In the linear model,
(8) |
and
(9) |
whereas, in the model with pure feedback,5
(10) |
and
(11) |
Calculation of the Fano factor (f) for Pmax(N, T) shows that, and for the linear and the feedback model, respectively; clearly demonstrating that Pmax(N, T) contains more variation in N for the feedback model for . On the other hand, the denominator in Pmax(t, T) for the feedback model decreases as t approaches T making the distribution more sharply peaked at t = T compared to the linear model. These results point to the following physical understanding. For the feedback model, the probability () to remain in the state n for a time interval Δt decreases with n, whereas this probability () is independent of n for the linear model. Therefore, in the feedback model, the states with smaller values of n spend large fraction of their time initially in waiting for the first reactions to occur, and then at times closer to the end time T, as these states move to larger values of n, the next reactions take place in rapid successions. This produces a large range of values of maximum values N in the time interval [0, T] but a sharp peak in Pmax(t, T). In contrast, in the linear model a state with n molecules moves to the next higher state (n + 1 molecules) with a constant rate, so at t = T all the states reside close to the average value of C*. In addition, since the waiting time for next reaction to occur does not depend on n, Pmax(t, T) is more spread out. We expect these features to persist even in the presence of a non-zero de-activation rate, as long as, the time scales for the feedback reactions are smaller than the de-activation time scale. These results suggest the following biological significance. During early signaling events, if the rate of de-activation is slower than that of activation, then in time scales smaller than that of deactivation the presence positive feedback interactions can help cells to achieve a wide range of activation within a narrow response time even in the presence of stochastic fluctuations. In Sec. 2B, we will show that the main features of the distributions Pmax(N, T) and Pmax(t, T) persist even in the presence of a non-zero de-activation rate.
Reversible kinetics
In the presence of a non-vanishing rate of de-activation (i.e., k−1 ≠ 0), the number of C* molecules can decrease after attaining the maximum value, therefore, the simple relationships between the P(n, t|m, 0) and the maximum value distributions as in Sec. 2A no longer hold. Moreover, it becomes difficult to analytically solve the master equation exactly when the positive feedback is present. Therefore, we calculated the maximum value distributions semi-analytically. We briefly outline the method here, the details of the calculations are shown in Ref. 27. The master equation in Eq. 2 can be cast as an operator equation21 described by
(12) |
where, ⟨n|P(t)⟩ = P(n, t|m, 0) and . We solve the above equation by numerically evaluating the right (|Rr⟩) and left (⟨Lr|) eigenvectors, and the eigenvalues ({λr}) of the operator, L. The solution of the master equation then can be written as , where |P(0)⟩ describes the probability distribution at t = 0 and . {an(0)} is calculated from the initial condition. The same scheme is used to calculate the probability distribution, QN(n, t|m, 0) using , where and are the eigenvalues and eigenvectors of L with an absorbing boundary condition at n = N, respectively. We define the survival probability, , which can be used to evaluate the first passage time distribution, FN(t|m, 0), following the relation, FN(t|m, 0) = −∂SN/∂t. The maximum value distribution functions are then calculated using the following equations:
(13) |
and
(14) |
The shapes of the probability distributions, P(n, t|m, 0) and EN(t, T|m, 0), depend on the dimensionless parameters, kpt, k1/kp and k−1/kp for the feedback model and k1t and k−1/k1 for the linear model. We varied k−1 as well as t and T in the models to investigate the effect of the de-activation rate on the above distributions. We kept k1/kp fixed to a small non-zero value (0.01) with two goals in mind: (i) prevent the n = 0 state from becoming an absorbing state, (ii) to exclusively study the effect of the feedback; when k1/kp ≫ 1, the feedback model starts behaving like the linear model. The presence of a non-zero de-activation rate makes Pmax(N, T) different than P(n, T|m, 0) (Fig. 3a), since P(n, t|m, 0) no longer vanishes for m > n. The non-zero deactivation rate can give rise to bimodal distributions for P(n, T|m, 0) (Fig. 3a) in the feedback model, which is purely generated due to stochastic fluctuations since the deterministic rate equation (Eq. 1) does not possess any bistability. Pmax(N, T) can also show a bimodal distribution as P(n, T|m, 0). However, the peaks in Pmax(N, T) occur at larger values of N compared to the values of n where P(n, T|m, 0) is peaked. This behavior is produced by stochastic trajectories that attained the maximum value n = N at times earlier than T. Since C* in the linear and the feedback models can attain the state n = N0 with a non-vanishing probability, as T → ∞ in both the models, making the distributions similar for both the models at long times. Therefore, the bimodal distribution in Pmax(N, T) for the feedback model will be transient. These results can have implications for transient bimodal distributions observed in experiments when the underlying signaling network predicts steady state bimodal distributions. In such experiments, activation of the cells could be caused by a key signaling protein crossing the activation threshold for the first time, consequently, the distribution of activated cells will be represented more appropriately by Pmax(N, T) instead of P(n, T|m, 0). Next we calculated Pmax(t, T) using Eq. 14. As in the irreversible case, the distribution, Pmax(t, T), shows a higher peakedness for the feedback model as compared to the linear model (Fig. 3b). The larger values of the Fano factor (f) for the distribution Pmax(N, T) for a range of de-activation rates (Figs. 4a, 4c) and end times T demonstrate that Pmax(N, T) continues to have more variance for the feedback model compared to the linear model as long as the system is away from the steady state. The peakedness in Pmax(t, T) was characterized by the kurtosis, K, as in Sec. 2A. The feedback model produces positive and substantially larger values of K than that for the linear model (Figs. 4b, 4d), where, for most of the parameter values, K is negative, indicating a flatter distribution compared to a Gaussian distribution. Therefore, the above results demonstrate that even in the presence of de-activation of signaling molecules, the presence of the positive feedback prepares the single cells to respond to a wide range of activation thresholds in a well-defined time window in a noisy environment.
CONCLUSION
We have studied how commonly found nonlinear biochemical processes in cell signaling and gene regulatory networks such as positive feedbacks influence single cell decision processes in the presence of stochastic fluctuations when these decisions are regulated by key proteins attaining threshold concentrations within a time window. We analyzed the joint probability distribution of the maximum value of concentration of a molecular species and the time when the maximum concentration was reached instead of the probability distribution of the concentration of the molecular species at any time, as the later distribution does not contain information regarding if the molecular species attained the threshold concentration at an earlier time. We calculated the maximum value distributions exactly and semi-analytically in minimal models that can effectively describe linear and positive feedback interactions in biochemical reactions found in a wide range of cell signaling networks. In particular, we investigated the role of positive feedback interactions in affecting the shape of the maximum value distributions. We find that in the presence of a positive feedback interaction the maximum values of concentrations of the activated species are distributed more broadly compared to the linear model when the system is away from the steady state. However, the positive feedback produces a narrower distribution in the time when the maximum activation was achieved. Therefore, a positive feedback interaction, even in situations when stochastic fluctuations dominate signaling kinetics, provides cells the ability to respond when specific cellular proteins need to attain a wide range of threshold concentrations within a narrow time window to influence cell decision processes. This property of the positive feedback could play an important role when a cell population has to sensitively respond to a weak stimulus. The biological significance of the presence of a “heavy tail” at time scales smaller than the most probable time scale in Pmax(t, T) in the presence of positive feedback interactions is less evident. Perhaps, the positive feedback helps create a small reservoir of cells that can react to very weak stimuli with a range of relatively smaller response times when the majority of the cells in a cell population are destined to respond at a much longer biologically irrelevant time scale. The probability distributions calculated from the solutions of the master equations for the minimal models indicate presence of multiple time scales in the system. It will be interesting to see if this can produce multi-scaling behavior12 in extreme value distributions in such chemical reactions in general. Such examples will be qualitatively different than extreme value distributions in standard Brownian motion11 or models of fluctuating interfaces10 which display single parameter scaling. In addition, the minimal models studied here are embedded in larger biological networks which ultimately determine cell fate responses; therefore, it will be important to investigate the behavior of maximal value distributions when the minimal models are connected to a larger network.
ACKNOWLEDGMENTS
This work was funded by the Research Institute at the Nationwide Children's Hospital and a Grant No. (1R56AI090115-01A1) from the National Institutes of Health (NIH). I thank C. Jayaprakash and M. Kardar for discussions, S. Mukherjee for help with LAPACK routines, and anonymous reviewers for their helpful comments. This work is dedicated to my parents.
References
- Das J., Ho M., Zikherman J., Govern C., Yang M., Weiss A., Chakraborty A. K., and Roose J. P., Cell 136, 337 (2009). 10.1016/j.cell.2008.11.051 [DOI] [PMC free article] [PubMed] [Google Scholar]
- J. E.FerrellJr. and Machleder E. M., Science 280, 895 (1998). 10.1126/science.280.5365.895 [DOI] [PubMed] [Google Scholar]
- Gardner T. S., Cantor C. R., and Collins J. J., Nature (London) 403, 339 (2000). 10.1038/35002131 [DOI] [PubMed] [Google Scholar]
- Gardiner C., Handbook of Stochastic Methods for Physics, Chemistry and Natural Sciences (Springer-Verlag, Heidelberg, 2004). [Google Scholar]
- Delbruck M., J. Chem. Phys. 8, 120 (1940). 10.1063/1.1750549 [DOI] [Google Scholar]
- McAdams H. H. and Arkin A., Proc. Natl. Acad. Sci. U.S.A. 94, 814 (1997). 10.1073/pnas.94.3.814 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Weinberger L. S. and Shenk T., PloS Biol. 5, e9 (2007). 10.1371/journal.pbio.0050009 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Swain P. S., Elowitz M. B., and Siggia E. D., Proc. Natl. Acad. Sci. U.S.A. 99, 12795 (2002). 10.1073/pnas.162041399 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gumbel E. J., Statistics of Extremes (Dover, New York, 2004). [Google Scholar]
- Majumdar S. N. and Comtet A., Phys. Rev. Lett. 92, 225501 (2004). 10.1103/PhysRevLett.92.225501 [DOI] [PubMed] [Google Scholar]
- Majumdar S. N., Randon-Furling J., Kearney M. J., and Yor M., J. Phys. A 41, 365005 (2008). 10.1088/1751-8113/41/36/365005 [DOI] [Google Scholar]
- Kadanoff L. P., Chin. J. Phys. 29, 613 (1991), accession number WOS:A1991GU01300006. [Google Scholar]
- Gillespie J. H., Theor Popul. Biol. 23, 202 (1983). 10.1016/0040-5809(83)90014-X [DOI] [PubMed] [Google Scholar]
- Kosmerlj A., Chakraborty A. K., Kardar M., and Shakhnovich E. I., Phys. Rev. Lett. 103, 068103 (2009). 10.1103/PhysRevLett.103.068103 [DOI] [PubMed] [Google Scholar]
- Embrechts P., Kluppelberg C., and Mikosch T., Modelling Extremal Events for Insurance and Finance (Springer, Berlin, 2004). [Google Scholar]
- Orr H. A., Evolution 56, 1317 (2002). 10.1554/0014-3820(2002)056[1317:TPGOAT]2.0.CO;2 [DOI] [PubMed] [Google Scholar]
- Artomov M., Kardar M., and Chakraborty A. K., J. Chem. Phys. 133, 105101 (2010). 10.1063/1.3482813 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Burkhard T. W., Gyorgyi G., Moloney N. R., and Racz Z., Phys. Rev. E 76, 041119 (2007). 10.1103/PhysRevE.76.041119 [DOI] [PubMed] [Google Scholar]
- Redner S., A Guide to First-Passage Processes (Cambridge University Press, Cambridge, 2001). [Google Scholar]
- Goel N. and Ritcher-Dyn N., Stochastic Models in Biology (Academic, New York, 1974). [Google Scholar]
- Honerkamp J., Stochastic Dynamical Systems (VCH, New York, 1993). [Google Scholar]
- DeCarlo L. T., Psychol. Methods 2, 292 (1997). 10.1037/1082-989X.2.3.292 [DOI] [Google Scholar]
- Gillespie D. T., J. Chem. Phys. 81, 2340 (1977). 10.1021/j100540a008 [DOI] [Google Scholar]
- Bouchaud J.-P. and Mezard M., J. Phys. A 30, 7997 (1997). 10.1088/0305-4470/30/23/004 [DOI] [Google Scholar]
- Katz R. W. and Brown B. G., Clim. Change 21, 289 (1992). 10.1007/BF00139728 [DOI] [Google Scholar]
- Lis M., Artyomov M. N., Devadas S., and Chakraborty A. K., Bioinformatics 25, 2289 (2009). 10.1093/bioinformatics/btp387 [DOI] [PMC free article] [PubMed] [Google Scholar]
- See supplementary material at http://dx.doi.org/10.1063/1.4772583 for additional details.
- Walczak A. M., Onuchic J. N., and Wolynes P. G., Proc. Natl. Acad. Sci. U.S.A. 102, 18926 (2005). 10.1073/pnas.0509547102 [DOI] [PMC free article] [PubMed] [Google Scholar]