Skip to main content
Biophysical Journal logoLink to Biophysical Journal
. 2004 Nov 12;88(2):828–850. doi: 10.1529/biophysj.104.050666

Self-Consistent Proteomic Field Theory of Stochastic Gene Switches

Aleksandra M Walczak *, Masaki Sasai , Peter G Wolynes *,‡
PMCID: PMC1305159  PMID: 15542546

Abstract

We present a self-consistent field approximation approach to the problem of the genetic switch composed of two mutually repressing/activating genes. The protein and DNA state dynamics are treated stochastically and on an equal footing. In this approach the mean influence of the proteomic cloud created by one gene on the action of another is self-consistently computed. Within this approximation a broad range of stochastic genetic switches may be solved exactly in terms of finding the probability distribution and its moments. A much larger class of problems, such as genetic networks and cascades, also remain exactly solvable with this approximation. We discuss, in depth, certain specific types of basic switches used by biological systems and compare their behavior to the expectation for a deterministic switch.

INTRODUCTION

Genetic switch systems are an elementary means of regulatory control present in every living organism. Their complexity and details differ, but the general mechanism, that of the expression of a given gene being regulated by proteins, is believed to be universal (Ptashne and Gann, 2002). They are building blocks of larger regulatory elements: genetic networks and signaling cascades. The pathways by which these systems operate are passed on from generation to generation. Understanding their stability and characteristics is therefore fundamental. A lot of previous work has considered a deterministic description of genetic switches (Ackers et al., 1982; Hasty et al., 2001). The need for a stochastic treatment of genetic switches, due to the single copy of the DNA molecule and multiple protein molecules in the cell, has been largely recognized (Sneppen and Aurell, 2002; Kepler and Elston, 2001).

The most general way of accounting for nondeterministic processes is to write down the master equation for a given system. To define the state of the switch, one must specify the DNA binding states of particular genes and the number of proteins of each type. The probability distribution for even a single switch consisting of two genes, the product proteins of which act as regulator proteins for the system, may not be determined exactly and approximations must be considered (Bialek, 2001; Hasty et al., 2000; Sneppen and Aurell, 2002).

Several approaches to account for the probabilistic nature of chemical reactions have been undertaken, ranging from the Langevin description of single genes (Bialek, 2001), and two interacting gene switches (Hasty et al., 2000), to the master equation reduced to Fokker-Planck equation considerations (Kepler and Elston, 2001; Hasty et al., 2001a). A dynamical action formulation has also been used (Sneppen and Aurell, 2002) to determine the lifetimes of states of the switch. A popular alternative to purely analytical methods, which often need to make approximations or are limited to very simple model systems, has been to conduct stochastic simulations of genetic switches. Two types of simulations are mostly used. In the first, the randomness of the system is introduced by means of a Monte Carlo algorithm with a fixed time step (Paulsson et al., 2000). The second is based on the Gillespie algorithm (Gillespie, 1977) to predict the probability of a given reaction occurring (Arkin et al., 1998). For single-gene systems, stochastic simulations have shown that the stochasticity in the system is responsible for the bimodal probability distributions (Cook et al., 1998) that have been experimentally observed. These methods prove very useful, because they allow us to test the theoretical predictions on model systems that might be hard to build experimentally. However, this approach often does not enable us to gain intuition or insight into the mechanisms behind the functioning of the system. The aim of the present work is to gain a better and deeper understanding of the device physics of genetic switches. We therefore, contrary to many important previous discussions (McAdams and Arkin, 1997; Aurell et al., 2002; Vilar et al., 2003), do not present a specific concrete biological system, but discuss generic behavior and try to understand its sources. Our approximation also allows for an exact solution of a broad class of genetic switch systems without any further assumptions and with little computational effort. Hasty et al. (2001b) present an overview of the existent theoretical approaches.

A popular approximation assumes that the DNA binding state reaches equilibrium much faster than the protein number state. Therefore the adiabatic approximation is often considered (Ackers et al., 1982; Sneppen and Aurell, 2002; Darling et al., 2000), allowing for a thermodynamic treatment (Ackers et al., 1982) of the DNA binding state. The protein number fluctuations are then treated stochastically. Even before the statistical thermodynamics approach of Ackers et al. (1982) using partition functions, much previous work assumed the DNA binding and unbinding can simply be accounted for by an equilibrium constant, since the relaxation timescales for equilibration of the DNA state are much larger than those of the protein numbers, which require protein synthesis and degradation to change. The partition function approach has also been successful for looking at logic gates built from switches (Buchler et al., 2003). The adiabatic approximation is believed to hold true in many cases, judging by the experimental parameters of biological switches (Darling et al., 2000). But as the experiments of, for example, Becskei et al. (2001) show, not all switches need to function within the adiabatic limit and the nonadiabatic limit may result in new phenomena. We therefore consider a wide range of parameter ratios in our discussion.

In this article we explore more fully an approximation, previously used by Sasai and Wolynes (2003) for the variational treatment of the problem, the self-consistent proteomic field (SCPF) approximation. Within this approximation one assumes that the probability of finding the switch in a given state is a product of the probabilities of states of individual genes. One can then solve the steady-state master equation for the probability distribution of many regulatory systems exactly. We discuss the approximation and present a detailed study of different classes of genetic switches, some of which have never been considered theoretically. We consider separately several particular features of such systems that are found in known switches, to be able to characterize their contributions to the behavior of the whole system. To be specific, starting from a symmetric toggle switch, we go on to compare the effects of multimer binding, and of the production of proteins in bursts, on the stability of the switch.

The stochastic effects prove to be modest for symmetric switches without bursts, especially if the genes have a basal production rate. We find the deterministic and stochastic SCPF solutions to have similar probabilities of particular genes to be on, and similar mean numbers of proteins of a given species in the cell. However, in the nonadiabatic limit, when the unbinding rate from the DNA is smaller than the death rate of proteins, the probability distributions have two well-defined peaks, unlike in the deterministic approximation or adiabatic limit of the stochastic SCPF solution.

We also show that the effect of stochasticity on the observables becomes more apparent when proteins are produced in bursts. In these types of switches, the definition of the adiabatic limit, which was clear for the switches in which proteins are produced separately, is no longer simple. Our discussion shows that the properties of genes often analyzed in the deterministic limit, may be strongly influenced by stochasticity in this case. Randomness in a biological reaction system leads to quantitative and, in many examples, even qualitative changes, from predictions of deterministic models.

We also discuss the differences in the behavior of asymmetric and symmetric switches. We point to the mechanisms resulting in different types of bifurcations and show how they are influenced by noise. Within the SCPF approximation, switches that are regulated by binding and unbinding of monomers do not have regions of bistability. This holds true for both symmetric and asymmetric switches. When proteins are produced individually rather than in bursts, fast unbinding from the DNA can effectively minimize the destructive effect of protein number fluctuations on the stability of the DNA binding state. Furthermore, a detailed analysis of the probability distributions shows that they have long tails, and are far from Poissonian in both the adiabatic and nonadiabatic limits. We discuss the properties of the system in terms of clouds of proteins buffering the DNA. We show how fast or slow DNA binding characteristics and protein number fluctuations influence the stability of the buffering clouds, leading to specific emergent behavior of observables. Throughout the article, a comparison is made between results of the exact stochastic solution, to solutions of the deterministic kinetic equations for the system within the self-consistent proteomic field approximation.

We establish a base of potential building blocks of more complicated switches and systems, such as networks and signaling cascades, for which an exact solution within the present approximation can also be obtained. A detailed discussion of these larger systems will be the topic of another article. We also present limitations of the present style of analysis where exact solutions are not possible.

There are two aims of this article. The first is to discuss the self-consistent field approximation and show that it has an exact solution that could be extended to a large class of systems. This approximation lets one deal in a straightforward and computationally inexpensive manner with the effect of random processes on genetic networks. The second is to discuss the many components of biological switches present in nature and in engineered systems, in the necessary stochastic framework.

THE SELF-CONSISTENT PROTEOMIC FIELD APPROXIMATION

The basic mechanism of gene transcription regulation in prokaryotes may be reduced to the binding and unbinding of regulatory proteins, repressors, and activators, to the operator site of the DNA. If we use this simplified treatment, which neglects extra levels of regulation such as the binding of RNA polymerase, effectively each gene can be described as being either in an active (on) state, when the repressor is unbound (activator bound); or in an inactive (off) state, with the repressor bound (activator unbound). The stochastic system of a single gene and its product proteins is described by the joint probability distribution Inline graphic of the number of product proteins in the cell n, and the DNA binding-site state, as on (protein not bound) = 1; and off (protein bound) = 2. To conserve probability, Inline graphic

If one considers two interacting genes, the description in terms of a joint probability vector needs to be extended to four states: both genes may be on, or off; or one of the genes may be on, and the other off. If the two genes do not interact, as would be the case for two self-regulatory proteins, the probability of finding the two-gene system in a given state, defined by both the number of product proteins and the DNA binding-site state, would be the product of the states of particular genes Pjj′(n1,n2;t) = Pj(n1;t)Pj′(n2;t). This is generally not true for two interacting proteins, as is the case in a genetic switch. However, as a first approximation to the problem, one can ignore correlations between the spaces of the two genes and assume the space of the switch is a sum of spaces of the genes that compose it. Since we are looking for solutions in which the symmetry of the system is broken and different behaviors of the on- and off-state of a gene are possible, we must allow for different probability distribution functions for the on- and off-states. This is analogous to the unrestricted Hartree approximation in quantum mechanics, where allowing different spatial functions for spin-up and spin-down states results in breaking of the symmetry of the bound molecular orbital solution to the dissociated solution of two separate hydrogen atoms with opposite spin-states for large internuclear distances. We therefore allow for multiple solutions for a given set of parameters. The total probability of having a given gene state j and ni proteins of that type is simply given by Pj(ni,ni′) = Pj,j′=0(ni,ni′) + Pj,j′=1(ni,ni′).

The self-consistent approximation is a crude one, since in the case of the genetic switch, the state of a given gene is often determined by the number of protein products of the other gene. However, within this approximation, one can solve the master equation for the probability distribution exactly, without any further approximations. This yields a powerful computational tool, which simultaneously gives useful insight.

THE TOGGLE SWITCH

For clarity of exposition, we show how the problem may be solved exactly within the self-consistent proteomic field approximation on a well-defined system of the toggle switch. We then expand the method to apply to other systems. The elementary system we use as an example is composed of two genes, labeled 1 and 2, as presented in Fig. 1. Gene 1 produces proteins of type 1, which act as regulatory proteins, i.e., repressors, on gene 2. The product of gene 2, proteins of type 2, in turn repress gene 1. In this simplified model, we assume that protein production occurs instantaneously upon unbinding of the repressor. For now, we assume that repressor proteins bind as dimers, since that is a common scenario in biological systems, but we do not treat dimerization kinetics explicitly. For simplicity, the coupling form between the genes responsible for binding will be taken to be of the form Inline graphic where p is the order of the multimerization of the repressor. This form is a small approximation to the more exact hin3−i(n3−i − 1)…(n3−ip + 1). We have checked that using the simpler monomial does not influence the results in any regime discussed. We also do not account for the existence of mRNA molecules and the consequent time delays owing to their synthesis as intermediates. The extensions of the model are discussed later.

FIGURE 1.

FIGURE 1

A schematic representation of the toggle switch. Gene 1 produces proteins of type 1, which repress gene 2; and gene 2 produces proteins of type 2, which repress gene 1.

Within the self-consistent proteomic field approximation, the set of master equations for the corresponding system is of the form

graphic file with name M4.gif
graphic file with name M5.gif (1)

for n ≥ 1 where the i = 1,2 refers to the gene label. P1(n1) describes the probability of gene 1 being in the on-state and there being n1 protein molecules of type 1 in the cell. The first term on the right-hand side of Eq. 1 describes the production of proteins of type i with a production rate gj(i), where j = 1, 2, depending on whether the gene is in the on- or off-state. The second term accounts for the destruction of proteins with rate ki. The binding of repressor proteins produced by the other gene is proportional to the number of dimer molecules present in the system n3−i with rate hi. We assume unbinding occurs with a constant rate fi. Binding and unbinding contributes to the kinetics of the DNA binding states, as described by the last two terms. This set is supplemented by the Pj(ni = 0) equations to account for boundary conditions.

graphic file with name M6.gif
graphic file with name M7.gif (2)

For convenience, let us define Inline graphic the probability of finding the DNA binding site in a given state. One can now sum the Pj(1) equations over the number states of the second protein with P1(2) + P2(2), and likewise the Pj(2) equations. Due to the SCPF approximation, the only term affected is the repressor binding term Inline graphic and since Inline graphic the summation results in Inline graphic where Inline graphic is the second moment of the number distributions of type 2 proteins produced when gene 2 is in the jth state. The equations of motion of the moments of the probability distribution are of the form

graphic file with name M13.gif (3)

The steady-state equations for the moments of the distributions that follow are closed-form; the Inline graphic order moment equation of motion depends only on the lower moments of the ith gene and Inline graphic

To analyze the behavior of switches we introduce the following scaled parameters: the adiabaticity parameter ωi = fi/ki, which represents the characteristic rate of change of the DNA state compared to the characteristic rate of change in protein number, Inline graphic measures the tendency for proteins to be unbound from the DNA; Inline graphic the effective production rate; and Inline graphic distinguishes between the two DNA states in terms of protein dynamics. We present a detailed derivation of the moment equations in Appendix A.

The resulting equations for the 0th moments couple to the higher moments by the interaction function F(i). These lower moments can be solved self-consistently. The resulting solution predetermines all the other moments, which completely describe the probability distribution. Each gene therefore couples to the other gene by the influence of the self-consistently generated proteomic field. One could define the generating function and calculate the probabilities of having a given DNA binding state j for the ith gene when there are ni proteins of type i in the cell. In practice, it is easier to go back to the steady-state master equation and solve directly for the probability distributions than sum an infinite number of moments. Rewriting the steady-state master equation (Eq. 1) one gets

graphic file with name M19.gif
graphic file with name M20.gif (4)

These sets of equations give recursion relations for Pj(ni) that one can use to express Pj(ni) as a function of P1(0) and P2(0). The normalization condition Inline graphic gives Pj(0) in term of constants and the result is the probability function Pj(ni) as a series. The SCPF approximation reduces the two-gene problem to a one-gene problem parameterized by the moments of the second gene, which can be worked out independently, as we have already shown, and these are represented by F(3−i)—which is a constant in terms of this calculation.

To see the effect of the stochastic nature of the system, we compare the exact solutions of the self-consistent field approximation equations to the results that would follow from deterministic kinetic rate equations for the number of proteins of each type and the fraction of on/off DNA binding states for each gene,

graphic file with name M22.gif (5)

where n(i) is the number of proteins of type i present in the cell. The exact SCPF equations reduce to the deterministic kinetic equations in the limit of large ω and Xad for the case discussed above. The F(3−i) term in the stochastic SCPF equations is replaced by the n2(3−i) term in the deterministic kinetic rate equations. For the toggle switch, where repressors bind as dimers, it is easily shown that the interaction functional may be rewritten in the form

graphic file with name M23.gif (6)

which in the large ω-limit reduces to F(i) = 〈n(i)〉2 + 〈n(i)〉. So for large mean numbers of proteins present in the cell, which corresponds to large effective production rates Xad, 〈n(i)〉 of the order of hundreds is a small correction to 〈n(i)〉2. We therefore reproduce the deterministic kinetics result.

As shown by Sasai and Wolynes (2003), the difference in the probability that gene 1 is active and that gene 2 is active, ΔC = |C1(1)−C1(2)|, plays the role of an order parameter. We can now consider a family of switches and discuss their stability, sensitivity of regions of bistability to control parameters, and types of bifurcations.

THE SYMMETRIC TOGGLE SWITCH

For pedagogic purposes we will start by analyzing the single symmetric toggle switch, such as discussed above, in which repressors bind as dimers, with ω1 = ω2 = ω, Inline graphic and Inline graphic as it is the most intuitive and shows the most generic behavior. It is an academic example, as even individual genes in switches engineered in the laboratory mostly have different chemical parameters. Yet a lot can be learned from this simple system.

The general mechanism of the phase transition

Fig. 2 shows the phase diagrams for the system, |ΔC|, as a function of reservoir protein number and the adiabaticity parameter for the exact SCPF equations for growing values of the parameter describing the tendency that proteins are unbound from the DNA, Xeq. The deterministic kinetics and exact SCPF approximations give qualitatively similar results. The analogous deterministic kinetic phase diagrams agree with the SCPF solutions in the large ω- and Xad-limit, hence they become more similar with growing Xeq, as the bifurcation occurs at larger effective production rates for larger Xeq. For large fluctuations and a small unbinding rate, neither gene 1 nor gene 2 is favored and the probability of a given gene to be on is determined solely by the effective production rate of the other gene and decreases in a quadratic manner as the number of repressor proteins grow (Fig. 3). Since the switch is symmetric, the system has one stable state, ΔC = 0, where the probabilities of the genes to be on are equal. As the relative protein number fluctuations get smaller and the DNA unbinding rate grows, a proteomic cloud buffers the repressed gene, keeping it repressed. The symmetry of the system is broken and the solution bifurcates into two separate basins of attraction. For the stochastic SCPF equations the bifurcation takes place for larger effective production rates (larger Xad), than for the deterministic equations, even in the large ω-limit, which depicts their sensitivity to fluctuations. The critical number of reservoir proteins necessary for the bifurcation of the solution to take place is the same in both approximations and is determined by 〈nc = (Xeq)½ (Fig. 3). In the discussed example, 〈nc = 32 = 1000½, for Xeq = 1000. For the deterministic kinetic switch the bifurcation takes place when C1(i) = (1 + 〈n(3−i)〉2/Xeq)−1 = 0.5, due to the simple form of the interaction function equal to 〈n(3−i)〉2 = (2XadC1(3−i))2. So C1(i) = 0.5 is equivalent to the 〈n(3−i)〉2/Xeq = 1. In a noisy system larger effective production rates are needed to achieve the critical value of proteins. The interaction function in this case may be written as Inline graphic and Inline graphic always. So at 〈nc, F(3−i)/Xeq > 1 and the probability of the genes to be on is <0.5, therefore Inline graphic The mechanism of the bifurcation requires the two genes to be more likely to be unbound than bound for the phase transition to take place. The curvature of the null clines presented in Fig. 2 can be simply worked out to be of the form Inline graphic with ζi,ξi constants determined by the specific value of C1(1), C1(2).

FIGURE 2.

FIGURE 2

Phase diagram obtained as an exact solution within the SCPF approximation for the single symmetric switch when repressors bind as dimers with Xeq = 1 (A), 100 (B), and 1000 (C). Contour lines mark values of ΔC.

FIGURE 3.

FIGURE 3

Probability that genes are in the active state (A), the mean number of proteins of each type present in the cell 〈n(i)〉(B), and the mean number of proteins of each type present in the cell if gene i is in the on-state 〈n1(i)〉 (C) as a function of Xad = δXsw for a symmetric switch. Exact solutions of the SCPF approximation equations compared with deterministic kinetic rate equations solutions, for a single symmetric switch, Xeq = 1000, ω = 0.5.

Adiabaticity parameter dependence

As the adiabaticity parameter decreases, the area of phase space corresponding to multiple solutions decreases (Fig. 2). For very small values of the adiabaticity parameter, there exists only one solution that corresponds to a state in which the two genes are off. The value of ω below which only one solution exists decreases with the tendency for proteins to be bound, but exists for all values of Xeq. Therefore if the two genes have very high repressor binding affinities, the critical number of proteins necessary for the phase transition to take place cannot be formed, even for very high production rates. This region of parameter space where one solution is possible corresponds to a situation in which a buffering proteomic cloud may not form, due to a very fast destruction rate of proteins or a very small unbinding rate from the DNA. The critical number of proteins necessary for the bifurcation to occur grows with the tendency for proteins to be unbound from the DNA (Xeq), as the cloud buffering the genes needs to be bigger and exhibit smaller relative protein number fluctuations, which effectively decrease with the growth of the adiabaticity parameter. This is further discussed in terms of the probability distributions. Therefore a monostable solution exists at all values of the effective growth rate, Xad, for larger values of ω at large Xeq than at smaller Xeq values. The bifurcation point is a result of competition between the number of reservoir repressor proteins and the tendency for proteins to be unbound from the DNA. This is clear from the dependence of the number of proteins present in the cell at the bifurcation point on the relative values of Xad and Xeq, but not the adiabaticity parameter ω.

Mean protein numbers

The total number of proteins present in the cell, produced both in the on- and off-states, asymptotically away from the bifurcation points is the same for the deterministic and stochastic approximations, and it is given by 〈n(i)〉 = 2Xad, when C1(1) ≈ 1 the probability of the gene to be on is close to unity. The number of proteins of a given type present in the cell, when the gene that produces them is in the on-state, is always considerably smaller in the noisy system than in the deterministic case (Fig. 3 C). Since the production rate in the off-state was assumed zero, in the deterministic case no proteins of a given type are present in the cell if the gene is in the off-state, unlike in the noisy system. Therefore the number of proteins in the deterministic system is nonzero only if the gene is on. But interaction of the DNA binding state with the proteins buffering it results in a residual number of proteins present in the off-state for all values of ω. The region of bistability of the switch in parameter space grows as the binding rate increases with respect to the unbinding rate, stabilizing the DNA binding states. As the susceptibility of the system to fluctuations increases, the deterministic equations prove to be a poor approximation to describe the state of the system.

Gene-buffering proteomic cloud interactions

The stochastic nature of the system also manifests itself at the DNA level (Fig. 2). As the tendency for proteins to be unbound from the DNA grows, the area of parameter space wherein multiple solutions are possible decreases—since a larger number of proteins is needed to reach a state in which two genes are more likely to be repressed (protein bound state) than at small Xeq. For small unbinding rates or large binding rates, regardless of the ratio of the rate of unbinding of repressors from the DNA to protein degradation, bistability requires smaller numbers of proteins, which correspond to larger relative fluctuations, than for large Xeq. Therefore a larger unbinding rate relative to the binding rate makes the system more susceptible to protein number noise. Competition between Xeq and 〈n(i)〉 results in Xeq, for a given null cline, being a parabolic function of Xad, for the dimer binding case, with coefficients determined by ω and C1(i). This is easily generalized to higher order functions for higher order (p) oligomers, and results in p-order dependence. The switching region, by which we mean that the region of parameter space between the bifurcation point and ΔC > 0.9 decreases as the binding and unbinding rates become comparable (Xeq decreases). As discussed above, the probability of the genes to be on at the bifurcation point tends to 0.5 as the adiabaticity parameter grows (Fig. 3), therefore the probability to be on has to increase by a smaller ΔC to reach C1(i) = 1. Therefore the switching region decreases also as the unbinding rate from the DNA grows, since smaller effective production rates are needed to reach ΔC = 1, than for small ω. Small values of ω correspond to large fluctuations in the DNA binding state as well as the protein number state, and result in destabilizing the gene-buffering protein cloud interactions. Hence very large effective production rates are needed for ΔC > 0.9. Therefore the DNA unbinding rate must become considerably faster compared to the protein degradation rate for the switch to have two stable solutions in a large region of parameter space.

The probability distributions

A better understanding of the bifurcation can be gained from examining the probability distributions. Fig. 4, A and B, and Fig. 4, C and D, show the evolution of the probability distributions of gene 1 and gene 2, respectively, to be on and off as functions of Xad. The peak of the distribution decreases and the width spreads out as the control parameter grows, until it reaches the bifurcation point at Xad = 44. Then the value of the probability function corresponding to the most probable number of proteins grows again. The spread of the functions grows as the effective production rate in the on-state increases; it narrows, however, with the increase of the adiabaticity parameter, as would be expected, since the DNA state fluctuations become smaller with ω. The average number of proteins in the cell in the on-state (ΔC > 0.9) does not show a dependence on ω. Yet as the unbinding rate from the DNA becomes very fast compared to the protein number fluctuations, the system switches often between the two states, hence a large number of proteins is present even in the off-state. If the DNA unbinding rate is small, the protein number characteristics follow the DNA state having time to reach a steady state within each well, before the DNA binding site switches into the other state, so the number of proteins in the off-state falls to zero (Fig. 5, A and B). This results in a two-peak, bimodal probability distribution (Fig. 4). If ω is large, random fluctuations in the DNA state do not change the effective state of the system, since a residual high mean protein number is present even in the off-state. In such a case, lower effective production rates than for small ω result in higher protein yields, and hence smaller switching regions.

FIGURE 4.

FIGURE 4

Evolution of probability distributions for the probability of the gene that will be active (on) after the bifurcation to be on (A) and off (B) and the gene that will be inactive (off) to be on (C) and off (D) as a function of the order parameter Xad for a symmetric switch. The bifurcation occurs at Xad = 44, Xeq = 1000, ω = 0.5.

FIGURE 5.

FIGURE 5

Probability distributions for the gene to be in the on-state (A) and off-state (B) for a gene in the active state for different values of the adiabaticity parameter ω = 0.5, 10, 100. Xeq = 100, Xad = δXsw = 100. Comparison of probability distributions obtained by exactly solving the steady-state equations in the SCPF approximations with analogous Poissonian distributions (C and D). Symmetric switch, Xad = 44, Xeq = 1000, ω = 0.5.

For small ω one might expect Poisson distributions of proteins in each of the DNA states, since the unbinding rate from the DNA is smaller than the protein degradation rate, so the proteins may reach a steady state without the DNA state changing. Hence, effectively proteins would feel the effect of only one well and be subject to a birth/death process. This is not true, however. The difference between the exact solution and a solution obtained within a Poissonian approximation to the state of the system is surprisingly large, owing to the skewed tails of these distributions. Fig. 5, C and D, compares these probability distributions with distributions for the same system if one assumes a Poissonian probability function. The distributions obtained as an exact solution within the SCPF approximation are clearly not symmetric, but exhibit long tails toward zero. Therefore, although the most probable values of the two types of distributions are similar, noise has a destructive impact on the system, resulting in a larger probability of having a smaller number of proteins in the cell than expected based on a Poissonian distribution, whose higher moments are equal to the mean. Therefore, a larger production rate is needed for one of the states to be favored as a result of noise, than that predicted from a symmetric probability distribution. The most probable number of proteins in the on-state, if the unbinding from the DNA is slow, is zero, unlike the number predicted by Poissonian distributions. The influence of noise on protein number fluctuations brings the protein-number means down, as can also be seen from Fig. 3 C. Overall, the spread of the probability distributions is large, and their characteristics for small values of the control parameters are different from those predicted by Poissonian distributions, let alone by deterministic kinetic equations; therefore the effects of stochasticity may not be neglected.

The nonzero basal effective production rate case

The above analysis concerns a switch with a zero basal production rate, so proteins were not produced in the off-state. In a number of biological systems (Ptashne and Gann, 2002) a nonzero basal production rate exists and we now turn to consider the effect of this on a symmetric switch. Fig. 6 B shows the dependence of the bifurcation curves for different values of the effective basal production rate g2/(2k). Values <1, when the death rate is larger than the production rate, show that, for the symmetric switch, assuming the effective production rate to be zero in the off-state is a reasonable approximation. If the on-state has a positive input to the number of reservoir proteins present due to g2/k > 1, the probability of the active gene to be on, even for very large on-state effective production levels Xad, is <1. Hence the off-state contributes considerably to the steady-state number of proteins. The solution that corresponds to the more active of the two states may effectively be an off-state, since it has C1(i) < 0.5, although the effective production rate in the on-state in the bifurcated region of parameter space is much larger than in the off-state (for example, the g2/(2k) = 20 line in Fig. 6 B). As the effective basal production rate increases, a larger production rate in the on-state than for small g2/(2k) > 1 is required to reach the critical number of proteins for the bifurcation to take place, which is given by 〈n(i)〉= 2XadC1(i) − g2/k(2C1(i) − 1). For this reason, even for the deterministic approximation at the bifurcation point, the two genes must be more probable to be off, as can also be seen for the exact SCPF solutions from the probability distributions (Fig. 7, B, C, E, and F). Fig. 6 A shows the dependence of the bifurcation curves on the adiabaticity parameter, which tend to the deterministic case for large ω. A closer analysis of the g2/k > 1 case, since the g2/k < 1 is analogous to the zero basal production rate case, which has already been discussed, shows that mean properties of the system are in even better agreement with the deterministic solution than the g2 = 0 case (Fig. 7, A and D). The genes have a nonzero probability of being in the off-state, with the probability distribution of the off-gene having a long tail toward higher protein numbers (Fig. 7, E and F). In the off-state the effective production rate g2/(2k) is small and the noise input is small, relative to the large protein numbers present in the system. The small effect of stochasticity results in the observed similar mean characteristics. Yet the form of the probability distributions for the genes to be on before the transition is especially broad, with a far smaller probability than those of the off-state (Fig. 7, B, C, E, and F). These clearly show that the two genes are more probable to be in the off-state before the bifurcation point. Therefore, although the average observables are similar for the deterministic and SCPF stochastic solutions, the predicted distributions are unusual.

FIGURE 6.

FIGURE 6

Nullclines for a symmetric switch, where proteins bind as dimers, when the effective base production rate is g2/(2k) ≠ 0. (A) Dependence on the adiabaticity parameter ω = 0.005, 0.05, 0.5, 5, and 50, compared to the deterministic equations solution, g2/(2k) = 5. (B) Dependence on g2/(2k) = 0.01, 0.1, 0.5, 1.0, 5, 10, and 20, ω = 0.5. Xeq = 1000.

FIGURE 7.

FIGURE 7

Probability of genes to be on (A) and mean number of proteins of a given type present in the cell (D) for a symmetric switch with an effective base production rate. Evolution of probability distributions for the probability of the gene that will be active after the bifurcation to be on (B) and off (C) and the gene that will be inactive to be on (E) and off (F) as a function of the order parameter Xad for the same system. The bifurcation occurs at Xad = 61, g2/(2k) = 5, ω = 0.5, Xeq = 1000.

Summary

The symmetric switch is based on a competition between the accessibility of the repressor site and the number of repressor proteins present in the cell. The bifurcation is solely a result of the nonlinearity of the system and introducing noise simply affects the region in parameter space where given states occur. The protein number fluctuations have a destructive role in determining the stability of the bifurcated solution; however, fast DNA unbinding rates can compensate for the destabilizing effect of protein number fluctuations. In this region the stochastic solution predicts similar means to the deterministic case, but the form of the probability distributions which depends on a large number of higher moments is nontrivial. It is a result of the interplay of the DNA binding and protein degradation kinetics.

THE ASYMMETRIC TOGGLE SWITCH

Most switches found in nature are not symmetric. For asymmetric switches, when proteins bind as dimers, the two genes interact, resulting in probabilities to be on, different from those imposed purely by the equilibrium between binding and unbinding. The steady-state solution is a compromise between the tendency that repressors are unbound from the initially off-gene (Inline graphic for the forward transition, Inline graphic for the backward in the following discussion) and the effective production rate of the initially on-gene (Inline graphic forward; Inline graphic backward transition), at least for the deterministic case. This results in the characteristic S-curve bifurcation diagram, as presented in, for example, Fig. 12, with possible forward and backward transitions, hence hysteresis. We refer to the transition that occurs with increasing Inline graphic as the forward transition, and that with decreasing Inline graphic as the backward transition. Since Inline graphic is a well-defined function of the probabilities that the genes are on, the simplicity of the deterministic equations allows for a completely analytic discussion of the asymmetric switch. The more complicated form of the exact SCPF equations makes this approach impossible. However, the deterministic rate solution offers valuable insight into the basic mechanism behind the transition.

FIGURE 12.

FIGURE 12

Bifurcation diagrams as a function of Inline graphic for C1(1), with g2(1)/(2k) = g2(2)/(2k) = 5 (A) and C1(2) g2(1)/(2k) = g2(1)/(2k) = 0.5 (B) for Inline graphic 50, and 500. Comparison of exact solutions of the SCPF and deterministic kinetic equations for an asymmetric switch. ω1 = ω2 = 0.5, Inline graphic and Inline graphic

The general mechanism

By combining the steady-state equations of motion for the probabilities of the two genes to be on and noting that, with a zero basal production rate Inline graphic one can derive the form of the deterministic bifurcation curves as

graphic file with name M38.gif (7)

as a function of C1(2), and

graphic file with name M39.gif (8)

as a function of C1(1). The transition points are determined as the extrema of Eqs. 7 and 8, which are functions solely of the scaled parameter Inline graphic and are plotted on the bifurcation graphs. It is worth noticing that the bifurcation points C1(i) do not depend on the value of Inline graphic the parameter describing the gene binding kinetics of the gene that is on initially. This is not true for the exact SCPF solution, which cannot be solved analytically, but the bifurcation curve has the more complex form of

graphic file with name M42.gif (9)

where C1(1) is a function of ω2, Inline graphic C1(2), and Inline graphic The bifurcation point is therefore determined by the protein (Inline graphic) and DNA (Inline graphic) characteristics and mutual interactions (ωi) of their two genes. The deterministic approximation therefore greatly simplifies the mathematical mechanism of the transition. This may lead to large errors when studying more complicated biologically relevant systems, where one considers asymmetric switches with nonzero basal production rates and proteins are produced in bursts. The case of the nonzero basal production rate within the deterministic approximation also cannot be solved analytically.

The general picture behind the transition is seen from the deterministic approach. The larger the tendency for proteins to be unbound from the DNA, the larger the effective production rate Inline graphic must be for the transition, from one gene to be active, to the other to be active, to take place —inasmuch as repressor proteins are less likely to bind to the on-gene (i) at large Inline graphic than at small Inline graphic However, if one considers a noisy system, it is effectively harder for proteins to stay bound to the initially off-gene due to the destabilizing effect of DNA binding noise (Fig. 8). For the stochastic system, apart from very low values of the adiabaticity parameter (ω < 0.1) (Fig. 11), there is a threshold number of reservoir proteins that will cause a rapid transition. If we start with a small effective production rate for one type of protein and increase this rate, keeping the production rate of the other gene fixed at an initially higher value, the proteins produced by the gene with the initially smaller production rate repress it gradually and ineffectively, until they reduce the probability of the gene to be on to one-half, for the exact SCPF solution. The number of proteins present in the on-state decreases much more rapidly with the change of Inline graphic—whether it be an increase for the forward transition or a decrease for the backward transition in the examples presented—than the number of proteins in the off-state grows (Fig. 10). Hence, the probability of the initially active gene to be on shows a larger sensitivity to the change of Inline graphic than does the off-state probability. This leads to a rapid transition of the previously active gene to an inactive state (Fig. 9). Such behavior is described by Ptashne (1992) and Ptashne and Gann (2002) in the λ-phage switch; they point out its role as a “buffer against ordinary fluctuations in repressor concentration.” The observed system switches when the “repression probability” drops to 50%, as in the solutions of this model. Our analysis seconds the hypothesis of Ptashne and Gann, inasmuch as the deterministic system lacks this behavior, the transition is rapid, and for certain values of parameters, takes place when the probability of the initially on-gene drops to 80% (Fig. 8). The buffering capabilities of the stochastic system are clearly seen in the long tails toward n = 0 of the probability distributions of the gene that is switching from the on- to the off-state (Fig. 9, A and B).

FIGURE 8.

FIGURE 8

Dependence of the probability of genes to be on in an asymmetric switch as a function of increasing parameters of one gene Inline graphic in the forward (top) and backward (bottom) transition for different values of Inline graphic: 5, 50, and 500. All other parameters fixed at Inline graphic and Inline graphic Comparison of solutions of deterministic and exact SCPF equations.

FIGURE 11.

FIGURE 11

Bifurcation diagrams for an asymmetric switch, presenting Inline graphic as a function of C1(2) (AC), and C1(1) (DF) for different values of the adiabaticity parameter: ω1 = ω2 (A,D), ω2, with ω1 = 0.001 = const (B,D), ω1, with ω2 = 0.001 = const (C,F). Inline graphic and Inline graphic

FIGURE 10.

FIGURE 10

Mean number of proteins of each type present in the cell, according to exact solutions of the SCPF approximation and deterministic kinetic rate equations for an asymmetric switch, with Inline graphic ω1 = ω2 = 0.5, Inline graphic and Inline graphic and 500 during the forward (A) and backward (B) transitions in an asymmetric switch.

FIGURE 9.

FIGURE 9

Evolution of the probability distributions for the two genes to be active for the forward transition (A and B) and the backward (C and D) as a function of Inline graphic for Inline graphic with Inline graphic ω1 = ω2 = 0.5; Inline graphic for an asymmetric switch.

The effect of noise on the bifurcation mechanism

The mean number of proteins at the transition point differs for the deterministic and exact SCPF solution (Fig. 10). More repressors are needed to induce the transition in the deterministic approximation than in the stochastic system, since, due to the form of the interaction function for the exact case, F(i) = 〈n(i)〉2(ω + 1)/(ω + C1(i)) + 〈n(i)〉 > 〈n(i)〉2. A smaller number of proteins is therefore needed for the inactive gene to become competitive with the active gene. The mechanism of the transition is different from the symmetric gene case, where a critical number of proteins needs to be reached. The asymmetric switch is based on the competition between the probability that proteins of one kind will repress the opposing genes and the analogous probability for the other kind of proteins. The repression capability is governed by Inline graphic which might be looked upon as the product of the probability of having a certain number of repressor proteins (3–i) in the cell and the tendency for them to be bound to the opposing gene (i). In fact, the transition point in the deterministic case is purely a function of such ratios, Inline graphic In both the stochastic and deterministic cases, the transition points are set by the interaction function which regulates the on- and off-state probabilities of a given gene Inline graphic Inclusion of noise in the system effectively increases the nonlinearity of the system, which results in the already discussed buffering capabilities of the system. Stochasticity alters the very simple competitive mechanism seen in the deterministic kinetics to allow for more levels of control of the stability of the state of the system against random fluctuations.

Further comparison of solutions of the deterministic and stochastic equations leads to the same conclusions as for a symmetric switch. As the tendency for proteins to be unbound from the DNA grows, the difference in the critical number of reservoir proteins necessary for the transition to take place increases for both approximations. The critical number of proteins produced by a given gene necessary for the transition to take place for both genes is, in most cases (see ω dependence discussion), smaller for the exact solutions of the SCPF equations and the difference between the stochastic and deterministic result grows with both Inline graphic and decreases with ωi (Fig. 10). It has a value of 15 for Inline graphic ω1 = ω2 = 0.5 and 2 for Inline graphic ω1 = ω2 = 10.

Consider the forward transition. The initially inactive gene is buffered by a cloud of repressor proteins. As one increases the effective production rate of the proteins produced by the inactive gene (Inline graphic), the number of proteins that are able to repress gene 2 grow slowly and linearly Inline graphic where C1(1) ∼const, and form a buffering proteomic cloud around it. In the results presented in the figures of this article, the tendency that proteins are unbound from gene 2, (Inline graphic), is smaller than Inline graphic so gene 1 is able to produce enough repressors to form a stable buffering cloud around gene 2 and turn it into the inactive state at quite modest values of Inline graphic If Inline graphic gene 1 produces proteins less effectively, as the probability of it being repressed is larger than in the previous case, and larger values of Inline graphic are needed to produce enough repressors to achieve a high effective probability of binding, Inline graphic An example of how Inline graphic grows as Inline graphic is seen by comparing the Inline graphic for Inline graphic (Fig. 8) and Inline graphic for Inline graphic (Fig. 11).

Adiabaticity parameter dependence

The interaction of the buffering proteomic cloud with the DNA can be altered when the ratio of the DNA unbinding rate compared to the protein degradation rate is changed. For small ωi values the unbinding rate of repressors from the DNA is slower than the destruction of the produced proteins. Apart from very small ω-values, as long as there is a critical number of repressor proteins in the buffering cloud, the off-gene is repressed and it responds by turning on, but only once the initially on-gene is nearly totally repressed. Large adiabaticity parameters result in the efficient formation of the buffering proteomic cloud. For the initially off-gene, a small DNA unbinding rate of the off-gene decreases the effectiveness of the buffering proteomic cloud around it, as the protein number state can reach a steady state before the DNA state does. The hindered DNA reaction to the protein-number state effectively increases the tendency of repressor proteins to be unbound from the DNA for a given Inline graphic This, in turn, decreases the probability of the initially on-gene to be on, leading to rapid switching behavior as can be seen for gene 2 in the forward, or gene 1 in the backward, transition for ω > 0.1 in Fig. 11 A. The initially on-gene reacts to the interaction function of the initially off-gene, for which F(i) → 〈n(i)〉2/C1(i) + 〈n(i)〉 in the small ω-limit. Therefore, the interaction function is effectively increased for C1(i) ≈ 0, leading to the enhanced buffering. The reaction of the initially off-gene is unaltered, as for C1(i) ≈ 1 F(i) = 〈n(i)〉2 + 〈n(i)〉 ∼const, if C1(i) remains close to 1. However, if ω is very small (black dash-dot curve in Fig. 11, A and D), the buffering proteomic cloud is not given a chance to form due to a very high degradation rate of proteins and gene 2 is simply repressed in a gradual transition. If ω1 is extremely small and ω2 large, the buffering proteomic cloud around gene 1 cannot form and the probability of it to be off in the forward transition decreases gradually. A buffering proteomic cloud exists around gene 2, hence the backward transition is reminiscent of the deterministic result (Fig. 11, B and E). The most interesting case is shown in Fig. 11, C and F, where a large ω1 acts as a buffer against fluctuations in the number of proteins, which repress gene 1. For large production rates of repressors the probability of gene 2 to be on for the forward transition decreases faster than in the deterministic solution; however, the buffering cloud repressing gene 1 allows gene 2 to remain in the on-state. A buffering proteomic cloud does not form around gene 2, and it remains on until the number of proteins produced by gene 1 grows considerably, as the effective production rate, Inline graphic is increased. The effective production rate of gene 1 must be very large to sustain a sufficient steady-state number of proteins to repress gene 2 to the point that C1(1) < 0.5, which leads to switching. For the backward transition, the lack of a buffering proteomic cloud around gene 2 results in destabilizing gene 1 for larger Inline graphic effective production rates than for large ω2 values. These examples show how certain combinations of values of adiabaticity parameters can lead to a system with a larger switching region than the deterministic model predicts. This property may be useful when engineering artificial switches. If one has a constraint on the production rates of the genes, one can use repressors with different binding affinities to achieve switching in the desired region of parameter space.

In this simple system slow unbinding from the DNA can compensate for the destabilizing of the DNA state by protein number fluctuations. As the probability of the initially active gene to be on gradually decreases, the initially repressed gene becomes active only once the probability of the other gene to be on has fallen below a certain value, α. The susceptibility of the system to protein number fluctuations may be estimated by the value of α. For small ω, which is still able to sustain a buffering proteomic cloud, this value tends to be 0.5. The incapability of the system to form a buffering proteomic cloud is much stronger if both adiabaticity parameters are small, since the reaction of both genes to the change in the number of proteins is hindered (Fig. 11, A and D). DNA state fluctuations contribute to effectively faster protein number fluctuations, therefore the exact solution exhibits the very small ω-characteristics, where a buffering proteomic cloud cannot form, for a slightly wider range of the adiabaticity parameter than one would expect with a Poissonian distribution (results not shown). Combining these observations, a switch works most effectively if the change of the DNA state compared to the protein number fluctuations of one gene is sufficiently smaller than that of the other gene, to allow for effective buffering.

The nonzero basal production rate

The asymmetric switch, in which both genes have a nonzero basal effective production rate, proves to be susceptible to noise. In Fig. 12, we show the dependence of C1(1), with g2(1)/(2k) = g2(2)/(2k) = 5 and C1(2), with g2(1)/(2k) = g2(2)/(2k) = 0.5 in the small ωi limit. The stochastic solutions converge to the deterministic solutions for large ω. If gene 2 is initially in the on-state, the majority of proteins are produced with the high fixed rate in the on-state, as g1(2) ≫ g2(2). The repression of gene 2 is, in turn, governed by the interaction function of gene 1. If Inline graphic is small the number of proteins produced in the on- and off-states by gene 1 are comparable. Since the number of proteins produced by gene 1 grows faster the larger g2 is, gene 2 gets repressed more effectively at smaller Inline graphic values. This results in a smaller number of repressors produced by gene 2, and the transition from gene 1 being on to its being off, takes place for smaller Inline graphic effective growth rate values, than for small g2.

The deterministic solution is much more influenced by the production of proteins in the off-state than the stochastic solution. In the exact SCPF solution, slow DNA unbinding rates compared to protein degradation rates are another means of control of the stability of the DNA state against random protein number fluctuations. The state of the system is far less influenced by the exact protein numbers than in the deterministic solution. So until the probability of a gene to be on is larger than that of being off, the fraction of proteins produced with a smaller effective production rate in the off-state is treated as a random fluctuation by the system. Once again, the SCPF system demonstrates its susceptibility to protein number fluctuations.

The influence of the off-state protein production on the total repressor yield may also be seen in the fast decrease of C1(2) and increase of C1(1) in the forward transition. If g2 is considerably large, its effect can also be seen in the stochastic solution; hence even when gene 1 is in the on-state, it never reaches C1(1) = 1, although gene 2 is totally repressed (Fig. 12 B; results not shown for gene 2). The magnitude of the probability of gene 1 to be on for very large effective production parameters strongly depends on the tendencies of the proteins to be unbound from gene 1. As Inline graphic increases, the asymptotic Inline graphic limit of C1(1) becomes smaller, as it is effectively harder for repressors to stay bound to the DNA. The gene is more likely to be in the off-state, which, however, manages to sustain the necessary number of proteins produced by gene 1 to repress gene 2. As g2 increases, the region of bistability grows into areas of parameter space, in which the tendency of proteins to be unbound, Inline graphic is larger than for small g2. For small values of Inline graphic the number of repressors produced by gene 1 in the off-state is sufficient to repress gene 2, and one observes a smooth and slow transition in terms of Inline graphic If g2 is considerably large, the transition takes place for larger values of Inline graphic in the stochastic solution than in the deterministic solution, hence showing the large buffering region that the interplay of DNA and protein number fluctuations provides. This also results in an effective similarity of the deterministic and stochastic solutions. In regions of parameter space, in which the transition takes place, the deterministic and stochastic solutions differ, apart from the large ω-limit. Most experimentally observed proteins have very small basal production rates, which seconds our analysis that it is functionally unfavorable for large basal production to occur. The dependence on other parameters is analogous to the case without a basal production rate.

The region of bistability

The backward transition, as already discussed, is analogous to the forward transition. In most cases, the regions of bistability (Fig. 11) in parameter space are reduced in size by noise. When engineering artificial switches, one may be interested in making sure the forward and backward transition takes place for considerably different production rates. We therefore consider how the region of bistability, defined as the difference in the critical effective production rate for the forward and backward transition, depends on the parameters of the model. For the deterministic case the region of bistability depends on the tendencies that proteins are unbound from the DNA in a quadratic manner, as can easily be seen from the bifurcation equations (Eqs. 7 and 8) and which is demonstrated in Fig. 13. The SCPF solution shows the same behavior. For large values of the adiabaticity parameter the size of the region of bistability is independent of ω, as is the form of the bifurcation curve (Fig. 13). The approach to this plateau is very rapid and is given by the ratio of polynomials. However, the size of the region of bistability for the ω1 = ω2 never reaches that of the deterministic solution, as even in the large ω-limit the greater nonlinearity of the interaction function F(i) results in a more complex SCPF curve that does not reduce to deterministic solution, but

graphic file with name M84.gif (10)

This effect is true for both curves, as the presented graphs show C1(1) hysteresis and the chosen equations C1(2). The same behavior is observed for the case with a zero and a nonzero basal production rate. The increase with Inline graphic is slightly slower in the g2 ≠ 0 case as the bifurcation curve is smaller by |g2/k(C1f(i) − C1in(i)) − ln(C2f(i)/C1in(i))/2|.

FIGURE 13.

FIGURE 13

Region of C1(1) hysteresis for an asymmetric switch for the SCPF and deterministic approximations as a function of ω1 = ω2, with Inline graphic and Inline graphic with ω1 = ω2 = 100 (B). Inline graphic g2/(2k) = 0.5.

Summary

After the transition, the number of proteins produced by the now on-gene follows a linear dependence on Xad, similarly to the symmetric switch. The number of proteins in the cell is independent of the DNA dynamical characteristics, as those remain constant in that region of parameter space. The number of proteins of the on-gene rapidly falls before the transition takes place. Based on the bifurcation diagram of Fig. 12 the phase transition is discontinuous. The region of parameter space where switching may occur may be roughly estimated by the parameters of the genes which must be competitive, (Inline graphic This has a major implication for biological systems, such as the λ-phage, where many mechanisms are used to achieve balance between two genes. The first-order phase transition, as opposed to the second order present in the symmetric system, is a result of the breaking of symmetry and is clearly seen in the evolution of probability distributions in phase space (Fig. 9). The gene that is on after the transition rapidly increases its probability of being on, whereas the off-gene decreases with a rapid drop in the number of proteins it produces.

THE CASE WHEN PROTEINS BIND AS MONOMERS

Equations 1 and 2 can easily be augmented to describe the binding of monomers or higher order oligomers by changing the form of the binding term to Inline graphic where p = 1 for monomers. The equations remain solvable for any value of p.

Monomers do not make good repressors/activators

The behavior of the system is quite different if we consider the case when proteins bind as monomers. For a symmetric switch there is no region of the parameter space in which one observes switching. The SCPF equations may be reduced to a single quadratic equation,

graphic file with name M88.gif (11)

which has, at most, only one positive solution. Therefore the probability of one gene to be in the active state is always equal to that of the other to be in the active state, and no switching is observed. Equation 11, above, is independent of ω, the adiabaticity parameter; therefore, it is solely a consequence of the lack of nonlinearity in the binding of proteins and cannot be influenced by very slow DNA unbinding rates. By writing down deterministic equations we can also show that when proteins bind as monomers, switching does not occur. A similar equation to Eq. 11, also independent of ω, holds for asymmetric switches. It also has one positive solution, and, therefore, the parameters of the model predetermine the solution and each gene has a probability to be on, determined by its kinetic rates. Since the rates are different for the two genes, the gene with the larger production rate will be in the active state, repressing the weaker gene (Fig. 14 A).

FIGURE 14.

FIGURE 14

(A) Probability of genes in an asymmetric switch to be active when proteins bind as monomers, for different values of Inline graphic Probability distributions for the gene to be in the on-state (B) and off-state (C) for a gene in the active state for different values of the adiabaticity parameter ω = 0.5, 5, and 100, when proteins bind as monomers to a symmetric switch. Xad = δXsw = 50, Xeq = 1000.

In naturally occurring biological switches and those developed experimentally, proteins bind as dimers, or higher order multimers (Ptashne, 1992). We see cooperativity contributes to improving the efficiency of a switch. A switch controlled by monomers is shown to react ineffectively to changes in the repressor concentration, just as in the case of the asymmetric switch in our model discussed above. Monomers do not have the ability to stabilize a broken symmetry state; therefore, the solution is fragile to kinetic rates and inefficient. Effectively monomers do not make good repressors/activators. Ptashne and Gann (2002) explain the cooperativity process between two monomers by claiming that one monomer bound to the DNA increases the local concentration of proteins around the binding site through weak protein-protein interaction, thus causing the second to bind cooperatively. Our model lacks spatial dependence, which therefore shows that this effect need not be thought of as due to changes in local concentration, but actually is required by the insufficient nonlinearity for monomers, which cannot produce bistability.

Bimodal probability distribution

Although the probabilities of the two genes to be on are equal for the whole region of parameter space, and the mean number of both types of proteins in the cell is the same as in the deterministic case, the probability distributions are bimodal when the DNA unbinding rates are slower than the protein number fluctuations (Fig. 14, B and C). The mechanism of this small ω-behavior has already been discussed in the example of the symmetric switch, when proteins bind as dimers. This is analogous to the case when DNA fluctuations induce a probability distribution with two peaks for the single gene with an external inducer (Cook et al., 1998). In fact, the SCPF approximation has reduced this two-gene system to an effective one-gene system with an external inducer. A bimodal distribution in the small ω-case is also observed for the asymmetric switch, when proteins bind as monomers.

THE CASE WHEN PROTEINS BIND AS HIGHER ORDER OLIGOMERS

Switches in which effector proteins bind as higher order oligomers are omnipresent in nature and have been realized experimentally in artificial switches (McLure and Lee, 1998). We considered the binding of trimers Inline graphic and tetramers Inline graphic in symmetric switches. The equations of motion have the same form as before, but the interaction function F(i) accounts for the higher moments. For proteins binding as kth order oligomers, it has the form Inline graphic As shown when discussing the dimer binding switch, the kth order moments have a simple form in the creation operator representation.

The general mechanism

From Fig. 15 one notes that, for the system to act as a bistable switch, a considerably smaller number of reservoir proteins is needed than in the case of the dimer binding switch. As the multimericity number grows, the area of bistability of the switch in parameter space grows. Since we assumed only one type of protein repressed a given gene, binding of higher order multimers is an effective model of cooperativity. Therefore, we expect the system to have a larger region of bistability, the higher the order of the binding multimer. The evolution of the system in parameter space when trimers bind is qualitatively similar to the dimer binding scenario (Fig. 16, B and C). Fast DNA unbinding rates stabilize the system and the bifurcation takes place for smaller effective production rates, for large ω than for small ω (Fig. 16, A and D). The critical number of proteins necessary for the bifurcation to take place is independent of the adiabaticity parameter and decreases with multimericity: 〈nc = 32 for dimers binding, 〈nc = 8 for trimers binding, and 〈nc = 4 for tetramers binding. This along with the narrow probability distributions (Fig. 16, E and F), small ω-dependence when tetramers bind (Fig. 15) shows that one binding event determines the result, hence DNA binding rates do not play a role. Once there are 〈nc proteins of a given type in the cell, a tetramer repressor will bind and stay bound. In the deterministic case the probability of the genes needs to fall to (p − 1)/p, where p is the order of multimerization of the repressor, for the bifurcation to take place. That, along with the need for the number of repressors to be comparable with the tendency for proteins to be unbound from the DNA, sets the critical number of proteins necessary for the bifurcation. Hence, the bifurcation occurs when both genes are more probable to be on than off, for both tetramers and trimers. Therefore, for the tetramer system, a large buffering proteomic cloud is not needed to stabilize the DNA binding state of the switch, and the characteristics of the system are practically independent of the adiabaticity parameter.

FIGURE 15.

FIGURE 15

Phase diagram for the SCPF approximation for a single symmetric switch to which proteins bind as trimers (A) and tetramers (B), with Xeq = 1000. Contour lines mark values of ΔC.

FIGURE 16.

FIGURE 16

Mean number of proteins in the cell, for each type when proteins bind as trimers (A) and tetramers (D), ω = 0.5, 10, symmetric switch. The evolution of the probability distribution for the probability of the gene that will be active and inactive after the bifurcation to be on as a function of Xad for a switch when proteins bind as trimers (B and C) and tetramers (E and F). Xeq = 1000, ω = 0.5.

Tetramer binding results in nearly deterministic characteristics

In naturally occurring systems the production of the critical number of proteins is slowed down by relatively high multimerization rates and spatial dependence arising from the need of a large number of particles to diffuse together. These elements, which we neglect in our simple model, constitute what might be called the cost of multimerization. This analysis also explains why most repressors and activators bind as dimers and tetramers, not trimers or pentamers. The effect of trimers binding is not different from that of dimers: a buffering proteomic cloud needs to be formed; the state of the system is quite influenced by noise; and the switching region (region in Xad parameter space from the bifurcation point to ΔC > 0.9) is quite large. Yet in a real system there is an effective cost of trimerization: the energy of trimer formation and a need for the diffusion of particles. For tetramers the effect of stochasticity becomes negligible. Effectively one tetramer is sufficient for the bifurcation to take place. The binding of tetramer repressors may be thought of as a mechanism for increasing the deterministic nature of the switch.

Binding of higher order oligomers as a competitive mechanism

This analysis, although it neglects some important features, allows for a more quantitative formulation of cooperativity. Since most biological switches are asymmetric, cooperativity is also used as a means of making genes with smaller chemical rates more competitive. Tetramer binding seems to have a different role than that of lower order multimers. It may be used by genes that need to react to very small concentrations of proteins; for example, they turn on degradation mechanisms when even a small number of toxic molecules is present. Or they may act as an extra mechanism stabilizing the existent state of a gene, as seems to be the case for the cI gene of the λ-phage. It seems that tetramers are used either in a stabilizing role or as a drastic, all-or-none response to the protein distributions in the system. This formulation of the problem is naturally oversimplified, but it allows for general observations.

THE CASE WHEN PROTEINS ARE PRODUCED IN BURSTS

Many proteins in biological systems—for example, the Cro protein in λ-phage—are produced in bursts of N of the order of tens. We consider a symmetric switch, where proteins bind as dimers and are produced in bursts of N. The derivation of the moment equations for this case is presented in Appendix B.

The general mechanism

We discuss the effect of bursting phenomena based on the example of a symmetric toggle switch in which proteins bind as dimers, as that can offer the most insight, when compared to previous results. In this case, switching takes place for much smaller values of the effective production rate parameter Xad compared to when proteins are produced separately. Therefore, even in the large ω-limit, noise resulting from large protein number fluctuations plays a role in defining the region of stability of the switch, as the criterion of large Xad is not reached. The number of proteins in the cell when the bifurcation occurs is determined by the tendency that proteins are unbound from the DNA and does not change when proteins are produced in bursts. For the rates discussed in Fig. 17, the critical mean number of proteins present in the cell at which the bifurcation occurs is nc = 10 = Xeq = 100½. If proteins are produced in bursts of N = 10, as in the left-hand figures, this value of nc is achieved when Xad > 1 (that is, proteins must get produced at a higher rate than they are destroyed, to be able to sustain the steady-state number of 10 proteins in the cell). In the figures on the right-hand side of Fig. 17, proteins are produced in bursts of N = 100. In this case even when the degradation rate is larger than the production rate, the critical steady-state number of proteins necessary for the bifurcation to take place can be reached and a bistable switch is possible. A bistable switch can exist if the degradation rate exceeds the production rate even for burst sizes present in biology. For Xeq = 100, the order of the tendencies for proteins to be unbound from the DNA in the λ-phage, the value of N for which Inline graphic is smaller than N = 20, the burst size for Cro proteins in the λ-phage. Xad at the critical point decreases as a function of N (Fig. 18 A) and depends on the tendency that proteins are unbound from the DNA Xeq (Fig. 18 B) and the adiabaticity parameter, ω (Fig. 19).

FIGURE 17.

FIGURE 17

Probability that gene i is on when proteins are produced in bursts of N = 10 (A) and N = 100 (B). Mean number of proteins of each type present in the cell when proteins are produced in bursts of N = 10 (C) and N = 100 (D). Symmetric switch proteins bind as dimers, Xeq = 100, ω = 100. Comparison of deterministic and stochastic solutions.

FIGURE 18.

FIGURE 18

Bifurcation curves as a function of Xad = δXsw, ω = 100 for different burst size values N = 1, 2, 5, 10, 50, and 100, with Xeq = 100 (A) and for proteins produced in bursts of N = 100 (B) for different values of Xeq = 1, 10, 100, and 1000.

FIGURE 19.

FIGURE 19

Bifurcation curves for proteins produced separately N = 1 (A), in bursts of N = 10 (B) and N = 100 (C) as a function of Xad = δXsw for different values of the adiabaticity parameter.

If proteins are produced individually, the span of the nonadiabatic regime is clear from Fig. 19. It corresponds to ω < 1. The bifurcation curves show small discrepancies for larger values of the adiabaticity parameter. However, for larger burst sizes, there is a continuous change in the form of the bifurcation curves with ω. All of the solutions differ substantially from the deterministic treatment, as shown in Fig. 17 A.

The influence of the adiabaticity parameter on the bifurcation mechanism

Contrary to the N = 1 case, the effective production rate at the bifurcation point Inline graphic grows with the increase of the adiabaticity parameter, for considerably large burst sizes, as in the N = 100 example in Fig. 19. In this case each gene produces a large number of repressors at a time. The bifurcation takes place in a region with Xad < 1, which corresponds to very small effective production rates, which denote very large death rates. Therefore, in the region of parameter space before the bifurcation takes place, both genes remain repressed (C1(i) < 0.5) in the steady state, as opposed to the previously discussed situations, in which both genes had equal probabilities to be active (C1(i) > 0.5). For large N bursts, the bifurcation takes place when one of the genes becomes unrepressed in the steady state. That is, when the repressor cloud buffering the DNA becomes destabilized, not when the cloud forms as in the smaller N examples. For large N bursts, if the rate of unbinding from the DNA is fast compared to the protein degradation rate, larger effective production rates are needed for the buffering proteomic cloud to stabilize the DNA state than for small ω (Fig. 19 C). The larger Xad is, the more repressor molecules are present in the system, which corresponds to the larger protein-number fluctuations, that are necessary for one of the genes to become unrepressed. For slower DNA unbinding rates, the buffering proteomic cloud is smaller, since the protein number reaches a steady state before the DNA state does. Therefore the buffering proteomic cloud is destabilized at smaller values of Xad. Hence, in the case of small ω the un-repressing bifurcation takes place for smaller effective production rates than for large ω. However, if the unbinding rate from the DNA is very small (i.e., ω < 0.01), Inline graphic as a function of the adiabaticity parameter grows again—as this corresponds to effectively large death rates that need very high production rates to sustain a proteomic cloud buffering the DNA. If the effective production rate is too small in this case, the steady-state number of proteins is too small to form the buffering proteomic cloud, although the burst size is enormous. In the very small ω-limit the bifurcation cloud needs to be formed for the bifurcation to be possible, as in the mechanism present in the small N case. The value of Xad at the bifurcation point in both the large and small ω-limits is strongly governed by protein and DNA binding-state fluctuations in the system. For this reason, the deterministic solution fails. It assumes the incorrect mechanism, in which the bifurcation is a result of repressing one of the genes. Such a scenario is possible if the death rate of proteins is slow enough to allow for the existence of 〈n(i)c〉 repressor molecules in the system at very small production rates (C1(1)biff,kin = 0.5) (Fig. 17, A and B). One can see that the order of taking the adiabatic limits in the steady state for proteins produced in large bursts is subtle and depends strongly on the parameters of the system, as the bifurcation is governed mainly by relative protein and DNA fluctuations, both of which are very large. Furthermore, the deterministic solution is closer to the small ω-limit, which corresponds to slow DNA unbinding rates compared to protein number fluctuations. Deterministic results may therefore be misleading in the bursting situation, even for large ω.

The steady state comes about as a result of different mechanisms, depending on the burst number N and the order of reaching the steady state by the protein, and DNA binding site dynamics changes depending on ω. For small burst sizes, slower DNA unbinding rates require larger effective production rates to reach the steady-state number of proteins necessary to form the buffering proteomic cloud than for large N. For larger burst sizes, faster DNA unbinding rates destabilize the buffering cloud of proteins for smaller effective production rates than in the small N case (Fig 18 A).

Consequences of bifurcation at smaller Xad values

The divergence from the deterministic solution at the bifurcation point increases with the burst size, as is expected due to the enormous noise effect due to large N, on a system with a constant, and independent of the burst size number of proteins at the bifurcation point. As already noted, the number of proteins in a cell is in the range of tens to hundreds, even if they are produced in bursts. This number is reached for smaller effective production rates for larger burst sizes than for small N-values. Therefore systems where proteins are produced in bursts display smaller values of Xad and are more susceptible to noise if the number of proteins in the cell is to be of the order which is observed experimentally. Furthermore, the noisy-burst systems, even for very large values of Xad, do not converge as closely to the deterministic solution as they do for the single protein production example. This can be seen from the form of the steady-state moment equations. The interaction function F(i) for the N = 1 case in the limit of large ω and Xad converges to F(i) → 〈n(i)〉 + 〈n(i)〉2, whereas the deterministic solution corresponds to F(i) = 〈n(i)〉2. Therefore for large mean values of proteins the two are equal. However, in the case when N > 1, F(i) → 〈n(i)〉 (1 + (N−1)/2) + 〈n(i)〉2, which requires N ≪ 2 〈n(i)〉 for the effect of bursting to be negligible at very large N. The values of the effective production rate that correspond to values of the proteins seen experimentally seem to be small. Therefore we can say that, effectively, the role of bursting is to enable the existence of a bistable solution at lower effective production rates, which determines a region of parameter space that has been previously unstudied. In this region, one cannot make the adiabatic assumption that the change in the DNA state can be integrated-out due to a separation of timescales. That assumption leads to erroneous results, predicting a region of bistability where explicit treatment of both timescales suggests monostability. Furthermore, for very large N, the region of bistability decreases with the adiabaticity parameter, making the disagreement of the stochastic solutions with those of the deterministic rate equations larger. The adiabatic approximation and the full solutions converge only in the regime of large ω and Xad, the second of which is never fulfilled at the bifurcation point or for biological concentration for systems in which proteins are produced in large bursts.

Dependence on the DNA binding coefficient

Just as increasing the burst size, decreasing the tendency for proteins to not be bound to the DNA results in a different switching mechanism. The probability of the genes to be on falls to far smaller values than the 0.5 of the N = 1 case. If the burst size is large, both genes have a very low probability of being on before the critical number of proteins necessary for bifurcation is achieved. The same effect is observed if proteins are more likely to bind to the DNA (small Xeq) (Fig. 18 B). When the genes are more probable to bind a repressor and successful unbinding events are rare, earlier bifurcations in terms of Xad result. As Xeq increases, the probability of the genes to be on at the bifurcation point increases, since repressors have a higher tendency of unbinding.

For very high values of the adiabaticity parameter, corresponding to high unbinding rates from the DNA binding site, the stable solution that corresponds to the off-state and the unstable state merge and the system is monostable again, with only the on-state present. This limit is also reached by keeping Xad fixed but taking the burst size N → ∞.

Probability distributions

In the case of the rates used in Fig. 20, nc = 32 is the same as for N = 1, but we note a 10-fold decrease in Inline graphic compared to when proteins are produced separately. When proteins are produced in bursts, the probability distributions have tails toward larger n, as opposed to the distributions for individual protein production. The mean number of proteins in the system for given states of the switch is similar to that of the N = 1 case; however, the distributions with bursts are much broader, as could be expected. In this case even very fast unbinding rates from the DNA cannot correct for the enormous protein number fluctuations and one must explicitly keep track of the change of the DNA binding state. A system in which proteins are produced in bursts is very noisy, especially compared to the nearly deterministic case of proteins binding as tetramers.

FIGURE 20.

FIGURE 20

The evolution of the probability distribution of the gene that is active after the bifurcation, to be on (A) and off (B) and the gene that is inactive to be on (C) and off (D) as a function of Xad for a switch when proteins are produced in bursts of N = 10, Xeq = 1000, ω = 100. Bifurcation point at Xad = δXsw = 35.

Nonzero basal effective production rate

If there is a nonzero basal production rate, the difference between the deterministic and stochastic solutions is also qualitative even for relatively small burst sizes. In this case, proteins are also produced in the off-state—so that the number of repressors produced by the off-gene after the bifurcation is nonzero, but equal to the burst size N, since Inline graphic This number is equal for both the stochastic and deterministic solutions and is equal to 10 in the examples presented in Fig. 21, C and D. So production in bursts maintains a high level of repressor proteins, even for very small g2/k values, if the burst size is large. When using experimental data one must be very careful to consider the burst size when assuming the basal production level is zero. Furthermore, the value of the interaction function of the gene in the off-state (C1(i) ∼0) for the stochastic case is much larger than for the deterministic case, due to the multiplication of 〈n(i)〉2, which gives F(i) → 〈n(i)〉2 (1 + k/(2g2)) + Ng2/(2k), for large ω, the effect of which is shown in Fig. 21, A and B. The number of repressor proteins produced by the off-gene decreases as g2 → 0, as expected, and the probability of the on-gene to be active tends to be 1. The dependence of the effective production rate at which the bifurcation occurs on the adiabaticity parameter is analogous to that of the case where g2 = 0. The probability distributions for the gene that is active after the bifurcation in the on- and off-states takes place are presented in Fig. 22, A and B, for large unbinding rates from the DNA; and Fig. 22, C and D, for small unbinding rates from the DNA. They exhibit maxima around 2Xad for the on-state and 2g2/(2k) for the off-state and display behavior analogous to that of proteins produced separately, apart from the different curvature of the slopes for n < N and n > N. For small ω-values the protein numbers reach a steady state before the DNA states, hence we observe bimodal probability distributions. The mechanism of competition in this noisy burst system is different than in the single protein production case. If the gene is in the on-state, probability states with higher n-values are strongly occupied and there is hardly any probability flux into the lower n-states. In the off-state, however, a flux pushes the system into the lower n-states, essentially trapping it there, hence the difference in the slopes, as can be seen in Fig. 22, C and D. This is also true for the g2 = 0 system when proteins are produced in bursts.

FIGURE 21.

FIGURE 21

Probability that gene i is on when proteins are produced in bursts of N = 10 with a basal effective production rate g2/(2k) = 0.5 (A) and N = 100, with a basal effective production rate g2/(2k) = 0.05 (B). Mean number of proteins produced by each gene in the two cases (C and D). Symmetric switch; proteins bind as dimers, Xeq = 100, ω = 100. Comparison of deterministic and stochastic solutions.

FIGURE 22.

FIGURE 22

The evolution of the probability distribution of the gene that is on after the bifurcation, to be on for ω = 100 (A and B) and ω = 0.5 (C) and off (D) as a function of Xad for a switch when proteins are produced in bursts of N = 10 with a basal effective production rate g2/(2k) = 0.5, Xeq = 100. Bifurcation points at Xad = 8 (ω = 100) and Xad = 6 (ω = 0.5).

Limitations of the SCPF treatment

The examples presented above cover a large class of two gene switches, all of which are exactly solvable within the SCPF approximation. An exact solution may be obtained within this approximation for systems of genetic networks and switching cascades. However, the SCPF approximation does not allow for an exact analytical solution of all systems. If we try to model one of the simplest natural systems where regulation is achieved by means of a switch, i.e., the λ-switch, we encounter a problem. The genes in the λ-switch, apart from having a toggle-like regulation, also exhibit autoregulation—that is, cI proteins can bind to OR3, repressing the cI gene, and the Cro proteins can bind to OR1 or OR2, enabling the RNA polymerase from transcribing the Cro gene (Ptashne, 1992; Ptashne and Gann, 2002). If we expand the master equation (Eq. 1) to account for self-regulation we add a Inline graphic binding term to the Pj(ni) equations. Therefore the kth moment equation will display a dependence on the k + pth moment and the set of equations will not exhibit closure. One can find the probability distribution for a single self-regulating single gene. However, if we consider a system like the λ-phage, where self-regulation is also combined with regulation by another gene, the problem is no longer solvable exactly and demands a cutoff of the hierarchy or other such approximations. We can nevertheless treat these systems using the variational method, as proposed by Sasai and Wolynes (2003). The fact that self-regulation renders the system incompletely solvable within the SCPF approximation is not surprising, since it corresponds to the exact solution for such a system. Gene i is influenced only by the number of proteins it produces. It is independent of the state of the other gene. Therefore, as one would expect, the full solution should depend on all moments of the distribution of gene i. However, for systems such as the λ-phage, we can treat all intergene regulation effects exactly and truncate the self-regulation equation at the highest order of the intergene interaction.

CONCLUSIONS

The self-consistent proteomic field approximation for stochastic switches reproduces many intuitive notions about their behavior. It proves to be a very powerful tool that allows for the consideration of all but one of the basic building blocks of more general switches and networks. A switch with a self-repressing/activating gene cannot be solved exactly within the SCPF approximation, since, in this case, the approximation is equivalent to the full solution. Therefore the probability distribution is determined by an infinite number of moments. The probability distributions obtained for the systems considered in this article are not symmetric and exhibit long tails. This anticipates problems for using the variational principle for finding probability distributions when one accounts for correlations between the two states. The possibility to expand this method to consider networks and cascades will allow for a more realistic treatment of complex systems with emergent behavior at low computational costs.

One can account for the mRNA step in the system by adding a deterministic step which, using a deterministic kinetic rate equation, translates the number of mRNA molecules into proteins produced in bursts. This is a valid procedure, as separately shown by Thattai and van Oudenaarden (2001) and Swain et al. (2002); transcription noise is just amplified in the translation process. Therefore treating the mRNA step deterministically simply introduces another constant into the discussed case of proteins produced in bursts. Therefore the presented treatment of proteins produced in bursts with a modified effective production rate is a simple model of including mRNA in the system. Of course, the effect of mRNA is much more complicated, as it also introduces, for example, time delay between binding and production. This model in the present state neglects these effects.

Our analysis of a large class of switches shows how particular elements contribute to the emergent behavior of functioning switches. Comparison of the stochastic and deterministic treatments of a single gene switch shows convergence in the region of fast rates of unbinding from the DNA compared to protein number fluctuations and large effective production rates. For symmetric switches when proteins are produced separately, the two solutions converge after the bifurcation, but often differ when defining the region of parameter space where the bifurcation occurs. The agreement between the deterministic and stochastic solutions is especially good for symmetric switches, with N = 1 and a nonzero basal production rate. However, even though the mean repressor protein levels in the cell are similar in both approximations, the probability distributions are broad and far from Poissonian (i.e., they are not completely characterized by these means). If the adiabaticity parameter is small (ω < 1), the protein-number state will reach a steady state before the DNA binding state, and we observe a bimodal probability distribution. For the symmetric switch, noise has a destructive effect on the region of bistability. Increasing the adiabaticity parameter facilitates the formation of a buffering proteomic cloud around a gene, which leads to repression at lower effective production rates than for small ω.

As was already mentioned, the symmetric switch is hard to design and build experimentally. The asymmetric switch, which is the experimental model system, is much more susceptible to noise than the symmetric switch and stochasticity has not only the destructive effect on the region of stability one might expect, but also introduces new phenomena and can be utilized to increase the bistable region. This is of fundamental importance, since experimentally one deals with asymmetric switches and these offer greater possibilities in artificially engineering new systems. As can also be learned from the asymmetric switch as well as from the analysis of binding of different oligomers, the region of bistability of a switch grows with increasing the interaction function. When creating artificial switches, one may argue a large region of bistability may be desired, so the switch reacts by the forward or backward transition to very specific concentrations or production levels of a protein. If the experimental setup constrains the protein production rates, this can also be achieved by modifying the adiabaticity parameters of the system, which ensures the transition remains rapid and effective. Asymmetric switches exhibit first-order phase transitions. This size of the region of phase space, in which the forward and backward transitions occur, grows with the tendency that proteins are unbound from the DNA of both genes. Large adiabaticity parameters stabilize the buffering proteomic cloud around the repressed gene and lead to the formation of an effectively repressing cloud for smaller numbers of repressors in the forward transition than for small ω, for the active gene.

Experimental data available at this point (Darling et al., 2000) suggest biological switches function in regions of high adiabaticity parameters from the deterministic point of view. Nevertheless, even for large values of adiabaticity parameters, one must account for the DNA binding site fluctuations explicitly when proteins are produced in bursts. The deterministic solutions give qualitatively wrong results in biologically relevant areas of parameter space. The stochastic solutions for large burst sizes suggest that the bifurcation of the solution is a result of destabilizing of the repressor cloud buffering the DNA, not formation of the cloud as for smaller-burst systems. The probability distribution therefore exhibit tails toward large n-values, not as in the small N case toward small n-values. The deterministic kinetics remains unchanged for large burst sizes, unlike the stochastic kinetics, hence presenting results derived from a wrong mechanism. The definition of the adiabatic limit, when proteins are produced in bursts, is not clear as in the N = 1 case, when it corresponds simply to ω < 1. This ambiguity does not allow one to integrate-out the degrees of freedom corresponding to the change in DNA binding site occupation. Such an approximation leads one to erroneously identify the regions of bistability. The switch with a nonzero basal production rate when proteins are produced in bursts results in probabilities to be on, and for mean numbers of proteins in the cell that are very different, from those of the deterministic solution even for small effective basal production rates. If proteins are produced in bursts, assuming that a small effective basal production rate may be approximated by a zero rate, may be misleading. Binding of proteins produced in bursts results in a bifurcation transition for smaller values of the effective production rate. It is also a mechanism for making two genes in an asymmetric switch more competitive.

Binding of higher order oligomers leads to results closer to those of deterministic treatments, with narrower probability distributions. This can be experimentally used to stabilize DNA binding states. In this simple model, tetramers seem to be the most optimum binders. The close to deterministic all-or-nothing switching that they offer may be worth the effective cost of the energy of multimerization and diffusion of particles. Binding of higher order oligomers may be viewed as a simple model of cooperativity, which increases the competitiveness of genes in an asymmetric switch. Within the SCPF approximation monomers do not make good switches due to lack of nonlinearity in protein concentration. They do not exhibit a region of bistability. This model neglects any structural DNA-protein interactions and spatial dependence. Hence this conclusion is simply a result of the lack of cooperativity in the system. For small adiabaticity parameters, they do, however, exhibit bimodal probability distributions, unlike in the large ω-limit.

The thorough investigation of different components of gene regulatory networks using the self-consistent proteomic field approximation provides a tool kit for engineering new switches and networks. Based on our analysis, if one would want to build a strong component of a switch out of a gene with relatively small chemical parameters, one could use components that utilize binding of tetramers and that produce proteins in bursts. This is what the Cro gene in the λ-switch uses.

Acknowledgments

A.M.W. and P.G.W. were supported by the Center for Theoretical Biological Physics through National Science Foundation grants PHY0216576 and PHY0225630. M.S. was supported by the Research and Development for Applying Advanced Computational Science and Technology Project of the Japan Science and Technology Corporation and by grants from the Ministry of Education, Culture, Sports, Science, and Technology, Japan.

APPENDIX A

In this appendix we derive the explicit form of the moment equations for the switch discussed in The Toggle Switch, above. In the operator formalism developed for classical diffusion by Doi (1976) and Zeldovich and Ovchinikov (1978), the number operator may be written in terms of number state creation a and annihilation a operators, as n = aa. It is then particularly easy to write down the equations for the a moments instead of the n moments. Setting the left-hand side to zero, one obtains the steady-state equations

graphic file with name M120.gif
graphic file with name M121.gif (A1)

Using the probability conservation relation C1(i) + C2(i) = 1, the 0th order equations become

graphic file with name A2.jpg (A2)

Dividing the higher order aj(i) moment equations by Cj(i) and using the relation Inline graphic from the 0th order equations, one can calculate

graphic file with name M123.gif (A3)

which depends only on a moments of lower order than the kth moment. This allows one to obtain the following form for the higher order a moments,

graphic file with name M124.gif
graphic file with name M125.gif (A4)

Going back and forth between the two types of moments is straightforward. The n-moment equations have, however, more complicated forms, as for example

graphic file with name M126.gif (A5)

APPENDIX B

In the case when proteins are produced in bursts of N and repressors bind as dimers, the master equation has the form

graphic file with name M127.gif (B1)

for nN. For n < N, the equations have the form

graphic file with name M128.gif
graphic file with name M129.gif (B2)

Following the same procedure as for the single protein production case, we get the equations of motion for the first three moments, as

graphic file with name M130.gif (B3)

where Inline graphic as before. Writing out N2 = N(N−1) + N and subtracting the 〈nj(i)〉 equations from Inline graphic we get the equations of motion for the previously defined annihilation operators a. Due to the form of F(i) for the dimer binding case only the first three moments are relevant. However, this procedure can generally be carried out for higher moments, yielding an expression for the mth annihilation operator moment in the steady state of the form

graphic file with name M133.gif (B4)

To consider the binding of higher order oligomers when proteins are produced in bursts one simply accounts for the changed form of F(i) as discussed in The Case when Proteins Bind as Higher Order Oligomers, above.

References

  1. Ackers, G. K., A. D. Johnson, and M. A. Shea. 1982. Quantitative model for gene regulation by λ-phage repressor. Proc. Natl. Acad. Sci. USA. 79:1129–1133. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Arkin, A., J. Ross, and H. H. McAdams. 1998. Stochastic kinetic analysis of developmental pathway bifurcation in phage λ-infected Escherichia coli cells. Genetics. 149:1633–1648. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Aurell, E., S. Brown, J. Johanson, and K. Sneppen. 2002. Stability puzzles in phage lambda. Phys. Rev. E. 65:051914-1–051914-9. [DOI] [PubMed] [Google Scholar]
  4. Buchler, N. E., U. Gerland, and T. Hwa. 2003. On schemes of combinatorial transcription logic. Proc. Natl. Acad. Sci. USA. 100:5136–5141. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Bialek, W. 2001. Stability and noise in biochemical switches. Adv. Neural Inf. Proc. 13:103–109. [Google Scholar]
  6. Becskei, A., B. Seraphin, and L. Serrano. 2001. Positive feedback in eukaryotic gene networks: cell differentiation by graded to binary response conversion. EMBO J. 20:2528–2535. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Cook, D. L., A. N. Gerber, and S. J. Tapscott. 1998. Modeling stochastic gene expression: implications for haplo-insufficiency. Proc. Natl. Acad. Sci. USA. 95:15641–15646. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Darling, P. J., J. M. Holt, and G. K. Ackers. 2000. Coupled energetics of λ-Cro repressor self-assembly and site-specific DNA operator binding. II. Cooperative interactions of Cro dimers. J. Mol. Biol. 302:625–638. [DOI] [PubMed] [Google Scholar]
  9. Doi., M. 1976. Stochastic theory of diffusion-controlled reaction. J. Phys. A. 9:1479–1495. [Google Scholar]
  10. Gillespie, D. T. 1977. Exact stochastic simulation of coupled chemical-reactions. J. Phys. Chem. 81:2340–2361. [Google Scholar]
  11. Hasty, J., F. Issacs, M. Dolnik, D. McMillen, and J. J. Collins. 2001a. Designer gene networks: towards fundamental cellular control. Chaos. 11:207–220. [DOI] [PubMed] [Google Scholar]
  12. Hasty, J., D. McMillen, F. Issacs, and J. J. Collins. 2001b. Computational studies of gene regulatory networks: in numero molecular biology. Nat. Rev. Genet. 2:268–279. [DOI] [PubMed] [Google Scholar]
  13. Hasty, J., J. Pradines, M. Dolnik, and J. J. Collins. 2000. Noise-based switches and amplifiers for gene expression. Proc. Natl. Acad. Sci. USA. 97:2075–2080. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Kepler, T. B., and T. C. Elston. 2001. Stochasticity in transcriptional regulation: origins, consequences, and mathematical representations. Biophys. J. 81:3116–3136. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. McAdams, H. H., and A. Arkin. 1997. Stochastic mechanisms in gene expression. Proc. Natl. Acad. Sci. USA. 94:814–819. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. McLure, K. G., and P. W. Lee. 1998. How p53 binds DNA as a tetramer. EMBO J. 17:3342–3350. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Paulsson, J., O. G. Berg, and M. Ehrenberg. 2000. Stochastic focusing: fluctuation-enhanced sensitivity of intracellular regulation. Proc. Natl. Acad. Sci. USA. 97:7148–7153. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Ptashne, M., and A. Gann. 2002. Genes and Signals. Cold Spring Harbor Laboratory Press, New York.
  19. Ptashne, M. 1992. A Genetic Switch, 2nd Ed. Cell Press and Blackwell Science, Oxford, UK.
  20. Sneppen, K., and E. Aurell. 2002. Epigenetics as a first exit problem. Phys. Rev. Lett. 88:048101-1–048101-4. [DOI] [PubMed] [Google Scholar]
  21. Swain, P. S., M. B. Elowitz, and E. D. Siggia. 2002. Intrinsic and extrinsic contributions to stochasticity in gene expression. Proc. Natl. Acad. Sci. USA. 99:12795–12800. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Sasai, M., and P. G. Wolynes. 2003. Stochastic gene expression as a many-body problem. Proc. Natl. Acad. Sci. USA. 100:2374–2379. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Thattai, M., and A. van Oudenaarden. 2001. Intrinsic noise in gene regulatory networks. Proc. Natl. Acad. Sci. USA. 98:8614–8619. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Vilar, J. M. G., C. C. Guet, and S. Leibler. 2003. Modeling network dynamics: the lac operon, a case study. J. Cell Biol. 161:471–476. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Zeldovich, Y. B., and A. A. Ovchinikov. 1978. Mass-action law and kinetics of chemical-reactions with allowance for thermodynamic density fluctuations. Sov. Phys. J. ETP. 74:1588–1598. [Google Scholar]

Articles from Biophysical Journal are provided here courtesy of The Biophysical Society

RESOURCES