Abstract
Two powerful and complementary experimental approaches are commonly used to study the cell cycle and cell biology: One class of experiments characterizes the statistics (or demographics) of an unsynchronized exponentially-growing population, while the other captures cell cycle dynamics, either by time-lapse imaging of full cell cycles or in bulk experiments on synchronized populations. In this paper, we study the subtle relationship between observations in these two distinct experimental approaches. We begin with an existing model: a single-cell deterministic description of cell cycle dynamics where cell states (i.e. periods or phases) have precise lifetimes. We then generalize this description to a stochastic model in which the states have stochastic lifetimes, as described by arbitrary probability distribution functions. Our analyses of the demographics of an exponential culture reveal a simple and exact correspondence between the deterministic and stochastic models: The corresponding state ages in the deterministic model are equal to the exponential mean of the age in the stochastic model. An important implication is therefore that the demographics of an exponential culture will be well-fit by a deterministic model even if the state timing is stochastic. Although we explore the implications of the models in the context of the Escherichia coli cell cycle, we expect both the models as well as the significance of the exponential-mean lifetimes to find many applications in the quantitative analysis of cell cycle dynamics in other biological systems.
I. INTRODUCTION
Methods to quantitatively characterize cell cycle dynamics have expanded dramatically [1] since the pioneering model of the Escherichia coli cell cycle described by Cooper and Helmstetter [2]. Their initial work represented the cell cycle as a deterministic process in which each step was precisely timed. Although these assumptions were almost certainly viewed as a matter of mathematical convenience, some later readers have interpreted the experimental success of this model as evidence that stochasticity in the cell cycle has little biological significance [3]. Some later authors have relaxed some of these assumptions and found that the predictions are in fact robust to the model details [3], but none have yet reanalyzed these dynamics in the context of the significant level of stochasticity observed in cell cycle timing (e.g. [4, 5]). In this paper, we study a class of stochastic models that can be solved exactly, even in the strong stochasticity limit, and we explore their phenomenology.
One fundamental difficulty with reconciling the quantitative analyses of the cell cycle is the existence of two distinct classes of experiments: In unsynchronized approaches, an exponential culture is analyzed and the number of cells at time t is used to generate statistics defined with respect to cell number [2]. Examples of this approach are snapshot imaging (e.g. [6]), flow cytometry (e.g. [7]), and many deep-sequencing based approaches (e.g. [8]). We contrast these with synchronized approaches in which cells of a known state in the cell cycle progression are analyzed. Examples of this approach are the use of any of the previously described methods on cells which are first synchronized using a baby machine (e.g. [9]). Time-lapse imaging of full cell cycles (e.g. [10]), including the use of devices like the mother machine (e.g. [4]), can also be used to generate data for synchronized analyses. Although it might naïvely seem that averaging with respect to these two population ensembles are equivalent, they are not.
To demonstrate the subtlety of interpreting the data from an exponential culture, consider the probability of observing the Z ring, the ring-shaped protein complex that forms in E. coli at midcell and drives the process of septation (or cytokinesis) [11]. If the cell cycle has duration T and the Z ring has lifetime δτZ, one might naïvely assume the probability of observing the Z ring is:
(1) |
See Fig. 1A. Although this is true in the synchronized population, in an exponential culture the probability is 30% lower as a direct consequence of the relative abundance of cells by age [12]. Why? The number of new-born cells is twice the abundance of cells at the end of the cell cycle when the Z ring forms. Although this seems like a trivial book-keeping annoyance, when we consider the stochastic model, this effect has consequential implications for timing throughout the cell cycle, including on the growth rate.
In Sec. II A, we will first revisit the existing deterministic model, where all events in the cell cycle are precisely timed. In this model, we will represent the fundamental state of the cell as an age τ and compute the statistics of cell age in an exponential culture. To make contact with observables, we then apply these results to describe the demographics of the E. coli cell cycle in Sec. II B. In Sec. II C, we consider a stochastic model, where the cell cycle is represented as discrete sequential states j = 1…m, each with a stochastic lifetime τδj. Although this model cannot fully capture all the complexities of the cell cycle, it is analytically tractable and we can exactly compute expressions for all the same statistics as the deterministic model. The relation between the deterministic and stochastic model statistics is at this point opaque. In Sec. II D, we define an exponential mean, which is a mean biased toward younger cells that are overabundant in exponential culture. In Sec. II E, we demonstrate that the predictions of the stochastic and deterministic models are in fact identical if the deterministic state ages τj are equal to the exponential-mean stochastic state ages . Finally, in Sec. II E, we consider a number of simple biological examples to underline both the mathematical behavior of the exponential mean as well as its biological implications. In the interest of brevity, we will discuss experimental support for this model elsewhere [13].
II. RESULTS
In this section, we will derive the expressions for a large number of statistics relevant for describing an exponential culture. We will first derive expressions for the statistics in the deterministic model and then the stochastic model.
A. Deterministic model
In the deterministic model, we will consider cells that are born with age τ = 0 and divide deterministically at age τ = Tτ. By cell age τ, we mean a continuous cell state variable representing cell cycle progression, not aging in the context of reduced cell fitness over time [14].
1. Definition of the deterministic model
In the deterministic model, cell state is described by a continuous variable, cell age τ, and therefore the population is described in terms of a number density with respect to age τ at time t: nτ(t). Age τ is defined on the interval [0, Tτ] with τ = 0 corresponding to cell birth and Tτ corresponding to cell division. Let the cumulative creation number, , be the cumulative number of cells that have entered state (i.e. age) τ and the cumulative annihilation number, , be the cumulative number of cells that have transitioned out of state τ [15]. In the deterministic model, where the cell state τ is continuous (not discrete), the cumulative creation and annihilation numbers are identical and are related to the number density by the equation:
(2) |
It is convenient to define these cumulative numbers in addition to the number density, since they will be a powerful tool for computing some observable quantities. To describe the dynamics, we can write an equation describing the number of cells entering the infinitesimal age interval [τ, τ + δτ] in the infinitesimal time interval [t, t + δt]:
(3) |
where and the first term on the RHS is the number of cells entering state τ and the second term represents the cells leaving state τ + δτ. Eq. 3 can then be rewritten:
(4) |
except at division, where some care is required. Now consider the process of cell division explicitly: The division process can be understood as the annihilation of a cell in state τ = Tτ and the creation of two new-born cells in state τ = 0:
(5) |
(6) |
Substituting Eq. 5 into Eq. 4 gives a single piecewise rate equation in terms of the number density nτ(t):
(7) |
where A′ ≡ ∂τA. Eq. 7 completely describes the cell cycle dynamics in the deterministic model. The details of the derivation are given in Appendix A 1.
2. Solution to the deterministic model
In steady-state growth, we can assume the total number of cells is:
(8) |
where k is the growth rate that is determined by solving the rate equation (Eq. 7), as detailed in Appendix A 2. It will often be convenient to rewrite the equations in terms of the doubling time:
(9) |
rather than the growth rate k. Eq. 7 evaluated at τ = 0 gives a consistency condition between doubling time T and the duration of the cell cycle Tτ:
(10) |
which is to say that the doubling time is equal to the duration of the cell cycle, as one would naïvely expect. In steady-state growth, one can compute the number density of cells, which is:
(11) |
where nτ is the density with respect to cell age τ and n0 is a constant determined by the initial cell number. The details of the derivations for the solution and the consistency condition are given in Appendix A 2.
3. Statistics of the deterministic model
The solution of Eq. 7 can be used to compute the probability (PDF) and cumulative (CDF) distribution functions with respect to cell age:
(12) |
(13) |
The details of the derivation are given in Appendix A 3. Eq. 12 implies that in an exponential culture, there is an enrichment of young cells which decays exponentially with age τ. See Fig. 2A.
Note that the canonical observable in an exponential culture is number as a function of time rather than abundances relative to the total number of cells N(t). However, we shall write each expression as the prefactor of N(t) and therefore the prefactor can be interpreted as the abundance relative to cell number N(t).
The cumulative creation number is:
(14) |
The details of the derivation are given in Appendix A 4. The number of cells younger than age τ is:
(15) |
and the number of cells older than age τ is:
(16) |
Finally, the number of cells in a state defined by the age range τ1 < τ < τ2 is:
(17) |
where the two terms on the right hand side are defined in Eq. 16.
B. Application to cell cycle dynamics
In this section, we demonstrate how to apply these results in the context of the E. coli cell cycle dynamics shown schematically in Fig. 1A. These formulae can be applied either to predict the numbers from the known replication timing or to infer timing from the observed numbers in an exponential culture.
1. Z ring
The Z ring is an ultra-structural complex responsible for the process of bacterial cytokinesis (or septation) in which the cell envelope contracts at midcell forming a septum that closes to form the new poles of the nascent daughter cells [16]. The assembly, dynamics and disassembly of this structure is easily visualized using a wide range of fluorescent fusions in live cells or immunofluorescence in fixed cells [17].
The Z ring is an example of a transient complex, therefore we need to use N (as opposed to N+). Furthermore, it assembles at τ = τZ and disassembles at the end of the cell cycle. The number of Z rings is therefore equal to the number of cells older than τZ:
(18) |
It is interesting to consider the limit as δτZ ≡ Tτ − τZ is small relative to the cell cycle duration Tτ in order to compare this to our intuitive guess (Eq. 1):
(19) |
Since ln 2 ≈ 0.69, this is roughly 30% smaller than our naïve estimate due to depletion of older cells in an exponential culture (Fig. 2A).
2. Cell poles
Although the number of cell poles is twice the number of cells, it is useful to consider this example more formally. Unlike the Z ring which is transient, the poles are perpetual: Once the state is created, it is never annihilated (Fig. 1B). In this context, we can use the cumulative creation number N+. Note that we are immediately presented with a conundrum: Are two poles formed at the end of the cell cycle (τ = T) or is one pole created at birth (τ = 0)? Both approaches give the same number:
(20) |
which is twice the number of cells, just as one intuitively expects.
3. DNA loci
The numbers of DNA loci can be observed by a number of different approaches: Modern deep sequencing methods allow a replication profile (i.e. the DNA copy number) of all loci to be measured in a single experiment (e.g. [8]). However, population-level analysis of the relative copy numbers of loci long predate this modern approach [2]. The single-cell dynamics of loci can also be observed: Imaging-based approaches, such as Fluorescence In Situ Hybridization and Fluorescent Repressor Operator Systems (or closely related approaches), can be used to visualize the numbers of segregated loci in single cells [18, 19].
There are multiple equivalent approaches to computing the numbers of genetic locus ℓ. First consider the slow-growth limit where both initiation and termination occur within the current cell cycle [2]. Assume the locus of interest is replicated at time τℓ. The number of copies per cell is one before replication and two after replication. We can therefore write:
(21) |
(22) |
(23) |
in agreement with previous results [20, 21]. Unlike transient quantities, e.g. the number of Z rings, the form of Eq. 23 implies that the number of genetic loci can be understood as a temporal shift of N(t) by Tτ − τℓ to shorter times, as illustrated in Fig. 2B.
The cancelation between the non-exponential terms between N>τℓ and N<τℓ in Eq. 22 may seem incidental, but from another perspective it is intuitive: The mathematical reason for the non-exponential terms in the prefactors of Eqs. 15-16 is annihilation, i.e. the reduction in the number of cells of a particular age τ due to aging. DNA loci correspond to a perpetual state: Once a locus state is created (i.e. replicated) it does not annihilate (i.e. transition into another state). To compute the number of genetic loci, we can therefore use the cumulative creation number N+ formula (as opposed to N which is reduced by annihilation). This more direct approach yields the same result as Eq. 22:
(24) |
but is applicable for fast growth where replication initiates before the cell cycle begins (i.e. τℓ < 0), as illustrated in the cell cycle schematic in Fig. 2A.
4. B, C, and D period
Traditionally, bacterial cell cycle is described by three periods: The B period is defined as the period between birth and replication initiation. The C period is defined as the cell cycle period during which replication occurs: i.e. after replication initiation and before termination [2]. The D period is defined as the period between replication termination and cell division [2]. There are multiple approaches to characterizing relative abundance of cells by period. A traditional approach is to infer this information from the relative abundance of the origin, terminus and number of cells [2]. However, more recent single-cell approaches can visualize the replication process itself in live cells [22, 23].
The relation between the durations of these periods and the locus number (Eq. 24) are:
(25) |
(26) |
(27) |
where N, Nori, and Nter are the number of cells, origins, and termini in the exponential culture (not per cell), which has previously been reported [2, 3]. Note that if replication initiates before the start of the cell cycle, B = τori < 0.
5. Replication
Finally, let us consider the replisomes and the replication process itself. A traditional population-level approach to determining the number of replicating cells is to infer it from the relative origin, terminus, and cell abundances. However, many single-cell and live single-cell approaches exist today as well. For instance, fluorescent fusions to core replisome components that localize during replication can be used to characterize the number of replicating cells [22, 24, 25].
Like the Z ring, replication is a transient state; however, there is a significant subtlety here: Do we count (i) replicating cells, (ii) individual replication processes consisting of replisome-pairs, or (iii) individual replisomes?
First let us consider the number of replisome-pairs. Since the replication process can span the overlap between two successive cell cycles, it is most convenient to use differences in the cumulative creation number N+. In fact, we can express the number of replisome-pairs concisely in terms of oriC and ter:
(28) |
and the number of individual replisomes will be twice the number of pairs. Nori and Nter are computed using Eq. 24.
For the number of replicating cells, we consider three different cases. First consider a case where the replication cycle is internal to the cell cycle. In this case, we have:
(29) |
which can be evaluated using Eq. 17. If the replication process overlaps by a single cell cycle but replication rounds do not overlap:
(30) |
where τori is negative in this context, as noted in Sec. II B 4. Finally, if the rounds of replication overlap:
(31) |
and all cells are replicating in the deterministic model.
C. Stochastic model
An important complication of a more realistic model for cell cycle dynamics is stochasticity (i.e. randomness) in the lifetime of the states of the cell cycle. We will incorporate this stochasticity by dividing the cell cycle into m discrete states through which the cell must transition sequentially. This model is shown schematically in Fig. 3. The lifetime of each state τδj will be described by an arbitrary lifetime PDF, pδj(·), for the jth state. It is important to note that this sequential-state stochastic model is not general enough to be an accurate representation of the bacterial cell cycle; however, it is sufficient to explore a number of interesting stochasticity-related phenomena and is exactly solvable.
1. Definition of the stochastic model
In our analysis, we will use the rate equation approach, rather than a master equation approach, since we are interested in the steady-state behavior of the model in the large cell number limit where the relative size of the fluctuations are vanishingly small.
Let Nj(t) be the number of cells in state j, the cumulative creation number, , and the cumulative annihilation number, , be the total number of cells to have arrived and departed from state j over all time, respectively. The state dynamics is therefore described by the following rate equation:
(32) |
In this model, cells move sequentially through the m states before the final state (j = m) transitions to the initial state (j = 1) as two cells:
(33) |
Each state j has a PDF of lifetimes pδj(t) and therefore the relation between state j arrivals and departures is given by:
(34) |
where ⊗ is the convolution:
(35) |
2. Solution to the stochastic model
We will work in the steady-state growth limit as before. It is most convenient to work in terms of the Laplace transforms of the rate equations (Eqs. 32-34). The Laplace transform is defined:
(36) |
where the tilde denotes the transformation from time t to Laplace conjugate λ.
The transformed representation is convenient since the ordinary differential equations become algebraic equations in terms of the transform quantities and the convolutions become products of transforms (e.g. [26]). Of particular importance in what follows is the relation between the PDF of lifetimes of individual state j, pδj(t), and the PDF of the age of the cell at the transition out of state j, pj(t):
(37) |
A detailed derivation of the solution to the rate equations (Eqs. 32-34) is given in Appendix A 5.
For steady-state exponential growth, the consistency condition that relates the growth rate k to the PDF of cell cycle durations p(t) ≡ pm(t) can be written concisely in terms of the Laplace transform:
(38) |
an equation that is well known [27, 28]. This consistency condition is equivalent to Eq. 10 in the deterministic model, although the mathematical equivalence between these two relations is opaque for the moment.
3. The statistics of the stochastic model
In an exponential culture, the cell numbers are given by the expressions:
(39) |
(40) |
(41) |
(42) |
which have a similar structure to the dynamics of the deterministic model (Eqs. 14-17), but are dependent on the Laplace transforms of the state lifetime PDFs. Intuitively, these Laplace transforms give rise to an effective mean time.
D. The exponential mean
To understand the biological significance of the Laplace transform of the lifetime PDF, consider the generalized f-mean (or Kolmogorov mean) where the random variable t is first transformed by function g, an arithmetic mean is performed, and then the inverse function is applied to generate a generalized expectation [29]:
(43) |
where is the arithmetic expectation over random variable t. Both the harmonic mean and geometric mean are special cases of this more general formulation. The Laplace transform of the lifetime and age PDFs can be reinterpreted as the expectation of g(t) = exp(−kt) and therefore we can generate the f-means:
(44) |
(45) |
which can be understood as the exponential-mean of the lifetime and age of state j respectively.
Before returning to our model, we will explore the behavior of the exponential mean. Consider the special case of a distribution that is very narrow relative to the growth rate. In this case:
(46) |
where the exponential mean is equal to the mean to the order of the variance () times the growth rate k. Details of the derivation are given in Appendix A 6. Short-lived states and states with small lifetime-variance will therefore have exponential means equal to the mean. More generally, the Jensen inequality always guarantees the exponential mean is less than or equal to the mean:
(47) |
since the function g(t) is convex [30].
Finally, let us consider the consequences of a very wide distribution of lifetimes. Consider a state j in which fraction ϵ of cells arrest (τδj → ∞) while the remaining cells have exponential-mean lifetime . Using Eq. 45, it is straightforward to compute the exponential-mean lifetime:
(48) |
where the second term acts to extend the lifetime by a positive multiple of the doubling time T. Although the arrested cells do lengthen the exponential-mean lifetime, it remains finite. Eq. 48 is a useful approximation anytime some fraction of the cells have a lifetime much longer than the doubling time even if all lifetimes are finite.
E. Model correspondence
To determine the differences between the deterministic and stochastic models, we eliminate the Laplace-transformed lifetime PDFs in favor of the exponential-mean lifetimes using their definition (Eq. 45). First consider the consistency condition for exponential growth (Eq. 38). The convolution theorem ensures that the exponential-mean lifetimes of successive states add to generate the age of state j (e.g. the natural logarithm of Eq. 37). Eq. 38 can now be rewritten as a relation between the exponential-mean cell-cycle duration and the doubling time:
(49) |
which is now intuitively equivalent to the consistency condition in the deterministic model (Eq. 10). Details of the derivation are given in Appendix A 7.
Now consider the expressions for state number in the stochastic model (Eqs. 39-41). When the deterministic age τ is evaluated at exponential-mean stochastic age , the numbers are identical in the two models:
(50) |
(51) |
(52) |
(53) |
We therefore conclude that the statistics of the deterministic and stochastic models are identical in an exponential culture for models with the same growth rate k, once a correspondence has been established between states j and ages τ. In the deterministic model, state j corresponds to times where and . This correspondence is illustrated schematically in Fig. 4. Since we demonstrated a correspondence between the models, almost all the application discussed in Sec. II B generalize by replacing the deterministic time τ with the corresponding exponential mean [31].
F. Implications for cell cycle phenomenology
To explore the nontrivial consequences of stochasticity in timing, consider an example motivated by replication conflicts [32]: By visualizing the replisome dynamics using single-molecule microscopy, we have recently reported that transcription leads to pervasive replisome instability [24]. To what extent should conflict-induced pauses in replication have been detectable in the classic analyses of unsynchronized cell populations?
Consider a simplified model in which an experiment probes the difference between the wildtype strain W and two mutant strains. The wildtype W grows with deterministic C period CW and deterministic cell cycle duration TW. In mutant strain A(rrest), a small fraction ϵ of cells arrest during replication (i.e. C period) and never complete the cell cycle, whereas non-arrested cells are identical to wildtype cells. In mutant strain S(low), the replication process is 1 + ϵ′ times slower, but the B and D periods are identical to wildtype. Using Eq. 48, one can compute the C period duration C and doubling time T. To lowest order in ϵ, the inferred cell cycle durations and C period are presented in Tab. I.
TABLE I.
Strain | Doubling Time: T | C period: C |
---|---|---|
Wildtype | T W | C W |
Mutant A | ||
Mutant S | TW + ϵ′ CW | CW + ϵ′ CW |
At an intuitive level, one aspect of the prediction is easy to understand: In both mutants the C period is lengthened, as one might naïvely expect since this is the replication period of the cell cycle. Furthermore, the doubling time increases by the same amount as the C period increases. But there is an aspect of this prediction which is perhaps less intuitive: One might naïvely expect to observe a more dramatic consequence of replication arrest, like a large buildup of C period cells, but the consequences are indistinguishable from a slowdown in an exponential culture. In both mutants, there is a slight lengthening of the inferred C period, even though the slowdown is caused by replication arrest in the context of the A mutant. Although this prediction is not new in a qualitative sense, it concisely illustrates how the statistics of the exponential culture mask two mechanistically distinct phenomena.
The statistics of an exponential culture can also generate distinctions where seemingly none exist. Consider a more realistic model in which the duration of the D period is stochastic, has a non-zero width, and is identical for all three strains. The more rapid growth rate of the wildtype strain implies that its effective D period is shorter than for mutants cells:
(54) |
even though the distributions of the D period durations are identical in all three strains. (To understand how this occurs, see the second term on the RHS of Eq. 46.) In most cases this effect should be subtle, but for large changes in growth rate, these changes could be quite significant and can clearly complicate the interpretation of effective period lifetimes in an exponential culture.
III. DISCUSSION
In this paper we provide a detailed analysis of both deterministic and stochastic models of the cell cycle. In Sec. II A, we solved the deterministic model in which the cell-aging and division processes are precisely timed and determined the demographics (i.e. statistics) of an exponential culture. Given a set of observed demographics, Sec. II B provides a detailed road map for how to infer cell cycle state timing in the context of the deterministic model. In Sec. II C, we solved the more realistic stochastic model in which the lifetimes of sequential states are stochastic and again we determined the demographics of an exponential culture. By defining an exponential mean in Sec. II D, we demonstrated that the statistics of the two models were equivalent in Sec. II E. The effective lifetime of states in the deterministic model is the exponential mean of the lifetimes in the stochastic model. That is to say that the exponential-mean lifetimes are the sufficient statistics of the model (e.g. [33]): Knowledge of only these lifetime statistics predicts the demographics of the exponential culture; therefore, inference on exponential-culture demographics infers only the exponential means, rather than the underlying lifetime distributions themselves. Finally in Sec. II F, we discussed some of the limitations of the exponential-mean lifetimes in resolving the underlying biological mechanisms.
A. Applicability of the stochastic model
Is the stochastic model sufficiently complex to capture all the relevant cell cycle phenomenology in E. coli and other bacterial systems? Like the deterministic model before it, the stochastic model is an idealized model that is simple enough to be tractable analytically, but complex enough to capture some important phenomenology. There are a number of shortcomings of this model but perhaps the most significant is that there is no memory beyond the cell state index j. As a consequence, it makes predictions at variance with some observed phenomenology: For instance, the stochastic model must predict that successive cell cycle durations are uncorrelated; however, these correlations are observed [4, 34]. (We briefly consider the implications of a more general model in Appendix A 8.) Another important limitation of the stochastic model is that cell divisions are symmetric, which is a good approximation in E. coli, but these types of stochastic models can easily be extended to the general asymmetric division case (e.g. [28]).
B. On the applicability of the exponential mean
Although the definition of the exponential mean was motivated by the correspondence between the deterministic and stochastic models, it almost certainly has much greater applicability to other more complicated scenarios. For instance, our own numerical experiments using more complex models suggest that the relation between the effective lifetime of the states and the exponential-mean lifetime appears to be more robust than the assumptions of the stochastic model might imply. Since the key mechanism for generating bias toward short times is steady-state exponential growth, we expect the exponential mean of wait times to be the determinative statistic in more general models, as demonstrated in Appendix A 8. As such, the exponential-mean lifetime could be a powerful observable to bridge timescales between single-cell and culture phenomenology in two different contexts: (i) in experiments probing cell cycle dynamics at the single-cell level and (ii) in complex numerical simulations that are too slow and too memory intensive to simulate in the long time limit [35].
We should note that although we believe our interpretation of the doubling time as an exponential mean (Eq. 49) is novel, it has already been appreciated in two important respects: (i) From a computational perspective, the Laplace-transform formulation (Eq. 38) of Eq. 49 has long been known [27]. (ii) From a qualitative perspective, biologists have long understood the consequences of the exponential-mean lifetime on cell growth rate: I.e. the doubling time T is “an average” of the cell cycle duration Tτ; however, a small arrested subpopulation, for whom Tτ → ∞, slows but does not stop growth. There is also physical precedent for this type of mean: Intriguingly, it emerges in context of non-equilibrium statistical mechanics [36] [37], although what connection this has to our cellular dynamics is opaque.
C. On the significance of stochasticity
How does stochasticity affect biological function? Experimentally, we have long known that although the statistics of an exponentially growing population are well described by the deterministic model [2], nontrivial stochasticity in cell cycle timing is observed [4, 5]. It is therefore tempting to conclude, based on the literature and perhaps even our own results, that stochasticity is either small or simply does not significantly affect biological function.
Our own conclusions are much more nuanced. Although our results guarantee that the deterministic model fits the exponential-culture demographic data just as well as the stochastic model, we have demonstrated that the stochasticity in timing is hidden in plain sight. The distribution of state lifetimes determine the exponential means. Therefore, the success of the deterministic model should not be interpreted as evidence against stochasticity or against its importance, but rather it indicates that only the exponential-mean state lifetimes are determinative parameters in the model for the demographics of an exponential culture.
Perhaps more than anything else, the exact correspondence between the deterministic and stochastic models emphasizes the need for synchronized single-cell measurements: In Sec. II F, we illustrated (i) how similarities in the effective duration of the C period obscures distinct biological mechanisms as well as (ii) how differences in the effective D period could belie an identical mechanism.
At a mechanistic level, stochasticity plays a central role in many processes. For instance, the mechanism that restarts replication will prevent the existence of a fat tail on the distribution of C periods [24, 32, 38]. Although the existence of the fat tail—i.e. a small number of cells with very long C periods—does not break the correspondence with the deterministic model, it does increase the exponential-mean C period, which in turn decreases the growth rate. (E.g. see Tab. I.) Since the growth rate is decreased, there is a strong selective pressure to reduce stochasticity. This argument predicts the existence of biological mechanisms to reduce stochasticity, as are already known in many contexts (e.g. replication restart). In fact, the subtle signature of stochasticity suggests an interesting hypothesis: a significant number of mutants that are currently known to reduce growth rate may in fact generate this phenotype by increasing the level of stochasticity in the cell cycle duration. Single-cell experimental analysis must play a central role in understanding these phenomena.
ACKNOWLEDGMENTS
PAW acknowledges advice and comments from M. Cosentino-Lagomarsino, S. Iyer-Biswas, P. Levine, J. Mittler, R. Phillips, M. Transtrum, B. Traxler, I.M. Shelby, and H.K. Choi. This work was supported by NIH grant R01-GM128191.
Appendix A: Supplemental derivations
1. Derivation of the rate equation in the deterministic model
To obtain the cumulative creation and annihilation numbers in terms of the number density (Eq. 2), we integrate the number density at fixed age τ over all time t to obtain the cumulative number of cells that have ever entered (creation) or left (annihilation) age τ. They are equivalent due to the continuous nature of the deterministic model. If states were discrete, as in the stochastic model, then the cumulative creation and annihilation numbers would differ by the number of cells currently in the state τ.
To obtain Eq. 4 from Eq. 3, we divide both sides of Eq. 3 by δt δτ and replace with the equivalent , which leaves:
(A1) |
Taking the limit as δτ goes to 0 and using the definition of a derivative, we are left with:
(A2) |
which is Eq. 4 in the main text. Eqs. 5-6 follow from taking the partial time derivative of Eq. 2 and taking into account the consistency condition:
(A3) |
Conceptually, this consistency condition describes how cell division at age Tτ leads to twice as many daughter cells of age τ = 0.
2. Derivation of the solution in the deterministic model
In the deterministic model, we can assume that steady-state growth of the population is represented by an exponentially increasing time dependence factor, ekt, with a constant unknown growth rate k. This assumption holds in the long time limit, since only the fastest growing mode remains in exponential growth, while all others (smaller k) are diluted out. We thus stipulate that in the deterministic model, all cellular quantities must grow with this same time dependence. The number density is then a solution of the form:
(A4) |
where nτ (0) represents the initial age distribution at t = 0. Plugging this into Eq. 7 for the τ > 0 case yields:
(A5) |
This can then be integrated to yield the solution:
(A6) |
(A7) |
where n0 is a constant determined by the initial cell number. This equation appears in the main text as Eq. 11. To satisfy the τ = 0 case of Eq. 7, we must use the consistency condition (Eq. A3):
(A8) |
(A9) |
Dividing both sides by n0ekt and solving for Tτ gives:
(A10) |
Therefore, the doubling time defined in Eq. 9 is equivalent to the cell cycle duration, as one would naïvely expect. Furthermore, this equation relates the growth rate k, a population measure, to the cell cycle duration Tτ, a single-cell measure.
3. Derivation of the PDF and CDF in the deterministic model
To obtain the probability distribution function, fτ(τ), with respect to cell age (Eq. 12), we must normalize the number density at any fixed time t:
(A11) |
where n(τ) = n(τ = 0)e−kt, which is just Eq. 11 with the fixed t factor absorbed into n(τ = 0). Evaluating the integral and replacing n(τ) with the expanded form yields:
(A12) |
(A13) |
Now recall that Tτ = T ≡ k−1 ln 2, from Eqs. 9-10. Plugging this Tτ into Eq. A13 gives Eq. 12:
(A14) |
To obtain the CDF, integrate with respect to τ from 0 to τ:
(A15) |
(A16) |
4. Derivation of the cumulative creation number in terms of N(t)
Consider the cumulative creation number Eq. 2 evaluated at τ = 0:
(A17) |
which is double the current number of cells N. The factor of two arises due to the cumulative nature of . To understand this intuitively, consider the total number of cells in each generation:
(A18) |
which is a geometric series and can be summed to 2N, matching Eq. A17. We can now multiply by the τ-dependence term, e−kτ, to obtain:
(A19) |
More formally, we can use direct integration of Eq. 11:
(A20) |
(A21) |
(A22) |
where c is an integration constant. We now integrate nτ(t) over all τ to obtain the total number of cells at time t:
(A23) |
(A24) |
(A25) |
Combining this with the consistency condition Eq. A17, we get:
(A26) |
Setting this equal to Eq. A22 allows us to set the integration constant c = 0. Eq. A21 then becomes Eq. A19 from above, which is Eq. 14 from the main text.
5. Derivation of the solution in the stochastic model
Clearly, Eqs. 33 and 34 can be combined recursively to generate a relation between the number of cells entering states 1 and j. First let us define the state-transition time PDF pj, describing the total time taken to transition from birth through state j:
(A27) |
(A28) |
where p is the lifetime PDF for the entire cell cycle. We then write an expression of the number arriving in state j:
(A29) |
As before, using this same condition at the end of the cell cycle gives rise to a consistency condition:
(A30) |
It follows that in steady-state exponential growth, the growth rate k must correspond to the solution to the equation:
(A31) |
an equation that is well known [27].
Let N≤j be the total number of cells in states i = 1…j. The dynamics of this quantity has a simple form due to the telescoping form of the dynamics equations (Eqs. 32-34) where the number entering the ith state exactly cancel the number leaving the i − 1th state:
(A32) |
(A33) |
To determine the overall normalization, we can sum up the cells in all states and set that sum equal to the total number of cells N(t):
(A34) |
From , we can compute the number in individual states:
(A35) |
(A36) |
In the long time limit, the fastest growing mode dominates the solution and therefore:
(A37) |
(A38) |
(A39) |
(A40) |
which also appear in the main text.
6. The exponential mean of a very narrow distribution
To obtain Eq. 46, we begin with the exponential mean:
(A41) |
which is obtained by using the function g(t) = exp(−kt) in the generalized expectation equation (Eq. 43). We then use a series expansion of the exponential term, , which yields:
(A42) |
(A43) |
We then use the series expansion , keeping only second order terms, since the distribution is very narrow:
(A44) |
(A45) |
Using the definition of the variance, , we obtain Eq. 46:
(A46) |
7. Derivation of the consistency condition in the stochastic model
In the deterministic model, Eq. 10 is a consistency condition that describes the naïve expectation that the duration of the cell cycle is equal to the doubling time of the population:
(A47) |
In the stochastic model, the consistency condition in terms of the Laplace transform is given by Eq. 38:
(A48) |
However, the mathematical equivalence is opaque for the moment. To make the equivalence clear, we use the exponential mean Eq. 45:
(A49) |
along with the relation between the PDF of lifetimes of individual state j and the PDF of times taken to transition from birth through state j:
(A50) |
which is Eq. 37 in the main text. Combining these two equations and letting j be the final state m, we obtain the exponential mean of the stochastic cell cycle duration:
(A51) |
(A52) |
If we use the consistency condition Eq. 38, we can also write:
(A53) |
(A54) |
where the second equality came from the definition of the doubling time (Eq. 9 in the main text). We thus recover Eq. 49 from the main text:
(A55) |
which corresponds to the consistency condition in the deterministic model, Eq. 10.
8. A generalization of the stochastic model
Like the deterministic model before it, it seems almost certain that the phenomenology of the stochastic model is more general than some of the assumptions made to motivate and derive it. In particular, the qualitative mechanism that makes the exponential-mean of state lifetime the determinative statistic would seem to depend only on the exponential enrichment of young cells in an exponential culture and not on the details of the sequential state structure of the stochastic model. We therefore offer a slightly more general derivation below.
In the generalized model, assume only that state or object j is created with wait time distribution p′ relative to the birth of a new cell and assume steady-state growth at rate k. Under these assumptions, replaces in Eqs. A29 and A40, even if k is not determined by Eq. A31 due to memory effects. Therefore, most of our results generalize in this new model if the suitable PDFs for the wait times replace the pj’s.
References
- [1].Willis L and Huang KC, Nat Rev Microbiol 15, 606 (2017). [DOI] [PubMed] [Google Scholar]
- [2].Cooper S and Helmstetter CE, J Mol Biol 31, 519 (1968). [DOI] [PubMed] [Google Scholar]
- [3].Bremer H and Churchward G, J Theor Biol 69, 645 (1977). [DOI] [PubMed] [Google Scholar]
- [4].Wang P, Robert L, Pelletier J, Dang WL, Taddei F, Wright A, and Jun S, Curr Biol 20, 1099 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- [5].Robert L, Hoffmann M, Krell N, Aymerich S, Robert J, and Doumic M, BMC Biol 12, 17 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- [6].Wang X, Possoz C, and Sherratt DJ, Genes Dev 19, 2367 (2005). [DOI] [PMC free article] [PubMed] [Google Scholar]
- [7].Withers HL and Bernander R, J Bacteriol 180, 1624 (1998). [DOI] [PMC free article] [PubMed] [Google Scholar]
- [8].Rudolph CJ, Upton AL, Stockum A, Nieduszynski CA, and Lloyd RG, Nature 500, 608 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- [9].Bates D, Epstein J, Boye E, Fahrner K, Berg H, and Kleckner N, Mol Microbiol 57, 380 (2005). [DOI] [PMC free article] [PubMed] [Google Scholar]
- [10].Kuwada NJ, Traxler B, and Wiggins PA, Mol Microbiol 95, 64 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- [11].Adams DW and Errington J, Nat Rev Microbiol 7, 642 (2009). [DOI] [PubMed] [Google Scholar]
- [12].The exact degree to which this is reduced depends on the ratio of τz/T as discussed below.
- [13].Lo T, Huang D, Merrikh H, and Wiggins PA, In preparation.. [Google Scholar]
- [14].Stewart EJ, Madden R, Paul G, and Taddei F, PLoS Biol 3, e45 (2005). [DOI] [PMC free article] [PubMed] [Google Scholar]
- [15].The naming of the cumulative creation and annihilation numbers was motivated in relation to the creation and annihilation operators from quantum field theory.
- [16].Du S and Lutkenhaus J, Trends Microbiol 27, 781 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- [17].Ma X, Ehrhardt DW, and Margolin W, Proc Natl Acad Sci U S A 93, 12998 (1996). [DOI] [PMC free article] [PubMed] [Google Scholar]
- [18].Niki H, Yamaichi Y, and Hiraga S, Genes Dev 14, 212 (2000). [PMC free article] [PubMed] [Google Scholar]
- [19].Lau IF, Filipe SR, Søballe B, Økstad O-A, Barre F-X, and Sherratt DJ, Mol Microbiol 49, 731 (2003). [DOI] [PubMed] [Google Scholar]
- [20].Bird RE, Louarn J, Martuscelli J, and Caro L, J Mol Biol 70, 549 (1972). [DOI] [PubMed] [Google Scholar]
- [21].Pritchard RH, Chandler MG, and Collins J, Mol Gen Genet 138, 143 (1975). [DOI] [PubMed] [Google Scholar]
- [22].Lemon KP and Grossman AD, Mol Cell 6, 1321 (2000). [DOI] [PubMed] [Google Scholar]
- [23].Wallden M, Fange D, Lundius EG, Baltekin Ö, and Elf J, Cell 166, 729 (2016). [DOI] [PubMed] [Google Scholar]
- [24].Mangiameli SM, Merrikh CN, Wiggins PA, and Merrikh H, Elife 6 (2017), 10.7554/eLife.19848. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [25].Mangiameli SM, Veit BT, Merrikh H, and Wiggins PA, PLoS Genet 13, e1006582 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- [26].Wiggins PA, Phillips R, and Nelson PC, Phys Rev E Stat Nonlin Soft Matter Phys 71, 021909 (2005). [DOI] [PMC free article] [PubMed] [Google Scholar]
- [27].Powell E, Microbiology 15 (1956). [Google Scholar]
- [28].Jafarpour F, Wright CS, Gudjonson H, Riebling J, Dawson E, Lo K, Fiebig A, Crosson S, Dinner AR, and Iyer-Biswas S, Phys. Rev. X 8, 021007 (2018). [Google Scholar]
- [29].Kolmogorov A, Atti Accad. Naz. Lincei 12, 388 (1930). [Google Scholar]
- [30].Jensen JLWV, Acta Mathematica 30, 175 (1906). [Google Scholar]
- [31].The exception are the replicating cell statistics Nrep cell. In this case, the implicit assumption that states are sequential cannot always be implemented in the stochastic model. For instance, consider the re-initiation of replication at the quarter cell positions late in the cell cycle in rapidly proliferating cells, as illustrated in Fig. 1A. If these events were modeled as independent, you will first see one replisome initiate at one quarter cell position and then the other. In this case one must compute the exponential means using the order statistics (e.g. [33]).
- [32].Merrikh H, Zhang Y, Grossman AD, and Wang JD, Nat Rev Microbiol 10, 449 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- [33].Cox DR and Hinkley DV, Theoretical Statistics (Chapman & Hall, 1974). [Google Scholar]
- [34].Lin J and Amir A, Cell Systems 5 (2017), 10.1016/j.cels.2017.08.015. [DOI] [PubMed] [Google Scholar]
- [35].Personal communication from S. Iyer-Biswas.
- [36].Jarzynski C, Phys. Rev. Lett. 78, 2690 (1997). [Google Scholar]
- [37].Personal communication from S. Iyer-Biswas.
- [38].Merrikh H, Machón C, Grainger WH, Grossman AD, and Soultanas P, Nature 470, 554 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]