Skip to main content
Journal of the Royal Society Interface logoLink to Journal of the Royal Society Interface
. 2020 Jul 8;17(168):20200360. doi: 10.1098/rsif.2020.0360

Effects of cell cycle variability on lineage and population measurements of messenger RNA abundance

Ruben Perez-Carrasco 1,2,, Casper Beentjes 3, Ramon Grima 4,
PMCID: PMC7423421  PMID: 32634365

Abstract

Many models of gene expression do not explicitly incorporate a cell cycle description. Here, we derive a theory describing how messenger RNA (mRNA) fluctuations for constitutive and bursty gene expression are influenced by stochasticity in the duration of the cell cycle and the timing of DNA replication. Analytical expressions for the moments show that omitting cell cycle duration introduces an error in the predicted mean number of mRNAs that is a monotonically decreasing function of η, which is proportional to the ratio of the mean cell cycle duration and the mRNA lifetime. By contrast, the error in the variance of the mRNA distribution is highest for intermediate values of η consistent with genome-wide measurements in many organisms. Using eukaryotic cell data, we estimate the errors in the mean and variance to be at most 3% and 25%, respectively. Furthermore, we derive an accurate negative binomial mixture approximation to the mRNA distribution. This indicates that stochasticity in the cell cycle can introduce fluctuations in mRNA numbers that are similar to the effect of bursty transcription. Finally, we show that for real experimental data, disregarding cell cycle stochasticity can introduce errors in the inference of transcription rates larger than 10%.

Keywords: cell cycle, mRNA distribution, DNA replication, Master Equation, negative binomial

1. Introduction

Intrinsic noise in gene expression induces variability in the transcript number across a population of cells. Current microscopy techniques are able to capture this variability, which can be used to infer the kinetic parameters of transcription, thereby letting us quantify mechanisms in charge of the regulation of gene expression [13]. In order to make this inference possible, it is necessary to have an accurate stochastic dynamical model that is able to relate the details of the messenger RNA (mRNA) number distribution to the different transcriptional and post-transcriptional molecular mechanisms involved in mRNA processing. This has been extensively done by describing the dynamics of the system by means of the Master Equation, a Markovian description whose solution gives the probability of observing a certain number of mRNAs in a cell at a certain time [4]. Because the exact analytical solution of the Master Equation is only available for a few scenarios (e.g. [57]), the study of the probability distribution of mRNA transcript number is usually limited to calculating the moments of the distribution.

One particular mechanism that has been difficult to study analytically is the influence of the cell cycle on the distribution of mRNAs in a population of cells. The duration of the different phases of the cell cycle is stochastic, introducing noise not only in the time of mitosis when the molecular content is diluted but also in the time at which DNA is replicated, which in turn increases the mRNA production rate [3]. In addition, during mitosis, the cellular content is divided, leading to a stochastic transcript bipartition [8].

Owing to these different challenges, mathematical effort has been focused on limit cases, such as when the cell cycle duration is considered constant [6,7,9], or when DNA replication is omitted [10,11]. Other studies have considered the effect of the cell cycle on protein fluctuations [5,1214]; the analysis in this case is simplified because unlike mRNA, protein lifetimes are very long and hence degradation is mostly owing to dilution at cell division.

In addition, there are other factors beyond details of the cell cycle progression that can have a profound influence on transcript fluctuations. The symmetry of cellular division affects the number of transcripts in a cellular population. For instance, in a growing proliferating tissue, the continuous exponential appearance of young cells in a population introduces an asymmetry in the population cell age, favouring the proportion of cells at early stages of their cell cycle. This contrasts with the age structure of a homeostatic population where it is expected to find the cells equally distributed along their cell cycle [15,16]. Because the average number of mRNAs in a cell increases with the time position in the cell cycle, we expect to observe larger mRNA content for the same type of cell in a homeostatic population compared to a growing population. Similar discrepancies arise when mRNA distributions measured from snapshots of a growing cell population are compared with the temporal tracking of the expression levels of a single cell over time, apparently contradicting ergodicity between single cells and the population. While this effect has been formalized mathematically [11], its relevance to the distributions of mRNA, or to the inference of different kinetic parameters, remains a conundrum.

In this paper, we study the distribution of mRNA transcripts in single cells where expression can be bursty or non-bursty (both commonly observed, see for example [2]), with a cell cycle progression described as a number of stages having a stochastic duration. Our model also includes DNA replication and differentiates between population and lineage (single-cell trajectory) measurements of the mRNA distribution. Keeping the framework relevant to the experimental inference of kinetic parameters, we aim to answer the following question: how important is the inclusion of cell cycle variability for predicting the statistics of stochastic mRNA expression? With this objective in mind, we derive and analyse expressions for the error made in different observables of transcript abundance when a deterministic cell cycle (one of fixed length) is considered instead of a stochastic one. Furthermore, we apply our results to a genome-wide expression dataset to address the magnitude of the error made in the inference of the transcription rate when mathematical models with different cell cycle details are employed.

2. Model description

We consider a general model of stochastic gene expression that takes into account cell cycle variability (for an illustration, see figure 1a,b) with the following properties.

  • (i)

    The cell cycle is divided into N stages. The duration of each stage i is exponentially distributed with a rate ki. This implies that the total cell cycle duration follows a hypoexponential distribution. Note that the number of stages, in general, will be larger than the number of cell cycle phases. Each biological cell phase can, therefore, be described as the composition of a given number of stages resulting in hypoexponentially distributed cell cycle phases consistent with recent experiments [17]. The number and duration of the different stages can be chosen by fitting the experimental cell cycle duration distribution.

  • (ii)

    The length of the mitotic phase is negligible and hence it is assumed to occur instantaneously after the end of the N-th stage. This leads to binomial partitioning of the mRNA between mother and daughter cells, where each individual molecule is allocated in either daughter cell with the same probability.

  • (iii)

    There is bursty or constitutive transcription of mRNA with rate ri of producing mRNAs per unit of time, and a decay rate di. When the transcription is bursty, the burst size follows a geometric distribution with mean βi. All the parameters ri, di, βi can vary depending on the stage i along the cell cycle.

Figure 1.

Figure 1.

(a,b) Schematic of the general model where mRNA dynamics take into account details of the cell cycle (a) including DNA replication of a gene, phase duration variability and bipartition of mRNA content at mitosis. During each cell cycle stage (b) mRNA dynamics is described as a production term (constitutive or bursty), and a linear degradation. (c) The Erlang and hypoexponential distributions provide excellent fits to the experimental probability distributions of cell cycle durations of eight different cell types. The parameters of the Erlang distribution (k, N) are shown on the figure. The sources of the experimental data, together with the parameters of the hypoexponential distribution (k1, N1, k2, N2) with N1 stages of rate k1 and N2 stages of rate k2 are: B-cells (11.7, 132, 0.26, 1) [21], Rat1 fibroblasts (0.15, 1, 2.4, 50) [22], HeLa cells (0.71, 4, 20.0, 110) [23], NIH 3T3 fibroblasts (0.24, 2, 11.4, 98) [19], mammary epithelial cells [24] (0.32, 1, 11.4, 154), Escherichia coli (0.88, 16, 0.07, 1) [25], Saccharomyces cerevisae (0.04, 1, 1.4, 115) [26] and Synechococcus elongatus (4.18, 30, 4.18, 28) [27]. Fitted distributions correspond with the least-squares fit of the distance between the experimental histograms and the probability density functions using the Trust Region Reflective algorithm implemented in SciPy. (d) Comparison of stochastic mRNA trajectories between a case where cell cycle duration is constant (purple) or stochastic (green), for different degradation rates d. Arrows indicate stochastic division times. Stochastic cell cycle simulations use the Erlang model with a production rate per chromosome equal to r = 50d for a cell cycle with N = 4 stages, from which W = 3 stages occur prior to the gene replication (w = W/N = 3/4) indicated by dashed lines for the deterministic simulations.

This constitutes the general model studied in this manuscript. Detailing cell stage-specific rates of transcription and degradation is particularly relevant as it will bestow our model with the ability to accurately describe the dynamic nature of mRNA transcription [3,18]. In addition, for the sake of clarity of our analysis, we will also consider a particular case of the general model.

  • (i)

    All the cell stage rates are identical along the cell cycle ki = k. This implies that the total cell cycle duration follows an Erlang distribution. The number of cell cycle stages, in this case, can be easily determined from the best fit of an Erlang distribution to the experimental cell cycle duration [19]. In particular, the coefficient of variation (CV) of the Erlang distribution is equal to 1/N.

  • (ii)

    The degradation rate of the mRNA is independent of the cell cycle stage di = d.

  • (iii)

    There are W stages prior to DNA replication of the gene of interest and NW stages post-replication. The production rate of mRNA is proportional to the DNA content of the cell at each stage without dosage compensation, being ri = r for iW and ri = 2r for i > W. We can consider replication to be instantaneous because the time of replication of the locus containing the gene of interest is much shorter than the total duration of the S-phase [20]. If transcription is considered to be bursty, the average burst size is constant along the cycle βi = β.

Because in this particular scenario, the cell cycle duration follows an Erlang distribution, it will be referred hereon as the Erlang model to distinguish it from the general model. Despite the simplicity of the model, a fit of the Erlang distribution to eight different types of eukaryotic and prokaryotic cells showed a very good fit, capturing the variability of cell cycle duration (figure 1c).

Stochastic simulations of the model can be used to study the effect of changing parameter values on the mRNA transcript number (figure 1d). Alternatively, we can study analytically the evolution of the probability Pi(n, t) of finding a cell in stage i with n mRNAs at time t by using a Master Equation description, that for the general model with constitutive mRNA transcription (bursty case is detailed in appendix A) reads:

P1(n,t)t=k1P1(n,t)+kNPN(n,t)+r1(P1(n1,t)P1(n,t))+d1((n+1)P1(n+1,t)nP1(n,t)) 2.1

and

Pi(n,t)t=kiPi(n,t)+ki1Pi1(n,t)+ri(Pi(n1,t)Pi(n,t))+di((n+1)Pi(n+1,t)nPi(n,t)),i[2,N]. 2.2

The first and second terms in these equations describe the exit from, and entrance to, the present cell cycle stage. The third term models transcription and the fourth term mRNA decay. Note that binomial partitioning during mitosis is explicitly taken into account by the second term of equation (2.1). This process implies

PN(n,t)=m=0mn2mPN(m,t), 2.3

where we take the convention m choose n equals zero when n > m.

3. Factorial moments in cyclo-stationary conditions

Defining the generating function Gi=nznPi(n), the Master Equations equations (2.1)–(2.2) can be written as

G1(z,t)t=k1G1(z,t)+kNGN(1+z2,t)+r1(z1)G1(z,t)+d1(1z)zG1(z,t) 3.1

and

Gi(z,t)t=kiGi(z,t)+ki1Gi1(z,t)+ri(z1)Gi(z,t)+di(1z)zGi(z,t),i[2,N]. 3.2

From the definition of the generating function, it follows that the unnormalized ℓ-th factorial moment of the mRNA distribution in stage j is given by

(nj)nn(n1)(n+1)Pj(n)=Gj()(1), 3.3

where the superscript (ℓ) means differentiating ℓ times. Enforcing cyclo-stationary conditions (steady state for the mRNA distribution of each individual cell stage) by setting the time derivatives in equations (3.1) and (3.2) to zero, differentiating p times the resulting equations and using the definition of the factorial moments above, we obtain

0=k1(n1)p+kN(12)p(nN)p+r1p(n1)p1d1p(n1)p 3.4

and

0=ki(ni)p+ki1(ni1)p+rip(ni)p1dip(ni)p,i[2,N]. 3.5

Equation (3.5) can be brought into the form

(ni+1)p=fi(ni)p+gi,i[2,N], 3.6

where we have used the definitions

fi=kiki+1+pdi+1andgi=ri+1p(ni+1)p1ki+1+pdi+1. 3.7

Because these are first-order non-homogeneous recurrence relations with variable coefficients, their solution can be written as

(nj)p=δj(n1)p+θj,j[2,N], 3.8

where we have used the definitions

θj=δjm=1j1gmδm+1andδj=k=1j1fk. 3.9

Solving equation (3.8) for (nN)p and substituting in equation (3.4), after some simplification, we obtain

(n1)p=2r1p(n1)p1+kN(12)p1θN2(d1p+k1)kN(12)p1δN. 3.10

Note that the solution of the unnormalized p-th factorial moment depends on knowledge of the unnormalized (p − 1)-th factorial moment. Hence, because of this dependency, all factorial moments need knowledge of the zeroth order factorial moment (nj)0, which corresponds with the probability of finding the cell at stage j. By the definition of equation (3.3), we see that (nj)0 = Gj(1). Setting p = 0 in equations (3.4) and (3.5) one obtains

(ni)0=kij=1Nkj11. 3.11

Hence summarizing, equations (3.8), (3.10) and (3.11) together give the solution to the unnormalized p-th factorial moment of the mRNA numbers in cell stage j. Note that to obtain the normalized p-th factorial moment one divides the unnormalized p-th factorial moment by Gj(1)=nPj(n)=(nj)0.

The factorial moments for the general model with bursty transcription can be derived following the same steps. This procedure shows that the first factorial moment is equal to the constitutive case, whereas the factorial moments for higher orders in the bursty case are larger than in the constitutive case (see appendix A).

4. Lineage measurements

The moments of the distribution can be used to compute the mRNA distribution statistics for different tissues. For instance, the mean number of mRNAs can be calculated as the average along the cell cycle stages of the expected number of mRNAs at each stage ((ni)1/(ni)0) weighted by the probability πi of finding a cell in a tissue at a certain stage i. Following this methodology, the expressions for the mean and the variance are

n=i=1Nπi(ni)1(ni)0andσ2=i=1Nπi(ni)2(ni)0+n(1n). 4.1

We will start our analysis studying the scenario in which the mRNA content of a single cell is tracked in time at regular intervals and, after division, the tracking keeps following only one of the daughter cells. This scenario is equivalent to the mRNA distribution of the cells forming a homeostatic tissue, where after each division one of the cells leaves the population, keeping constant the number of cells in the tissue [15,16]. This scenario will be referenced in the text as the ‘lineage’ case, to differentiate it from the mRNA distribution across a growing proliferating population of cells, which will be referred to as the “population” case. In the lineage case, the probability πi of finding a cell at the i-th cell cycle stage corresponds with (ni)0 (equation (3.11)) being inversely proportional to the cell stage advance rate ki:

πi=(ni)0. 4.2

For the Erlang model, this is πi = 1/N, and the explicit expression for the mean transcript can be obtained by introducing equations (3.10), (3.11) and (4.2) in (4.1), obtaining

n=n^+(1w)n^n^η1(11+ηΔ)(1w)/Δ2(11+ηΔ)1/Δ, 4.3

where for the sake of clarity we have written the expression in terms of the CV of the cell cycle Δ=1/N. In addition, we have introduced the mean mRNA number in the absence of a cell cycle n^=r/d, the fraction of the cell cycle before DNA replication of the gene of interest w = W/N, and the non-dimensional parameter η = dT that compares the degradation time scale with the dilution time scale given by the average cycle duration T = N/k (table 1). Note that η is proportional to the ratio between the mRNA half-life t1/2 and the cell cycle duration T following η = Tln(2)/t1/2.

Table 1.

Description of the different parameters used to describe the cell cycle, mRNA dynamics, and their relationship in the Erlang model. (Parameters in shadowed rows can be derived from the rest of the parameters.)

meaning Erlang model
N number of cell stages
W cell stages prior to replication
ki rate of advance of cell cycle stage i k
ri transcription rate during stage i r if iW
2r if i > W
di mRNA degradation during stage i d
βi mean burst size during stage i β
w proportion of cell cycle
before gene replication W/N
n^ stationary average mRNA number
in absence of cell cycle r/d
T average cell cycle duration N/k
Δ squared coefficient of variation
of cell cycle duration 1/N
η mRNA degradation relative
to cell division rate dT

The first term of equation (4.3) corresponds to the classical scenario without cell cycle. The second term of equation (4.3) introduces the effect of DNA replication for the case in which the mRNA degradation time scale is much shorter than the cell cycle length (η → ∞). Finally, the third term in equation (4.3) describes the contribution when mRNA degradation occurs at a comparable time scale to the cell cycle duration through the parameter η which has measured values in the approximate range 0.5–8.0 for a variety of cell types (table 2). This latter contribution increases monotonically with the cell cycle variability Δ (see appendix B), and is minimal in the limit of Δ → 0 (deterministic cell cycle duration). In this deterministic limit, equation (4.3) reduces to the simpler form

n=limΔ0n=n^+(1w)n^n^η(1eη(1w)2eη), 4.4

which agrees with a different calculation using deterministic rate equations (see appendix C). Comparison of equations (4.3) and (4.4) allows us to quantify the relative error R made in the expected number of mRNA when the cell cycle variability is not considered in the description of the mRNA dynamics:

Rnnn=1η(2w)1eη(1w)2eηη(2w)1(1+ηΔ)(1w)/Δ2(1+ηΔ)1/Δ. 4.5

Note that R is only a function of η, w and Δ, and therefore independent of the mRNA production rate (figure 2a). The error is always positive (see appendix B) and increases with the cell cycle time variability Δ, reaching its maximum for Δ = 1 (which is the maximum Δ attainable for an Erlang process since N ≥ 1). Similarly, as the expression for the first moment is identical in the bursty and constitutive cases (see appendix A), the mean transcript number and its error are also independent of how bursty the transcription is.

Table 2.

Typical values of Δ, w and η for different cell types and their source. (Values of Δ = 1/N are obtained from the Erlang fitting (figure 1c). Range for η corresponds with the 99% confidence interval of the η distribution for each genomic dataset.)

E. coli S. cerevisae N1H3T3
Δ 0.14 [25] 0.03 [26] 0.08 [19]
w 0.15–0.50 [28] 0.35–0.55 [29] 0.40–0.80 [30]
η 0.41–8.4 [31] 0.89–15 [32] 0.65–6.4 [33]

Figure 2.

Figure 2.

Relative error made in the average number of mRNAs (R) and its variance (Rσ) when considering the cell cycle to be deterministic instead of Erlang distributed in a non-proliferating population or a lineage. Panels compare the theoretical results (lines) with stochastic simulations (circles). (a,b) Relative error R of the mean number of mRNA. (c) Genome-wide values of η for three different cell types. NIH3T3 mouse fibroblast data was obtained from [33]. Degradation rates for S. cerevisiae cultured in yeast extract peptone dextrose were obtained from [32] and its cell cycle duration from [34]. Stability data for the transcripts of E. coli cultured in Lysogeny broth were obtained from [31], while its cell cycle duration is taken from [35]. Averages are done over trajectories of duration t = 600Tmax (1/dT, 1/rT, 1). (a,b,f) Averages over 50 trajectories for all conditions except for n^=1, which shows an average of 500 trajectories. (d,e) Averages over 200 trajectories. Error bars indicate the standard error of the mean.

For a given cell type, the average time at which replication of a given gene occurs and the cell cycle duration variability can be considered constant (provided external conditions are not changed), and hence the value of the error R for different genes will be determined exclusively by η, which compares the mean cell cycle duration and mRNA lifetime, and can vary significantly from gene to gene [33]. The error decreases with η (figure 2a,b), vanishing for η ≫ 1 corresponding with the scenario where mRNA lifetime is much shorter than the cell cycle duration. On the other hand, the relative error R is maximum for low values of η, describing the case of stable mRNAs for which degradation rates are much lower than the proliferation rate of the cell (analytical expressions for the mRNA distribution for this case can be found following the method described in [5]):

limη0R=ΔΔ+(2w)+22w.

Interestingly, this maximal error depends on the properties of the cell cycle through w and Δ and it is maximized for intermediate levels of the DNA replication time w=220.6, which is comparable to biological values of the relative duration of the G1 phase for N1H1 3T3 cells (table 2) [3,30] (excluding cells which have arrested G1 phases), achieving a maximal relative error of R15%, corresponding to Δ = 1/2 and w=22.

The relative error can be more precisely estimated given data for specific types of cells. For example, the cell cycle duration distribution in NIH 3T3 mouse embryonic fibroblasts has been described by an Erlang distribution with CV2 ≃ 1/12 (which implies N = 12 effective cell cycle stages) [19] and the G1 phase occupies roughly a fraction w = 0.4 of the cell cycle, indicating the position of the earlier transcribed genes during S-phase [30]. The maximum relative error R for these parameters lies around 3% (figure 2b), while for most of the transcriptome (η ∼ 1, figure 2c) the relative error R1%, indicating that in these cases the cell cycle duration variability can be ignored if the mean mRNA is all that we are interested in.

Making use of the second-order moments of the distribution, we can extend the analysis to other statistic observables, allowing us to quantify the (relative) error in the variance, Rσ, of mRNA fluctuations made when neglecting cell-cycle variability:

Rσ=σ2σ2σ2, 4.6

where σ2* is the variance of the mRNA distribution in the deterministic cell cycle limit (Δ → 0). For the Erlang model, combining the expression for the variance (equation (4.1)) with the factorial moments (equations (3.10) and (3.11)), we obtain an error for the variance Rσ that is much larger than the one observed in the mean. Additionally, Rσ does not have a monotonic dependence on the degradation rate, but is maximal for intermediate values of the degradation rate (η ∼ 1, see figure 2d,e,f). Interestingly, this region of values of η corresponds with most of the transcripts genome-wide for different species (figure 2c). In particular, for the NIH 3T3 cells the error reaches Rσ25% (figure 2e), and can reach values as high as 80% for Δ = 1/2 (figure 2d). In contrast to the error in the mean, the error in the variance will depend on the transcription rate and the transcriptional burstiness. Analysis of Rσ for the bursty model shows that Rσ decreases with the burst size, reflecting that an increase in the variance owing to the bursty gene expression reduces the relative impact of the contribution from cell cycle variability (figure 2f). Nevertheless, despite this reduction, the error Rσ is still above 10% for many scenarios including both bursty and constitutive expression (figure 2d,e,f). Furthermore, in contrast to the error in the mean, Rσ depends on the DNA replication position w in such a way that genes replicating later in the cell cycle (larger w) not only show larger errors but also for a broader range of degradation rates (figure 2e).

5. Population measurements

When considering a proliferating population of cells, the continuous appearance of synchronized cells at an initial cell cycle stage establishes a different age distribution than the one derived in the lineage scenario (figure 3a). Specifically, after mitosis, one cell at stage N leaves the population to give rise to two cells at stage 1, enhancing the probability of finding cells in the population at initial stages of their cell cycle. The population values for the probability of observing a cell in the i-th cell stage πi, can be obtained by considering the evolution of the average number of cells in cell cycle stage i at time t, denoted by Ci(t) [19,36]:

dC1(t)dt=k1C1(t)+2kNCN(t) 5.1

and

dCi(t)dt=kiCi(t)+ki1Ci1(t)i=2,,N, 5.2

where the factor 2 in the first equation stands for cellular division: every time a cell divides (leaving stage N), two cells start at stage 1. In the lineage case this factor becomes 1. More generally, for cases with asymmetric division (after mitosis some cells leave the population with a certain probability) this factor 2 can be replaced by a factor α ∈ [0, 2] [16]. While for equations (5.1) and (5.2), the number of cells Ci(t) will grow in time, the relative cell stage distribution in the population will eventually reach a steady state for which we can write the ansatz Ci(t)/C1(t) ≡ λi. Specifically, for the Erlang case, introducing the definition of λi in equation (5.2) yields the relationship

λi=2(1i)/N, 5.3

that gives the explicit values for the probability πi of observing a cell in the population at stage i:

πi(t)=Ci(t)i=1NCi(t)=21/N12(i/N)1,i=1,,N, 5.4

differing from the lineage stage distribution (equation (4.2)), which for the Erlang case is constant (πi = 1/N). This discrepancy was confirmed by simulations (see inset of figure 3a).

Figure 3.

Figure 3.

(a) Comparison between the mRNA content over a single-cell trajectory in time (blue) with the mRNA distribution of a proliferating population (orange). Histograms for both mRNA distributions (right) compare the average of 100 trajectory realizations with the snapshot of a single population at t = 7T. Parameters used are T = 1 , N = 4, W = 2 , n^=50, η = 1. (Inset) Probability distribution πi of finding a cell at different cell cycle stages for single trajectory (blue) and a proliferating population (orange). Stochastic simulations for πi (circles) are compared with theoretical results (bars) obtained from equation (4.2) (that for the Erlang case is constant πi = 1/N) and from equation (5.4). (b,c) Relative error made in the average number of mRNAs (R) and its variance (Rσ) when considering the cell cycle to be deterministic instead of Erlang distributed in a growing proliferative population. Comparison includes theoretical results (lines) and stochastic simulations (circles). Simulations in (b) show the average of 5000 snapshots at a time 10 T and in (c) the average of 25000 snapshots at a time 10 T. Error bars indicate the standard error of the mean.

In addition to differences in πi, cells in the population case are also found more likely at earlier times inside each stage than in the lineage case. For the Erlang model, the distribution of times that each cell has been in its current cell stage follows an exponential distribution ∼Exp(k21/N) (see appendix D and [5]). Given the Markovian nature of the process, this effect is equivalent to reducing k and having a faster cell advance through the cell cycle. Therefore, using the expressions for πi from equation (5.4), and the new effective rates of cell stage advance kk21/N in equations (3.8), (3.10) and (3.11), allows us to obtain the factorial moments for population measurements. The mean number of mRNA in this scenario is

n=n^21wηΔ2Δ+ηΔ1. 5.5

It is straightforward to show that 〈n〉 increases monotonically with Δ (similar to the lineage case). The exactness of equation (5.5) is confirmed by stochastic simulations in figure 3. In the limit of a deterministic cell cycle, equation (5.5) reduces to the simpler form

n=limΔ0n=21wηn^η+ln(2). 5.6

This agrees with a different calculation using deterministic rate equations (see appendix C). Similar to the lineage case, this allows us to write explicitly an expression for the relative error R in the average number of mRNAs made when omitting the stochasticity of the cell cycle:

Rnnn=12Δ+ηΔ1Δ(η+ln(2)). 5.7

As in the lineage case, the error is a monotonic decreasing function of η, and increases with Δ reaching an error that is similar to the single-cell case (R20%) (figure 3b,c). Nevertheless, in contrast to the lineage case, the error is negative, indicating that the expected number of mRNA decreases with the variability of the cell cycle duration. Strikingly, the error is independent of w and thus independent of the relative duration of G1 and G2 phases. Analysis of the error in the variance, Rσ, results in similar observations to those of the lineage measurements, where Rσ depends on the transcription rate and the transcription burstiness, resulting in errors much larger than R (Rσ>50%) that peaks at intermediate values of the degradation rate corresponding to the most frequent values of η measured genome-wide for different species (figure 2c). As in the lineage case, the error Rσ depends on the replication position during the cell cycle w, so genes replicating later in the cell cycle show larger errors for broader ranges of mRNA stability.

6. Messenger RNA distribution approximation

The exact mRNA distribution of our model is known only for some limit cases such as η → 0 [5]. Nevertheless, for more general realistic cases, we can use the moment derivation to reconstruct an approximate distribution. In particular, our analysis provides analytical expressions for the moments of the distribution at each cell stage i. Exclusively using the first moments, we can approximate the total mRNA population as a mixture of N Poisson distributions P~(N,t)=iπiPois(ni), where the weights πi correspond to the probability of finding a cell at cell stage i obtained in equations (4.2) and (5.4). Similarly, including the second moments, we can describe the probability as a mixture of negative binomial distributions P~(n,t)=iπiNB(ni,σi), where each component NB(ni,σi2) is a negative binomial distribution with mean 〈ni and variance σi2 (see equation 4.1). Results for the lineage case, show that while the Poisson mixture failed to recover the distribution obtained from stochastic simulations in most scenarios (figure 4a), the negative binomial mixture resulted in a very good prediction, able to recover the broad tails and bimodality of the mRNA distribution. In order to accurately assess the goodness of the reconstructed distribution, we computed the Kullback–Leibler divergence of the negative binomial mixture P~(n,t) from the simulated exact distribution (figure 4b). We observed that the approximation only fails for regimes with very unstable mRNAs that are highly expressed. On the other hand, the approximation improves for larger values of N, closer to experimental values for the cell cycle duration variability (CV2 = 1/N = 1/12) [19] (compare left and right panels of figure 4b). Comparison of the distributions for bursty expression and population measurements, using their corresponding moments and stages distributions, πi, yielded an even better approximation with values of the Kullback–Leibler divergence orders of magnitude lower than the lineage case (figure 4c).

Figure 4.

Figure 4.

Approximation of the mRNA distribution as a mixture of negative binomials (NBs). (a) Comparison between the approximations for the mRNA distribution (Poisson mixture and negative binomial mixture) with simulations for four different cases: I (N = 2, η = 10, n^=50), II (N = 2, η = 1, n^=50), III (N = 12, η = 10, n^=50 ) and IV (N = 12, η = 1, n^=50). Case IV also includes the individual components of the NB mixture. (b) Kullback–Leibler divergence of the NB binomial mixture from stochastic simulations for lineage distributions of constitutive mRNA expression. (c) Comparison of the Kullback–Leibler divergence from simulations of the NB mixture for combinations of lineage/population measures and constitutive/bursty (β = 10) expression. For all the panels w = 1/2. Distributions for (a) and (b) result from trajectories over a time t = 6 × 104 Tmax (1/dT, 1/rT, 1), while for (c) we used t = 6 × 106T/d · max (1/dT, 1/rT, 1) for lineage measurements, and t = 20T for population measurements.

7. Genome-wide transcription rate inference error

Our results so far have been focused on analysing the errors that different models introduce on the mRNA statistics. Likewise, it is relevant to assess the error that different models introduce in the inference of biochemical parameters from experimental data. For this purpose, we analysed the genome-wide data from [33] and compared their transcription rate inference based on a lineage model with constant cell cycle and no replication (obtained by solving an expression equivalent to 〈n*〉 with w = 0 in equation (4.4), see appendix F) against different models incorporating stochastic cell cycle duration (see figure 5 and appendix F). Interestingly, because the average mRNA number is proportional to the transcription rate (see equations (4.3) and (5.5)), for given values of the cell cycle duration and gene replication, the relative error made when omitting cell cycle variation is a function depending only on the degradation rate through the parameter η (figure 5a). Because [33] reported no correlation between mRNA stability and transcription rate, this resulted in the absence of correlation between the error and the speed at which genes are transcribed (figure 5a). Additionally, in agreement with the error of the average mRNA number R, the error in the transcriptional rate estimate increases with the stability of the mRNA. When comparing the error expected for different models, small errors were observed for the lineage case with no DNA replication (figure 5b). Nevertheless, for more realistic scenarios, where the error is evaluated for a growing population case with DNA replication [33], more than 90% of the genes detected underestimate the transcription rate with an error bigger than 10% (figure 5b).

Figure 5.

Figure 5.

Relative error of inferred mean transcription rates with deterministic models at a genomic scale. (a) Relative error in the inferred transcription rate for the 5028 genes reported in [33], as a function of the relative degradation rate η and the reported transcription rate. The relative errors are calculated between different models with stochastic cell cycle and the deterministic cell cycle model without replication used in [33] (see appendix F). Discrepancies between models are a function of η (dotted line). (b) Histogram showing the number of genes for different levels of transcription rate error for three different stochastic cell cycle models. The stochastic cell cycle used is an Erlang model with k = 0.145 h−1 and N = 16 using the reported cell cycle duration and variability in [33] (see appendix F), while replication is considered to occur at the middle of the cell cycle (w = 1/2).

8. Discussion

Most of the models employed to study gene regulation ignore the effect that a detailed stochastic cell cycle description has on gene expression. The model and methodology developed in this paper not only allows one to analytically evaluate the role of features such as cell duration stochasticity or DNA replication in the transcript population, but also provides a straightforward way of discriminating the scenarios for which such details are relevant for the description of the system. This is of paramount importance when mathematical models are used to infer parameters from experimental data, where the precision of the information demands the use of the right level of abstraction [37].

Specifically, this approach contrasts with alternative strategies that either ignore cell cycle effects or fit mRNA populations to arbitrary population mixtures, impeding the inference of mechanistic information of the transcriptional parameters. This is of particular relevance for current data analysis where mRNA labelling techniques give access to mRNA abundance distributions in populations of cells. In order to extract mechanistic information of the transcriptional process from these distributions, it is paramount to link the details of the distribution to the properties of the different biomolecular mechanisms [3,38]. While in this paper, we analysed the error in the transcription rate estimation owing to neglecting cell cycle variability and replication, future work will address how taking into account such details may also affect the inference of other biochemical parameters such as gene activation and deactivation rates, or the mean burst size. The necessity of such a study becomes apparent from the mRNA distributions obtained, which can be approximated accurately by negative binomials in scenarios with constitutive gene expression, challenging the common practice to use negative binomial distributions as a signature of bursty transcription [1,38,39]. Experimental validation of these predictions will require inference of parameters using data obtained by live imaging techniques, such as labelling with mRNA aptamers such as MS2, Mango or Peppers [4042] capable of measuring individual mRNA dynamics. Specifically, trajectory data provided by live imaging techniques can be used together with the moments provided by our stochastic cell cycle length theory to infer the parameters of the model using Bayesian methods (such as ABC) [37]. Contrasting the resulting parameter posterior from a stochastic cell cycle model with those obtained from performing the same analysis under the assumption of a fixed cell cycle length can highlight the relevance of cell cycle variability for the estimation of transcription and other biochemical parameters.

One of the major assumptions used in the bulk of the paper is that all the stages of the cell cycle have the same rates of mRNA production per allele, burst size, and degradation. Nevertheless, our analysis allows the incorporation of different rates of production and degradation in the expression for the moments of the general model (equations (3.8)–(3.11)). Such an accurate description will be of paramount importance to understand the effect of stochastic cell cycle duration in genes related to cell-cycle progression [4345] or dosage compensation along the cell cycle [3,46]. Furthermore, additional experimental stochastic details of the different cell cycle phases can be incorporated by choosing accordingly the rates and number of stages of the general model [47].

Future extensions of the model can focus on incorporating more detailed descriptions of the stochastic dynamics of the different processes. These could include more realistic assumptions about how molecules are partitioned at cell division to effectively account for specific segregation mechanisms [8]. Another possible extension could describe transcriptional bursts by considering promoter switching dynamics explicitly [48] allowing us to investigate the effect of multiple promoter states on the mRNA dynamics [49].

Finally, further extension of the model should include protein regulation of mRNA abundance. Particular mechanisms include nuclease dynamics controlling mRNA turnover along the cell cycle [43,50], or cyclin-dependent kinases controlling cell-cycle advance [51]. In addition, incorporating the methodology developed in this paper to gene regulatory networks will provide a route to better understanding the stochastic details of the expression of genes with dynamics that can change in a time scale comparable to the cell cycle duration, such as circadian clock-related genes [52,53]. This is of special relevance in embryonic development, where the details of intrinsic noise are known to play a major role in the formation of spatial domains of gene expression in the patterning of embryonic tissues [5456].

Acknowledgements

We thank James Briscoe for feedback on the manuscript. R.P.-C. acknowledges support from the UCL Mathematics Clifford Fellowship. R.G. acknowledges support from the Leverhulme Trust (grant no. RPG-2018-423). C.B. acknowledges the Clarendon Fund and New College, Oxford, for funding.

Appendix A. Bursty messenger RNA transcription model

Considering the general model, we can introduce bursty mRNA transcription as a reaction with a burst rate νi at cell stage i. The number of mRNAs ℓ produced in a burst at cell stage i follows a geometric distribution ξi(ℓ) with average number βi of transcripts produced per burst, i.e. an average rate ri = νiβi of mRNAs produced per unit of time [57]. The explicit geometric probability distribution follows:

ξi()=11+βiβi1+βi. A 1

The resulting Master Equation reads

P1(n,t)t=k1P1(n,t)+kNPN(n,t)+r1β1=1n[P1(n,t)ξ1()]P1(n)(1ξ1(0))+d1((n+1)P1(n+1,t)nP1(n,t)) A 2

and

Pi(n,t)t=kiPi(n,t)+ki1Pi1(n,t)+riβi=1n[Pi(n,t)ξi()]Pi(n)(1ξi(0))+di((n+1)Pi(n+1,t)nPi(n,t)),i[2,N]. A 3

As in the constitutive case, we can use these system of differential equations to obtain the steady-state factorial moments of the distribution by introducing the generating function Gi=nznPi(n). In particular, the terms corresponding to bursty transcription follow the sum

n=0=1nznP(n,t)ξi()==1n=0znP(n,t)ξi()==1ξi()zG(z)=G(z)11+βi(1z)11+βi. A 4

This results in the system of differential equations

G1(z,t)t=k1G1(z,t)+kNGN(1+z2,t)+11+β1(1z)1r1β1G1(z,t)+d1(1z)zG1(z,t), A 5
Gi(z,t)t=kiGi(z,t)+ki1Gi1(z,t)+11+βi(1z)1riβiGi(z,t)+di(1z)zGi(z,t),i[2,N]. A 6

Enforcing the steady state by setting the time derivatives in equations (A 5) and (A 6) to zero, differentiating p times the resulting equations and using the definition of the factorial moments (ni)k, we obtain

0=k1(n1)p+kN(12)p(nN)p+r1j=0p1p!j!β1pj1(n1)jd1p(n1)p A 7

and

0=ki(ni)p+ki1(ni1)p+rij=0p1p!j!βipj1(ni)jdip(ni)p,i[2,N]. A 8

The normalization of the factorial moments (nj)0, obtained for p = 0 with the normalization condition jGj(1)=1, is the same as in the constitutive case, and only depends on the cell stage advance rates

(n1)0=(1+i=2Nj=2ikj1kj)1 A 9

and

(ni)0=(n1)0j=2ikj1kj,i[2,N]. A 10

Similarly to the constitutive case, equation (A 8) can be written in the form

(ni)p=fi1(ni1)p+g~i1,i[2,N], A 11

where we have used same definition for fi as in the constitutive production case, but g~i replaces gi, which instead of depending on the immediately lower order factorial moment p − 1, depends on all the moments lower than p:

fi=kiki+1+pdi+1,g~i=ri+1j=0p1p!j!βi+1pj1(ni+1)jki+1+pdi+1. A 12

These first-order non-homogeneous recurrence relations have the solution

(nj)p=δj(n1)p+θ~j,j[2,N], A 13

where we have used the definitions

θ~j=δjm=1j1g~mδm+1,δj=k=1j1fk. A 14

Substituting this solution in equation (A 7), we obtain

(n1)p=2r1j=0p1p!j!β1pj1(n1)j+kN(12)p1θ~N2(d1p+k1)kN(12)p1δN. A 15

Comparing these results, we can immediately see that the expected value of mRNAs is the same in the bursty case and the constitutive case considering the same average rate of mRNA production at cell stage i : ri = νiβi. Differences arise for higher moments. In particular, all the factorial moments of the bursty scenario with p > 1 are larger than the factorial moments of the constitutive case because

rij=0p1p!j!βipj1(ni)j=rip(ni)p1+rij=0p2p!j!βipj1(ni)j>pri(ni)p1. A 16

Appendix B. Monotonic dependence of mean messenger RNA on the coefficient of variation of the cell cycle duration for lineage observations

By equation (4.3), we have for η > 0

n=wn^+(1w)2n^n^η111+ηΔ(1w)/Δ211+ηΔ1/Δ=C+Df(Δ), B 1

where w is a fraction, C, D are constants (D is positive) and

f(Δ)=11+ηΔ(1w)/Δ211+ηΔ1/Δ. B 2

If we define x = (1 + ηΔ)1/Δ, we note that because Δ > 0 we have x ∈ (1, eη) and also x(Δ) is monotonically decreasing, which follows from

dxdΔ=(1+ηΔ)1+1/ΔΔ2>0(ηΔ(1+ηΔ)log(1+ηΔ))<0<0. B 3

We then note that using this transformation, we get

f(Δ)=g(x)=xw2x1, B 4

which satisfies

dg(x)dx=xw1(12x)2>0(2x(w1)w)<0<0. B 5

Using this, we find

df(Δ)dΔ=dxdΔ<0dg(x)dx<0>0, B 6

which proves strict monotonicity of 〈n〉 as a function of Δ > 0.

Appendix C. Alternative derivation of equations (4.4) and (5.6) from deterministic rate equations

Consider a cell cycle of fixed duration T with replication (and consequent doubling of transcription) occurring at time τ = wT (where w is a fraction). If the transcription rate before replication is r, the mRNA decay rate is d and n(t) is the deterministic estimate for the mean number of mRNA molecules at time t then a deterministic model for this process is

dn(t)dt=rdn(t),if0t<τ2rdn(t),ifτtT. C 1

In the cyclo-stationary limit, binomial partitioning (when cell division occurs at the end of the cell cycle) leads to the boundary condition 2n(0) = n(T). Note that t in this context means the cell age and not absolute time and hence it can only vary between 0 and T. Solving these differential equations, we obtain the solution

n(t)=n^(1+ed(τt)12eη),if0t<τ2n^(1+ed(τt+T)12eη),ifτtT, C 2

where n^=r/d. Let f(t) dt be the probability of observing a cell of age between t and t + dt, where dt is an infinitesimal time interval. It then follows that

f(t)dt=limNπjj=NtT, C 3

where πj is the probability of observing a cell in cell cycle stage j. Note that because Δ = 1/N, the limit of N → ∞ at constant T is the same as the limit of Δ → 0. Because T = N/k, in this limit, we have infinite cell stages N advancing with an infinite rate k, i.e. the cell spends an infinitesimal small time dt = 1/k at each stage. Knowing that πi = 1/N for lineage measurements, we have

f(t)=limNπjdt=1T. C 4

For the population case, we substitute i/N = t/T in equation (5.4) take limit of large N and finally use N = kT to obtain

f(t)=ln(2)T2(1t/T). C 5

Note that both equations (C 4) and (C 5) are well known and have been in common use for more than 40 years [58]. Finally, we obtain the mean number of mRNA averaged over the cell cycle n¯=0Tf(t)n(t)dt. For the lineage measurements, this yields

n¯=wn^+(1w)2n^n^η(1eη(1w)2eη), C 6

while for the population measurements, we obtain

n¯=21wηn^η+ln(2). C 7

These expressions agree exactly with equations (4.4) and (5.6) which were derived from a Master Equation approach in the limit of zero variability in the cell cycle duration for the case of lineage and population measurements, respectively.

Appendix D. Derivation of the distribution of cell stage durations in population measurements

We let Ci(t, τ) denote the number of cells in a population that are in cell stage i at time t that have been in that cell state for a duration τ. After a small time duration δ all the cells will either advance to an age τ + δ or advance to the next cell stage. Therefore, we can write the conservation equation

Ci(t+δ,τ+δ)=Ci(t,τ)Ci(t,τ)kiδ. D 1

Assuming that there is a stationary distribution for the stage age of the cell population at a stage i, pi(τ), we can write Ci(t, τ) as Ci(t, τ) = Ci(t)pi(τ); where Ci(t) is the number of cells at cell stage i. Introducing this factorization of Ci(t, τ) in equation (D 1), and taking the limit δ → 0, we get the relationship

dCi(t)dtpi(τ)+dpi(τ)dτCi(τ)=Ci(t)pi(τ)ki, D 2

where we have used the chain rule to compute the derivative of dNi(x, x)/dx|t,τ. Because the probability of finding a cell in a certain stage i, πi, is constant in time, the number of cells at a given stage has to grow with the same rate as the population, therefore dCi(t)/dt = KCi(t), where K is the growth rate of the population. Introducing this equality in equation (D 2), we get an equation for pi(τ):

Kpi(τ)+dpi(τ)dτ=pi(τ)ki. D 3

That gives

pi(τ)=(K+ki)e(K+ki)τ. D 4

In the Erlang distributed model, ki = k is constant, and the rate of growth of the population can be calculated from the conservation equation for the total number of cells C(t):

C(t+δ)=C(t)+CN(t)kNδ=C(t)+C(t)kNπNδ, D 5

where N is the number of stages of the cell cycle. From this equation, we obtain that the rate of exponential growth of the population is K = kNπN. Using the value of πN from equation (5.4), we obtain that for the Erlang model, the stage age distribution of cell cycle stage i is

pi(τ)=k21/Nek21/Nτ. D 6

Appendix E. Computational analysis

The simulations for the general cell cycle model (including Erlang distributed times), were made using a custom made publicly available Gillespie algorithm, where cell cycle stages are treated as one extra reaction (https://github.com/2piruben/langil/tree/master/examples/CellCycleVariability). After the last stage of the cell cycle is completed, the cell cycle time is reset and the number of mRNAs is reduced by sampling a binomial distribution B(n, 1/2) where n is the number of mRNAs before cell division (see algorithm 1).

graphic file with name rsif20200360-i1.jpg

On the other hand, to simulate a cell cycle where the different stages have deterministic duration, the Gillespie algorithm has been modified to take into account if a deterministic cell stage change would take place before the next stochastic reaction time (see algorithm 2). The rest of the details of the algorithm are the same as in the general cell cycle model.

graphic file with name rsif20200360-i2.jpg

To obtain statistics from lineage measurements, each trajectory was sampled by choosing evenly distributed time points. For population measurements, several simulations are run in parallel, one for each cell. After each cell division event, a new cell is introduced in the simulation containing the remaining mRNA from the binomial partition of the mother cell. In order to achieve a steady-state behaviour with deterministic cell cycles it was necessary to initiate each replicate following the corresponding age distribution (equation (C 4) or equation (C 5)). Statistics from the population measurements are calculated across all the cells at a particular time snapshot.

Appendix F. Inference of transcription rates and error calculation

Using the expression for the average number of mRNAs in the lineage measurements given by equation (4.3), we can write the transcription rate parameter r for the Erlang model as a function of the average number of mRNAs observed and the rest of the parameters of the model:

r=nηT2w1η1(11+ηΔ)(1w)/Δ2(11+ηΔ)1/Δ1. F 1

Similarly, we can write an expression for r for the population case using equation (5.5):

r=nηT2Δ+ηΔ121wηΔ. F 2

We can use both equations (F 1) and (F 2) to obtain the average transcription rate r¯ along the cell cycle for lineage or population cases:

r¯=rw+2r(1w). F 3

In the limit of a deterministic cell cycle duration and no replication rexp=r¯(Δ0,w=1), we recover the expression used in [33], which returns a value of transcription rate for each gene given the measured decay rate and average number of mRNA transcripts. By contrast, in order to compute r¯ in a general case we need to evaluate the cell cycle duration variability Δ. The cell cycle duration reported in [33] was 19.9 h < 27.5 h < 33.6 h, and hence we choose a standard deviation of (33.619.9)/2=6.85h. The corresponding number of effective states N can be obtained from the CV of the cell cycle length N = 1/Δ = 1/CV2 ≃ 16.

Introducing the calculated value of Δ in equations (F 1)–(F 3), we can evaluate r¯ for lineage and population cases for different DNA replication positions along the cell cycle. In the text, we study a case without replication w = 1 and a case with replication at the middle of the cell cycle w = 1/2.

In order to evaluate how our predictions differ from the reported transcription rates rexp, we compute the relative error ɛ:

ε=rexpr¯rexp. F 4

Note that because r¯ is linear in the average transcript number 〈n〉, the resulting error is independent of 〈n〉. Therefore, differences in the error ɛ among the different genes reported in [33] will only depend on their degradation rate.

Data accessibility

The code used to obtain the results is freely available at the repository https://github.com/2piruben/langil/tree/master/examples/CellCycleVariability.

Authors' contributions

R.P.-C., C.B. and R.G. contributed equally in the research and writing of the manuscript.

Competing interests

We declare we have no competing interest.

Funding

No funding has been received for this article.

References

  • 1.Raj A, Peskin CS, Tranchina D, Vargas DY, Tyagi S. 2006. Stochastic mRNA synthesis in mammalian cells. PLoS Biol. 4, e309 ( 10.1371/journal.pbio.0040309) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Zenklusen D, Larson DR, Singer RH. 2008. Single-RNA counting reveals alternative modes of gene expression in yeast. Nat. Struct. Mol. Biol. 15, 1263–1271. ( 10.1038/nsmb.1514) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Skinner SO, Xu H, Nagarkar-Jaiswal S, Freire PR, Zwaka TP, Golding I. 2016. Single-cell analysis of transcription kinetics across the cell cycle. Elife 5, 1–24. ( 10.7554/eLife.12175) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Van Kampen N. 2011. Stochastic processes in physics and chemistry. Amsterdam, The Netherlands: Elsevier Science: North-Holland Personal Library. [Google Scholar]
  • 5.Beentjes CHL, Perez-Carrasco R, Grima R. 2020. Exact solution of stochastic gene expression models with bursting, cell cycle and replication dynamics. Phys. Rev. E 101, 032403 ( 10.1103/PhysRevE.101.032403) [DOI] [PubMed] [Google Scholar]
  • 6.Cao Z, Grima R. 2020. Analytical distributions for detailed models of stochastic gene expression in eukaryotic cells.. Proc. Natl Acad. Sci. USA 117, 4682–4692. ( 10.1073/pnas.1910888117) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Johnston IG, Jones NS. 2015. Closed-form stochastic solutions for non-equilibrium dynamics and inheritance of cellular components over many cell divisions. Proc. R. Soc. A 471, 20150050 ( 10.1098/rspa.2015.0050) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Huh D, Paulsson J. 2011. Random partitioning of molecules at cell division. Proc. Natl Acad. Sci. USA 108, 15004–15009. ( 10.1073/pnas.1013171108) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Dessalles R, Fromion V, Robert P. 2020. Models of protein production along the cell cycle: an investigation of possible sources of noise. PLoS ONE 15, 1–25. ( 10.1371/journal.pone.0226016) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Schwabe A, Bruggeman FJ. 2014. Contributions of cell growth and biochemical reactions to nongenetic variability of cells. Biophys. J. 107, 301–313. ( 10.1016/j.bpj.2014.05.004) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Thomas P. 2017. Making sense of snapshot data: ergodic principle for clonal cell populations. J. R. Soc. Interface 14, 20170467 ( 10.1098/rsif.2017.0467) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Soltani M, Singh A. 2016. Effects of cell-cycle-dependent expression on random fluctuations in protein levels. R. Soc. Open Sci. 3, 160578 ( 10.1098/rsos.160578) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Soltani M, Vargas-Garcia CA, Antunes D, Singh A. 2016. Intercellular variability in protein levels from stochastic expression and noisy cell cycle processes. PLOS Comput. Biol. 12, e1004972 ( 10.1371/journal.pcbi.1004972) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Je¸drak J, Kwiatkowski M, Ochab-Marcinek A. 2019. Exactly solvable model of gene expression in a proliferating bacterial cell population with stochastic protein bursts and protein partitioning. Phys. Rev. E 99, 042416 ( 10.1103/PhysRevE.99.042416) [DOI] [PubMed] [Google Scholar]
  • 15.Nowakowski RS, Lewin SB, Miller MW. 1989. Bromodeoxyuridine immunohistochemical determination of the lengths of the cell cycle and the DNA-synthetic phase for an anatomically defined population. J. Neurocytol. 18, 311–318. ( 10.1007/BF01190834) [DOI] [PubMed] [Google Scholar]
  • 16.Hannezo E, Prost J, Joanny J-F. 2014. Growth, homeostatic regulation and stem cell dynamics in tissues. J. R. Soc. Interface 11, 20130895 ( 10.1098/rsif.2013.0895) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Gavagnin E, Vittadello ST, Guanasingh G, Haass NK, Simpson MJ, Rogers T, Yates CA. 2020. Synchronised oscillations in growing cell populations are explained by demographic noise. bioRxiv, 987032 ( 10.1011/2020.03.13.987032) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Battich N, Beumer J, de Barbanson B, Krenning L, Baron CS, Tanenbaum ME, Clevers H, van Oudenaarden A. 2020. Sequencing metabolically labeled transcripts in single cells reveals mRNA turnover strategies. Science (80-.) 367, 1151–1156. ( 10.1126/science.aax3072) [DOI] [PubMed] [Google Scholar]
  • 19.Yates CA, Ford MJ, Mort RL. 2017. A multi-stage representation of cell proliferation as a Markov process. Bull. Math. Biol. 79, 2905–2928. ( 10.1007/s11538-017-0356-4) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Kelly T, Callegari AJ. 2019. Dynamics of DNA replication in a eukaryotic cell. Proc. Natl Acad. Sci. USA 116, 4973–4982. ( 10.1073/pnas.1818680116) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Duffy KR, Wellard CJ, Markham JF, Zhou JHS, Holmberg R, Hawkins ED, Hasbold J, Dowling MR, Hodgkin PD. 2012. Activation-induced B cell fates are selected by intracellular stochastic competition. Science (80-. ) 335, 338–341. ( 10.1126/science.1213230) [DOI] [PubMed] [Google Scholar]
  • 22.Hölzel M, Kohlhuber F, Schlosser I, Hölzel D, Lüscher B, Eick D. 2001. Myc/Max/Mad regulate the frequency but not the duration of productive cell cycles. EMBO Rep. 2, 1125–1132. ( 10.1093/embo-reports/kve251) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Hurwitz C, Tolmach LJ. 1969. Time-lapse cinemicrographic studies of X-irradiated HeLa S3 Cells: I. Cell progression and cell disintegration. Biophys. J. 9, 607–633. ( 10.1016/S0006-3495(69)86407-6) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Tyson DR, Garbett SP, Frick PL, Quaranta V. 2012. Fractional proliferation: a method to deconvolve cell population dynamics from single-cell data. Nat. Methods 9, 923–928. ( 10.1038/nmeth.2138) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Brenner N, Braun E, Yoney A, Susman L, Rotella J, Salman H. 2015. Single-cell protein dynamics reproduce universal fluctuations in cell populations. Eur. Phys. J. E 38, 102 ( 10.1140/epje/i2015-15102-8) [DOI] [PubMed] [Google Scholar]
  • 26.Talia SD, Skotheim JM, Bean JM, Siggia ED, Cross FR. 2007. The effects of molecular noise and size control on variability in the budding yeast cell cycle. Nature 448, 947–951. ( 10.1038/nature06072) [DOI] [PubMed] [Google Scholar]
  • 27.Martins BM, Tooke AK, Thomas P, Locke JC. 2018. Cell size control driven by the circadian clock and environment in cyanobacteria. Proc. Natl Acad. Sci. USA 115, E11 415–E11 424. ( 10.1073/pnas.1811309115) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Bates D, Kleckner N. 2005. Chromosome and replisome dynamics in E. coli: loss of sister cohesion triggers global chromosome movement and mediates chromosome segregation. Cell 121, 899–911. ( 10.1016/j.cell.2005.04.013) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Koren A, Soifer I, Barkai N. 2010. MRC1-dependent scaling of the budding yeast DNA replication timing program. Genome Res. 20, 781–790. ( 10.1101/gr.102764.109) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Hahn AT, Jones JT, Meyer T. 2009. Quantitative analysis of cell cycle phase durations and PC12 differentiation using fluorescent biosensors. Cell Cycle 8, 1044–1052. ( 10.4161/cc.8.7.8042) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Selinger DW. 2003. Global RNA Half-life analysis in Escherichia coli reveals positional patterns of transcript degradation. Genome Res. 13, 216–223. ( 10.1101/gr.912603) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Wang Y, Liu CL, Storey JD, Tibshirani RJ, Herschlag D, Brown PO. 2002. Precision and functional specificity in mRNA decay. Proc. Natl Acad. Sci. USA 99, 5860–5865. ( 10.1073/pnas.092538799) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Schwanhäusser B, Busse D, Li N, Dittmar G, Schuchhardt J, Wolf J, Chen W, Selbach M. 2011. Global quantification of mammalian gene expression control. Nature 473, 337–342. ( 10.1038/nature10098) [DOI] [PubMed] [Google Scholar]
  • 34.Fred S. 2002. Getting started with yeast. Methods Enzymol. 350, 3–41. [DOI] [PubMed] [Google Scholar]
  • 35.Liang ST, Ehrenberg M, Dennis P, Bremer H. 1999. Decay of rpIN and lacZ mRNA in Escherichia coli. J. Mol. Biol. 288, 521–538. ( 10.1006/jmbi.1999.2710) [DOI] [PubMed] [Google Scholar]
  • 36.Vittadello ST, McCue SW, Gunasingh G, Haass NK, Simpson MJ. 2019. Mathematical models incorporating a multi-stage cell cycle replicate normally-hidden inherent synchronization in cell proliferation. J. R. Soc. Interface 16, 20190382 ( 10.1098/rsif.2019.0382) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Warne DJ, Baker RE, Simpson MJ. 2019. Simulation and inference algorithms for stochastic biochemical reaction networks: from basic concepts to state-of-the-art. J. R. Soc. Interface 16, 20180943 ( 10.1098/rsif.2018.0943) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Ham L, Brackston RD, Stumpf MPH. 2020. Extrinsic noise and Heavy-Tailed laws in gene expression. Phys. Rev. Lett. 124, 108101 ( 10.1103/PhysRevLett.124.108101) [DOI] [PubMed] [Google Scholar]
  • 39.So LH, Ghosh A, Zong C, Sepúlveda LA, Segev R, Golding I. 2011. General properties of transcriptional time series in Escherichia coli. Nat. Genet. 43, 554–560. ( 10.1038/ng.821) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Cawte AD, Unrau PJ, Rueda DS. 2020. Live cell imaging of single RNA molecules with fluorogenic Mango II arrays. Nat. Commun. 11, 1283 ( 10.1038/s41467-020-14932-7) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Tutucci E, Vera M, Biswas J, Garcia J, Parker R, Singer RH. 2018. An improved MS2 system for accurate reporting of the mRNA life cycle. Nat. Methods 15, 81–89. ( 10.1038/nmeth.4502) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Chen X. et al. 2019. Visualizing RNA dynamics in live cells with bright and stable fluorescent RNAs. Nat. Biotechnol. 37, 1287–1293. ( 10.1038/s41587-019-0249-1) [DOI] [PubMed] [Google Scholar]
  • 43.Tourrière H, Chebli K, Tazi J. 2002. mRNA degradation machines in eukaryotic cells. Biochimie 84, 821–837. ( 10.1016/S0300-9084(02)01445-1) [DOI] [PubMed] [Google Scholar]
  • 44.Higareda-Mendoza AE, Pardo-Galván MA. 2010. Expression of human eukaryotic initiation factor 3f oscillates with cell cycle in A549 cells and is essential for cell viability. Cell Div. 5, 1–13. ( 10.1186/1747-1028-5-10) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Sabi R, Tuller T. 2019. Novel insights into gene expression regulation during meiosis revealed by translation elongation dynamics. npj Syst. Biol. Appl. 5, 12 ( 10.1038/s41540-019-0089-0) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Padovan-Merhar O. et al. 2015. Single mammalian cells compensate for differences in cellular volume and DNA copy number through independent global transcriptional mechanisms. Mol. Cell 58, 339–352. ( 10.1016/j.molcel.2015.03.005) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Dowling MR, Kan A, Heinzel S, Zhou JH, Marchingo JM, Wellard CJ, Markham JF, Hodgkin PD. 2014. Stretched cell cycle model for proliferating lymphocytes. Proc. Natl Acad. Sci. USA 111, 6377–6382. ( 10.1073/pnas.1322420111) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Suter DM, Molina N, Gatfield D, Schneider K, Schibler U, Naef F. 2011. Mammalian genes are transcribed with widely different bursting kinetics. Science (80-.) 332, 472–474. ( 10.1126/science.1198817) [DOI] [PubMed] [Google Scholar]
  • 49.Zhang J, Zhou T. 2014. Promoter-mediated transcriptional dynamics. Biophys. J. 106, 479–488. ( 10.1016/j.bpj.2013.12.011) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Gill T, Cai T, Aulds J, Wierzbicki S, Schmitt ME. 2004. RNase MRP cleaves the CLB2 mRNA to promote cell cycle progression: novel method of mRNA degradation. Mol. Cell Biol. 24, 945–953. ( 10.1128/MCB.24.3.945-953.2004) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Cross FR. 2003. Two redundant oscillatory mechanisms in the yeast cell cycle. Dev. Cell 4, 741–752. ( 10.1016/S1534-5807(03)00119-9) [DOI] [PubMed] [Google Scholar]
  • 52.Ouyang Y, Andersson CR, Kondo T, Golden SS, Johnson CH. 1998. Resonating circadian clocks enhance fitness in cyanobacteria. Proc. Natl Acad. Sci. USA 95, 8660–8664. ( 10.1073/pnas.95.15.8660) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Xu Y, Mori T, Johnson CH. 2003. Cyanobacterial circadian clockwork: roles of KaiA, KaiB and the KaiBC promoter in regulating KaiC. EMBO J. 22, 2117–2126. ( 10.1093/emboj/cdg168) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Zoller B, Little SC, Gregor T. 2018. Diverse spatial expression patterns emerge from unified kinetics of transcriptional bursting. Cell 175, 835–847.e25. ( 10.1016/j.cell.2018.09.056) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Oates AC. 2011. What’s all the noise about developmental stochasticity? Development 138, 601–607. ( 10.1242/dev.059923) [DOI] [PubMed] [Google Scholar]
  • 56.Lammers NC, Galstyan V, Reimer A, Medin SA, Wiggins CH, Garcia HG. 2020. Multimodal transcriptional control of pattern formation in embryonic development. Proc. Natl Acad. Sci. USA 117, 836–847. ( 10.1073/pnas.1912500117) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Yu J, Xiao J, Ren X, Lao K, Xie XS. 2006. Probing gene expression in live cells, one protein molecule at a time. Science 311, 1600–16033. ( 10.1126/science.1119623) [DOI] [PubMed] [Google Scholar]
  • 58.Berg OG. 1978. A model for the statistical fluctuations of protein numbers in a microbial population. J. Theor. Biol. 71, 587–603. ( 10.1016/0022-5193(78)90326-0) [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The code used to obtain the results is freely available at the repository https://github.com/2piruben/langil/tree/master/examples/CellCycleVariability.


Articles from Journal of the Royal Society Interface are provided here courtesy of The Royal Society

RESOURCES