A unified framework for measuring selection on cellular lineages and traits

Shunpei Yamauchi; Takashi Nozoe; Reiko Okura; Edo Kussell; Yuichi Wakamoto

doi:10.7554/eLife.72299

. 2022 Dec 6;11:e72299. doi: 10.7554/eLife.72299

A unified framework for measuring selection on cellular lineages and traits

Shunpei Yamauchi ¹, Takashi Nozoe ¹, Reiko Okura ¹, Edo Kussell ^2,³, Yuichi Wakamoto ^1,^4,^5,^✉

Editors: Armita Nourmohammad⁶, Aleksandra M Walczak⁷

PMCID: PMC9725751 PMID: 36472074

Abstract

Intracellular states probed by gene expression profiles and metabolic activities are intrinsically noisy, causing phenotypic variations among cellular lineages. Understanding the adaptive and evolutionary roles of such variations requires clarifying their linkage to population growth rates. Extending a cell lineage statistics framework, here we show that a population’s growth rate can be expanded by the cumulants of a fitness landscape that characterize how fitness distributes in a population. The expansion enables quantifying the contribution of each cumulant, such as variance and skewness, to population growth. We introduce a function that contains all the essential information of cell lineage statistics, including mean lineage fitness and selection strength. We reveal a relation between fitness heterogeneity and population growth rate response to perturbation. We apply the framework to experimental cell lineage data from bacteria to mammalian cells, revealing distinct levels of growth rate gain from fitness heterogeneity across environments and organisms. Furthermore, third or higher order cumulants’ contributions are negligible under constant growth conditions but could be significant in regrowing processes from growth-arrested conditions. We identify cellular populations in which selection leads to an increase of fitness variance among lineages in retrospective statistics compared to chronological statistics. The framework assumes no particular growth models or environmental conditions, and is thus applicable to various biological phenomena for which phenotypic heterogeneity and cellular proliferation are important.

Research organism: E. coli, S. pombe, Other

Introduction

Growth rates of cellular populations are physiological quantities directly linked to the fitness of cellular organisms. To understand the roles of biological processes and reactions within cells, including modulation of gene expression and metabolic states, one must characterize how they are eventually channeled into an increase or maintenance of population growth rates.

As documented by many single-cell studies, phenotypic states of individual cells in cellular populations are heterogeneous and often correlate with fitness variations among cellular lineages (Balázsi et al., 2011; Elowitz et al., 2002; Kelly and Rahn, 1932; Powell, 1956; Wakamoto et al., 2005; Wang et al., 2010; Cerulus et al., 2016; Susman et al., 2018). Fitness heterogeneity within a population causes a statistical bias on ancestral cells’ contributions to the number of descendants, which is broadly referred to as ‘selection’ (Leibler and Kussell, 2010). Such bias from growth heterogeneity makes the relations between cellular lineages and populations nontrivial. For example, an intriguing consequence of intra-population selection is a growth rate gain, a phenomenon that cell population’s growth rate becomes greater than the mean division rate of isolated single-cell lineages (Powell, 1956; Hashimoto et al., 2016; Rochman et al., 2018). Recent progress of single-cell measurements has enabled high-throughput acquisitions of cellular lineage trees and historical dynamics in each lineage (Stewart et al., 2005; Wang et al., 2010; Hashimoto et al., 2016). However, establishing the theory and method of cellular lineage statistics to quantify fitness differences among different phenotypic states and intrapopulation selection is still in progress (Nozoe et al., 2017; García-García et al., 2019; Levien et al., 2020; Genthon and Lacoste, 2020; Genthon and Lacoste, 2021).

Growth of cellular populations can be described using the ensemble of individual cells’ growth histories (Leibler and Kussell, 2010). A theoretical approach that regards a cell lineage (history) as a basic unit of analysis has offered illuminating insights into population dynamics. For example, it has provided the formula for untangling selection from responses (Leibler and Kussell, 2010), population response to age-specific changes in mortality and fecundity (Wakamoto et al., 2012), fluctuation relations of fitness (Kobayashi and Sughiyama, 2015; Genthon and Lacoste, 2020), and relations between cell size growth rate and population growth rate (Thomas, 2007; Lin and Amir, 2017).

Employing this cell history-based formulation of population dynamics, we have previously proposed a method of cellular lineage statistics that allows quantification of fitness landscapes and selection strength for any traits of cellular lineages (Nozoe et al., 2017). Here, we extend this statistical framework and show that population growth rates can be expanded by the cumulants that represent various properties of fitness distributions, such as variance and skewness, in a population. We apply the framework to experimental single-cell lineage data of bacteria, yeast, and mammalian cells to quantify their condition-dependent growth heterogeneity and its contribution to population growth rate. We also apply this framework to measuring the fitness landscapes for a growth-regulating sigma factor in E. coli and identify the conditions where its continuum and non-genetic expression heterogeneity correlates with lineage fitness in cellular populations.

Examples of biological questions

Before detailing the theoretical and experimental results, we first present several biological questions for which cell lineage statistics could provide essential insights.

Growth rate gain

Growth of individual cells is heterogeneous in a cellular population even under constant environmental conditions (Stewart et al., 2005; Wakamoto et al., 2005; Wang et al., 2010; Hashimoto et al., 2016). Whether genetic or non-genetic, such growth heterogeneity inevitably enables selection within a cellular population. Growth heterogeneity can increase the rate of a population’s growth compared to the mean replication (division) rate of individual cells, known as ‘growth rate gain’ (Hashimoto et al., 2016). Since population growth rate is one of the critical quantities that determine long-term evolutionary success, it is interesting to ask to what extent growth heterogeneity contributes to population growth rate and how the contributions change depending on cellular phenotypes, genotypes (e.g. species), and environmental conditions. Answering this question may uncover strategies of each organism regarding how it exploits inherent stochasticity for population growth.

As we detail below, a measure of selection strength, $S_{KL}^{(1)} [D]$ , can quantify the growth rate gain from growth heterogeneity. Furthermore, we show that one can quantitatively decompose $S_{KL}^{(1)} [D]$ into the contributions of distinct characteristics of growth heterogeneity, such as variance and skewness of fitness distributions. In this study, we apply the cell lineage statistics framework to single-cell lineage data and unravel how the growth rate gain changes across environments and organisms.

Selection in changing environments

When a population of cells faces environmental changes, response of individual cells can be uniform and heterogeneous (Lambert et al., 2014; Julou et al., 2020). In one scenario, individual cells might respond to an environmental change uniformly and contribute to the future population nearly equally with respect to the number of descendants. In another scenario, only a tiny fraction of the cell population could respond to an environmental change, and the descendants of the responders might dominate the entire future population. In this case, the selection within a population is intense, and the nature of a population’s response exclusively depends on these rare cell lineages. Typically, the responses of real cell populations would fall between these two extremes; it is therefore critical to ask how strongly selection occurs within cellular populations in response to environmental changes to understand their response and adaptation strategies.

The framework enables such quantification by evaluating the selection strength $S_{KL}^{(1)} [D]$ of responding cell populations. Importantly, quantifying the selection strength $S_{KL}^{(1)} [D]$ requires only the information of division counts in cellular lineages. Hence, the selection strength is measurable even for complex processes where clarifying the transitions of environmental conditions around cells is technically challenging. We indeed analyze cellular populations of E. coli regrowing from an early or late stationary phase and characterize distinct levels of selection depending on the duration of stationary phase.

Correlations between cellular lineage traits and fitness

Since various traits of individual cells, such as expression levels of particular genes (Elowitz et al., 2002), are heterogeneous in cellular populations, it is natural to ask how strongly trait heterogeneity correlates with the fitness of individual cell lineages. Quantifying such correlations will allow us to understand which traits are under strong selection and potentially crucial for long-term evolution.

The cell lineage statistics framework quantifies relationships between traits and fitness using fitness landscapes $h (x)$ . Additionally, the overall correlation between the heterogeneity of traits and that of fitness can be quantified by the relative selection strength $S_{rel} [X]$ . In this study, we measure $h (x)$ and $S_{rel} [X]$ for a growth-regulation sigma factor in E. coli to unravel whether its continuum expression level heterogeneity is correlated with the fitness heterogeneity of single cell lineages.

Clarifying trait and fitness correlations based on individual-cell-based analyses is difficult when growth and traits fluctuate rapidly over time and when the traits affect growth with delays. In such circumstances, instantaneous correlations between traits and growth might not report their relations correctly. On the other hand, the cell-lineage-based analysis of this framework can take the whole dynamics of traits in cell lineages into account. For example, if we expect that absolute expression levels are important for fitness, the expression level averaged in each cell lineage can be employed as the lineage trait, and its fitness landscape and selection strength are measurable. If large fluctuations affect cell fates and contribute to diversification of cell lineage fitness within a cellular population (Purvis and Lahav, 2013), the variances of expression levels can be taken as lineage traits, and one can evaluate their fitness landscape $h (x)$ and relative selection strength $S_{rel} [X]$ . Therefore, the assumption of a cell lineage as a unit of selection can significantly extend the choice of traits, including time-dependent properties, and can provide insights into cellular dynamics that cannot be gained without the lineage-based formulation of fitness and selection.

Theoretical background

First, we briefly review the analytical framework of cell lineage statistics introduced in Nozoe et al., 2017. This framework allows us to quantitatively infer fitness differences associated with distinct states of cellular lineage traits and selection within a growing cell population from empirical single-cell lineage tree data. Time-lapse single-cell measurements provide cellular growth and division information in the form of lineage trees (Figure 1, Stewart et al., 2005). We regard a lineage $σ$ as a cell history traceable back from a descendant cell at the final time point $t = τ$ (Figure 1B). For the case of cellular growth shown in Figure 1A, 22 cell lineages exist in the trees.

Figure 1. — (A) Time-lapse images of a growing microcolony of *Escherichia coli* expressing green fluorescent protein (GFP) from plasmids. Scale bars, 5 μm. (B) Cellular lineage trees for the microcolony in A. Bifurcations in the trees represent cell divisions. $σ$ denotes cell lineage labels. $D (σ)$ shows the number of cell divisions in each lineage. $P_{cl} (σ)$ and $P_{rs} (σ)$ are chronological and retrospective probabilities defined in the main text.

We assign two types of probability weight to cellular lineages. One is retrospective probability, in which we assign equal weight $P_{r s} (σ) := 1 / N_{τ}$ to all lineages, where $N_{τ}$ is the number of cells at the final time point $t = τ . P_{r s} (σ)$ represents the probability of selecting the history of a cell present at the endpoints of lineage trees. Another is chronological probability, in which we assign the weight $P_{c l} (σ) := 2^{- D (σ)} / N_{0}$ to the lineages, where $D (σ)$ is the number of cell divisions on lineage $σ$ and N₀ is the initial number of cells at $t = τ . P_{c l} (σ)$ represents the probability of choosing lineage $σ$ descending the tree from one of the ancestor cells at $t = 0$ and selecting one branch with the probability $1 / 2$ at every cell division. $P_{rs} (σ)$ and $P_{cl} (σ)$ can be different in general when the number of cell divisions are variable among the cell lineages, as shown in Figure 1B.

We define retrospective and chronological probabilities for a lineage trait $X$ as $Q_{r s} (x) := \sum_{σ : X (σ) = x} P_{r s} (σ)$ and $Q_{c l} (x) := \sum_{σ : X (σ) = x} P_{c l} (σ)$ , where $X (σ)$ is the value of trait $X$ for lineage $σ$ . Here, we regard any measurable quantity associated with cellular lineages as a lineage trait $X$ . For example, time-averaged expression levels and production rates of a drug-resistance protein were analyzed as lineage traits in the experiments of Nozoe et al., 2017. Intuitively, $Q_{cl} (x)$ and $Q_{rs} (x)$ represent the probabilities of finding the lineage trait value $X = x$ before and after selection, respectively.

Using these retrospective and chronological distributions, we define the fitness landscape for lineage trait $X$ as

h (x) := τ Λ + \ln \frac{Q_{r s} (x)}{Q_{c l} (x)},

(1)

where $Λ := \frac{1}{τ} \ln \frac{N_{τ}}{N_{0}}$ is the population growth rate. This definition relates the relative difference of the retrospective probability from the chronological probability to fitness. $h (x)$ becomes greater than $τ Λ$ if the lineage trait state $X = x$ is overrepresented in the retrospective probability relative to chronological probability and vice versa. Furthermore, if none of the states of lineage trait $X$ are overrepresented nor underrepresented, $h (x)$ becomes constant across the states and equals $τ Λ$ for all $x$ . The fitness landscape $h (x)$ thus represents fitness differences mapped on the lineage trait space of $X$ (see Figure 2 and Box 1).

Figure 2. — (A) Non-uniform fitness landscape and broad trait distribution. The gray distribution represents a chronological distribution of lineage trait $x$ ; the cyan distribution represents a retrospective distribution of lineage trait $x$ ; and the black dashed line represents a fitness landscape. Due to the non-uniform fitness landscape and the broad chronological distribution, there is trait fitness heterogeneity for selection to act on. The retrospective distribution therefore shifts significantly from the chronological distribution, and the selection strength is large ( $S [X] > 0$ ). (B) Non-uniform fitness landscape and narrow trait distribution. Due to the lack of trait heterogeneity, there is little fitness heterogeneity for selection to act on. The retrospective distribution shifts only slightly from the chronological distribution, and the selection strength is small ( $S [X] \approx 0$ ). (C) Uniform fitness landscape. When the fitness landscape is constant ( $= τ Λ$ ) across the lineage trait state $x$ , there can be no trait fitness heterogeneity regardless of whether the trait distribution itself is narrow or broad. The selection strength is therefore zero ( $S [X] = 0$ ).

Box 1. A glossary of the terms.

Here, we provide intuitive and illustrative explanations of the essential quantities in the cell lineage statistics and discuss their similarities and differences compared to the common usage in evolutionary biology.

Fitness: In evolutionary biology, fitness refers to the expected per capita contribution of individuals of a particular trait (usually a genotype) to the future population (Futuyma, 2010). For example, if a set of N₀ individuals with trait $X$ produce N₁ descendants on average in the future population, the fitness of this trait would be $N_{1} / N_{0}$ . Since proliferation usually proceeds multiplicatively, the logarithm of fitness, In $(N_{1} / N_{0})$ , is also often referred to as ‘fitness’. Analogously, in our framework we define fitness for cell lineage traits as the expected contribution of lineages with a given trait value in the future population. For each cell lineage $σ$ , the number of cell divisions occurring along the lineage, $D (σ)$ , is used to estimate the expected contribution of each lineage to the future population.

Fitness landscape: In evolutionary biology, fitness landscapes are visual representations of relationships between reproductive abilities (fitness) and genotypes (Futuyma, 2010), where the height along the landscape corresponds to fitness. Since “genotype space” is vast and usually difficult to construct or visualize, fitness landscapes are often referred to as a metaphorical or conceptual tool for understanding complex evolutionary processes. For practical applications, however, fitness landscapes are often mapped on a low dimensional allele frequency space or a phenotypic space. Analogously, in our framework fitness landscapes are mapped on cell lineage trait spaces. However, they are different in that there is no assumption of genotypic differences underlying different trait states. Furthermore, the landscapes are directly measurable using cellular lineage trees and trait dynamics in each lineage.

For a cell lineage trait $X$ , we define its fitness landscape to be a function $h (x)$ that reports the expected reproductive success of lineages having trait value $X = x$ . Each lineage $σ$ having trait value $x$ contributes $2^{D (σ)}$ lineages to the future population, and by summing over lineages sharing the same trait value, we estimate the expected reproductive success of the trait and measure its fitness landscape. If differences in $X$ correlate with division count heterogeneity among cell lineages, $h (x)$ varies across the trait space of $X$ ; if differences in $X$ are uncorrelated with division count heterogeneity, $h (x)$ is constant over the entire space of $X$ (Figure 2).

Selection: The term selection refers to processes in which the frequencies of individuals with different traits change due to differences in their fitness (Futuyma, 2010). In evolutionary biology, selection is usually assessed based on changes in the distribution of traits between two points in times, which requires an accurate measure of fitness and a model to determine whether the observed changes were the result of trait fitness differences. In our cell lineage statistics framework, we measure selection by determining whether the observed distribution of lineage traits (i.e. the retrospective distribution) differs from the distribution expected in the absence of fitness differences (i.e. the chronological distribution). The key advantage that lineage-based analysis provides is the ability to construct explicitly the chronological distribution, which is the natural ‘null hypothesis’ against which selection can be tested in a model-independent manner.

Selection strength: $S [X]$ (i.e., $S_{J F} [X]$ , $S_{K L}^{(1)} [X]$ , or $S_{K L}^{(2)} [X]$ ) is a quantitative measure that reports how strongly differences in the states of cell lineage trait $X$ are correlated with cell lineage fitness, taking the distributions of $X$ into account. The selection strength in our framework is measured by differences in the fitness measures or by differences between chronological and retrospective distributions (Equations 2–4). One can prove that these different definitions are mathematically equivalent.

The three situations depicted in Figure 2 would help us to gain an intuitive understanding of the properties and meanings of selection strength. When $X$ is correlated with fitness, a fitness landscape $h (x)$ becomes non-uniform, as mentioned above. When the states of lineage trait $X$ are heterogeneous and distributed widely within a population, the selection causes a significant difference between chronological and retrospective distributions due to the biased representation of trait states by selection. Therefore, the selection strength becomes large ( $S [X] > 0$ , Figure 2A). In the second situation, $h (x)$ is again non-uniform, but the distribution of $x$ is narrow. In this case, there is almost no effective trait heterogeneity in the population on which selection can act. Consequently, the overall extent of selection becomes small, i.e., selection strength becomes small ( $S [X] \approx 0$ , Figure 2B). Finally, when $h (x)$ is uniform over the observed state of $x$ , selection can neither overrepresent nor underrepresent any states, no matter how the trait $x$ distributes in a population. Therefore, the chronological and retrospective distributions become identical, and the selection strength becomes zero ( $S [X] = 0$ , Figure 2C).

These examples show that $S [X]$ can gauge to what extent selection acts on a lineage trait $X$ , considering both shapes of fitness landscapes and distributions of lineage traits in a population. Therefore, if $X$ is a trait of interest, quantifying $S [X]$ or the relative strength of selection $S [X] / S [D]$ determines how strongly the heterogeneity of $X$ is correlated with fitness differences of cell lineages.

In evolutionary biology, various measures are used to quantify how strongly selection acts in a population of interest. For example, the ‘coefficient of selection’ measures a relative difference in fitness of each genotype from that of the fittest genotype (Futuyma, 2010). This measure is useful when considering the selection against a particular reference genotype. The overall intensity of selection in a population can be quantified by changes in mean fitness before and after selection, variances of fitness before selection, changes in the mean of log fitness, and Jeffreys divergence between trait distributions before and after selection (Frank, 2012). Therefore, our definitions of selection strength follow the standard measures for the overall selection in evolutionary biology both conceptually and mathematically but are different in that the mean fitness and distributions of chronological and retrospective statistics are used.

Cumulants: In Results, we consider the contributions of the cumulants of a fitness landscape to population growth. The cumulants of a probability distribution are a set of quantities that characterize the distribution. For a discrete probability distribution $P (x)$ , its cumulant generating function is defined as

K (ξ) := \ln E [e^{ξ X}] = \ln \sum_{x} e^{ξ x} P (x),

(6)

and the $n$ -th order cumulant $κ_{n}$ is obtained by evaluating the $n$ -th order derivative of $K (ξ)$ at $ξ = 0$ , i.e.,

κ_{n} := {\frac{d^{n} K (ξ)}{d ξ^{n}} |}_{ξ = 0} .

(7)

Notably, the first few cumulants correspond to important statistical quantities. The first-order cumulant $κ_{1}$ corresponds to the mean $⟨ X ⟩ := E [X] = \sum_{x} x P (x)$ , and the second-order cumulant $κ_{2}$ corresponds to the variance $V a r [X] := E [X^{2}] - E [X]^{2} = \sum_{x} x^{2} P (x) - {(\sum_{x} x P (x))}^{2}$ . The skewness of a distribution is usually defined as $E [{(\frac{X - E [X]}{\sqrt{V a r [X]}})}^{3}]$ , and this quantity can be expressed as $κ_{3} / κ_{2}^{\frac{3}{2}}$ using the third-order cumulant. Since $κ_{2}$ is positive, the sign of $κ_{3}$ determines the direction of the skewness: When $κ_{3} > 0$ , the distribution is skewed to the right with a long right tail; when $κ_{3} < 0$ , the distribution is skewed to the left with a long left tail.

One can also define ‘selection strength’ using $Q_{rs} (x)$ and $Q_{cl} (x)$ as

S_{J F} [X] := J [Q_{c l} (X), Q_{r s} (X)] = ⟨ h (X) ⟩_{r s} - ⟨ h (X) ⟩_{c l},

(2)

where $J [Q_{c l} (X), Q_{r s} (X)] := \sum_{x} (Q_{c l} (x) - Q_{r s} (x)) \ln \frac{Q_{c l} (x)}{Q_{r s} (x)}$ is Jeffreys divergence, and $⟨ h (X) ⟩_{r s} := \sum_{x} h (x) Q_{r s} (x)$ and $⟨ h (X) ⟩_{c l} := \sum_{x} h (x) Q_{c l} (x)$ are the retrospective and chronological mean fitness for lineage trait $X$ . Jeffreys divergence measures dissimilarity between two probability distributions. Therefore, $S_{JF} [X]$ measures dissimilarity between the chronological and retrospective distributions caused by selection. Notably, one can link this dissimilarity to the difference in the mean fitness, as shown in Equation 2. Since Jeffreys divergence is non-negative, the retrospective mean fitness (mean fitness after selection) is equal to or greater than the chronological mean fitness (mean fitness before selection).

This measure of selection strength quantifies how strongly differences in the states of lineage trait $X$ correlate with the differences in lineage fitness. Therefore, one can unravel which traits correlate with lineage fitness strongly by evaluating this for traits of interest.

Likewise, we can define two alternative selection strength measures:

\begin{aligned} S_{K L}^{(1)} [X] := D_{K L} [Q_{c l} (X) | | Q_{r s} (X)] = τ Λ - ⟨ h (X) ⟩_{c l}, \end{aligned}

(3)

\begin{aligned} S_{K L}^{(2)} [X] := D_{K L} [Q_{r s} (X) | | Q_{c l} (X)] = ⟨ h (X) ⟩_{r s} - τ Λ, \end{aligned}

(4)

where $D_{K L} [Q_{c l} (X) | | Q_{r s} (X)] := \sum_{x} Q_{c l} (X) \ln \frac{Q_{c l} (X)}{Q_{r s} (X)}$ and $D_{K L} [Q_{r s} (X) | | Q_{c l} (X)] := \sum_{x} Q_{r s} (X) \ln \frac{Q_{r s} (X)}{Q_{c l} (X)}$ are the Kullback-Leibler divergence of the two distributions. Note that $S_{KL}^{(1)} [X] + S_{KL}^{(2)} [X] = S_{JF} [X]$ .

These three types of selection strength measures share identical properties in common: they are always non-negative and report the overall correlations between trait states and fitness. We exclusively used $S_{JF} [X]$ as the selection strength measure in our previous study (Nozoe et al., 2017). However, $S_{KL}^{(1)} [X]$ , $S_{KL}^{(2)} [X]$ , and their difference $S_{KL}^{(2)} [X] - S_{KL}^{(1)} [X]$ possess their own unique biological meanings, as we detail in Results. We indeed evaluate both $S_{KL}^{(1)} [X]$ and $S_{KL}^{(2)} [X]$ for the empirical lineage data of various organisms and use them to unravel distinct effects of selection on fitness variances. Such meanings and roles of the different selection strength measures are clarified in this study.

Importantly, division count $D$ is also a lineage trait, and its selection strength sets the maximum bound for the selection strength of any lineage trait irrespectively of a choice of the selection strength measures as discussed in Appendix 3. Therefore, the selection strength relative to that of $D$ is bounded between 0 and 1 and evaluates how strongly the heterogeneity of $X$ correlates with the division count heterogeneity in a given cellular population. This relative measure is useful when comparing relative strength of correlations between lineage traits and fitness across conditions. In this study, we define relative selection strength as

S_{r e l} [X] := \frac{S_{K L}^{(1)} [X]}{S_{K L}^{(1)} [D]},

(5)

and use it in the analysis.

All of the quantities introduced above are measurable without relying on any growth models. Thus, this cell lineage statistics framework is applicable to a wide range of single-cell lineage data.

Results

Growth rate gain and cumulant expansion of population growth rate

To quantify contributions of growth heterogeneity to population growth, we first rewrite the definition of the selection strength $S_{KL}^{(1)} [X]$ (Equation 3) as follows:

τ Λ = {⟨ h (X) ⟩}_{cl} + S_{KL}^{(1)} [X] .

(8)

This shows that population growth rates can be decomposed into chronological mean fitness and selection strength. In particular, when we take division count $D$ as a lineage trait, its fitness landscape is $\tilde{h} (d) = d \ln 2$ (Appendix 3), and ${⟨ \tilde{h} (D) ⟩}_{cl} / τ$ represents the mean division rate of cellular lineages without selection. $S_{KL}^{(1)} [D] / τ$ , thus, represents growth rate gain caused by the growth heterogeneity among the cellular lineages in a cellular population. Therefore, evaluating $S_{KL}^{(1)} [D] / τ Λ$ from single-cell lineage data provides information on the contribution of growth heterogeneity to population growth.

To further examine the connections between the disparate selection measures and elucidate their meaning, we define a function of a variable $ξ$ as

K_{X} (ξ) := \ln ⟨ e^{ξ h (X)} ⟩_{c l} = \ln \sum_{x} e^{ξ h (x)} Q_{c l} (x) .

(9)

This is the cumulant generating function (cgf) of $h (x)$ with respect to the chronological distribution $Q_{cl}$ . We have $K_{X} (0) = 0$ , and from the definition of fitness landscape $h (x)$ (Equation 1), we find

K_{X} (1) = τ Λ .

(10)

When the radius of convergence of the Taylor expansion of $K_{X} (ξ)$ around $ξ = 0$ is at least 1, $K_{X} (1)$ can be expressed as the series using the cumulants of a fitness landscape as

K_{X} (1) = \sum_{n = 1}^{\infty} \frac{κ_{n}^{(X)}}{n!},

(11)

where $κ_{n}^{(X)} := {\frac{d^{n} K_{X} (ξ)}{d ξ^{n}} |}_{ξ = 0}$ is the $n$ -th order cumulant, satisfying $κ_{1}^{(X)} = {⟨ h (X) ⟩}_{cl}$ , and $κ_{2}^{(X)} = Var {[h (X)]}_{cl} = {⟨ h {(X)}^{2} ⟩}_{cl} - {⟨ h (X) ⟩}_{cl}^{2}$ . Hence,

τ Λ = \sum_{n = 1}^{\infty} \frac{κ_{n}^{(X)}}{n!},

(12)

which shows that population growth rates can be expanded by the cumulants of a fitness landscape for any lineage trait $X$ . Additionally, since $κ_{1}^{(X)} = {⟨ h (X) ⟩}_{cl}$ , comparing (Equation 8) and (Equation 12) yields

S_{K L}^{(1)} [X] = \sum_{n = 2}^{\infty} \frac{κ_{n}^{(X)}}{n!} .

(13)

Therefore, $S_{KL}^{(1)} [X]$ represents the total contribution of second and higher order cumulants to population growth.

The cumulant expansion allows us to quantify the relative contributions of various statistical features of fitness distributions to population growth, such as mean, variance, and skewness. We define the cumulative contribution up to the $n$ -th order cumulant as

W_{n}^{(X)} := \frac{1}{τ Λ} \sum_{k = 1}^{n} \frac{κ_{k}^{(X)}}{k!},

(14)

and note that $W_{n}^{(X)}$ converges to 1 as $n \to \infty$ . In particular, $W_{1}^{(X)} = \frac{{⟨ h (x) ⟩}_{cl}}{τ Λ}$ and $W_{2}^{(X)} = \frac{1}{τ Λ} ({⟨ h (X) ⟩}_{cl} + \frac{1}{2} Var {[h (X)]}_{cl})$ . We will indeed measure $W_{n}^{(D)}$ for various cellular species under steady and non-steady environments in the experimental sections below.

The function $K_{X} (ξ)$ defined in (Equation 9) is useful as it provides various forms of fitness and selection measures by simple algebraic calculation, as shown in Table 1. In general, evaluating $K_{X} (ξ)$ and its derivatives at $ξ = 0$ and $ξ = 1$ gives the information of chronological and retrospective statistics, respectively (Appendix 3). Therefore, $K_{X} (ξ)$ contains complete information on the fitness distributions in both chronological and retrospective statistics.

Table 1. Relationships between $K_{X} (ξ)$ and quantities in cellular lineage statistics.

	Quantities in lineage statistics	Symbol	Correspondence to $K_{X} (ξ)$
Fitness	Population growth	$τ Λ$	$K_{X} (1)$
	Chronological mean fitness	${⟨ h (X) ⟩}_{cl}$	$K_{X}^{'} (0)$
	Retrospective mean fitness	${⟨ h (X) ⟩}_{rs}$	$K_{X}^{'} (1)$
	Chronological fitness variance	$Var {[h (X)]}_{cl}$	$K_{X}^{''} (0)$
	Retrospective fitness variance	$Var {[h (X)]}_{rs}$	$K_{X}^{''} (1)$
Selection strength	Jeffreys divergence bet. $Q_{cl} (X)$ and $Q_{rs} (X)$	$S_{JF} [X]$	$K_{X}^{'} (1) - K_{X}^{'} (0)$
	KL divergence of $Q_{c l} (X)$ from $Q_{r s} (X)$	$S_{KL}^{(1)} [X]$	$K_{X} (1) - K_{X}^{'} (0)$
	KL divergence of $Q_{r s} (X)$ from $Q_{c l} (X)$	$S_{KL}^{(2)} [X]$	$K_{X}^{'} (1) - K_{X} (1)$
Growth rate gain/loss	Growth rate gain	$S_{K L}^{(1)} [D] / τ Λ$	$1 - K_{D}^{'} (0) / K_{D} (1)$
	Additional growth rate loss upon perturbation	$- S_{KL}^{(2)} [D] / τ Λ$	$1 - K_{D}^{'} (1) / K_{D} (1)$

Open in a new tab

Difference in the selection strength measures reveals the effect of selection on fitness variance

The difference between the two selection strength measures $S_{KL}^{(1)} [X]$ and $S_{KL}^{(2)} [X]$ is determined by the higher order cumulants by the relation

S_{K L}^{(2)} [X] - S_{K L}^{(1)} [X] = \sum_{n = 3}^{\infty} \frac{κ_{n}^{(X)}}{n!} (n - 2)

(15)

(Appendix 3). When fourth or higher order cumulants are negligible, the third-order fitness cumulant $κ_{3}^{(X)}$ , that is the skewness of fitness distribution, determines which selection strength measure is greater.

The relations among the fitness and selection strength measures can be graphically depicted by plotting $K_{X}^{'} (ξ)$ in the interval $0 \leq ξ \leq 1$ (Figure 3A). $S_{KL}^{(1)} [X]$ corresponds to the area between $y = {⟨ h (X) ⟩}_{cl}$ and $y = K_{X}^{'} (ξ)$ ; and $S_{KL}^{(2)} [X]$ corresponds to the area between $y = K_{X}^{'} (ξ)$ and $y = {⟨ h (X) ⟩}_{rs}$ (Figure 3A). Therefore, the skewness of fitness distribution primarily determines the convexity of $K_{X}^{'} (ξ)$ (Figure 3B–G).

Figure 3. — (A) Graphical representation of various fitness and selection strength measures by $K_{X}^{'} (ξ)$ -plot. Blue curve represents $K_{X}^{'} (ξ)$ . The area between the horizontal axis and $K_{X}^{'} (ξ)$ in the interval $0 \leq ξ \leq 1$ outlined in red corresponds to population growth $τ Λ$ . The gray and hatched regions correspond to ${⟨ h (X) ⟩}_{cl}$ and $S_{KL}^{(1)} [X]$ , respectively. The area between $K_{X}^{'} (ξ)$ and $y = {⟨ h (X) ⟩}_{rs}$ corresponds to $S_{KL}^{(2)} [X]$ . (**B-D**) Representative shapes of $K_{X}^{'} (ξ)$ depending on $κ_{3}^{(X)}$ . Assuming that the contributions from fourth or higher-order cumulants are negligible, $K_{X}^{'} (ξ)$ becomes convex downward when $κ_{3}^{(X)} > 0$ (B); a straight line when $κ_{3}^{(X)} = 0$ (C); and convex upward when $κ_{3}^{(X)} < 0$ (D). (**E-G**) Relationships between third-order fitness cumulant and skewness of chronological distribution $Q_{cl} (h)$ .

The difference between the two selection strength measures can reveal the effect of selection on fitness variances. The slope of the tangent lines to $K_{X}^{'} (ξ)$ at $ξ = 0$ and 1 corresponds to the chronological and retrospective fitness variances, respectively (Table 1). Therefore, when $K_{X}^{'} (ξ)$ is convex upward in the interval $0 \leq ξ \leq 1$ ( $κ_{3}^{(X)} < 0$ , i.e., $S_{K L}^{(1)} [X] > S_{K L}^{(2)} [X]$ , as in Figure 3D), the effect of selection is to decrease the lineage fitness variance in the retrospective distribution relative to the chronological distribution, whereas if $K_{X}^{'} (ξ)$ is convex downward ( $κ_{3}^{(X)} > 0$ , i.e., $S_{K L}^{(1)} [X] < S_{K L}^{(2)} [X]$ , as in Figure 3B), selection increases the fitness variance. We indeed find cases of both kinds of behavior in the experimental lineage data, as will be seen below. Therefore, one can probe the effect of selection on fitness variances by comparing the two selection strength measures $S_{KL}^{(1)} [X]$ and $S_{KL}^{(2)} [X]$ .

Significant differences between $S_{KL}^{(1)} [X]$ and $S_{KL}^{(2)} [X]$ indicate non-negligible contributions of higher-order cumulants. In such circumstances, the fitness distributions are far from Gaussian with significant skews or multiple peaks. Therefore, higher-order cumulants can also be used to probe the existence of sub-populations in cellular populations.

Population growth rate under fitness perturbations

We mentioned above that the selection strength measure $S_{KL}^{(1)} [D]$ represents growth rate gain caused by fitness heterogeneity. Likewise, another selection strength measure $S_{KL}^{(2)} [D]$ represents a different consequence of fitness heterogeneity, that is, additional loss of growth rate under fitness perturbations.

From (Equation 1), and taking division count as a lineage trait, one can express population growth rate as

Λ = \frac{1}{τ} \ln \sum_{d} e^{\tilde{h} (d)} Q_{c l} (d) .

(16)

We now consider the response of population growth rate to perturbations that cause lineage fitness to change from $D (σ) \ln 2$ to $(1 - ϵ) D (σ) \ln 2$ , and rewrite the population growth rate as

Λ (ϵ) := \frac{1}{τ} \ln \sum_{d} e^{(1 - ϵ) \tilde{h} (d)} Q_{c l} (d) .

(17)

We have $Λ (0) = Λ$ , and note that $Λ (ϵ) = \frac{1}{τ} K_{D} (1 - ϵ)$ from (Equation 9). Differentiating $Λ (ϵ)$ with respect to $ϵ$ , and evaluating at $ϵ = 0$ , we find

{\frac{d Λ (ϵ)}{d ϵ} |}_{ϵ = 0} = - \frac{{⟨ \tilde{h} (D) ⟩}_{rs}}{τ}

(18)

(see Appendix 3). This relation shows that the change of population growth rate for small $ϵ$ is proportional to the retrospective mean fitness of the unperturbed population. Since ${⟨ \tilde{h} (D) ⟩}_{rs} = τ Λ + S_{KL}^{(2)} [D]$ (Equation 4), the relative change of population growth rate is

{\frac{1}{Λ} \frac{d Λ (ϵ)}{d ϵ} |}_{ϵ = 0} = - (1 + \frac{S_{KL}^{(2)} [D]}{τ Λ}) .

(19)

Therefore, a population with higher selection strength will exhibit a greater change in population growth rate upon perturbation. The selection strength measure $S_{KL}^{(2)} [D]$ represents additional loss of population growth rate due to division count heterogeneity before perturbation.

As we see below, one manifestation of $ϵ$ occurs via a cell removal operation. Consider the removal of a branch in the genealogical tree just after each cell division with the probability of $1 - 2^{- ϵ}$ ( $ϵ > 0$ ) (Figure 4A). In this case, the probability that a cell remains in the population after a cell division is $2^{- ϵ}$ , and the growth of cell lineages that originally divided $d$ times will be effectively reduced by the factor ${(2^{- ϵ})}^{d}$ . Consequently, the number of cell lineages that reach the end time point will also be effectively reduced from $N_{0} (\sum_{d} 2^{d} Q_{cl} (d))$ to $N_{0} (\sum_{d} 2^{(1 - ϵ) d} Q_{cl} (d))$ . Therefore, the population growth rate under this branch removal operation is given by (Equation 17), and the relative change of population growth rate is

\frac{Δ Λ}{Λ} := \frac{Λ (ϵ) - Λ}{Λ} = - (1 + \frac{S_{K L}^{(2)} [D]}{τ Λ}) ϵ + O (ϵ^{2}) .

(20)

Figure 4—figure supplement 1. — (A) Scheme of random cell removal. Here, we consider the situation where cells were removed probabilistically after each cell division. Red crosses represent cell removal positions in the tree. The lineages after cell removal points disappear from the tree. Consequently, the number of cells at the end time point decreases. (B) Generation time distributions used in the simulation. We assumed that cellular generation time follows gamma distributions in the simulation. We set the shape parameter to either 1 ( $g_{1} (x)$ ), 2 ( $g_{2} (x)$ ), or 5 ( $g_{3} (x)$ ). (**C-E**) Population growth rate changes by cell removal perturbation. Gray points show the relative changes in population growth rate $Δ Λ / Λ := (Λ (ϵ) - Λ (0)) / Λ (0)$ . Cell removal probability was set to $1 - 2^{- ϵ}$ in each condition of perturbation strength $ϵ$ . Broken red lines represent the theoretical prediction $Δ Λ / Λ \approx - \frac{{⟨ \tilde{h} (D) ⟩}_{rs}}{τ Λ} ϵ = - (1 + \frac{S_{KL}^{(2)} [D]}{τ Λ}) ϵ$ . The lines of $Δ Λ / Λ = - ϵ$ (blue) and $- \frac{{⟨ \tilde{h} (D) ⟩}_{cl}}{τ Λ} ϵ$ (green) are shown for reference. The generation time distributions used in the simulation are $g_{1} (x)$ for C, $g_{2} (x)$ for D, and $g_{3} (x)$ for E.

We validated this relation by simulating population growth with and without the cell removal operation (Figure 4B–E and Figure 4—figure supplement 1). The result confirmed that the relative changes of population growth rates by the probabilistic removal of cells followed $- (1 + \frac{S_{KL}^{(2)} [D]}{τ Λ}) ϵ$ in all the conditions (Figure 4C–E). We also tested this relation for cell populations with positive mother-daughter correlations of division intervals, which are often found for eukaryotic cells (Nozoe and Kussell, 2020; Seita et al., 2021; Mosheiff et al., 2018; Kuchen et al., 2020). We confirmed that the response relation was valid irrespectively of the strength of mother-daughter correlations (Figure 4—figure supplement 1), which shows that the relation is general and independent of the specific dynamics of the cell division process.

Applications to models

In Appendices 1 and 2, we calculate the exact form of $K_{D} (ξ)$ for analytically-tractable models. We derive chronological and retrospective mean fitness, selection strength, and the cumulants of fitness landscapes from $K_{D} (ξ)$ to observe how the framework works. In particular, we show the analytical calculation for a cellular population in which cells divide with gamma-distributed uncorrelated interdivision times in Appendix 2 to understand the effect of inherent stochasticity on population growth. This analysis yields two conclusions: (1) Unlike the central limit theorem, the contribution of higher-order cumulants to population growth remains even in the long-term limit, and (2) the shape of the generation time distribution influences the cell population’s long-term growth rate by constantly introducing selection within the population. Therefore, the details of inherent stochasticity of interdivision times are essential for the long-term population growth rate.

Experimental evaluation of contributions of growth heterogeneity to population growth

Next, we apply this framework of cell lineage statistics to experimental single-cell lineage data of various organisms. The list includes bacterial cells (Escherichia coli and Mycobacterium smegmatis), unicellular eukaryotic cells (Schizosaccharomyces pombe), and mammalian cancer cells (L1210 mouse leukemia cells). This analysis aims to unravel whether the extent of growth rate gain from growth heterogeneity depends on the organisms and environments under constant growth conditions. As summarized in Tables 2 and 3, we used cellular lineage data newly obtained in this study as well as other existing datasets (Nozoe et al., 2017; Wakamoto et al., 2013; Nakaoka and Wakamoto, 2017; Seita et al., 2021). The E. coli and S. pombe data include several culture conditions to compare cumulants’ contributions to population growth across environments. The E. coli data were obtained using either agarose pad or the microchamber array microfluidic device, yielding genealogical tree information such as the one shown in Figure 1. The S. pombe and L1210 cell data were obtained with mother machine microfluidic devices (Wang et al., 2010), which provide isolated cell lineage information but discard tree information due to its cell exclusion scheme. We assumed that these isolated cell lineages would follow chronological statistics and evaluated chronological distributions and selection strength according to the method described in Materials and methods. All the data analyzed in this section were taken from cell populations growing at approximately constant rates.

Table 2. Summary of cellular species, culture conditions, and observation setup used in the experiments in Figure 5.

Species	Label	Strain	Medium	Temperature (°C)	Device
E. coli	rpoS-mcherry glucose_30°C	MG1655 F3 rpoS-mcherry /pUA66-PrpsL-gfp	M9 minimal medium +0.2%(w/v) glucose +1/2 MEM amino acids solution (Sigma)	30	Microchamber array	This study
E. coli	rpoS-mcherry glucose_37°C	MG1655 F3 rpoS-mcherry /pUA66-PrpsL-gfp	M9 minimal medium +0.2%(w/v) glucose +1/2 MEM amino acids solution (Sigma)	37	Microchamber array	This study
E. coli	rpoS-mcherry glycerol_37°C	MG1655 F3 rpoS-mcherry /pUA66-PrpsL-gfp	M9 minimal medium +0.1%(v/v) glycerol +1/2 MEM amino acids solution (Sigma)	37	Microchamber array	This study
E. coli	f3nw -sm	F3NW	M9 minimal medium +0.2%(w/v) glucose +1/2 MEM amino acids solution (Sigma)+0.1mM Isopropyl β-D-1 thiogalactopyranoside (IPTG)	37	Agar pad	Nozoe et al., 2017
E. coli	f3nw +sm	F3NW	M9 minimal medium +0.2%(w/v) glucose +1/2 MEM amino acids solution (Sigma)+0.1 mM Isopropylβ-D-1 thiogalactopyranoside (IPTG)+100 μg/ml streptomycin	37	Agar pad	Nozoe et al., 2017
E. coli	f3ptn001 -sm	F3/pTN001	M9 minimal medium +0.2%(w/v) glucose +1/2 MEM amino acids solution (Sigma)+0.1 mM Isopropylβ-D-1 thiogalactopyranoside (IPTG)	37	Agar pad	Nozoe et al., 2017
E. coli	f3ptn001+sm	F3/pTN001	M9 minimal medium +0.2%(w/v) glucose +1/2 MEM amino acids solution (Sigma)+0.1 mM Isopropylβ-D-1 thiogalactopyranoside (IPTG)+200 μg/ml streptomycin	37	Agar pad	Nozoe et al., 2017
M. smegmatis	mc²155 7H9	mc²155	Middlebrook 7H9 medium +0.5% albumin +0.2% glucose +0.085% NaCl+0.5% glycerol +0.05% Tween-80	37	Membrane cover	Wakamoto et al., 2013
S. pombe	EMM28	HN0025	Edinburgh minimal medium +2% (w/v) glucose	28	Mother machine	Nakaoka and Wakamoto, 2017
S. pombe	EMM30	HN0025	Edinburgh minimal medium +2%(w/v) glucose	30	Mother machine	Nakaoka and Wakamoto, 2017
S. pombe	EMM32	HN0025	Edinburgh minimal medium +2%(w/v) glucose	32	Mother machine	Nakaoka and Wakamoto, 2017
S. pombe	EMM34	HN0025	Edinburgh minimal medium +2%(w/v) glucose	34	Mother machine	Nakaoka and Wakamoto, 2017
S. pombe	YE28	HN0025	Yeast extract medium +3%(w/v) glucose	28	Mother machine	Nakaoka and Wakamoto, 2017
S. pombe	YE30	HN0025	Yeast extract medium +3%(w/v) glucose	30	Mother machine	Nakaoka and Wakamoto, 2017
S. pombe	YE34	HN0025	Yeast extract medium +3%(w/v) glucose	34	Mother machine	Nakaoka and Wakamoto, 2017
L1210 mouse leukemia cell	L1210 RPMI-1640	L1210 (ATCC CCL-219)	RPMI-1640 medium (Wako)+10% fetal bovine serum (Biosera) under 5% CO₂ atmosphere	37	Mother machine	Seita et al., 2021

Open in a new tab

Table 3. Summary of the data used in the analysis in Figure 5.

t_start and t_end are the start and end times for the analysis time window $τ$ .

Species	label	$τ$ (hr)	$t_{start}$ (hr)	$t_{end}$ (hr)	$N_{0}$	$N_{τ}$
E. coli	rpoS-mcherry glucose_37°C	5	0.95	5.95	163	3989
E. coli	rpoS-mcherry glucose_30°C	8	0.95	8.95	197	6173
E. coli	rpoS-mcherry glycerol_37°C	6.5	0.95	7.45	253	5825
E. coli	f3nw-sm	5	0	5	305	4343
E. coli	f3nw +sm	5	0	5	291	3164
E. coli	f3ptn001-sm	5	0	5	984	9229
E. coli	f3ptn001+sm	5	0	5	977	7429
M. smegmatis	mc²155 7H9	10	1.75	11.75	39	311
S. pombe	EMM28	167	0	167	1148	-
S. pombe	EMM30	131	0	131	963	-
S. pombe	EMM32	123.5	0	123.5	883	-
S. pombe	EMM34	152	0	152	1078	-
S. pombe	YE28	108	0	108	1177	-
S. pombe	YE30	90	0	90	866	-
S. pombe	YE34	78	0	78	863	-
L1210 mouse leukemia cell	L1210 RPMI-1640	60	0	60	474	-

Open in a new tab

First, we evaluated the first-order cumulants’ contributions $W_{1}^{(D)} = \frac{κ_{1}^{(D)}}{τ Λ} = \frac{{⟨ \tilde{h} (D) ⟩}_{cl}}{τ Λ}$ (Equation 14), finding that $W_{1}^{(D)} < 1$ for all the samples and conditions (Figure 5A). This result confirms that the chronological mean fitness cannot fully account for the population growth rates. This means that the division count heterogeneity present even in constant environments contributes to increasing the population growth rate. However, the extent of the contributions was different: $W_{1}^{(D)}$ for S. pombe was consistently closer to 1 than those for the other cell types except one condition (EMM, 34 °C), suggesting that S. pombe’s growth is less heterogeneous under most culture conditions.

Figure 5—figure supplement 1. — (A) Contributions of the cumulants of a fitness landscape to population growth. $W_{1}^{(D)}$ and $W_{2}^{(D)}$ were evaluated for the experimental cell lineage data from *E. coli* (red), *M. smegmatis* (blue), *S. pombe* (green), and L1210 mouse leukemia cells (yellow). The *E. coli* rpoS-mcherry data were newly obtained in this study (see Materials and methods). The other data were taken from literature: *E. coli* f3nw and f3ptn001 from Nozoe et al., 2017; *M. smegmatis* from Wakamoto et al., 2013; *S. pombe* from Nakaoka and Wakamoto, 2017; and L1210 from Seita et al., 2021. Circles and triangles represent $W_{1}^{(D)}$ and $W_{2}^{(D)}$ , respectively. Error bars represent the two standard deviation ranges estimated by resampling the cellular lineages (see Materials and methods). (B) Relationship between $S_{KL}^{(1)} [D] / τ Λ$ and $S_{KL}^{(2)} [D] / τ Λ$ . Colors correspond to the cellular species as in A. The *S. pombe* data were further categorized into two groups: Circles for the EMM conditions; and triangles for the YE conditions. (**C-E**) Representative chronological distributions of division count, $Q_{cl} (D)$ . (**F-H**) Graphical representation of $K_{D}^{'} (ξ)$ . F for *S. pombe* EMM30; G for L1210 RMPI-1640; and H for *E. coli* f3nw-sm.

We next evaluated $W_{2}^{(D)} = \frac{κ_{1}^{(D)} + κ_{2}^{(D)} / 2}{τ Λ}$ , finding that $W_{2}^{(D)} \approx 1$ for most of the conditions (Figure 5A). This result indicates small contributions of the third or higher-order cumulants to population growth. Consistent with this result, $S_{KL}^{(1)} [D]$ and $S_{KL}^{(2)} [D]$ were almost identical in most conditions (Figure 5B). Note that $S_{KL}^{(2)} [D] - S_{KL}^{(1)} [D]$ depends only on the third or higher order cumulants (Equation 15). The chronological distributions $Q_{cl} (D)$ of these samples were nearly symmetric in most cases; however, under the conditions where the deviations of $W_{2}^{(D)}$ from 1 are larger, such as S. pombe in EMM medium and L1210, the distributions were skewed slightly (Figure 5C–E and Figure 5—figure supplement 1). Such distribution skew was reflected in the convexity directions of $K_{D}^{'} (ξ)$ -plots (Figure 5F–H and Figure 5—figure supplement 2). These results imply that cellular populations of S. pombe in EMM medium and of L1210 contain small subpopulations that follow distinct division statistics. In fact, it was previously demonstrated that the L1210 cell populations contain slow-cycling cell lineages that can survive for longer durations under exposure to an anticancer drug (Seita et al., 2021). Therefore, this analysis confirms that the differences in the two strength measures can be used for detecting subpopulations in cellular populations.

In S. pombe EMM medium conditions, $K_{D}^{'} (ξ)$ was convex downward in the interval $0 \leq ξ \leq 1$ except for EMM 34°C (Figure 5F and Figure 5—figure supplement 2). Therefore, under certain conditions selection can increase fitness variance in the retrospective distributions relative to chronological distributions among cellular lineages.

The contributions of higher order cumulants become significant in the regrowth from a late stationary phase

We further applied the framework to the cell lineage data of E. coli populations regrowing from an early or late stationary phase. This analysis aims to uncover how strongly selection occurs upon environmental changes and whether the selection strength can differ under identical conditions depending on the conditions before regrowth. To conduct time-lapse observations of regrowing cell populations, we used a microfluidic device equipped with microchambers etched on a glass coverslip. We sampled E. coli cells either from an early or late stationary phase batch culture and enclosed the cells into the microchambers by a semipermeable membrane (Inoue et al., 2001; Hashimoto et al., 2016). We switched flowing media from stationary-phase conditioned medium to fresh medium at the start of time-lapse measurements and recorded the growth and division of individual cells (Figure 6A, see Materials and Methods).

The growth curves reconstructed by counting the number of cells at each time point showed lags in regrowth (Figure 6B). The lag time was shorter for the populations from the early stationary phase. The lineage tree structures in the cell populations were markedly different between the conditions (Figure 6C and D). The tree structures were more uniform for the early stationary phase sample with multiple divisions in most cell lineages (Figure 6C), whereas those for the late stationary phase sample were more heterogeneous, with 90% of cells showing no divisions within the observation time (Figure 6D).

We analyzed these data and found $W_{1}^{(D)} = 0.95 \pm 0.02$ for the population from the early stationary phase and $W_{1}^{(D)} = 0.27 \pm 0.04$ for the population from the late stationary phase (Figure 6E). Therefore, the chronological mean fitness, ${⟨ \tilde{h} (D) ⟩}_{cl}$ , explains only 27% of the growth rate of the population regrowing from the late stationary phase. In other words, significantly strong selection occurred in the regrowth from the late stationary phase. We also found that $W_{2}^{(D)} \approx 1$ for the population from the early stationary phase, as observed for the E. coli populations growing at constant rates. In constrast, $W_{2}^{(D)}$ for the population from the late stationary phase was $0.61 \pm 0.04$ , and $W_{n}^{(D)}$ converged to 1 only after taking the cumulants up to approximately 10th-order into account (Figure 6E). This indicates a skew of the fitness distribution and validates the existence of subpopulations following distinct division statistics in the population from the late stationary phase in this time scale of regrowth (Figure 6F). Reflecting the extreme skew to the right of the chronological distributions $Q_{cl} (D)$ (Figure 6F), $S_{KL}^{(2)} [D]$ was significantly greater than $S_{KL}^{(1)} [D]$ for the late stationary phase sample (Figure 6G).

These results indicate that the levels of selection in the regrowing processes strongly depend on the durations under stationary phase conditions. Therefore, the ability to quickly resume growth under favorable conditions is gradually lost in most cells in the stationary phase; only a fraction of cells in the population can contribute to the future cell population. However, we also remark that preserving such non-growing cell lineages can be beneficial when cell populations are exposed to harsh environments in unpredictable manners (Kussell and Leibler, 2005).

Lineage statistics reveal condition-dependent fitness landscapes and selection strength for a growth-regulating sigma factor

RpoS is a sigma factor that controls the transcription of a large set of genes (10% of the genome) in E. coli (Battesti et al., 2011). High RpoS expression usually correlates with growth suppression; RpoS is induced when cells enter stationary phases or encounter stress conditions, such as starvation, low pH, oxidative stress, high temperature, or osmotic stress. Elevated RpoS expression provokes the intracellular programs to shut down growth and resist the stress (Battesti et al., 2011). However, it remains poorly understood how the continuum heterogeneity of RpoS expression levels is linked to the lineage fitness and selection in exponentially growing cellular populations. We therefore applied the lineage statistics framework to the single-cell time-lapse data of an E. coli strain expressing an RpoS-mCherry fusion protein from the native chromosomal locus and green fluorescent protein (GFP) from a low copy plasmid.

We quantified the time-scaled fitness landscapes $h (X) / τ$ and relative selection strength $S_{rel} [X]$ (Equation 5) under three growth conditions, taking the time-averaged mean fluorescent intensity of RpoS-mCherry or GFP along each cell lineage (proxies of time-averaged intracellular concentrations) as $X$ (Figure 7). Since fluorescent intensity is a trait that takes continuous values, we binned the intensity values with the bin sizes around which selection strength values are relatively stable (Materials and methods). Furthermore, since the calculation of relative selection strength from empirical data always gives positive values, we compared the relative selection strength values with those calculated from the data in which the correspondences between division counts and trait values were randomized to confirm the confidence levels (Figure 7—figure supplement 1).

Figure 7. — (A) Fitness landscapes for the time-averaged concentration (mean fluorescent intensity) for RpoS-mCherry. The time-averaged mean fluorescent intensity of RpoS-mCherry was adoped as a lineage trait $X$ and changes in fitness were plotted against the trait values $x$ . Fitness landscapes were scaled by the lineage length (observation duration) $τ$ . Error bars represent the two standard deviation ranges estimated by resampling the cellular lineages. (B) Fitness landscapes for the time-averaged concentration for GFP. The time-averaged mean fluorescent intensity of GFP was adoped as a lineage trait $X$ and changes in fitness were plotted against the trait values $x$ . (C) Relative selection strength for the time-averaged concentrations of RpoS-mCherry (red) and GFP (green). (**D, E**) Chronological distributions $Q_{cl} (x)$ for the time-averaged concentrations of RpoS-mCherry (D) and GFP (E). (**F-H**) Cumulative contributions of fitness cumulants to population growth, $W_{n}^{(X)}$ , assuming that $X$ is either time-averaged concentration of RpoS-mCherry (red) or time-averaged concentration of GFP (green). Error bars represent the two standard deviation ranges estimated by resampling the cellular lineages. Panel F is for the Glucose-37°C condition; Panel G for the Glucose-30°C condition; and Panel H for the Glycerol-37°C condition.

Figure 7—figure supplement 1. — (A) Fitness landscapes for the time-averaged concentration (mean fluorescent intensity) for RpoS-mCherry. The time-averaged mean fluorescent intensity of RpoS-mCherry was adoped as a lineage trait $X$ and changes in fitness were plotted against the trait values $x$ . Fitness landscapes were scaled by the lineage length (observation duration) $τ$ . Error bars represent the two standard deviation ranges estimated by resampling the cellular lineages. (B) Fitness landscapes for the time-averaged concentration for GFP. The time-averaged mean fluorescent intensity of GFP was adoped as a lineage trait $X$ and changes in fitness were plotted against the trait values $x$ . (C) Relative selection strength for the time-averaged concentrations of RpoS-mCherry (red) and GFP (green). (**D, E**) Chronological distributions $Q_{cl} (x)$ for the time-averaged concentrations of RpoS-mCherry (D) and GFP (E). (**F-H**) Cumulative contributions of fitness cumulants to population growth, $W_{n}^{(X)}$ , assuming that $X$ is either time-averaged concentration of RpoS-mCherry (red) or time-averaged concentration of GFP (green). Error bars represent the two standard deviation ranges estimated by resampling the cellular lineages. Panel F is for the Glucose-37°C condition; Panel G for the Glucose-30°C condition; and Panel H for the Glycerol-37°C condition.

The result shows that the fitness landscapes and selection strength of RpoS expression level differ significantly among the growth conditions (Figure 7). Under the glucose-37°C condition, the fitness landscapes of RpoS-mCherry and GFP expression were both decreasing functions (Figure 7A and B). Thus, high expression of RpoS-expression and GFP in an exponentially growing population are both linked with lower lineage fitness. However, while the fitness landscape of GFP expression were nearly constant and showed significant decrease of fitness only at high expression levels, the fitness landscape of RpoS-mCherry decreased steadily in the observed expression range (Figure 7A and B). Consequently, the relative selection strength for RpoS-mCherry was 2.6-fold larger than that for GFP (Figure 7C).

Under the glucose-30°C and glycerol-37°C conditions, the fitness landscapes for RpoS-mCherry level were also decreasing functions and close to each other but significantly downshifted from that for the glucose-37°C condition (Figure 7A). This result reveals that cells could have different fitness for the same expression levels of RpoS, depending on the growth conditions. The selection strength for RpoS-mCherry was larger than that for GFP under the glucose-37°C and glucose-30°C conditions (Figure 7C), which proves that the heterogeneity of RpoS expression in a population correlates with the lineage fitness more strongly than that of GFP under those conditions. On the other hand, the relative selection strength of RpoS-mCherry under the glycerol-37°C condition was the smallest among the three conditions and not significantly different from that of GFP (Figure 7C). This is due to the relatively flat fitness landscapes in the central ranges of the distributions $Q_{cl} (x)$ (Figure 7A and B) and the smaller variations of $x$ in the population (Figure 7D and E). These results reveal that the continuum heterogeneity of RpoS expression level in a population does correlate with the lineage fitness, but its contribution to selection depends on growth conditions. In other words, the heterogeneity in the RpoS-mCherry expression levels can barely correlate with fitness heterogeneity under some conditions.

We also evaluated the contributions of fitness cumulants for RpoS-mCherry expression to the population growth rate. Under all the conditions, $W_{1}^{(X)}$ was lower than 1 (Figure 7F–H). Therefore, the contributions of the higher-order fitness cumulants are non-negligible. However, the deviation of $W_{1}^{(X)}$ from 1 for RpoS-mCherry under the glycerol-37°C condition was small (Figure 7H). Hence, in this growth condition, RpoS-mCherry expression barely correlated with fitness heterogeneity in the population.

Importantly, this analysis can simultaneously reveal the changes in fitness landscapes (Figure 7A) and chronological distributions (Figure 7D). Interestingly, the distributions of the RpoS-mCherry expression levels are close between the Glucose-37°C and the Glycerol-37°C conditions, but the fitness landscapes are close between the Glucose-30°C and the Glycerol-37°C conditions. These results imply that the distributions and the fitness landscapes may vary independently in different conditions. Therefore, cells can potentially modulate the selection strength in each environment either by changing the fitness landscape or by changing the distribution of expression levels.

Discussion

Growth and division of individual cells are intrinsically variable, which causes division count heterogeneity among cellular lineages in a population. Such heterogeneity is ubiquitous across prokaryotic and eukaryotic cells, and its statistical properties could depend on the mechanisms and regulations determining cell division timings. Notably, division count heterogeneity influences population growth rate and, consequently, a population’s survival and evolutionary success. Therefore, understanding what statistical features are produced among cellular lineages and how these features contribute to population growth is essential for unraveling each organism’s survival and evolutionary strategy.

This report presents a cell lineage statistics framework to uncover the linkage between fitness distributions and population growth rate. We reveal that a population’s growth rate can be expanded by the cumulants of a fitness landscape for any lineage trait. The cumulant expansion allows us to quantify the contribution of each fitness cumulant, such as variance and skewness, to population growth rate. Applying this framework to the experimental cell lineage data revealed the cumulants’ contributions to population growth for various organisms and environmental conditions. In particular, higher-order cumulants became significant in the regrowth of E. coli from a late stationary phase. We remark that the cumulant expansion of population growth rate is valid only when all the cumulants are finite and when the Taylor expansion of $K_{X} (ξ)$ around $ξ = 0$ also converges at $ξ = 1$ . However, all the experimental data examined in this study exhibited stable convergence, including in the regrowth condition from the late stationary phase.

An advantage of this framework is its independence from any growth and division models. The mechanisms driving the growth and division of individual cells are diverse among organisms. For example, the properties of cellular growth and division, such as whether a cell’s size increases exponentially or linearly and whether cell size regulation follows sizer or adder models, could depend on cell types, organisms, and environmental conditions (Jun et al., 2018; Kohram et al., 2021). Therefore, any model assumptions restrict applicability and necessitate model validation before application. The model independence of the framework presented here comes from the definitions of two essential quantities: the chronological and retrospective probabilities. Quantifying these probabilities requires only the information of the numbers of cells at initial and end time points and of division counts on each cellular lineage. Consequently, this formalism can be applied even to non-stationary conditions without modifications. However, we also remark that this independence from the details other than cell lineage structures imposes a limitation on the framework because it cannot report any potential influences from factors such as heterogeneous environments around cells and non-quantified traits. Furthermore, the fitness landscape $h (x)$ and the relative selection strength $S_{rel} [X]$ evaluate only the correlations between the trait and fitness, not causal relationships. However, causal traits should have large selection strength values, and this framework helps narrow down the candidates for essential traits. Most importantly, division statistics is the focal information that connects molecular details underlying cellular growth and division to population growth. Regulatory mechanisms can influence population growth only by modulating the division statistics in a cellular population.

Growth heterogeneity in a cellular population plays a critical role in its adaptation and survival against stressful conditions. In antibiotic persistence, bacterial cell populations often harbor small populations of non-growing or slow-growing cells which can survive under antibiotic exposures (Balaban et al., 2004). Such structures of growth heterogeneity can be investigated in a unified manner by the selection strength measures introduced here. For example, the differences in $S_{KL}^{(1)} [D] / τ Λ$ among organisms can reveal the distinct levels of the overall growth heterogeneity of these organisms. Furthermore, the differences between $S_{KL}^{(1)} [D]$ and $S_{KL}^{(2)} [D]$ characterize the structure of growth heterogeneity: If $S_{K L}^{(1)} [D] > S_{K L}^{(2)} [D]$ , the distribution of lineage fitness is skewed negatively, and the cell population harbors small subpopulations of slow-growing cell lineages; on the contrary, if $S_{K L}^{(1)} [D] < S_{K L}^{(2)} [D]$ , the population harbors small populations of fast-growing cell lineages. Untangling the linkage between the structures of growth heterogeneity and their adaptability would help us understand the adaptive strategies of various organisms.

In general, heredity is also crucial for the growth and evolution of a population. The role of the heredity of a particular trait might be unravelled by taking the correlation length as a lineage trait $X$ and quantifying its selection strength. Since the modes of heredity can also be important targets of natural selection (Rivoire and Leibler, 2014), such measurements might provide insights into the evolution of heredity.

We remark that the distribution of interdivision time (generation time) influences the long-term growth rate, as demonstrated by the analytical model in Appendix 2. Therefore, statistical properties of generation time, such as distribution shapes and transgenerational correlations, can contribute to organisms’ evolutionary success by constantly introducing selection within a population. Unlike the central limit theorem, the contributions of higher-order cumulants can remain even in the long-term limit. Importantly, even when cell division processes seem purely stochastic, different states in some traits might underlie these variations in generation times. In such cases, $h (x)$ and $S_{rel} [X]$ for these traits can still unravel the correlations between the trait values and fitness.

This framework is applicable even to cell populations growing under non-constant environmental conditions. We indeed utilized this framework to analyze the regrowth of growth-arrested cells from the stationary phase conditions. The selection strength contributions to population growth, $S_{KL}^{(1)} [D] / τ Λ$ , were below 10% in most cases under constant growth conditions. Nevertheless, it became over 70% in the regrowth of E. coli from the late stationary phase. While increased selection in non-constant environments may not be surprising itself, it is intriguing to ask how its contribution changes quantitatively depending on the conditions of environmental changes, such as nutrient upshift and downshift. The selection strength contribution in the regrowth from the early stationary phase was only 5%. This result clearly shows that how strongly selection acts in regrowing processes depends on stationary phase incubation durations. However, we also remark that the differences in the selection strength values depend on the time window and might be valid only in this time scale. Clarifying the differences in the selection strength in longer time scales requires the detail of their lag time distributions, which we did not measure in this study.

We identified the cellular populations in which selection acts to increase fitness variance in the retrospective statistics compared with the chronological statistics (Figures 5F and 6G and Figure 5—figure supplement 2). When a decrease in fitness variance by selection is mentioned in evolutionary biology, an upper bound and inheritance of fitness across the generations of individuals are usually assumed. In such circumstances, selection drives the fitness distribution toward the maximum value, and the selection eventually causes fitness variance to decrease. However, even in this process, a decrease is not assured for every step; whether selection reduces fitness variance at each step depends on the fitness distribution at that time. Likewise, whether the fitness variance increases or decreases in the retrospective distribution depends on the shape of the fitness distribution before selection, that is, chronological distribution. Such conditions are graphically recognized by the downward convexity of $K_{D}^{'} (ξ)$ (Figure 3). When the fourth or higher order fitness cumulants are negligible, the convexity of $K_{D}^{'} (ξ)$ is determined primarily by the skewness of $Q_{cl} (d)$ ; positive skew of $Q_{cl} (d)$ with a long right tail makes $K_{D}^{'} (ξ)$ convex downward and $Var {[\tilde{h} (D)]}_{rs}$ greater than $Var {[\tilde{h} (D)]}_{cl}$ . This consequence is intuitively understandable since the right tail of $Q_{cl} (d)$ is accentuated in proportion to $e^{D}$ by selection, which leads to greater variance of $Q_{rs} (d)$ . On the other hand, when the skew is negative with the long left tail, the effect of applying $e^{D}$ is to diminish the tail and compress the distribution toward the fittest lineages. It is of note that greater fitness variance in the retrospective statistics is possible even in the long-term limit, as demonstrated by the model in Appendix 2.

We showed that division count heterogeneity among cellular lineages has dual facets: increasing population growth rate while sensitizing populations to perturbations. These two effects are quantitatively represented by $S_{KL}^{(1)} [D] / τ Λ$ and $S_{KL}^{(2)} [D] / τ Λ$ , respectively. Therefore, the difference between these selection strength measures gauges which aspect of growth heterogeneity is more significant in the population. Even though $S_{KL}^{(1)} [X]$ and $S_{KL}^{(2)} [X]$ are different in general, the analysis revealed that they were nearly identical in most of the cellular populations growing at constant rates (Figure 5). This result might suggest that, from a practical viewpoint, the contribution of higher-order cumulants becomes negligible under steady growth conditions, and the significant difference between $S_{K L}^{(1)} [X]$ and $S_{KL}^{(2)} [X]$ could be used as a probe for the non-stationarity of the population growth. This speculation must be examined experimentally using various organisms and cell types across diverse environmental conditions.

This framework is premised on complete lineage tree information. However, many methods of single-cell measurements continuously exclude cells from observation areas and provide only a part of the tree information. Therefore, extending this framework so that one can infer both chronological and retrospective probabilities from incomplete tree information is an essential future research direction. In this study, we calculated the fitness landscapes and selection strength measures for the cell lineage data obtained with the mother machine devices, assuming that these cell lineages would follow the chronological statistics. Such a simple approach is not yet available for larger scale lineage tree data obtainable with the other single-cell measurement devices such as dynamics cytometer (Hashimoto et al., 2016) and chemoflux (Lambert et al., 2014). Furthermore, it has been shown that the inference precision of population growth rate has non-monotonic dependence on the length of cell lineages obtained with mother machine devices (Levien et al., 2020). Even though the difficulties to overcome are present, a comprehensive framework may permit a unified treatment of cellular lineage data obtained using various single-cell measurement methods.

Phenotypic heterogeneity is widely observed in diverse cellular systems, including both prokaryotic and eukaryotic cells. It is often considered that phenotypic heterogeneity allows bet-hedging against unpredictable environments and promotes the survival of cellular population (Kussell and Leibler, 2005). However, quantitative evaluation of correlations between the traits of interest and fitness is usually an intricate problem. The cell lineage statistical framework described in this study offers a straightforward procedure applicable to any cellular genealogical data, which are now becoming increasingly available for various biological phenomena, including cancer metastasis (Quinn et al., 2021) and stem cell differentiation (Filipczyk et al., 2015; Frieda et al., 2017; Chow et al., 2021). Another important advantage of this framework is that it allows decomposing a population growth rate into chronological fitness and selection strength. It is thus intriguing to apply this framework to long-term evolutionary dynamics and quantify how the contributions of chronological mean fitness and selection underlie the transitions of population growth rate. Such analysis might clarify the crucial roles of phenotypic heterogeneity in facilitating evolution.

Materials and methods

Key resources table.

Reagent type (species) or resource	Designation	Source or reference	Additional information
Recombinant DNA reagent	pUA66-PrpsL-gfp (plasmid)	Zaslaver et al., 2006
Strain, strain background (Escherichia coli)	MG1655 F3	Wakamoto lab	MG1655ΔfliCΔfimAΔflu
Strain, strain background (Escherichia coli)	MG1655 F3 rpoS- mcherry /pUA66-P rplS-gfp	Wakamoto lab	MG1655ΔfliCΔfimAΔflu rpoS-mcherry /pUA66-PrplS-gfp

Open in a new tab

Microfabrication of microchamber array

We constructed and used a microchamber array for conducting single-cell time-lapse observation under controlled environmental conditions. A microchamber is a well etched on a glass coverslip. We used two types of microchamber array. One is an array of microchamber, whose dimension is 70 μm (w) × 55 μm (h) × 1 μm (d). This microchamber has a 21-μm×7-μm pillar for supporting the membrane in the middle. We used this microchamber array for the exponential-phase experiment of E. coli. Another is an array of microchamber, whose dimension is 40 μm (w) × 30 μm (h) × 1 μm (d). We used this type of microchamber array for the stationary-phase-regrowth experiment in Figure 6. We fabricated these microchamber arrays following similar procedures described in Hashimoto et al., 2016; Inoue et al., 2001.

The photomasks for the microchamber array were created by laser drawing (DDB-201-TW, Neoark) on mask blanks (CBL4006Du-AZP, CLEAN SURFACE TECHNOLOGY). The photoresist on mask blanks was developed in NMD-3 (Tokyo Ohka Kogyo). The uncovered chromium (Cr)-layer was removed in MPM-E30 (DNP Fine Chemicals), and the remaining photoresist was removed by acetone. Lastly, the slide was rinsed in MilliQ water and air-dried.

The microchamber array was created in glass coverslips by chemical etching. First, we coated a 1,000-angstrom Cr-layer on a clean coverslip (NEO Micro glass, No. 1., 24 mm × 60 mm, Matsunami) by evaporative deposition and AZP1350 (AZ Electronic Materials) by spin-coating on the Cr-layer. We transferred the photomask patterns using a mask aligner (MA-20, Mikasa). After developing the photoresist in NMD-3 and the Cr-layer in MPM-E30, the coverslip was soaked in buffered hydrofluoric acid solution (110-BHF, Morita Kagaku Kogyo) for 14 minutes 20 seconds at 23°C for glass etching. The etching reaction was stopped by soaking the coverslip in milliQ water. The remaining photoresist and the Cr-layer were removed by acetone and MPM-E30, respectively.

Fabrication of PDMS pad

We used a polydimethylsiloxane (PDMS) pad to flow culture medium and control the environmental conditions around the cells in the microchamber array. The PDMS pad was designed to have a square bubble-trap groove, which prevents interference with bright-field microscopic imaging by air bubbles in flowing media.

To create a mold for the bubble-trap groove, we spin-coated SU-8 3050 (Kayaku Advanced Materials) on a silicon wafer (ID 447, $ϕ$ = 76.2 mm, University Wafer) and baked it at 95°C for 2 hr on a hot plate. The SU-8 layer was exposed to UV light on a mask aligner using a photomask and postbaked at 95°C for 2 hr. After cooled down to room temperature, the SU-8 photoresist was developed in the SU-8 developer (Kayaku Advanced Materials) and rinsed with isopropanol (Wako).

Part A and Part B of PDMS resin (SYLGARD 184 Silicone Elastomer Kit, DOW SILICONES) were mixed at 10:1 and poured onto the SU-8 mold. The air bubbles were removed under a decreased pressure for 30 min. The PDMS was cured at 65°C for 1 hour, and 20 mm × 20 mm square PDMS pad was cut out using a blade. We punched out two holes ( $ϕ$ = 2 mm) in the PDMS pad for the inlet and outlet, and 10-cm silicone tubes (SR-1554, Tigers Polymer Corp., outer $ϕ$ = 2 mm, inner $ϕ$ = 1 mm) were inserted into the holes. The tubes were fixed to the holes by gluing a small amount of PDMS around the tubes at the holes. This PDMS pad was washed in isopropanol by sonication and autoclaved for the single-cell measurements.

Chemical decoration of coverslip and cellulose membrane

We washed the microfabricated coverslips by sonication in contaminon (Wako), ethanol (Wako), and 0.1 M NaOH solution (Wako). The washed coverslips were rinsed in milliQ water by sonication and dried at 140°C for 30 min. The washed coverslip was soaked in 1% (v/v) 3-(2-aminoethylaminopropyl)trimethoxysilane solution (Shinetsu Kagaku Kogyo) for 30 min and incubated at 140°C for 30 min to create an amino group on the glass surface. The treated coverslip was washed in milliQ water for 15 min and dried at 140°C for 30 min. 1 mg NHS-LC-LC-Biotin (Funakoshi) was dissolved in 25 μl dimethyl sulfoxide and dispersed in 1 ml phosphate buffer (0.1 mM, pH8.0). A total of 200 μl of this biotin solution was placed on the coverslip and incubated at room temperature for 4 hr. The biotin solution was removed by soaking the coverslip in milliQ water.

We prepared a streptavidin-decorated cellulose membrane to enclose cells in the microchamber array while retaining a flexible environmental control. First, a 3 cm × 3 cm square cellulose membrane (Spectra/Por7 Pre-treated RC Tubing MWCO:25kD) was cut out and washed in milliQ water for 10 min. The membrane was incubated in a 0.1 M NaIO₄ solution with gentle shaking for 4 hr at 25°C. After the wash in milliQ water, the treated membrane was incubated in a 500-μl solution of streptavidin hydrazide (Funakoshi) (10 μg/ml, dissolved in 0.1 mM phosphate buffer (pH7.0)) with gentle shaking for 14 hr at 25°C. The membrane was again washed in milliQ water and stored at 4°C.

E. coli strains

We used two E. coli strains: MG1655 and MG1655 F3 rpoS-mcherry (MG1655 ΔfliCΔfimAΔflu rpoS-mcherry/pUA66-PrplS-gfp). MG1655 was used in the regrowth experiment from the stationary phases (Figure 6). MG1655 F3 rpoS-mcherry was used for analyzing the growth in constant environments (Figures 5 and 7). In MG1655 F3 rpoS-mcherry, the three genes, fliC, fimA, and flu, were deleted, and mcherry gene was inserted downstream of rpoS gene to express RpoS-mCherry translational fusion protein. This strain also expresses green fluorescent protein (GFP) from a low-copy plasmid, pUA66-PrplS-gfp, taken from a comprehensive library of fluorescent transcriptional reporters (Zaslaver et al., 2006).

Culture conditions and sample preparation (exponential growth)

We used MG1655 F3 rpoS-mcherry E. coli strain and cultured the cells in M9 minimal medium (Difco) supplemented with 1/2 MEM amino acids solution (SIGMA) and 0.2% (w/v) glucose or glycerol as a carbon source. We set the cultivation temperature either at 37°C or 30°C.

To prepare E. coli cells for single-cell observation, we first inoculated a glycerol stock into a 3-ml culture medium and incubated it with shaking overnight under the same conditions of culture medium and temperature as those used in the time-lapse measurement. 30 μl of the overnight culture was inoculated in a 3-ml fresh medium and incubated with shaking until the optical density at $λ$ = 600 nm reaches 0.1-0.3. This exponential-phase culture was diluted to OD₆₀₀ = 0.05, and 0.5 μl of the diluted cell suspension was spotted on the microchamber array on a biotin-decorated coverslip. A 5-mm × 5-mm streptavidin-decorated cellulose membrane was placed gently on the cell suspension on the coverslip, and an excess cell suspension was removed by a clean filter paper. A small piece of agar pad made with the culture medium and 1.5% (w/v) agar was placed on the cellulose membrane to maintain the culture conditions around the cells until tight streptavidin-biotin bonding was formed between the coverslip and the membrane. After 5-min incubation, the agar pad was removed, and the PDMS pad for medium perfusion was attached on the coverslip via a square-frame two-sided seal (Frame-Seal Incubation Chambers, Bio-rad). We immediately filled the device with the fresh medium and connected it to a syringe pump on the microscope stage.

Culture conditions and sample preparation (regrowth from stationary phases)

We used E. coli MG1655 strain and cultured the cells in Luria-Bertani (LB) medium at 37°C. To prepare the cells for the time-lapse experiment, a glycerol stock of this strain was inoculated into a 2 ml LB medium and cultured with shaking for 15 hours. The cell culture was diluted in 50 ml fresh LB medium to OD₆₀₀ = 0.005 and again cultured with shaking as a pre-culture. For preparing the early-stationary-phase conditioned medium, 7 ml pre-culture cell suspension at 8 hr (OD₆₀₀ ≈ 4.3) was spun down at 2600 G for 12 min. The supernatant was filtered through a 0.22-μm filter. For preparing cells for time-lapse observation, a 10-μl pre-culture cell suspension at 8 hr was mixed with 240 μl early-stationary-phase conditioned medium. One μl of this diluted cell suspension was placed on the microchamber array on a biotin-decorated glass coverslip. A 5-mm × 5-mm streptavidin-decorated cellulose membrane was placed gently on the cell suspension on the coverslip, and an excess cell suspension was removed by a clean filter paper. A small piece of a conditioned medium agar pad made with 1.5% (w/v) agar was placed on a cellulose membrane to maintain the early stationary phase condition during the incubation. After 5-min incubation, the conditioned medium agar pad was removed, and the PDMS pad for medium perfusion was attached on the coverslip via a square-frame two-sided seal. We immediately filled the device with the conditioned medium and connected it to a syringe pump. We maintained the chamber filled with the conditioned medium until we started the time-lapse observation. The conditioned medium was flushed away immediately before starting the time-lapse measurement by flowing fresh LB medium. After flowing 2 ml fresh LB medium at 32 ml/hr, the flow rate was decreased and maintained at 2 ml/hr throughout the time-lapse measurement.

We followed the same procedures for the late stationary phase sample except that we sampled the cells and prepared the conditioned medium from a 24-hr pre-culture cell suspension (OD₆₀₀ ≈ 3.0).

Time-lapse measurements and image analysis

We used Nikon Ti-E inverted microscope equipped with Plan Apo $λ$ 100× phase contrast objective (NA1.45), ORCA-R2 cooled CCD camera (Hamamatsu Photonics), Thermobox chamber (Tokai hit, TIZHB), and LED excitation light source (Thorlabs, DC2100). The microscope was controlled by Micromanager (Edelstein et al., 2014). In the exponential phase experiments, we monitored 25-30 microchambers in parallel in one measurement and acquired the phase-contrast, RpoS-mCherry fluorescence, and GFP fluorescence images from each position with a 3-min interval. We repeated the time-lapse measurement for each culture condition three times. In the regrowth experiment from the stationary phases, we monitored 150-250 microchambers in parallel with a 3-min interval and acquired only phase-contrast images.

We analyzed the time-lapse images by ImageJ (Schneider et al., 2012). We extracted the information of cell size (projected cell area), RpoS-mCherry fluorescence mean intensity, and GFP fluorescence mean intensity of individual cells along with division timings on each cell lineage for the exponential phase experiment. We extracted only division timings on each cellular lineage for the regrowth experiments from the stationary phases and used this information for further analysis.

Data analysis

Distributions and selection strength measures for division count

We calculated the distributions and selection strength measures of $D$ as follows. With the list of division counts ${D}$ for each lineage $σ$ , the chronological and retrospective probabilities were evaluated as $P_{cl} (σ) = 2^{- D (σ)} / N_{0}$ and $P_{rs} (σ) = 1 / N_{τ}$ , respectively, where N₀ is the number of cells at $t = 0$ and $N_{τ}$ is that at $t = τ$ . From these probabilities, the chronological and retrospective distributions of $D$ were obtained by summing the lineage probabilities for each division count, that is,

Q_{c l} (d) = \sum_{σ : D (σ) = d} P_{c l} (σ),

(21)

Q_{r s} (d) = \sum_{σ : D (σ) = d} P_{r s} (σ) .

(22)

The selection strength measures, $S_{KL}^{(1)} [D]$ and $S_{KL}^{(2)} [D]$ , were calculated as

S_{K L}^{(1)} [D] = \sum_{d \in D_{s u p p}} Q_{c l} (d) \ln \frac{Q_{c l} (d)}{Q_{r s} (d)},

(23)

S_{K L}^{(2)} [D] = \sum_{d \in D_{s u p p}} Q_{r s} (d) \ln \frac{Q_{r s} (d)}{Q_{c l} (d)},

(24)

where $D_{supp}$ is the support of both chronological and retrospective probabilities with respect to $D$ , which is common between the two probabilities.

Distributions and selection strength measures for time-averaged fluorescence intensity of RpoS-mCherry and GFP

We obtained the mean fluorescence intensity of RpoS-mCherry and GFP along with the genealogical trees in the time-lapse measurements of E. coli MG1655 F3 rpoS-mcherry strain. We analyzed the time-averaged fluorescence intensity of RpoS-mCherry and GFP as a lineage trait $X$ and evaluated their distributions, fitness landscapes, and selection strength measures (Figure 7). For each cell lineage, the time-averaged fluorescence intensity was calculated as

X (σ) = \frac{1}{N + 1} \sum_{i = 0}^{N} x_{σ} (t_{i}),

(25)

where $t_{i} = t_{start} + i Δ t$ min (t_start is the start time of the cell lineage; $Δ t = 3$ min is the time-lapse interval), and $x_{σ} (t_{i})$ is the mean fluorescence intensity at time t_i.

Generally, bin sizes for the fluorescence intensity affect the selection strength values. However, one can usually find the ranges of bin sizes where the results are relatively insensitive to the choice (Nozoe et al., 2017). Following an empirical rule, we set the bin width $Δ X$ to

Δ X = 0.4 * IQR ({X}),

(26)

where $I Q R (X)$ is the interquartile range of the set of $X (σ)$ from all the cell lineages. Then, the interval was defined as $I_{x, Δ X} = [x - \frac{Δ X}{2}, x + \frac{Δ X}{2}]$ for $x = \min ({X}), \min ({X}) + Δ X, \dots, \min ({X}) + (L - 1) Δ X$ , where $L$ is the number of total bins given by $L = ⌊ \frac{\max ({X}) - \min ({X})}{Δ X} ⌋ + 2$ .

We calculated the chronological and retrospective probability distributions of $X$ by

Q_{c l} (x) = \sum_{σ : X (σ) \in I_{x, Δ X}} \frac{2^{- D (σ)}}{N_{0}},

(27)

Q_{r s} (x) = \sum_{σ : X (σ) \in I_{x, Δ X}} \frac{1}{N_{τ}} .

(28)

$h (x)$ The fitness landscape was evaluated by

h (x) = \ln \frac{N_{τ}}{N_{0}} \frac{Q_{rs} (x)}{Q_{cl} (x)} .

(29)

The selection strength measures were evaluated by

S_{K L}^{(1)} [X] = \sum_{l = 0}^{L - 1} Q_{c l} (m i n (X) + l Δ X) \ln \frac{Q_{c l} (m i n (X) + l Δ X)}{Q_{r s} (m i n (X) + l Δ X)},

(30)

S_{K L}^{(2)} [X] = \sum_{l = 0}^{L - 1} Q_{r s} (m i n (X) + l Δ X) \ln \frac{Q_{r s} (m i n (X) + l Δ X)}{Q_{c l} (m i n (X) + l Δ X)} .

(31)

Cumulant generating functions and cumulants

To plot the differential of the cumulant generating functions in Figure 5F-H, we evaluated $K_{D}^{'} (ξ) = \frac{\sum_{d \in D_{s u p p}} (d \ln 2) 2^{ξ d} Q_{c l} (d)}{\sum_{d \in D_{s u p p}} 2^{ξ d} Q_{c l} (d)}$ by changing $ξ$ from 0 to 1 with the step size 0.01.

We calculated the cumulative contributions of fitness cumulants to the population growth $W_{n}^{(X)}$ (Figures 5A—7F-H) using a julia package, JuliaDiff/TaylorSeries.jl (Benet and Sanders, 2019; Benet and Sanders, 2021).

Error estimations by resampling method

To evaluate the error ranges of the quantities calculated in the analysis, we created 20,000 randomly resampled datasets for each condition and reported the means and two standard deviation ranges in the results.

For the datasets of colony growth (E. coli and M. smegmatis), $N_{τ}$ lineages were randomly sampled with replacement according to the probability weight $P_{rs} (σ)$ for each resampled dataset. In each resampled dataset, the initial number of cells was estimated as ${\hat{N}}_{0} = \sum_{σ \in {σ}_{resampled}} 2^{- D (σ)}$ .

For the datasets taken using the mother machines (S. pombe and L1210), we randomly sampled N₀ lineages with an equal weight, which corresponds to the chronological probability in this setting. $N_{τ}$ was estimated as ${\hat{N}}_{τ} = \sum_{σ \in {σ}_{resampled}} 2^{D (σ)}$ .

Simulating the effect of cell removal on population growth rates

We simulated cell population growth with cell removal using a custom C script. The gamma distributions were adopted as generation time distributions. We assigned the shape parameter to $k =$ 1, 2, or 5 and the scale parameter to $θ = 2^{1 / k} - 1$ . The perturbation strength $ϵ$ was changed from 0 to 0.2 with the interval 0.01.

As a pre-run, we started a simulation from a newborn cell and assigned its generation time randomly according to a pre-defined gamma probability distribution. We assumed that this cell divided into two daughter cells at the end of the generation. Each daughter cell was removed with probability $1 - 2^{- ϵ}$ and assigned with generation time from the same pre-defined probability distribution if it escaped removal. Repeating this procedure, we let the population grow until all of the remaining cell lineages in the population exceed the maximum duration $T_{\max} = 8.0$ . The time to the next division of each cell lineage at $T_{\max}$ was exported as the first division time in the main simulation. This pre-run was repeated 1000 cycles to export a sufficiently sizable list of first division times.

In the main simulation, we started from a progenitor cell with its division time randomly assigned from the first division time list exported in the pre-rum. For the daughter cells born from the first divisions and their descendants, the assignment of generation time and the cell removal were done as in the pre-run. We stopped further production of daughter cells in each lineage if it exceeded $T_{\max} = 8.0$ . We repeated this main simulation 1,000 cycles starting from different progenitor cells. The number of cell divisions in each cell lineage until $T_{\max}$ was exported for analysis.

We calculated the population growth rate at each perturbation strength as

Λ (ϵ) = \frac{1}{T_{\max}} \ln \frac{N (T_{\max}, ϵ)}{1000},

(32)

where $N (T_{\max}, ϵ)$ is the number of cell lineages at $T_{\max}$ when the perturbation strength was $ϵ$ . The chronological and retrospective mean fitness of division count without cell removal was calculated as

⟨ \tilde{h} (D) ⟩_{c l} = \sum_{σ = 1}^{N (T_{m a x}, 0)} \frac{(D (σ) \ln 2) 2^{- D (σ)}}{1000},

(33)

⟨ \tilde{h} (D) ⟩_{r s} = \sum_{σ = 1}^{N (T_{m a x}, 0)} \frac{D (σ) \ln 2}{N (T_{m a x}, 0)} .

(34)

When simulating the cell population with mother-daughter correlation time, we randomly assigned the generation time from the gamma probability distribution with its shape parameter $\frac{r τ_{m} / θ + k (1 - r)}{1 - r^{2}}$ and scale parameter $(1 - r^{2}) θ$ , where $τ_{m}$ is the generation time of the mother cell, $r$ is the correlation coefficient of generation time between neighboring generations. The stationary distribution of this transition probability approximates the gamma distribution with shape parameter $k$ and scale parameter $θ$ to good precision with identical first and second-order moments irrespective of the parameters $k$ , $θ$ , and $r$ . In Figure 4—figure supplement 1, we fixed $k = 2$ and $θ = \sqrt{2} - 1$ and set $r$ to 0, 0.2, 0.4, or 0.6.

Data and code availability

The raw data obtained in this study, the Matlab codes for data analysis, and the C code for simulation have been deposited in Github repositories (https://github.com/Wakamoto-lab/LineageAnalysis, (copy archived at swh:1:rev:1865d167f1c24625c98d3c493a9a180b1aa2035d; Yamauchi, 2021), https://github.com/Wakamoto-lab/LineageAnalysis-Julia, (copy archived at swh:1:rev:e22fbce8a713582a18fbe2bcc57dc9078090f121; Nozoe and Wakamoto, 2021) and https://github.com/Wakamoto-lab/LineageSimulation, (copy archived at swh:1:rev:ef1166620396835168ca9061851898993a091976; Wakamoto, 2021).

Acknowledgements

We thank Tetsuya J Kobayashi and the members of the Wakamoto Lab for discussion. This work was supported by JST CREST Grant Number JPMJCR1927 (YW); JST ERATO Grant Number JPMJER1902 (YW); NIH Grant Number R01-GM097356 (EK); and Japan Society for the Promotion of Science KAKENHI Grant Number 17H06389 (YW), 19H03216 (YW), and 21K20672 (TN).

Appendix 1

Analytical calculations of fitness measures, selection strength, and the cumulants of a fitness landscape

To observe how the framework works, we show the exact form of $K_{D} (ξ)$ for a class of discrete probability distributions containing Poisson, binomial and negative binomial distributions. Let $\bar{D}$ and $\bar{D} ϕ$ denote the mean and the variance of $Q_{cl} (D)$ respectively (i.e., $ϕ$ is the Fano factor of division counts). When $Q_{cl} (D)$ is Poisson, binomial or negative binomial distributions, $\bar{D}$ and $ϕ$ uniquely determine the form of probability distribution: $ϕ = 1$ for Poisson; $ϕ < 1$ for binomial; and $ϕ > 1$ for negative binomial (Appendix 1—figure 1A). Then, $K_{D} (ξ)$ for these distributions have a closed form

K_{D} (ξ) = {\begin{matrix} \bar{D} \frac{\ln (2^{ξ} (1 - ϕ) + ϕ)}{1 - ϕ}, & ϕ \neq 1 \\ \bar{D} (2^{ξ} - 1), & ϕ = 1 \end{matrix}

(35)

Appendix 1—figure 1. — (A) Chronological division count distributions. $ϕ = 0.3$ and $ϕ = ϕ_{0} (= 0.5857 \dots)$ are binomial, $ϕ = 1$ is Poisson and $ϕ = 1.6$ is negative binomial. $\bar{D} = 20 (1 - ϕ_{0})$ is fixed. (B) Cumulative contributions of fitness cumulants. Parameter values are given in panel A legend. (C) The relation between two selection strength measures. Binomial (blue curve), Poisson (closed black circle) and negative binomial (orange curve) are indicated on the single curve plotted using Equations 37 and 38 within the range of $0 < ϕ < 2$ . The point where $S_{KL}^{(1)} [D] = S_{KL}^{(2)} [D]$ ( $ϕ = ϕ_{0}$ ) is indicated by the open black circle. The grey dotted line corresponds to $S_{KL}^{(1)} [D] = S_{KL}^{(2)} [D]$ . (D) Convexity of $K_{D}^{'} (ξ)$ . Y-axis shows a rescaling of $K_{D}^{'} (ξ)$ according to $(K_{D}^{'} (ξ) - K_{D}^{'} (0)) / (K_{D}^{'} (1) - K_{D}^{'} (0))$ . The same values of $ϕ$ as in A are used; $ϕ = 0.3$ (blue), $ϕ = ϕ_{0}$ (orange), $ϕ = 1$ (green) and $ϕ = 1.6$ (red). The grey dotted line indicates the case that $K_{D}^{'} (ξ)$ is a linear function of $ξ$

(Appendix 3). We then immediately obtain

τ Λ = K_{D} (1) = {\begin{matrix} \bar{D} \frac{\ln (2 - ϕ)}{1 - ϕ}, & ϕ \neq 1 \\ \bar{D}, & ϕ = 1 \end{matrix}

(36)

Since ${lim}_{ϕ \to 2} K_{D} (1) = \infty$ , $0 < ϕ < 2$ is the range that the Fano factor of division counts can take within this scheme.

Using (Equation 35) allows us to calculate the cumulative contribution of cumulants of a fitness landscape $W_{n}^{(D)}$ (Equation 14). Plotting $W_{n}^{(D)}$ shows that the contribution of higher-order cumulants becomes significant when $ϕ$ is large (Appendix 1—figure 1B). Also, evaluating the values of the derivative of Equation 35 at $ξ = 0$ and $ξ = 1$ , we have

\frac{S_{KL}^{(1)} [D]}{τ Λ} = 1 - \frac{K_{D}^{'} (0)}{K_{D} (1)} = 1 - \frac{(1 - ϕ) \ln 2}{\ln (2 - ϕ)},

(37)

\frac{S_{KL}^{(2)} [D]}{τ Λ} = \frac{K_{D}^{'} (1)}{K_{D} (1)} - 1 = \frac{2 (1 - ϕ) \ln 2}{(2 - ϕ) \ln (2 - ϕ)} - 1 .

(38)

Therefore, $S_{KL}^{(1)} [D] / τ Λ$ and $S_{KL}^{(2)} [D] / τ Λ$ depend only on the Fano factor $ϕ$ . In particular, $S_{KL}^{(1)} [D] = S_{KL}^{(2)} [D]$ has 2 roots $ϕ = 0, ϕ_{0} (= 0.5857)$ ; $S_{K L}^{(1)} [D] > S_{K L}^{(2)} [D]$ if $0 < ϕ < ϕ_{0}$ and $S_{K L}^{(1)} [D] < S_{K L}^{(2)} [D]$ if $ϕ_{0} < ϕ < 2$ (Appendix 1—figure 1C). Plotting $K_{D}^{'} (ξ)$ confirms that the covexity direction changes around $ϕ_{0}$ (Appendix 1—figure 1D). These analyses demonstrate how one can extract detailed information regarding selection in populations from $Q_{cl} (D)$ .

Appendix 2

Long-term limit for gamma-distributed uncorrelated generation times

To understand how inherent stochasticity affect long-term population growth rate and selection, we consider a cellular population in which cells divide stochastically following a probability distribution of generation times (interdivision times).

Let $g (x)$ and $z$ denote the probability density function of generation time $x$ and the mean number of offsprings per generation, respectively. We assume that the generation time correlation between parent and offspring can be ignored; i.e., $g (x)$ gives the probability density that offspring’s generation time becomes $x$ . The Malthusian parameter $λ$ is the real root of the so-called Euler-Lotka equation (Fisher, 1930):

z \int_{0}^{\infty} g (x) e^{- λ x} d x = 1.

(39)

We remark that (Equation 39) also holds for correlated generation times such as Markov models (Lebowitz and Rubinow, 1974) by reinterpreting $g (x)$ as the probability distribution of generation times of parent cells across a steadily growing population. In such cases, $g (x)$ depends on $z$ , and we cannot treat $g (x)$ in (Equation 43) independent of $z = 2^{ξ}$ . Here, we ignore any transgenerational correlations in generation time to illustrate the effect of the variation in generation time on $K_{D} (ξ)$ and selection strength measures with simple calculations. For this purpose, we further choose gamma distributions as $g (x)$ , i.e.,

g (x) = \frac{x^{α - 1} e^{- x / θ}}{Γ (α) θ^{α}}, x \geq 0,

(40)

where $α > 0$ is a shape parameter; and $θ > 0$ is a scale parameter. In this case, the Malthusian parameter is

λ = \frac{z^{1 / α} - 1}{θ} .

(41)

The probability distribution of division count $Q_{cl} (D)$ , in this case, is known as gamma count distribution (Winkelmann, 1995). Though any closed-form expression of the corresponding cumulant generating function is not known, it has a simple limiting form for $τ \to \infty$ as shown below. We define the rescaled cumulant generating function by

{\tilde{K}}_{D} (ξ) := lim_{τ \to \infty} \frac{K_{D} (ξ)}{τ} .

(42)

Since ${\tilde{K}}_{D} (ξ)$ represents the population growth rate, or Malthusian parameter with the mean number of offspring $z = 2^{ξ}$ , we have

2^{ξ} \int_{0}^{\infty} g (x) e^{- {\tilde{K}}_{D} (ξ) x} d x = 1 .

(43)

When $g$ is a gamma distribution with a shape parameter $α$ and a scale parameter $θ$ , we obtain

{\tilde{K}}_{D} (ξ) = \frac{2^{ξ / α} - 1}{θ},

(44)

and

{\tilde{K}}_{D}^{'} (ξ) = \frac{2^{ξ / α} \ln 2}{α θ} .

(45)

Note that $α = 1$ corresponds to the case where division counts follow the Poisson distribution with mean $θ^{- 1}$ . The scaled key quantities derived from ${\tilde{K}}_{D} (ξ)$ are as follows.

Λ = {\tilde{K}}_{D} (1) = \frac{2^{1 / α} - 1}{θ},

(46)

{\tilde{K}}_{D}^{'} (0) = \frac{\ln 2}{α θ},

(47)

{\tilde{K}}_{D}^{'} (1) = \frac{2^{1 / α} \ln 2}{α θ},

(48)

{\tilde{S}}_{K L}^{(1)} [D] := {\tilde{K}}_{D} (1) - {\tilde{K}}_{D}^{'} (0),

(49)

and

{\tilde{S}}_{K L}^{(2)} [D] := {\tilde{K}}_{D}^{'} (1) - {\tilde{K}}_{D} (1) .

(50)

Hence,

\frac{{\tilde{S}}_{KL}^{(1)} [D]}{Λ} = 1 - \frac{{\tilde{K}}_{D}^{'} (0)}{{\tilde{K}}_{D} (1)} = 1 - \frac{\ln 2}{α (2^{1 / α} - 1)},

(51)

and

\frac{{\tilde{S}}_{KL}^{(2)} [D]}{Λ} = \frac{{\tilde{K}}_{D}^{'} (1)}{{\tilde{K}}_{D} (1)} - 1 = \frac{2^{1 / α} \ln 2}{α (2^{1 / α} - 1)} - 1 .

(52)

${\tilde{S}}_{K L}^{(2)} [D] > {\tilde{S}}_{K L}^{(1)} [D]$ is always true for $0 < α < \infty$ because

\begin{aligned} \frac{{\tilde{S}}_{K L}^{(2)} [D] - {\tilde{S}}_{K L}^{(1)} [D]}{Λ} & = \frac{(γ - 2) e^{γ} + γ + 2}{e^{γ} - 1} \\ > \frac{(γ - 2) (γ + 1) + γ + 2}{e^{γ} - 1} \\ = \frac{γ^{2}}{e^{γ} - 1} > 0, \end{aligned}

(53)

where $γ = α^{- 1} \ln 2$ and the inequality $e^{γ} > 1 + γ$ ( $γ > 0$ ) are used.

Since the Taylor expansion of ${\tilde{K}}_{D} (ξ)$ at $ξ = 0$ is

{\tilde{K}}_{D} (ξ) = \frac{2^{ξ / α} - 1}{θ} = \sum_{n \geq 1} \frac{ξ^{n}}{n!} \frac{1}{θ} {(\frac{\ln 2}{α})}^{n},

(54)

the time-scaled $n$ -th order fitness cumulant is

{\tilde{κ}}_{n} := lim_{τ \to \infty} \frac{κ_{n}}{τ} = \frac{1}{θ} {(\frac{\ln 2}{α})}^{n}, n = 1, 2, \dots

(55)

Therefore,

W_{n} = \frac{1}{Λ} \sum_{m = 1}^{n} \frac{{\tilde{κ}}_{m}}{m!} = \frac{\sum_{m = 1}^{n} \frac{1}{m!} {(\frac{\ln 2}{α})}^{m}}{2^{1 / α} - 1} .

(56)

These results show that, unlike the central limit theorem, higher-order cumulants remain even in the long-term limit. Selection strength also remains in the long-term limit, which means that inherent stochasticity of generation times continuously introduces selection within a cellular population. Importantly, the time-scaled cumulants and the selection strength depend on $α$ . Therefore, the shape of generation time distributions influences the long-term population growth rate and selection. Since ${\tilde{S}}_{KL}^{(2)} [D]$ is always greater than ${\tilde{S}}_{KL}^{(1)} [D]$ , the fitness variance is larger in the retrospective distribution than in the chronological distribution.

Appendix 3

The properties of the selection strength of division count

Below we derive several important properties of the selection strength of division count. We focus on the selection strength measure $S_{KL}^{(1)}$ and write it as $S$ this section for conciseness. However, the conclusions are likewise valid for $S_{JF}$ and $S_{KL}^{(2)}$ .

The most detailed description of cellular lineage statistics is based on individual lineages $σ$ . From the definitions of $P_{cl} (σ)$ and $P_{rs} (σ)$ in the main text, the relation

P_{rs} (σ) = P_{cl} (σ) e^{D (σ) \ln 2 - τ Λ}

(57)

is held. We define the selection strength of cellular lineages as

\begin{aligned} S [σ] := & \sum_{σ} P_{c l} (σ) \ln \frac{P_{c l} (σ)}{P_{r s} (σ)} \\ = & \sum_{σ} P_{c l} (σ) \ln \frac{P_{c l} (σ)}{P_{c l} (σ) e^{D (σ) \ln 2 - τ Λ}} \\ = & τ Λ - ⟨ D (σ) \ln 2 ⟩_{c l}, \end{aligned}

(58)

where ${⟨ D (σ) \ln 2 ⟩}_{cl} = \sum_{σ} (D (σ) \ln 2) P_{cl} (σ)$

From the definition of fitness landscape (Equation 1),

\begin{aligned} \tilde{h} (d) = & τ Λ + \ln \frac{Q_{r s} (d)}{Q_{c l} (d)} \\ = & τ Λ + \ln \frac{\sum_{σ : D (σ) = d} P_{r s} (σ)}{\sum_{σ : D (σ) = d} P_{c l} (σ)} \\ = & τ Λ + \ln \frac{\sum_{σ : D (σ) = d} P_{c l} (σ) e^{D (σ) \ln 2 - τ Λ}}{\sum_{σ : D (σ) = d} P_{c l} (σ)} \\ = & d \ln 2. \end{aligned}

(59)

On the other hand,

\begin{aligned} ⟨ D (σ) \ln 2 ⟩_{c l} = & \sum_{σ} (D (σ) \ln 2) P_{c l} (σ) = \sum_{d} \sum_{σ : D (σ) = d} (D (σ) \ln 2) P_{c l} (σ) \\ = & \sum_{d} (d \ln 2) \sum_{σ : D (σ) = d} P_{c l} (σ) = \sum_{d} \tilde{h} (d) Q_{c l} (d) \\ = & ⟨ \tilde{h} (D) ⟩_{c l} . \end{aligned}

(60)

This proves that the chronological mean fitness of cellular lineages equals the chronological mean fitness of division count.

Since $S [D] = τ Λ - {⟨ \tilde{h} (D) ⟩}_{cl}$ and $S [σ] = τ Λ - {⟨ D (σ) \ln 2 ⟩}_{cl}$ (Equations 3; 58),

S [D] = S [σ]

(61)

is also held. This result shows that the selection strength of $D$ is equivalent to the selection strength of cellular lineages despite $D$ being a coarse-grained lineage trait.

Another important property of $S [D]$ is that it sets the maximum bound for the selection strength of any lineage traits. Now we consider the joint probability distributions of $D$ and lineage trait $X$ , which we write $Q_{cl} (d, x)$ and $Q_{rs} (d, x)$ . We define the joint selection strength as

S [D, X] := \sum_{d} \sum_{x} Q_{c l} (d, x) \ln \frac{Q_{c l} (d, x)}{Q_{r s} (d, x)} .

(62)

Using $Q_{cl} (d, x) = Q_{cl} (d | x) Q_{cl} (x)$ and $Q_{rs} (d, x) = Q_{rs} (d | x) Q_{rs} (x)$ ,

\begin{aligned} S [D, X] = & \sum_{x} (\sum_{d} Q_{c l} (d | x)) Q_{c l} (x) \ln \frac{Q_{c l} (x)}{Q_{r s} (x)} + \sum_{d} \sum_{x} Q_{c l} (d, x) \ln \frac{Q_{c l} (d | x)}{Q_{r s} (d | x)} \\ = & S [X] + S [D | X], \end{aligned}

(63)

where $S [D | X] := \sum_{d} \sum_{x} Q_{c l} (d, x) \ln \frac{Q_{c l} (d | x)}{Q_{r s} (d | x)}$ , and we used $\sum_{d} Q_{c l} (d | x) = 1$ .

Likewise, $S [D, X]$ can also be decomposed as

S [D, X] = S [D] + S [X | D] .

(64)

However, $S [X | D] = 0$ because

\begin{aligned} h (d, x) := & τ Λ + \ln \frac{Q_{r s} (d, x)}{Q_{c l} (d, x)} \\ = & τ Λ + \ln \frac{\sum_{\begin{matrix} σ : D (σ) = d, \\ X (σ) = x \end{matrix}} P_{r s} (σ)}{\sum_{\begin{matrix} σ : D (σ) = d, \\ X (σ) = x \end{matrix}} P_{c l} (σ)} \\ = & d \ln 2 = \tilde{h} (d), \end{aligned}

(65)

and

\begin{aligned} S [X | D] := & \sum_{d} \sum_{x} Q_{c l} (d, x) \ln \frac{Q_{c l} (x | d)}{Q_{r s} (x | d)} \\ = & \sum_{d} \sum_{x} Q_{c l} (d, x) \ln \frac{Q_{c l} (d, x) Q_{r s} (d)}{Q_{r s} (d, x) Q_{c l} (d)} \\ = & \sum_{d} \sum_{x} Q_{c l} (d, x) {\tilde{h} (d) - h (d, x)} = 0 \end{aligned}

(66)

from (Equation 1) and (Equation 65). This leads to

S [D] = S [X] + S [D | X]

(67)

from (Equation 63) and (Equation 64). Furthermore, $S [D | X] \geq 0$ from Jensen’s inequality. Thus,

S [D] \geq S [X] .

(68)

The equality is held when $D$ is a deterministic function of $X$ . This inequality shows that $S [D]$ ( $= S [σ]$ ) sets the maximum bound for the selection strength of any lineage trait $X$ .

The cumulant generating function $K_{X} (ξ)$ provides both chronological and retrospective fitness cumulants

In the main text, we introduced the cumulant generating function of $h (x)$ with respect to the chronological distribution $Q_{cl} (x)$ ,

K_{X} (ξ) := \ln ⟨ e^{ξ h (x)} ⟩_{c l} = \ln \sum_{x} e^{ξ h (x)} Q_{c l} (x) .

(69)

This function can also be written as

K_{X} (ξ) = \sum_{n = 1}^{\infty} \frac{κ_{n}^{(X)}}{n!} ξ^{n}

(70)

when the fitness cumulants $κ_{n}^{(X)}$ are all finite, and the Taylor expansion converges at $ξ$ . Also,

κ_{n}^{(X)} = {\frac{d^{n} K_{X} (ξ)}{d ξ^{n}} |}_{ξ = 0} .

(71)

Below we prove that $K_{X} (ξ)$ also gives the fitness cumulants on the retrospective distributions.

We define a cumulant generating function on the retrospective probability as

R_{X} (ξ) := \ln ⟨ e^{ξ h (x)} ⟩_{r s} = \ln \sum_{x} e^{ξ h (x)} Q_{r s} (x) .

(72)

This function can be expanded by the fitness cumulants of the retrospective statistics $ρ_{n}^{(X)}$ as

R_{X} (ξ) = \sum_{n = 1}^{\infty} \frac{ρ_{n}^{(X)}}{n!} ξ^{n} .

(73)

Therefore,

ρ_{n}^{(X)} = {\frac{d^{n} R_{X} (ξ)}{d ξ^{n}} |}_{ξ = 0} .

(74)

For example, $ρ_{1}^{(X)} = {⟨ h (X) ⟩}_{rs}$ and $ρ_{2}^{(X)} = Var {[h (X)]}_{rs} = {⟨ h {(X)}^{2} ⟩}_{rs} - {⟨ h (X) ⟩}_{rs}^{2}$ .

Inserting $Q_{rs} (x) = e^{h (x) - τ Λ} Q_{cl} (x)$ into (Equation 72),

\begin{aligned} R_{X} (ξ) = & \ln \sum_{x} e^{ξ h (x)} (e^{h (x) - τ Λ} Q_{c l} (x)) \\ = & - τ Λ + \ln \sum_{x} e^{(ξ + 1) h (x)} Q_{c l} (x) \\ = & - τ Λ + K_{X} (ξ + 1) . \end{aligned}

(75)

Hence,

\frac{d^{n} R_{X} (ξ)}{d ξ^{n}} = \frac{d^{n} K_{X} (ξ + 1)}{d ξ^{n}},

(76)

for $n \geq 1$ . This relation proves that evaluating $\frac{d^{n} K_{X} (ξ)}{d ξ^{n}}$ at $ξ = 1$ gives the $n$ -th order fitness cumulant on the retrospective statistics; i.e.,

ρ_{n}^{(X)} = {\frac{d^{n} K_{X} (ξ)}{d ξ^{n}} |}_{ξ = 1} .

(77)

Furthermore, this leads to

ρ_{n}^{(X)} = \sum_{k = n}^{\infty} \frac{κ_{k}^{(X)}}{(k - n)!},

(78)

from (Equation 70) and (Equation 77). Similarly, evaluating (Equation 76) at $ξ = - 1$ gives

κ_{n}^{(X)} = {\frac{d^{n} R_{X} (ξ)}{d ξ^{n}} |}_{ξ = - 1} = \sum_{k = n}^{\infty} \frac{ρ_{k}^{(X)} (- 1)^{k - n}}{(k - n)!} .

(79)

Analogously to (Equation 12), we can also expand the population growth rate in terms of the retrospective cumulants, by evaluating (Equation 75) at $ξ = - 1$ ,

τ Λ = K_{X} (0) - R_{X} (- 1) = \sum_{n = 1}^{\infty} \frac{(- 1)^{n - 1} ρ_{n}^{(X)}}{n!} .

(80)

For example, when the fitness distribution is Gaussian for the chronological statistics,

\begin{aligned} ⟨ h (X) ⟩_{r s} = & ρ_{1}^{(X)} = κ_{1}^{(X)} + κ_{2}^{(X)} \\ = & ⟨ h (X) ⟩_{c l} + V a r [h (X)]_{c l}, \end{aligned}

(81)

\begin{aligned} V a r [h (X)]_{r s} = & ρ_{2}^{(X)} = κ_{2}^{(X)} \\ = & V a r [h (X)]_{c l}, \end{aligned}

(82)

since $κ_{n}^{X} = 0$ for $\forall n \geq 3$ .

These results confirm that the function $K_{X} (ξ)$ contains the information of both chronological and retrospective statistics.

Relationships between fitness cumulants and selection strength measures

In the main text, we have shown that the selection strength $S_{KL}^{(1)} [X]$ corresponds to the contribution of the second or higher-order fitness cumulants to population growth, i.e.,

S_{K L}^{(1)} [X] = \sum_{n = 2}^{\infty} \frac{κ_{n}^{(X)}}{n!} .

(83)

or alternatively, by substituting (Equation 79) and (Equation 80) we obtain

\begin{aligned} S_{K L}^{(1)} [X] & = \sum_{n = 1}^{\infty} \frac{ρ_{n}^{(X)} (- 1)^{n - 1}}{n!} - \sum_{n = 1}^{\infty} \frac{ρ_{n}^{(X)} (- 1)^{n - 1}}{(n - 1)!} \\ = \sum_{n = 2}^{\infty} \frac{ρ_{n}^{(X)} (- 1)^{n}}{n!} (n - 1) . \end{aligned}

(84)

Similar expressions can also be found for $S_{KL}^{(2)} [X]$ . Since $S_{KL}^{(2)} [X] = {⟨ h (X) ⟩}_{rs} - τ Λ$ (Equation 4), substituting (Equation 80) yields

S_{K L}^{(2)} [X] = \sum_{n = 2}^{\infty} \frac{(- 1)^{n} ρ_{n}^{(X)}}{n!},

(85)

or alternatively, by substituting (Equation 78) and (Equation 12) we obtain

\begin{aligned} S_{K L}^{(2)} [X] & = \sum_{n = 1}^{\infty} \frac{κ_{n}^{(X)}}{(n - 1)!} - \sum_{n = 1}^{\infty} \frac{κ_{n}^{(X)}}{n!} \\ = \sum_{n = 2}^{\infty} \frac{κ_{n}^{(X)}}{n!} (n - 1) . \end{aligned}

(86)

These show that both of $S_{KL}^{(1)} [X]$ and $S_{KL}^{(2)} [X]$ can be expanded by the chronological or retrospective fitness cumulants.

The difference between these two selection strength measures is

S_{K L}^{(2)} [X] - S_{K L}^{(1)} [X] = \sum_{n = 3}^{\infty} \frac{κ_{n}^{(X)}}{n!} (n - 2) = \sum_{n = 3}^{\infty} \frac{ρ_{n}^{(X)} (- 1)^{n - 1}}{n!} (n - 2)

(87)

from (Equation 83) to (Equation 86). Thus, it depends only on the third or higher-order fitness cumulants.

Finally, another selection strength measure $S_{JF} [X]$ can also be expanded by the fitness cumulants as

S_{J F} [X] = S_{K L}^{(1)} [X] + S_{K L}^{(2)} [X] = \sum_{n = 2}^{\infty} \frac{κ_{n}^{(X)}}{(n - 1)!} = \sum_{n = 2}^{\infty} \frac{ρ_{n}^{(X)} (- 1)^{n}}{(n - 1)!}

(88)

from (Equation 83) to (Equation 86). When the chronological fitness distribution is Gaussian ( $κ_{n}^{(X)} = 0$ for $\forall n \geq 3$ ),

\begin{matrix} S_{KL}^{(1)} [X] = S_{KL}^{(2)} [X] = \frac{κ_{2}^{(X)}}{2} = \frac{Var {[h (X)]}_{cl}}{2}, \\ S_{JF} [X] = κ_{2}^{(X)} = Var {[h (X)]}_{cl} . \end{matrix}

(89)

Analytical calculations of $K_{D} (ξ)$ and related relations given specific form of division count distributions

Here we derive (Equations 35–38) in the main text. We begin with the case where $Q_{cl} (D)$ follows a Poisson distribution. Let $\bar{D}$ denote the chronological mean division count.

Q_{cl} (D) = \frac{{\bar{D}}^{D} e^{- \bar{D}}}{D!}

(90)

By the definition of $K_{D} (ξ)$ ,

K_{D} (ξ) = \ln \sum_{D \geq 0} 2^{ξ D} \frac{{\bar{D}}^{D} e^{- \bar{D}}}{D!} = \bar{D} (2^{ξ} - 1)

(91)

By the Taylor expansion of $2^{ξ} = e^{ξ \ln 2}$ , the $n$ -th order cumulant is $κ_{n}^{(D)} = \bar{D} {(\ln 2)}^{n}$ . Since

τ Λ = K_{D} (1) = \bar{D},

(92)

we derive

W_{n} = \sum_{m = 1}^{n} \frac{{(\ln 2)}^{m}}{m!} .

(93)

For example, $W_{1}^{(D)} = 0.693$ , $W_{2}^{(D)} = 0.933$ , $W_{3}^{(D)} = 0.988$ , and $W_{4}^{(D)} = 0.998$ . The first order derivative of $K_{D} (ξ)$ is

K_{D}^{'} (ξ) = \bar{D} 2^{ξ} \ln 2,

(94)

and thereby we have

{⟨ D ⟩}_{cl} \ln 2 = K_{D}^{'} (0) = \bar{D} \ln 2,

(95)

{⟨ D ⟩}_{rs} \ln 2 = K_{D}^{'} (1) = 2 \bar{D} \ln 2,

(96)

\frac{S_{KL}^{(1)}}{τ Λ} = 1 - \frac{K_{D}^{'} (0)}{K_{D} (1)} = 1 - \ln 2 ≃ 0.31,

(97)

and

\frac{S_{KL}^{(2)}}{τ Λ} = \frac{K_{D}^{'} (1)}{K_{D} (1)} - 1 = 2 \ln 2 - 1 ≃ 0.39 .

(98)

Next we derive $K_{D} (ξ)$ for binomial and negative binomial distributions. Let $\bar{D}$ and $\bar{D} ϕ$ denote the mean and the variance of $Q_{cl} (D)$ . When $Q_{cl} (D)$ is binomial,

Q_{cl} (D) = (\binom{D_{\max}}{D}) p^{D} {(1 - p)}^{D_{\max} - D}, D = 0, 1, \dots, D_{\max}

(99)

where $D_{\max}$ and $p$ satisfy $\bar{D} = D_{\max} p$ and $\bar{D} ϕ = D_{\max} p (1 - p)$ ; namely $D_{\max} = \bar{D} / (1 - ϕ)$ and $p = 1 - ϕ$ . Therefore,

\begin{aligned} K_{D} (ξ) & = \ln \sum_{D \geq 0} 2^{ξ D} (\binom{D_{max}}{D}) p^{D} {(1 - p)}^{D_{max} - D} \\ = D_{max} \ln (2^{ξ} p + 1 - p) \\ = \frac{\bar{D}}{1 - ϕ} \ln (2^{ξ} (1 - ϕ) + ϕ) . \end{aligned}

(100)

When $Q_{cl} (D)$ is negative binomial,

Q_{cl} (D) = \frac{Γ (α + D)}{Γ (α) D!} p^{D} {(1 - p)}^{α}, D = 0, 1, \dots

(101)

where $α$ and $p$ satisfy $\bar{D} = α p / (1 - p)$ and $\bar{D} ϕ = α p / {(1 - p)}^{2}$ ; namely $α = \bar{D} / (ϕ - 1)$ and $p = 1 - ϕ^{- 1}$ . Therefore,

\begin{aligned} K_{D} (ξ) & = \ln \sum_{D \geq 0} 2^{ξ D} \frac{Γ (α + D)}{Γ (α) D!} p^{D} {(1 - p)}^{α} \\ = - α \ln (\frac{1 - 2^{ξ} p}{1 - p}) \\ = \frac{\bar{D}}{1 - ϕ} \ln (2^{ξ} (1 - ϕ) + ϕ) . \end{aligned}

(102)

(Equation 102) is exactly the same as (Equation 100) as the function of $\bar{D}, ϕ,$ and $ξ$ . In addition, (Equation 91) is the limiting form of (Equation 100) and (Equation 102) as $ϕ \to 1$ . Thus, (Equation 35) in the main text represents $K_{D} (ξ)$ for Poisson, binomial or negative binomial $Q_{cl} (D)$ .

The Taylor expansion of (Equation 35) is obtained as follows:

\begin{aligned} \frac{K_{D} (ξ)}{\bar{D}} & = \sum_{m \geq 1} \frac{{(ϕ - 1)}^{m - 1}}{m} {(2^{ξ} - 1)}^{m} \\ = \sum_{m \geq 1} \frac{{(ϕ - 1)}^{m - 1}}{m} \sum_{k \geq 0} (\binom{m}{k}) {(- 1)}^{m - k} \sum_{n \geq 0} \frac{{(k ξ \ln 2)}^{n}}{n!} \\ = \sum_{n \geq 0} \frac{{(ξ \ln 2)}^{n}}{n!} c_{n} (ϕ), \end{aligned}

(103)

where

c_{n} (ϕ) = \sum_{m \geq 1} \frac{{(ϕ - 1)}^{m - 1}}{m} \sum_{k = 0}^{m} k^{n} (\binom{m}{k}) {(- 1)}^{m - k} .

(104)

For the first five terms, for example, we have

\begin{matrix} c_{1} (ϕ) & = 1 \end{matrix}

(105a)

\begin{matrix} c_{2} (ϕ) & = ϕ \end{matrix}

(105b)

\begin{matrix} c_{3} (ϕ) & = ϕ (2 ϕ - 1) \end{matrix}

(105c)

\begin{matrix} c_{4} (ϕ) & = ϕ (6 ϕ^{2} - 6 ϕ + 1) \end{matrix}

(105d)

\begin{matrix} c_{5} (ϕ) & = ϕ (24 ϕ^{3} - 36 ϕ^{2} + 14 ϕ - 1) \end{matrix}

(105e)

$κ_{n}^{(D)} = \bar{D} c_{n} (ϕ) {(\ln 2)}^{n}$ gives the $n$ -th order cumulant.

The first order derivative of (Equation 35) is

K_{D}^{'} (ξ) = \frac{2^{ξ} \bar{D} \ln 2}{2^{ξ} (1 - ϕ) + ϕ},

(106)

and thereby we obtain

{⟨ D ⟩}_{cl} \ln 2 = K_{D}^{'} (0) = \bar{D} \ln 2,

(107)

{⟨ D ⟩}_{rs} \ln 2 = K_{D}^{'} (1) = \frac{2 \bar{D} \ln 2}{2 - ϕ},

(108)

\frac{S_{KL}^{(1)} [D]}{τ Λ} = 1 - \frac{K_{D}^{'} (0)}{K_{D} (1)} = 1 - \frac{(1 - ϕ) \ln 2}{\ln (2 - ϕ)},

(109)

and

\frac{S_{KL}^{(2)} [D]}{τ Λ} = \frac{K_{D}^{'} (1)}{K_{D} (1)} - 1 = \frac{2 (1 - ϕ) \ln 2}{(2 - ϕ) \ln (2 - ϕ)} - 1 .

(110)

(Equation 109) and (Equation 110) equal if and only if

\frac{(2 - ϕ) \ln (2 - ϕ)}{(4 - ϕ) (1 - ϕ)} = \frac{\ln 2}{2}

(111)

This equation has two roots $ϕ = 0$ and $ϕ = ϕ_{0} = 0.5857 \dots$ and LHS >RHS if and only if $0 < ϕ < ϕ_{0}$ .

Funding Statement

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Contributor Information

Yuichi Wakamoto, Email: cwaka@mail.ecc.u-tokyo.ac.jp.

Armita Nourmohammad, University of Washington, United States.

Aleksandra M Walczak, CNRS LPENS, France.

Funding Information

This paper was supported by the following grants:

Japan Science and Technology Agency JPMJCR1927 to Yuichi Wakamoto.
Japan Science and Technology Agency JPMJER1902 to Yuichi Wakamoto.
National Institute of General Medical Sciences R01-GM097356 to Edo Kussell.
Japan Society for the Promotion of Science 17H06389 to Yuichi Wakamoto.
Japan Society for the Promotion of Science 19H03216 to Yuichi Wakamoto.
Japan Society for the Promotion of Science 21K20672 to Takashi Nozoe.

Additional information

Competing interests

No competing interests declared.

Author contributions

Conceptualization, Software, Formal analysis, Investigation, Visualization, Methodology, Writing - original draft, Writing - review and editing.

Conceptualization, Software, Formal analysis, Validation, Investigation, Methodology, Writing - original draft, Writing - review and editing.

Resources, Investigation.

Conceptualization, Funding acquisition, Validation, Writing - original draft, Writing - review and editing.

Conceptualization, Software, Supervision, Funding acquisition, Validation, Investigation, Visualization, Writing - original draft, Project administration, Writing - review and editing.

Additional files

Transparent reporting form

elife-72299-transrepform1.docx^{(246.6KB, docx)}

Data availability

All data generated or analyzed during this study and the Matlab codes for data analysis have been deposited in a GitHub repository (https://github.com/Wakamoto-lab/LineageAnalysis; copy archived at swh:1:rev:1865d167f1c24625c98d3c493a9a180b1aa2035d).

The following dataset was generated:

Yamauchi S, Nozoe T, Okura R, Kussell E, Wakamoto Y. 2021. LineageAnalysis. Github. LineageAnalysis

The following previously published datasets were used:

Nozoe T, Kussell E, Wakamoto Y. 2018. Data from: Inferring fitness landscapes and selection on phenotypic states from single-cell genealogical data. Dryad Digital Repository.

Nakaoka H, Wakamoto Y. 2018. Data from: Aging, mortality, and the fast growth trade-off of Schizosaccharomyces pombe. Dryad Digital Repository.

Seita A, Nakaoka H, Okura R, Wakamoto Y. 2021. Data from: Intrinsic growth heterogeneity of mouse leukemia cells underlies differential susceptibility to a growth-inhibiting anticancer drug. Dryad Digital Repository.

References

Balaban NQ, Merrin J, Chait R, Kowalik L, Leibler S. Bacterial persistence as a phenotypic switch. Science. 2004;305:1622–1625. doi: 10.1126/science.1099390. [DOI] [PubMed] [Google Scholar]
Balázsi G, van Oudenaarden A, Collins JJ. Cellular decision making and biological noise: from microbes to mammals. Cell. 2011;144:910–925. doi: 10.1016/j.cell.2011.01.030. [DOI] [PMC free article] [PubMed] [Google Scholar]
Battesti A, Majdalani N, Gottesman S. The rpos-mediated general stress response in Escherichia coli. Annual Review of Microbiology. 2011;65:189–213. doi: 10.1146/annurev-micro-090110-102946. [DOI] [PMC free article] [PubMed] [Google Scholar]
Benet L, Sanders DP. TaylorSeries.jl: taylor expansions in one and several variables in julia. Journal of Open Source Software. 2019;4:1043. doi: 10.21105/joss.01043. [DOI] [Google Scholar]
Benet L, Sanders DP. TaylorSeries.jl. Zenodo. 2021 doi: 10.5281/zenodo.2601941. [DOI]
Cerulus B, New AM, Pougach K, Verstrepen KJ. Noise and epigenetic inheritance of single-cell division times influence population fitness. Current Biology. 2016;26:1138–1147. doi: 10.1016/j.cub.2016.03.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
Chow K-HK, Budde MW, Granados AA, Cabrera M, Yoon S, Cho S, Huang T-H, Koulena N, Frieda KL, Cai L, Lois C, Elowitz MB. Imaging cell lineage with a synthetic digital recording system. Science. 2021;372:eabb3099. doi: 10.1126/science.abb3099. [DOI] [PubMed] [Google Scholar]
Edelstein AD, Tsuchida MA, Amodaj N, Pinkard H, Vale RD, Stuurman N. Advanced methods of microscope control using μmanager software. Journal of Biological Methods. 2014;1:e10. doi: 10.14440/jbm.2014.36. [DOI] [PMC free article] [PubMed] [Google Scholar]
Elowitz MB, Levine AJ, Siggia ED, Swain PS. Stochastic gene expression in a single cell. Science. 2002;297:1183–1186. doi: 10.1126/science.1070919. [DOI] [PubMed] [Google Scholar]
Filipczyk A, Marr C, Hastreiter S, Feigelman J, Schwarzfischer M, Hoppe PS, Loeffler D, Kokkaliaris KD, Endele M, Schauberger B, Hilsenbeck O, Skylaki S, Hasenauer J, Anastassiadis K, Theis FJ, Schroeder T. Network plasticity of pluripotency transcription factors in embryonic stem cells. Nature Cell Biology. 2015;17:1235–1246. doi: 10.1038/ncb3237. [DOI] [PubMed] [Google Scholar]
Fisher RA. The Genetical Theory of Natural Selection. Oxford: Oxford University Press; 1930. [DOI] [Google Scholar]
Frank SA. Natural selection. V. how to read the fundamental equations of evolutionary change in terms of information theory. Journal of Evolutionary Biology. 2012;25:2377–2396. doi: 10.1111/jeb.12010. [DOI] [PubMed] [Google Scholar]
Frieda KL, Linton JM, Hormoz S, Choi J, Chow K-HK, Singer ZS, Budde MW, Elowitz MB, Cai L. Synthetic recording and in situ readout of lineage information in single cells. Nature. 2017;541:107–111. doi: 10.1038/nature20777. [DOI] [PMC free article] [PubMed] [Google Scholar]
Futuyma DJ. Evolutionary Biology. Sinauer Associates, Inc; 2010. [Google Scholar]
García-García R, Genthon A, Lacoste D. Linking lineage and population observables in biological branching processes. Physical Review E. 2019;99:1–12. doi: 10.1103/PhysRevE.99.042413. [DOI] [PubMed] [Google Scholar]
Genthon A, Lacoste D. Fluctuation relations and fitness landscapes of growing cell populations. Scientific Reports. 2020;10:1–13. doi: 10.1038/s41598-020-68444-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
Genthon A, Lacoste D. Universal constraints on selection strength in lineage trees. Physical Review Research. 2021;3:023187. doi: 10.1103/PhysRevResearch.3.023187. [DOI] [Google Scholar]
Hashimoto M, Nozoe T, Nakaoka H, Okura R, Akiyoshi S, Kaneko K, Kussell E, Wakamoto Y. Noise-driven growth rate gain in clonal cellular populations. PNAS. 2016;113:3251–3256. doi: 10.1073/pnas.1519412113. [DOI] [PMC free article] [PubMed] [Google Scholar]
Inoue I, Wakamoto Y, Moriguchi H, Okano K, Yasuda K. On-Chip culture system for observation of isolated individual cells. Lab on a Chip. 2001;1:50–55. doi: 10.1039/b103931h. [DOI] [PubMed] [Google Scholar]
Julou T, Zweifel L, Blank D, Fiori A, van Nimwegen E. Subpopulations of sensorless bacteria drive fitness in fluctuating environments. PLOS Biology. 2020;18:e3000952. doi: 10.1371/journal.pbio.3000952. [DOI] [PMC free article] [PubMed] [Google Scholar]
Jun S, Si F, Pugatch R, Scott M. Fundamental principles in bacterial physiology-history, recent progress, and the future with focus on cell size control: a review. Reports on Progress in Physics. Physical Society. 2018;81:056601. doi: 10.1088/1361-6633/aaa628. [DOI] [PMC free article] [PubMed] [Google Scholar]
Kelly CD, Rahn O. The growth rate of individual bacterial cells. Journal of Bacteriology. 1932;23:147–153. doi: 10.1128/jb.23.2.147-153.1932. [DOI] [PMC free article] [PubMed] [Google Scholar]
Kobayashi TJ, Sughiyama Y. Fluctuation relations of fitness and information in population dynamics. Physical Review Letters. 2015;115:238102. doi: 10.1103/PhysRevLett.115.238102. [DOI] [PubMed] [Google Scholar]
Kohram M, Vashistha H, Leibler S, Xue B, Salman H. Bacterial growth control mechanisms inferred from multivariate statistical analysis of single-cell measurements. Current Biology. 2021;31:955–964. doi: 10.1016/j.cub.2020.11.063. [DOI] [PubMed] [Google Scholar]
Kuchen EE, Becker NB, Claudino N, Höfer T. Hidden long-range memories of growth and cycle speed correlate cell cycles in lineage trees. eLife. 2020;9:e51002. doi: 10.7554/eLife.51002. [DOI] [PMC free article] [PubMed] [Google Scholar]
Kussell E, Leibler S. Phenotypic diversity, population growth, and information in fluctuating environments. Science. 2005;309:2075–2078. doi: 10.1126/science.1114383. [DOI] [PubMed] [Google Scholar]
Lambert G, Kussell E, Kussel E. Memory and fitness optimization of bacteria under fluctuating environments. PLOS Genetics. 2014;10:e1004556. doi: 10.1371/journal.pgen.1004556. [DOI] [PMC free article] [PubMed] [Google Scholar]
Lebowitz JL, Rubinow SI. A theory for the age and generation time distribution of a microbial population. Journal of Mathematical Biology. 1974;1:17–36. doi: 10.1007/BF02339486. [DOI] [Google Scholar]
Leibler S, Kussell E. Individual histories and selection in heterogeneous populations. PNAS. 2010;107:13183–13188. doi: 10.1073/pnas.0912538107. [DOI] [PMC free article] [PubMed] [Google Scholar]
Levien E, GrandPre T, Amir A. Large deviation principle linking lineage statistics to fitness in microbial populations. Physical Review Letters. 2020;125:048102. doi: 10.1103/PhysRevLett.125.048102. [DOI] [PubMed] [Google Scholar]
Lin J, Amir A. The effects of stochasticity at the single-cell level and cell size control on the population growth. Cell Systems. 2017;5:358–367. doi: 10.1016/j.cels.2017.08.015. [DOI] [PubMed] [Google Scholar]
Mosheiff N, Martins BMC, Pearl-Mizrahi S, Grünberger A, Helfrich S, Mihalcescu I, Kohlheyer D, Locke JCW, Glass L, Balaban NQ. Inheritance of cell-cycle duration in the presence of periodic forcing. Physical Review X. 2018 doi: 10.1103/PhysRevX.8.021035. [DOI] [Google Scholar]
Nakaoka H, Wakamoto Y. Aging, mortality, and the fast growth trade-off of Schizosaccharomyces pombe. PLOS Biology. 2017;15:e2001109. doi: 10.1371/journal.pbio.2001109. [DOI] [PMC free article] [PubMed] [Google Scholar]
Nozoe T, Kussell E, Wakamoto Y. Inferring fitness landscapes and selection on phenotypic states from single-cell genealogical data. PLOS Genetics. 2017;13:e1006653. doi: 10.1371/journal.pgen.1006653. [DOI] [PMC free article] [PubMed] [Google Scholar]
Nozoe T, Kussell E. Cell cycle heritability and localization phase transition in growing populations. Physical Review Letters. 2020;125:268103. doi: 10.1103/PhysRevLett.125.268103. [DOI] [PMC free article] [PubMed] [Google Scholar]
Nozoe T, Wakamoto Y. LineageAnalysis-julia. swh:1:rev:e22fbce8a713582a18fbe2bcc57dc9078090f121Software Heritage. 2021 https://archive.softwareheritage.org/swh:1:dir:3f8515a3a85a898cf42435bd3badd5c837d23587;origin=https://github.com/Wakamoto-lab/LineageAnalysis-Julia;visit=swh:1:snp:e342b9260c581a7e3a1caf204a8a052f29647f26;anchor=swh:1:rev:e22fbce8a713582a18fbe2bcc57dc9078090f121
Powell EO. Growth rate and generation time of bacteria, with special reference to continuous culture. Journal of General Microbiology. 1956;15:492–511. doi: 10.1099/00221287-15-3-492. [DOI] [PubMed] [Google Scholar]
Purvis JE, Lahav G. Encoding and decoding cellular information through signaling dynamics. Cell. 2013;152:945–956. doi: 10.1016/j.cell.2013.02.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
Quinn JJ, Jones MG, Okimoto RA, Nanjo S, Chan MM, Yosef N, Bivona TG, Weissman JS. Single-Cell lineages reveal the rates, routes, and drivers of metastasis in cancer xenografts. Science. 2021;371:eabc1944. doi: 10.1126/science.abc1944. [DOI] [PMC free article] [PubMed] [Google Scholar]
Rivoire O, Leibler S. A model for the generation and transmission of variations in evolution. PNAS. 2014;111:E1940–E1949. doi: 10.1073/pnas.1323901111. [DOI] [PMC free article] [PubMed] [Google Scholar]
Rochman ND, Popescu DM, Sun SX. Ergodicity, hidden bias and the growth rate gain. Physical Biology. 2018;15:036006. doi: 10.1088/1478-3975/aab0e6. [DOI] [PMC free article] [PubMed] [Google Scholar]
Schneider CA, Rasband WS, Eliceiri KW. Nih image to imagej: 25 years of image analysis. Nature Methods. 2012;9:671–675. doi: 10.1038/nmeth.2089. [DOI] [PMC free article] [PubMed] [Google Scholar]
Seita A, Nakaoka H, Okura R, Wakamoto Y. Intrinsic growth heterogeneity of mouse leukemia cells underlies differential susceptibility to a growth-inhibiting anticancer drug. PLOS ONE. 2021;16:e0236534. doi: 10.1371/journal.pone.0236534. [DOI] [PMC free article] [PubMed] [Google Scholar]
Stewart EJ, Madden R, Paul G, Taddei F. Aging and death in an organism that reproduces by morphologically symmetric division. PLOS Biology. 2005;3:e45. doi: 10.1371/journal.pbio.0030045. [DOI] [PMC free article] [PubMed] [Google Scholar]
Susman L, Kohram M, Vashistha H, Nechleba JT, Salman H, Brenner N. Individuality and slow dynamics in bacterial growth homeostasis. PNAS. 2018;115:E5679–E5687. doi: 10.1073/pnas.1615526115. [DOI] [PMC free article] [PubMed] [Google Scholar]
Thomas P. Single-Cell Histories in Growing Populations: Relating Physiological Variability to Population Growth. bioRxiv. 2007 doi: 10.1101/100495. [DOI]
Wakamoto Y, Ramsden J, Yasuda K. Single-Cell growth and division dynamics showing epigenetic correlations. The Analyst. 2005;130:311–317. doi: 10.1039/b409860a. [DOI] [PubMed] [Google Scholar]
Wakamoto Y, Grosberg AY, Kussell E. Optimal lineage principle for age-structured populations. Evolution; International Journal of Organic Evolution. 2012;66:115–134. doi: 10.1111/j.1558-5646.2011.01418.x. [DOI] [PubMed] [Google Scholar]
Wakamoto Y, Dhar N, Chait R, Schneider K, Signorino-Gelo F, Leibler S, McKinney JD. Dynamic persistence of antibiotic-stressed mycobacteria. Science. 2013;339:91–95. doi: 10.1126/science.1229858. [DOI] [PubMed] [Google Scholar]
Wakamoto Y. LineageSimulation. swh:1:rev:ef1166620396835168ca9061851898993a091976Software Heritage. 2021 https://archive.softwareheritage.org/swh:1:dir:c5afe35f0203c0b58ec96a15f89bb88182e26c4c;origin=https://github.com/Wakamoto-lab/LineageSimulation;visit=swh:1:snp:6e2e48d9653006dc988484d143c8685f59d23583;anchor=swh:1:rev:ef1166620396835168ca9061851898993a091976
Wang P, Robert L, Pelletier J, Dang WL, Taddei F, Wright A, Jun S. Robust growth of Escherichia coli. Current Biology. 2010;20:1099–1103. doi: 10.1016/j.cub.2010.04.045. [DOI] [PMC free article] [PubMed] [Google Scholar]
Winkelmann R. Duration dependence and dispersion in count-data models. Journal of Business & Economic Statistics. 1995;13:467–474. doi: 10.1080/07350015.1995.10524620. [DOI] [Google Scholar]
Yamauchi S. LineageAnalysis. swh:1:rev:1865d167f1c24625c98d3c493a9a180b1aa2035dSoftware Heritage. 2021 https://archive.softwareheritage.org/swh:1:dir:1d0239681f886fdda988fd6004edadca0850fa10;origin=https://github.com/Wakamoto-lab/LineageAnalysis;visit=swh:1:snp:891a7efe029cb0dcaf72b1a020e9ae47e3fd0097;anchor=swh:1:rev:1865d167f1c24625c98d3c493a9a180b1aa2035d
Zaslaver A, Bren A, Ronen M, Itzkovitz S, Kikoin I, Shavit S, Liebermeister W, Surette MG, Alon U. A comprehensive library of fluorescent transcriptional reporters for Escherichia coli. Nature Methods. 2006;3:623–628. doi: 10.1038/nmeth895. [DOI] [PubMed] [Google Scholar]

eLife. doi: 10.7554/eLife.72299.sa0

Editor's evaluation

Armita Nourmohammad ¹

This manuscript presents a general statistical framework to infer selection on a quantitative trait, based on measurements of the values of this trait along related cell lineages. The manuscript provides both a detailed explanation of the mathematical underpinnings of the method and an illustration of its application to existing and new cell lineage datasets. This is a general framework and is not tailored to particular growth models or environmental conditions, making it applicable to broad examples of exponentially growing populations.

eLife. doi: 10.7554/eLife.72299.sa1

Decision letter

Editor: Armita Nourmohammad¹

Reviewed by: Srividya Iyer-Biswas²

Our editorial process produces two outputs: (i) public reviews designed to be posted alongside the preprint for the benefit of readers; (ii) feedback on the manuscript for the authors, including requests for revisions, shown below. We also include an acceptance summary that explains what the editors found interesting or important about the work.

Decision letter after peer review:

Thank you for submitting your article "A unified framework for measuring selection on cellular lineages and traits" for consideration by eLife. Your article has been reviewed by 3 peer reviewers, and the evaluation has been overseen by a Reviewing Editor and Aleksandra Walczak as the Senior Editor. The following individual involved in review of your submission has agreed to reveal their identity: Srividya Iyer-Biswas (Reviewer #3).

The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a revised submission.

Essential revisions:

1. The authors should restructure the paper so it is accessible to the broad audience of eLife. The main issues to keep in mind:

i. Better clarify of the biological questions that the approach can address and the biological insights that it can provide. Tangible connections to specific biological systems would strength the work.

ii. Better structure the results and present the main points in an accessible language to those not interested in the mathematical intricacies.

2. Better clarify the definitions and measures used for fitness, selection, evolutionary advantages, etc in the paper and contrast them with the common notion of fitness as the instantaneous growth. See the comments regarding the definitions of selection strength by reviewers #1 and #2.

3. Discuss the impact of inherent stochasticity in division events on the results and the outcome of lineage evolution.

4. Discuss how the mathematical formalism and results acquired for the stationary state can be applied (or modified) to address non-stationary conditions that are more relevant for the processes discussed in the manuscript. See comment 3 by reviewer #3.

The reviews below contain a number of other suggestions that we encourage you to consider.

Reviewer #1 (Recommendations for the authors):

– In the Introduction it is mentioned that « cell population's growth rate becomes greater than the mean division rate ». Can the formalism presented here describe this in a simple way?

– It could be useful for a broader audience to explicitly explain how skewness and cumulants are related.

– It could be good to define W1 and W2 in the theoretical background section.

– Figure 7: I recommend to explicitly tell what x is.

– How do definitions depend on the discrete or continuous nature of x? Practically, do we need to, say in Figure 7, bin x to compute different functions? How is this done?

Reviewer #2 (Recommendations for the authors):

The main concepts were previously proposed and published by the same authors (ref 13). The new clarification and applications are welcome but the scope of their novelty and impact is not totally obvious. Currently, the paper tends to read as a technical paper with new observations that are intriguing but not totally understood (e.g. the difference between stationary and non-stationary growth). It would benefit from a clarification of the biological questions that the approach can address and the biological insights that it can provide.

Selection is most commonly thought to act on traits and, in constant environments, quantified by the instantaneous growth. This quantity is often identified to fitness with a straightforward relation to adaptation at the population level. The reduction of fitness variance that the authors mention (in the abstract and line 260) is derived in this context. The paper takes a different perspective and it would be helpful to contrast it more clearly to this more usual approach: Why and when is it justified to define selection at the level of lineage trait? Why should we be interested in multiple definitions of selection strength in this context? What can we expect to learn?

Further explaining the questions that the formalism intend to address is needed or the paper may appear as a formal exercise to solve a problem that the authors artificially created, i.e. introducing multiple measures of selection strength on an unusual quantity. Further explaining the biological insights that the framework can provide is also needed if the intent is to reach readers interested in applications and not only mathematical technicalities.

Some questions along those lines:

Why multiple definitions of selection strength? Is it just a matter of quantifying the difference between Q_{rs}(x) and Q_{cl}(x) which cannot be reduced to a scalar quantity? What information is in principle contained in the difference? What is the biological interpretation of the observation that it can be reduced to 2 numbers in many cases (when higher order cumulant are negligible)?

In applications, it appears that interpretable conclusions are mainly drawn from two quantities: S_KL(D) for global selection and S_rel(X) for selection of a specific trait. In the current understanding of the approach, are these the quantities that one should compute to reach biological insights in practical applications?

Can we use the approach to rule out that a trait is under selection? If so, what would be the statistical evidence?

How critical is the formalism: can the authors derive a biological conclusion that would not be accessible without it?

The application of the method to non time-invariant conditions (regrowth, changing environments) is not completely clear. The results should depend on the time-window and important information pertaining to selection should be contained in the time evolution. The observation that empirical observation that S_KL² differs from S_KL¹ in this context is intriguing but its origins and implications unclear.

The authors stress that their model is independent from mechanisms, which makes it broadly applicable. But only correlations can be assessed which may limit the identification what drives selection.

What is the relationship to adaptation and evolution? The abstract raises the question but no further mention of adaptation is made in the rest of the paper.

Heredity is generally as important as selection: is it within or beyond the scope of the framework?

Reviewer #3 (Recommendations for the authors):

I recommend addressing the specific concerns raised through appropriate discussions and clarifications in the manuscript text.

eLife. 2022 Dec 6;11:e72299. doi: 10.7554/eLife.72299.sa2

Author response

Essential revisions:

1. The authors should restructure the paper so it is accessible to the broad audience of eLife. The main issues to keep in mind:

i. Better clarify of the biological questions that the approach can address and the biological insights that it can provide. Tangible connections to specific biological systems would strength the work.

ii. Better structure the results and present the main points in an accessible language to those not interested in the mathematical intricacies.

We thank the reviewers for the suggestion. To address this issue, we restructured the manuscript significantly as follows.

First, we added the section titled "Examples of biological questions" immediately after the Introduction. We present three examples of fundamental biological questions for which this framework of cell lineage statistics is indispensable. This section also clarifies which quantities one needs to evaluate from experimental data to gain insights into each problem.

Second, we significantly reduced the mathematical descriptions in the main text and limited them only to the essential equations required for understanding the meanings of the quantities and interpreting the experimental results. We also moved most of the contents for model applications to Appendices to minimize the usage of equations and to present only the essential conclusions in the main text.

Third and lastly, we added more intuitive and plain explanations of the quantities where we first introduced them in the main text. Furthermore, in each application to the experimental data, we explained the biological insights gained by the results in more detail.

We hope these modifications and clarifications have made this manuscript more accessible to the broad audience of eLife. Please also see our replies to reviewers #1 and #2.

2. Better clarify the definitions and measures used for fitness, selection, evolutionary advantages, etc in the paper and contrast them with the common notion of fitness as the instantaneous growth. See the comments regarding the definitions of selection strength by reviewers #1 and #2.

We now include a glossary of the terms as Box 1 alongside a figure and explain the terms of fitness, fitness landscape, selection, selection strength, and cumulants. We also explain their standard usage in evolutionary biology and clarify their similarities and differences in this framework.

In addition, we clarify when the lineage-based consideration of fitness becomes indispensable in the new section, “Examples of biological questions,” referring to specific biological questions (L77-89).

The shared and distinct properties of the different selection strength measures introduced in this study are also clarified in the revised manuscript (L136-142 and Results). Please see our replies to the specific comments by reviewers #1 and #2.

3. Discuss the impact of inherent stochasticity in division events on the results and the outcome of lineage evolution.

We now clarify the impact of inherent stochasticity in interdivision time (generation time) on overall selection strength ( $S_{KL}^{(1)} [D]$ ) and relative selection strength for a lineage trait ( $S_{rel} [X]$ ) in the Discussion (L519-526). We also explain the impact of stochasticity on the long-term growth rate of the population and selection (L327-334), referring to the results from the analytical model in Appendix 2. We demonstrate that the statistical properties of stochasticity influence both growth and selection even in the long-term limit. Please also see our replies to reviewers #1 and #3.

4. Discuss how the mathematical formalism and results acquired for the stationary state can be applied (or modified) to address non-stationary conditions that are more relevant for the processes discussed in the manuscript. See comment 3 by reviewer #3.

We now explicitly state in the text that this formalism can be applied to non-stationary conditions without modifications because evaluating fitness and selection requires only the information of division counts and trait dynamics in cell lineages (L62-65 and L492-493). However, we also clarify a limitation of this framework from this evaluation scheme: it cannot report any potential influences from uncharacterized factors, such as heterogeneous environments around cells and non-quantified traits (L493-496).

We also clarify the importance of the time windows for the results, especially when applied to non-stationary conditions (L536-539). Please see our replies to the comments by reviewers #1 and #3.

The reviews below contain a number of other suggestions that we encourage you to consider.

Reviewer #1 (Recommendations for the authors):

– In the Introduction it is mentioned that « cell population's growth rate becomes greater than the mean division rate ». Can the formalism presented here describe this in a simple way?

Yes, Equation 8 with X = D describes this effect. We now explain explicitly that the selection strength measure $S_{KL}^{(1)} [D]$ corresponds to growth rate gain from growth (fitness) heterogeneity (L243-244).

– It could be useful for a broader audience to explicitly explain how skewness and cumulants are related.

We appreciate your suggestion. We now explain "cumulants" in Box 1 and clarify what first, second, and third-order cumulants represent. We also explain how the third-order cumulant is related to the skewness of a distribution.

– It could be good to define W1 and W2 in the theoretical background section.

We again appreciate your suggestion. The definitions of W₁ and W₂ become clear only after introducing the cumulant expansion. We therefore still believe it is more appropriate to define W₁ and W₂ in the Results section. Instead of showing the definitions in the Theoretical background section, we explicitly wrote the relations of W₁ and W₂ to ${⟨ h (X) ⟩}_{cl}$ and $Var {[h (X)]}_{cl}$ in Results after the definition of W_n (L262-263).

– Figure 7: I recommend to explicitly tell what x is.

We clarified in the legend of Figure 7A and B which quantities were adopted as lineage traits and what x represents.

– How do definitions depend on the discrete or continuous nature of x? Practically, do we need to, say in Figure 7, bin x to compute different functions? How is this done?

Thank you for pointing out an important practical issue. If x is a trait that takes continuous values, we need to bin the values of x for calculating fitness landscapes and selection strength from experimental data. Bin width does affect the results, but we can usually find a range of bin width in which the results are relatively insensitive to the choice (see ref. 13). In this study, we set the bin widths for the time-averaged RpoS-mCherry and GFP fluorescence intensities as 0.4 x (interquartile ranges) of the data from all the conditions following the rule that empirically works. We added explanations on this to Materials and methods (L795-796).

Reviewer #2 (Recommendations for the authors):

The main concepts were previously proposed and published by the same authors (ref 13). The new clarification and applications are welcome but the scope of their novelty and impact is not totally obvious. Currently, the paper tends to read as a technical paper with new observations that are intriguing but not totally understood (e.g. the difference between stationary and non-stationary growth). It would benefit from a clarification of the biological questions that the approach can address and the biological insights that it can provide.

We appreciate the thoughtful advice. To clarify new biological questions that this approach can address, we added a section titled "Examples of biological questions" immediately after the Introduction. This new section presents three examples of fundamental biological questions and explains why this cell lineage analysis framework is indispensable.

In addition, we also mention new insights gained by the analyses in the experimental Results sections. Although gaining a more profound understanding of the results still requires further investigations, we believe the added information would clarify the types of biological questions for which this framework becomes valuable.

Selection is most commonly thought to act on traits and, in constant environments, quantified by the instantaneous growth. This quantity is often identified to fitness with a straightforward relation to adaptation at the population level. The reduction of fitness variance that the authors mention (in the abstract and line 260) is derived in this context. The paper takes a different perspective and it would be helpful to contrast it more clearly to this more usual approach: Why and when is it justified to define selection at the level of lineage trait? Why should we be interested in multiple definitions of selection strength in this context? What can we expect to learn?

We again thank the reviewer for an insightful suggestion. We now explain when a lineage-based analysis of traits and selection becomes indispensable in the "Examples of biological questions" section (L77-89). We also added discussions on reduction of fitness variance, clarifying the differences and similarities of the contexts (L540-548). Please also see our reply to the comment 4 of reviewer #1.

The lineage-based analysis is required especially when growth and traits fluctuate rapidly over time and when the traits affect growth with delays. Under these conditions, instantaneous correlations between traits and growth might not report their relations correctly. On the other hand, the cell lineage-based analysis of this framework can take the whole dynamics of traits in cell lineages into account. For example, if we expect that absolute expression levels are essential for fitness, the expression level averaged in each cell lineage can be employed as the lineage trait. Furthermore, when large fluctuations are expected to affect cell fates and promote or suppress growth (ref. 24, for example), variances of expression levels along the cell lineages can be taken as lineage traits. Therefore, assuming a cell lineage as a unit of selection can significantly extend the choice of traits, including time-dependent properties.

We also appreciate your question on why we need multiple selection strength measures. As we now explain explicitly in the main text, these measures share a similar property of reporting the overall correlations between traits and fitness (L136-137). However, they also have critical differences regarding additional selection effects they represent: $S_{KL}^{(1)}$ for growth rate gain, $S_{KL}^{(2)}$ for additional loss of growth rate under perturbations, and their difference $S_{KL}^{(2)} - S_{KL}^{(1)}$ for the effect of selection on fitness variance. We restructured the sections in Results and clarified these important meanings of the different selection strength measures.

Further explaining the questions that the formalism intend to address is needed or the paper may appear as a formal exercise to solve a problem that the authors artificially created, i.e. introducing multiple measures of selection strength on an unusual quantity. Further explaining the biological insights that the framework can provide is also needed if the intent is to reach readers interested in applications and not only mathematical technicalities.

As we stated in our reply to the comment 1, we included a new section titled "Examples of biological questions" to clarify the biological questions we intended to address using this framework. We also explained in more detail what insights we gained from the experimental data analyses.

Some questions along those lines:

Why multiple definitions of selection strength? Is it just a matter of quantifying the difference between Q_{rs}(x) and Q_{cl}(x) which cannot be reduced to a scalar quantity? What information is in principle contained in the difference? What is the biological interpretation of the observation that it can be reduced to 2 numbers in many cases (when higher order cumulant are negligible)?

As we stated in our reply to the comment 2, the selection measures $S_{KL}^{(1)}$ and $S_{KL}^{(2)}$ represent distinct selection effects (growth rate gain or additional loss of growth rate under perturbations) on cellular populations. Furthermore, the difference of the selection strength measures, $S_{KL}^{(2)} - S_{KL}^{(1)}$ , represents the effect of selection on fitness variances.

The situations where the contributions of higher-order cumulants are significant indicate that the fitness distributions are far from Gaussian due to significant skew or multiple peaks. Therefore, the higher-order cumulants can suggest the existence of sub-populations in the cellular populations. We now explicitly explain this role of higher-order cumulants in the Results and Discussion (L291-294 and L508-514).

In applications, it appears that interpretable conclusions are mainly drawn from two quantities: S_KL(D) for global selection and S_rel(X) for selection of a specific trait. In the current understanding of the approach, are these the quantities that one should compute to reach biological insights in practical applications?

We now clarify what quantities we should compute depending on the types of biological questions in the new section, "Examples of biological questions." When we aim to know how strong the overall selection is in the population and how much growth rate gain is obtained from fitness heterogeneity, one needs to quantify $S_{KL}^{(1)} [D]$ . On the other hand, if we need to know how state differences of a trait of interest correlate with fitness and selection, one should compute the chronological distribution $Q_{cl} (x)$ , the fitness landscape $h (x)$ , and the relative selection strength $S_{rel} [X]$ , as those quantities have different meanings. For example, when we find a significant change in $S_{rel} [X]$ , two extreme scenarios can be considered: (a) $h (x)$ is unchanged, but $Q_{cl} (x)$ changes; (b) the distribution $Q_{cl} (x)$ is unchanged, but the fitness landscape $h (x)$ changes. In reality, $h (x)$ and $Q_{cl} (x)$ can change simultaneously depending on the traits and conditions. Therefore, evaluating all of these quantities is essential for understanding the underlying biological processes. We now explain this important point, referring to the experimental results (L458-464).

Can we use the approach to rule out that a trait is under selection? If so, what would be the statistical evidence?

Yes, a trait whose selection strength $S_{rel} [X]$ is zero is not under selection in a population. However, practically speaking, computing $S_{rel} [X]$ for any empirical lineage data will always find a positive value for this quantity. Therefore, it is helpful to compare the measured $S_{rel} [X]$ with those computed from the lineage data for which their correspondences between division counts and trait values are randomized (L427-430). If the trait is under selection, the measured $S_{rel} [X]$ will become more significant than the randomized values.

We now include a figure showing the levels of $S_{rel} [X]$ of randomized data for the experiments in which we evaluated the RpoS-mCherry and GFP expression levels under different culture conditions (Figure 7 – supplement 1).

How critical is the formalism: can the authors derive a biological conclusion that would not be accessible without it?

Related to our reply to comment 2, this formalism is critical if one needs to evaluate fitness and selection for traits with significant temporal fluctuations or delayed effects on fitness. Furthermore, this cell lineage-based framework is indispensable if we consider traits that can be defined only for cell lineages, not for individuals, such as the "variableness" of expression levels. Consequently, this framework significantly extends the choice of traits, including time-dependent properties.

We now explain this important point more clearly early in the manuscript in the section, "Examples of biological questions" (L77-89).

The application of the method to non time-invariant conditions (regrowth, changing environments) is not completely clear. The results should depend on the time-window and important information pertaining to selection should be contained in the time evolution. The observation that empirical observation that S_KL² differs from S_KL¹ in this context is intriguing but its origins and implications unclear.

We appreciate another insightful comment. It is correct that the time window affects the selection strength values. The purpose of this analysis on the regrowth of E. coli was to examine whether there was a significant difference in the selection strength depending on how long cell populations were placed under stationary phase conditions. For this purpose, it is appropriate to compare the results between the conditions (regrowth from the early stationary phase or the late stationary phase), fixing the time window for regrowth. Therefore, the results are valid at least in this time scale, but clarification of the selection in the longer time scales requires a more detailed characterization of lag time distributions under both conditions.

The significantly larger $S_{KL}^{(2)}$ than $S_{KL}^{(1)}$ indicates a strong positive skew of the division count distribution and an existence of distinct subpopulations. Although these conclusions are already clear from the lineage trees (Figure 7D) and the division count distribution (Figure 7F), the selection strength measures permit quantitative evaluation and comparison of the differences in a unified manner.

We added sentences to Discussion to explain that our conclusions from the analyses on regrowth depends on the time window (L536-539). In addition, we explained the implications of the larger $S_{KL}^{(2)}$ than $S_{KL}^{(1)}$ (L603-611).

The authors stress that their model is independent from mechanisms, which makes it broadly applicable. But only correlations can be assessed which may limit the identification what drives selection.

It is true that our approach can assess only correlations between traits and fitness. Revealing causal traits requires additional experiments and analyses in each phenomenon. Nevertheless, we believe this framework is valuable for narrowing down the candidates of traits whose heterogeneity influences fitness and selection; such causal traits should have large selection strength. We now discuss these limitations and utilities of this framework in Discussion (L496-499).

What is the relationship to adaptation and evolution? The abstract raises the question but no further mention of adaptation is made in the rest of the paper.

We appreciate the reviewer for pointing out the lack of explanations on the linkage to adaptation. We added a paragraph to the Discussion in which we discuss the implications of the results from the experimental data analyses for adaptation and evolution. We remark on the importance of growth heterogeneity structures within cellular populations for adaptation and discuss the advantage of this framework in characterizing such structures (L503-514).

Heredity is generally as important as selection: is it within or beyond the scope of the framework?

Again, we thank the reviewer for an insightful comment. We agree that heredity is also important for growth and evolution of a population. A straightforward application of this framework to this problem is to take correlation length (e.g., a half-life of autocorrelation of a particular trait) along a cell lineage as a lineage trait X. We now discuss the possibility of such an application, citing a reference that revealed modes of heredity could also be the target of the natural selection (L515-518).

Reviewer #3 (Recommendations for the authors):

I recommend addressing the specific concerns raised through appropriate discussions and clarifications in the manuscript text.

We thank the reviewer for the constructive suggestion. As we replied above, we addressed the specific concerns by adding the discussions and clarifications to the text.

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Citations

Yamauchi S, Nozoe T, Okura R, Kussell E, Wakamoto Y. 2021. LineageAnalysis. Github. LineageAnalysis
Nozoe T, Kussell E, Wakamoto Y. 2018. Data from: Inferring fitness landscapes and selection on phenotypic states from single-cell genealogical data. Dryad Digital Repository. [DOI] [PMC free article] [PubMed]
Nakaoka H, Wakamoto Y. 2018. Data from: Aging, mortality, and the fast growth trade-off of Schizosaccharomyces pombe. Dryad Digital Repository. [DOI] [PMC free article] [PubMed]
Seita A, Nakaoka H, Okura R, Wakamoto Y. 2021. Data from: Intrinsic growth heterogeneity of mouse leukemia cells underlies differential susceptibility to a growth-inhibiting anticancer drug. Dryad Digital Repository. [DOI] [PMC free article] [PubMed]

Supplementary Materials

Transparent reporting form

elife-72299-transrepform1.docx^{(246.6KB, docx)}

Data Availability Statement

The following dataset was generated:

Yamauchi S, Nozoe T, Okura R, Kussell E, Wakamoto Y. 2021. LineageAnalysis. Github. LineageAnalysis

The following previously published datasets were used:

Nozoe T, Kussell E, Wakamoto Y. 2018. Data from: Inferring fitness landscapes and selection on phenotypic states from single-cell genealogical data. Dryad Digital Repository.

Nakaoka H, Wakamoto Y. 2018. Data from: Aging, mortality, and the fast growth trade-off of Schizosaccharomyces pombe. Dryad Digital Repository.

[bib1] Balaban NQ, Merrin J, Chait R, Kowalik L, Leibler S. Bacterial persistence as a phenotypic switch. Science. 2004;305:1622–1625. doi: 10.1126/science.1099390. [DOI] [PubMed] [Google Scholar]

[bib2] Balázsi G, van Oudenaarden A, Collins JJ. Cellular decision making and biological noise: from microbes to mammals. Cell. 2011;144:910–925. doi: 10.1016/j.cell.2011.01.030. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib3] Battesti A, Majdalani N, Gottesman S. The rpos-mediated general stress response in Escherichia coli. Annual Review of Microbiology. 2011;65:189–213. doi: 10.1146/annurev-micro-090110-102946. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib4] Benet L, Sanders DP. TaylorSeries.jl: taylor expansions in one and several variables in julia. Journal of Open Source Software. 2019;4:1043. doi: 10.21105/joss.01043. [DOI] [Google Scholar]

[bib5] Benet L, Sanders DP. TaylorSeries.jl. Zenodo. 2021 doi: 10.5281/zenodo.2601941. [DOI]

[bib6] Cerulus B, New AM, Pougach K, Verstrepen KJ. Noise and epigenetic inheritance of single-cell division times influence population fitness. Current Biology. 2016;26:1138–1147. doi: 10.1016/j.cub.2016.03.010. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib7] Chow K-HK, Budde MW, Granados AA, Cabrera M, Yoon S, Cho S, Huang T-H, Koulena N, Frieda KL, Cai L, Lois C, Elowitz MB. Imaging cell lineage with a synthetic digital recording system. Science. 2021;372:eabb3099. doi: 10.1126/science.abb3099. [DOI] [PubMed] [Google Scholar]

[bib8] Edelstein AD, Tsuchida MA, Amodaj N, Pinkard H, Vale RD, Stuurman N. Advanced methods of microscope control using μmanager software. Journal of Biological Methods. 2014;1:e10. doi: 10.14440/jbm.2014.36. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib9] Elowitz MB, Levine AJ, Siggia ED, Swain PS. Stochastic gene expression in a single cell. Science. 2002;297:1183–1186. doi: 10.1126/science.1070919. [DOI] [PubMed] [Google Scholar]

[bib10] Filipczyk A, Marr C, Hastreiter S, Feigelman J, Schwarzfischer M, Hoppe PS, Loeffler D, Kokkaliaris KD, Endele M, Schauberger B, Hilsenbeck O, Skylaki S, Hasenauer J, Anastassiadis K, Theis FJ, Schroeder T. Network plasticity of pluripotency transcription factors in embryonic stem cells. Nature Cell Biology. 2015;17:1235–1246. doi: 10.1038/ncb3237. [DOI] [PubMed] [Google Scholar]

[bib11] Fisher RA. The Genetical Theory of Natural Selection. Oxford: Oxford University Press; 1930. [DOI] [Google Scholar]

[bib12] Frank SA. Natural selection. V. how to read the fundamental equations of evolutionary change in terms of information theory. Journal of Evolutionary Biology. 2012;25:2377–2396. doi: 10.1111/jeb.12010. [DOI] [PubMed] [Google Scholar]

[bib13] Frieda KL, Linton JM, Hormoz S, Choi J, Chow K-HK, Singer ZS, Budde MW, Elowitz MB, Cai L. Synthetic recording and in situ readout of lineage information in single cells. Nature. 2017;541:107–111. doi: 10.1038/nature20777. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib14] Futuyma DJ. Evolutionary Biology. Sinauer Associates, Inc; 2010. [Google Scholar]

[bib15] García-García R, Genthon A, Lacoste D. Linking lineage and population observables in biological branching processes. Physical Review E. 2019;99:1–12. doi: 10.1103/PhysRevE.99.042413. [DOI] [PubMed] [Google Scholar]

[bib16] Genthon A, Lacoste D. Fluctuation relations and fitness landscapes of growing cell populations. Scientific Reports. 2020;10:1–13. doi: 10.1038/s41598-020-68444-x. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib17] Genthon A, Lacoste D. Universal constraints on selection strength in lineage trees. Physical Review Research. 2021;3:023187. doi: 10.1103/PhysRevResearch.3.023187. [DOI] [Google Scholar]

[bib18] Hashimoto M, Nozoe T, Nakaoka H, Okura R, Akiyoshi S, Kaneko K, Kussell E, Wakamoto Y. Noise-driven growth rate gain in clonal cellular populations. PNAS. 2016;113:3251–3256. doi: 10.1073/pnas.1519412113. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib19] Inoue I, Wakamoto Y, Moriguchi H, Okano K, Yasuda K. On-Chip culture system for observation of isolated individual cells. Lab on a Chip. 2001;1:50–55. doi: 10.1039/b103931h. [DOI] [PubMed] [Google Scholar]

[bib20] Julou T, Zweifel L, Blank D, Fiori A, van Nimwegen E. Subpopulations of sensorless bacteria drive fitness in fluctuating environments. PLOS Biology. 2020;18:e3000952. doi: 10.1371/journal.pbio.3000952. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib21] Jun S, Si F, Pugatch R, Scott M. Fundamental principles in bacterial physiology-history, recent progress, and the future with focus on cell size control: a review. Reports on Progress in Physics. Physical Society. 2018;81:056601. doi: 10.1088/1361-6633/aaa628. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib22] Kelly CD, Rahn O. The growth rate of individual bacterial cells. Journal of Bacteriology. 1932;23:147–153. doi: 10.1128/jb.23.2.147-153.1932. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib23] Kobayashi TJ, Sughiyama Y. Fluctuation relations of fitness and information in population dynamics. Physical Review Letters. 2015;115:238102. doi: 10.1103/PhysRevLett.115.238102. [DOI] [PubMed] [Google Scholar]

[bib24] Kohram M, Vashistha H, Leibler S, Xue B, Salman H. Bacterial growth control mechanisms inferred from multivariate statistical analysis of single-cell measurements. Current Biology. 2021;31:955–964. doi: 10.1016/j.cub.2020.11.063. [DOI] [PubMed] [Google Scholar]

[bib25] Kuchen EE, Becker NB, Claudino N, Höfer T. Hidden long-range memories of growth and cycle speed correlate cell cycles in lineage trees. eLife. 2020;9:e51002. doi: 10.7554/eLife.51002. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib26] Kussell E, Leibler S. Phenotypic diversity, population growth, and information in fluctuating environments. Science. 2005;309:2075–2078. doi: 10.1126/science.1114383. [DOI] [PubMed] [Google Scholar]

[bib27] Lambert G, Kussell E, Kussel E. Memory and fitness optimization of bacteria under fluctuating environments. PLOS Genetics. 2014;10:e1004556. doi: 10.1371/journal.pgen.1004556. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib28] Lebowitz JL, Rubinow SI. A theory for the age and generation time distribution of a microbial population. Journal of Mathematical Biology. 1974;1:17–36. doi: 10.1007/BF02339486. [DOI] [Google Scholar]

[bib29] Leibler S, Kussell E. Individual histories and selection in heterogeneous populations. PNAS. 2010;107:13183–13188. doi: 10.1073/pnas.0912538107. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib30] Levien E, GrandPre T, Amir A. Large deviation principle linking lineage statistics to fitness in microbial populations. Physical Review Letters. 2020;125:048102. doi: 10.1103/PhysRevLett.125.048102. [DOI] [PubMed] [Google Scholar]

[bib31] Lin J, Amir A. The effects of stochasticity at the single-cell level and cell size control on the population growth. Cell Systems. 2017;5:358–367. doi: 10.1016/j.cels.2017.08.015. [DOI] [PubMed] [Google Scholar]

[bib32] Mosheiff N, Martins BMC, Pearl-Mizrahi S, Grünberger A, Helfrich S, Mihalcescu I, Kohlheyer D, Locke JCW, Glass L, Balaban NQ. Inheritance of cell-cycle duration in the presence of periodic forcing. Physical Review X. 2018 doi: 10.1103/PhysRevX.8.021035. [DOI] [Google Scholar]

[bib33] Nakaoka H, Wakamoto Y. Aging, mortality, and the fast growth trade-off of Schizosaccharomyces pombe. PLOS Biology. 2017;15:e2001109. doi: 10.1371/journal.pbio.2001109. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib34] Nozoe T, Kussell E, Wakamoto Y. Inferring fitness landscapes and selection on phenotypic states from single-cell genealogical data. PLOS Genetics. 2017;13:e1006653. doi: 10.1371/journal.pgen.1006653. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib35] Nozoe T, Kussell E. Cell cycle heritability and localization phase transition in growing populations. Physical Review Letters. 2020;125:268103. doi: 10.1103/PhysRevLett.125.268103. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib36] Nozoe T, Wakamoto Y. LineageAnalysis-julia. swh:1:rev:e22fbce8a713582a18fbe2bcc57dc9078090f121Software Heritage. 2021 https://archive.softwareheritage.org/swh:1:dir:3f8515a3a85a898cf42435bd3badd5c837d23587;origin=https://github.com/Wakamoto-lab/LineageAnalysis-Julia;visit=swh:1:snp:e342b9260c581a7e3a1caf204a8a052f29647f26;anchor=swh:1:rev:e22fbce8a713582a18fbe2bcc57dc9078090f121

[bib37] Powell EO. Growth rate and generation time of bacteria, with special reference to continuous culture. Journal of General Microbiology. 1956;15:492–511. doi: 10.1099/00221287-15-3-492. [DOI] [PubMed] [Google Scholar]

[bib38] Purvis JE, Lahav G. Encoding and decoding cellular information through signaling dynamics. Cell. 2013;152:945–956. doi: 10.1016/j.cell.2013.02.005. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib39] Quinn JJ, Jones MG, Okimoto RA, Nanjo S, Chan MM, Yosef N, Bivona TG, Weissman JS. Single-Cell lineages reveal the rates, routes, and drivers of metastasis in cancer xenografts. Science. 2021;371:eabc1944. doi: 10.1126/science.abc1944. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib40] Rivoire O, Leibler S. A model for the generation and transmission of variations in evolution. PNAS. 2014;111:E1940–E1949. doi: 10.1073/pnas.1323901111. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib41] Rochman ND, Popescu DM, Sun SX. Ergodicity, hidden bias and the growth rate gain. Physical Biology. 2018;15:036006. doi: 10.1088/1478-3975/aab0e6. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib42] Schneider CA, Rasband WS, Eliceiri KW. Nih image to imagej: 25 years of image analysis. Nature Methods. 2012;9:671–675. doi: 10.1038/nmeth.2089. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib43] Seita A, Nakaoka H, Okura R, Wakamoto Y. Intrinsic growth heterogeneity of mouse leukemia cells underlies differential susceptibility to a growth-inhibiting anticancer drug. PLOS ONE. 2021;16:e0236534. doi: 10.1371/journal.pone.0236534. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib44] Stewart EJ, Madden R, Paul G, Taddei F. Aging and death in an organism that reproduces by morphologically symmetric division. PLOS Biology. 2005;3:e45. doi: 10.1371/journal.pbio.0030045. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib45] Susman L, Kohram M, Vashistha H, Nechleba JT, Salman H, Brenner N. Individuality and slow dynamics in bacterial growth homeostasis. PNAS. 2018;115:E5679–E5687. doi: 10.1073/pnas.1615526115. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib46] Thomas P. Single-Cell Histories in Growing Populations: Relating Physiological Variability to Population Growth. bioRxiv. 2007 doi: 10.1101/100495. [DOI]

[bib47] Wakamoto Y, Ramsden J, Yasuda K. Single-Cell growth and division dynamics showing epigenetic correlations. The Analyst. 2005;130:311–317. doi: 10.1039/b409860a. [DOI] [PubMed] [Google Scholar]

[bib48] Wakamoto Y, Grosberg AY, Kussell E. Optimal lineage principle for age-structured populations. Evolution; International Journal of Organic Evolution. 2012;66:115–134. doi: 10.1111/j.1558-5646.2011.01418.x. [DOI] [PubMed] [Google Scholar]

[bib49] Wakamoto Y, Dhar N, Chait R, Schneider K, Signorino-Gelo F, Leibler S, McKinney JD. Dynamic persistence of antibiotic-stressed mycobacteria. Science. 2013;339:91–95. doi: 10.1126/science.1229858. [DOI] [PubMed] [Google Scholar]

[bib50] Wakamoto Y. LineageSimulation. swh:1:rev:ef1166620396835168ca9061851898993a091976Software Heritage. 2021 https://archive.softwareheritage.org/swh:1:dir:c5afe35f0203c0b58ec96a15f89bb88182e26c4c;origin=https://github.com/Wakamoto-lab/LineageSimulation;visit=swh:1:snp:6e2e48d9653006dc988484d143c8685f59d23583;anchor=swh:1:rev:ef1166620396835168ca9061851898993a091976

[bib51] Wang P, Robert L, Pelletier J, Dang WL, Taddei F, Wright A, Jun S. Robust growth of Escherichia coli. Current Biology. 2010;20:1099–1103. doi: 10.1016/j.cub.2010.04.045. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib52] Winkelmann R. Duration dependence and dispersion in count-data models. Journal of Business & Economic Statistics. 1995;13:467–474. doi: 10.1080/07350015.1995.10524620. [DOI] [Google Scholar]

[bib53] Yamauchi S. LineageAnalysis. swh:1:rev:1865d167f1c24625c98d3c493a9a180b1aa2035dSoftware Heritage. 2021 https://archive.softwareheritage.org/swh:1:dir:1d0239681f886fdda988fd6004edadca0850fa10;origin=https://github.com/Wakamoto-lab/LineageAnalysis;visit=swh:1:snp:891a7efe029cb0dcaf72b1a020e9ae47e3fd0097;anchor=swh:1:rev:1865d167f1c24625c98d3c493a9a180b1aa2035d

[bib54] Zaslaver A, Bren A, Ronen M, Itzkovitz S, Kikoin I, Shavit S, Liebermeister W, Surette MG, Alon U. A comprehensive library of fluorescent transcriptional reporters for Escherichia coli. Nature Methods. 2006;3:623–628. doi: 10.1038/nmeth895. [DOI] [PubMed] [Google Scholar]

PERMALINK

A unified framework for measuring selection on cellular lineages and traits

Shunpei Yamauchi

Takashi Nozoe

Reiko Okura

Edo Kussell

Yuichi Wakamoto

Roles

Abstract

Introduction

Examples of biological questions

Growth rate gain

Selection in changing environments

Correlations between cellular lineage traits and fitness

Theoretical background

Figure 1. Representative single-cell lineage trees.

Figure 2. Conceptual illustration of the relationships between fitness landscapes, trait distributions, and selection strength.

Box 1. A glossary of the terms.

Results

Growth rate gain and cumulant expansion of population growth rate

Table 1. Relationships between KX⁢(ξ) and quantities in cellular lineage statistics.

Difference in the selection strength measures reveals the effect of selection on fitness variance

Figure 3. Relationships among chronological distributions’ shape and selection strength measures.

Population growth rate under fitness perturbations

Figure 4. Population growth rate response to cell removal perturbation.

Figure 4—figure supplement 1. Response of population growth rate to cell removal perturbation with positive mother-daughter correlations of generation time.

Applications to models

Experimental evaluation of contributions of growth heterogeneity to population growth

Table 2. Summary of cellular species, culture conditions, and observation setup used in the experiments in Figure 5.

Table 3. Summary of the data used in the analysis in Figure 5.

Figure 5. Application of cell lineage statistics to experimental data.

Figure 5—figure supplement 1. Chronological distributions of division count, Qcl⁢(D).

Figure 5—figure supplement 2. Graphical representation of KD′⁢(ξ).

The contributions of higher order cumulants become significant in the regrowth from a late stationary phase

Figure 6. Strong selection in the E.coli population regrowing from a late stationary phase.

Lineage statistics reveal condition-dependent fitness landscapes and selection strength for a growth-regulating sigma factor

Figure 7. Fitness landscapes and selection strength for RpoS expression levels.

Figure 7—figure supplement 1. The relative selection strength values for time-averaged RpoS-mCherry and GFP fluorescence intensity compared with the randomized data.

Discussion

Materials and methods

Key resources table.

Microfabrication of microchamber array

Fabrication of PDMS pad

Chemical decoration of coverslip and cellulose membrane

E. coli strains

Culture conditions and sample preparation (exponential growth)

Culture conditions and sample preparation (regrowth from stationary phases)

Time-lapse measurements and image analysis

Data analysis

Distributions and selection strength measures for division count

Distributions and selection strength measures for time-averaged fluorescence intensity of RpoS-mCherry and GFP

Cumulant generating functions and cumulants

Error estimations by resampling method

Simulating the effect of cell removal on population growth rates

Data and code availability

Acknowledgements

Appendix 1

Analytical calculations of fitness measures, selection strength, and the cumulants of a fitness landscape

Appendix 1—figure 1. Analytical calculations of KD⁢(ξ) and related relations given specific form of division count distributions.

Appendix 2

Long-term limit for gamma-distributed uncorrelated generation times

Appendix 3

The properties of the selection strength of division count

The cumulant generating function KX⁢(ξ) provides both chronological and retrospective fitness cumulants

Relationships between fitness cumulants and selection strength measures

Analytical calculations of KD⁢(ξ) and related relations given specific form of division count distributions

Funding Statement

Contributor Information

Funding Information

Additional information

Competing interests

Author contributions

Additional files

Data availability

References

Editor's evaluation

Armita Nourmohammad

Roles

Decision letter

Roles

Table 1. Relationships between $K_{X} (ξ)$ and quantities in cellular lineage statistics.

Figure 5—figure supplement 1. Chronological distributions of division count, $Q_{cl} (D)$ .

Figure 5—figure supplement 2. Graphical representation of $K_{D}^{'} (ξ)$ .

Appendix 1—figure 1. Analytical calculations of $K_{D} (ξ)$ and related relations given specific form of division count distributions.

The cumulant generating function $K_{X} (ξ)$ provides both chronological and retrospective fitness cumulants

Analytical calculations of $K_{D} (ξ)$ and related relations given specific form of division count distributions