Statistical inference of the rates of cell proliferation and phenotypic switching in cancer

Einar Bjarki Gunnarsson; Jasmine Foo; Kevin Leder

doi:10.1016/j.jtbi.2023.111497

. Author manuscript; available in PMC: 2024 Jul 7.

Published in final edited form as: J Theor Biol. 2023 Apr 21;568:111497. doi: 10.1016/j.jtbi.2023.111497

Statistical inference of the rates of cell proliferation and phenotypic switching in cancer

Einar Bjarki Gunnarsson ^1,^2,^*, Jasmine Foo ², Kevin Leder ¹

PMCID: PMC10372878 NIHMSID: NIHMS1894227 PMID: 37087049

Abstract

Recent evidence suggests that nongenetic (epigenetic) mechanisms play an important role at all stages of cancer evolution. In many cancers, these mechanisms have been observed to induce dynamic switching between two or more cell states, which commonly show differential responses to drug treatments. To understand how these cancers evolve over time, and how they respond to treatment, we need to understand the state-dependent rates of cell proliferation and phenotypic switching. In this work, we propose a rigorous statistical framework for estimating these parameters, using data from commonly performed cell line experiments, where phenotypes are sorted and expanded in culture. The framework explicitly models the stochastic dynamics of cell division, cell death and phenotypic switching, and it provides likelihood-based confidence intervals for the model parameters. The input data can be either the fraction of cells or the number of cells in each state at one or more time points. Through a combination of theoretical analysis and numerical simulations, we show that when cell fraction data is used, the rates of switching may be the only parameters that can be estimated accurately. On the other hand, using cell number data enables accurate estimation of the net division rate for each phenotype, and it can even enable estimation of the state-dependent rates of cell division and cell death. We conclude by applying our framework to a publicly available dataset.

Keywords: Mathematical modeling, maximum likelihood estimation, parameter identifiability, phenotypic switching, epigenetics, cancer evolution

1. Introduction

Cancer evolution has long been understood to be a genetic process. However, recent evidence suggests an equally important role for non-genetic forces, including epigenetic mechanisms and the inherent stochasticity in gene transcription and translation [1, 2, 3, 4, 5, 6]. These mechanisms are heritable and reversible, and they can enable cells to dynamically switch between two or more phenotypic states. Such switching dynamics have been observed e.g. in lung cancer [7, 8, 9], melanoma [10, 11, 12], glioblastoma [13, 14], leukemia [15, 16], colon cancer [17, 18, 19, 20] and breast cancer [21, 22, 23, 24]. The different phenotypes commonly show differential responses to drug treatments, which enhances the adaptability of the cancer under treatment and significantly increases the probability of treatment resistance [25].

Unraveling how the cancer-specific rates of cell division, cell death and phenotypic switching shape tumor evolution over time is crucial to furthering our understanding of the disease and to informing new treatment strategies. For example, in a two-phenotype cancer where one type is drug-sensitive and the other is drug-tolerant, the change in phenotypic proportions during the initial stages of treatment can be explained by a combination of sensitive cells dying, drug tolerant cells proliferating, and cells switching between sensitivity and tolerance. Disentangling the relative rates at which these events occur can help us to better understand how resistance arises, how it evolves over time, and how best to combat it [25].

Our current quantitative understanding of the rates of cell proliferation and phenotypic switching in cancer is largely derived from cell line experiments. In these experiments, live cells are commonly sorted into phenotypes, e.g. based on gene expression profiles or cell morphologies, isolated subpopulations are expanded in culture, and phenotypic proportions are tracked over time (Fig. 1). These isolated subpopulations have been observed to give rise to all other phenotypes over time, with proportions between types eventually converging to the constant proportions observed in the parental population [21, 17, 19, 20, 23, 24, 14].

Figure 1: — The dynamics of phenotypic switching are commonly interrogated by sorting live cells into isolated phenotypic subpopulations and expanding these subpopulations in culture [21, 17, 19, 20, 23, 24, 14]. By tracking the evolution of phenotypic proportions over time and applying mathematical models of phenotypic switching, it becomes possible to estimate the quantitative parameters of the process [21, 26, 27, 28, 29, 22, 12, 30].

To explain this behavior, simple mathematical models of phenotypic switching have been proposed, and these models have been used to estimate the rates at which cells switch between states [21, 26, 27, 28, 29, 22, 12, 30]. These works are reviewed in Section 2 below. Previous estimation methods have been deterministic in nature, and they have generally derived their estimates from data on the fraction of cells in each state at each time point. If the total size of the cell population is measured at the same time points, as e.g. in [30], one obtains data on the number of cells in each state at each time point. We will show that when cell fraction data is used, the rates of phenotypic switching may be the only parameters that can be estimated accurately. In contrast, using cell number data enables accurate estimation of the net cell division rate for each phenotype, and it can even enable estimation of the state-dependent rates of cell division and cell death. Understanding how growth rates vary between types is as important as understanding the rates of phenotypic switching, especially in the context of treatment response. Not only do the growth rates influence the phenotypic composition of the population, they also control the evolution of the tumor burden over time.

Our goal in this work is to develop a statistically rigorous framework for estimating the rates of cell proliferation and phenotypic switching in cancer. In contrast to previous approaches, our framework explicitly models the stochastic dynamics of cell division, cell death and phenotypic switching, it provides likelihood-based confidence intervals for the model parameters, and it enables estimation both from cell fraction and cell number data. We also use our framework to analyze the identifiability of model parameters and how it depends on the input data. This important topic has not been addressed by previous works.

The rest of the paper is organized as follows. In Section 2, we review prior estimation methods. In Section 3.1, we introduce our stochastic model of cell division, cell death and phenotypic switching. In Section 3.2, we state our assumptions on the cell line experiments conducted and the data collected. In Sections 3.3 and 3.4, we propose statistical models for cell number and cell fraction data, respectively, and describe how parameter estimates and confidence intervals are computed. In Section 4.1, we present theoretical analysis of the identifiability of parameters under each model. In Section 4.2, we conduct numerical experiments for the case of two phenotypes, and in Section 4.3, we apply our framework to a publicly available dataset. We conclude by discussing limitations of the framework as well as avenues for improvement (Section 5). For simplicity, the development of the estimation framework in the main text is focused on the case of experiments started by isolated subpopulations. General starting conditions are treated in full detail in the appendices.

2. Review of prior estimation methods

At the single-cell-level, phenotypic switching has commonly been modeled by a discrete-time Markov chain with $K \geq 2$ states, where $K$ is the number of phenotypes. In each time step, a cell in state $j$ transitions to state $k \neq j$ with probability $p_{j k}$ , and it remains in state $j$ with probability $p_{j j} = 1 - \sum_{k \neq j} p_{j k}$ . The transition probabilities are collected into the $K \times K$ transition matrix $P = (p_{j k})$ . The evolution of the Markov chain is determined by $P$ and the initial distribution $q = (q_{1}, \dots, q_{K})$ , where $q_{j}$ is the probability that a cell starts in state $j$ . If we let $q^{(ℓ)}$ denote the cell state distribution after $ℓ \geq 1$ time steps, then $q^{(ℓ)} = q P^{ℓ}$ .

Say we conduct $K$ cell line experiments starting with $N$ cells in each experiment and known initial cell state distributions $q_{1}, \dots, q_{K}$ . The initial distributions are collected into a $K \times K$ matrix $Q$ , where $q_{i}$ is the $i$ -th row vector. Each experiment is run for $ℓ \geq 1$ time steps, at which point the fraction of cells in each state is recorded. Let $f_{i j}^{(ℓ)}$ be the observed fraction of cells in state $j$ under the $i$ -th initial condition. The observa at the $l$ -th time step under the $i$ -th initial condition are collected into a vector $f_{i}^{(ℓ)} = (f_{i 1}^{(ℓ)}, \dots, f_{i K}^{(ℓ)})$ , and all observations at the $l$ -th time step are collected into a $K \times K$ matrix $F^{(ℓ)} = (f_{i j}^{(ℓ)})$ . If there are multiple replicates $r = 1, \dots, R$ , we let $F^{(ℓ), r}$ denote the data from the $r$ -th replicate.

Now assume that the starting population $N$ is large, that there is no cell division or cell death, and that each cell switches between states according to the above Markov model. In this case, by the strong law of large numbers, the model-predicted distribution between cell states $Q P^{ℓ}$ after $ℓ$ time steps can be approximated by the experimentally observed cell-state fractions $F^{(ℓ)}$ . If we simply equate these two matrices, we can obtain an estimate $P_{ℓ}$ of the transition matrix $P$ by inverting the matrix $Q$ of initial distributions and taking an $l$ -th matrix root, $P_{ℓ} = {(Q^{- 1} F^{(ℓ)})}^{1 / ℓ}$ . Here, we assume that $Q$ is invertible, which is e.g. the case when experiments are started with isolated subpopulations.

This simple estimation idea was applied by Gupta et al. [21] to investigate phenotypic switching between stem-like, basal and luminal cell states in breast cancer, using data from a single time point. A multiple-time-point extension has since been implemented in the R package CellTrans [26]. Say that cell state fractions are experimentally observed at time steps $m_{1}, \dots, m_{L}$ for $L \geq 1$ . CellTrans first computes an estimate $P_{m_{ℓ}}$ of the transition matrix for each time step as above, and then returns a final estimate as the average across time steps:

\hat{P} : = (1 / L) \sum_{ℓ = 1}^{L} P_{m_{ℓ}} .

(1)

CellTrans also involves a regularization step to ensure that $\hat{P}$ is stochastic. CellTrans is used on publicly available datasets in [26] and it has been applied more recently in [31, 32, 33].

Cell populations in culture typically change in size over time. If all phenotypes grow at the same rate, and cell growth occurs deterministically at the end of each time step, the constant-sized Markov model can be used to describe the evolution of cell state fractions. Both Gupta et al. [21] and Su et al. [12] have applied an augmented version of the Markov model intended to capture proliferation differences between types. In the augmented model, during a single time step, each type- $j$ cell first grows deterministically to a population of size $Λ_{j j}$ , and a fraction $p_{j k}$ of cells then switch to type- $k$ . The growth factors $Λ_{j j}$ are collected into a diagonal proliferation matrix $Λ$ , and the multiple $Λ P$ , after being normalized to produce cell fractions as opposed to cell numbers, is used to predict the distribution between cell states. In both Gupta et al. [21] and Su et al. [12], the matrix $Λ$ is found by randomly sampling candidate parameter values and selecting the values that best fit the experimental data.

TRANSCOMPP [27] is a more systematic version of the aforementioned method. In TRANSCOMPP, the diagonal proliferation matrix $Λ$ and the transition matrix $P$ are estimated by minimizing the sum of squared errors between the model prediction and the data,

\min_{Λ, P} \sum_{i = 1}^{I} \sum_{ℓ = 1}^{L} \sum_{r = 1}^{R} {‖f_{i}^{(m_{ℓ}), r} - {(q_{i} {(Λ P)}^{m_{ℓ}} 1^{T})}^{- 1} q_{i} {(Λ P)}^{m_{ℓ}}‖}^{2} .

(2)

Note that this problem only determines the growth factors relative to one another, $Λ_{j j} / Λ_{11}$ for $j = 2, \dots, K$ . TRANSCOMPP is implemented in MATLAB, and it includes a stochastic resampling procedure for estimating the distributions of the transition probability estimates. The stochastic resampling is performed on single-cell measurements of cell phenotypes, if available, or on data generated from a user-defined distribution of cell state fractions.

In modeling switching between HER2+ and HER2− states in breast cancer, Li and Thirumalai [28] employ a deterministic continuous-time model. Their model assumes symmetric and asymmetric cell divisions, which through reparametrization leads to the same dynamics as symmetric cell divisions and switching between types. Li and Thirumalai assume equal rates of asymmetric division for the two types (or equivalently, equal rates of switching between types), and they show that if experiments are started with isolated subpopulations, the slopes of the cell fraction trajectories at time 0 can be used to estimate these rates. They also show that the equilibrium proportion between types can be used to estimate the difference in symmetric division rate between the two types. The proportion between phenotypes in the parental population is used as an estimate of the equilibrium proportion. We have made use of these insights in our identifiability analysis in Section 4.1.2 below.

Finally, in their investigation of epithelial to mesenchymal transition in breast cancer, Devaraj and Bose [30, 34] employ a discrete-time model where cells divide, die and switch between types. Their model includes a separate state for dead cells to facilitate estimation of death rates and well as division rates. We have used the same idea in Section 4.2.3 below to improve the identifiability of birth and death rates under our framework. Their model furthermore assumes that the rates of birth, death and switching are time-dependent. Devaraj and Bose derive difference equations for the change in the number of cells in each state between time points. They then propose a multi-objective optimization problem to estimate the model parameters from data on cell state fractions and the total number of alive and dead cells at each time point. Their parameter fitting procedure minimizes the least squares error between the model predictions and the data across the different time points, while ensuring that parameters do not vary too drastically between time periods.

3. Models and methods

In this section, we propose statistical models for cell number and cell fraction data, which are based on a multitype branching process model of the cell population dynamics [35]. To simplify the discussion, we will focus on the case where all experiments are started from isolated subpopulations of cells. We emphasize however that the estimation framework can be applied to any set of starting conditions, as is outlined in more detail in Appendix A.

3.1. Multitype branching process model

3.1.1. Model definition and model parameters

To model the cell population dynamics, we employ a multitype branching process model in continuous time, with $K \geq 2$ types [35]. In the model, a type- $j$ cell divides into two cells at rate $b_{j} \geq 0$ , it dies at rate $d_{j} \geq 0$ , and it switches to type- $k$ at rate $ν_{j k} \geq 0$ for $k \neq j$ , independently of all other cells. This means that in an infinitesimally short time interval of length $Δ t > 0$ , a type- $j$ cell divides with probability $b_{j} Δ t$ , it dies with probability $d_{j} Δ t$ , and it switches to type $k$ with probability $ν_{j k} Δ t$ . The multitype branching process model captures a variety of switching dynamics previously observed in the literature (Fig. 2).

Figure 2: — The multitype branching process model captures a variety of switching dynamics previously observed in the literature. **(a)** A two-type model captures e.g. the dynamics between HER2+ and HER2− cell states in Brx-82 and Brx-142 breast cancer cells [23]. **(b)** A three-type model captures e.g. the dynamics between stem-like, basal and luminal cell states in SUM149 and SUM159 breast cancer cells [21]. **(c)** A four-type model captures e.g. the dynamics between CD24^Low/ALDH^High, CD24^Low/ALDH^Low, CD24^High/ALDH^High and CD24^Hich/ALDH^Low cell states in GBC02, SCC029B and SCC070 oral cancer cells [32].

We allow $ν_{j k} = 0$ for some $j$ and $k$ , which means that a type- $j$ cell is not able to switch directly to type- $k$ . However, in our exposition, we assume that the model is irreducible, in that each cell type is accessible from any other cell type, possibly through intermediate types. In mathematical terms, this means that for each $j$ , $k = 1, \dots, K$ with $k \neq j$ , there exist $r \geq 0$ integers $m_{1}, \dots, m_{r} \in {1, \dots, K}$ so that $ν_{j m_{1}} ν_{m_{1} m_{2}} \dots ν_{m_{r} k} > 0$ . Our estimation framework can also be applied to reducible switching models, as we discuss in Appendix B below.

For $j = 1, \dots, K$ , we define $λ_{j} : = b_{j} - d_{j}$ as the net birth rate of a type- $j$ cell. We collect the growth parameters into $1 \times K$ vectors $b = (b_{1}, \dots, b_{K})$ , $d = (d_{1}, \dots, d_{K})$ and $λ = (λ_{1}, \dots, λ_{K})$ . We also define $λ^{[- j]} : = λ - λ_{j} 1$ as the vector of net birth rates relative to λ_j, with $λ_{k}^{[- j]} = λ_{k} - λ_{j}$ for $k \neq j$ and $λ_{j}^{[- j]} = 0$ . We finally define the $K \times K$ matrix $A$ with $a_{j j} : = λ_{j} - \sum_{k \neq j} ν_{j k}$ for $j = 1, \dots, K$ and $a_{j k} : = ν_{j k}$ for $k \neq j$ as the infinitesimal generator of the model, where $a_{j k}$ is the net rate at which a cell of type $j$ produces a cell of type $k$ .

3.1.2. Random processes and their moments

If the branching process is started by $N$ cells of type- $j$ , the state of the process at time $t \geq 0$ is encoded in the $1 \times K$ random vector of cell numbers $Z^{(j)} (t; N) = (Z_{1}^{(j)} (t; N), \dots, Z_{K}^{(j)} (t; N))$ . On the event $\{\sum_{k = 1}^{K} Z_{k}^{(j)} (t; N) \neq 0\}$ that the cell population is still alive at time $t$ , we let $Δ^{(j)} (t; N)$ denote the corresponding random vector of cell fractions, i.e.

Δ_{i}^{(j)} (t; N) : = Z_{i}^{(j)} (t; N) / (\sum_{k = 1}^{K} Z_{k}^{(j)} (t; N)), i = 1, \dots, K .

If the process is started by a single type- $j$ cell, we write $Z^{(j)} (t) : = Z^{(j)} (t; 1)$ , and we define the associated mean vector and covariance matrix by

\begin{array}{l} m^{(j)} (t) : = E [Z^{(j)} (t)], \\ Σ^{(j)} (t) : = E [{(Z^{(j)} (t) - m^{(j)} (t))}^{T} (Z^{(j)} (t) - m^{(j)} (t))], t \geq 0. \end{array}

(3)

We also define the $K \times K$ matrix $M (t)$ with row vectors $m^{(j)} (t)$ as the mean matrix for the process at time $t$ . It can be shown that $M (t)$ is given by the matrix exponential $\exp (t A) : = \sum_{k = 0}^{\infty} \frac{t^{k}}{k!} A^{k}$ [35]. Note that $A$ and $M (t)$ depend on the birth rates $b$ and the death rates $d$ only through the net birth rates $λ$ .

3.1.3. Long-run behavior

In the branching process model with irreducible switching dynamics, all subpopulations eventually grow at the same exponential rate $σ$ . This applies both to individual trajectories of the model (when the population does not go extinct) and its mean behavior. In mathematical terms, if the process is started by a single type- $j$ cell, there exists a real number $σ$ , positive $1 \times K$ vectors $β = (β_{1}, \dots, β_{K})$ and $γ = (γ_{1}, \dots, γ_{K})$ , and a nonnegative random variable $W$ with mean $E [W] = β_{j}$ , so that

\lim_{t \to \infty} e^{- σ t} Z^{(j)} (t) = W γ, almost surely,

(4)

and

\lim_{t \to \infty} e^{- σ t} m^{(j)} (t) = β_{j} γ .

(5)

See e.g. Sections V.7.1-V.7.4 and Theorem 2 in Section V.7.5 of [35]. In other words, the number of type- $k$ cells at time $t$ is approximately $W γ_{k} e^{σ t}$ almost surely when $t$ is large, and the mean number of type- $k$ cells is approximately $β_{j} γ_{k} e^{σ t}$ . It follows that if we define

{\bar{γ}}_{k} : = γ_{k} / (\sum_{m = 1}^{K} γ_{m}), k = 1, \dots, K,

(6)

then given that the population does not go extinct, ${\bar{γ}}_{k}$ is the long-run proportion of type- $k$ cells in the population, independently of the initial condition. Thus, in the long run, cell proportions tend towards an equilibrium distribution given by $\bar{γ}$ , which is consistent with the experimental observations discussed in the introduction.

3.2. Experimental assumptions and notation for experimental data

In the development of our estimation framework, we assume that each experiment returns measurements from a single time point only, meaning that the experimental sample is discarded once measurements are taken (endpoint data). In this case, techniques such as flow cytometry or fluorescence-activated cell sorting (FACS) can be used to identify phenotypes at the experimental endpoints. Sometimes, the data collected is sequential, meaning that a single experiment returns measurements from multiple time points. This can for example be the case when phenotypes are tagged with fluorescent dyes and tracked over time using time-lapse microscopy (live-cell imaging) [36, 37]. In Section 4.2.4, we show that our endpoint-data statistical framework can also yield reasonable estimates for sequential data. In Appendix A.2, we discuss what would be required to rigorously extend the framework to sequential data.

In the main text, we assume that each experiment is started by an isolated subpopulation, and we let $N_{j}$ be the number of starting cells for the experiment started only by type- $j$ cells. We assume that $N_{j}$ is large, which is generally the case for the experiments discussed in the introduction (Section 1). Furthermore let $0 < t_{1} < t_{2} < \dots < t_{L}$ with $L \geq 1$ denote the time points at which data is collected, and let $R \geq 1$ be the number of experimental replicates performed. The data collected in each experiment is either a vector $n_{j, ℓ, r} = (n_{j, ℓ, r, 1}, \dots, n_{j, ℓ, r, K})$ of cell numbers or a vector $f_{j, ℓ, r} = (f_{j, ℓ, r, 1}, \dots, f_{j, ℓ, r, K})$ of cell fractions. Here, $n_{j, ℓ, r, k}$ is the number of type- $k$ cells in the $r$ -th replicate of the experiment started only by type- $j$ cells and ended at the $l$ -th timepoint, and $f_{j, ℓ, r, k}$ is the corresponding cell fraction.

3.3. Estimation for cell number data

Our statistical framework for cell number data is rooted in a central limit theorem for the vector $Z^{(j)} (t; N)$ of cell numbers at time $t$ . More precisely, by decomposing the branching process ${(Z^{(j)} (s; N))}_{s \geq 0}$ into i.i.d. processes started by single type- $j$ cells, we can show that as $N \to \infty$ ,

N^{- 1 / 2} (Z^{(j)} (t; N) - N m^{(j)} (t)) \overset{d}{\to} 𝓝 (0, Σ^{(j)} (t)) .

(7)

The details are provided in Appendix C.1, where we also show that the covariance matrix $Σ^{(j)} (t)$ is given by

\begin{matrix} Σ^{(j)} (t) = 2 \int_{0}^{t} {(M (t - τ))}^{T} diag (b ⊙ m^{(j)} (τ)) (M (t - τ)) d τ \\ + diag (m^{(j)} (t)) - {(m^{(j)} (t))}^{T} m^{(j)} (t) . \end{matrix}

(8)

When the starting cell number $N$ is large, the central limit theorem (7) allows us to approximate the distribution of $Z^{(j)} (t; N)$ by a multivariate normal distribution as follows:

Z^{(j)} (t; N) \approx N m^{(j)} (t) + 𝓝 (0, N Σ^{(j)} (t)) .

(9)

Based on this approximation, we propose the following statistical model for the experimental data $n_{j, ℓ, r}$ :

n_{j, ℓ, r} \sim \underset{\begin{matrix} mean \\ behavior \end{matrix}}{\underset{︸}{N_{j} m^{(j)} (t_{ℓ})}} + \underset{\begin{matrix} variability in \\ population dynamics \end{matrix}}{\underset{︸}{𝓝 (0, N_{j} Σ^{(j)} (t_{ℓ}))}} + \underset{\begin{matrix} measurement \\ error \end{matrix}}{\underset{︸}{𝓝 (0, E_{j, ℓ}^{num})}} .

(10)

The first two terms capture the mean and variance of the branching process model dynamics, while the final term captures experimental measurement error, which is independent of the branching process. We assume that the $K \times K$ covariance matrix $E_{j, ℓ}^{num}$ associated with measurement error can be written as a function of the branching process model parameters and additional error parameters $ω_{num} = (ω_{1}, \dots, ω_{M_{num}})$ for some $M_{num} \geq 0$ . A simple example is $E_{j, ℓ}^{num} = ω^{2} I$ for some $ω > 0$ , where the measurement error is assumed to be of equal magnitude for all data points, and to be uncorrelated between cell types. Another simple example is $E_{j, ℓ}^{num} = ω^{2} {(diag (N_{j} m^{(j)} (t_{ℓ})))}^{2}$ , where the measurement error is assumed to scale with mean experimental outcomes.

To compute parameter estimates from the statistical model (10), we use a maximum likelihood approach, due to its simplicity and desirable large-sample properties like consistency and asymptotic efficiency [38]. More precisely, the statistical model (10) is used to derive a likelihood function, which is the probability of observing the experimental data as a function of the model parameters, and point estimates for the parameters are computed by maximizing the likelihood function. We also derive a likelihood-based confidence interval for each model parameter $θ$ , which is obtained by inverting the likelihood-ratio test for the given parameter, i.e. collecting all values $θ_{0}$ for which the null hypothesis $θ = θ_{0}$ is accepted under the likelihood-ratio test [39, 40, 41, 42, 43]. The confidence interval is determined by the profile log-likelihood for $θ$ , as is further discussed in Appendix A.2.

3.4. Estimation for cell fraction data

For cell fraction data, we propose a similar maximum likelihood estimation framework, rooted in a central limit theorem for the vector $Δ^{(j)} (t; N)$ of cell fractions at time $t$ . To state the central limit theorem, we define the $1 \times K$ vector $p^{(j)} (t)$ and the $K \times K$ matrix $S^{(j)} (t)$ by

\begin{array}{l} p^{(j)} (t) : = {(m^{(j)} 1^{T})}^{- 1} m^{(j)} (t), \\ S^{(j)} (t) : = {(m^{(j)} (t) 1^{T})}^{- 2} {(I - 1^{T} p^{(j)} (t))}^{T} Σ^{(j)} (t) (I - 1^{T} p^{(j)} (t)) . \end{array}

(11)

Using arguments of Yakovlev and Yanev [44], we can show that as $N \to \infty$ ,

N^{1 / 2} (Δ^{(j)} (t; N) - p^{(j)} (t)) \overset{d}{\to} 𝓝 (0, S^{(j)} (t)) .

(12)

The details are provided in Appendix C.2, where we also show that the mean function $p^{(j)} (t)$ can be written solely as a function of the switching rates ${(ν_{i k})}_{k \neq i}$ and the relative net birth rates $λ^{[- 1]}$ . The choice of type-1 as a reference phenotype is arbitrary. Based on the central limit theorem (12), we propose the following statistical model for the experimental data $f_{j, ℓ, r}$ :

f_{j, ℓ, r} \sim p^{(j)} (t_{ℓ}) + 𝓝 (0, N_{j}^{- 1} S^{(j)} (t_{ℓ})) + 𝓝 (0, E_{j, ℓ}^{frac}) .

(13)

As for cell number data, we assume that the $K \times K$ covariance matrix $E_{j, ℓ}^{frac}$ associated with measurement error can be written as a function of the branching process model parameters and additional error parameters $ω_{frac} = (ω_{1}, \dots, ω_{M_{frac}})$ for some $M_{frac} \geq 0$ .

Note that in the statistical model (13), the variability term $N_{j}^{- 1} S^{(j)} (t_{ℓ})$ decreases with the initial population size $N_{j}$ . Thus, if a large $N_{j}$ is coupled with a large measurement error, the third term in (13) will dominate the second term. When applying the framework to real cell fraction datasets, this can potentially allow us to simplify the model in (13) so that it only includes the first and third term:

f_{j, ℓ, r} \sim p^{(j)} (t_{ℓ}) + 𝓝 (0, E_{j, ℓ}^{frac}) .

(14)

We discuss this point further in Section 4.3 and the discussion section (Section 5).

As for cell number data, from the statistical model (13) (and the simpler version (14)), it is straightforward to derive a likelihood function, maximum likelihood estimates and likelihood-based confidence intervals, as is discussed in more detail in Appendix A.3.

4. Results

4.1. Structural identifiability analysis

We begin by analyzing the structural identifiability of the statistical models (10) and (13). Informally, structural identifiability refers to whether a parameter can be estimated accurately given an infinite amount of noise-free data. More precisely, a parameter is structurally identifiable if complete knowledge of the model distribution uniquely determines the value of the parameter, in the absence of any measurement noise [45, 46].

To demonstrate the structural identifiability of a parameter, it is sufficient to show that knowledge of the statistical moments of the model distribution implies knowledge of the parameter. By considering the moments, we can adopt techniques from systems biology used for the analysis of deterministic models based on ordinary differential equations [47]. In particular, we will assume that we know the behavior of the mean functions $m^{(j)} (t)$ and $p^{(j)} (t)$ and the covariance functions $Σ^{(j)} (t)$ and $S^{(j)} (t)$ close to time 0 (more precisely, their derivatives at 0), and we will analyze to what extent the model parameters can be extracted from this information. In other words, we are interested in the following question: If we conduct experiments started from isolated subpopulations, and perfect observations are made of the first two statistical moments of the model close to time 0, can we identify the model parameters?

This analysis serves two purposes. First, it ascertains whether in this idealized setting, the model parameters can be extracted uniquely from short-term observations of the population dynamics. Second, the analysis indicates how much information is required to estimate each model parameter accurately, which yields valuable insights into how comparatively difficult it is to estimate the parameters from more limited data.

4.1.1. Cell number data

In the following proposition, we show that for cell number data, the switching rates ${(ν_{i k})}_{k \neq i}$ and the net birth rates $λ$ can be recovered uniquely from knowledge of the mean functions $m^{(j)} (t)$ close to time 0, while the birth rates $b$ can be recovered from the covariance matrices $Σ^{(j)} (t)$ .

Proposition 1.

For each $j = 1, \dots, K$ , the switching rates $ν_{j k}$ , $k \neq j$ , and the net birth rate $λ_{j}$ are uniquely determined by ${\frac{d}{d t} m^{(j)} (t)|}_{t = 0}$ .
For each $j = 1, \dots, K$ , if the switching rates $ν_{j k}$ , $k \neq j$ , and the net birth rate $λ_{j}$ are known, the birth rate $b_{j}$ is uniquely determined by ${({\frac{d}{d t} Σ^{(j)} (t)|}_{t = 0})}_{j j}$ .

Proof. Appendix D. □

Proposition 1 establishes the structural identifiability of all model parameters for cell number data. The process of extracting the parameters as suggested by Proposition 1 can be thought of as follows: If we want to know $ν_{j k}$ for some $k \neq j$ , we can simply plot the mean function $M_{j k} (t) = E [Z_{k}^{(j)} (t)]$ and compute its slope at 0. If we want to know the birth rate $b_{j}$ , we can plot the variance function ${(Σ^{(j)} (t))}_{j j} = Var [Z_{j}^{(j)} (t)]$ and compute its slope at 0.

It is important to note that we are not suggesting to use this approach to estimate parameters from real data. Instead, we are establishing theoretically that there is sufficient information in the distribution of the data close to time 0 to determine all model parameters uniquely. In particular, we can in theory predict the entire evolutionary trajectory of the population from short-term observations of the initial population dynamics.

4.1.2. Cell fraction data

In the following proposition, we show that for cell fraction data, only the switching rates ${(ν_{i k})}_{k \neq i}$ can be recovered from the slopes of the mean functions $p^{(j)} (t)$ at time 0. The net birth rate differences $λ^{[- 1]}$ can be recovered from the curvatures of the mean functions at time 0 or from the equilibrium proportions $\bar{γ}$ between cell types if they are known. We are not able to learn any more parameters from the mean functions, since $p^{(j)} (t)$ can be written solely as a function of ${(ν_{i k})}_{k \neq i}$ and $λ^{[- 1]}$ . The slopes of the covariance functions $S^{(j)} (t)$ depend only on ${(ν_{i k})}_{k \neq i}$ , meaning that they provide no extra information on the model parameters.

Proposition 2.

F.or $j = 1, \dots, K$ , the switching rates $ν_{j k}$ , $k \neq j$ , are uniquely determined by ${\frac{d}{d t} p^{(j)} (t)|}_{t = 0}$ .
If the switching rates ${(ν_{i k})}_{k \neq i}$ are known, the net birth rate differences $λ^{[- 1]}$ are uniquely determined by (i) ${\frac{d^{2}}{d t^{2}} p^{(j)} (t)|}_{t = 0}$ for $j = 1, \dots, K$ or (ii) the equilibrium proportions $\bar{γ}$ .
For $j = 1, \dots, K$ , ${\frac{d}{d t} S^{(j)} (t)|}_{t = 0}$ .only depends on the switching rates $ν_{j k}$ for $k \neq j$ .

Proof. Appendix E. □

As for the remaining model parameters, the net birth rate $λ_{1}$ and the birth rates $b$ , they require information on the curvatures of the covariance functions $S^{(j)} (t)$ at time 0 at the least. We will not analyze the structural identifiability of these parameters further. Proposition 2 indicates that one should not expect to be able to estimate these parameters accurately from cell fraction data, which is confirmed by numerical experiments in Section 4.2.2.

4.1.3. Comparison

The results of our identifiability analysis are summarized in Table 2. Our analysis indicates that the switching rates ${(ν_{i k})}_{k \neq i}$ and net birth rates $λ$ are easy to estimate for cell number data, using information only on the mean behavior of the population. The birth rates $b$ are harder to estimate, since they require second moment information, but they may still be obtainable with sufficient data, as we discuss further in Section 4.2.3. For cell fraction data, the switching rates ${(ν_{i k})}_{k \neq i}$ are easy to estimate using the mean behavior of the population. The net birth rate differences $λ^{[- 1]}$ can also be estimated from the mean, but they require more information. The remaining model parameters are unlikely to be obtainable from real datasets.

Table 2:

Summary of the structural identifiability analysis of Propositions 1 and 2. For cell number data, the switching rates ${(ν_{i k})}_{k \neq i}$ and the net birth rates $λ$ are identifiable from the slopes (first derivatives) of the mean functions $m^{(j)} (t)$ (first moments) at time 0. The birth rates $b$ are identifiable from the slopes of the covariance functions $Σ^{(j)} (t)$ (second moments). For cell fraction data, only the switching rates ${(ν_{i k})}_{k \neq i}$ are identifiable from the slopes of the mean functions $p^{(j)} (t)$ , while the net birth rate differences $λ^{[- 1]}$ can be determined from their curvatures (second derivatives). In contrast to cell number data, the slopes of the covariance functions $S^{(j)} (t)$ for cell fraction data provide no extra information on the model parameters.

Moment	Derivative	Cell number data	Cell fraction data
1	1 2	$λ$ , ${(ν_{i k})}_{k \neq i}$ –	${(ν_{i k})}_{k \neq i}$ $λ^{[- 1]}$
2	1	$b$	${(ν_{i k})}_{k \neq i}$

Open in a new tab

4.2. Numerical experiments

Next, we apply our maximum likelihood framework to computer-generated data. In all cases, we assume that experiments are conducted from isolated initial conditions, and we assume no measurement noise, i.e. $E_{j, ℓ}^{num} = 0$ and $E_{j, ℓ}^{frac} = 0$ . For simplicity, we only consider a model with two cell types, $K = 2$ . Our goal is to assess how comparatively difficult it is to estimate the different model parameters depending on what data is collected.

4.2.1. Implementation in MATLAB

Our estimation framework has been implemented in MATLAB codes which are available at https://github.com/egunnars/phenotypic_switching_inference/. The framework returns (i) a maximum likelihood estimate and (ii) a likelihood-based confidence interval for each parameter, using the sequential quadratic programming (sqp) solver in MATLAB. Before solving the maximum likelihood problem, we compute initial parameter estimates from a simpler model, which we use to initialize the optimization and to rescale the model parameters so that they are of similar magnitude. In most cases, we have found it sufficient to solve the maximum likelihood problem once, starting from the simple estimates. However, our MATLAB codes provide the option to solve the problem several times using different initial guesses. Details of the implementation are provided in Appendix F.

4.2.2. Estimation across a wide range of biologically realistic regimes

In Appendix G.1, we provide a simple illustration of the output of our estimation framework for a single artifical dataset. For a more thorough evaluation of estimation accuracy, we generated 10,000 artificial datasets for $K = 2$ cell types. We first generated 100 biologically realistic parameter regimes and then generated 100 datasets for each regime. To generate the parameter regimes, we sampled birth and death rates uniformly between 0 and 1, and sampled switching rates log-uniformly between 10⁻¹ and 10⁻³. We considered both regimes where the two phenotypes have positive net birth rates $(λ_{1}, λ_{2} > 0)$ and regimes where one phenotype has a negative net birth rate $(λ_{1} < 0, λ_{2} > 0)$ . The latter regimes are relevant to the dynamics of anti-cancer treatment response, where one phenotype is drug-sensitive and the other is drug-tolerant. We assumed $I = 2$ isolated initial conditions, $L = 6$ time points and $R = 3$ replicates. Further details of the data generation are provided in Appendix H.

For each dataset, we used our framework to compute MLE estimates for all model parameters. In this way, we obtained 100 estimates of each parameter under each parameter regime, which we used to compute the coefficient of variation (CV) for the MLE estimator of the parameter. The CV is the sample standard deviation of the MLE estimator as a proportion of its sample mean, and it measures the percentage error in the estimation.

The results are shown in Figure 3. A horizontal line is drawn at 25% CV to indicate whether parameters can be estimated with reasonable accuracy. Note that for the switching rates ${(ν_{i k})}_{k \neq i}$ , the median CV for cell fraction data is about twice as large as for cell number data. The median CV for the net birth rate difference $λ_{2} - λ_{1}$ is an order of magnitude larger for cell fraction data than cell number data, and it is two orders of magnitude larger for the net birth rate $λ_{1}$ . The birth rates $b$ can in many cases be estimated reasonably well for cell number data, whereas they are never estimated accurately for cell fraction data. These results are very much in line with our identifiability analysis in Section 4.1.

Note that for cell fraction data, the estimation error for the net birth rate difference $λ_{2} - λ_{1}$ exceeds the 25% threshold CV for several parameter regimes. This occurs when $λ_{2} - λ_{1}$ is small in magnitude, more precisely when it is smaller than 0.1 in regimes where the birth rates lie between 0.1 and 1. Note in contrast that for cell number data, the estimation error for $λ_{2} - λ_{1}$ never exceeds the 25% threshold. This indicates that for cell fraction data, it may be difficult to distinguish the net birth rate difference $λ_{2} - λ_{1}$ from 0 unless it is relatively pronounced. We discuss this point further in Section 4.3 below.

In Appendix G.2, we show how our framework can be used to investigate questions related to experimental design. In particular, we consider the question of whether experimental efforts should be prioritized to collect data from more time points (either in between or after the previous time points) or to perform more experimental replicates.

4.2.3. Improving identifiability of the rates of cell division and cell death

For cell number data, even though the birth rates $b$ can be estimated reasonably well in many cases by Section 4.2.2, they are estimated much less accurately than the net birth rates $λ$ and the switching rates ${(ν_{i k})}_{k \neq i}$ . In Figure 4a, we show that as the number of replicates is increased from 3 to 20 or above, the accuracy in the estimation becomes more acceptable. However, even with 100 replicates, the birth rates $b$ are estimated less accurately than the net birth rates $λ$ with 3 replicates (see Figure 3).

Figure 4: — Two ways of improving the estimation accuracy for the birth rates $b$ when cell number data is used. In **(a)**, we show how the estimation accuracy for the birth rate $b_{1}$ improves as the number of experimental replicates is increased. In **(b)**, we compare the estimation accuracy for the birth rate $b_{1}$ and the net birth rate $λ_{1}$ depending on whether data on the number of dead cells at each time point is included in the estimation or not.

As we mentioned in the introduction, data on the number of cells in each state at each time point can be obtained by measuring the fraction of cells in each state and the total number of cells at each time point. In addition, it is often possible to measure the number of dead cells at each time point, see e.g. [30]. If this data is obtained, we can augment our mathematical model by introducing a new cell state, which cells transition into upon death (Figure 5). In Figure 4b, we show that if we apply our estimation framework to this model, the birth rates $b$ become as easy to estimate as the net birth rates $λ$ . Thus, if data is collected on the number of live and dead cells at each time point, it becomes possible to estimate all model parameters accurately using our framework.

Figure 5: — Augmentation of the mathematical model for when data is available on the number of dead cells at each time point. In that case, in stead of cells being lost from the model upon dying (left panel), they transition into a new state (right panel).

It should be noted that data collection on the number of dead cells is confounded by the fact that dead cells are eventually cleared from the system. This can potentially be addressed by introducing a clearance rate for dead cells in the augmented model, i.e. by introducing a death rate for the type-3 cells in the right panel of Figure 5.

4.2.4. Estimation using endpoint data vs. sequential data

We conclude by examining how well our estimation framework applies to sequential data, when data is collected at multiple time points in the same experiment (Section 3.2). In Figure 6, we see that for cell number data, the CV for each parameter approximately doubles when applying our framework to sequential data vs. endpoint data. However, it remains true that the switching rates ${(ν_{i k})}_{k \neq i}$ and net birth rates $λ$ can be estimated with good accuracy. For cell fraction data, the difference in the estimation error for ${(ν_{i k})}_{k \neq i}$ and $λ_{2} - λ_{1}$ is even smaller. Together, these results indicate that our framework can yield reasonable estimates for sequential data. At the same time, for cell number data in particular, there can be a significant benefit to developing a method tailored to sequential data, both in terms of deriving reliable point estimates and robust confidence intervals.

Figure 6: — Comparison of estimation error depending on whether our framework is applied to endpoint data or sequential data. The blue dots show the estimation error when endpoint data is used, i.e. when experiments from different time points are independent, and the red dots show the error when sequential data is used, i.e. when data is collected at multiple time points in the same experiment. Panel **(a)** shows the comparison for cell number data and panel **(b)** for cell fraction data. Even though our framework is derived for endpoint data, it provides reasonable estimation accuracy for sequential data.

4.3. Application: Transition between stem and non-stem cell states in SW620 colon cancer

To give an example of how our estimation framework can be used to analyze real experimental data, we conclude by applying it to a publicly available cell fraction dataset. We use data collected by Yang et al. [17] and made available in Tables S2 and S3 of Wang et al. [20], on the dynamics between stem-like (type-1) and non-stem (type-2) cells in SW620 colon cancer. In Yang et al. [17], the two cell types were sorted based on expression of the CD133 cell-surface antigen marker. Isolated subpopulations were expanded and phenotypic proportions were tracked for 24 days, with data collected every other day. This dataset has previously been analyzed using the CellTrans estimation method [26] (Section 2).

Since data on individual experimental replicates is not available, we use data on the mean cell fraction across replicates as input to our estimation framework. We first consider the statistical model (13) and the simpler version (14) with $E_{j, ℓ}^{frac} = ω^{2} I$ for all $j$ , $l$ , which we refer to as Models I and Ia, respectively:

Model I: $f_{j, ℓ} ~ p^{(j)} (t_{ℓ}) + 𝓝 (0, N_{j}^{- 1} S^{(j)} (t_{ℓ})) + 𝓝 (0, ω^{2} I)$ .
Model Ia: $f_{j, ℓ} ~ p^{(j)} (t_{ℓ}) + 𝓝 (0, ω^{2} I)$ .

Note that Model I has seven parameters ( $d_{1}$ , $d_{2}$ , $λ_{1}$ , $λ_{2} - λ_{1}$ , $ν_{12}$ , $ν_{21}$ , $ω$ ), while Model Ia has four parameters ( $λ_{2} - λ_{1}$ , $ν_{12}$ , $ν_{21}$ , $ω$ ). In Table 3, we show parameter estimates and 95% confidence intervals for the two models, which turn out to be very similar. By the Akaike Information Criterion (AIC) and the Bayesian Information Criterion (BIC), which assess the quality of model fit relative to model complexity, the simpler Model Ia is preferred for this dataset (Appendix I). The codes used to compute the estimates in Table 3 are available at https://github.com/egunnars/phenotypic_switching_inference/.

Table 3:

Comparison of model fit quality, point estimates and confidence intervals for the statistical models $f_{j, ℓ} \sim p^{(j)} (t_{ℓ}) + 𝓝 (0, N_{j}^{- 1} S^{(j)} (t_{ℓ})) + 𝓝 (0, ω^{2} I)$ (Model I) and $f_{j, ℓ} \sim p^{(j)} (t_{ℓ}) + 𝓝 (0, ω^{2} I)$ (Model Ia) applied to publicly available cell fraction data from Yang et al. [17].

Model	AIC	BIC	$ν_{21}$	$ν_{12}$	$λ_{2} - λ_{1}$
I	−113.4	−105.2	0.154 CI: [0.111, 0.212]	0.057 CI: [0.036, 0.087]	0.080 CI: [−0.058, 0.219]
Ia	−119.4	−114.7	0.157 CI: [0.115, 0.213]	0.057 CI: [0.037, 0.088]	0.084 CI: [−0.054, 0.218]

Open in a new tab

The CIs under Model Ia show that while the point estimates for $ν_{21}$ and $ν_{12}$ are 0.157 and 0.057, respectively, the true value of $ν_{21}$ may range between 0.115 and 0.213, and the true value of $ν_{12}$ may range between 0.037 and 0.088. Since the two CIs do not overlap, $ν_{21} > ν_{12}$ at the 5% level of significance, but there is considerable uncertainty as to the true values. The CI for $λ_{2} - λ_{1}$ is even wider, which is in line with our earlier observations that this parameter is more difficult to estimate from cell fraction data than the switching rates, especially when $λ_{2} - λ_{1}$ is relatively small in magnitude (Sections 4.1.2 and 4.2.2). In fact, the CI for $λ_{2} - λ_{1}$ includes zero, meaning that it is plausible that $λ_{1} = λ_{2}$ .

In the CellTrans paper [26], it is assumed that the two phenotypes have the same growth rate, based on data from Wang et al. [20]. We can build this assumption into the estimation by solving the MLE problem for Models I/Ia under the constraint $λ_{2} - λ_{1} = 0$ (Appendix F). We refer to this as Models II/IIa:

Model II: $f_{j, ℓ} ~ p^{(j)} (t_{ℓ}) + 𝓝 (0, N_{j}^{- 1} S^{(j)} (t_{ℓ})) + 𝓝 (0, ω^{2} I), λ_{2} - λ_{1} = 0$ .
Model IIa: $f_{j, ℓ} ~ p^{(j)} (t_{ℓ}) + 𝓝 (0, ω^{2} I), λ_{2} - λ_{1} = 0$ .

Estimation results for Models II/IIa are shown in Table 4, and a visual comparison between the estimates for Models Ia and IIa is shown in Figure 7. The assumption $λ_{1} = λ_{2}$ has a noticeable effect on both the point estimates of $ν_{21}$ and $ν_{12}$ and their confidence intervals. For example, the ratio $ν_{21} / ν_{12}$ is 2.7 under Model Ia, while it is 1.9 under Model IIa. In other words, switching from type-2 to type-1 happens about three times as often as switching from type-1 to type-2 under Model Ia, while it happens about two times as often under Model IIa. Furthermore, under Model IIa, the length of the CI for $ν_{21}$ is reduced by a half compared to Model Ia, meaning that Model IIa significantly restricts the plausible values of $ν_{21}$ .

Table 4:

Comparison of model fit quality, point estimates and confidence intervals for the statistical models $f_{j, ℓ} \sim p^{(j)} (t_{ℓ}) + 𝓝 (0, N_{j}^{- 1} S^{(j)} (t_{ℓ})) + 𝓝 (0, ω^{2} I)$ , $λ_{2} - λ_{1} = 0$ (Model II) and $f_{j, ℓ} \sim p^{(j)} (t_{ℓ}) + 𝓝 (0, ω^{2} I)$ , $λ_{2} - λ_{1} = 0$ (Model IIa) applied to publicly available cell fraction data from Yang et al. [17].

Model	AIC	BIC	$ν_{21}$	$ν_{12}$
II	−114.0	−107.0	0.131 CI: [0.110, 0.161]	0.071 CI: [0.057, 0.089]
IIa	−119.8	−116.3	0.134 CI: [0.112, 0.162]	0.072 CI: [0.059, 0.090]

Open in a new tab

Figure 7: — Visual comparison of point estimates and 95% confidence intervals for the statistical model $f_{j, ℓ} \sim p^{(j)} (t_{ℓ}) + 𝓝 (0, ω^{2} I)$ (Model Ia) and the same model with $λ_{2} - λ_{1} = 0$ (Model IIa) applied to publicly available cell fraction data from Yang et al. [17].

In the CellTrans paper [26], the same dataset is used to estimate switching probabilities of $p_{21} = 0.1030$ and $p_{12} = 0.0545$ , based on a discrete-time Markov model with a time step of $Δ t =$ one day. We also solved the TRANSCOMPP problem (2) (see Section 2) with $Δ t =$ one day to obtain the estimates $p_{21} = 0.136$ and $p_{12} = 0.054$ for the switching probabilities and $Λ_{22} / Λ_{11} = 1.079$ for the ratio between the growth factors of the two phenotypes, which translates to a growth rate difference of $r_{2} - r_{1} = 0.076$ if we set $Λ_{22} = e^{r_{2} Δ t}$ and $Λ_{11} = e^{r_{1} Δ t}$ . In the CellTrans and TRANSCOMPP models, type switches are synchronized between all cells in the population, and they occur at discrete time steps. In our continuous-time model, the time steps are infinitesimally small, and each cell has a certain probability of switching, proliferating and dying during each step, independently of other cells (Section 3.1.1). If we shorten the time step to $Δ t = 1 / 10$ day, the switching probabilities become 0.0111 and 0.0059 under CellTrans, which translates to continuous-time rates of ${\tilde{p}}_{21} = 0.111$ and ${\tilde{p}}_{12} = 0.059$ . These estimates fall at the lower limits of our CIs for $ν_{21}$ and $ν_{12}$ under Models II/IIa (Table 4). Under TRANSCOMPP, the switching probabilities become 0.0154 and 0.0057 for $Δ t = 1 / 10$ day, which translates to continuous-time rates of ${\tilde{p}}_{21} = 0.154$ and ${\tilde{p}}_{12} = 0.057$ , and the difference in growth rates becomes ${\tilde{r}}_{2} - {\tilde{r}}_{1} = 0.083$ . These estimates are very similar to the point estimates of Models I/Ia (Table 3).

The estimates of CellTrans and TRANSCOMPP are consistent with our estimates in that they fall within the 95% confidence intervals produced by our framework, if the time step is taken to be sufficiently small. Our framework complements these methods for cell fraction data by providing continuous-time estimates and enabling a rigorous analysis of the estimates and the uncertainty involved. For example, the CIs provided by our framework reveal how uncertain the value of $λ_{2} - λ_{1}$ is compared to $ν_{21}$ and $ν_{12}$ , and that $λ_{2} - λ_{1}$ cannot be distinguished from zero using this dataset. If assumptions such as $λ_{1} = λ_{2}$ or $ν_{12} = ν_{21}$ can be made, it is easy to incorporate them into the estimation and to assess their effect on point estimates and confidence intervals (Appendix F). In this case, our analysis shows that the assumption $λ_{1} = λ_{2}$ significantly restricts the plausible values of $ν_{21}$ and $ν_{12}$ , which may underestimate the true uncertainty in the estimation, given that the claim $λ_{1} = λ_{2}$ is subject to statistical error. We discuss the differences between our approach and these two methods, and the importance of quantifying the uncertainty in the estimation, in more detail in the following section.

5. Discussion

In this work, we have proposed a maximum likelihood framework for estimating the rates of cell proliferation and phenotypic switching in cancer. In contrast to previous approaches, the framework explicitly models the stochastic dynamics of cell division, cell death and phenotypic switching, it provides likelihood-based confidence intervals for the model parameters, and it enables estimation from data on the fraction of cells or the number of cells in each state at each time point. An implementation of the framework in MATLAB with sample scripts is available at https://github.com/egunnars/phenotypic_switching_inference/.

We have also used our framework to analyze the identifiability of model parameters. Through a combination of theoretical and numerical investigation and application to real data, we have seen that when cell fraction data is used, the switching rates ${(ν_{i k})}_{k \neq i}$ may be the only parameters that can be estimated accurately, while the net birth rate differences $λ^{[- 1]}$ can also be estimated reasonably accurately if they are sufficiently large. Including information on the total size of the population at each time point yields significantly better estimates of $λ^{[- 1]}$ , and it also enables accurate estimation of the net birth rates $λ$ . Finally, if enough experimental replicates are performed, or if data is collected on the number of dead cells at each time point, it even becomes possible to estimates the birth rates $b$ and death rates $d$ accurately.

In a previous work, we discussed how knowledge of the model parameters ${(ν_{i k})}_{k \neq i}$ , $λ$ , $b$ can enhance our understanding of resistance evolution in cancer and inform the design of combination treatments of anti-cancer agents and epigenetic drugs [25]. Together, these parameters shape the evolution of phenotypic proportions and the total tumor burden over time, each of which is relevant to the dynamics of tumor recurrence. Our current work shows that it is not possible to estimate the net birth rates $λ$ or the birth rates $b$ accurately from cell fraction data, it indicates what data is required to obtain these parameters, and it offers a rigorous approach to parameter estimation and uncertainty quantification once the data has been acquired. In the context of anti-cancer drug response, uncertainty quantification is crucial for assessing how treatment affects the model parameters and for evaluating the robustness of any treatment recommendations. For example, there is evidence that both chemotherapies and targeted agents can induce phenotypic switching from drug-sensitivity to drug-tolerance [22, 12, 48], where the level of induction determines the optimal dose under continuous drug treatment [49, 50, 51]. In this context, it is important to confirm that an estimated induction of drug-tolerance is statistically significant, and to assess how accurately the induction level can be estimated, before dose changes for established treatment protocols can be recommended.

In our application to a publicly available cell fraction dataset, we compared estimates from our framework to estimates produced by CellTrans [26] and TRANSCOMPP [27]. CellTrans is based on a discrete-time Markov chain model, and it provides estimates for the probabilities of switching between phenotypes during a single time step, for the case where all types grow at the same rate. TRANSCOMPP is based on a similar model, except it also provides estimates of the relative growth rates of the different phenotypes, and it includes a stochastic resampling method for estimating the distributions of transition probability estimates using single-cell measurements. Each method can only be applied to cell fraction data. For the dataset studied in Section 4.3, CellTrans and TRANSCOMPP produce estimates similar to our framework, when the time step is taken to be sufficiently small. We expect that this will usually be the case for datasets with few experimental replicates or a large measurement error, since our baseline statistical model (13) incorporates second moment information which is likely to be distorted in such datasets. However, we believe that even for these datasets, the continuous-time estimates provided by our framework better reflect the asynchronous nature of cell state switching, division and death, and they have the benefit of not being affected by an arbitrary choice of time step. More importantly, our framework provides likelihood-based confidence intervals for the parameters ${(ν_{i k})}_{k \neq i}$ and $λ^{[- 1]}$ , which is crucial to assess the quality of the estimation. Finally, our framework is unique in that it enables estimation from cell number data. It should be noted that for cell number data in particular, the appropriate measurement error model may vary between specific applications, as is discussed below.

There are several limitations of the estimation framework, which represent avenues for future development and improvement. First, our framework assumes that the cell population can be decomposed into discrete phenotypes, which can be identified using known biomarkers. Second, our multitype branching process model assumes that the lifetime of a cell is exponentially distributed, meaning that the rate at which a cell divides or dies is independent of its age. It is possible to model non-exponential lifetimes using our framework by assuming that each phenotype transitions through a number of internal states, each at an exponential rate, before dividing or dying. This will however increase the number of parameters in the model, which will require more data to obtain accurate estimates. Another approach would be to employ age-dependent branching processes, which would also add parameters to the model [35]. A third limitation of our framework is that it ignores any potential cell-to-cell interactions. Incorporating such interactions likely requires estimation methods tailored to specific applications, depending on the specific nature of the interactions.

Fourth, the branching process model assumes that cells are allowed to grow uninterrupted for the duration of the experiments. This does not address the effect of passaging in longer-duration experiments. One potential way to address passaging is to keep track of cell state proportions and seeding densities for each passage, and to consider each passage as a new experiment with new initial conditions. In other words, instead of viewing a long experiment involving serial passaging as a single experiment with a single initial condition, it can be viewed as a collection of shorter experiments with different initial conditions. However, our framework currently assumes that initial conditions are known, while uncertainty is assigned to all subsequent time points. In reality, the initial conditions are subject to measurement error, and it may become important to model this error for the case of repeated passaging.

Fifth, our framework currently models measurement error as an additive Gaussian noise with a general covariance matrix. We have suggested simple ways of choosing the covariance matrix both for cell number and cell fraction data, but further exploration of appropriate choices is warranted. Ideally, the determination of an appropriate measurement error model should be driven by the particular dataset being analyzed [52]. Depending on the application, it may also become necessary to develop a more sophisticated error model than the additive Gaussian model. For example, for cell number data, if the measurement error is proportional to the population size, it may become necessary to model it as a multiplicative term rather than an additive term, or to build the experimental cell counting procedure more explicitly into the statistical model. We plan to address this in future work.

Sixth, we have focused on estimation from experiments started with isolated subpopulations of each phenotype, as this is a common experimental design, and we have analyzed parameter identifiability in this setting. Understanding to what extent the model parameters, or some combinations of the parameters, can be estimated from more limited data is an interesting avenue for future investigation. For example, if we are interested in estimating parameters from clinical data, the data will likely contain much less information than we have assumed here, and it will become necessary to analyze what parameters are identifiable and how identifiability can be improved, e.g. by combining data from similar patients.

Finally, we believe our framework can be useful for the design of cell line experiments aimed at deciphering the dynamics of phenotypic switching. For example, preliminary experiments can first be conducted, from which initial parameter estimates and confidence intervals can be derived. Based on the confidence intervals, one can construct a set of likely values for the parameters, which can be used to evaluate the expected improvement in estimation accuracy depending on the experimental design (see e.g. [53]). Once good experimental designs have been identified, one can evaluate whether the expected improvement in estimation accuracy justifies the additional experimental resources. If this is the case, additional experiments can be performed and the process can be repeated. In a future work, we plan to develop a tool for the optimal selection of experimental designs, to facilitate more efficient utilization of experimental resources.

Table 1:

Notation for the stochastic model of Section 3.1.

Symbol	Dimension	Description
$K$	1	Number of types
$b_{j}$	1	Division rate of type- $j$ cells
d_j	1	Death rate of type- $j$ cells
$ν_{j k}$	1	Rate of switching from type- $j$ to type- $k$
$λ_{j}$	1	Net birth rate of type- $j$ cells, $λ_{j} = b_{j} - d_{j}$
$λ_{k}^{[- j]}$	1	Net birth rate relative to $λ_{j}$ , $λ_{k}^{[- j]} = λ_{k} - λ_{j}$

$A$	$K \times K$	Infinitesimal generator of the branching process model
$M (t)$	$K \times K$	Mean matrix at time $t$ , $M (t) = exp (t A)$
$\bar{γ}$	$1 \times K$	equilibrium proportions between cell types

$Z^{(j)} (t; N)$	$1 \times K$	Vector of cell numbers at time $t$ , started by $N$ type- $j$ cells
$Z^{(j)} (t)$	$1 \times K$	$Z^{(j)} (t) : = Z^{(j)} (t; 1)$
$m^{(j)} (t)$	$1 \times K$	$m^{(j)} (t) : = E [Z^{(j)} (t; 1)] = e_{j} M (t)$
$Σ^{(j)} (t)$	$K \times K$	$Σ^{(j)} (t) : = E [{(Z^{(j)} (t) - m^{(j)} (t))}^{T} (Z^{(j)} (t) - m^{(j)} (t))]$

$Δ^{(j)} (t; N)$	$1 \times K$	Vector of cell fractions at time t, started by $N$ type- $j$ cells
$Δ^{(j)} (t)$	$1 \times K$	$Δ^{(j)} (t) : = Δ^{(j)} (t; 1)$
$p^{(j)} (t)$	$1 \times K$	$p^{(j)} (t) : = {(m^{(j)} 1^{T})}^{- 1} m^{(j)} (t)$
$Q^{(j)} (t)$	$K \times K$	$Q^{(j)} (t) : = I - 1^{T} p^{(j)} (t)$
$S^{(j)} (t)$	$K \times K$	$S^{(j)} (t) : = {(m^{(j)} (t) 1^{T})}^{- 2} {(Q^{(j)} (t))}^{T} Σ^{(j)} (t) Q^{(j)} (t)$

Open in a new tab

A framework to estimate parameters of phenotypic switching is developed.
Identifiability of parameters depending on what data is collected is investigated.
Cell fraction data may only enable accurate estimation of switching rates.
Cell number data enables accurate estimation of growth rates.

Acknowledgments.

EBG and JF were supported in part by NIH grant R01CA241137. EBG and KL were supported in part by NSF grant CMMI-1552764. JF was supported in part by NSF grants DMS-1349724 and DMS-2052465. JF and KL were supported in part by the Research Council of Norway R&D Grant 309273. EBG was supported in part by the Norwegian Centennial Chair grant and the Doctoral Dissertation Fellowship from the University of Minnesota.

A. Estimation framework

In Sections 3.3 and 3.4 of the main text, we described our estimation framework for the simple case where all experiments are started from isolated subpopulations. We also omitted the details regarding the computation of point estimates and confidence intervals. In this section, we develop the estimation framework in full detail for general starting conditions.

A.1. Notation for experimental data

For the general case, we assume that each experiment is started with a known initial condition, encoded by the $1 \times K$ vector n = (n₁, …, n_K) of starting cell numbers of each type. We let $I \geq 1$ denote the number of distinct initial conditions and n_i = (n_i1, …, n_iK) denote the $i$ -th initial condition. We assume that for each i = 1, …, I and $j = 1, \dots, K$ , either n_ij = 0 or n_ij is large, which is generally the case for the experiments discussed in the introduction (Section 1).

We define $N_{i} : = \sum_{k = 1}^{K} n_{i k}$ as the total number of starting cells in the $i$ -th condition, and $f_{i} = (f_{i 1}, \dots, f_{i K})$ as the vector of starting cell fractions, with $f_{i j} : = n_{i j} / N_{i}$ . As in Section 3.2 of the main text, we let $L \geq 1$ be the number of time points at which data is collected, and we let $0 < t_{1} < t_{2} < \dots < t_{L}$ denote the time points. Finally, we let $R \geq 1$ be the number of experimental replicates performed.

The data collected in each experiment is either a vector $n_{i, ℓ, r} = (n_{i, ℓ, r, 1}, \dots, n_{i, ℓ, r, K})$ of cell numbers or $f_{i, ℓ, r} = (f_{i, ℓ, r, 1}, \dots, f_{i, ℓ, r, K})$ of cell fractions. Here, $n_{i, ℓ, r, k}$ is the number of type- $k$ cells in the $r$ -th replicate of the experiment started by the $i$ -th initial condition and ended at the $l$ -th timepoint, and $f_{i, ℓ, r, k}$ is the corresponding cell fraction.

A.2. Estimation for cell number data

We now develop the estimation framework for cell number data. For the general case, the starting vector $f_{i}$ of cell fractions can be any $1 \times K$ vector $α$ with $α_{k} \geq 0$ for $k = 1, \dots, K$ and $\sum_{k = 1}^{K} α_{k} = 1$ . In expression (3) of the main text, we defined the mean function $m^{(j)} (t)$ and the covariance matrix $Σ^{(j)} (t)$ for an isolated initial condition. We extend these definitions to a general vector $α$ of starting cell fractions as follows:

\begin{array}{l} m^{α} (t) : = α M (t) = \sum_{j = 1}^{K} α_{j} m^{(j)} (t), \\ \sum^{α} (t) : = \sum_{j = 1}^{K} α_{j} \sum^{(j)} (t) . \end{array}

(15)

Then, based on a generalized version of the central limit theorem (7), which is stated and proved as Proposition 3 in Appendix C.1, we propose the following extension of the statistical model (10) in the main text:

n_{i, ℓ, r} \sim \underset{\begin{matrix} mean \\ behavior \end{matrix}}{\underset{︸}{N_{i} m^{f_{i}} (t_{ℓ})}} + \underset{\begin{matrix} variability in \\ population dynamics \end{matrix}}{\underset{︸}{𝓝 (0, N_{i} Σ^{f_{i}} (t_{ℓ}))}} + \underset{\begin{matrix} measurement \\ error \end{matrix}}{\underset{︸}{𝓝 (0, E_{i, ℓ}^{num})}} .

(16)

The vectors $n_{i, ℓ, r}$ and $n_{j, m, s}$ are assumed independent for $(i, ℓ, r) \neq (j, m, s)$ , and they are assumed i.i.d. for $(i, ℓ) = (j, m)$ and $r \neq s$ . This implies that data from distinct time points come from distinct experiments (endpoint data). We assume endpoint data since the central limit theorem (CLT) in Proposition 3 holds for the distribution of cell numbers at a fixed time point $t$ . Developing an analogous statistical model for sequential data requires extending the CLT to a process-level or functional CLT. We plan to address this in future work.

Note that in the statistical model (16), the mean behavior $N_{i} m^{f_{i}} (t_{ℓ})$ of the model depends only on the switching rates ${(ν_{i k})}_{k \neq i}$ and the net birth rates $λ$ , while the variance term $N_{i} Σ^{f_{i}} (t_{ℓ})$ depends on ${(ν_{i k})}_{k \neq i}$ , $λ$ and also the birth rates $b$ by (8). It is therefore natural to parametrize the first two terms in (16) by $b$ , $λ$ , ${(ν_{i k})}_{k \neq i}$ instead of the primary parameters $b$ , $d$ , ${(ν_{i k})}_{k \neq i}$ . As stated in the main text, we assume that the $K \times K$ covariance matrix $E_{j, ℓ}^{num}$ associated with measurement error can be written as a function of $b$ , $λ$ , ${(ν_{i k})}_{k \neq i}$ and added error parameters $ω_{frac} = (ω_{1}, \dots, ω_{M_{num}})$ for some $M_{num} \geq 0$ . We let $θ_{num}$ be the complete $1 \times (K (K + 1) + M_{num})$ vector of model parameters including the error parameters.

From the statistical model (16), it is straightforward to derive the following likelihood function:

\begin{array}{l} 𝓛_{num} (θ_{num} | (n_{i, ℓ, r})) \\ = \prod_{i = 1}^{I} \prod_{ℓ = 1}^{L} \prod_{r = 1}^{R} {({(2 π)}^{K} \det (N_{i} Σ^{f_{i}} (t_{ℓ}) + E_{i, ℓ}^{num}))}^{- 1 / 2} \\ \cdot \exp (- \frac{1}{2} (n_{i, ℓ, r} - N_{i} m^{f_{i}} (t_{ℓ})) {(N_{i} Σ^{f_{i}} (t_{ℓ}) + E_{i, ℓ}^{num})}^{- 1} {(n_{i, ℓ, r} - N_{i} m^{f_{i}} (t_{ℓ}))}^{T}) . \end{array}

(17)

We next define the negative double log-likelihood,

l_{num} (θ_{num}) : = - 2 \log 𝓛_{num} (θ_{num} | (n_{i, ℓ, r})) .

(18)

The maximum likelihood estimate ${\hat{θ}}_{num}$ for the parameter vector $θ_{num}$ is obtained by minimizing $l_{num} (θ_{num})$ over a set of feasible parameters $Θ_{num}$ :

{\hat{θ}}_{num} : = {argmin}_{θ_{num} \in Θ_{num}} l_{num} (θ_{num}) .

(19)

In the feasible set $Θ_{num}$ , we restrict the parameter values so that $ν_{i k} \geq 0$ , $b \geq 0$ and $λ \leq b$ . Further restrictions can be made depending on the context, see e.g. Appendix B.

A $1 - α$ likelihood-based confidence interval $[θ_{num, i}^{-}, θ_{num, i}^{+}]$ for the $i$ -th model parameter $θ_{num, i}$ can be obtained by collecting all values $θ$ for which the null hypothesis $θ_{num, i} = θ$ is accepted under the likelihood-ratio test. To describe how the confidence interval is obtained, we define the negative double profile log-likelihood for $θ_{num, i}$ as

{\tilde{l}}_{num, i} (θ) : = \min_{θ_{num} \in Θ_{num} : θ_{num, i} = θ} l_{num} (θ_{num}) .

Note that ${\tilde{l}}_{num, i} (θ)$ is computed by fixing the $i$ -th parameter to the value $θ$ and minimizing the negative double log-likelihood (18) over the remaining parameters. The $1 - α$ confidence interval for $θ_{num, i}$ derived from the likelihood-ratio test is given by

[θ_{num, i}^{-}, θ_{num, i}^{+}] = \{θ : {\tilde{l}}_{num, i} (θ) - l_{num} ({\hat{θ}}_{num}) \leq χ_{1, 1 - α}^{2}\},

(20)

where ${\hat{θ}}_{num}$ is the MLE estimator defined by (19) and $χ_{1, 1 - α}^{2}$ is the $(1 - α)$ -th quantile of the $χ^{2}$ -distribution. Instead of computing the endpoints $θ_{num, i}^{-}$ and $θ_{num, i}^{+}$ directly using (20), they can be computed by solving the following two constrained optimization problems:

\begin{array}{l} θ_{num, i}^{-} = \min_{θ_{num} \in Θ_{num}} \{θ_{num, i} : l_{num} (θ_{num}) \leq l_{num} ({\hat{θ}}_{num}) + χ_{1, 1 - α}^{2}\}, \\ θ_{num, i}^{+} = \max_{θ_{num} \in Θ_{num}} \{θ_{num, i} : l_{num} (θ_{num}) \leq l_{num} ({\hat{θ}}_{num}) + χ_{1, 1 - α}^{2}\} . \end{array}

(21)

We refer to e.g. [39, 40, 41, 42, 43] for further details.

Our estimation framework is based on solving the optimization problems in (19) and (21) using the sqp solver in MATLAB. The implementation is described in Appendix F.

A.3. Estimation for cell fraction data

For cell fraction data, we begin by extending the definitions $p^{(j)} (t)$ and $S^{(j)} (t)$ from (11) in the main text to a general vector $α$ of starting cell fractions:

\begin{array}{l} p^{α} (t) : = {(m^{α} (t) 1^{T})}^{- 1} m^{α} (t), \\ Q^{α} (t) : = I - 1^{T} p^{α} (t), \\ S^{α} (t) : = {(m^{α} (t) 1^{T})}^{- 2} {(Q^{α} (t))}^{T} Σ^{α} (t) Q^{α} (t) . \end{array}

(22)

Then, based on a generalized version of the central limit theorem (12), which is stated and proved as Proposition 4 in Appendix C.2, we propose the following extension of the statistical model (13) in the main text:

f_{i, ℓ, r} \sim p^{f_{i}} (t_{ℓ}) + 𝓝 (0, N_{i}^{- 1} S^{f_{i}} (t_{ℓ})) + 𝓝 (0, E_{i, ℓ}^{frac}) .

(23)

Note that the mean behavior $p^{f_{i}} (t_{ℓ})$ depends only on ${(ν_{i k})}_{k \neq i}$ and $λ^{[- 1]}$ , while the variance term $N_{i}^{- 1} S^{f_{i}} (t_{ℓ})$ depends on all model parameters ${(ν_{i k})}_{k \neq i}$ , $λ^{[- 1]}$ , $λ_{1}$ , $d$ . The choice of type-1 as a reference phenotype is arbitrary, and we use $d$ as opposed to $b$ as we found it to perform well numerically. As stated in the main text, we assume that the $K \times K$ covariance matrix $E_{i, ℓ}^{frac}$ associated with measurement error can be written as a function of $d$ , $λ_{1}$ , $λ^{[- 1]}$ , ${(ν_{i k})}_{k \neq i}$ and added error parameters $ω_{frac} = (ω_{1}, \dots, ω_{M_{frac}})$ for some $M_{frac} \geq 0$ . We let $θ_{frac}$ denote the complete $1 \times (K (K + 1) + M_{frac})$ vector of model parameters including the error parameters.

When deriving a likelihood function for the statistical model (23), we note that the last coordinate of $f_{i, ℓ, r}$ provides no new information over the first $K - 1$ coordinates, since the coordinates always sum to one. In the likelihood function, we therefore only consider the first $K - 1$ coordinates, which we can accomplish by multiplying $f_{i, ℓ, r}$ by the $K \times (K - 1)$ matrix $B$ with 1 on the diagonal and 0 off it. In this way, we obtain the following likelihood:

\begin{array}{l} 𝓛_{frac} (θ_{frac} ∣ (f_{i, ℓ, r})) \\ = \prod_{i = 1}^{I} \prod_{ℓ = 1}^{L} \prod_{r = 1}^{R} {({(2 π)}^{K - 1} \det (B^{T} (N_{i}^{- 1} S^{f_{i}} (t_{ℓ}) + E_{i, ℓ}^{frac}) B))}^{- 1 / 2} \\ \cdot \exp (- \frac{1}{2} (f_{i, ℓ, r} - p^{f_{i}} (t_{ℓ})) B {(B^{T} (N_{i}^{- 1} S^{f_{i}} (t_{ℓ}) + E_{i, ℓ}^{frac}) B)}^{- 1} B^{T} {(f_{i, ℓ, r} - p^{f_{i}} (t_{ℓ}))}^{T}) . \end{array}

(24)

As for cell number data, we define the negative double log-likelihood,

l_{frac} (θ_{frac}) : = - 2 \log 𝓛_{frac} (θ_{frac} | (f_{i, ℓ, r})),

(25)

and obtain the maximum likelihood estimate for $θ_{frac}$ by solving

{\hat{θ}}_{frac} : = {argmin}_{θ_{frac} \in Θ_{frac}} l_{frac} (θ_{frac}) .

(26)

In the feasible set $Θ_{frac}$ , we restrict the parameter values so that $ν_{i k} \geq 0$ , $d \geq 0$ , $λ_{1} \geq - d_{1}$ and $(λ_{j} - λ_{1}) + d_{j} + λ_{1} \geq 0$ for $j = 2, \dots, K$ . Further restrictions can be made depending on the context, see e.g. Section 4.3 and Appendix F. The computation of confidence intervals proceeds as described for cell number data.

For the simplified model (14), we proceed as above except we remove all terms involving $N_{i}^{- 1} S^{f_{i}} (t_{ℓ})$ .

B. Estimation for reducible switching dynamics

In the main text, we have assumed that the switching dynamics are irreducible, meaning that it is possible to switch between any pair of phenotypes, possibly through intermediate types. In this section, we show how our framework can be applied to the case of reducible switching dynamics. For simplicity, we will consider one particular model shown in Figure 8. This model has been applied e.g. to the dynamics of epigenetic gene silencing under recruitment of chromatin regulators [36] and the evolution of epigenetically-driven drug resistance in cancer, where drug-sensitive cells (type-1) first acquire a transiently resistant phenotype (type-2) and then evolve to stable epigenetic resistance (type-3) [25].

Say that experiments are conducted from isolated initial conditions, and say first that cell number data is collected. For the model in Figure 8, the distribution of the data vector $n_{3, ℓ, r}$ is degenerate, since $n_{3, ℓ, r, j} = 0$ for $j = 1, 2$ . As a result, the covariance matrix $Σ^{(3)} (t_{ℓ})$ is singular for all $ℓ = 1, \dots, L$ , and the likelihood function in (17) is not defined. To resolve this issue, we set $C_{1} = C_{2} = I$ and $C_{3} = e_{3}^{T}$ , where $e_{3}$ is the $1 \times K$ third unit vector. By Proposition 3, $n_{3, ℓ, r} C_{3} = n_{3, ℓ, r, 3}$ has a normal distribution, which is nondegenerate. We therefore modify the likelihood function in (17) to

\begin{array}{l} 𝓛_{num} (θ_{num} | {(n_{i, ℓ, r})}_{i, ℓ, r}) \\ = \prod_{i = 1}^{3} \prod_{ℓ = 1}^{L} \prod_{r = 1}^{R} {({(2 π)}^{K} \det (C_{i}^{T} (N_{i} Σ^{(i)} (t_{ℓ}) + E_{i, ℓ}^{num}) C_{i}))}^{- 1 / 2} \\ \cdot \exp (- \frac{1}{2} (n_{i, ℓ, r} - N_{i} m^{(i)} (t_{ℓ})) C_{i} {(C_{i}^{T} (N_{i} Σ^{(i)} (t_{ℓ}) + E_{i, ℓ}^{num}) C_{i})}^{- 1} C_{i}^{T} {(n_{i, ℓ, r} - N_{i} m^{(i)} (t_{ℓ}))}^{T}) . \end{array}

From this likelihood function, MLE estimates and confidence intervals can be computed as described in Appendix A, where we restrict the set of feasible parameters $Θ_{num}$ so that $ν_{13} = ν_{31} = ν_{32} = 0$ . By our analysis in Section 4.1.1, all model parameters are structurally identifiable for this example.

To accommodate model structures such as the one discussed here, the above modified likelihood function is implemented in our MATLAB codes (Appendix F). By taking $C_{i} = I$ for each $i = 1, \dots, I$ , we recover the original likelihood function in (17).

If cell fraction data is collected, there is no value in conducting experiments starting only from type-3 cells. We therefore use the likelihood function

\begin{array}{l} 𝓛_{frac} (θ_{frac} | {(f_{i, ℓ, r})}_{i, ℓ, r}) \\ = \prod_{i = 1}^{2} \prod_{ℓ = 1}^{L} \prod_{r = 1}^{R} {({(2 π)}^{K - 1} \det (B^{T} (N_{i}^{- 1} S^{(i)} (t_{ℓ}) + E_{i, ℓ}^{frac}) B))}^{- 1 / 2} \\ \cdot \exp (- \frac{1}{2} (f_{i, ℓ, r} - p^{(i)} (t_{ℓ})) B {(B^{T} (N_{i}^{- 1} S^{(i)} (t_{ℓ}) + E_{i, ℓ}^{frac}) B)}^{- 1} B^{T} {(f_{i, ℓ, r} - p^{(i)} (t_{ℓ}))}^{T}), \end{array}

where we only include experiments started by type-1 and type-2 cells, respectively. By our analysis in Section 4.1.2, the switching rates $ν_{12}$ , $ν_{21}$ and $ν_{23}$ , and the net birth rate differences $λ_{2} - λ_{1}$ and $λ_{3} - λ_{2}$ , are structurally identifiable in this case. An example of a model structure where it becomes necessary to modify the above likelihood function for cell fraction data is given in Appendix F.

C. Central limit theorems

In the main text, we stated the central limit theorems (CLTs) (7) and (12) for cell number and cell fraction data, respectively, for the simple case where all experiments are started from isolated subpopulations. Here, we state and prove the CLTs for general starting conditions.

C.1. Cell number data

For cell number data, we begin by modifying the notation developed for the branching process model in Section 3.1.2 to facilitate analysis of general starting conditions. In particular, for a general $1 \times K$ vector $n = (n_{1}, \dots, n_{K})$ of starting cell numbers of each type, we let $Z^{n} (t) = (Z_{1}^{n} (t), \dots, Z_{K}^{n} (t))$ denote the random vector of cell numbers at time $t$ . We state and prove a CLT for $Z^{n} (t)$ when the total number of starting cells is sent to infinity (Proposition 3). More precisely, we fix a vector $α$ of starting cell fractions with $α_{i} \geq 0$ for $i = 1, \dots, K$ and $\sum_{i = 1}^{K} α_{i} = 1$ , write the vector of starting cell numbers as $n = ⌊N α⌋$ , and send $N \to \infty$ . Note that some coordinates of the vector $α$ are allowed to be 0. In the $N \to \infty$ regime, the starting condition $n = ⌊N α⌋$ will therefore either include no cell or a large number of cells of any given type. This is consistent with our assumptions on the vectors $n_{1}, \dots, n_{I}$ of experimental starting conditions (Appendix A.1).

We establish Proposition 3 more generally for linear transformations $Z^{n} (t) C$ of $Z^{n} (t)$ , which gives a CLT for $Z^{n} (t)$ by taking $C = I$ . The more general version allows us to obtain a CLT for cases where we do not observe the full vector $Z^{n} (t)$ . For example, if we set $C : = 1^{T}$ , then $Z^{n} (t) C = \sum_{k = 1}^{K} Z_{k}^{n} (t)$ is the total number of cells at time $t$ . The more general version also becomes useful when estimating from models with reducible switching dynamics, as we discuss in Appendix B.

Proposition 3.

Let $α$ be $1 \times K$ with $α_{i} \geq 0$ for $i = 1, \dots, K$ and $\sum_{i = 1}^{K} α_{i} = 1$ . Let $J \geq 1$ be any integer. For any $K \times J$ matrix $C$ , then as $N \to \infty$ ,

N^{- 1 / 2} (Z^{⌊N α⌋} (t) C - N m^{α} (t) C) \overset{d}{\to} 𝓝 (0, C^{T} Σ^{α} (t) C) .

Here, the covariance matrix $Σ^{(j)} (t)$ is given by

\begin{matrix} Σ^{(j)} (t) = 2 \int_{0}^{t} {(M (t - τ))}^{T} diag (b ⊙ m^{(j)} (τ)) (M (t - τ)) d τ \\ + diag (m^{(j)} (t)) - {(m^{(j)} (t))}^{T} m^{(j)} (t) . \end{matrix}

Proof. First note that we can write

Z^{⌊N α⌋} (t) = \sum_{j = 1}^{K} Z^{⌊N α_{j}⌋ e_{j}} (t),

(27)

where ${(Z^{⌊N α_{j}⌋ e_{j}} (s))}_{s \geq 0}$ for $j = 1, \dots, K$ are independent branching processes started with $⌊N α_{j}⌋$ cells of type- $j$ , respectively. For each process, we can write

Z^{⌊N α_{j}⌋ e_{j}} (t) = \sum_{m = 1}^{⌊N α_{j}⌋} Z^{(j), m} (t),

where ${(Z^{(j), m} (s))}_{s \geq 0}$ for $m = 1, \dots, ⌊N α_{j}⌋$ are i.i.d. copies of the branching process ${(Z^{(j)} (s))}_{s \geq 0}$ started by a single type- $j$ cell. Set

W^{⌊N α_{j}⌋ e_{j}} (t) : = N^{- 1 / 2} (Z^{⌊N α_{j}⌋ e_{j}} (t) - N α_{j} m^{(j)} (t)) .

(28)

Let $J \geq 1$ and let $C$ be a $K \times J$ matrix. By the standard (multivariate) central limit theorem, as $N \to \infty$ ,

W^{⌊N α_{j}⌋ e_{j}} (t) C \overset{d}{\to} 𝓝 (0, α_{j} C^{T} Σ^{(j)} (t) C) .

where $Σ^{(j)} (t)$ is the covariance matrix for $Z^{(j)} (t)$ . We can then conclude from (27) that as $N \to \infty$ ,

N^{- 1 / 2} (Z^{⌊N α⌋} (t) C - N m^{α} (t) C) \overset{d}{\to} 𝓝 (0, C^{T} Σ^{α} (t) C) .

It remains to derive the given expression for the covariance matrix $Σ^{(j)} (t)$ . To that end, let $D^{(j)} (t)$ be the matrix of second factorial moments of $Z^{(j)} (t)$ ,

D_{k ℓ}^{(j)} (t) : = E [Z_{k}^{(j)} (t) (Z_{ℓ}^{(j)} (t) - δ_{k ℓ})],

where $δ_{k ℓ}$ is the Kronecker delta. Let $s = (s_{1}, \dots, s_{K})$ be a $K$ -dimensional vector of real numbers and set $h_{j} : = b_{j} + d_{j} + \sum_{k \neq j} ν_{j k}$ for $j = 1, \dots, K$ . Furthermore, let

u^{(j)} (s) : = b_{j} s_{j}^{2} + d_{j} + \sum_{k \neq j} ν_{j k} s_{k} - h_{j} s_{j}, 0 \leq s \leq 1

be the infinitesimal generating function for $Z^{(j)} (t)$ , and let

F^{(j)} (s, t) : = E [s^{Z^{(j)} (t)}] = E [\prod_{k = 1}^{K} s_{k}^{Z_{k}^{(j)} (t)}], 0 \leq s \leq 1, t \geq 0,

be the probability generating function for $Z^{(j)} (t)$ . With this notation, we can write the Kolmogorov forward equation for $Z^{(j)} (t)$ as

\frac{\partial}{\partial t} F^{(j)} (s, t) = \sum_{i = 1}^{K} u^{(i)} (s) \frac{\partial}{\partial s_{i}} F^{(j)} (s, t) .

Then, for $k$ , $ℓ = 1, \dots, K$ ,

\begin{array}{l} \frac{\partial}{\partial t} (\frac{\partial}{\partial s_{k}} \frac{\partial}{\partial s_{ℓ}} F^{(j)} (s, t)) \\ = \sum_{i = 1}^{K} (\frac{\partial}{\partial s_{k}} \frac{\partial}{\partial s_{ℓ}} u^{(i)} (s) \frac{\partial}{\partial s_{i}} F^{(j)} (s, t) + \frac{\partial}{\partial s_{ℓ}} u^{(i)} (s) \frac{\partial}{\partial s_{k}} \frac{\partial}{\partial s_{i}} F^{(j)} (s, t) \\ + \frac{\partial}{\partial s_{k}} u^{(i)} (s) \frac{\partial}{\partial s_{ℓ}} \frac{\partial}{\partial s_{i}} F^{(j)} (s, t) + u^{(i)} (s) \frac{\partial}{\partial s_{k}} \frac{\partial}{\partial s_{ℓ}} \frac{\partial}{\partial s_{i}} F^{(j)} (s, t)) . \end{array}

(29)

Now,

\begin{array}{l} \frac{\partial}{\partial s_{k}} u^{(i)} (s) = \{\begin{array}{l} 2 b_{i} s_{i} - h_{i}, & k = i, \\ ν_{i k}, & k \neq i, \end{array} \\ \frac{\partial}{\partial s_{k}} \frac{\partial}{\partial s_{ℓ}} u^{(i)} (s) = δ_{k i} δ_{ℓ i} 2 b_{i} . \end{array}

Let $A$ be the infinitesimal generator and $M (t)$ be the mean matrix as defined in Sections 3.1.1 and 3.1.2. Since

\begin{array}{l} a_{i k} = {\frac{\partial}{\partial s_{k}} u^{(i)} (s)|}_{s = 1}, \\ M_{j k} (t) = {\frac{\partial}{\partial s_{k}} F^{(j)} (s, t)|}_{s = 1}, \\ D_{k ℓ}^{(j)} (t) = {\frac{\partial}{\partial s_{k}} \frac{\partial}{\partial s_{ℓ}} F^{(j)} (s, t)|}_{s = 1}, \end{array}

and $u^{(i)} (1) = 0$ , we can conclude from (29) that

\begin{array}{l} \frac{d}{d t} D_{k ℓ}^{(j)} (t) = \sum_{i = 1}^{K} (δ_{k i} δ_{ℓ i} 2 b_{i} M_{j i} (t) + a_{i ℓ} D_{k i}^{(j)} (t) + a_{i k} D_{ℓ i}^{(j)} (t)) \\ = \sum_{i = 1}^{K} a_{i k} D_{i ℓ}^{(j)} (t) + \sum_{i = 1}^{K} D_{k i}^{(j)} (t) a_{i ℓ} + δ_{k ℓ} 2 b_{k} M_{j k} (t) . \end{array}

In the second step, we use that $D_{i ℓ}^{(j)} (t) = D_{ℓ i}^{(j)} (t)$ . This yields a Lyapunov matrix differential equation,

\frac{d}{d t} D^{(j)} (t) = A^{T} D^{(j)} (t) + D^{(j)} (t) A + 2 diag (b ⊙ m^{(j)} (t)),

(30)

with initial condition $D^{(j)} (0) = 0$ . The solution is

\begin{array}{l} D^{(j)} (t) = 2 \exp (t A^{T}) (\int_{0}^{t} \exp (- τ A^{T}) diag (b ⊙ m^{(j)} (τ)) \exp (- τ A) d τ) \exp (t A) \\ = 2 \int_{0}^{t} {(M (t - τ))}^{T} diag (b ⊙ m^{(j)} (τ)) (M (t - τ)) d τ, \end{array}

and the expression (8) for $Σ^{(j)} (t)$ follows from the fact that

Σ^{(j)} (t) = D^{(j)} (t) + diag (m^{(j)} (t)) - {(m^{(j)} (t))}^{T} m^{(j)} (t) .

(31)

□

C.2. Cell fraction data

For cell fraction data, we similarly begin by modifying the notation developed for the branching process model in Section 3.1.2. In particular, for the $1 \times K$ vector $n = (n_{1}, \dots, n_{K})$ of starting cell numbers of each type, then on the event $\{\sum_{k = 1}^{K} Z_{k}^{n} (t) \neq 0\}$ , we let $Δ^{n} (t)$ denote the random vector of cell fractions at time $t$ , i.e.

Δ_{i}^{n} (t) : = Z_{i}^{n} (t) / (\sum_{k = 1}^{K} Z_{k}^{n} (t)), i = 1, \dots, K .

We now state and prove a central limit theorem for $Δ^{n} (t)$ (Proposition 4). As for cell number data, the CLT is established for linear transformations $Δ^{n} (t) C$ of $Δ^{n} (t)$ . We note that the CLT has already been established for the case of an isolated large starting population by Yakovlev and Yanev [44]. We extend their argument to more general starting conditions by fixing the vector $α$ of starting cell proportions and sending the total population size $N$ to infinity. We also provide a simplified expression for the covariance matrix $S^{α} (t)$ and show that the mean function $p^{α} (t)$ can be written solely in terms of ${(ν_{i k})}_{k \neq i}$ and $λ^{[- 1]}$ .

Proposition 4.

Let $α$ be $1 \times K$ with $α_{i} \geq 0$ for $i = 1, \dots, K$ and $\sum_{i = 1}^{K} α_{i} = 1$ . Let $J \geq 1$ be any integer. For any $K \times J$ matrix $C$ , then as $N \to \infty$ ,

N^{1 / 2} (Δ^{⌊N α⌋} (t) C - p^{α} (t) C) \overset{d}{\to} 𝓝 (0, C^{T} S^{α} (t) C) .

Here, the mean function $p^{α} (t)$ can be written solely as a function of the switching rates ${(ν_{i k})}_{k \neq i}$ and the relative net birth rates $λ^{[- 1]}$ .

Proof. Recall from (27) that we can write

Z^{⌊N α⌋} (t) = \sum_{j = 1}^{K} Z^{⌊N α_{j}⌋ e_{j}} (t),

where ${(Z^{⌊N α_{j}⌋ e_{j}} (s))}_{s \geq 0}$ for $j = 1, \dots, K$ are independent branching processes started with $⌊N α_{j}⌋$ cells of type- $j$ , respectively. Define

U^{⌊N α⌋} (t) : = \sum_{k = 1}^{K} Z_{k}^{⌊N α⌋} (t) = \sum_{j = 1}^{K} \sum_{k = 1}^{K} Z_{k}^{⌊N α_{j}⌋ e_{j}} (t)

as the total population size at time $t$ and note that

Δ_{i}^{⌊N α⌋} (t) = \frac{Z_{i}^{⌊N α⌋} (t)}{U^{⌊N α⌋} (t)} = \frac{\sum_{j = 1}^{K} Z_{i}^{⌊N α_{j}⌋ e_{j}} (t)}{U^{⌊N α⌋} (t)} .

We can therefore write

\begin{array}{l} \sqrt{N} (Δ_{i}^{⌊N α⌋} (t) - p_{i}^{α} (t)) \\ = \frac{\sqrt{N}}{U ⌊N α⌋ (t)} (\sum_{j = 1}^{K} ((1 - p_{i}^{α} (t)) Z_{i}^{⌊N α_{j}⌋ e_{j}} (t) - p_{i}^{α} (t) \sum_{k \neq i} Z_{k}^{⌊N α_{j}⌋ e_{j}} (t))) . \end{array}

Note that by definition,

p^{α} (t) = {(m^{α} (t) 1^{T})}^{- 1} m^{α} (t) = {(\sum_{j = 1}^{K} \sum_{k = 1}^{K} α_{j} m_{k}^{(j)} (t))}^{- 1} \sum_{j = 1}^{K} α_{j} m^{(j)} (t) .

It follows that

\begin{array}{l} \sum_{j = 1}^{K} N α_{j} ((1 - p_{i}^{α} (t)) m_{i}^{(j)} (t) - p_{i}^{α} (t) \sum_{k \neq i} m_{k}^{(j)} (t)) \\ = N (\sum_{j = 1}^{K} α_{j} m_{i}^{(j)} (t) - p_{i}^{α} (t) \sum_{j = 1}^{K} \sum_{k = 1}^{K} α_{j} m_{k}^{(j)} (t)) \\ = 0. \end{array}

We can therefore write

\begin{array}{l} \sqrt{N} (Δ_{i}^{⌊N α⌋} (t) - p_{i}^{α} (t)) \\ = \frac{N}{U ⌊N α⌋ (t)} (\sum_{j = 1}^{K} ((1 - p_{i}^{α} (t)) W_{i}^{⌊N α_{j}⌋ e_{j}} (t) - p_{i}^{α} (t) \sum_{k \neq i} W_{k}^{⌊N α_{j}⌋ e_{j}} (t))), \end{array}

where the vector $W^{⌊N α_{j}⌋ e_{j}} (t)$ is defined as in (28). In vector form, this becomes

\sqrt{N} (Δ^{⌊N α⌋} (t) - p^{α} (t)) = \frac{N}{U ⌊N α⌋ (t)} \sum_{j = 1}^{K} W^{⌊N α_{j}⌋ e_{j}} (t) Q^{α} (t),

where $Q^{α} (t)$ is defined as in (22). By the strong law of large numbers, $U^{⌊N α⌋} (t) / N \to m^{α} (t) 1^{T}$ almost surely as $N \to \infty$ . Let $J \geq 1$ and let $C$ be a $K \times J$ matrix. By the standard (multivariate) central limit theorem, as $N \to \infty$ ,

W^{⌊N α_{j}⌋ e_{j}} (t) Q^{α} (t) C \overset{d}{\to} 𝓝 (0, α_{j} C^{T} {(Q^{α} (t))}^{T} Σ^{(j)} (t) Q^{α} (t) C)

Writing $Σ^{α} (t) = \sum_{j = 1}^{K} α_{j} Σ^{(j)} (t)$ , it finally follows from Slutsky’s theorem that

\sqrt{N} (Δ^{⌊N α⌋} (t) C - p^{α} (t) C) \overset{d}{\to} 𝓝 (0, {(m^{α} (t) 1^{T})}^{- 2} C^{T} {(Q^{α} (t))}^{T} Σ^{α} (t) Q^{α} (t) C) .

It remains to show that $p^{α} (t)$ can be written solely as a function of the switching rates ${(ν_{i k})}_{k \neq i}$ and the net birth rate differences $λ^{[- 1]}$ . To this end, we define for any $j = 1, \dots, K$ :

A^{[- j]} : = A - λ_{j} I,

(32)

where $I$ is the $K \times K$ identity matrix, and

M^{[- j]} (t) : = \exp (t A^{[- j]}) = \sum_{k = 0}^{\infty} (t^{k} / k!) {(A^{[- j]})}^{k}, t \geq 0.

(33)

Note that $A^{[- j]}$ and $M^{[- j]} (t)$ only depend on ${(ν_{i k})}_{k \neq i}$ and $λ^{[- j]}$ . It is easy to see that

M (t) = e^{λ_{j} t} M^{[- j]} (t),

for $j = 1, \dots, K$ , from which it follows that

\begin{array}{l} p^{α} (t) = {(α M (t) 1^{T})}^{- 1} α M (t) \\ = {(α M^{[- 1]} (t) 1^{T})}^{- 1} (α M^{[- 1]} (t)), t \geq 0. \end{array}

(34)

This completes the proof. □

D. Proof of Proposition 1

Proof of Proposition 1.

Since $M (t) = \exp (t A) = \sum_{k = 0}^{\infty} (1 / k!) t^{k} A^{k}$ , we have $\frac{d}{d t} M (t) = AM (t)$ . By taking $t = 0$ and noting that $M (0) = I$ , we obtain
${\frac{d}{d t} M (t)|}_{t = 0} = A .$

If ${\frac{d}{d t} M (t)|}_{t = 0}$ .is known, we can recover the switching rate $ν_{j k}$ for $k \neq j$ by recalling that $a_{j k} = ν_{j k}$ . We can then recover $λ_{j}$ for $j = 1, \dots, K$ by recalling that $a_{j j} = λ_{j} - \sum_{k \neq j} ν_{j k}$ .
Recall that $m^{(j)} (t) = e_{j} M (t)$ . By (31) in the proof of Proposition 3, we can write
$\begin{array}{l} \frac{d}{d t} Σ^{(j)} (t) \\ = \frac{d}{d t} D^{(j)} (t) + diag (e_{j} A M (t)) - A^{T} {(M (t))}^{T} e_{j}^{T} e_{j} M (t) - {(M (t))}^{T} e_{j}^{T} e_{j} A M (t), \end{array}$
where $D^{(j)} (t)$ is the matrix of second factorial moments of $Z^{(j)} (t)$ . Next, by taking $t = 0$ in (30) and noting that $D^{(j)} (0) = 0$ and $m^{(j)} (0) = e_{j}$ for all $j = 1, \dots, K$ , we see that
${\frac{d}{d t} D^{(j)} (t)|}_{t = 0} = 2 b_{j} e_{j}^{T} e_{j} .$

It follows that

{\frac{d}{d t} Σ^{(j)} (t)|}_{t = 0} = 2 b_{j} e_{j}^{T} e_{j} + diag (e_{j} A) - {(e_{j}^{T} e_{j} A)}^{T} - e_{j}^{T} e_{j} A .

(35)

For each $j = 1, \dots, K$ , if the switching rates $ν_{j k}$ for $k \neq j$ and the net birth rate $λ_{j}$ are known, the birth rate $b_{j}$ can be recovered from ${({\frac{d}{d t} Σ^{(j)} (t)|}_{t = 0})}_{j j}$ using this expression. □

E. Proof of Proposition 2

Proof of Proposition 2. We begin by establishing some notation. First, define $Q^{(j)} (t) : = Q^{e_{j}} (t)$ and $Q^{(j)} : = Q^{(j)} (0) = I - 1^{T} e_{j}$ , with $Q^{e_{j}} (t)$ defined as in (22). Also define

V : = A - diag (λ)

(36)

as the infinitesimal generator $A$ with the net birth rates $λ$ removed from the diagonal. Let $v^{(j)}$ denote the $j$ -th row vector of $V$ with coordinates $v_{k}^{(j)} = ν_{j k}$ for $k \neq j$ and $v_{j}^{(j)} = - \sum_{k \neq j} ν_{j k}$ , and note that

v^{(j)} = e_{j} V = e_{j} A^{[- j]},

(37)

where $A^{[- j]}$ is defined as in (32). Also note that $v^{(j)} 1^{T} = 0$ . In the proof, we will rely on the following basic facts:

\begin{array}{l} e_{j} Q^{(j)} = e_{j} (I - 1^{T} e_{j}) = 0, \\ v^{(j)} Q^{(j)} = v^{(j)} (I - 1^{T} e_{j}) = v^{(j)} . \end{array}

(38)

Since $p^{(j)} (t) = {(e_{j} \exp (t A) 1^{T})}^{- 1} (e_{j} \exp (t A))$ , we can write
$\begin{array}{l} \frac{d}{d t} p^{(j)} (t) = {(e_{j} \exp (t A) 1^{T})}^{- 1} (e_{j} A \exp (t A)) \\ - {(e_{j} \exp (t A) 1^{T})}^{- 2} (e_{j} A \exp (t A) 1^{T}) (e_{j} \exp (t A)) . \end{array}$ (39)

Since $\exp (0) = I$ , $e_{j} 1^{T} = 1$ and $e_{j} A 1^{T} = λ_{j}$ , we obtain by (37),
${\frac{d}{d t} p^{(j)} (t)|}_{t = 0} = e_{j} (A - λ_{j} I) = e_{j} A^{[- j]} = v^{(j)} .$ (40)

Since the $k$ -th coordinate of $v^{(j)}$ is $ν_{j k}$ for $k \neq j$ , we can recover $ν_{j k}$ from the $k$ -th coordinate of ${\frac{d}{d t} p^{(j)} (t)|}_{t = 0}$ .

(i) Using (39), we begin by writing

\begin{array}{l} \frac{d^{2}}{d t^{2}} p^{(j)} (t) = {(e_{j} \exp (t A) 1^{T})}^{- 1} (e_{j} A^{2} \exp (t A)) \\ - {(e_{j} \exp (t A) 1^{T})}^{- 2} (e_{j} A \exp (t A) 1^{T}) (e_{j} A \exp (t A)) \\ + 2 {(e_{j} \exp (t A) 1^{T})}^{- 3} {(e_{j} A \exp (t A) 1^{T})}^{2} (e_{j} \exp (t A)) \\ - {(e_{j} \exp (t A) 1^{T})}^{- 2} (e_{j} A^{2} \exp (t A) 1^{T}) (e_{j} \exp (t A)) \\ - {(e_{j} \exp (t A) 1^{T})}^{- 2} (e_{j} A \exp (t A) 1^{T}) (e_{j} A \exp (t A)) . \end{array}

Since

\exp (0) = I

e_{j} 1^{T} = 1

e_{j} A 1^{T} = λ_{j}

v^{(j)} = e_{j} A^{[- j]}

and

Q^{(j)} = I - 1^{T} e_{j}

\begin{array}{l} {\frac{d^{2}}{d t^{2}} p^{(j)} (t)|}_{t = 0} = 2 λ_{j} e_{j} (λ_{j} I - A) + e_{j} A^{2} (I - 1^{T} e_{j}) \\ = - 2 λ_{j} v^{(j)} + e_{j} A^{2} Q^{(j)} . \end{array}

Recalling that

A = A^{[- j]} + λ_{j} I

by (32), we can write

\begin{array}{l} e_{j} A^{2} = e_{j} {(A^{[- j]})}^{2} + 2 λ_{j} e_{j} A^{[- j]} + λ_{j}^{2} e_{j} \\ = v^{(j)} A^{[- j]} + 2 λ_{j} v^{(j)} + λ_{j}^{2} e_{j} . \end{array}

(41)

Since

e_{j} Q^{(j)} = 0

and

v^{(j)} Q^{(j)} = v^{(j)}

by (38), it follows that

e_{j} A^{2} Q^{(j)} = v^{(j)} A^{[- j]} Q^{(j)} + 2 λ_{j} v^{(j)} Q^{(j)} = v^{(j)} A^{[- j]} Q^{(j)} + 2 λ_{j} v^{(j)},

(42)

which implies

{\frac{d^{2}}{d t^{2}} p^{(j)} (t)|}_{t = 0} = v^{(j)} A^{[- j]} Q^{(j)} .

(43)

It is straightforward to verify that for

i \neq j

\begin{array}{l} {(v^{(j)} A^{[- j]} Q^{(j)})}_{i} \\ = ν_{j i} (λ_{i} - λ_{j}) - ν_{i j} (\sum_{k \neq j} ν_{j k}) - ν_{j i} (\sum_{ℓ \neq i} ν_{i ℓ}) + \sum_{m \neq j, m \neq i} ν_{j m} ν_{i m} . \end{array}

If ${(ν_{i k})}_{k \neq i}$ and ${\frac{d^{2}}{d t^{2}} p^{(j)} (t)|}_{t = 0}$ are known, we can therefore use (43) to get an equation for $λ_{i} - λ_{j}$ of the form $ν_{j i} (λ_{i} - λ_{j}) = C$ for some constant $C$ . If $ν_{j i} \neq 0$ , we immediately obtain the value of $λ_{i} - λ_{j}$ . If $ν_{j i} = 0$ , then by our assumption of irreducibility, there exist integers $n_{1}, \dots, n_{k}$ so that $ν_{n_{0} n_{1}} ν_{n_{1} n_{2}} \dots ν_{n_{k} n_{k + 1}} > 0$ , where $n_{0} = j$ and $n_{k + 1} = i$ . For each $ℓ = 0, \dots, k$ , we can use the fact that $ν_{n_{ℓ} n_{ℓ + 1}} > 0$ to obtain the value of $λ_{n_{ℓ + 1}} - λ_{n_{ℓ}}$ . Since $λ_{n_{k + 1}} - λ_{n_{0}} = \sum_{ℓ = 0}^{k} (λ_{n_{ℓ + 1}} - λ_{n_{ℓ}})$ , we also obtain the value of $λ_{i} - λ_{j}$ .

(ii) We know from (39) that

\begin{array}{l} \frac{d}{d t} p^{(j)} (t) = {(e_{j} \exp (t A) 1^{T})}^{- 1} (e_{j} A \exp (t A)) \\ - {(e_{j} \exp (t A) 1^{T})}^{- 2} (e_{j} A \exp (t A) 1^{T}) (e_{j} \exp (t A)) . \end{array}

We also know from (5) that

\lim_{t \to \infty} e^{- σ t} \exp (t A) = β^{T} γ,

where

β

and

γ

are positive vectors. It follows that as

t \to \infty

\begin{array}{l} \frac{d}{d t} p^{(j)} (t) \to {(e_{j} β^{T} γ 1^{T})}^{- 1} (e_{j} A β^{T} γ) - {(e_{j} β^{T} γ 1^{T})}^{- 2} (e_{j} A β^{T} γ 1^{T}) (e_{j} β^{T} γ) \\ = {(e_{j} β^{T} \bar{γ} 1^{T})}^{- 1} (e_{j} A β^{T} \bar{γ}) - {(e_{j} β^{T} \bar{γ} 1^{T})}^{- 2} (e_{j} A β^{T} \bar{γ} 1^{T}) (e_{j} β^{T} \bar{γ}), \end{array}

where

\bar{γ}

is the normalized version of

γ

, see (6). Since

e_{j} β^{T} = β_{j} > 0

and

\bar{γ} 1^{T} = 1

, we obtain

\frac{d}{d t} p^{(j)} (t) \to β_{j}^{- 1} (e_{j} A β^{T} \bar{γ} - e_{j} A β^{T} \bar{γ}) = 0 .

(44)

On the other hand, by noting that

A

and

\exp (t A)

commute, we can rewrite the expression (39) for

\frac{d}{d t} p^{(j)} (t)

\begin{array}{l} \frac{d}{d t} p^{(j)} (t) = {(e_{j} \exp (t A) 1^{T})}^{- 1} (e_{j} \exp (t A) A) \\ - {(e_{j} \exp (t A) 1^{T})}^{- 2} (e_{j} \exp (t A) A 1^{T}) (e_{j} \exp (t A)) . \end{array}

Since

A 1^{T} = λ^{T}

A = V + diag (λ)

by (36),

\bar{γ} diag (λ) = λ diag (\bar{γ})

and

\bar{γ} λ^{T} = λ {\bar{γ}}^{T}

, we get as

t \to \infty

\frac{d}{d t} p^{(j)} (t) \to \bar{γ} A - \bar{γ} A 1^{T} \bar{γ} = \bar{γ} (A - λ^{T} \bar{γ}) = \bar{γ} V + λ diag (\bar{γ}) - λ {\bar{γ}}^{T} \bar{γ} .

(45)

Combining (44) and (45), we obtain the following linear system for

λ

λ (diag (\bar{γ}) - {\bar{γ}}^{T} \bar{γ}) = - \bar{γ} V .

It is straightforward to verify that this system is solved by

λ = a + x 1, x \in ℝ,

for some vector

a

, which can be used to extract

λ^{[- 1]}

By the definition of

S^{(j)} (t)

in (11),

\begin{array}{l} \frac{d}{d t} S^{(j)} (t) = \frac{d}{d t} {(e_{j} M (t) 1^{T})}^{- 2} {(Q^{(j)} (t))}^{T} Σ^{(j)} (t) Q^{(j)} (t) \\ + {(e_{j} M (t) 1^{T})}^{- 2} \frac{d}{d t} {(Q^{(j)} (t))}^{T} Σ^{(j)} (t) Q^{(j)} (t) \\ + {(e_{j} M (t) 1^{T})}^{- 2} {(Q^{(j)} (t))}^{T} \frac{d}{d t} Σ^{(j)} (t) Q^{(j)} (t) \\ + {(e_{j} M (t) 1^{T})}^{- 2} {(Q^{(j)} (t))}^{T} Σ^{(j)} (t) \frac{d}{d t} Q^{(j)} (t) . \end{array}

(46)

Since $Σ^{(j)} (0) = 0$ and $e_{j} M (0) 1^{T} = 1$ , we obtain

{\frac{d}{d t} S^{(j)} (t)|}_{t = 0} = {(Q^{(j)})}^{T} ({\frac{d}{d t} Σ^{(j)} (t)|}_{t = 0}) Q^{(j)} .

From (35) in the proof of Proposition 1, we know that

\begin{array}{l} {\frac{d}{d t} Σ^{(j)} (t)|}_{t = 0} = 2 b_{j} e_{j}^{T} e_{j} + diag (e_{j} A) - {(e_{j}^{T} e_{j} A)}^{T} - e_{j}^{T} e_{j} A \\ = diag (e_{j} A^{[- j]}) - {(e_{j}^{T} e_{j} A)}^{T} - e_{j}^{T} e_{j} A + (2 b_{j} + λ_{j}) e_{j}^{T} e_{j}, \end{array}

(47)

where in the second step, we write $A = A^{[- j]} + λ_{j} I$ . Since $e_{j} Q^{(j)} = 0$ and $e_{j} A^{[- j]} = v^{(j)}$ , we obtain

{\frac{d}{d t} S^{(j)} (t)|}_{t = 0} = {(Q^{(j)})}^{T} diag (e_{j} A^{[- j]}) Q^{(j)} = {(Q^{(j)})}^{T} diag (v^{(j)}) Q^{(j)} .

(48)

It is straightforward to verify that the $(j, k)$ -th coordinate of ${(Q^{(j)})}^{T} diag (v^{(j)}) Q^{(j)}$ is $- ν_{j k}$ . Thus, knowledge of the switching rates ${(ν_{i k})}_{k \neq i}$ follows immediately from knowledge of ${\frac{d}{d t} S^{(j)} (t)|}_{t = 0}$ .for $j = 1, \dots, K$ , but no other parameters can be extracted. □

F. Implementation in MATLAB

In this section, we give details on how our estimation framework is implemented in MATLAB.

F.1. Cell number data

The first step in the implementation for cell number data is to compute simple parameter estimates for the switching rates ${(ν_{i k})}_{k \neq i}$ and the net birth rates $λ$ based on a deterministic population model. This model is obtained by ignoring the stochastic terms in the statistical model (10), i.e. by equating the data vector $n_{i, ℓ, r}$ with the mean prediction of (10):

n_{i, ℓ, r} = N_{i} m^{f_{i}} (t_{ℓ}) = n_{i} M (t_{ℓ}) .

(49)

Let $N$ be the $I \times K$ matrix with the initial conditions $n_{i}$ as row vectors, and let $N_{ℓ, r}$ be the $I \times K$ matrix with the data vectors $n_{i, ℓ, r}$ as row vectors. We can then write (49) in matrix form as

N_{ℓ, r} = N M (t_{ℓ}) = N \exp (t_{ℓ} A) .

(50)

Assuming $N$ has rank $K$ , we can solve for $A$ in (50) by first multiplying both sides by $N^{T}$ , then multiplying both sides by the inverse of $N^{T} N$ , and finally taking a matrix logarithm. We can thus obtain an estimate for the infinitesimal generator $A$ ,

A_{ℓ, r}^{*} : = (1 / t_{ℓ}) \log ({(N^{T} N)}^{- 1} N^{T} N_{ℓ, r}) .

We then compute a final estimate $A^{*}$ by averaging across time points and replicates:

A^{*} : = (1 / (L R)) \sum_{ℓ = 1}^{L} \sum_{r = 1}^{R} A_{ℓ, r}^{*} .

(51)

From $A^{*}$ , we can obtain estimates of the switching rates ${(ν_{i k})}_{k \neq i}$ and the net birth rates $λ$ . As indicated in Appendix B, we implement the following likelihood function in our codes:

\begin{array}{l} 𝓛_{num} (θ_{num} ∣ {(n_{i, ℓ, r})}_{i, ℓ, r}) \\ = \prod_{i = 1}^{I} \prod_{ℓ = 1}^{L} \prod_{r = 1}^{R} {({(2 π)}^{K} \det (C_{i}^{T} (N_{i} Σ^{f_{i}} (t_{ℓ}) + E_{i ℓ}^{num}) C_{i}))}^{- 1 / 2} \\ \cdot \exp (- \frac{1}{2} (n_{i, ℓ, r} - N_{i} m^{f_{i}} (t_{ℓ})) C_{i} {(C_{i}^{T} (N_{i} Σ^{f_{i}} (t_{ℓ}) + E_{i, ℓ}^{num}) C_{i})}^{- 1} C_{i}^{T} {(n_{i, ℓ, r} - N_{i} m^{f_{i}} (t_{ℓ}))}^{T}) . \end{array}

For each $i = 1, \dots, I$ , $C_{i}$ is a $K \times J_{i}$ matrix for some $1 \leq J_{i} \leq K$ , which can be used to reduce the dimension of the data vector $n_{i, ℓ, r}$ when necessary. This option can e.g. be useful for models with reducible switching dynamics, see Appendix B.

From the above likelihood function, we compute a negative double log-likelihood as in (18), and solve the MLE problem (19) using the sequential quadratic programming (sqp) solver in MATLAB. For the optimization, one must supply an initial guess $θ_{num}^{(0)}$ for the parameter vector $θ_{num}$ , and a set of feasible parameters $Θ_{num}$ of the form

Θ_{num} = \{θ_{num} : l \leq θ_{num} \leq u, G θ_{num} \leq h, G_{eq} θ_{num} = h_{eq}\} .

By default, we assume lower bounds of $0$ for the switching rates ${(ν_{i k})}_{k \neq i}$ and the birth rates $b$ , and we impose the inequality constraint $λ \leq b$ . The user is expected to provide lower bounds for the net birth rates $λ$ and upper bounds for all parameters, and they have the option to provide further inequality or equality constraints as necessary. This provides the opportunity to impose constraints such as $λ_{1} = λ_{2}$ (Section 4.3) or $ν_{13} = ν_{31} = ν_{32} = 0$ (Appendix B).

For the initial guess $θ_{num}^{(0)},$ , we use the simple estimates for ${(ν_{i k})}_{k \neq i}$ and $λ$ computed from (51). An initial guess for the birth rate $b_{i}$ is generated as $|λ_{i}| / U$ , where $U$ is uniformly distributed between 0 and 1. The idea is that if $λ_{i} > 0$ , then in the absence of phenotypic switching, the survival probability of a single-cell derived clone of type $i$ is $q_{i} = λ_{i} / b_{i}$ [54]. Since we do not assume any information on $q_{i}$ , we sample it uniformly between 0 and 1, and then use the initial guess for $λ_{i}$ to compute an initial guess for $b_{i}$ .

If data on the number of dead cells at each time point is available, the initial guesses for the birth rates can be improved as follows. As before, let $n_{i j}$ be the number of starting cells of type- $j$ under the $i$ -th initial condition. In the absence of phenotypic switching, the expected number of type- $j$ cells at time $t$ under the $i$ -th initial condition is given by $n_{i j} \exp (λ_{j} t)$ . If we assume that type- $j$ cells grow deterministically according to this function, the number of dead cells of type- $j$ that accumulate up until the first experimental timepoint $t_{1}$ is given by

d_{j} \int_{0}^{t_{1}} n_{i j} \exp (λ_{j} t) d t = d_{j} λ_{j}^{- 1} n_{i j} (\exp (λ_{j} t_{1}) - 1) .

Set $D_{i j} : = λ_{j}^{- 1} n_{i j} (\exp (λ_{j} t_{1}) - 1)$ and let $D = (D_{i j})$ denote the corresponding $I \times K$ matrix. Also, let $c$ denote the $1 \times I$ vector of the experimentally measured number of dead cells at time $t_{1}$ , averaged across the $R$ experimental replicates. We should then have

{dD}^{T} = c .

Assuming $D$ has rank $K$ , we can solve this equation for $d$ as follows:

d = c D {(D^{T} D)}^{- 1},

which gives an estimate for the vector of death rates $d$ . An estimate for the birth rates $b$ can then be computed as $b = λ + d$ .

In addition to being used to initialize the optimization, the initial guess $θ_{num}^{(0)}$ is used to estimate the relative scales of the parameters ${(ν_{i k})}_{k \neq i}$ , $λ$ and $b$ . In particular, for the $i$ -th coordinate of the initial guess, we define the corresponding scale variable

s_{i}^{(0)} : = 10^{⌊\log_{10} |θ_{num, i}^{(0)}|⌋},

with $s_{i}^{(0)} : = 1$ if $θ_{num, i}^{(0)} = 0$ . For example, if the initial guesses are $b^{(0)} = (1.5, 1.2)$ for the birth rates, $λ^{(0)} = (0.3, 0.4)$ for the net birth rates, and $(ν_{12}^{(0)}, ν_{21}^{(0)}) = (0.05, 0.002)$ for the switching rates, the corresponding scale variables are (1, 1), (0.1, 0.1) and (0.01, 0.001), respectively. For a given parameter vector $θ_{num}$ , we define the transformed vector

{\tilde{θ}}_{num} : = θ_{num} ⊘ s^{(0)},

where $⊘$ denotes elementwise division. For the initial guesses $b^{(0)} = (1.5, 1.2)$ , $λ^{(0)} = (0.3, 0.4)$ and $(ν_{12}^{(0)}, ν_{21}^{(0)}) = (0.05, 0.002)$ , the corresponding transformed values are ${\tilde{b}}^{(0)} = (1.5, 1.2)$ , ${\tilde{λ}}^{(0)} = (3, 4)$ and ${\tilde{λ}}^{(0)} = (5, 2)$ . With this transformation, all nonzero parameters take values in [1, 10]. When we solve the MLE problem (19), we treat ${\tilde{θ}}_{num}$ as the parameter vector instead of $θ_{num}$ , and solve

\min_{{\tilde{θ}}_{num} \in {\tilde{Θ}}_{num}} l_{num} ({\tilde{θ}}_{num} ⊙ s^{(0)}),

(52)

where ${\tilde{Θ}}_{num}$ is the transformed set of feasible parameters. The parameter scaling is applied to ensure that all model parameters are of a similar magnitude in the optimization.

In most cases, we have found it sufficient to solve the optimization problem (52) once. However, in our codes, we provide an option to solve the problem multiple times, using (i) user-supplied initial guesses, (ii) initial guesses based on the simple estimates from (51), with new birth rates selected randomly each time, or (iii) randomly sampled initial guesses, using the parameter generation procedure described in Appendix H below.

The optimization problems (21) for the endpoints of the confidence intervals are solved in a similar way, except the initial guess is taken to be the maximum likelihood estimate.

F.2. Cell fraction data

The implementation for cell fraction data is similar with the following modifications. First of all, we parametrize the model in terms of the death rates $d$ , the net birth rate $λ_{1}$ and the net birth rate differences $λ^{[- 1]}$ , instead of the birth rates $b$ and net birth rates $λ$ (Section 3.4). Second, the initial guess for the MLE problem (26) is based on solving the following least squares problem, which minimizes the sum of squared errors between the mean prediction of the statistical model (13) and the data:

\{{\hat{λ}}^{[- 1]}, {({\hat{ν}}_{i k})}_{k \neq i}\} = {argmin}_{λ [- 1], {(ν_{i k})}_{k \neq i}} \sum_{i = 1}^{I} \sum_{ℓ = 1}^{L} \sum_{r = 1}^{R} {‖f_{i, ℓ, r} - p^{f_{i}} (t_{ℓ})‖}^{2} .

(53)

Note that this is a continuous-time version of the TRANSCOMPP problem (2). When solving (53), we need to supply an initial guess. If experiments are conducted from isolated initial conditions, we compute initial guesses for the switching rates ${(ν_{i k})}_{k \neq i}$ based on part (1) of Proposition 2, which shows how ${(ν_{i k})}_{k \neq i}$ can be estimated from the slopes of the mean functions $p^{(j)} (t)$ at time zero. We approximate the slopes of $p^{(j)} (t)$ at time zero using experimentally observed cell fractions at the first time point. The initial guesses for the remaining parameters are set to 0. If experiments are not conducted from isolated initial conditions, we randomly sample initial guesses as described in Appendix H below. The simple problem (53) returns estimates for ${(ν_{i k})}_{k \neq i}$ and $λ^{[- 1]}$ , which we supply as initial guesses to (26).

In our codes, we implement the following likelihood function for cell fraction data:

\begin{array}{l} 𝓛_{frac} (θ_{frac} | {(f_{i, ℓ, r})}_{i, ℓ, r}) \\ = \prod_{i = 1}^{I} \prod_{ℓ = 1}^{L} \prod_{r = 1}^{R} {({(2 π)}^{K - 1} \det (C_{i}^{T} B^{T} (N_{i}^{- 1} S^{f_{i}} (t_{ℓ}) + E_{i, ℓ}^{frac}) B C_{i}))}^{- 1 / 2} \\ \cdot \exp (- \frac{1}{2} (f_{i, ℓ, r} - p^{f_{i}} (t_{ℓ})) B C_{i} {(C_{i}^{T} B^{T} (N_{i}^{- 1} S^{f_{i}} (t_{ℓ}) + E_{i, ℓ}^{frac}) B C_{i})}^{- 1} \\ C_{i}^{T} B^{T} {(f_{i, ℓ, r} - p^{f_{i}} (t_{ℓ}))}^{T}) . \end{array}

Recall from (24) that the matrix $B$ is applied to reduce the data vector $f_{i, ℓ, r}$ to a ( $K - 1$ )-dimensional vector. To accommodate reducible switching dynamics, the user is allowed to implement a further reduction in the data by specifying a $(K - 1) \times J_{i}$ matrix $C_{i}$ for each initial condition $i$ . This can for example be useful for the four-type model $(K = 4)$ displayed in Figure 9, in which case we would take $I = 3$ , $C_{1} = C_{2} = I$ and $C_{3} = e_{3}^{T}$ , and we would restrict the set of feasible parameters $Θ_{frac}$ so that $ν_{13} = ν_{14} = ν_{24} = ν_{31} = ν_{32} = ν_{41} = ν_{42} = ν_{43} = 0$ . Note that here, $I$ refers to the $(K - 1) \times (K - 1) = 3 \times 3$ identity matrix.

Figure 9: — An example of a four-type switching model where the likelihood function (24) for cell fraction data from the main text must be modified to avoid degeneracy issues. This model structure can e.g. arise in the context of epigenetically-driven drug resistance in cancer, where drug-sensitive (type-0) cells can acquire transient resistance (type-1), which then evolves gradually to stable resistance (type-4) in two steps [25].

G. Additional numerical results

This section contains additional numerical results to those discussed in Section 4.2 of the main text.

G.1. Illustrative example

For illustrative purposes, we show here a graphical depiction of the output of our estimation framework for a single dataset. We generated artificial cell number and cell fraction data by performing a stochastic simulation of the branching process model from Section 3.1. We then used the data to compute MLE estimates and confidence intervals for the model parameters. The data was generated assuming $K = 2$ cell types, $I = 2$ isolated initial conditions, $L = 6$ time points, and $R = 3$ replicates. Estimation results are shown in Figure 10.

Figure 10: — Graphical depiction of the output of our estimation framework. We first generated artificial cell-number and cell-fraction data by simulating the branching process model of Section 3.1 for $b_{1} = 0.6$ , $d_{1} = 0.3$ , $b_{2} = 1.0$ , $d_{2} = 0.5$ , $ν_{12} = 0.02$ , $ν_{21} = 0.04$ and $N_{1} = N_{2} = 1, 000$ . Using this data, we computed maximum likelihood estimates and likelihood-based 95% confidence intervals (CIs) for the model parameters. For each parameter, the shaded region indicates the CI, the vertical bar inside the interval indicates the MLE estimate, and the arrow points to the true value of the parameter.

Note first the difference in scale between the switching rates and the rates involving cell division and death. This is typically the case, since epigenetic modifications can generally be retained for 10–10⁵ cell divisions [55, 3]. Also note that all model parameters are estimated more accurately for cell number data than cell fraction data, in that their confidence intervals are narrower for cell number data. Otherwise, the relative accuracy with which different model parameters can be estimated is in line with our identifiability analysis in Section 4.1.

G.2. Experimental design: Adding replicates vs. adding time points

In this section, we discuss how our framework can be used to evaluate to what extent additional data can improve parameter estimates and to identify experimental designs that best accomplish this goal. To illustrate this point, we compared the effect of (i) doubling the number of replicates from $R = 3$ to $R = 6$ (design 1), (ii) doubling the number of time points from $T = 6$ to $T = 12$ , adding time points in between the previous time points (design 2), and (iii) doubling the number of time points, adding time points after the previous points (design 3) (Appendix H). We generated 10 parameter regimes and 100 datasets for each regime. The results are shown in Figure 11.

Figure 11: — Comparison of estimation error for different experimental designs when the number of data points is doubled. We generated 10 parameter regimes and 100 datasets for each regime. The blue dots represent estimation from datasets with $L = 6$ time points and $R = 3$ replicates. The red dots represent estimation from $L = 6$ time points and $R = 6$ replicates. The green and grey dots represent estimation from $L = 12$ time points and $R = 3$ replicates, where the extra time points are added in between and after the previous time points, respectively. Panel **(a)** shows estimation from cell number data and panel **(b)** shows estimation from cell fraction data.

For cell number data, the median CV for the switching rate $ν_{21}$ and the net birth rate $λ_{2}$ reduces by 26% and 27%, respectively, when the number of replicates is doubled (design 1) (Fig. 11a). This is consistent with the fact that the standard deviation of an MLE estimator can be expected to decrease with $1 / \sqrt{n}$ , where $n$ is the number of datapoints $(1 - 1 / \sqrt{2} = 0.29)$ [38]. Adding data from time points in between the previous time points (design 2) has a similar effect on the median CV. However, adding time points after the previous points (design 3) reduces the median CV of $ν_{21}$ and $λ_{2}$ by 23% and 16%, respectively, over adding replicates (design 1). We also note that the 10th percentile of the CV for $ν_{21}$ and $λ_{2}$ reduces by 26% and 42%, respectively, between design 1 and design 3, which indicates that the degree of improvement between design 1 and design 3 depends very much on the parameter regime.

For cell fraction data, the relative attractiveness of the three experimental designs is similar (Fig. 11b). However, in this case, the estimate for the net birth rate difference $λ_{2} - λ_{1}$ benefits significantly more from using design 3 than the estimate for the switching rate $ν_{21}$ . For example, the median CV for $ν_{21}$ reduces by 16% and the 10th percentile by 30% between design 1 and design 3, while the analogous reduction for $λ_{2} - λ_{1}$ is 59% and 53%, respectively.

In our structural identifiability analysis for cell fraction data (Section 4.1.2), we observed that it is more difficult to estimate $λ_{2} - λ_{1}$ than $ν_{21}$ from the initial population dynamics, and that $λ_{2} - λ_{1}$ can be identified from the equilibrium proportions $\bar{γ}$ if the switching rates ${(ν_{i k})}_{k \neq i}$ are known. The fact that adding more information on the long-run behavior of the population benefits the estimation of $λ_{2} - λ_{1}$ more than $ν_{21}$ is consistent with these insights. Of course, the results of Section 4.2.2 indicate that the estimation of $λ_{2} - λ_{1}$ can be improved even further by using cell number data as opposed to cell fraction data.

In general, Sections 4.2.2 and G.2 show how our framework can be used to evaluate the estimation accuracy that can be achieved by different experimental designs, depending e.g. on what data is collected, when it is collected, how many replicates are performed, etc.

H. Generation of artificial data

Here, we discuss how the artificial data was generated for the numerical experiments in Section 4.2. First, to generate each parameter regime, we sampled the birth rates $b$ and death rates $d$ uniformly at random on (0, 1), with the following caveats: The birth rates $b$ and net birth rates $λ$ were required to be larger than 0.01 in absolute value, and at least one of the net birth rates $λ_{1}, \dots, λ_{K}$ was required to be positive. Each switching rate $ν_{i j}$ was sampled as $10^{- 3 + 2 U}$ , where $U$ is uniform between 0 and 1, meaning that it was sampled log-uniformly between 10⁻³ and 10⁻¹. The starting number of cells $N_{i}$ was chosen as $N_{i} = 10^{- 3}$ , $N_{i} = 10^{- 4}$ or $N_{i} = 10^{- 5}$ for $i = 1, \dots, K$ based on the order of magnitude of the smallest switching rate. The experimental time points were selected as $t = 1, \dots, 6$ .

In Section G.2, where the number of time points was doubled, the time points were taken as either $t = 0.5, 1, 1.5, 2, \dots, 6$ or $t = 1, 2, 3, \dots, 12$ , depending on whether the new time points were added in between or after the previous time points.

Once the parameters were set, we performed stochastic simulations of the model in Section 3.1 to obtain the artificial datasets. The parameter regimes used to perform the simulations are available in the Github repository for the paper (https://github.com/egunnars/phenotypic_switching_inference/). The background MATLAB codes used to generate the parameter regimes and the artificial datasets, and to perform estimation on the artificial datasets, are also available in the same repository.

I. AIC and BIC

To evaluate model fit relative to model complexity in Section 4.3, we use the Akaike Information Criterion (AIC) and the Bayesian Information Criterion (BIC). For a statistical model with parameters $θ$ and negative double log-likelihood $l (θ)$ , the AIC and BIC are given by

AIC = l (\hat{θ}) + 2 p,

BIC = l (\hat{θ}) + p \log (n),

where $\hat{θ}$ is the MLE estimate, $p$ is the number of parameters in the statistical model, and $n$ is the number of datapoints. When comparing two models, the model with the lower AIC or BIC is preferred, depending on which criterion is used. The BIC criterion generally favors simpler models, i.e. models with fewer parameters, to a greater extent than the AIC criterion.

Footnotes

Competing Interests Statement. The authors have no competing interests to declare.

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References

[1].Brock Amy, Chang Hannah, and Huang Sui. Non-genetic heterogeneity - a mutation-independent driving force for the somatic evolution of tumours. Nature Reviews Genetics, 10(5):336, 2009. [DOI] [PubMed] [Google Scholar]
[2].Peter A Jones and Stephen B Baylin. The epigenomics of cancer. Cell, 128(4):683–692, 2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
[3].Brown Robert, Curry Edward, Magnani Luca, Charlotte S Wilhelm-Benartzi, and Jane Borley. Poised epigenetic states and acquired drug resistance in cancer. Nature Reviews Cancer, 14(11):747, 2014. [DOI] [PubMed] [Google Scholar]
[4].Flavahan William A, Gaskell Elizabeth, and Bernstein Bradley E. Epigenetic plasticity and the hallmarks of cancer. Science, 357(6348):eaal2380, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
[5].Salgia Ravi and Kulkarni Prakash. The genetic/non-genetic duality of drug ‘resistance’in cancer. Trends in cancer, 4(2):110–118, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
[6].Biswas Antara and De Subhajyoti. Drivers of dynamic intratumor heterogeneity and phenotypic plasticity. American Journal of Physiology-Cell Physiology, 320(5):C750–C760, 2021. [DOI] [PMC free article] [PubMed] [Google Scholar]
[7].Sharma Sreenath V, Lee Diana Y, Li Bihua, Quinlan Margaret P, Takahashi Fumiyuki, Maheswaran Shyamala, McDermott Ultan, Azizian Nancy, Zou Lee, Fischbach Michael A, et al. A chromatin-mediated reversible drug-tolerant state in cancer cell subpopulations. Cell, 141(1):69–80, 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
[8].Ramirez Michael, Rajaram Satwik, Steininger Robert J, Osipchuk Daria, Roth Maike A, Morinishi Leanna S, Evans Louise, Ji Weiyue, Hsu Chien-Hsiang, Thurley Kevin, et al. Diverse drug-resistance mechanisms can emerge from drug-tolerant cancer persister cells. Nature communications, 7:10690, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
[9].Hata Aaron N, Niederst Matthew J, Archibald Hannah L, Gomez-Caraballo Maria, Siddiqui Faria M, et al. Tumor cells can follow distinct evolutionary paths to become resistant to epidermal growth factor receptor inhibition. Nature Medicine, 22(3):262–269, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
[10].Roesch Alexander, Fukunaga-Kalabis Mizuho, Schmidt Elizabeth C, Zabierowski Susan E, Brafford Patricia A, Vultur Adina, Basu Devraj, Gimotty Phyllis, Vogt Thomas, and Herlyn Meenhard. A temporarily distinct subpopulation of slow-cycling melanoma cells is required for continuous tumor growth. Cell, 141(4):583–594, 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
[11].Shaffer Sydney M, Dunagin Margaret C, Torborg Stefan R, Torre Eduardo A, Emert Benjamin, Krepler Clemens, Beqiri Marilda, Sproesser Katrin, Brafford Patricia A, Xiao Min, et al. Rare cell variability and drug-induced reprogramming as a mode of cancer drug resistance. Nature, 546(7658):431, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
[12].Su Yapeng, Wei Wei, Robert Lidia, Xue Min, Tsoi Jennifer, Angel Garcia-Diaz, Moreno Blanca Homet, Kim Jungwoo, Ng Rachel H, Lee Jihoon W, et al. Single-cell analysis resolves the cell state transition and signaling dynamics associated with melanoma drug-induced resistance. Proceedings of the National Academy of Sciences, 114(52):13679–13684, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
[13].Liau Brian B, Sievers Cem, Donohue Laura K, Gillespie Shawn M, Flavahan William A, Miller Tyler E, Venteicher Andrew S, Hebert Christine H, Carey Christopher D, Rodig Scott J, et al. Adaptive chromatin remodeling drives glioblastoma stem cell plasticity and drug tolerance. Cell stem cell, 20(2):233–246, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
[14].Neftel Cyril, Laffy Julie, Filbin Mariella G, Hara Toshiro, Shore Marni E, Rahme Gilbert J, Richman Alyssa R, Silverbush Dana, Shaw McKenzie L, Hebert Christine M, et al. An integrative model of cellular states, plasticity, and genetics for glioblastoma. Cell, 178(4):835–849, 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
[15].Pisco Angela Oliveira, Brock Amy, Zhou Joseph, Moor Andreas, Mojtahedi Mitra, Jackson Dean, and Huang Sui. Non-darwinian dynamics in therapy-induced cancer drug resistance. Nature communications, 4:2467, 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
[16].Knoechel Birgit, Roderick Justine E, Williamson Kaylyn E, Zhu Jiang, Lohr Jens G, Cotton Matthew J, Gillespie Shawn M, Fernandez Daniel, Ku Manching, Wang Hongfang, et al. An epigenetic mechanism of resistance to targeted therapy in t cell acute lymphoblastic leukemia. Nature genetics, 46(4):364, 2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
[17].Yang G, Quan Y, Wang W, Fu Q, Wu J, Mei T, Li J, Tang Y, Luo C, Ouyang Q, et al. Dynamic equilibrium between cancer stem cells and non-stem cancer cells in human sw620 and mcf-7 cancer cell populations. British journal of cancer, 106(9):1512–1519, 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
[18].Feng Jian-Ming, Miao Ze-Hong, Jiang Yi, Chen Yi, Li Jia-Xin, Tong Lin-Jiang, Zhang Jin, Huang Yi-Ran, and Ding Jian. Characterization of the conversion between cd133+ and cd133-cells in colon cancer sw620 cell line. Cancer biology & therapy, 13(14):1396–1406, 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
[19].Geng Yue, Chandrasekaran Siddarth, Agastin Sivaprakash, Li Jiahe, and King Michael R. Dynamic switch between two adhesion phenotypes in colorectal cancer cells. Cellular and molecular bioengineering, 7(1):35–44, 2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
[20].Wang Weikang, Quan Yi, Fu Qibin, Liu Yu, Liang Ying, Wu Jingwen, Yang Gen, Luo Chunxiong, Ouyang Qi, and Wang Yugang. Dynamics between cancer cell subpopulations reveals a model coordinating with both hierarchical and stochastic concepts. PloS one, 9(1):e84654, 2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
[21].Gupta Piyush B, Fillmore Christine M, Jiang Guozhi, Shapira Sagi D, Tao Kai, Kuperwasser Charlotte, and Lander Eric S. Stochastic state transitions give rise to phenotypic equilibrium in populations of cancer cells. Cell, 146(4):633–644, 2011. [DOI] [PubMed] [Google Scholar]
[22].Goldman Aaron, Majumder Biswanath, Dhawan Andrew, Ravi Sudharshan, Goldman David, Kohandel Mohammad, Majumder Pradip K, and Sengupta Shiladitya. Temporally sequenced anticancer drugs overcome adaptive resistance by targeting a vulnerable chemotherapy-induced phenotypic transition. Nature communications, 6:6139, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
[23].Jordan Nicole Vincent, Bardia Aditya, Wittner Ben S, Benes Cyril, Ligorio Matteo, Zheng Yu, Yu Min, Sundaresan Tilak K, Licausi Joseph A, Desai Rushil, et al. Her2 expression identifies dynamic functional states within circulating breast cancer cells. Nature, 537(7618):102–106, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
[24].Bhatia Sugandha, Monkman James, Blick Tony, Pinto Cletus, Waltham Mark, Nagaraj Shivashankar H, and Thompson Erik W. Interrogation of phenotypic plasticity between epithelial and mesenchymal states in breast cancer. Journal of clinical medicine, 8(6):893, 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
[25].Gunnarsson Einar Bjarki, De Subhajyoti, Leder Kevin, and Foo Jasmine. Understanding the role of phenotypic switching in cancer drug resistance. Journal of Theoretical Biology, 490:110162, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
[26].Buder Thomas, Deutsch Andreas, Seifert Michael, and Voss-Böhme Anja. Celltrans: an r package to quantify stochastic cell state transitions. Bioinformatics and biology insights, 11:1177932217712241, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
[27].Suhas Jagannathan N, Ihsan Mario O, Kin Xiao Xuan, Welsch Roy E, Clément Marie-Véronique, and Tucker-Kellogg Lisa. Transcompp: understanding phenotypic plasticity by estimating markov transition rates for cell state transitions. Bioinformatics, 36(9):2813–2820, 2020. [DOI] [PubMed] [Google Scholar]
[28].Li Xin and Thirumalai D. A mathematical model for phenotypic heterogeneity in breast cancer with implications for therapeutic strategies. Journal of the Royal Society Interface, 19(186):20210803, 2022. [DOI] [PMC free article] [PubMed] [Google Scholar]
[29].Zhou Joseph Xu, Pisco Angela Oliveira, Qian Hong, and Huang Sui. Nonequilibrium population dynamics of phenotype conversion of cancer cells. PloS one, 9(12):e110714, 2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
[30].Devaraj Vimalathithan and Bose Biplab. Morphological state transition dynamics in egf-induced epithelial to mesenchymal transition. Journal of clinical medicine, 8(7):911, 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
[31].Dirkse Anne, Golebiewska Anna, Buder Thomas, Nazarov Petr V, Muller Arnaud, Poovathingal Suresh, Brons Nicolaas HC, Leite Sonia, Sauvageot Nicolas, Sarkisjan Dzjemma, et al. Stem cell-associated heterogeneity in glioblastoma results from intrinsic tumor plasticity shaped by the microenvironment. Nature communications, 10(1):1–16, 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
[32].Vipparthi Kavya, Hari Kishore, Chakraborty Priyanka, Ghosh Subhashis, Patel Ankit Kumar, Ghosh Arnab, Biswas Nidhan Kumar, Sharan Rajeev, Arun Pattatheyil, Jolly Mohit Kumar, et al. Emergence of hybrid states of stem-like cancer cells correlates with poor prognosis in oral cancer. iScience, 25(5):104317, 2022. [DOI] [PMC free article] [PubMed] [Google Scholar]
[33].Chedere Adithya, Hari Kishore, Kumar Saurav, Rangarajan Annapoorni, and Jolly Mohit Kumar. Multi-stability and consequent phenotypic plasticity in ampk-akt double negative feedback loop in cancer cells. Journal of clinical medicine, 10(3):472, 2021. [DOI] [PMC free article] [PubMed] [Google Scholar]
[34].Devaraj Vimalathithan and Bose Biplab. The mathematics of phenotypic state transition: paths and potential. Journal of the Indian Institute of Science, 100(3):451–464, 2020. [Google Scholar]
[35].Athreya Krishna B and Ney Peter E. Branching processes Courier Corporation, 2004. [Google Scholar]
[36].Bintu Lacramioara, Yong John, Antebi Yaron E, McCue Kayla, Kazuki Yasuhiro, Uno Narumi, Oshimura Mitsuo, and Michael B Elowitz. Dynamics of epigenetic regulation at the single-cell level. Science, 351(6274):720–724, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
[37].Nam Arin, Mohanty Atish, Bhattacharya Supriyo, Kotnala Sourabh, Achuthan Srisairam, Hari Kishore, Srivastava Saumya, Guo Linlin, Nathan Anusha, Chatterjee Rishov, et al. Dynamic phenotypic switching and group behavior help non-small cell lung cancer cells evade chemotherapy. Biomolecules, 12(1):8, 2022. [DOI] [PMC free article] [PubMed] [Google Scholar]
[38].Casella George and Berger Roger L. Statistical inference Cengage Learning, 2021. [Google Scholar]
[39].Neale Michael C and Miller Michael B. The use of likelihood-based confidence intervals in genetic models. Behavior genetics, 27(2):113–120, 1997. [DOI] [PubMed] [Google Scholar]
[40].Fischer Samuel M and Lewis Mark A. A robust and efficient algorithm to find profile likelihood confidence intervals. Statistics and Computing, 31(4):1–17, 2021. [Google Scholar]
[41].Borisov Ivan and Metelkin Evgeny. Confidence intervals by constrained optimization—an algorithm and software package for practical identifiability analysis in systems biology. PLOS Computational Biology, 16(12):e1008495, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
[42].Venzon DJ and Moolgavkar SH. A method for computing profile-likelihood-based confidence intervals. Journal of the Royal Statistical Society: Series C (Applied Statistics), 37(1):87–94, 1988. [Google Scholar]
[43].Raue Andreas, Kreutz Clemens, Maiwald Thomas, Bachmann Julie, Schilling Marcel, Ursula Klingmüller, and Jens Timmer. Structural and practical identifiability analysis of partially observed dynamical models by exploiting the profile likelihood. Bioinformatics, 25(15):1923–1929, 2009. [DOI] [PubMed] [Google Scholar]
[44].Yakovlev Andrei Y and Yanev Nikolay M. Relative frequencies in multitype branching processes. The annals of applied probability, 19(1):1–14, 2009. [Google Scholar]
[45].Rothenberg Thomas J. Identification in parametric models. Econometrica: Journal of the Econometric Society, pages 577–591, 1971. [Google Scholar]
[46].Browning Alexander P, Warne David J, Burrage Kevin, Baker Ruth E, and Simpson Matthew J. Identifiability analysis for stochastic differential equation models in systems biology. Journal of the Royal Society Interface, 17(173):20200652, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
[47].Chis Oana-Teodora, Banga Julio R, and Balsa-Canto Eva. Structural identifiability of systems biology models: a critical comparison of methods. PloS one, 6(11):e27755, 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
[48].Russo Mariangela, Pompei Simone, Sogari Alberto, Corigliano Mattia, Crisafulli Giovanni, Puliafito Alberto, Lamba Simona, Erriquez Jessica, Bertotti Andrea, Gherardi Marco, et al. A modified fluctuation-test framework characterizes the population dynamics and mutation rate of colorectal cancer persister cells. Nature Genetics, 54(7):976–984, 2022. [DOI] [PMC free article] [PubMed] [Google Scholar]
[49].Greene James M, Gevertz Jana L, and Sontag Eduardo D. Mathematical approach to differentiate spontaneous and induced evolution to drug resistance during cancer treatment. JCO clinical cancer informatics, 3:1–20, 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
[50].Kuosmanen Teemu, Cairns Johannes, Noble Robert, Beerenwinkel Niko, Mononen Tommi, and Mustonen Ville. Drug-induced resistance evolution necessitates less aggressive treatment. PLoS computational biology, 17(9):e1009418, 2021. [DOI] [PMC free article] [PubMed] [Google Scholar]
[51].Angelini Erin, Wang Yue, Zhou Joseph Xu, Qian Hong, and Huang Sui. A model for the intrinsic limit of cancer therapy: Duality of treatment-induced cell death and treatment-induced stemness. PLOS Computational Biology, 18(7):e1010319, 2022. [DOI] [PMC free article] [PubMed] [Google Scholar]
[52].Benzekry Sébastien, Lamont Clare, Beheshti Afshin, Tracz Amanda, Ebos John ML, Hlatky Lynn, and Hahnfeldt Philip. Classical mathematical models for description and prediction of experimental tumor growth. PLoS computational biology, 10(8):e1003800, 2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
[53].Steiert Bernhard, Raue Andreas, Timmer Jens, and Kreutz Clemens. Experimental design for parameter estimation of gene regulatory networks. PloS one, 7(7):e40052, 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
[54].Durrett Richard. Branching process models of cancer. In Branching Process Models of Cancer, pages 1–63. Springer, 2015. [Google Scholar]
[55].Niepel Mario, Spencer Sabrina L, and Sorger Peter K. Non-genetic cell-to-cell variability and the consequences for pharmacology. Current opinion in chemical biology, 13(5–6):556–561, 2009. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R1] [1].Brock Amy, Chang Hannah, and Huang Sui. Non-genetic heterogeneity - a mutation-independent driving force for the somatic evolution of tumours. Nature Reviews Genetics, 10(5):336, 2009. [DOI] [PubMed] [Google Scholar]

[R2] [2].Peter A Jones and Stephen B Baylin. The epigenomics of cancer. Cell, 128(4):683–692, 2007. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R3] [3].Brown Robert, Curry Edward, Magnani Luca, Charlotte S Wilhelm-Benartzi, and Jane Borley. Poised epigenetic states and acquired drug resistance in cancer. Nature Reviews Cancer, 14(11):747, 2014. [DOI] [PubMed] [Google Scholar]

[R4] [4].Flavahan William A, Gaskell Elizabeth, and Bernstein Bradley E. Epigenetic plasticity and the hallmarks of cancer. Science, 357(6348):eaal2380, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R5] [5].Salgia Ravi and Kulkarni Prakash. The genetic/non-genetic duality of drug ‘resistance’in cancer. Trends in cancer, 4(2):110–118, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R6] [6].Biswas Antara and De Subhajyoti. Drivers of dynamic intratumor heterogeneity and phenotypic plasticity. American Journal of Physiology-Cell Physiology, 320(5):C750–C760, 2021. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R7] [7].Sharma Sreenath V, Lee Diana Y, Li Bihua, Quinlan Margaret P, Takahashi Fumiyuki, Maheswaran Shyamala, McDermott Ultan, Azizian Nancy, Zou Lee, Fischbach Michael A, et al. A chromatin-mediated reversible drug-tolerant state in cancer cell subpopulations. Cell, 141(1):69–80, 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R8] [8].Ramirez Michael, Rajaram Satwik, Steininger Robert J, Osipchuk Daria, Roth Maike A, Morinishi Leanna S, Evans Louise, Ji Weiyue, Hsu Chien-Hsiang, Thurley Kevin, et al. Diverse drug-resistance mechanisms can emerge from drug-tolerant cancer persister cells. Nature communications, 7:10690, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R9] [9].Hata Aaron N, Niederst Matthew J, Archibald Hannah L, Gomez-Caraballo Maria, Siddiqui Faria M, et al. Tumor cells can follow distinct evolutionary paths to become resistant to epidermal growth factor receptor inhibition. Nature Medicine, 22(3):262–269, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R10] [10].Roesch Alexander, Fukunaga-Kalabis Mizuho, Schmidt Elizabeth C, Zabierowski Susan E, Brafford Patricia A, Vultur Adina, Basu Devraj, Gimotty Phyllis, Vogt Thomas, and Herlyn Meenhard. A temporarily distinct subpopulation of slow-cycling melanoma cells is required for continuous tumor growth. Cell, 141(4):583–594, 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R11] [11].Shaffer Sydney M, Dunagin Margaret C, Torborg Stefan R, Torre Eduardo A, Emert Benjamin, Krepler Clemens, Beqiri Marilda, Sproesser Katrin, Brafford Patricia A, Xiao Min, et al. Rare cell variability and drug-induced reprogramming as a mode of cancer drug resistance. Nature, 546(7658):431, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R12] [12].Su Yapeng, Wei Wei, Robert Lidia, Xue Min, Tsoi Jennifer, Angel Garcia-Diaz, Moreno Blanca Homet, Kim Jungwoo, Ng Rachel H, Lee Jihoon W, et al. Single-cell analysis resolves the cell state transition and signaling dynamics associated with melanoma drug-induced resistance. Proceedings of the National Academy of Sciences, 114(52):13679–13684, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R13] [13].Liau Brian B, Sievers Cem, Donohue Laura K, Gillespie Shawn M, Flavahan William A, Miller Tyler E, Venteicher Andrew S, Hebert Christine H, Carey Christopher D, Rodig Scott J, et al. Adaptive chromatin remodeling drives glioblastoma stem cell plasticity and drug tolerance. Cell stem cell, 20(2):233–246, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R14] [14].Neftel Cyril, Laffy Julie, Filbin Mariella G, Hara Toshiro, Shore Marni E, Rahme Gilbert J, Richman Alyssa R, Silverbush Dana, Shaw McKenzie L, Hebert Christine M, et al. An integrative model of cellular states, plasticity, and genetics for glioblastoma. Cell, 178(4):835–849, 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R15] [15].Pisco Angela Oliveira, Brock Amy, Zhou Joseph, Moor Andreas, Mojtahedi Mitra, Jackson Dean, and Huang Sui. Non-darwinian dynamics in therapy-induced cancer drug resistance. Nature communications, 4:2467, 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R16] [16].Knoechel Birgit, Roderick Justine E, Williamson Kaylyn E, Zhu Jiang, Lohr Jens G, Cotton Matthew J, Gillespie Shawn M, Fernandez Daniel, Ku Manching, Wang Hongfang, et al. An epigenetic mechanism of resistance to targeted therapy in t cell acute lymphoblastic leukemia. Nature genetics, 46(4):364, 2014. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R17] [17].Yang G, Quan Y, Wang W, Fu Q, Wu J, Mei T, Li J, Tang Y, Luo C, Ouyang Q, et al. Dynamic equilibrium between cancer stem cells and non-stem cancer cells in human sw620 and mcf-7 cancer cell populations. British journal of cancer, 106(9):1512–1519, 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R18] [18].Feng Jian-Ming, Miao Ze-Hong, Jiang Yi, Chen Yi, Li Jia-Xin, Tong Lin-Jiang, Zhang Jin, Huang Yi-Ran, and Ding Jian. Characterization of the conversion between cd133+ and cd133-cells in colon cancer sw620 cell line. Cancer biology & therapy, 13(14):1396–1406, 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R19] [19].Geng Yue, Chandrasekaran Siddarth, Agastin Sivaprakash, Li Jiahe, and King Michael R. Dynamic switch between two adhesion phenotypes in colorectal cancer cells. Cellular and molecular bioengineering, 7(1):35–44, 2014. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R20] [20].Wang Weikang, Quan Yi, Fu Qibin, Liu Yu, Liang Ying, Wu Jingwen, Yang Gen, Luo Chunxiong, Ouyang Qi, and Wang Yugang. Dynamics between cancer cell subpopulations reveals a model coordinating with both hierarchical and stochastic concepts. PloS one, 9(1):e84654, 2014. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R21] [21].Gupta Piyush B, Fillmore Christine M, Jiang Guozhi, Shapira Sagi D, Tao Kai, Kuperwasser Charlotte, and Lander Eric S. Stochastic state transitions give rise to phenotypic equilibrium in populations of cancer cells. Cell, 146(4):633–644, 2011. [DOI] [PubMed] [Google Scholar]

[R22] [22].Goldman Aaron, Majumder Biswanath, Dhawan Andrew, Ravi Sudharshan, Goldman David, Kohandel Mohammad, Majumder Pradip K, and Sengupta Shiladitya. Temporally sequenced anticancer drugs overcome adaptive resistance by targeting a vulnerable chemotherapy-induced phenotypic transition. Nature communications, 6:6139, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R23] [23].Jordan Nicole Vincent, Bardia Aditya, Wittner Ben S, Benes Cyril, Ligorio Matteo, Zheng Yu, Yu Min, Sundaresan Tilak K, Licausi Joseph A, Desai Rushil, et al. Her2 expression identifies dynamic functional states within circulating breast cancer cells. Nature, 537(7618):102–106, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R24] [24].Bhatia Sugandha, Monkman James, Blick Tony, Pinto Cletus, Waltham Mark, Nagaraj Shivashankar H, and Thompson Erik W. Interrogation of phenotypic plasticity between epithelial and mesenchymal states in breast cancer. Journal of clinical medicine, 8(6):893, 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R25] [25].Gunnarsson Einar Bjarki, De Subhajyoti, Leder Kevin, and Foo Jasmine. Understanding the role of phenotypic switching in cancer drug resistance. Journal of Theoretical Biology, 490:110162, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R26] [26].Buder Thomas, Deutsch Andreas, Seifert Michael, and Voss-Böhme Anja. Celltrans: an r package to quantify stochastic cell state transitions. Bioinformatics and biology insights, 11:1177932217712241, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R27] [27].Suhas Jagannathan N, Ihsan Mario O, Kin Xiao Xuan, Welsch Roy E, Clément Marie-Véronique, and Tucker-Kellogg Lisa. Transcompp: understanding phenotypic plasticity by estimating markov transition rates for cell state transitions. Bioinformatics, 36(9):2813–2820, 2020. [DOI] [PubMed] [Google Scholar]

[R28] [28].Li Xin and Thirumalai D. A mathematical model for phenotypic heterogeneity in breast cancer with implications for therapeutic strategies. Journal of the Royal Society Interface, 19(186):20210803, 2022. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R29] [29].Zhou Joseph Xu, Pisco Angela Oliveira, Qian Hong, and Huang Sui. Nonequilibrium population dynamics of phenotype conversion of cancer cells. PloS one, 9(12):e110714, 2014. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R30] [30].Devaraj Vimalathithan and Bose Biplab. Morphological state transition dynamics in egf-induced epithelial to mesenchymal transition. Journal of clinical medicine, 8(7):911, 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R31] [31].Dirkse Anne, Golebiewska Anna, Buder Thomas, Nazarov Petr V, Muller Arnaud, Poovathingal Suresh, Brons Nicolaas HC, Leite Sonia, Sauvageot Nicolas, Sarkisjan Dzjemma, et al. Stem cell-associated heterogeneity in glioblastoma results from intrinsic tumor plasticity shaped by the microenvironment. Nature communications, 10(1):1–16, 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R32] [32].Vipparthi Kavya, Hari Kishore, Chakraborty Priyanka, Ghosh Subhashis, Patel Ankit Kumar, Ghosh Arnab, Biswas Nidhan Kumar, Sharan Rajeev, Arun Pattatheyil, Jolly Mohit Kumar, et al. Emergence of hybrid states of stem-like cancer cells correlates with poor prognosis in oral cancer. iScience, 25(5):104317, 2022. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R33] [33].Chedere Adithya, Hari Kishore, Kumar Saurav, Rangarajan Annapoorni, and Jolly Mohit Kumar. Multi-stability and consequent phenotypic plasticity in ampk-akt double negative feedback loop in cancer cells. Journal of clinical medicine, 10(3):472, 2021. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R34] [34].Devaraj Vimalathithan and Bose Biplab. The mathematics of phenotypic state transition: paths and potential. Journal of the Indian Institute of Science, 100(3):451–464, 2020. [Google Scholar]

[R35] [35].Athreya Krishna B and Ney Peter E. Branching processes Courier Corporation, 2004. [Google Scholar]

[R36] [36].Bintu Lacramioara, Yong John, Antebi Yaron E, McCue Kayla, Kazuki Yasuhiro, Uno Narumi, Oshimura Mitsuo, and Michael B Elowitz. Dynamics of epigenetic regulation at the single-cell level. Science, 351(6274):720–724, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R37] [37].Nam Arin, Mohanty Atish, Bhattacharya Supriyo, Kotnala Sourabh, Achuthan Srisairam, Hari Kishore, Srivastava Saumya, Guo Linlin, Nathan Anusha, Chatterjee Rishov, et al. Dynamic phenotypic switching and group behavior help non-small cell lung cancer cells evade chemotherapy. Biomolecules, 12(1):8, 2022. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R38] [38].Casella George and Berger Roger L. Statistical inference Cengage Learning, 2021. [Google Scholar]

[R39] [39].Neale Michael C and Miller Michael B. The use of likelihood-based confidence intervals in genetic models. Behavior genetics, 27(2):113–120, 1997. [DOI] [PubMed] [Google Scholar]

[R40] [40].Fischer Samuel M and Lewis Mark A. A robust and efficient algorithm to find profile likelihood confidence intervals. Statistics and Computing, 31(4):1–17, 2021. [Google Scholar]

[R41] [41].Borisov Ivan and Metelkin Evgeny. Confidence intervals by constrained optimization—an algorithm and software package for practical identifiability analysis in systems biology. PLOS Computational Biology, 16(12):e1008495, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R42] [42].Venzon DJ and Moolgavkar SH. A method for computing profile-likelihood-based confidence intervals. Journal of the Royal Statistical Society: Series C (Applied Statistics), 37(1):87–94, 1988. [Google Scholar]

[R43] [43].Raue Andreas, Kreutz Clemens, Maiwald Thomas, Bachmann Julie, Schilling Marcel, Ursula Klingmüller, and Jens Timmer. Structural and practical identifiability analysis of partially observed dynamical models by exploiting the profile likelihood. Bioinformatics, 25(15):1923–1929, 2009. [DOI] [PubMed] [Google Scholar]

[R44] [44].Yakovlev Andrei Y and Yanev Nikolay M. Relative frequencies in multitype branching processes. The annals of applied probability, 19(1):1–14, 2009. [Google Scholar]

[R45] [45].Rothenberg Thomas J. Identification in parametric models. Econometrica: Journal of the Econometric Society, pages 577–591, 1971. [Google Scholar]

[R46] [46].Browning Alexander P, Warne David J, Burrage Kevin, Baker Ruth E, and Simpson Matthew J. Identifiability analysis for stochastic differential equation models in systems biology. Journal of the Royal Society Interface, 17(173):20200652, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R47] [47].Chis Oana-Teodora, Banga Julio R, and Balsa-Canto Eva. Structural identifiability of systems biology models: a critical comparison of methods. PloS one, 6(11):e27755, 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R48] [48].Russo Mariangela, Pompei Simone, Sogari Alberto, Corigliano Mattia, Crisafulli Giovanni, Puliafito Alberto, Lamba Simona, Erriquez Jessica, Bertotti Andrea, Gherardi Marco, et al. A modified fluctuation-test framework characterizes the population dynamics and mutation rate of colorectal cancer persister cells. Nature Genetics, 54(7):976–984, 2022. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R49] [49].Greene James M, Gevertz Jana L, and Sontag Eduardo D. Mathematical approach to differentiate spontaneous and induced evolution to drug resistance during cancer treatment. JCO clinical cancer informatics, 3:1–20, 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R50] [50].Kuosmanen Teemu, Cairns Johannes, Noble Robert, Beerenwinkel Niko, Mononen Tommi, and Mustonen Ville. Drug-induced resistance evolution necessitates less aggressive treatment. PLoS computational biology, 17(9):e1009418, 2021. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R51] [51].Angelini Erin, Wang Yue, Zhou Joseph Xu, Qian Hong, and Huang Sui. A model for the intrinsic limit of cancer therapy: Duality of treatment-induced cell death and treatment-induced stemness. PLOS Computational Biology, 18(7):e1010319, 2022. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R52] [52].Benzekry Sébastien, Lamont Clare, Beheshti Afshin, Tracz Amanda, Ebos John ML, Hlatky Lynn, and Hahnfeldt Philip. Classical mathematical models for description and prediction of experimental tumor growth. PLoS computational biology, 10(8):e1003800, 2014. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R53] [53].Steiert Bernhard, Raue Andreas, Timmer Jens, and Kreutz Clemens. Experimental design for parameter estimation of gene regulatory networks. PloS one, 7(7):e40052, 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R54] [54].Durrett Richard. Branching process models of cancer. In Branching Process Models of Cancer, pages 1–63. Springer, 2015. [Google Scholar]

[R55] [55].Niepel Mario, Spencer Sabrina L, and Sorger Peter K. Non-genetic cell-to-cell variability and the consequences for pharmacology. Current opinion in chemical biology, 13(5–6):556–561, 2009. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Statistical inference of the rates of cell proliferation and phenotypic switching in cancer

Einar Bjarki Gunnarsson

Jasmine Foo

Kevin Leder

Abstract

1. Introduction

Figure 1:

2. Review of prior estimation methods

3. Models and methods

3.1. Multitype branching process model

3.1.1. Model definition and model parameters

Figure 2:

3.1.2. Random processes and their moments

3.1.3. Long-run behavior

3.2. Experimental assumptions and notation for experimental data

3.3. Estimation for cell number data

3.4. Estimation for cell fraction data

4. Results

4.1. Structural identifiability analysis

4.1.1. Cell number data

Proposition 1.

4.1.2. Cell fraction data

Proposition 2.

4.1.3. Comparison

Table 2:

4.2. Numerical experiments

4.2.1. Implementation in MATLAB

4.2.2. Estimation across a wide range of biologically realistic regimes

Figure 3:

4.2.3. Improving identifiability of the rates of cell division and cell death

Figure 4:

Figure 5:

4.2.4. Estimation using endpoint data vs. sequential data

Figure 6:

4.3. Application: Transition between stem and non-stem cell states in SW620 colon cancer

Table 3:

Table 4:

Figure 7:

5. Discussion

Table 1:

Acknowledgments.

A. Estimation framework

A.1. Notation for experimental data

A.2. Estimation for cell number data

A.3. Estimation for cell fraction data

B. Estimation for reducible switching dynamics

Figure 8:

C. Central limit theorems

C.1. Cell number data

Proposition 3.

C.2. Cell fraction data

Proposition 4.

D. Proof of Proposition 1

E. Proof of Proposition 2

F. Implementation in MATLAB

F.1. Cell number data

F.2. Cell fraction data

Figure 9:

G. Additional numerical results

G.1. Illustrative example

Figure 10:

G.2. Experimental design: Adding replicates vs. adding time points

Figure 11:

H. Generation of artificial data

I. AIC and BIC

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases