Skip to main content
BMC Bioinformatics logoLink to BMC Bioinformatics
. 2023 Jun 2;24:228. doi: 10.1186/s12859-023-05269-1

A mixed-effects stochastic model reveals clonal dominance in gene therapy safety studies

Luca Del Core 1,2,, Danilo Pellin 3,, Ernst C Wit 1,4,✉,#, Marco A Grzegorczyk 1,✉,#
PMCID: PMC10239124  PMID: 37268887

Abstract

Background

Mathematical models of haematopoiesis can provide insights on abnormal cell expansions (clonal dominance), and in turn can guide safety monitoring in gene therapy clinical applications. Clonal tracking is a recent high-throughput technology that can be used to quantify cells arising from a single haematopoietic stem cell ancestor after a gene therapy treatment. Thus, clonal tracking data can be used to calibrate the stochastic differential equations describing clonal population dynamics and hierarchical relationships in vivo.

Results

In this work we propose a random-effects stochastic framework that allows to investigate the presence of events of clonal dominance from high-dimensional clonal tracking data. Our framework is based on the combination between stochastic reaction networks and mixed-effects generalized linear models. Starting from the Kramers–Moyal approximated Master equation, the dynamics of cells duplication, death and differentiation at clonal level, can be described by a local linear approximation. The parameters of this formulation, which are inferred using a maximum likelihood approach, are assumed to be shared across the clones and are not sufficient to describe situation in which clones exhibit heterogeneity in their fitness that can lead to clonal dominance. In order to overcome this limitation, we extend the base model by introducing random-effects for the clonal parameters. This extended formulation is calibrated to the clonal data using a tailor-made expectation-maximization algorithm. We also provide the companion graphic file with name 12859_2023_5269_Figa_HTML.jpg package RestoreNet, publicly available for download at https://cran.r-project.org/package=RestoreNet.

Conclusions

Simulation studies show that our proposed method outperforms the state-of-the-art. The application of our method in two in-vivo studies unveils the dynamics of clonal dominance. Our tool can provide statistical support to biologists in gene therapy safety analyses.

Supplementary Information

The online version contains supplementary material available at 10.1186/s12859-023-05269-1.

Keywords: Stochastic reaction networks, Mixed-effects models, E-M algorithm, τ-Leaping, Gene therapy, Clonal dominance

Background

In gene therapy the correction of the defective gene(s) underlying the disease is, in principle, sufficient for inducing disease remission or even full recovery [1]. Since the blood system possesses a hierarchical structure with haematopoietic stem cells (HSCs) at its root [2], correction of large numbers of HSCs might be sufficient to eradicate a genetic disease [3, 4]. But genetic modification of large numbers of cells is associated with the higher probability of unintentional vector insertions near proto oncogenes, that may lead to insertional mutagenesis [57]. Insertional mutagenesis causes a significant change in clone fitness that can lead to the clones’ abnormal expansion and to an unbalanced contribution of different clones to blood cells production. Clonal dominance, characterised by the outgrowth of a small subset of clones (oligoclonality) or one clone (monoclonality) in the most extreme cases, poses serious concerns in the context of gene therapy clinical trials because they might represent the initial stage of a leukemic transformation and are in general considered negative predictors of long term therapeutic benefit.

Clonal dominance in malignant haematopoiesis has been previously identified as a consequence of a clonal competition that is corrupted by disease progression [8, 9]. However, clonal dominance has also been observed in normal haematopoiesis, even in the case of truly neutral clonal markers [1012]. Indeed, on the basis of various mathematical models, progression of monoclonality has been discussed also for normal (non-leukaemic) stem cell systems [1317]. While there is strong evidence for clonal selection inducing monoclonal systems in the crypts of the small intestine [1821], such a process has not been demonstrated for the haematopoietic system yet. There are several high-throughput systems that allow to quantitatively investigate those mechanisms. In gene therapy applications, clonal tracking is performed by using permanent molecular identifier integrated in the host cell genome. In pre-clinical animal studies, these are short fragments of random or semi-random DNA stretches called barcodes, whereas in clinical setting vector integration sites are in general used. After transplantation, all the progeny deriving through cell differentiation inherits the original labels, thus allowing computational modelling to unveil population dynamics and hierarchical relationships in vivo [2225].

Here we extend the work by [26, 27] and propose a random-effects cell differentiation network to detect the dynamics of clonal expansion from high dimensional clonal tracking data. In particular, starting from the definition of the Master equation [28], a set of Ito-type stochastic differential equations is derived to describe the first two-order moments of the process. We estimate the parameters of the Ito system from its Euler-Maruyama local linear approximation (LLA) [29] using a maximum likelihood approach. Although the base LLA model formulation has been shown to be effective in modelling cell differentiation [27], it has some limitations as it considers all clone trajectories to be iid realizations of the same underlying stochastic process, and does not take into account possible heterogeneous behaviour across the clones. Therefore, the base LLA formulation cannot be used to model clonal dominance. In this work we further increase the flexibility of the base LLA model to take into account for potential heterogeneity in clones’ behaviour in both duplication and differentiation rates. To this end we introduce random-effects for the clones inside the LLA formulation, providing a mixed-effects LLA model. Then, we use the inferred mixed-effects model to identify which clones are mainly expanding and in which cell compartments. Parameter inference in the mixed-effects formulation is performed by means of an expectation-maximization algorithm, for which we developed an efficient implementation in the graphic file with name 12859_2023_5269_Figb_HTML.jpg package RestoreNet. Our random-effects LLA formulation describes a stochastic process of clonal dominance on a network of cell lineages. We tested and validated our method in simulation studies, including a direct comparison with the state-of-the-art method GLS [27]. Subsequently, our method is applied to investigating the dynamics of clonal expansion in a in-vivo model of rhesus macaque haematopoiesis [23]. Finally, by analysing an in-vivo model of tumor prone mice, our method identifies the expected impact of vector genotoxicity on clonal dynamics [30].

Methods

An outline of our proposed stochastic framework is as follows. RestoreNet takes a clonal tracking dataset as input, along with a set of reactions coding for cellular duplication, death and differentiation. The system of stochastic differential equations describing the clonal dynamics are translated into a generalized linear model formulation, that possibly includes clone-specific random-effects on the dynamics parameters. Subsequently, the parameters are inferred and, if an event of clonal dominance is detected, a pie-chart shows the clones that are expanding and in which cell lineage. A graphical representation of the framework is provided in Fig. 1. This section contains a concise description of the stochastic formulation of clonal dominance and the corresponding inference method. A more detailed description of the stochastic model can be found in the Additional file 1.

Fig. 1.

Fig. 1

Schematic representation of the analysis: A three-dimensional clonal tracking dataset (left) is received as input from our proposed stochastic framework RestoreNet (middle). It mainly consists in two parts, such as a maximum likelihood step to infer the base LLA model, and an expectation-maximization step to infer the random-effects LLA formulation. Finally, a clonal piechart is returned, where each clone is identified by a pie whose slices are lineage specific and proportional to their expansion rates (right)

A stochastic model for cell differentiation

Consistently with the definition of a stochastic quasi-reaction network of Section 1.1 of the Additional file 1, we consider a Markov process

xt=(x1t,,xnt), 1

for a single clone and n cell types (i=1,,n) that evolve, in a time interval (t,t+Δt), according to a set of K distinct biochemical reactions whose net-effect vectors {vk}k=1K and hazard functions {hk(xt,θ)}k=1K are defined as

vk=(1i(k))(-1i(k))(-1i(k)2j(k))hk(xt,θ)=xi(k)tαi(k)duplicationxi(k)t2δi(k)deathxi(k)tλi(k)j(k)differentiation 2

where i(k) and j(k) are the cell types possibly involved in the k-th reaction, and 

j(k)O(i(k))={j|λi(k)j>0}, 3

where O(i) is called the offspring set of cell type i. The definitions of the hazard functions and the net-effects follow from the law of mass action, consistently with Eq. (7) of the Additional file 1. The hazard functions include a linear growth term xi(k)tαi(k) for cell lineage i(k) with a duplication rate parameter αi(k)>0, a quadratic term xi(k)t2δi(k) for cell death of lineage i(k) with a death rate parameter δi(k)>0, and a linear term xi(k)tλi(k)j(k) describing cell differentiation from cell lineage i(k) to cell lineage j(k)O(i(k)) with a differentiation rate λi(k)j(k)>0. The vector parameter

θ=α1,,αn,δ1,,δn,λ1O(1),,λnO(n), 4

appearing in the hazard functions, includes all the dynamic parameters, where λiO(i) is the vector of all the differentiation rates from cell lineage i to its offspring set O(i). Finally, we define the net-effect matrix and the hazard vector as

V=v1vKZn×K,h(xt,θ)=h1(xt,θ),,hK(xt,θ). 5

LLA formulation of clonal dominance

Let yt=(y1t,,ynt) be the vector of the measurements collected at time t for a n-dimensional counting process xt=(x1t,,xnt) obeying to a network of stochastic biochemical reactions defined by a net-effect matrix VZn×K, a vector parameter θRK and an hazard vector h(x,θ)=(h1(x,θ),,hK(x,θ)) and let

Δyt0ΔytT-1Δy=Mt0MtT-1Mθ+ε,εNnT0,Wt0(θ)WtT-1(θ)W(θ)+σ2InTΣ(θ,σ2), 6

be the local linear approximation of the Kramers-Moyal approximated Master equation (see Section 1.3 of the Additional file 1 for details) where

Δyt=Vi=1nyitr1ii=1nyitrKiΔtMtθ1θKθ+Vh1(yt,θ)h1(yt,θ)VΔtWt(θ)+σ2In1/2Δεt,ΔεtNn(0,In), 7

with σ2 being the measurement noise variance, Mtθ the mean drift, Wt(θ) the diffusion matrix, and Δyt=yt+Δt-yt is a finite-time increment of y in the time interval Δt. From Eq. (6) it can be seen that all clones share the same vector parameter θ. To infer the parameters of Eqs. (6)–(7) we developed a maximum likelihood algorithm which is fully described in Section 1.4 of the Additional file 1.

In some cases it may happen that the clones being analysed are drawn from a hierarchy of J different populations that possibly behave differently in terms of dynamics. In this case it might be of interest to quantify the population-average θ and the clonal-specific effects u around the average θ for the description of clone-specific dynamics. For achieving this goal, we extend the LLA formulation of Eq. (6) with a mixed-effects model [31] by introducing random-effects u for the J distinct clones on the vector parameter θ, leading to a random-effects stochastic reaction network (RestoreNet). The extended random-effects formulation becomes

Δy=M100MJMRnT×Jpu+ε,uNJp1Jθθu,IJτ1200τp2Δu,εNnT(0,Σ(θ,σ2)), 8

where M is the block-diagonal design matrix for the random-effects u centered in θ, each block Mj is clone-specific, and is the Kronecker product. As in the case of the null model of Eq. (6), we estimate σ2 based on data. From Section 1.5 of the Additional file 1 the conditional distribution of the random-effects u given the data Δy is

u|ΔyNJp(Eu|Δy;ψ[u],Vu|Δy;ψ(u)), 9

where

Eu|Δy;ψ[u]=Vu|Δy;ψ(u)MΣ-1(θ,σ2)Δy+Δu-1θu,Vu|Δy;ψ(u)=MΣ-1(θ,σ2)M+Δu-1-1, 10

and ψ=(θ,σ2,τ12,,τp2) is the vector of all the unknown parameters. Once the parameters are estimated (see next section for inference details), the conditional expectations Eu|Δy;ψ[u] can then be used as a proxy for the clone-specific dynamic parameters. This method allows to infer clone-specific dynamics by extremely reducing the problem dimensionality from J·p to 2·p+1 (J2).

Inference procedure

In order to infer the maximum likelihood estimator ψ^ for ψ=(θ,σ2,τ12,,τp2), we have developed an efficient expectation-maximization (E-M) algorithm where the collected cell increments Δy and the random-effects u take the roles of the observed and latent states respectively. The full analytical expression of the E-step function Q(ψ|ψ)=Eu|Δy;ψ[(Δy,u;ψ)] and its partial derivatives ψjQ(ψ|ψ) are available (see Section 1.5 of the Additional file 1). In the E-M algorithm we iteratively update the E-function Q(ψ|ψ) using the current estimate ψ of ψ and then we minimize the -Q(ψ|ψ) w.r.t. ψ. As the E-step function Q(ψ|ψ) is non-linear and the parameters are box-constrained, we used the L-BFGS-B algorithm from the optim() base R function for optimization, to which we provided the objective function, along with its gradient ψQ(ψ|ψ), as input. The E-M algorithm is iterated until a convergence criterion is met, that is when the relative errors of the E-step function Q(ψ|ψ) and the parameters ψ are lower than a predefined tolerance.

Once we get the E-M estimate ψ^ for the parameters we evaluate the goodness-of-fit of the mixed-effects model according to the conditional Akaike information criterion [32]. As every E-M algorithm, the choice of the starting point ψs is very important from a computational point of view. We chose ψs=(θs,σs2,τ12=0,,τp2=0) as a starting point where (θs,σs2) is the optimum found in the fixed-effects LLA formulation of Eq. (6). This is a reasonable choice since we want to quantify how the dynamics Eu|Δy;ψ^[u]j of each clone j departs from the average dynamics θs. With the help of simulation studies (see Results section), we empirically proved that this choice always led to a conditional expectation Eu|Δy;ψ^[u] consistent with the true clone-specific dynamic parameters θ. Computational details can be found in Section 1.5 of the Additional file 1. The pseudocode of the E-M algorithm is provided in Algorithm 3 of the Additional file 1. The maximum likelihood inference for the basal model and the expectation-maximization algorithm for the random-effects model are implemented in the graphic file with name 12859_2023_5269_Figc_HTML.jpg package RestoreNet, available for download at https://cran.r-project.org/package=RestoreNet.

Model selection

The fixed-effects model M0 is scored according to the corrected Akaike information criterion (AIC) [33] defined as

AIC(M0)=-2M0(θ,σ2|Δy)+2dpM0d-pM0-1, 11

where M0 is the log-likelihood of the null model M0, d=nT is the size of Δy, and pM0 the corresponding number of parameters. The random-effects model M1 is ranked with the conditional Akaike information criterion (cAIC) [33] defined as

cAIC(M1)=-2(Δy|u;ψ)+2(ρ+1), 12

where (Δy|u;ψ) is the conditional log-likelihood of the response measurements Δy given the random-effects u, ψ is the vector of all the unknown parameters, and ρ is the effective degrees of freedom of M1 [34] defined as the trace ρ=tr(H) of the hat matrix

H=MMMΣ-1(θ,σ2)MMΣ-1(θ,σ2)MMΣ-1(θ,σ2)MMΣ-1(θ,σ2)M+Δu-1MΣ-1(θ,σ2)MΣ-1(θ,σ2). 13

To measure the distance of the fixed-effects model M0 from the mixed-effects model M1 we use the the Kullback-Leibler (KL) divergence [35]

KLdiv(M0M1)=p(Δy)logp(Δy)q(Δy)d(Δy)=12tr(Σ1-1Σ0)-d+(μ1-μ0)Σ1-1(μ1-μ0)+log|Σ1||Σ0|, 14

where p and q are the multivariate Gaussian density functions of Eqs. (6) and (8), whose mean vector and covariance matrix are given by

μ0=Mθ^0,Σ0=Σ(θ^0,σ^02),μ1=Mθ^1+MEu|Δy;ψ^[u],Σ1=Σ(θ^1,σ^12), 15

where (θ^0,σ^02) and (θ^1,σ^12) are the parameter estimates for M0 and M1. To make model divergences comparable across different sized samples, we use the rescaled KL divergence KLdiv(M0M1)/d.

Results

In silico validation study

We simulated the dynamics of J=3 distinct clones in four synthetic cell types A, B, C, D following the differentiation network structure of Fig. 2. The net-effect matrix V and the hazard vector h(x,θ) were derived from Eq. (2). To simulate the clonal tracking data we used the τ-leaping Algorithm 1 of the Additional file 1, with a time lag τ=1, that has been run independently for each clone. We designed each simulation so that the first clone dominates lineage D and the third clone dominates lineage C with a sampling frequency T=100. The values that were used for the reaction parameters are reported in Table 1.

Fig. 2.

Fig. 2

Differentiation structure of four synthetic cell types A, B, C, D. Cell duplication, cell death and cell differentiation are indicated with green, red and grey arrows

Table 1.

For each synthetic clone (row) the parameter values (columns) used in the synthetic studies

αA αB αC αD δA δB δC δD λAB λAC λCD
c1 0.2 0.15 0.17 0.45 0.001 0.007 0.004 0.002 0.13 0.15 0.08
c2 0.2 0.15 0.17 0.09 0.001 0.007 0.004 0.002 0.13 0.15 0.08
c3 0.2 0.15 0.51 0.09 0.001 0.007 0.004 0.002 0.13 0.15 0.08

We first ran a single simulation under different magnitudes for the noise variance σ2. Then we fit the random-effects model of Eq. (8) to the simulated data using Algorithm 3 from Additional file 1. We reported in Fig. 3 the simulated trajectories and a scatterplot of the estimated conditional expectation Eu|Δy;ψ^[u] for the random-effects model against the true clone-specific parameters. In the same figure we also reported a piechart where each clone k is identified with a pie whose slices are lineage-specific and weighted with wkl, defined as the difference between the conditional expectations of the duplication and death parameters, that is

wkl=Eu|Δy;ψ^[uαlk]-Eu|Δy;ψ^[uδlk], 16

where uαlk and uδlk are the random-effects for duplication and death of clone k in cell lineage l. The diameter of the k-th pie is proportional to the Euclidean 2-norm of

wk=(wkl1,,wkln), 17

where n is the number of cell types. Therefore, the larger the diameter, the more the corresponding clone expanded into the lineage associated to the largest slice. The values of the estimated conditional expectations are reported in Table 2. The scatterplot of Fig. 3 clearly indicates a strong agreement between the true parameters and the conditional expectations Eu|Δy;ψ^[u]. In particular, as expected, as the noise variance σ2 increased, the parameter estimates gradually moved away from the diagonal, so that the precision decreased. Also, our model correctly detected the dominance of clones 1 and 3 in lineages D and C respectively, even for large values of σ2, as suggested by the pie-charts of Fig. 3 and by the values of Table 2.

Fig. 3.

Fig. 3

a Simulated trajectories. b Scatterplot between the clone-specific true parameters θtrue and the conditional expectation Eu|Δy;ψ^[u]. c Clonal pie-charts where each clone k is identified with a pie whose slices are lineage-specific and weighted according to Eq. (16). The diameter of the k-th pie is proportional to the Euclidean 2-norm of wk, as defined in Eq. (17). Each row refers to specific values of synthetic noise variance σ2

Table 2.

Conditional expectations Eu|Δy;ψ^[u] of the random-effects obtained from the estimated parameters ψ^ for each reaction rate (rows) under different magnitudes of the noice variance σ2 (outer columns) for each clone (inner columns)

σ2=0.1 σ2=1 σ2=10
c1 c2 c3 c1 c2 c3 c1 c2 c3
αA 0.198 0.198 0.199 0.183 0.191 0.198 0.151 0.139 0.127
αB 0.151 0.152 0.148 0.146 0.148 0.145 0.163 0.148 0.137
αC 0.171 0.168 0.509 0.163 0.168 0.518 0.166 0.175 0.649
αD 0.446 0.094 0.098 0.450 0.100 0.121 0.479 0.199 0.319
δA 0.001 0.001 0.001 0.001 0.001 0.001 0.001 0.000 0.001
δB 0.007 0.007 0.007 0.007 0.007 0.007 0.008 0.007 0.007
δC 0.004 0.004 0.004 0.004 0.004 0.004 0.004 0.005 0.005
δD 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.003 0.004
δAB 0.129 0.130 0.130 0.129 0.130 0.133 0.127 0.126 0.110
δAC 0.149 0.150 0.148 0.148 0.149 0.151 0.154 0.155 0.153
δCD 0.081 0.079 0.079 0.079 0.080 0.078 0.082 0.079 0.058

Subsequently, to check goodness-of-fit, we ran 100 independent simulations separately for each noise variance setting. After fitting both the base model of Eq. (6) and the random-effects model of Eq. (8), using Algorithms 2 and 3 of the Additional file 1, the latter always reached a significantly lower AIC compared to the null model, as suggested by the boxplots of Fig. 4. This result clearly indicates that our proposed random-effects stochastic reaction network was able to measure variation between clones in terms of differentiation dynamics and to detect events of clonal dominance.

Fig. 4.

Fig. 4

Boxplot of the AICs of the fixed-effects (base) and random-effects (re) models under a measurement noise level equal to 0.1 (a), 1 (b) and 10 (c)

Comparison with GLS method

We compared our proposed method with the state-of-the-art method GLS [27]. To this end, we have designed two different simulation studies. In the first simulation study all the clones shared the same vector parameter, while in the second study we induced the same clonal expansions of previous section. In both studies we used the differentiation network structure of Fig. 2 as the true generative model from which we simulated clonal trajectories, using the τ-leaping Algorithm 1 of the Additional file 1, with a time lag τ=1. The net-effect matrix V and the hazard vector h(x,θ) were derived from Eq. (2). For each simulation, we ran 100 independent simulations under different noise variance settings (σ2{0.1,1,10}). Subsequently we fit both our proposed method RestoreNet and the competitor method GLS. We reported the results in Fig. 5, showing boxplots of the relative errors between the true parameters and the estimated parameters provided by each method.

Fig. 5.

Fig. 5

Boxplots of the relative errors between the true parameters and the estimated parameters provided by each candidate method (x-axis) for simulation study 1 (a) and 2 (b) under each noise variance setting (columns)

Figure 5 clearly indicates that our proposed inference method RestoreNet overall outperformed the competitor method GLS. Indeed, while in the first simulation study (no clonal dominance) both methods provided similar parameter estimates, in the second simulation study (with clonal dominance) our proposed method RestoreNet provided better parameter estimates compared to GLS. This result suggests that our proposed method RestoreNet was able to infer a cell differentiation network with clone-specific parameters. In conclusion, results from this synthetic study show that our method outperformed the competitor one for the identification of clonal dominance.

Clonal dynamics in rhesus macaques

We analysed the cellular barcode data collected from an established hematopoietic stem cell model, previously used to investigate hematopoietic reconstitution in rhesus macaques [23]. Mobilized peripheral blood (MPB) CD34+ cells from three macaques were transduced with barcoded vectors and, following engraftment, myeloid Granulocytes (G), Monocytes (M), and lymphoid T, B, and Natural Killer (NK) cells were flow sorted for 9.5 months (ZH33), 6.5 months (ZH17), and 4.5 months (ZG66) [36]. The total numbers of clones collected are 1165 (ZH33), 1280 (ZH17), and 1291(ZG66). Further details on transduction protocols and culture conditions can be found in the original study.

Although the sample DNA amount was maintained constant during the whole experiment (200 ng for ZH33 and ZG66 or 500 ng for ZH17), the sample collected resulted in different magnitudes of total number of reads (see Table 2 from Additional file 1). This discrepancy made all the samples not directly comparable. Therefore we rescaled the barcode counts according to Eq. (34) of the Additional file 1 before analysis. We compared the base and the random-effects models on the rhesus macaques clonal tracking data. Since the CD34+ cells were not collected, we only estimated the duplication parameters αT, αB, αNK, αM, αG and the death parameters δT, δB, δNK, δM, δG of the lymphoid (T, B, NK) and myeloid (M, G) cells. Therefore the differentiation parameters were not considered in our model, and the net-effect matrix and the hazard vector were obtained from Eqs. (2)–(5) accordingly. Thus, the biochemical reactions were defined as

xTαT2·xT,xTδT,xBαB2·xB,xBδB,xNKαNK2·xNK,xNKδNK,xMαM2·xM,xMδM,xGαG2·xG,xGδG, 18

where the left and right columns list the duplication and death reactions, respectively. The corresponding model became effectively a birth/death model including 10 dynamic parameters, one duplication and death rate for each cell lineage. We fit both the fixed-effects model of Eq. (6) and the mixed-effects model of Eq. (8) separately to the data of each animal. To further remove bias, we focused our analyses on the clones that were recaptured at least 5 times. This resulted in a number of clones J equal to 481 (ZH33), 139 (ZH17), and 202 (ZG66), and in 6 (ZH33), 5 (ZH17), and 4 (ZG66) time points. We reported the results on model selection in Table 3, and the estimated parameters ψ^ in Table 4.

Table 3.

Comparison between fixed-effects M0 and mixed-effects M1 models: Number of parameters (p), AIC, KL divergence KLdiv(M0M1) and rescaled KL divergence KLdiv(M0M1)/d in each rhesus macaque

p AIC KLdiv(M0M1) KLdiv(M0M1)/d
ZH33 M0 11.00 81377.27
M1 434.16 38160.15 21062.95 1.87
ZH17 M0 11.00 336752.11
M1 478.43 29478.05 291854802.44 114228.89
ZG66 M0 11.00 31194.60
M1 410.92 21384.85 232030.37 83.77

Table 4.

Parameter estimated for the proposed mixed-effects model: Fixed-effects (θ) and variance (τ2) of the random-effects for both the duplication α and death δ parameters for each cell lineage and each rhesus macaque

ZH33 ZH17 ZG66
θ τ2 θ τ2 θ τ2
αT 0.813 1.176 2.246 1.051 1.081 2.702
αB 0.193 0.597 6.503 4.648 0.055 0.876
αNK 0.758 2.253 2.435 2.364 1.095 1.943
αG 0.197 0.403 10.931 53.216 0.847 1.318
αM 0.360 0.547 3.298 4.256 2.198 1.800
δT 0.155 0.074 0.172 0.741 0.039 0.059
δB 0.102 0.059 2.159 36.268 0.006 0.051
δNK 0.228 0.089 0.223 0.406 0.098 0.100
δG 0.039 0.029 13.211 70.756 0.018 0.017
δM 0.100 0.059 0.012 0.018 0.035 0.019

Using the estimated parameters ψ^, following Eq. (10), we computed the net conditional expectations of Eq. (16), which we used as a proxy for the clone-specific net-duplication αl-δl in each cell lineage l. The resulting values are reported in Fig. 6 in a box-plot fashion. Subsequenty, in Fig. 7 we proposed to use a weighted pie chart to visualize our findings at clonal level. Consistently with previous section, each pie, corresponding to a particular clone, was weighted by its net conditional expectations, as defined in Eq. (16).

Fig. 6.

Fig. 6

For each animal analyzed (ac), the boxplots of the conditional expectations Eu|Δy;ψ^[uαlk]-Eu|Δy;ψ^[uδlk] computed from the estimated parameters ψ^ for the clone-specific net-duplication αl-δl in each cell lineage l (different colors). The whiskers extend to the data extremes

Fig. 7.

Fig. 7

Estimated clonal pie-charts for the rhesus macaques ZH33 (a), ZH17 (b) and ZG66 (c): Each k-th clone is identified with a pie whose slices are lineage-specific and weighted according to Eq. (16). The diameter of the k-th pie is proportional to the Euclidean 2-norm of wk, as defined in Eq. (17). The legend scales are different across the three plot panels

As a result, according to the AIC values, in each animal the mixed-effects model (M1) outperformed the fixed-effects one (M0). This means that the clones did not follow the same average dynamics for the birth/death process. Instead, the dynamic of some clones departed from the average dynamics with a significant (random) effect. In particular, the conditional net-duplication rates Eu|Δy;ψ^[uαlk]-Eu|Δy;ψ^[uδlk] of Figs. 6 - 7 suggest events of clonal dominance in specific cell lineages. As an example, for the animals ZH33 and ZG66 we observed clonal expansions into NK cells. Whereas, for the animal ZH17 we observed clonal expansions into G and B cell lineages. Finally, for the animal ZG66 we also observed events of clonal dominance into M and T cell lineages. Furthermore, the weighted pie charts from Fig. 7 revealed different gradients of clonal dominance between the three rhesus macaques. As an example, by looking at the size of the pies, it is possible to observe an higher clonal dominance of NK cells in ZH33, and of G cells in ZH17, compared to the expansions of M, NK and T cells detected in ZG66, where the diameters of the clone-specific pies are rather similar. Not only the proposed mixed-effects model detected clonal dominance in certain cell types, it also detected which clones were responsible.

Genotoxic effects on clonal dynamics

We analyzed an in-vivo clonal tracking dataset previously used in [30] to investigate clonal diversity in tumor-prone mice under two different treatment conditions. Cdkn2a-/- tumor prone Lin- cells were ex-vivo transduced with a lentiviral vector expressing GFP under either spleen focus-forming virus (SFV) or PGK promoter/enhancer sequence. Cells are then transplanted into lethally irradiated wild-type mice. To recover enough DNA material, equal amounts of blood from two or three mice belonging to the same experimental group were pooled before cell sorting. Integration sites were then retrieved by polymerase chain reaction (PCR) at different time points from sorted T (CD3+) and B (CD19+) lymphocytes, from myeloid cells (CD11b+) and unsorted blood cells (total MNC). Clonal tracking samples were collected under heterogeneous technical conditions as reported in Table 2 of the Additional file 1. These confounding effects made the samples not directly comparable. Therefore we rescaled the samples following the description in Section 2.2 of the Additional file 1 before analysis. The total number of distinct clones collected were 45186 and 20471 for the PGK and SFV treatments respectively. To further remove bias, we focused our analyses on the top J=1000 most recaptured clones across lineages and time. The number of time-points T was equal to 7 (PGK) and 6 (SFV).

Next, we compared the fixed-effects model of Eq. (6) and the random-effects model of Eq. (8) on the rescaled clonal tracking data, so as to compare the dynamics of clonal dominance under the two viral vector conditions. Since the HSCs were not collected, we only estimated the duplication parameters αT, αB, αM and the death parameters δT, δB, δM of the lymphoid (T, B) and myeloid (M) cells. Therefore, in analogy to the previous section the differentiation parameters were not considered in our model, and the net-effect matrix and the hazard vector were obtained from Eqs. (2)–(5) accordingly. Therefore, the biochemical reactions were defined as

xTαT2·xT,xTδT,xBαB2·xB,xBδB,xMαM2·xM,xMδM, 19

where the left and right columns list the duplication and death reactions, respectively. We fit both the fixed-effects model of Eq. (6) and the mixed-effects model of Eq. (8) separately to the data of each vector treatment. Both models included six dynamic parameters, that is one scalar value for each combination of cell type with duplication and death reactions. We reported the results on model selection in Table 5, and the estimated parameters ψ^ in Table 6.

Table 5.

Comparison between fixed-effects M0 and mixed-effects M1 models: Number of parameters (p), AIC, KL divergence KLdiv(M0M1) and rescaled KL divergence KLdiv(M0M1)/d in each treatment group

p AIC KLdiv(M0M1) KLdiv(M0M1)/d
PGK M0 7.00 115997.43
M1 471.40 65083.07 17098.71 1.29
SFV M0 7.00 63520.89
M1 842.00 30147.56 52431.53 6.51

Table 6.

Parameter estimated for the proposed mixed-effects model: Fixed-effects (θ) and variance (τ2) of the random-effects for both the duplication α and death δ parameters for each cell lineage and each treatment group

PGK SFV
θ τ2 θ τ2
αM 0.058 1.014 1.287 5.781
αB 0.092 0.872 0.024 0.408
αT 0.632 2.625 3.367 2.824
δM 0.095 0.041 0.232 0.085
δB 0.079 0.028 0.156 0.080
δT 0.127 0.044 0.437 0.193

Then, from the estimated parameters ψ^ we computed the conditional expectations of Eq. (16), which we used as a proxy for the clone-specific net-duplication αl-δl in each cell lineage l. In analogy to the previous section, the resulting values are reported in Fig. 8 in a box-plot fashion, while in Fig. 9 we proposed to use a weighted pie chart to visualize our findings at clonal level.

Fig. 8.

Fig. 8

For each treatment group (a, b), the boxplots of the conditional expectations of Eq. (16) computed from the estimated parameters ψ^ for the clone-specific net-duplication αl-δl in each cell lineage l (different colors). The whiskers extend to the data extremes

Fig. 9.

Fig. 9

Estimated clonal pie-charts under the vector treatments PGK (a) and SFV (b): Each k-th clone is identified with a pie whose slices are lineage-specific and weighted according to Eq. (16). The diameter of the k-th pie is proportional to the Euclidean 2-norm of wk, as defined in Eq. (17). The legend scales are different across the two plot panels

As a result, according to the AIC values, under each treatment the mixed-effects model (M1) outperformed the fixed-effects one (M0). This means that the clones exhibited heterogeneity in their dynamics for the birth/death process. The dynamics of some clones departed from the average dynamics with a significant (random) effect. In particular, the conditional net-duplication rates of Eq. (16) from Figs. 8 - 9 suggest events of clonal dominance in specific cell lineages. For example, under the PGK treatment we observed clonal expansions into T cells. Whereas, under the SFV treatment we observed clonal expansions into M and T cell lineages with even higher conditional rates compared to PGK. Furthermore, the Kullback–Leibler divergence from Table 5 revealed a different gradient of clonal dominance between the two treatments, suggesting that the clonal expansions identified in the SFV case were more significant compared to PGK.

Discussion and conclusion

In this work we proposed a random-effects cell differentiation network which takes into account heterogeneity in the dynamics across the clones. Our framework extends the clone neutral local linear approximation of a stochastic quasi-reaction network, written in the Ito formulation, by introducing random-effects for the clones on the dynamics parameters to allow for clonal dominance. We used a maximum likelihood approach to infer the parameters of the base (fixed-effects only) model that are than used as initial values for the estimation of the random-effects model by means of an E-M algorithm. We tested our framework with a τ-leaping simulation study, showing accurate performance of the method in the identification of a clonal expansion and in the inference of the true parameters. Then, by means of an additional in-silico study, we have shown that our method outperforms the state-of-the-art method GLS [27]. Subsequently, the application of our proposed method on a rhesus macaque clonal tracking study revealed significant clonal dominance for specific cell types. Particularly interesting is that the NK clonal expansions detected by our model were already observed by former studies [23, 37, 38], and therefore our findings are consistent with those previously obtained. Indeed [37] described the oligoclonal expansions of NK cells and the long-term persistence of HSPCs and immature NK cells. Finally, our proposed method allowed to detect the expected impact of vector genotoxicity on clonal dynamics in a tumor-prone mice model of haematopoiesis, as already observed in a previous study [30].

The main approximation, in both the basal and random-effects formulations, is the piece-wise linearity of the process. In both cases we consider first a local linear approximation of the Kramers-Moyal approximated Master equation, which is then used to infer the process parameters either with or without random-effects. Although the linearity assumption makes all the computations easier, this approximation becomes poor as the time lag increments (the Δts) of the collected data increase. This can be addressed by introducing in the likelihood higher-order approximation terms than the ones considered by the Euler-Maruyama method. The Milstein approximation is a possible choice [39]. Another, completely different, approach is to employ extended Kalman filtering (EKF) which is suitable for non-linear state space formulations [40]. Also, our framework cannot consider false-negative errors or missing values of clonal tracking data. Also for this issue, an EKF formulation could be a possible extension. The frequentist-based inference step of our proposed E-M algorithm may be replaced by Bayesian alternatives. For example, the E-step function Q(ψ|ψ) could be replaced by a Metropolis-Hastings step [41, 42]. Alternatively, a variational Bayes method could be employed, where the unknown vector parameter ψ is treated as an additional latent variable [43]. Our future work will aim to extend the graphic file with name 12859_2023_5269_Figd_HTML.jpg package RestoreNet by including other types of reactions (besides cell duplication, cell death and cell differentiation).

Our tool can be considered as complementary to the classical Shannon entropy index [30] in detecting fast and uncontrolled growing of clones after a gene therapy treatment. Indeed, while the Shannon entropy measures the diversity of a population of clones as a whole, RestoreNet provides a clone-specific quantification of dominance in terms of conditional mean and variance of the expansion rates. Our proposed method provides a prototype model of clonal haematopoiesis whose parameters are calibrated to fit high-dimensional clonal tracking data. Our data-driven model can be integrated with those obtained with alternative approaches, where the unknown parameters are either set to experimentally-derived quantities, computed from the steady states, or based on independent studies [44, 45].

In conclusion, our proposed stochastic framework is able to detect deviant clonal behaviour relative to the average dynamics of haematopoiesis. This is an important aspect for gene therapy applications where is crucial to quickly detect any adverse event that may be related to clonal dominance. Therefore our tool can provide statistical support in gene therapy surveillance analyses. Our proposed method also has potential applications in other biomedical longitudinal studies with subject-specific dynamics, such as population infection dynamics [46, 47], population analysis of tumor development [48], and genetic regulatory networks [49]. Moreover, our proposed mixed-effects formulation of stochastic quasi-reaction networks can potentially be applied to more general, non-Markovian, classes of network models, such as stochastic hybrid systems with memory (SHSM). This more general class of models suits history-dependent biological systems, such as neural dynamics and immune responses [50, 51]. A mixed-effects formulation of dynamical systems may find room also in optimal investments problems, such as stochastic games in a continuous-time Markov regime-switching environment [52]. Indeed, if such models can be written in a Ito-type formulation, mixed-effects on sensible subjects (e.g. groups of investors in a market) can be incorporated.

Supplementary Information

12859_2023_5269_MOESM1_ESM.pdf (395.2KB, pdf)

Additional file 1. Detailed description of the methods

Acknowledgements

Not applicable

Author contributions

All authors analysed the data and wrote the paper. L.D.C. designed and implemented the stochastic framework. E.C.W. and M.A.G. equally supervised this work. All authors read and approved the final manuscript.

Funding

This publication is based on work from COST Action CA15109 (COSTNET), supported by COST (European Cooperation in Science and Technology). E.C.W. acknowledges support from the Fondazione Leonardo (514.7.010.098-4) and funding from the Swiss National Science Foundation (SNSF 188534). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Availability of data and materials

All data analysed in this work are available from earlier publications [23, 30]. The code that supports the findings of this study is openly available at https://github.com/delcore-luca/ClonalDominance. The stochastic framework is implemented in the graphic file with name 12859_2023_5269_Fige_HTML.jpg package RestoreNet publicly available for download at https://CRAN.R-project.org/package=RestoreNet.

Declarations

Ethics approval and consent to participate

Not applicable

Consent for publication

Not applicable

Competing interests

The authors declare that they have no competing interests.

Footnotes

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Ernst C. Wit and Marco A. Grzegorczyk equally supervised this work

Change history

7/17/2023

The formatting of the figures and sentences were incorrect in the original publication. The article has been updated to rectify the errors.

Contributor Information

Luca Del Core, Email: luca.delcore@nottingham.ac.uk.

Danilo Pellin, Email: danilo_pellin@dfci.harvard.edu.

Ernst C. Wit, Email: ernst.jan.camiel.wit@usi.ch

Marco A. Grzegorczyk, Email: m.a.grzegorczyk@rug.nl

References

  • 1.Friedmann T, Roblin R. Gene therapy for human genetic disease? Science. 1972;175(4025):949–55. doi: 10.1126/science.175.4025.949. [DOI] [PubMed] [Google Scholar]
  • 2.Bryder D, Rossi DJ, Weissman IL. Hematopoietic stem cells: the paradigmatic tissue-specific stem cell. Am J Pathol. 2006;169(2):338–346. doi: 10.2353/ajpath.2006.060312. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Kustikova OS, Wahlers A, Kühlcke K, Stähle B, Zander AR, Baum C, Fehse B. Dose finding with retroviral vectors: correlation of retroviral vector copy numbers in single cells with gene transfer efficiency in a cell population. Blood. 2003;102(12):3934–3937. doi: 10.1182/blood-2003-05-1424. [DOI] [PubMed] [Google Scholar]
  • 4.Fehse B, Kustikova O, Bubenheim M, Baum C. Pois (s) on-it’s a question of dose.... Gene Ther. 2004;11(11):879–881. doi: 10.1038/sj.gt.3302270. [DOI] [PubMed] [Google Scholar]
  • 5.Baum C, Düllmann J, Li Z, Fehse B, Meyer J, Williams DA, Von Kalle C. Side effects of retroviral gene transfer into hematopoietic stem cells. Blood J Am Soc Hematol. 2003;101(6):2099–2113. doi: 10.1182/blood-2002-07-2314. [DOI] [PubMed] [Google Scholar]
  • 6.Modlich U, Kustikova OS, Schmidt M, Rudolph C, Meyer J, Li Z, Kamino K, Von Neuhoff N, Schlegelberger B, Kuehlcke K, et al. Leukemias following retroviral transfer of multidrug resistance 1 (mdr1) are driven by combinatorial insertional mutagenesis. Blood. 2005;105(11):4235–4246. doi: 10.1182/blood-2004-11-4535. [DOI] [PubMed] [Google Scholar]
  • 7.Baum C, Kustikova O, Modlich U, Li Z, Fehse B. Mutagenesis and oncogenesis by chromosomal insertion of gene transfer vectors. Hum Gene Ther. 2006;17(3):253–263. doi: 10.1089/hum.2006.17.253. [DOI] [PubMed] [Google Scholar]
  • 8.Catlin SN, Guttorp P, Abkowitz JL. The kinetics of clonal dominance in myeloproliferative disorders. Blood. 2005;106(8):2688–2692. doi: 10.1182/blood-2005-03-1240. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Roeder I, Horn M, Glauche I, Hochhaus A, Mueller MC, Loeffler M. Dynamic modeling of imatinib-treated chronic myeloid leukemia: functional insights and clinical implications. Nat Med. 2006;12(10):1181–1184. doi: 10.1038/nm1487. [DOI] [PubMed] [Google Scholar]
  • 10.Müller-Sieburg CE, Cho RH, Thoman M, Adkins B, Sieburg HB. Deterministic regulation of hematopoietic stem cell self-renewal and differentiation. Blood J Am Soc Hematol. 2002;100(4):1302–1309. [PubMed] [Google Scholar]
  • 11.Roeder I, Kamminga LM, Braesel K, Dontje B, de Haan G, Loeffler M. Competitive clonal hematopoiesis in mouse chimeras explained by a stochastic model of stem cell organization. Blood. 2005;105(2):609–616. doi: 10.1182/blood-2004-01-0282. [DOI] [PubMed] [Google Scholar]
  • 12.Sieburg HB, Cho RH, Dykstra B, Uchida N, Eaves CJ, Muller-Sieburg CE. The hematopoietic stem compartment consists of a limited number of discrete stem cell subsets. Blood. 2006;107(6):2311–2316. doi: 10.1182/blood-2005-07-2970. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Loeffler M, Birke A, Winton D, Potten C. Somatic mutation, monoclonality and stochastic models of stem cell organization in the intestinal crypt. J Theor Biol. 1993;160(4):471–491. doi: 10.1006/jtbi.1993.1031. [DOI] [PubMed] [Google Scholar]
  • 14.Loeffler M, Bratke T, Paulus U, Li Y, Potten C. Clonality and life cycles of intestinal crypts explained by a state dependent stochastic model of epithelial stem cell organization. J Theor Biol. 1997;186(1):41–54. doi: 10.1006/jtbi.1996.0340. [DOI] [PubMed] [Google Scholar]
  • 15.Loeffler M, Roeder I. Tissue stem cells: definition, plasticity, heterogeneity, self-organization and models-a conceptual approach. Cells Tissues Organs. 2002;171(1):8–26. doi: 10.1159/000057688. [DOI] [PubMed] [Google Scholar]
  • 16.Meineke FA, Potten CS, Loeffler M. Cell migration and organization in the intestinal crypt using a lattice-free model. Cell Prolif. 2001;34(4):253–266. doi: 10.1046/j.0960-7722.2001.00216.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Roeder I, Braesel K, Lorenz R, Loeffler M. Stem cell fate analysis revisited: interpretation of individual clone dynamics in the light of a new paradigm of stem cell organization. J Biomed Biotechnol. 2007;2007. [DOI] [PMC free article] [PubMed]
  • 18.Winton D, Blount M, Ponder B. A clonal marker induced by mutation in mouse intestinal epithelium. Nature. 1988;333(6172):463–466. doi: 10.1038/333463a0. [DOI] [PubMed] [Google Scholar]
  • 19.Park H-S, Goodlad RA, Wright NA. Crypt fission in the small intestine and colon. A mechanism for the emergence of g6pd locus-mutated crypts after treatment with mutagens. Am J Pathol. 1995;147(5):1416. [PMC free article] [PubMed] [Google Scholar]
  • 20.Bjerknes M, Cheng H. Modulation of specific intestinal epithelial progenitors by enteric neurons. Proc Natl Acad Sci. 2001;98(22):12497–12502. doi: 10.1073/pnas.211278098. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Potten CS, Booth C, Pritchard DM. The intestinal epithelial stem cell: the mucosal governor. Int J Exp Pathol. 1997;78(4):219–243. doi: 10.1046/j.1365-2613.1997.280362.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Biasco L, Pellin D, Scala S, Dionisio F, Basso-Ricci L, Leonardelli L, Scaramuzza S, Baricordi C, Ferrua F, Cicalese M, Giannelli S, Neduva V, Dow D, Schmidt M, Von Kalle C, Roncarolo M, Ciceri F, Vicard P, Wit E, Di Serio C, Naldini L, Aiuti A. In vivo tracking of human hematopoiesis reveals patterns of clonal dynamics during early and steady-state reconstitution phases. Cell Stem Cell. 2016;19(1):107–119. doi: 10.1016/j.stem.2016.04.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Wu C, Li B, Lu R, Koelle S, Yang Y, Jares A, Krouse A, Metzger M, Liang F, Loré K, Wu C, Donahue R, Chen IY, Weissman I, Dunbar C. Clonal tracking of rhesus macaque hematopoiesis highlights a distinct lineage origin for natural killer cells. Cell Stem Cell. 2014;14(4):486–499. doi: 10.1016/j.stem.2014.01.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Mazurier F, Gan OI, McKenzie JL, Doedens M, Dick JE. Lentivector-mediated clonal tracking reveals intrinsic heterogeneity in the human hematopoietic stem cell compartment and culture-induced stem cell impairment. Blood. 2004;103(2):545–52. doi: 10.1182/blood-2003-05-1558. [DOI] [PubMed] [Google Scholar]
  • 25.Biasco L, Rothe M, Schott JW, Schambach A. Integrating vectors for gene therapy and clonal tracking of engineered hematopoiesis. Hematol/Oncol Clin. 2017;31(5):737–752. doi: 10.1016/j.hoc.2017.06.009. [DOI] [PubMed] [Google Scholar]
  • 26.Pellin D. Stochastic modelling of dynamical systems in biology [phd thesis]. PhD thesis, University of Groningen; 2017.
  • 27.Pellin D, Biasco L, Aiuti A, Di Serio MC, Wit EC. Penalized inference of the hematopoietic cell differentiation network via high-dimensional clonal tracking. Appl Netw Sci. 2019;4(1):1–26. doi: 10.1007/s41109-019-0225-1. [DOI] [Google Scholar]
  • 28.Bailey NTJ. The elements of stochastic processes with applications to the natural sciences. Wiley Classics Library, Wiley; 1990. https://books.google.it/books?id=yHPnwl4QOfIC.
  • 29.Kloeden PE, Platen E. Numerical solution of stochastic differential equations. In: Stochastic modelling and applied probability. Springer; 2011. https://books.google.it/books?id=BCvtssom1CMC.
  • 30.Del Core L, Cesana D, Gallina P, Secanechia YNS, Rudilosso L, Montini E, Wit EC, Calabria A, Grzegorczyk MA. Normalization of clonal diversity in gene therapy studies using shape constrained splines. Sci Rep. 2022;12(1):3836. doi: 10.1038/s41598-022-05837-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Dobson AJ, Barnett AG. An Introduction to Generalized Linear Models. Chapman & Hall/CRC Texts in Statistical Science. CRC Press; 2018. https://books.google.it/books?id=kIhnDwAAQBAJ.
  • 32.Vaida F, Blanchard S. Conditional Akaike Information for mixed-effects models. Biometrika. 2005;92(2):351–370. doi: 10.1093/biomet/92.2.351. [DOI] [Google Scholar]
  • 33.Burnham KP, Anderson DR, Huyvaert KP. AIC model selection and multimodel inference in behavioral ecology: some background, observations, and comparisons. Behav Ecol Sociobiol. 2011;65(1):23–35. doi: 10.1007/s00265-010-1029-6. [DOI] [Google Scholar]
  • 34.Müller S, Scealy JL, Welsh AH. Model selection in linear mixed models. Stat Sci. 2013;28(2):135–167. doi: 10.1214/12-STS410. [DOI] [Google Scholar]
  • 35.Kullback S, Leibler RA. On information and sufficiency. Ann Math Stat. 1951;22(1):79–86. doi: 10.1214/aoms/1177729694. [DOI] [Google Scholar]
  • 36.Lu R, Neff NF, Quake SR, Weissman IL. Tracking single hematopoietic stem cells in vivo using high-throughput sequencing in conjunction with viral genetic barcoding. Nat Biotechnol. 2011;29(10):928–933. doi: 10.1038/nbt.1977. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Wu C, Espinoza DA, Koelle SJ, Yang D, Truitt L, Schlums H, Lafont BA, Davidson-Moncada JK, Lu R, Kaur A, et al. Clonal expansion and compartmentalized maintenance of rhesus macaque NK cell subsets. Sci Immunol. 2018;3(29):9781. doi: 10.1126/sciimmunol.aat9781. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Wu C, Mortlock RD, Shin T, Cordes S, Fan X, Brenchley J, Allan DA, Hong SG, Dunbar CE. Tissue-resident clonal expansions of rhesus macaque NK cells. Blood. 2021;138:998. doi: 10.1182/blood-2021-146220. [DOI] [Google Scholar]
  • 39.Mil’shtejn GN. Approximate integration of stochastic differential equations. Theory Probab Its Appl. 1975;19(3):557–562. doi: 10.1137/1119062. [DOI] [Google Scholar]
  • 40.Jazwinski AH. Stochastic Processes and Filtering Theory. Courier Corporation; 2007.
  • 41.Metropolis N, Rosenbluth AW, Rosenbluth MN, Teller AH, Teller E. Equation of state calculations by fast computing machines. J Chem Phys. 1953;21(6):1087–1092. doi: 10.1063/1.1699114. [DOI] [Google Scholar]
  • 42.Hastings WK. Monte Carlo sampling methods using Markov chains and their applications; 1970.
  • 43.MacKay DJ, Mac Kay DJ. Information Theory. Inference and Learning Algorithms. Cambridge University Press; 2003.
  • 44.Ashcroft P, Manz MG, Bonhoeffer S. Clonal dominance and transplantation dynamics in hematopoietic stem cell compartments. PLoS Comput Biol. 2017;13(10):1–20. doi: 10.1371/journal.pcbi.1005803. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Pedersen RK, Andersen M, Stiehl T, Ottesen JT. Mathematical modelling of the hematopoietic stem cell-niche system: clonal dominance based on stem cell fitness. J Theor Biol. 2021;518:110620. doi: 10.1016/j.jtbi.2021.110620. [DOI] [PubMed] [Google Scholar]
  • 46.Liu D, Lu T, Niu X-F, Wu H. Mixed-effects state-space models for analysis of longitudinal dynamic systems. Biometrics. 2011;67(2):476–85. doi: 10.1111/j.1541-0420.2010.01485.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Nowak MA, Bangham CRM. Population dynamics of immune responses to persistent viruses. Science. 1996;272(5258):74–9. doi: 10.1126/science.272.5258.74. [DOI] [PubMed] [Google Scholar]
  • 48.Ribba B, Holford N, Magni P, Trocóniz I, Gueorguieva I, Girard P, Sarr C, Elishmereni M, Kloft C, Friberg L. A review of mixed-effects models of tumor growth and effects of anticancer drug treatment used in population analysis. CPT Pharmacomet Syst Pharmacol. 2014;3(5):113. doi: 10.1038/psp.2014.12. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Schlitt T, Brazma A. Current approaches to gene regulatory network modelling. BMC Bioinform. 2007;8:1–22. doi: 10.1186/1471-2105-8-S6-S9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Gokgoz N, Öktem H. Modeling of tumor-immune system interaction with stochastic hybrid systems with memory: a piecewise linear approach. Adv Theory Nonlinear Anal its Appl. 2021;5(1):25–38. [Google Scholar]
  • 51.Weber G-W, Ugur O, Taylan P, Tezel A. On optimization, dynamics and uncertainty: a tutorial for gene-environment networks. Discret Appl Math. 2009;157(10):2494–2513. doi: 10.1016/j.dam.2008.06.030. [DOI] [Google Scholar]
  • 52.Savku E, Weber G-W. Stochastic differential games for optimal investment problems in a Markov regime-switching jump-diffusion market. Ann Oper Res. 2022;312(2):1171–1196. doi: 10.1007/s10479-020-03768-5. [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

12859_2023_5269_MOESM1_ESM.pdf (395.2KB, pdf)

Additional file 1. Detailed description of the methods

Data Availability Statement

All data analysed in this work are available from earlier publications [23, 30]. The code that supports the findings of this study is openly available at https://github.com/delcore-luca/ClonalDominance. The stochastic framework is implemented in the graphic file with name 12859_2023_5269_Fige_HTML.jpg package RestoreNet publicly available for download at https://CRAN.R-project.org/package=RestoreNet.


Articles from BMC Bioinformatics are provided here courtesy of BMC

RESOURCES