Skip to main content
Springer logoLink to Springer
. 2017 Oct 24;28(5):1053–1072. doi: 10.1007/s11222-017-9779-x

Trajectory inference and parameter estimation in stochastic models with temporally aggregated data

Maria Myrto Folia 1, Magnus Rattray 1,
PMCID: PMC6096750  PMID: 30147250

Abstract

Stochastic models are of fundamental importance in many scientific and engineering applications. For example, stochastic models provide valuable insights into the causes and consequences of intra-cellular fluctuations and inter-cellular heterogeneity in molecular biology. The chemical master equation can be used to model intra-cellular stochasticity in living cells, but analytical solutions are rare and numerical simulations are computationally expensive. Inference of system trajectories and estimation of model parameters from observed data are important tasks and are even more challenging. Here, we consider the case where the observed data are aggregated over time. Aggregation of data over time is required in studies of single cell gene expression using a luciferase reporter, where the emitted light can be very faint and is therefore collected for several minutes for each observation. We show how an existing approach to inference based on the linear noise approximation (LNA) can be generalised to the case of temporally aggregated data. We provide a Kalman filter (KF) algorithm which can be combined with the LNA to carry out inference of system variable trajectories and estimation of model parameters. We apply and evaluate our method on both synthetic and real data scenarios and show that it is able to accurately infer the posterior distribution of model parameters in these examples. We demonstrate how applying standard KF inference to aggregated data without accounting for aggregation will tend to underestimate the process noise and can lead to biased parameter estimates.

Keywords: Linear noise approximation, Stochastic systems biology, Time aggregation, Kalman filter

Introduction

Stochastic differential equations (SDEs) are used to model the dynamics of processes that evolve randomly over time. SDEs have found a range of applications in finance (e.g. stock markets, Hull 2009), physics (e.g. statistical physics, Gardiner 2004) and biology (e.g. biochemical processes, Wilkinson 2011). Usually, the coefficients (model parameters) of SDEs are unknown and have to be inferred using observations from the systems of interest. Observations are typically partial (e.g. collected at discrete times for a subset of variables), corrupted by measurement noise, and may also be aggregated over time and/or space. Given these observed data, our task is to infer the process trajectory and estimate the model parameters.

A motivating example of stochastic aggregated data comes from biology and more specifically from luminescence bioimaging, where a luciferase reporter gene is used for studying gene expression inside a cell (Spiller et al. 2010). The luminescence intensity emitted from the luciferase experiments is collected from single cells and is integrated over a time period (in certain cases up to 30 min, Harper et al. 2011) and then recorded as a single data point. In this paper, we consider the problem of inferring SDE model parameters given temporally aggregated data of this kind.

Imaging data from single cells are highly stochastic due to the low number of reactant molecules and the inherent stochasticity of cellular processes such as gene transcription or protein translation. The chemical master equation (CME) is widely used to describe the evolution of biochemical reactions inside cells stochastically (Gillespie 1992). Exact inference with the CME is rare and, even when possible, computationally prohibitive. In Golightly and Wilkinson (2005), the authors perform inference using a diffusion approximation of the CME, resulting in a nonlinear SDE. The linear noise approximation (LNA) (Kampen 2007) has been used as an alternative approximation of the CME which is valid for a sufficiently large system (Komorowski et al. 2009; Fearnhead et al. 2014). According to the LNA, the system is decomposed into a deterministic and a stochastic part. The latter is described by a linear SDE of the following form:

dXt=atXtdt+btdWt, 1

where Xt is a d-dimensional process, at is a (d×d)-matrix-valued function, Wt is an m-dimensional Wiener process, and bt a (d×m) matrix-valued function.

Given an initial condition X0=c, Eq. (1) has the following known solution (Arnold 1974):

Xt=Φtc+Φtt0tΦs-1bsdWs, 2

where Φt is the fundamental matrix of the homogeneous equation dXt=atXtdt. Note that the right integral in Eq. (2) is a Gaussian process, as it is an integral of a non-random function with respect to Wt (Arnold 1974). If we further assume that the initial condition c is normally distributed or constant, Eq. (2) gives rise to a Gaussian process. Additionally, the solution of a (linear) SDE is a Markov process (Arnold 1974). These properties of linear SDEs (of the form of Eq. (1)) are highly desirable when carrying out inference.

The approaches above do not treat the aggregated nature of luciferase data in a principled way but instead assume that the data are proportional to the quantity of interest at the measurement time (Harper et al. 2011; Komorowski et al. 2009). Here, we build on the work of Komorowski et al. (2009) and Fearnhead et al. (2014) and extend it to the case of aggregated data. Since we are using the LNA, the problem is equivalent to a parameter inference problem for the time integral of a linear SDE as in Eq. (1): t0tX(u)du. We follow a Bayesian approach, where the likelihood of our model is computed using a continuous-discrete Kalman filter (Särkkä 2006) and parameter inference is achieved using an MCMC algorithm. The paper is structured as follows: we first provide a description of the LNA as an approximation of the CME and introduce the integral of the LNA for treating temporally aggregated observations. We then describe a Kalman filter framework for performing inference with the LNA and its integral. Finally, we apply our method in three different examples. The Ornstein–Uhlenbeck process has been picked as a system where we can study its exact solutions. The Lotka–Volterra model was selected as an example of a nonlinear system with partial observations. The translation inhibition model was used to demonstrate our method with real data.

The linear noise approximation and its integral

The CME can be used to model biochemical reactions inside a cell. It is essentially a forward Kolmogorov equation for a Markov process that describes the evolution of a spatially homogeneous biochemical system over time.

Assume a biochemical reaction network consisting of N chemical species X1,,XN in a volume Ω and v reactions R1,,Rv. The usual notation for such a network is given below:

R1 : p11X1 + p12X2++p1NXN q11X1 + q12X2++q1NXN R2 : p21X1 + p22X2++p2NXN q21X1 + q22X2++q2NXN ... Rv : pv1X1 + pv2X2++pvNXN qv1X1 + qv2X2++qvNXN

where X = (X1,,XN)T represents the number of chemical species (we assume molecules) and x = XΩ is the concentration of molecules. We denote with P the v×u matrix whose elements are given by pij and Q the v×u matrix with elements qij. We define the stoichiometry matrix S as S=(Q-P)T. The probability of a reaction taking place in [t,t+dt) is given by the vector of reaction rates hj(x,Ω,t)Ωdt.

The probability p(X,t) that the system is in state X at time t is given by the CME:

dp(X,t)dt=Ωi=1v[hi(X-S(i),Ω,t)p(X-S(i),t)-hi(X,Ω,t)p(X,t)]. 3

However, as mentioned before, exact inference with the CME, even when possible, is computationally prohibitive. We use the LNA as an approximation of the CME due to its successful application in Komorowski et al. (2009) and Fearnhead et al. (2014). The state of the system X is expected to have a peak around the macroscopic value of order Ω and fluctuations of order Ω1/2 such that Xt=Ωϕt+Ω1/2ξt. This way the system is decomposed to the deterministic part ϕt and the stochastic part ξt. The LNA arises as a Taylor expansion of the CME in powers of the volume Ω; for a detailed derivation the reader is referred to Kampen (2007) and Elf and Ehrenberg (2003). By collecting terms of order Ω1/2, we obtain the deterministic part of the system, namely the macroscopic rate equations ϕi, where i stands for the ith species:

dϕidt=Sih(ϕt,Ω,t). 4

Terms of order Ω0 give us the stochastic part of the system:

dξt=Atξtdt+EtdW, 5

where, At=SFt and Fij=hj(ϕt,Ω,t)ϕi(t), while EEtT=Sdiag(h(ϕt,Ω,t))ST. Equation (5) is a linear SDE of the form of Eq. (1). Its solution is a Gaussian Markov process, provided that we have an initial condition that is a constant or a Gaussian random variable. The ordinary differential equations (ODEs) that describe the mean and variance of this Gaussian process are given by Arnold (1974):

dmtdt=Atmt, 6
dVtdt=VtAtT+AtVt+EEtT. 7

Note that if we set the initial condition of m0=0, then Eq. (6) will lead to mt=0 at all times. We will make the assumption that, at each observation point, mt is reset to zero since it can be beneficial for inference as discussed in Fearnhead et al. (2014) and Giagos (2010).

In what follows we will assume, without loss of generality, that the volume Ω=1, i.e. the number of molecules equals the concentration of molecules and thus,

Xt=ϕt+ξt. 8

Equation (8) is the sum of a deterministic and a Gaussian term; consequently, it will also be normally distributed. By taking its expectation and variance, we have that Xt|X0N(ϕt+mt,Vt) which, according to the initial condition m0=0, leads to Xt|X0N(ϕt,Vt).

We are now interested in the integral of Eq. (8), as this will allow us to model the aggregated data,

Ht=t0tXudu=t0tϕudu+t0tξudu=It+Qt. 9

The deterministic part of this aggregated process is given by I(t), and the stochastic part is given by Q(t). Subsequently, we have the following ODEs:

dItdt=ddtt0tϕtdu=ϕt, 10
dQtdt=ξt. 11

Here, Qt will also follow a Gaussian process (as it is the integral of a Gaussian process) so we need to compute its mean and variance. The ODEs for the mean, variance and E[QtξtT] are given below; their proofs can be found in “Appendix A.1”:

dE[Qt]dt=E[ξt]=0, 12
dVar[Qt]dt=E[QtξtT]+E[ξtQtT], 13
dE[QtξtT]dt=E[QtξtT]A(t)T+Vt. 14

Note that Qt is not Markovian since knowledge of its history is not sufficient to determine its current state. However, jointly with ξt it forms a bivariate Gaussian Markov process, that is characterised by the following linear SDE:

dξtQt=At010ξtQtdt+Et0dWt,ξ0Q0=00. 15

From Eq. (15) we have that ξt,Qt are jointly Gaussian and, consequently, their marginals are also normally distributed. Thus, according to (9) Ht|H0,X0N(μt,Σt) with μt=It and Σt=V[Qt].

Kalman filter for the LNA and its integral

The classical filtering problem is concerned with the problem of estimating the state of a linear system given noisy, indirect or partial observations (Kalman 1960). In our case, the state is continuous and is described by Eq. (8) while the observations are collected at discrete time points with or without Gaussian noise. For this reason, we refer to it as the continuous-discrete filtering problem (Jazwinski 1970; Särkkä 2006).

First, we consider the case where observations are taken from the process Xt and not from its integral Ht. In that case, the observation process is given by yt=PtXt+ϵt where ϵtN(0,R) and accounts for technical noise. The observability matrix Pt is used to deal with the partial observability of the system, for example, if we have two species X1, X2 and we observe only X1, P=[1,0]T.

Following the Kalman filter (KF) methodology, we need to define the following quantities:

  • Prior: p(X0).

  • Predictive distribution: p(Xt|y1:t-1), where y1:t-1 refers to the observations at discrete points up to time t-1.

  • Posterior or Update distribution: p(Xt|y1:t).

The predictive distribution is given by Xt|y1:t-1N(μ1t-,Vt-), where μ1t- and Vt- are found by integrating forward for [t,t-1] Eqs. (4) and (7) initialised at the posterior mean μ1t-1 and variance Vt-1. In our case, the mean of the stochastic part is initialised at 0, so μ1t corresponds to the deterministic part ϕt. By updating the deterministic solution at each observation point, we achieve a better estimate, as the ODE solution can become a poor approximation over long periods of time. The posterior distribution p(Xt|y1:t)=N(μ1t,Vt) corresponds to the standard posterior distribution of a discrete KF and the updated μ1t and Vt are given in “Appendix A.3”. This case has been thoroughly studied in Fearnhead et al. (2014).

We consider now the case where the state Xt is being observed through the integrated process Ht, such that the observation process is given by yt=PtHt+ϵt and ϵtN(0,R). Again, we need to define a prior distribution as well as calculate the predictive and posterior distributions for the system that we are studying.

The predictive distribution of our system is given by p(XtHt|y1:t-1)=N(μ1t-μ2t-,Vt-Ct-TCt-Σt-), where Ct=E[QtMtT]. For this step, we need to integrate forward the ODEs (4), (10), (7), (13) and (14) with the appropriate initial conditions as seen in Algorithm 1. Note that the integrated process Ht needs to be reset to 0 at each observation point in order to capture correctly the ‘area under graph’ of the underlying process Xt.

To compute the posterior distribution p(Xt|y1:t), we look at the joint distribution of (Ht,Xt,yt) conditioned on y1:t-1:

XtHtyt|y1:t-1Nμ1t-μ2t-Ptμ2t-,Vt-Ct-TCt-TPtTCt-Σt-Σt-PtTPtCt-PtΣt-PtΣt-PtT+Rt 16

By using the lemma in “Appendix A.2” and using the corresponding blocks of the joint distribution (16), we can calculate the posterior mean and variance of p(Xt|y1:t):

μ1t=μ1t-+PtCt-T(PtΣt-PtT+Rt)-1(yt-Ptμ2t-),Vt=Vt--PtCt-T(PtΣt-PtT+Rt)-1PtCt-. 17

Since we are interested in parameter inference, we will need to compute the likelihood L(θ) of the system, where θ represents the parameter vector of the system:

L(θ)=p(y1|θ)i=2tp(yi|y1:i-1,θ). 18

The individual terms of the likelihood are given by p(yt|y1:t-1)=N(Ptμ2t-,PtΣt-PtT+Rt). Parameter inference is then straightforward either by using a numerical technique such as the Nelder–Mead algorithm to obtain the maximum likelihood (ML) parameters or using a Bayesian method such as a Metropolis-Hastings (MH) algorithm. The general procedure for performing inference using aggregated data is summarised in Algorithm 1. graphic file with name 11222_2017_9779_Figa_HTML.jpg

The Ornstein–Uhlenbeck process

We first investigate the effect of integration in a one-dimensional, zero-mean OU process of the following form:

dXt=-αXtdt+σdWt, 19

where α is the drift or decay rate of the process and σ is the diffusion constant. Both of these parameters are assumed to be unknown, and we will try to infer them using the KF scheme that we have developed.

The OU process is a special case of a linear SDE (Eq. (1)), since its coefficients are time invariant, resulting in a stationary Gaussian–Markov process. Analytical solutions for both the OU and its integral exist (Gillespie 1996) and are presented in “Appendix A.4”. The results for the mean mt and variance Vt of the OU, where Δ=t-t0, are given below:

mt=m0e-αΔ, 20a
Vt=e-2αΔV0+σ22α1-e-2αΔ. 20b

The integral of Eq. (19) is given by dYt=Xtdt, and the mean, variance and covariance are given below,

E[yt]=m0α(1-e-αΔ), 21a
Cov(Xt,Yt)=σ22α2+-σ2α2+V0αe-αΔ+σ22α2-V0αe-2αΔ, 21b
Var[yt]=σ2α2Δ+σ22α3-V0α21-e-2αΔ+2-σ2α3+V0α2(1-e-αΔ). 21c

We are interested in inferring the parameters α and σ given observations from Yt at discrete times, where the interval Δ between two observations is constant. We will compare two approaches. First, we will assume that the data come directly from Xt ignoring their aggregated nature and use the standard discrete–continuous KF, referred to as KF1. To make the comparison of this scenario fairer, we will normalise the observations by dividing with Δ, which brings the observation close to an average value of the process, in an attempt to match the observations to data generated from the process Xt. In the second case, we will use the KF on the integrated process in analogy with Algorithm 1, which we will refer to as KF2. The case of inferring the parameters of an OU process using non-aggregated data with an MCMC algorithm has already been studied in Mbalawata et al. (2013).

Xt will reach its stationary distribution after a time of order 1α, which is given by N(0,σ22α) (Gillespie 1992). However, the integrated process Yt is non-stationary since Var[yt] as Δ . This already shows us that the two processes behave differently.

Since we are going to use the normalised observations from Yt with KF1, we will take a look at the normalised process Zt=1ΔYt:

E[zt]=E[1ΔYt]=1ΔE[yt]=m0αΔ(1-e-αΔ), 22a
Var[zt]=Var[1ΔYt]=1Δ2Var[yt]=σ2α2Δ+1Δ2(σ22α3-V0α2)(1-e-2αΔ)++2Δ2(-σ2α3+V0α2)(1-e-αΔ). 22b

By taking the limit as Δ in Eq. (22) and using L’Hospital’s rule we can show that E[zt]0 and Var[zt]0. So, the normalised process is again not approaching the stationary distribution of Xt.

We have generated aggregated data from the integral of an OU process with α=4 and σ=2. To simulate data from Yt, we need to first simulate data from Xt. This can be done in general by discretising the process and using the Euler–Maruyama algorithm. However, in the case of the OU process, we can also use an exact updating formula (see “Appendix A.6”). The aggregated data can then be collected using the discretised form Yt+dt=Yt+Xtdt or a numerical integration method such as the trapezoidal rule over the indicated integration period. In “Appendix A.12” we have included plots of the OU process and the corresponding aggregated process.

We tested inference using KF1 with normalised data and KF2 with aggregated data. Results of parameter estimation using a standard random walk MH algorithm are presented in Table 1. Improper uniform priors over infinite range have been used on the log parameters, while different time intervals Δ have been considered. For each interval Δ, we have sampled 100 observations from a single trajectory of an OU process with α=4 and σ=2 aggregated over the specified Δ. For this example, we have assumed no observation noise. MCMC traceplots of α and σ can be found in “Appendix A.13” (Figs. 6, 7) which indicate a good mixing of the chain and fast convergence. All chains were run for 50K iterations and 30K were discarded as burn-in. To verify the validity of the results, we have run nine more datasets, separately each time. An average over the ten datasets can be found in “Appendix A.7” (Table 5). As we can see, the estimates for KF1 deteriorate for larger Δ. This is expected since the aggregated process diverges further from the OU process as Δ increases. Estimates remain good for KF2 even when Δ is large, although they become more uncertain, as can be witnessed by the increased standard deviations. Filtering results for KF1 and KF2 with aggregated data using the estimated parameter results for Δ=1 are given in “Appendix A.14”.

Table 1.

Mean posterior ± 1 s.d. for α and σ using a Metropolis-Hastings algorithm

Δ KF α σ
0.1 KF1 3.023±0.235 1.891±0.135
0.5 KF1 1.905±0.141 1.256±0.095
1.0 KF1 1.420±0.102 0.868±0.068
2.0 KF1 1.022±0.075 0.540±0.044
0.1 KF2 4.022±0.295 2.113±0.159
0.5 KF2 4.092±0.335 2.311±0.206
1.0 KF2 3.865±0.368 2.234±0.240
2.0 KF2 3.704±0.513 2.082±0.307

Data were simulated from an OU process with α=4 and σ=2

Fig. 6.

Fig. 6

MCMC traces of the posterior of α using a random walk MH for both KF1 and KF2. Ground truth for α=4

Fig. 7.

Fig. 7

MCMC traces of the posterior of σ using a random walk MH for both KF1 and KF2. Ground truth for σ=2

Table 5.

Average of mean posterior ± 1 s.d. over 10 different datasets for α and σ using a Metropolis–Hastings algorithm

Δ KF α σ
0.1 KF1 3.081±0.258 1.670±0.209
0.5 KF1 1.956±0.153 1.199±0.125
1.0 KF1 1.493±0.112 0.799±0.088
2.0 KF1 1.064±0.090 0.485±0.046
0.1 KF2 4.171±0.417 1.974±0.208
0.5 KF2 4.121±0.377 2.068±0.257
1.0 KF2 4.123±0.445 2.083±0.283
2.0 KF2 4.208±0.783 2.091±0.371

Data were simulated from an OU process with α=4 and σ=2

It is of interest to investigate the inferred stationary variance of the OU process using KF1 and KF2. We have plotted the inferred stationary variances obtained by the MH for both KF1 and KF2 in Fig. 1. The boxplots are obtained using the average of 10 different datasets and correspond again to an OU process with α=4 and σ=2, thus giving rise to a stationary variance of σ22α=0.5. When using the normalised aggregated data directly with KF1, we infer the wrong stationary variance of the underlying OU process which tends to zero as Δ becomes larger, consistent with the theoretical results from Eq. (22). Intuitively, we can attribute this behaviour to the fact that aggregated data have relatively smaller fluctuations, so that KF1 will tend to underestimate the process variance.

Fig. 1.

Fig. 1

Boxplots of inferred stationary variance of the OU process for different Δ. The simulated OU process has α=4 and σ=2 corresponding to a stationary variance of 0.5, as indicated by the dotted horizontal line. The inferred stationary variance using KF1 tends to zero as Δ grows, but the stationary variance from KF2 is inferred correctly at all Δ. a Boxplots of inferred stationary variance for different Δ using KF1. b Boxplots of inferred stationary variance for different Δ using KF2

In this section, we have looked at an example of inferring the parameters of an SDE using aggregated data, and we have found that to obtain accurate results we need to explicitly model the aggregated process. As the observation intervals become larger, there is a greater mismatch between KF1 and KF2. In the next two sections, we will look at examples of more complex stochastic systems that must be approximated by the LNA and compare again inference results using KF1 and KF2.

Lotka–Volterra model

We are now going to look at a system of two species that interact with each other according to three reactions

X1θ12X1(prey production) 23a
X1+X2θ22X2(predator production) 23b
X2θ3(predator death) 23c

The model represented by the biochemical reaction network (23) is known as the Lotka–Volterra model, with X1 representing prey species and X2 predator species. Although a simple model, it has been used as a reference model (Boys et al. 2008; Fearnhead et al. 2014) since it consists of two species, making it possible to observe it partially through one of the species and also provides a simple example of a nonlinear system.

The LNA can be used to approximate the dynamics and the resulting ODEs can be found in “Appendix A.8”. We want to compare parameter estimation results using KF1 and KF2. We collected aggregated data from a Lotka–Volterra model using the Gillespie algorithm. We assumed a known initial population of 10 prey species and 100 predator species. The parameters of the system for producing the synthetic data were set to (θ1,θ2,θ3)=(0.5,0.0025,0.3), following (Boys et al. 2008). We have added Gaussian noise with standard deviation set to 3.0, and we assumed that the noise level was known for inference. Our goal was to infer the three parameters (θ1,θ2,θ3) of the system using aggregated observations solely from the predator population.

The Gillespie algorithm was run for 20 min. Data were aggregated and collected every 2 min resulting in 10 observations per sample. To infer the parameters, we assumed that we had 40 independent samples available. Since we assumed independence between the samples, we worked with the product of their likelihoods. In the ideal case of having complete data of a stochastic kinetic model the likelihood is conjugate to an independent gamma prior for the rate constants (Wilkinson 2011). The choice of Ga(2,10) with shape =2 and rate =10 gives a reasonable range for all three parameters and has also been used by Fearnhead et al. (2014). However, in this case the choice of prior is not important as the data dominate the posterior. We have run the same experiment using uninformative exponential priors Exp(10-4) that resulted in equivalent posterior distributions. Since we know that we want all parameters to be positive, we worked with a log transformation. MCMC convergence in this example is relatively slow and adaptive MCMC (Sherlock et al. 2010) was found to speed up convergence (see “Appendix A.9” for details). The adaptive MCMC was run for 30K iterations with 10K regarded as burn-in. The MCMC was initialised at random values sampled from uniform distributions. Parameter estimation results for all three parameters using adaptive MCMC are shown in Table 2, while Fig. 2 shows histograms of their posterior densities. The ground truth value for each parameter is indicated by a vertical blue line. We can see that only the posterior histograms corresponding to KF2 include the correct estimate for all three parameters in their support. In “Appendix A.15”, we have included traceplots of the MCMC runs for all three parameters, where we can see that the adaptive MCMC leads to a fast convergence for both KF1 and KF2. In order to verify the validity of our results, we have run an extra 100 datasets, each consisting of 40 independent samples and obtained point estimates from KF1 and KF2 using the Nelder–Mead algorithm. The results can be found in “Appendix A.10” and agree with our previous conclusion that inference with KF1 gives inaccurate estimates.

Table 2.

Mean posterior ± 1 s.d. for θ1,θ2,θ3 using an adaptive MCMC

θ Ground truth KF1 KF2
θ1 0.5 0.480±0.006 0.494±0.005
θ2 0.0025 0.0023±5×10-5 0.0025±5×10-5
θ3 0.3 0.243±0.010 0.298±0.010

Data were simulated from a Lotka–Volterra model according to the ground truth values

Fig. 2.

Fig. 2

Posterior densities of θ1,θ2,θ3 from aggregated data using KF1 (red histogram) and KF2 (green histogram). a Posterior density of θ1. b Posterior density of θ2. c Posterior density of θ3

Assuming knowledge of the parameter values, we can also use the KF for trajectory inference. In Fig. 3, we demonstrate filtering results for the prey population assuming that we have aggregated data. We simulated a trajectory using θ1=0.5,θ2=0.0025,θ3=0.3 and sampled aggregated data every 2 min. Black lines represent the true trajectory of the populations. We see that the inferred credible region with KF1 does not contain the true underlying trajectory in many places. Note that red dots correspond to normalised (aggregated) observations for KF1 and aggregated observations for KF2, so they do not have the same values. In “Appendix A.16”, we include filtering results for the unobserved predator population.

Fig. 3.

Fig. 3

Filtering results for the prey population. Red dots correspond to aggregated observations for KF2 and normalised observations for KF1. The black line represents the actual process. Purple lines represent the mean estimate and green 1 s.d. . a Filtering results for the prey population using KF1. b Filtering results for the prey population using KF2

Translation inhibition model

In this example, we are interested in inferring the degradation rate of a protein from a translation inhibition experiment. We model the translation inhibition experiment by the following set of reactions where R stands for mRNA and P for protein:

RcPR+P(translation) 24a
PdPP/Ω(protein degradation) 24b

The LNA is used, again, as an approximation of the dynamics and the resulting system of ODEs can be found in “Appendix A.11”. Before applying our method to real data from this system, we test the performance on synthetic data simulated using the Gillespie algorithm. We simulated 30 time series (corresponding to 30 different cells), assuming the following values as the ground truth for the kinetic parameters: cP=200 and dP=0.97. We further set the initial protein abundance of m0 to 400 molecules. We have scaled the data by a factor k=0.03, so that they are proportional to the original synthetic data and added Gaussian noise with a variance of s=0.1. For this study, we have assumed that data were integrated over 30 min.

Again we use an adaptive MCMC algorithm (Sherlock et al. 2010). Non-informative exponential priors with mean 104 were placed on all parameters. We have adopted the parametarisation used in Komorowski et al. (2009) and Finkenstädt et al. (2013) such as c~P=k·cp and m~0=k·m0 and worked in the log parameter space. Parameter estimation results for the vector (cp,dp,s,k,m0) using KF1 and KF2 are summarised in Table 3. As we can see, the degradation rates are successfully inferred by both approaches. However, using KF1 leads to an overestimation of m0 and an underestimation of the noise level s, which corresponds to a smoother process than the underlying one. MCMC traces from both KF1 and KF2 are presented in Fig. 11.

Table 3.

Mean posterior ± 1 s.d. for (cP,dP,s,k,m0) using an adaptive MCMC

c GT KF1 KF2
cp 200 254.152±23.3329 196.9065±25.6251
dp 0.97 0.9822±0.0364 0.9974±0.0433
s 0.1 0.0349±0.0251 0.0995±0.0093
k 0.03 0.0236±0.0017 0.0312±0.0039
m0 400 588.9959±44.0205 392.5980±49.0594

Data were simulated from a translation inhibition model according to the ground truth (GT) values

Fig. 11.

Fig. 11

Adaptive MCMC traces of the posterior vector (cp,dp,s,k,m0) using synthetic data with KF1 and KF2

We then applied our model to single cell luciferase data from a subset of 11 pituitary cells (Harper et al. 2011). Parameter estimation results using the same adaptive MCMC are summarised in Table 4. The MCMC was run for 100K iterations out of which 60K were discarded as burn-in. Again, we observe that, using KF1, we get a higher m0 and a slightly lower noise level s. Posterior histograms of the degradation rates are shown in Fig. 4. A deterministic approach for fitting the data would give a degradation rate of around 1.02 and, as we can see, this value is included in both histograms of Fig. 4. To check convergence using the Gelman–Rubin statistic, we have run 3 different chains with different initialisations. MCMC traces for both KF1 and KF2 are shown in “Appendix A.18” (Fig. 12 and 13) where we can see that the three chains are very close to each other, corresponding to a Gelman–Rubin statistic close to 1.

Table 4.

Mean posterior ± 1 s.d. for (cP,dP,s,k,m0) using adaptive MCMC with single cell data obtained from a subset of 11 pituitary cells from a translation inhibition experiment (Harper et al. 2011)

c KF1 KF2
cp 217.2987±33.5441 169.9254±43.1153
dp 1.1020±0.0767 1.2037±0.1046
s 0.0026±0.0026 0.0081±0.0038
k 0.0255±0.0029 0.0373±0.0088
m0 449.7679±53.9760 278.2987±70.6582

Fig. 4.

Fig. 4

Posterior histograms of degradation rate using KF1 and KF2

Fig. 12.

Fig. 12

Three MCMC chains of the posterior vector (cp,dp,s,k,m0) using single cell data with KF1

Fig. 13.

Fig. 13

Three MCMC chains of the posterior vector (cp,dp,s,k,m0) using single cell data with KF2

Discussion

We have presented a Bayesian framework for doing inference using aggregated observations from a stochastic process. Motivated by a systems biology example, we chose to use the LNA to approximate the dynamics of the stochastic system, leading to a linear SDE. We then developed a Kalman filter that can deal with integrated, partial and noisy data. We have compared our new inference procedure to the standard Kalman filter which has previously been applied in systems biology applications approximated using the LNA. Overall, we conclude that the aggregated nature of data should be considered when modelling data, as aggregation will tend to reduce fluctuations and therefore the stochastic contribution of the process may be underestimated.

In Sect. 4, we described the different properties of a stochastic process and its integral in the case of the Ornstein–Uhlenbeck process. We showed that one cannot simply treat the integrated observations as proportional to observations coming from the underlying unintegrated process when carrying out inference. As the aggregation time window increases, parameter estimates using this approach become less accurate and the inferred stationary variance of the process is underestimated. In contrast, our modified KF is able to accurately estimate the model parameters and stationary variance of the process.

In Sect. 5, we have demonstrated the ability of our method to give more accurate results in a Lotka–Volterra model given synthetic aggregated data. In Sect. 6, we looked at a real-world application with data from a translation inhibition experiment carried out in single cells. As the LNA depends on its deterministic part, and in a deterministic system integration is dealt with reasonably well using the simple proportionality constant approach, some of the system parameters, such as the degradation rate, can be inferred reasonably well by the standard non-aggregated data approach. However, neglecting the aggregated nature of the data does lead to a significantly larger estimate of the initial population of molecules even in this simple application. This is consistent with our observation that neglecting aggregation will tend to underestimate the scale of fluctuations as it is the number of molecules that determines the size of fluctuations in this example. In models where noise plays a more critical role, e.g. systems with noise-induced oscillations, the effect of parameter misspecification could have more serious consequences on model-based inferences.

Our proposed inference method can deal with the intrinsic noise inside a cell, measurement noise and temporal aggregation. However, cell populations are highly heterogeneous, and cell-to-cell variability has not been considered in our current inference scheme. It would be possible to deal with cell-to-cell variability using a hierarchical model (Finkenstädt et al. 2013) which could be combined with the integrated data Kalman Filter developed here.

All experiments were carried out on a cluster of 64bit Ubuntu machines with an i5-3470 CPU @ 3.20 GHz x 4 processor and 8 GB RAM. All scripts were run in Spyder (Anaconda 2.5.0, Python 2.7.11, Numpy 1.10.4). Code reproducing the results of the experiments can be found on GitHub https://github.com/maria-myrto/inference-aggregated.

Acknowledgements

MR was funded by the UK’s Medical Research Council (award MR/M008908/1).

Appendix

Mean and variance of the integrated process

We start by computing E[Qt], i.e. the mean of Qt. We know that:

dξt=Atξtdt+EtdW, 25
dQt=ξtdtQt+dt=Qt+ξtdt. 26

Averaging Eq. (26), dividing by dt and letting dt0, gives us:

E[Qt+dt]=E[Qt]+E[ξt]dtE[Qt+dt]-E[Qt]=E[ξt]dtdE[Qt]dt=E[ξt]=0 27

The mean of Qt is set to zero, as we have chosen to use the Restarting LNA.

We now need to compute the covariance between Qt and ξt. Again E[Qt]=0 and E[ξt]=0 since we are using the Restarting LNA and thus, the covariance is given by E[QtξtT]. For our derivation, we need to use:

ξt+dtT=ξtT+ξtTAtTdt+EtTdWt. 28

By multiplying Eqs. (26) and (28) we get:

Qt+dtξt+dtT=(Qt+ξtdt)(ξtT+ξtTATdt+EtTdWt)=QtξtT+QtξtTAtTdt+QtEtTdWt++ξtξtTdt+ξtξtTAtTdtdt+ξtEtTdtdWt. 29

Averaging the result (29), retaining terms up to first order in dt, dividing by dt and letting dt0, we get:

E[Qt+dtξt+dtT]=E[QtξtT]+E[QtξtT]AtTdt+E[QtdWt]EtT+E[ξtξtT]dt,dE[QtξtT]dt=E[QtξtT]A(t)T+E[ξtξtT],dE[QtξtT]dt=E[QtξtT]A(t)T+Vt. 30

The variance of Qt is given by Var[Qt]=E[QtQtT] since E[Qt]=0. We have that,

Qt+dtQt+dtT=(Qt+ξtdt)(Qt+ξtdt)T,Qt+dtQt+dtT=QtQtT+QtξtTdt+ξtQtTdt+ξtξtTdtdt. 31

By averaging (31), retaining terms up to first order in dt, dividing by dt and letting dt0, we get:

E[Qt+dtQt+dtT]=E[QtQtT]+E[QtξtT]dt+E[ξtQtT]dt,E[Qt+dtQt+dtT]-E[QtQtT]=E[QtξtT]dt+E[ξtQtT]dt,Var[Qt]dt=E[QtξtT]+E[ξtQtT]. 32

Useful Gaussian identities

Let x and y be jointly Gaussian random vectors:

xyNμxμy,ACCTB 33

Then, the marginal and conditional distributions of x (equivalently for y) are, respectively (Bishop 2007):

xN(μx,A) 34
x|yN(μx+CB-1(y-μy),A-CB-1CT) 35

Update equations of a discrete Kalman Filter

Using the Gaussian Identities in A.2 we have

Xiyi|y1:(i-1)NmiPmi,SiSiPTPSiPSiPT+R 36

Since we are working with Gaussians, we know that Xi|y1:iN(mi,Si), and the updated mi and Si are given by:

mi=mi+SiPT(PSiPT+R)-1(yi-Pmi),Si=Si+SiPT(PSiPT+R)-1PSi. 37

Analytical solutions for the OU process and its integral

Given an OU process of the following form:

dXt=-αXtdt+σdWt 38

we can derive its solution according to the general theory for linear SDEs. Since the solution is a Gaussian process, we will only need to define its mean and variance which are given by Eqs. (6, 7). All the ODEs in this case are first-order linear ODEs with constant coefficients, so using for example an integrating factor, we can derive the following solution for an ODE of the form dxdt+ax=g(t),x(t=0)=x0:

xt=e-a(t-t0)x0+t0te-a(t-τ)g(τ)dτ. 39

For the mean we get from Eq. (6):

dmtdt=-αmt,mt0=m0mt=m0e-a(t-t0) 40

For the variance we have the following:

dVtdt=-2αVt+σ2,Vt0=V0Vt=e-2α(t-t0)V0+t0te-2α(t-τ)σ2dτVt=e-2α(t-t0)V0+σ22α(1-e-2α(t-t0)) 41

For the solution of the integrated OU process dYt/dt=Xt, we need to calculate its mean, covariance and variance given by Eqs. (12), (13) and (14). The initial conditions for these ODEs will be set to 0, since at each observation point the integrated process starts from 0. For clarity, we will use the results ABC from “Appendix A.5”.

First we find the mean:

dEtdt=mt=m0e-α(t-t0),E(t0)=0Et=t0tm0e-α(τ-t0)dτAEt=m0α(1-e-α(t-t0)) 42

For the covariance, we first calculate from Eq. (13):

dE[XtYt]dt=-αE[XtYt]+E[Xt2],E[X0Y0]=0E[XtYt]=t0tE[Xt2]e-α(t-τ)dτE[Xt2]=Vt+mt2E[XtYt]=t0tm02-σ22α+V0e-2α(τ-t0)+σ22αe-α(t-τ)dτA,CE[XtYt]=σ22α2(1-e-α(t-t0))+1αm02-σ22α+V0(e-α(t-t0)-e-2α(t-t0)) 43

Now the covariance can be calculated from:

Cov(Xt,Yt)=E[XtYt]-mtEtCov(Xt,Yt)==σ22α2+-σ2α2+V0αe-α(t-t0)+σ22α2-V0αe-2α(t-t0) 44

For the variance, we need to calculate:

dE[Yt2]dt=2E[XtYt],E[Y02]=0E[Yt2]=2t0tE[XτYτ]dτ(43),BE[Yt2]=m02α2(1-2e-α(t-t0)+e-2α(t-t0)) 45

Now we can derive the variance:

Var[yt]=E[Yt2]-Et2Var[yt]=σ2α2(t-t0)+(σ22α3-V0α2)(1-e-2α(t-t0))+2(-σ2α3+V0α2)(1-e-α(t-t0)) 46

Frequently used integrals for part (A.4)

A=t0te-α(τ-t0)dτ=1α(1-e-α(t-t0)) 47
B=t0te-2α(τ-t0)dτ=12α(1-e-2α(t-t0)) 48
C=t0te-α(t-τ)e-2α(τ-t0)dτ=1α(e-α(t-t0)-e-2α(t-t0)) 49

Exact updating formula of OU process

The OU process dXt=-αXtdt+σdWt admits an exact update formula given by Gillespie (1992):

Xt+dt=Xte-αdt+σ212αe-2αdtN(0,1), 50

Average over 10 datasets—OU example

See Table 5.

LNA for Lotka–Volterra model

The Lotka–Volterra model (23) gives rise to the stoichiometry matrix,

S=1-1001-1, 51

with transition rates,

h(X)=θ1X1θ2X1X2θ3X2. 52

The following matrices need to be computed:

F=θ10θ2y2θ2y10θ3, 53
SFT=A=θ1-θ2y2-θ2y1θ2y2θ2y1-θ3, 54
Sdiag(h(yt))ST=EET=θ1y1+θ2y1y2-θ2y1y2-θ2y1y2θ2y1y2+θ3y2, 55

The macroscopic rate equations are now given by:

dy1dt=θ1y1-θ2y1y2 56
dy2dt=θ2y1y2-θ3y2 57

For the diffusion terms, we only need to compute the variance of the resulting Gaussian process since we restart the stochastic part at each observation point in accordance with (Fearnhead et al. 2014).

dVdt=VAT+EET+AV=V11V12V21V22θ1-θ2y2θ2y2-θ2y1θ2y1-θ3++θ1y1+θ2y1y2-θ2y1y2-θ2y1y2θ2y1y2+θ3y2++θ1-θ2y2-θ2y1θ2y2θ2y1-θ3V11V12V21V22 58

V is a symmetric matrix so V12=V21. So:

dV11dt=2V11(θ1-θ2y2)-2V12θ2y1+θ2y1y2+θ1y1 59
dV12dt=V12(θ2y1-θ3+θ1-θ2y2)+V11θ2y2-θ2y1V22-θ2y1y2 60
dV22dt=2V22(θ2y1-θ3)+2V12θ2y2+θ2y1y2+θ3y2 61

The integrated process dYt=Xtdt follows Eqs. (10),(13),(14). The deterministic part is given by:

dI1dt=y1,dI2dt=y2. 62

The ODEs for its integrated variance and covariance with the underline process Xt are given below, where Cov(YXT)=Ct=C11C12C21C22 and Var(Y)=Gt:

dCdt=C11C12C21C22θ1-θ2y2θ2y2-θ2y1θ2y1-θ3+V11V12V12V22, 63

such as,

dC11dt=(θ1-θ2y2)C11-θ2y1C12+V11 64
dC12dt=θ2y2C11+(θ2y1-θ3)C12+V12 65
dC21dt=(θ1-θ2y1)C21-θ2y1C22+V12 66
dC22dt=θ2y2C21+(θ2y1-θ3)C22+V22 67
dGdt=Ct+CtT 68

Adaptive MCMC

According to the specific adaptive MH (Sherlock et al. 2010), the new state θ is sampled from a mixture of Gaussians:

θ=N(θt,Σ0),w.p.δN(θt,λΣt),w.p.1-δ 69

Σt corresponds to the sampled variance up to iteration t and is estimated after enough samples have been accepted. The parameter δ(0,1) and is defined by the user, we have chosen a value of 0.05. The scaling factor λ can either be fixed (Roberts and Rosenthal 2009) or be tuned (Sherlock et al. 2010). This algorithm targets an acceptance rate of 0.3.

Nelder Mead results for LV model

See Table 6.

Table 6.

Nelder–Mead results for θ1,θ2,θ3. The median values across 100 datasets are shown in the third and fourth column for KF1 and KF2, respectively

θ Ground truth KF1 Median[LQ,UQ] KF2 Median[LQ,UQ]
θ1 0.5 0.48160 [0.47770,0.48651] 0.49746 [0.49278,0.50122]
θ2 0.0025 0.00227 [0.00222,0.00232] 0.00248[0.00244,0.00254]
θ3 0.3 0.24773 [0.23927,0.25797] 0.30047[0.29320,0.31061]

Lower and upper quartiles are shown in brackets

LNA for translation inhibition model

The following model is being assumed where R and P stand for the (numbers of) gene mRNA and protein, respectively:

RcPR+P(translation) 70a
PdPP/Ω(protein degradation) 70b

The above equations result in the following stoichiometry matrix:

S=1-1, 71

and the transition rates are :

h(x,t)=cPdPp, 72

The required matrices are calculated below:

F=0dP, 73
SFT=A=-dP, 74
Sdiag(h(yt,θ))ST=EET=cP+dPp, 75

The deterministic part is now given by:

dpdt=cP-dPp 76

The stochastic part is given by the (restarting) LNA where we have dropped the dependency of Mt,Vt from time:

dMp=-dpMpdt+cP+dPpdWt 77

resulting in the following ODE for the stochastic variance:

dVpdt=-2dpVp+cP+dPp 78

For the integrated process, we get the following according to Eqs. (10),(13) and (14) (Fig. 5). First the deterministic part is given by:

dIpdt=p, 79

The stochastic part is given by:

dQpdt=Mp, 80

resulting in the following ODEs for its integrated variance and covariance with the unintegrated process:

dCov(QpMpT)dt=-dpCov(QpMpT)+Vp 81
dVar(Qp)dt=2Cov(QpMpT) 82
Fig. 5.

Fig. 5

Simulated trajectories from an OU process with α=4 and σ=2, along with the corresponding aggregated process with an integration period of 2 min. For the aggregated process, we assumed observations every 2 min, which are indicated by red crosses. a OU process. b Aggregated process

OU and aggregated OU process

See Fig. 5.

OU traceplots

Figures 6 and 7 show samples of the OU parameters during the MCMC runs.

Filtering results for OU process using aggregated data

See Fig. 8.

Fig. 8.

Fig. 8

Filtering results from KF1 and KF2 for an OU process with α=4.0 and σ=2.0 (blue trace) using aggregated data over an integration period of Δ=1.0. Black lines correspond to the posterior mean estimate and green lines to 1 s.d.. For inference, we used the estimated parameters from A.7. a KF1 (Δ=1.0). b KF2 (Δ=1.0). (Color figure online)

MCMC traces from LV experiment

See Fig. 9.

Fig. 9.

Fig. 9

MCMC traceplots for the LV experiment using an adaptive MCMC algorithm. a MCMC traceplots for θ1. b MCMC traceplots for θ2. c MCMC traceplots for θ3

Filtering results for the predator population

See Fig. 10.

Fig. 10.

Fig. 10

Filtering results for the predator population. Red dots correspond to aggregated observations, and the black line represents the actual process. Purple lines represent the mean estimate and green 1 s.d. . a Filtering results for the predator population using KF1. b Filtering results for the predator population using KF2

MCMC traces for Translation inhibition example with synthetic data

See Fig. 11.

MCMC traces for translation inhibition example with single cell data

See Figs. 12 and 13

References

  1. Arnold L. Stochastic Differential Equations Theory and Applications. [S.l.] Hoboken: Wiley; 1974. [Google Scholar]
  2. Bishop C. Pattern Recognition and Machine Learning (Information Science and Statistics) New York: Springer; 2007. [Google Scholar]
  3. Boys R, Wilkinson D, Kirkwood T. Bayesian inference for a discretely observed stochastic kinetic model. Stat. Comput. 2008;18(2):125–135. doi: 10.1007/s11222-007-9043-x. [DOI] [Google Scholar]
  4. Elf J, Ehrenberg M. Fast evaluation of fluctuations in biochemical networks with the linear noise approximation. Genome Res. 2003;13(11):2475–2484. doi: 10.1101/gr.1196503. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Fearnhead P, Giagos V, Sherlock C. Inference for reaction networks using the linear noise approximation. Biometrics. 2014;70(2):457–466. doi: 10.1111/biom.12152. [DOI] [PubMed] [Google Scholar]
  6. Finkenstädt B, Woodcock DJ, Komorowski M, Harper CV, Davis JRE, White MRH, Rand DA. Quantifying intrinsic and extrinsic noise in gene transcription using the linear noise approximation: an application to single cell data. Ann. Appl. Stat. 2013;7(4):1960–1982. doi: 10.1214/13-AOAS669. [DOI] [Google Scholar]
  7. Gardiner C. Handbook of Stochastic Methods: for Physics, Chemistry and the Natural Sciences (Springer Series in Synergetics) 3. Berlin: Springer; 2004. [Google Scholar]
  8. Giagos, V.: Inference for auto-regulatory genetic networks using diffusion process approximations. Ph.D. Thesis, Lancaster University (2010)
  9. Gillespie DT. Markov Processes : An Introduction for Physical Scientists. Boston: Academic Press; 1992. [Google Scholar]
  10. Gillespie DT. A rigorous derivation of the chemical master equation. Phys. A Stat. Mech. Appl. 1992;188:404–425. doi: 10.1016/0378-4371(92)90283-V. [DOI] [Google Scholar]
  11. Gillespie DT. Exact numerical simulation of the Ornstein-Uhlenbeck process and its integral. Phys. Rev. E. 1996;54:2084–2091. doi: 10.1103/PhysRevE.54.2084. [DOI] [PubMed] [Google Scholar]
  12. Golightly A, Wilkinson DJ. Bayesian inference for stochastic kinetic models using a diffusion approximation. Biometrics. 2005;61(3):781–788. doi: 10.1111/j.1541-0420.2005.00345.x. [DOI] [PubMed] [Google Scholar]
  13. Harper CV, Finkenstädt B, Woodcock DJ, Friedrichsen S, Semprini S, Ashall L, Spiller DG, Mullins JJ, Rand DA, Davis JRE, White MRH. Dynamic analysis of stochastic transcription cycles. PLoS Biol. 2011;9(4):e1000607. doi: 10.1371/journal.pbio.1000607. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Hull J. Futures and Other Derivatives. Options, Options, Futures and Other Derivatives. Upper Saddle River: Pearson/Prentice Hall; 2009. [Google Scholar]
  15. Jazwinski AH. Stochastic Processes and Filtering Theory. Cambridge: Academic Press; 1970. [Google Scholar]
  16. Kalman RE. A new approach to linear filtering and prediction problems. J. Basic Eng. 1960;82(1):35–45. doi: 10.1115/1.3662552. [DOI] [Google Scholar]
  17. Komorowski M, Finkenstädt B, Harper CV, Rand DA. Bayesian inference of biochemical kinetic parameters using the linear noise approximation. BMC Bioinform. 2009;10(1):1–10. doi: 10.1186/1471-2105-10-343. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Mbalawata IS, Särkkä S, Haario H. Parameter estimation in stochastic differential equations with Markov chain Monte Carlo and non-linear Kalman filtering. Comput. Stat. 2013;28(3):1195–1223. doi: 10.1007/s00180-012-0352-y. [DOI] [Google Scholar]
  19. Roberts GO, Rosenthal JS. Examples of adaptive MCMC. J. Comput. Graph. Stat. 2009;18(2):349–367. doi: 10.1198/jcgs.2009.06134. [DOI] [Google Scholar]
  20. Särkkä, S.: Recursive Bayesian inference on stochastic differential equations. Ph.D. Thesis, Helsinki University of Technology (2006)
  21. Sherlock C, Fearnhead P, Roberts GO. The random walk metropolis: linking theory and practice through a case study. Stat. Sci. 2010;25(2):172–190. doi: 10.1214/10-STS327. [DOI] [Google Scholar]
  22. Spiller DG, Wood CD, Rand DA, White MRH. Measurement of single-cell dynamics. Nature. 2010;465(7299):736–745. doi: 10.1038/nature09232. [DOI] [PubMed] [Google Scholar]
  23. van Kampen N. Stochastic Processes in Physics and Chemistry. 3. Amsterdam: Elsevier; 2007. [Google Scholar]
  24. Wilkinson, D.J.: Stochastic modelling for systems biology. In: Chapman & Hall/CRC Mathematical and Computational Biology, 2nd edn. Taylor & Francis (2011)

Articles from Statistics and Computing are provided here courtesy of Springer

RESOURCES