Skip to main content
Infectious Disease Modelling logoLink to Infectious Disease Modelling
. 2020 Jan 10;5:221–232. doi: 10.1016/j.idm.2019.12.007

Bayesian inference for dynamical systems

Weston C Roda 1
PMCID: PMC6994543  PMID: 32021948

Abstract

Bayesian inference is a common method for conducting parameter estimation for dynamical systems. Despite the prevalent use of Bayesian inference for performing parameter estimation for dynamical systems, there is a need for a formalized and detailed methodology. This paper presents a comprehensive methodology for dynamical system parameter estimation using Bayesian inference and it covers utilizing different distributions, Markov Chain Monte Carlo (MCMC) sampling, obtaining credible intervals for parameters, and prediction intervals for solutions. A logistic growth example is given to illustrate the methodology.

Keywords: Bayesian, Inference, Model fitting, Data, Dynamical system, Mathematical model

1. Introduction

A common method for performing parameter estimation for dynamical systems is to use Bayesian inference (Ghasemi et al., 2011; Higham & Husmeier, 2013; Ma & Berndsen, 2014; Periwal et al., 2008; Vanlier, Tiemann, Hilbers, & van Riel, 2012). Despite the popularity of using Bayesian inference for performing parameter estimation for dynamical systems and useful computational manuals, there is a need for a formalized and comprehensive methodology.

The methods described in this paper assume that the behaviors of the dynamical system of interest have been mathematically analyzed and that the solutions of the dynamical system are well-behaved. Additionally, it is assumed that if a numerical scheme is being used to solve the dynamical system that the numerical scheme is stable. The methodology is presented from a mathematical biology perspective and it will focus on systems of ordinary differential equations (ODEs); however, the Bayesian inference methodology presented can be applied to other areas of applied mathematics and other differential equations systems such as partial differential equations (PDEs). This paper will provide a formalized methodology for dynamical system parameter estimation using Bayesian inference and it will cover utilizing different distributions, Markov Chain Monte Carlo (MCMC) sampling, obtaining credible intervals for parameters, and prediction intervals for solutions. The methodology is illustrated by using a logistic growth example.

2. Dynamical system

Assume that the dynamical system of interest can be described by the following autonomous ODE system (1) written as a vector differential equation:

x=f(x), (1)

where x=x1,,xk and f=f1(x),,fk(x), with the vector of initial conditions x0=x10,,xk0.

It is assumed that the that the unique solution vector, x(t), of system (1) exists and can be obtained either explicitly or using numerical approximation. If a numerical approximation method is used, it is assumed that the numerical approximation scheme is stable.

All the parameters in system (1) will be denoted by the vector β. If the initial conditions x10,,xk0 will also be estimated, then let the initial conditions x10,,xk0 be contained in vector β as well.

The dependence of the unique solution vector x on both time, t, and the vector of parameters, β, will be emphasized and the unique solution vector will be denoted as x(β,t).

3. Data

Suppose there are m time series data sets. It is important to ensure that the correct ODE model solution or combination of ODE model solutions is fit to the jth time series data set (j=1,,m).

Sometimes a data set is scaled differently than the model solutions or the data set can be described by a summation of the ODE model solutions. In order to include these situations, we can use a linear combination of the ODE model solutions, a1jx1(β,t)++akjxk(β,t), to fit to the jth time series data set. (The simpler case where only the ith specific ODE model solution xi(β,t) is to be fit to the jth time series data set, is included in the linear combination where aij=1 and the other constants are zero.) If the nonzero vector of constants, aj, will be estimated, then let the nonzero vector of constants, aj, for j=1,m, be contained in vector

ν=[βa1am].

Also, if the jth data set can be described by a nonlinear combination of the ODE model solutions, then, similarly, let any estimated nonzero vector of constants, aj, be contained in vector

ν=[βa1am].

So, in general, we fit the function, F(x1(β,tij),,xk(β,tij),a1,,am), to the jth data set.

4. Distribution of data over time

The distribution of the observations over time for each jth data set must be chosen before fitting system (1) to the data. The following sections will describe the Gaussian, Poisson, Negative Binomial, and other distribution options.

4.1. Gaussian distribution

Let Y be a random variable from the Gaussian distribution with parameters μ and σ2=1τ>0, Y~N(μ,θ2). The formulation of the Gaussian distribution is given by the following continuous probability density function (pdf), f(y) (Bain & Engelhardt, 1987):

fy=12πσ2exp12σ2yμ2=τ2πexp12τyμ2. (2)

The mean, E[Y], of the Gaussian distribution is given by μ and the variance, Var[Y], of this distribution is given by σ2=1τ.

Assume that the jth time series data set is given by observations Dj={d1j,,dnjj} with corresponding times Tj={t1j,,tnjj}. and that the probability of observing dij is given by the Gaussian distribution:

fdij=τj2πexp12τjdijμij2, (3)

where the mean μij changes depending on the time, tij and the variance 1τj is specific to the jth data set.

Given our assumption of fitting the function of the ODE model solutions and any necessary constants, F(x1(β,tij),,xk(β,tij),a1,,am), to the jth time series data set, we set

E[Dij]=μij=F(x1(β,tij),,xk(β,tij),a1,,am). (4)

Equation (4) can be thought of as a type of link function. In statistics, for generalized linear models (GLMs), a link function is defined as the function that transforms the mean of a distribution to a linear regression model (Montgomery, Peck, & Vining, 2006). Equation (4) equates the mean of the Gaussian distribution to the ODE model solutions.

4.2. Poisson distribution

Let Y be a random variable from the Poisson distribution with parameter μ>0, Y~POI(μ). The formulation of the Poisson distribution is given by the following discrete pdf, f(y) (Bain & Engelhardt, 1987):

fy=expμμyy!, (5)

where y=0,1,.

The mean, E[Y], of the Poisson distribution is given by μ. For the Poisson distribution, the variance is equal to the mean, Var[Y]=E[Y]=μ.

Assume that the jth time series data set is given by observations Dj={d1j,,dnjj} with corresponding times Tj={t1j,,tnjj} and that the probability of observing dij is given by the Poisson distribution:

fdij=expμijμijdijdij!, (6)

where the mean E[Dij]=μij changes depending on the time, tij. Hence, the variance, Var[Dij]=E[Dij]=μij, also changes over time.

Again, we will use equation (4) to equate the mean, E[Dij]=μij, to the ODE model solutions.

The Poisson distribution is used for count data of rare events. The fact that the variance is dependent on the mean is particularly useful since in practice when observing count data over time the count data generally expresses more variability at higher values than at lower values (Bolker, 2007). The restriction that the variance is strictly equal to the mean is commonly violated for many types of count data. Count data where the variance is larger than the mean is called overdispersed. The negative binomial distribution can be used for count data with overdispersion.

4.3. Negative binomial distribution

Let Y be a random variable from the negative binomial distribution with parameters 0<p<1 and r0, Y~NB(r,p). The formulation of the negative binomial distribution is given by the following discrete pdf, f(y) (Linden & Mantyniemi, 2011):

fy=Γy+ry!Γrpr1py, (7)

where y=0,1,2.

The interpretation of this formulation of the negative binomial distribution is that y are the number of failures before the rth success and p is the probability of success per trial (Linden & Mantyniemi, 2011).

The mean, E[Y], of the negative binomial distribution is given by μ=r(1p)p and the variance, Var[Y], of this distribution is given by

σ2=r(1p)p2=μp.

For count data, the negative binomial distribution can be interpreted as the mean number of counts E[Y]=μ with the variance Var[Y]=μp overdispersed, since 0<p<1, Var[Y]>E[Y] (Bolker, 2007).

Assume that the jth time series data set is given by observations Dj={d1j,,dnjj} with corresponding times Tj={t1j,,tnjj} and that the probability of observing dij is given by the negative binomial distribution:

fdij=Γdij+rijdij!Γrijpjrij1pjdij, (8)

where rij=(pj)(μij)1(pj)μij=(rij)(1pj)pj changes depending on the time, tij and pj is specific to the jth data set. Hence, the variance, Var[Dij]=μijpj, also changes over time.

As before, we will use equation (4) to equate the mean, E[Dij]=μij, to the ODE model solutions.

4.4. Other distributions

It is seen from sections 4.1, 4.2, 4.3 that in general if the jth time series data set is given by observations Dj={d1j,,dnjj} with corresponding times Tj={t1j,,tnjj} and the probability of observing dij is given by the distribution with pdf f(dij) with mean E[Dij]=μij, then equation (4) is used to equate the mean, E[Dij]=μij, to the ODE model solutions.

5. Likelihood function

In a dynamical system, the dependency of solutions x1,,xk on each other is built into the mathematical model itself. Assuming that the mathematical model correctly describes the data sets of interest, the data sets can be considered independent from each other. With m independent time series data sets, there will be m likelihood functions associated with each of the independent data sets and the combined likelihood function is given by

Lθ=CL1θLmθ, (9)

where θ is the vector of parameters to estimate, and C is any positive constant not depending on θ used to simplify the likelihood function (Kalbfleisch, 1979).

5.1. Gaussian probability model for m data sets and combined likelihood function

Assume, for j=1,,m, that the jth time series data set is given by observations Dj={d1j,,dnjj} with corresponding times Tj={t1j,,tnjj} and that the probability of observing dij is given by the Gaussian distribution in equation (3) where the mean μij changes depending on the time, tij and the variance 1τj>0 is specific to the jth data set. Then the probability of the observed counts D={D1,,Dm} is given by

PD|θ=j=1mi=1njτj2πexp12τjdijμij2=12πj=1mnj2τ1n12τmnm2exp12j=1mτji=1njdijμij2, (10)

where equation (4) is used to equate the mean, μij, to the ODE model solutions and

θ=[ντ1τm].

The Gaussian probability model is very beneficial for fitting since even poor initial guesses of the vector of parameters, θ, will still produce a nonzero probability.

The combined likelihood function is given by

Lθ=C12πj=1mnj2τ1n12τmnm2exp12j=1mτji=1njdijμij2=τ1n12τmnm2exp12j=1mτji=1njdijμij2, (11)

where C = (12π)(j=1mnj2) simplifies the likelihood function. The value of θ that maximizes P(D|θ) will also maximize L(θ) (Kalbfleisch, 1979).

5.2. Poisson probability model for m data sets and combined likelihood function

Assume, for j=1,,m, that the jth time series data set is given by observations Dj={d1j,,dnjj} with corresponding times Tj={t1j,,tnjj} and that the probability of observing dij is given by the Poisson distribution in equation (6) where the mean μij (and hence the variance, μij) changes depending on the time, tij. Then the probability of the observed counts D={D1,,Dm} is given by

PD|θ=j=1mi=1njexpμijμijdijdij!=1d11!dn11!1d1m!dnmm!expj=1mi=1njμijj=1mμ1jd1jμnjjdnjj, (12)

where equation (4) is used to equate the mean, μij, to the ODE model solutions and θ=ν.

The combined likelihood function is given by

Lθ=C1d11!dn11!1d1m!dnmm!expj=1mi=1njμijj=1mμ1jd1jμnjjdnjj=expj=1mi=1njμijj=1mμ1jd1jμnjjdnjj, (13)

where C = (d11!dn11!)(d1m!dnmm!) simplifies the likelihood function.

5.3. Negative binomial probability model for m data sets and combined likelihood function

Assume, for j=1,,m, that the jth time series data set is given by observations Dj={d1j,,dnjj} with corresponding times Tj={t1j,,tnjj} and that the probability of observing dij is given by the negative binomial distribution in equation (8) where the mean μij (and hence the variance Var[Dij]=μijpj) changes depending on the time, tij. Then the probability of the observed counts D={D1,,Dm} is given by

PD|θ=j=1mi=1njΓdij+rijdij!Γrijpjrij1pjdij=1d11!dn11!1d1m!dnmm!Γd11+r11Γdn11+rn11Γr11Γrn11Γd1m+r1mΓdnmm+rnmmΓr1mΓrnmmp1i=1n1rijpmi=1nmrij1p1i=1n1dij1pmi=1nmdij, (14)

where rij=(pj)(μij)1(pj)μij=(rij)(1pj)pj, equation (4) is used to equate the mean, μij, to the ODE model solutions and

θ=[νp1pm].

The combined likelihood function is given by

Lθ=C1d11!dn11!1d1m!dnmm!Γd11+r11Γdn11+rn11Γr11Γrn11Γd1m+r1mΓdnmm+rnmmΓr1mΓrnmmp1i=1n1rijpmi=1nmrij1p1i=1n1dij1pmi=1nmdij=Γd11+r11Γdn11+rn11Γr11Γrn11Γd1m+r1mΓdnmm+rnmmΓr1mΓrnmmp1i=1n1rijpmi=1nmrij1p1i=1n1dij1pmi=1nmdij, (15)

where C = (d11!dn11!)(d1m!dnmm!) simplifies the likelihood function.

6. Bayesian framework

The Bayesian framework is set up by first assuming a probability model for the observed data D given a p×1 vector of unknown parameters θ, which is P(D|θ). Then it is assumed that θ is randomly distributed from the prior distribution P(θ). Statistical inference for θ is based on the posterior distribution, P(θ|D). Using Bayes’ theorem we have

P(θ|D)=P(D|θ)P(θ)P(D)=P(D|θ)P(θ)ΩP(D|θ)P(θ)dθ (16)
LθPθ=πθ|D,

where Ω is the parameter space of θ and L(θ) is the likelihood function. P(D)=ΩP(D|θ)P(θ)dθ is called the prior predictive distribution and it is the normalizing constant of the posterior distribution P(θ|D) (Chen, Shao, & Ibrahim, 2000). The unnormalized posterior distribution is given by π(θ|D)=L(θ)P(θ).

The Bayesian framework is very useful to use for statistical inference that occurs in mathematical biology since there is generally prior information about the unknown parameters in the literature.

6.1. Prior distribution

In biological applications there may exist literature regarding an appropriate prior distribution for a parameter of interest. However, in many cases, only a general range is known from the literature about a parameter of interest and the uniform distribution is chosen as the prior distribution for the parameter of interest.

7. Markov Chain Monte Carlo algorithms

Markov Chain Monte Carlo (MCMC) algorithms are designed to sample and to fully explore the parameter space where the unnormalized posterior distribution is positive (Lynch, 2007). The MCMC algorithms involve a process where a new vector of parameter values is sampled from the posterior distribution, θ(t), based off of the previous vector of parameter values, θ(t1). A successful MCMC algorithm results in a sample path (also called a chain or walker) that has arrived at a stationary process and covers the domain of the target unnormalized posterior distribution.

7.1. Metropolis-Hastings algorithm

The Metropolis-Hastings algorithm is one of the classic MCMC algorithms (Chen et al., 2000):

A starting point θ(0) is selected.

For every iteration t=1,2,,T:

  • randomly select a proposal for θ(t), γ, from the proposal distribution f(θ(t)|θ(t1))

  • proposal for θ(t) is accepted with probability α=min{1,π(γ|D)π(θ(t1)|D)f(θ(t1)|γ)f(γ|θ(t1))}

  • random sample u from U(0,1)

  • if u<α, the proposal is accepted and θ(t)=γ.

  • If not, θt=θt1,

where π(θ|D) is the unnormalized posterior distribution.

7.1.1. Random-walk Metropolis-Hastings algorithm

If a symmetric proposal distribution is chosen in the Metropolis-Hastings Algorithm, then the proposal distribution randomly perturbs the current position of the vector of unknown parameters, θ(t1), and these algorithms are called Random-Walk Metropolis-Hastings algorithms (Lynch, 2007).

A symmetric proposal distribution has the property that f(γ|θ(t1))=f(θ(t1)|γ) and this simplifies the acceptance probability to α=min{1,π(γ|D)π(θ(t1)|D)}.

7.2. Affine invariant ensemble Markov Chain Monte Carlo algorithm

The affine invariant ensemble MCMC algorithm is shown to perform better than the Metropolis-Hastings algorithm and other MCMC algorithms (Goodman & Weare, 2010). The algorithm uses K walkers and the positions of the walkers are updated based on the present positions of the K walkers (Weikun, 2015, pp. 1–8). The following is the affine invariant ensemble MCMC algorithm:

A starting point θi(0) is selected for each of the walkers, i=1,2,,K.

For every iteration t=1,2,,T:

For i=1,2,,K:

  • randomly select a walker j from the K walkers such that ji

  • randomly choose z from the distribution f(z)=1az, 1aza

  • proposal for θi(t) is γ=θj(t1)+z(θi(t1)θj(t1)) (Stretch Move)

  • proposal for θi(t) is accepted with probability α=min{1,zp1π(γ|D)π(θi(t1)|D)}

  • random sample u from U(0,1). If u<α, the proposal is accepted and θi(t)=γ. If not, θit=θit1,

where π(θ|D) is the unnormalized posterior distribution, a>1 is adjusted to improve performance, and f(z) satisfies the symmetry condition f(1z)=zf(z).

The equation θj(t1)+z(θi(t1)θj(t1)) is the equation of a line parallel to the vector (θi(t1)θj(t1)). By randomly choosing z, the stretch move in the algorithm moves to a vector position, γ, a certain distance up or down the line. Then the vector proposal, γ, is either accepted or rejected based on the acceptance probability, α.

The set of samples from each of the K walkers will converge to the unnormalized posterior distribution, π(θ|D). After running the method, the set of samples from each of the K walkers can be pooled together to form a larger sample from the unnormalized posterior distribution, KT samples. Since the samples from the first iterations are generally far away from the highest density of the unnormalized posterior distribution, the first iterations are usually deleted from each of the K walkers; the deletion of the first iterations is called burn-in. Let H be the number of pooled samples after the burn-in is completed.

8. Diagnostics

The samples from the MCMC provide a sample path. It is important to diagnose if this sample path produces a sample from the target unnormalized posterior distribution, π(θ|D). In other words, the sample path converges to the target unnormalized posterior distribution, π(θ|D). From the plot of the sample path, it is vital to find that the sample path has arrived at a stationary process and the sample path covers the domain of the target unnormalized posterior distribution, π(θ|D).

The sample path for each parameter θi should be plotted. It is ideal to find that the sample path for each parameter θi is oscillating very fast and displays no apparent trend; this indicates that the sample path has arrived at a stationary process. By observing the marginal posterior distribution, π(θi|D) for each parameter θi, it should be observed that the sample path covers the domain of the target unnormalized posterior distribution, π(θ|D).

A formalized test of the convergence of the MCMC sampling to the estimated unnormalized posterior distribution for each parameter θi is found by using a general univariate comparison method (Gelman & Brooks, 1998). The general univariate comparison method uses the distance of the empirical 100(1α)% interval for the pooled samples, S, and divides this distance by the average of the distances of the empirical 100(1α)% interval for each of the K walkers, si, to receive the potential scale reduction factor, η (Gelman & Brooks, 1998):

η=Si=1KsiK. (17)

When the potential scale reduction factor, η, is close to 1 for all the estimated parameters, this indicates that the MCMC sampling converged to the estimated posterior distribution for each parameter.

9. Credible intervals for parameters

For a unimodel, symmetric marginal posterior distribution, π(θi|D), for θi, the 95% credible interval for θi is given by the 2.5 and 97.5 percentiles of the marginal posterior distribution of π(θi|D) (Chen et al., 2000).

9.1. Non-uniqueness

Non-uniqueness occurs when there is more than one solution vector θ that explains the data, D, equally as well.

When there is non-uniqueness, the marginal posterior distribution, π(θi|D), for θi is constant over an interval and the credible interval for θi is given by the upper and lower limits of the interval (Chen et al., 2000).

The credible intervals resulting from non-uniqueness are still very beneficial since they are often more specific than the initial prior distributions specified for the parameters.

10. Posterior predictive distribution

Let D˜={D˜1,,D˜m} be future responses of interest for the m datasets. The posterior predictive distribution of D˜ is given by

PD˜|D=ΩPD˜|θPθ|Ddθ, (18)

where P(θ|D) is the posterior distribution and P(D˜|θ) is the same probability model for the data specified in the Bayesian framework (16).

To generate the posterior predictive distribution.

For each pooled sample t=1,2,,H:

  • randomly sample D˜ from the probability distribution specified for the data P(D|θ(t)) at θt,

where H is the number of samples from the unnormalized posterior distribution.

The 95% prediction intervals for each data set Dj is found by determining the 2.5 and 97.5 percentiles of the posterior predictive distribution at each tij.

The posterior predictive mean is found by taking the mean of the posterior predictive distribution at each tij.

11. An example: logistic growth

Assume there are three people infected with a virus in an isolated town of 3000 people. Furthermore, assume that the true model for the first 15 days of the virus across the population is plotted in Fig. 1 and given by the following differential equation

dxdt=xrrNx, (19)

where x0=3, r=0.8 and N=3000.

Fig. 1.

Fig. 1

The true logistic growth model for the spread of viral infection in the small town withx0=3,r=0.8and.N=3000

Now, this differential equation (19) can be solved analytically and we receive the logistic equation

xβ,ti=rx0rNx0+rrNx0erti, (20)

where

β=[x0rN].

Now, assume that the town collects count data for the number of people infected with the virus. We will generate this observed data by randomly sampling from the Negative Binomial distribution with mean given by (20) with x0=3, r=0.8 and N=3000, and variance given by the mean divided by p, where p is chosen as 0.005. The generated observed data for the first 15 days of the virus across the population is plotted in Fig. 2.

Fig. 2.

Fig. 2

The generated data for the spread of a viral infection in the small town.

Now, we will use Bayesian inference to determine the following unknown vector of parameters

θ=[βp]
=[x0rNp].

In this scenario, equation (4) is E[Di]=μi=x(β,ti) and the negative binomial distribution (8) is chosen to describe the observed data.

The following uniform prior distributions are chosen for the parameters:

  • x0 with distribution U(1,50)

  • r with distribution U(0.1,2)

  • N with distribution U(100,6000)

  • p with distribution U(1×105,1×101).

The affine invariant ensemble MCMC algorithm is used with T=100000 iterations and K=8 walkers. The potential scale reduction factor, η, for each parameter:

  • η=0.9941 for x0

  • η=0.9977 for r

  • η=0.9963 for N

  • η=0.9987 for p.

All potential scale reduction factors are close to 1 and this indicates that the algorithm converged to the posterior distribution.

The marginal unnormalized posterior distribution for each parameter is plotted in Fig. 3. The estimated parameters with 95% credible intervals are the following:

  • x0 is estimated to be 4.13 (1.68, 19.58),

  • r is estimated to be 0.690 (0.474, 0.834),

  • N is estimated to be 2.99×103 (2.46×103, 4.47×103), and

  • p is estimated to be 0.0070 (0.0032, 0.0111).

Fig. 3.

Fig. 3

Marginal unnormalized posterior distribution for (a)x0, (b)r, (c)N, and (d)p.

The true parameter values for x0, r, N, and p all lie within the 95% credible intervals.

Samples from the posterior predictive distribution and the posterior predictive mean are displayed in Fig. 4. The true model, best fit model (model with the highest unnormalized posterior probability), and posterior predictive mean are compared in Fig. 5. It is seen that the best fit model (red curve) lies very close to the posterior predictive mean (black curve) and is near the true model (blue curve). It is observed that the true model (blue curve) and all of the generated data (red circles) lie within the 95% prediction intervals (dashed black curves).

Fig. 4.

Fig. 4

Posterior predictive distribution with the posterior predictive mean.

Fig. 5.

Fig. 5

Best fit and true model for the spread of a viral infection in the small town with 95% prediction interval.

Declaration of competing interest

I wish to confirm that there are no known conflicts of interest associated with these lecture notes.

Handling Editor: Dr. J Wu

Footnotes

Peer review under responsibility of KeAi Communications Co., Ltd.

References

  1. Bain L.J., Engelhardt M. 2nd ed. Edition. Brooks/Cole; 1987. Introduction to probability and mathematical statistics. [Google Scholar]
  2. Bolker B. Princeton University Press; Princeton-New Jersey: 2007. Ecological models and data in R. [Google Scholar]
  3. Chen M., Shao Q., Ibrahim J.G. Springer-Verlag; New York-New York: 2000. Monte Carlo methods in Bayesian computation. [Google Scholar]
  4. Gelman A., Brooks S.P. General methods for monitoring convergence of iterative simulations. J. Comput. Graph. Stat. 1998;7(4):434–455. [Google Scholar]
  5. Ghasemi O., Lindsey M.L., Yang T., Nguyen N., Huang Y., Jin Y.-F. Bayesian parameter estimation for nonlinear modelling of biological pathways. BMC Syst. Biol. 2011;5(Suppl 3):S9. doi: 10.1186/1752-0509-5-S3-S9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Goodman J., Weare J. Ensemble samplers with affine invariance. Comm. App. Math,. Com. Sc. 2010;5(1):65–80. [Google Scholar]
  7. Higham C.F., Husmeier D. A bayesian approach for parameter estimation in the extended clock gene circuit of arabidopsis thaliana. BMC Bioinformatics. 2013;14(Suppl 10):S3. doi: 10.1186/1471-2105-14-S10-S3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Kalbfleisch J.G. Vol. 2. Springer-Verlag New York, Inc.; 1979. (Probability and statistical inference). Statistical Inference. [Google Scholar]
  9. Linden A., Mantyniemi S. Using the negative binomial distribution to model overdispersion in ecological count data. Ecology. 2011;92(7):1414–1421. doi: 10.1890/10-1831.1. [DOI] [PubMed] [Google Scholar]
  10. Lynch S.M. Springer; New York: 2007. Introduction to applied Bayesian statistics and estimation for social scientists. [Google Scholar]
  11. Ma Y.Z., Berndsen A. How to combine correlated data sets - a bayesian hyperparameter matrix method. Astron. Comput. 2014;5:45–56. [Google Scholar]
  12. Montgomery D.C., Peck E.A., Vining G.G. John Wiley & Sons, Inc.; Hoboken-New Jersey: 2006. Introduction to linear regression analysis. [Google Scholar]
  13. Periwal V., Chow C.C., Bergman R.N., Ricks M., Vega G.L., Sumner A.E. Evaluation of quantitative models of the effect of insulin on lipolysis and glucose disposal. Am. J. Physiol. Regul. Integr. Comp. Physiol. 2008;295:R1089–R1096. doi: 10.1152/ajpregu.90426.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Vanlier J., Tiemann C.A., Hilbers P.A.J., van Riel N.A.W. A bayesian approach to targeted experiment design. Bioinformatics. 2012;28(8):1136–1142. doi: 10.1093/bioinformatics/bts092. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Weikun C. 2015. A parallel implementation of mcmc. [Google Scholar]

Articles from Infectious Disease Modelling are provided here courtesy of KeAi Publishing

RESOURCES