Skip to main content
PLOS ONE logoLink to PLOS ONE
. 2013 Dec 20;8(12):e82317. doi: 10.1371/journal.pone.0082317

Nested Sampling for Bayesian Model Comparison in the Context of Salmonella Disease Dynamics

Richard Dybowski 1,*, Trevelyan J McKinley 1, Pietro Mastroeni 1, Olivier Restif 1
Editor: Andrew J Yates2
PMCID: PMC3869703  PMID: 24376528

Abstract

Understanding the mechanisms underlying the observed dynamics of complex biological systems requires the statistical assessment and comparison of multiple alternative models. Although this has traditionally been done using maximum likelihood-based methods such as Akaike's Information Criterion (AIC), Bayesian methods have gained in popularity because they provide more informative output in the form of posterior probability distributions. However, comparison between multiple models in a Bayesian framework is made difficult by the computational cost of numerical integration over large parameter spaces. A new, efficient method for the computation of posterior probabilities has recently been proposed and applied to complex problems from the physical sciences. Here we demonstrate how nested sampling can be used for inference and model comparison in biological sciences. We present a reanalysis of data from experimental infection of mice with Salmonella enterica showing the distribution of bacteria in liver cells. In addition to confirming the main finding of the original analysis, which relied on AIC, our approach provides: (a) integration across the parameter space, (b) estimation of the posterior parameter distributions (with visualisations of parameter correlations), and (c) estimation of the posterior predictive distributions for goodness-of-fit assessments of the models. The goodness-of-fit results suggest that alternative mechanistic models and a relaxation of the quasi-stationary assumption should be considered.

Introduction

Model comparison

Model-based inference is widely used in life sciences in order to assess the plausibility of hypothesised biological mechanisms based on data from observations or experiments. One of the most common approaches to compare competing models representing alternative hypotheses relies on Akaike's Information Criterion (AIC) [1]. For a given data set Inline graphic, the plausibility of the candidate models Inline graphic is assessed by calculating their respective AIC values, Inline graphic:

graphic file with name pone.0082317.e004.jpg (1)

In (1), Inline graphic is the maximum likelihood estimate of the set parameters associated with model Inline graphic, and Inline graphic is the corresponding number of degrees of freedom. If Inline graphic then Inline graphic is more plausible than Inline graphic, with respect to Inline graphic, in the sense that the Kullback-Liebler divergence of Inline graphic from the true model is smaller [2].

An important drawback to the classic approach to model choice is that it is based on a single point estimate Inline graphic of Inline graphic, the uncertainty in Inline graphic being ignored. In contrast, the Bayesian approach considers a probability distribution for Inline graphic, with Inline graphic expressing the uncertainty in Inline graphic given Inline graphic (for a model Inline graphic).

Suppose that we wish to select a model from a set of candidate models Inline graphic given our observation of data Inline graphic. We can express this goal probabilistically by stating that the aim is to determine the most probable model: Inline graphic.

From Bayes' theorem, we have

graphic file with name pone.0082317.e024.jpg (2)

therefore, if Inline graphic is known, or considered to be equal for all Inline graphic then the focus is on the model evidence Inline graphic.

If Inline graphic is the set of parameters associated with model Inline graphic, the Bayesian approach to Inline graphic is to integrate over all possible values of Inline graphic:

graphic file with name pone.0082317.e032.jpg (3)

In addition to allowing for parameter uncertainty, (3) intrinsically penalizes against models that are better able to fit to observed data because of their complexity [3], thereby removing the need for an explicit complexity penalization term.

The integral of (3) can be estimated analytically or numerically. In analytical approaches, the integral is approximated by the adoption of simplifying assumptions; for example, as used for derivation of the Bayes Information Criterion [4]. Numerical approaches are based on some form of Monte Carlo sampling such as Gibbs Sampling [5].

One approach to estimating the integral

graphic file with name pone.0082317.e033.jpg

numerically is to sample Inline graphic randomly from its prior,

graphic file with name pone.0082317.e035.jpg (4)

however, the prior Inline graphic is often concentrated in places where the likelihood Inline graphic is relatively low. This problem becomes more severe in high-dimensional parameter Inline graphic spaces, or in problems where the likelihood function Inline graphic is concentrated in a very small region.

To overcome the problem, Skilling [6], [7] proposed a means of estimating Inline graphic that, by design, samples Inline graphic sparsely from the Inline graphic space where the likelihood Inline graphic is low, and densely where Inline graphic is high, by means of ‘nested sampling’, which is the focus of this paper. A recent addition to the Bayesian arsenal, nested sampling has been used in cosmology to compare alternative models of the universe against observed data [8]. Outside of physics, it has, so far, received little attention [9], [10].

Within-host dynamics of a bacterial infection

Quantitative research on infectious disease dynamics has undergone rapid development over the last two decades, motivated by concerns about emerging infections that can spread globally and about the evolution of pathogens resistant to existing control measures such as antimicrobials and vaccines. Bayesian computation has become the method of choice to fit stochastic dynamic models to epidemiological [11] or experimental datasets [12]. This is in large part due to the appeal of being able to produce measures of uncertainty and correlation for the model parameters based on their posterior probability distributions. Similarly, models for within-host dynamics of infection have more recently started to benefit from Bayesian inference approaches [13].

Salmonella enterica causes systemic diseases (typhoid and paratyphoid fever) [14], food-borne gastroenteritis and non-typhoidal septicaemia (NTS) [15] in humans and in many other animal species world-wide, which also cause a very serious problem for the food industry. The global burden of typhoid fever is estimated at ca. 22 million cases with a mortality estimated at ca. 200,000 deaths per year [14], [16]. Paratyphoid has an estimated 5.4 million illnesses worldwide [16]. The high incidence of these diseases, that affect both travellers to and residents in endemic areas, and threaten infants, children and immunodeficient patients, dictates the urgent need for more efficacious preventive and therapeutic measures.

In the mouse model of systemic infection, Salmonella reside and proliferate mainly within phagocytic cells of the spleen liver, bone marrow and lymph nodes [17][19]. Observation of Salmonella by fluorescence microscopy in the tissues of mice has revealed that a key feature of systemic infections with wild type bacteria is the presence, on average, of low bacterial numbers within individual phagocytes irrespective of net bacterial growth rate and time since infection [20][23].

In an effort to understand the dynamics that underpin the intracellular numerical distributions of Salmonella within the host cells, and to capture the essential traits of the cell-to-cell spread of the bacteria, we have used mathematical model frameworks for the intensity of intracellular infection that links the quasi-stationary distribution of bacteria to bacterial and cellular demography. An example of this the work done by Brown et al. [24], who compared the observed distribution Inline graphic, where Inline graphic is the number of cells with Inline graphic bacteria, across 16 candidate infection models. The models under consideration were as follows: (a) one homogeneous model, in which, for every cell, burst occurred only when the number of bacteria Inline graphic in a cell reached a single burst threshold Inline graphic; (b) five heterogeneous models having a probability distribution of burst thresholds; and (c) eight stochastic models for which there is a probability that a given cell will undergo burst. Two datasets were analysed, one for a virulent strain of bacteria and the other for an attenuated strain. Brown et al. [24] computed the maximum likelihood estimates of the parameters of each model, and selected the ‘best’ model based on the corresponding AIC values.

In order to overcome the issues raised by AIC discussed above, we decided to re-analyse the datasets and re-assess the models within a Bayesian framework.

Methods

What follows is an elaboration of the description of nested sampling given by Skilling [6], [7].

Nested sampling

The expected value of a function Inline graphic of a random variable Inline graphic is given by

graphic file with name pone.0082317.e052.jpg

where Inline graphic is the pdf of Inline graphic. On comparing this expression with the target integral Inline graphic, it is clear that

graphic file with name pone.0082317.e056.jpg (5)

that is to say, the expected value of the likelihood under the prior. The cumulative distribution function Inline graphic with respect to a random variable Inline graphic is defined by

graphic file with name pone.0082317.e059.jpg

and is related to the expectation Inline graphic by

graphic file with name pone.0082317.e061.jpg

[25]; consequently, from (5), we obtain the important relationship

graphic file with name pone.0082317.e062.jpg (6)

where Inline graphic is likelihood, and Inline graphic in the right-hand integral is equal to Inline graphic. The reason why (6) is important is that the multivariate integral on the left-hand side has been equated to a univariate integral.

Since Inline graphic has a distribution defined by prior Inline graphic, and Inline graphic, it follows that Inline graphic has a probability distribution and thus a cumulative distribution function,

graphic file with name pone.0082317.e070.jpg (7)

which is present in the integrand of the right-hand integral of (6).

We can replace Inline graphic in (6) with a more accessible integral by the following steps. First, since the pdf of Inline graphic is connected to the pdf of Inline graphic via Inline graphic, we can write

graphic file with name pone.0082317.e075.jpg (8)

thus, from (6), (7) and (8), we can write

graphic file with name pone.0082317.e076.jpg (9)

It will be convenient to rewrite the inner integral of (9) as Inline graphic to give

graphic file with name pone.0082317.e078.jpg (10)

where Inline graphic is the probability of selecting Inline graphic from the prior Inline graphic such that Inline graphic:

graphic file with name pone.0082317.e083.jpg (11)

Introducing Inline graphic, hence Inline graphic, we can rewrite the previous integral as

graphic file with name pone.0082317.e086.jpg (12)

where Inline graphic is that likelihood Inline graphic such that Inline graphic (cf. Equation (11)); for example, if Inline graphic then 90% of Inline graphic drawn from the prior Inline graphic will have likelihoods greater than 0.0042.

The algorithm

The main steps of the nested sampling technique are as follows. First, Inline graphic points Inline graphic (i.e., parameter vectors) are sampled from the prior Inline graphic, and their corresponding likelihoods Inline graphic determined. The point Inline graphic having the smallest likelihood is determined and its likelihood Inline graphic is recorded. Furthermore, the probability Inline graphic that Inline graphic is also recorded.Point Inline graphic is replaced by a new Inline graphic drawn from the prior Inline graphic but restricted to those Inline graphic for which Inline graphic. In other words, a restricted prior is used: Inline graphic. If Inline graphic is the set of all possible Inline graphic then the set Inline graphic is a subset of Inline graphic.

The above sequence of determining Inline graphic and the corresponding Inline graphic is performed on the new set of points, giving rise to Inline graphic and Inline graphic. Point Inline graphic is replaced by a Inline graphic drawn from the new restricted prior Inline graphic. In other words, Inline graphic is sampled from Inline graphic, for which Inline graphic.

This cycle is repeated until some stopping criterion has been reached. If this termination occurs at the Inline graphic-th iteration then the resulting values of Inline graphic and Inline graphic will be

graphic file with name pone.0082317.e124.jpg
graphic file with name pone.0082317.e125.jpg

and the resulting sequence of Inline graphic subsets is

graphic file with name pone.0082317.e127.jpg

hence the term nested sampling.

Model evidence Inline graphic can be estimated from the recorded Inline graphic and Inline graphic values by means of the approximation

graphic file with name pone.0082317.e131.jpg (13)

where Inline graphic is the number of iterations used, and Inline graphic is a vertical rectangular segment under the curve of Figure 1.

Figure 1. The shaded area below the curve for Inline graphic is equal to Inline graphic.

Figure 1

See Equation (12).

Algorithm 1 (Table 1) describes the above process in pseudocode.

Table 1. Algorithm 1: The nested sampling algorithm.

Input: (a) likelihood function Inline graphic; (b) prior Inline graphic; (c) number Inline graphic of active parameter vectors in use during nested sampling.
Out put: an estimate Inline graphic of Inline graphic.
1: Let Inline graphic be a set of Inline graphic parameter vectors Inline graphic
2: Inline graphic
3: Inline graphic
4:while terminating condition not satisfied do
5: Inline graphic
6: Inline graphic
7: Inline graphic
8: if Inline graphic then
9: Inline graphic ▹Estimated segment of Inline graphic
10: Inline graphic
11: Inline graphic ▹Restricted prior
12: Inline graphic
13: Inline graphic
return Inline graphic

Practical adjustments to the algorithm

We will now consider how some of the aspects of Algorithm 1 can be implemented.

Segment Inline graphic used in (13) could be evaluated by the trapezoidal approach

graphic file with name pone.0082317.e158.jpg

but Sivia and Skilling [26] have found

graphic file with name pone.0082317.e159.jpg

to be adequate (line 9 in Algorithm 1).

Line 7 in Algorithm 1 used the assignment Inline graphic, but an alternative approach is to replace this assignment with Inline graphic. An approximation of Inline graphic is derived as follows. Let Inline graphic denote the ratio Inline graphic, with Inline graphic. At the Inline graphicth iteration we have

graphic file with name pone.0082317.e167.jpg

and so

graphic file with name pone.0082317.e168.jpg

therefore,

graphic file with name pone.0082317.e169.jpg (14)

Now,

graphic file with name pone.0082317.e170.jpg
graphic file with name pone.0082317.e171.jpg

[27]

graphic file with name pone.0082317.e172.jpg
graphic file with name pone.0082317.e173.jpg

therefore, from (14),

graphic file with name pone.0082317.e174.jpg

Since the logarithm function is strictly increasing and concave, we have, from Jensen's inequality, that

graphic file with name pone.0082317.e175.jpg

and thus

graphic file with name pone.0082317.e176.jpg

however, Sivia and Skilling [26, p. 186] drop the inequality and use the approximation

graphic file with name pone.0082317.e177.jpg

As regards the termination of Algorithm 1, there is no rigorous criterion as to when the algorithm should be stopped, but Skilling [7] and Feroz and Hobson [28] have found

graphic file with name pone.0082317.e178.jpg

to be an effective stopping condition, where Inline graphic is the fraction of Inline graphic that will not significantly contribute to the estimate of Inline graphic (according to a user-defined value).

Chopin and Robert [29] have shown that the asymptotic variance of the nested sampling approximation typically grows linearly with parameter dimensions.

Finally, there is the structure of the restricted priors. Each new point Inline graphic for a set Inline graphic of active points is sampled from prior Inline graphic conditioned on the restriction that Inline graphic. Rather than searching across the entire Inline graphic -space for such a point, it is more computationally efficient to restrict the search to a region Inline graphic that contains Inline graphic. We have used rectangular cuboids for Inline graphic.

Incorporating the above points into Algorithm 1 leads to Algorithm 2 (Table 2). Before applying the algorithm to our experimental datasets, we tested it on a simple two-parameter likelihood function Inline graphic. The analyses and results are presented in Methods S1.

Table 2. Algorithm 2: An implementation of Algorithm 1 in which practical adjustments are included.

Input (a) likelihood function Inline graphic; (b) prior Inline graphic; (c) number Inline graphic of active parameter vectors in use during nested sampling; (d) procedure for determining a region Inline graphic of parameter space that encloses a set of parameter vectors Inline graphic; (e) fraction Inline graphic of Inline graphic to be estimated.
Output: an estimate Inline graphic of Inline graphic.
1: Let Inline graphic be a set of Inline graphic parameter vectors Inline graphic
2: Inline graphic
3: Inline graphic
4:Repeat
5: Inline graphic
6: Inline graphic
7: Inline graphic
8: if Inline graphic then
9: Inline graphic ▹Estimated segment of Inline graphic
10: Inline graphic
11: Inline graphic ▹Restricted prior
12: Inline graphic
13: Inline graphic
14: until: Inline graphic ▹The stopping condition
return Inline graphic

The Salmonella models

Evidence Inline graphic was estimated by nested sampling with respect to two groups of models associated with within-host S. enterica infection, were each model Inline graphic provides an expression for the probability Inline graphic that a host cell contains Inline graphic bacteria.

In the first group of models, infected cells are assumed to burst when the number of bacteria they contain reach a fixed threshold Inline graphic. The probability distributions considered for Inline graphic are shown in Table 3.

Table 3. Probability distributions Inline graphic for the burst thresholds Inline graphic.

Model Distribution Parameters, θ
1 Inline graphic Inline graphic
2 Inline graphic Inline graphic, Inline graphic, Inline graphic
3 Inline graphic Inline graphic
4 Inline graphic Inline graphic, Inline graphic
5 Inline graphic Inline graphic, Inline graphic
6 Inline graphic Inline graphic

(1) Unimodal Kronecker, (2) bimodal Kronecker, (3) Poisson, (4) binomial, (5) negative binomial, and (6) geometric.

For the second group of models, the assumption is that, instead of pre-programmed burst thresholds Inline graphic, there is burst rate Inline graphic that is a function of the number of bacteria Inline graphic in a cell. For these models, the general relationship is

graphic file with name pone.0082317.e244.jpg (15)

where Inline graphic. Furthermore, the rate of bacterial replication Inline graphic is assumed to be related to Inline graphic by

graphic file with name pone.0082317.e248.jpg (16)

where Inline graphic. As explained in Brown et al. [24], in the dynamic model, time can be re-scaled by the baseline replication rate Inline graphic, therefore this parameter cannot be estimated using the quasi-stationary distribution. For convenience, we set Inline graphic, so that the values of other parameters are relative to the baseline replication rate. The parameters of the eight stochastic models considered are shown in Table 4.

Table 4. Parameters used for the eight stochastic models based on (15) and (16).

Parameters, θ
Model μ 0 μ 1 μ 2 α 0 αe
7 μ 0 0 0 1 0
8 0 μ 1 0 1 0
9 0 0 μ 0 1 0
10 μ 0 μ 1 μ 2 1 0
11 μ 0 0 0 1 αe
12 0 μ 1 0 1 αe
13 0 0 μ 2 1 αe
14 μ 0 μ 1 μ 2 1 αe

For each model, some of the parameters were set equal to constant values, which effectively removed the parameters from the model. The range of values considered were Inline graphic and Inline graphic.

Under the assumption that the number of host cells infected by Inline graphic bacteria reaches a quasi-stationary distribution, the probability Inline graphic that a cell contains Inline graphic bacteria can be derived for the 14 models [30]. For Model 1, we have the relationship

graphic file with name pone.0082317.e257.jpg (17)

For Models 2 to 6, the relationship is

graphic file with name pone.0082317.e258.jpg (18)

For Models 7 to 16, we have the recursive relationship

graphic file with name pone.0082317.e259.jpg (19)

where the infection rate constant Inline graphic is given by

graphic file with name pone.0082317.e261.jpg (20)

The value for q(1| Inline graphic, Inline graphic) can be handled as follows. Let

graphic file with name pone.0082317.e264.jpg (21)

so that (19) can be written as Inline graphic, then

graphic file with name pone.0082317.e266.jpg

but Inline graphic; therefore,

graphic file with name pone.0082317.e268.jpg

When bacterial replication is not dependent on Inline graphic, Inline graphic, in which case Inline graphic, but when replication is density dependent, (19) and (20) need to be solved self-consistently. This can be done by assuming an initial value for Inline graphic, computing Inline graphic from (19), updating Inline graphic using (20), and repeating this iteratively until Inline graphic no longer changes significantly. This process is shown in Algorithm 3 (Table 5).

Table 5. Algorithm 3: Estimation of Inline graphic using an iterative estimation of the infection rate constant Inline graphic.

Input: parameters Inline graphic for model Inline graphic.
Output: an estimate of probabilities Inline graphic.
1: Inline graphic Initial value for Inline graphic
2: Inline graphic
3: while: Inline graphic do
4: Inline graphic
5: Inline graphic
where Inline graphic ▹Equation (21)
6: Inline graphic ▹Estimate of Inline graphic
7: Inline graphic ▹Estimates of Inline graphic where Inline graphic
8: Inline graphic ▹Normalisation of the estimated probabilities
9: Inline graphic
return: Inline graphic

Likelihood function

With expressions for Inline graphic established for all the models, we can now determine the likelihood Inline graphic required for Algorithm 2. Following Brown et al. [30], we can express the likelihood function by a multinomial distribution:

graphic file with name pone.0082317.e298.jpg (22)
graphic file with name pone.0082317.e299.jpg (23)
graphic file with name pone.0082317.e300.jpg (24)

where Inline graphic is the observed distribution of Inline graphic (the number of cells with Inline graphic bacteria), and Inline graphic, if observations are assumed to be independent. Garca-Pérez [31] provides an algorithm for the accurate computation of multinomial probabilities.

As regards the prior Inline graphic for a model Inline graphic, it will be assumed to be uniform across the parameter space of interest for that model; consequently, the prior will be set equal to the reciprocal of the size of the parameter space. More precisely,

graphic file with name pone.0082317.e307.jpg

A continuation approach

The theory underlying nested sampling assumes that all the parameters for a model have continuous values, however, this will not necessarily be the case in practice. For example, the binomial model (Model 3) has a discrete parameter Inline graphic and a continuous parameter Inline graphic.

It is possible to formulate a theory of nested sampling for discrete parameters by replacing integrals with summations, but modifications to Algorithm 2 would be required to take account of the fact that, if Inline graphic is discrete, several points could occupy the same location in parameter space.

An alternative response to the presence of discrete parameters is to use a type of continuation approach [32]; in other words, if Inline graphic is a function defined only for integer values of Inline graphic, replace it with another function Inline graphic that takes real values, but for which Inline graphic when Inline graphic (or Inline graphic).

For Model 2, the Kronecker delta Inline graphic can be replaced with a narrow Gaussian function Inline graphic with Inline graphic. In the case of Model 1, continuation can be applied directly to (17) by allowing Inline graphic.

For those models using a factorial of a parameter (i.e., Models 4 and 5), we can replace Inline graphic with Inline graphic since Inline graphic is a function of a real value.

The data

The data Inline graphic consisted of the number Inline graphic of mice cells observed (via fluorescence microscopy) to contain Inline graphic S. enterica bacteria: Inline graphic. One dataset was used for a virulent bacterial strain (SL5560); another for an attenuated strain (SL3261). The infected cells were taken randomly from various locations in the liver. The observed Inline graphic values are shown in Table 6.

Table 6. The number Cn of cells containing n bacteria when virulent (SL5560) and attenuated (SL3261) strains of bacteria were used.

Cn
n Virulent Attenuated
1 655 1189
2 250 396
3 87 104
4 86 70
5 54 40
6 42 25
7 13 8
8 30 10
9 8 9
10 19 3
11 5 7
12 12 4
13 5 3
14 1 4
15 6 0
16 3 2
17 2 1
18 0 2
19 1 1
20 4 0
21 0 0
22 0 0
23 0 0
24 0 1
25 1 0
26 0 0
27 0 0
28 0 0
29 1 0

The data was pooled. If Inline graphic denotes the number of cells having Inline graphic bacteria on day Inline graphic then, for the virulent strain, Brown et al. [24] used Inline graphic, and for the attenuated strain they used Inline graphic.

Posterior model probabilities

If we assume that the set of candidate models is exhaustive, we can apply (2) to estimate the posterior probability Inline graphic for each model. Furthermore, if Inline graphic is assumed to be equal for all models, we can use

graphic file with name pone.0082317.e336.jpg (25)

There are 14 models, each arbitrarily having 10 estimates of Inline graphic, but it is impractical to systematically apply each of the Inline graphic possible combinations of Inline graphic to [25]; therefore, the Inline graphic values were chosen randomly in order to obtain distributions for Inline graphic. The resulting distributions are shown in Figure 2.

Figure 2. Estimates of the posterior model probabilities Inline graphic when using data from (A) the attenuated strain and (B) the virulent strain.

Figure 2

An alternative approach to Bayesian model comparison is to use the Bayes factor Inline graphic. This provides a relative comparison of models Inline graphic and Inline graphic but not the absolute values of their posterior probabilities Inline graphic.

Results

The estimated model-evidence values Inline graphic obtained by nested sampling for each model is shown in Tables 7 and 8. The ranges are shown in Table 9.

Table 7. Median Inline graphic estimated for Models 1 to 6.

Model Distribution Attenuated Virulent
1 Inline graphic 77.59 38.56
2 Inline graphic 69.49 92.79
3 Inline graphic 53.75 34.09
4 Inline graphic 245.87 281.46
5 Inline graphic 30.26 34.18
6 Inline graphic 84.26 79.97

The highest model evidence Inline graphic (bold) and second highest model evidence (italic) models are highlighted.

Table 8. Median Inline graphic estimated for stochastic Models 7 to 14.

Parameters, θ
Model μ 0 μ 1 μ 2 α 0 αe Attenuated Virulent
7 μ 0 0 0 1 0 27.21 38.56
8 0 μ 1 0 1 0 28.00 36.93
9 0 0 μ 2 1 0 38.80 35.24
10 μ 0 μ 1 μ 2 1 0 29.21 39.21
11 μ 0 0 0 1 αe 27.32 34.27
12 0 μ 1 0 1 αe 30.13 34.43
13 0 0 μ 2 1 αe 41.25 34.60
14 μ 0 μ 1 μ 2 1 αe 30.04 36.34

The highest model evidence Inline graphic (bold) and second highest model evidence (italic) models are highlighted.

Table 9. Inline graphic estimates for all models.

Attenuated Virulent
Model min median max min median max
1 77.56 77.59 77.63 38.55 38.56 38.58
2 69.38 69.49 69.66 92.66 92.79 92.88
3 53.71 53.75 53.79 34.07 34.09 34.10
4 245.83 245.87 245.91 281.36 281.46 281.50
5 29.93 30.26 30.52 34.16 34.18 34.24
6 84.23 84.26 84.30 79.93 79.97 80.01
7 27.19 27.21 27.24 38.52 38.56 38.58
8 27.94 28.00 28.02 36.88 36.93 36.97
9 38.78 38.80 38.85 35.20 35.24 35.98
10 29.06 29.21 29.39 38.66 39.21 43.12
11 27.28 27.32 27.38 34.24 34.27 34.28
12 29.99 30.13 30.36 34.39 34.43 34.50
13 40.93 41.25 41.84 34.53 34.60 34.63
14 29.86 30.04 30.48 36.19 36.34 39.65

With respect to the data from the attenuated strain, the most probable model was Model 7 (Inline graphic only) followed by Model 11 (Inline graphic and Inline graphic). With respect to the data from the virulent strain, the most probable model was Model 3 (Poisson) followed by Model 5 (negative binomial).

Parameter distributions

After having estimated the most probable model, Inline graphic, it is of interest to estimate the posterior joint probability of the parameters Inline graphic with respect to Inline graphic and Inline graphic: Inline graphic.

From Bayes' theorem, we can write

graphic file with name pone.0082317.e367.jpg (26)

and the denominator of Eqn (26) can be estimated by nested sampling:

graphic file with name pone.0082317.e368.jpg (27)

Parameter estimation via reject sampling

Distribution Inline graphic can be estimated using reject sampling with approximation (27). As part of this process, the maximum of Inline graphic can be determined by performing Nelder-Mead simplex optimisation with respect to this distribution over parameter space.

The estimated parameter distributions obtained by reject sampling for Models 3, 5, 7 and 11, are shown in Figures 3, 4, 5, and 6, respectively. In each case, the sample size Inline graphic was 10000. The samples obtained by reject sampling were also used to construct density scatter plots (Figures 7 and 8), which provide a visualisation of the correlations between the parameters.

Figure 3. An estimate of the marginal probability distribution Inline graphic.

Figure 3

Inline graphic is data from the attenuated strain.

Figure 4. Estimates of the marginal probability distributions Inline graphic and Inline graphic.

Figure 4

Inline graphic is data from the attenuated strain.

Figure 5. An estimate of the marginal probability distribution Inline graphic.

Figure 5

Inline graphic is data from the virulent strain.

Figure 6. Estimates of the marginal probability distributions Inline graphic and Inline graphic.

Figure 6

Inline graphic is data from the virulent strain.

Figure 7. Density scatter plot of the estimated joint probability distribution Inline graphic.

Figure 7

Inline graphic is data from the attenuated strain.

Figure 8. Density scatter plot of the estimated joint probability distribution Inline graphic.

Figure 8

Inline graphic is data from the virulent strain.

Parameter estimation directly from nested sampling

The parameter sequence Inline graphic is produced during nested sampling. Can this set of parameters be regarded as a random sample from Inline graphic? Sivia and Skilling [26] proposed using Inline graphic for this purpose so long as it is weighted by Inline graphic, where Inline graphic, on the basis that Inline graphic. A theoretical justification for this is given by Chopin and Robert [29].

The appropriateness of regarding Inline graphic as a random sample from Inline graphic, was ascertained empirically using the Kolomogorov-Smirnov test, as follows.

The Kolmogorov-Smirnov statistic Inline graphic is given by

graphic file with name pone.0082317.e395.jpg

where Inline graphic is the cdf of the null-hypothesis pdf, and Inline graphic is the empirical cdf obtained from a sample Inline graphic:

graphic file with name pone.0082317.e399.jpg (28)

This definition can be generalized to a weighted Kolmogorov-Smirnov statistic by replacing (28) with a weighted cdf:

graphic file with name pone.0082317.e400.jpg

This allows us to take account of the weights Inline graphic on Inline graphic.

Applying this method to the toy model Inline graphic presented in Methods S1, a sample Inline graphic, with Inline graphic, was obtained by performing nested sampling for the evaluation of evidence Inline graphic, where Inline graphic. The corresponding sample Inline graphic was compared with the marginal beta distribution,

graphic file with name pone.0082317.e409.jpg

using the weighted Kolmogorov-Smirnov statistic Inline graphic. This statistic was equal to 0.01298. In order to obtain a frequentist Inline graphic-value for the statistic, an empirical probability distribution for Inline graphic was obtained by randomly selecting a set Inline graphic of Inline graphic values from Inline graphic and determining Inline graphic for the set, this being done 10000 times. On comparing 0.01298 with this empirical distribution, the Inline graphic-value for Inline graphic was found to be 0.0276. In contrast, when a sample of size Inline graphic was obtained by reject sampling from Inline graphic, the value of unweighted Inline graphic was 0.00630, which has a Inline graphic-value of 0.5772.

As a result of this experiment, it was decided not to use Inline graphic for estimating parameter distributions.

Model checking

It does not follow that the most probable model from a set of candidate models is necessarily an acceptable model: the most probable model may be the least worst of a set of poor models. What is required is an assessment of the fit of the most probable models to the observed data.

A common approach to assessing the fit of a model to data is to use a Inline graphic-value with respect to some statistic Inline graphic, where Inline graphic is observed data. More formally, the classical Inline graphic-value is given by

graphic file with name pone.0082317.e428.jpg (29)

where Inline graphic is a possible future value, and the probability is taken over the distribution of Inline graphic given Inline graphic, a single parameter estimate.

A drawback of (29) is that it does not take account of the uncertainty in Inline graphic expressed by the posterior distribution Inline graphic. In contrast, the Bayesian posterior predictive Inline graphic -value [33], [34]

graphic file with name pone.0082317.e435.jpg (30)

overcomes the problem by using the posterior predictive distribution:

graphic file with name pone.0082317.e436.jpg
graphic file with name pone.0082317.e437.jpg

The posterior distribution can be simulated by drawing Inline graphic values Inline graphic from Inline graphic, and then, for each Inline graphic, sampling a Inline graphic from Inline graphic. The resulting Inline graphic values of Inline graphic represent draws from Inline graphic.

In the context of the Salmonella study, Inline graphic was provided by the parameter estimates obtained for Inline graphic, Inline graphic was set to 10000, and Inline graphic was modelled as a multinomial distribution

graphic file with name pone.0082317.e451.jpg (31)

where Inline graphic is the total number of counts (cf. (22)).

In order to obtain Inline graphic values of Inline graphic drawn from Inline graphic, each Inline graphic drawn from Inline graphic is mapped to Inline graphic.

We used the Inline graphic-statistic for the test statistic Inline graphic [35]. The Inline graphic-statistic is proportional to the Kullback-Leibler measure of distribution divergence, and is given by

graphic file with name pone.0082317.e462.jpg (32)

where Inline graphic, and Inline graphic is the expected value for Inline graphic: Inline graphic.

Applying the above approach for estimating the distribution of Inline graphic under a given model Inline graphic, the posterior predictive Inline graphic-values for Inline graphic were found to be 0.005 for Model 7 and 0.006 for Model 11 (with respect to the attenuated strain), Inline graphic for Model 3 and Inline graphic for Model 5 (with respect to the virulent strain). This suggests a poor fit of the models to the data.

A visual representation of the fit of data to a model Inline graphic can be provided by comparing the observed count Inline graphic (the number of cells containing Inline graphic bacteria) to the distribution of Inline graphic possible count values Inline graphic obtained via (31). This visualisation is shown in Figures 9, 10, 11 and 12.

Figure 9. The observed number of cells with Inline graphic bacteria (blue) compared with 95% credibility intervals (red) predicted by Model 3 with respect to the virulent strain.

Figure 9

Figure 10. The observed number of cells with Inline graphic bacteria (blue) compared with 95% credibility intervals (red) predicted by Model 5 with respect to the virulent strain.

Figure 10

Figure 11. The observed number of cells with Inline graphic bacteria (blue) compared with 95% credibility intervals (red) predicted by Model 7 with respect to the attenuated strain.

Figure 11

Figure 12. The observed number of cells with Inline graphic bacteria (blue) compared with 95% credibility intervals (red) predicted by Model 11 with respect to the attenuated strain.

Figure 12

Discussion

The AIC is a common maximum-likelihood approach to model comparison, but nested sampling enables a Bayesian approximation of model evidence Inline graphic to be computed, along with the advantages of adopting the Bayesian approach. These include integration across parameters; estimation of the posterior parameter distributions (with visualisation of parameter correlations); and estimation of the posterior predictive distributions for goodness-of-fit assessments of the models.

Under the assumptions used, the most probable models with respect to the virulent and attenuated strains of S. enterica were burst-threshold Model 3 (Poisson) and burst-rate Model 7 (Inline graphic only), respectively. The next two most probable models were burst-threshold Model 5 (negative binomial) and burst-rate Model 11 (Inline graphic plus Inline graphic), respectively. However, the Bayesian posterior predictive Inline graphic-values indicate that alternative models and/or a relaxation of the quasi-stationary assumption adopted by Brown et al. [24] should be considered. It may be the case that one of the candidate models is correct but the use of pooled data was detrimental.

Other assumptions of the underlying mechanistic model may also be wrong; in particular, the absence of bacterial death and the assumption that each released bacterium infects a new macrophage.

For both the attenuated and virulent strains, the data Inline graphic was recorded over a number of days following infection and then pooled, with Inline graphic. If time-dependent data is to be retained and nested sampling is to be applied then a method is required to estimate the likelihood function Inline graphic, where Inline graphic and Inline graphic is the number of cells containing Inline graphic bacteria on the Inline graphic-th day. Branching processes have been used to model a variety of biological systems [36], and we will investigate the potential of estimating Inline graphic through the use of Bellman-Harris processes to model within-host infection dynamics.

We have demonstrated that a visualisation of the marginal and joint posterior parameter distributions Inline graphic is readily obtainable once model evidence Inline graphic has been estimated by nested sampling. The estimated joint posterior distributions provided a visualisation of the correlations between the parameters. Through the use of a weighted Kolomogorov-Smirnov test, we also found that the parameter sequence Inline graphic resulting from nested sampling could not be regarded as a random sample from the posterior parameter distribution Inline graphic.

One drawback of Algorithm 2 is that the restricted priors will converge to a single mode when a likelihood is multi-modal, and this will cause the evidence Inline graphic to be underestimated. This issue can be resolved by implementing a multi-modal version of nested sampling, such as that proposed by Feroz et al. [37] for comparing cosmological models.

Supporting Information

Methods S1

Toy example of nested sampling.

(PDF)

Acknowledgments

We wish to thank Dr Andrew Grant and Dr Chris Coward for their helpful contributions during discussions.

Funding Statement

RD was funded by the Biotechnology and Biological Sciences Research Council (BBSRC) (grant number BB/I002189/1). TJM was funded by the Biotechnology and Biological Sciences Research Council (BBSRC) (grant number BB/I012192/1). OR was funded by the Royal Society. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Anderson D (2008) Model based inference in the life sciences: a primer on evidence. New York, NY: Springer Science+Business Media, LLC. [Google Scholar]
  • 2. Akaike H (1974) A new look at statistical model identification. IEEE Transactions on Automatic Control AU-19 195–223. [Google Scholar]
  • 3.Bishop C (2006) Pattern Recognition and Machine Learning. New York: Springer. [Google Scholar]
  • 4. Schwarz G (1978) Estimating the dimension of a model. Annals of Statistics 6: 461–464. [Google Scholar]
  • 5. Geman S, Geman D (1984) Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence 6: 721–741. [DOI] [PubMed] [Google Scholar]
  • 6. Skilling J (2004) Nested sampling. AIP Conference Proceedings 735: 395–405. [Google Scholar]
  • 7. Skilling J (2006) Nested sampling for general Bayesian computation. Bayesian Analysis 1 4: 833–859. [Google Scholar]
  • 8. Mukherjee P, Parkinson D, Liddle A (2006) A nested sampling algorithm for cosmological model selection. Astrophysical Journal Letters 638: L51–L54. [Google Scholar]
  • 9.Murray I, Ghahramani Z, Mackay D, Skilling J (2006) Nested sampling for Potts models. In: Weiss Y, Scholkopf B, Platt J, editors. Advances in Neural Information Processing Systems (NIPS) 19. Cambridge, MA: MIT Press. pp. 947–954. [Google Scholar]
  • 10. Jasa T, Xiang N (2012) Nested sampling applied in Bayesian room-acoustics decay analysis. Journal of the Acoustical Society of America 132: 3251–3262. [DOI] [PubMed] [Google Scholar]
  • 11. O'Neill P (2002) A tutorial introduction to Bayesian inference for stochastic epidemic models using Markov chain Monte Carlo methods. Mathematical Biosciences 180: 103–114. [DOI] [PubMed] [Google Scholar]
  • 12. Charleston B, Bankowski B, Gubbins S, Chase-Topping M, Schley D, et al. (2011) Relationship between clinical signs and transmission of an infectious disease and the implications for control. Science 332: 726–729. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Miller M, Raberg L, Read A, Savill N (2010) Quantitative analysis of immune response and edrythropoiesis during rodent malarial infection. PLoS Computational Biology 6: e1000946. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Crump J, Luby S, Mintz E (2004) The global burden of typhoid fever. Bulletin of the World Health Organization 82: 346–353. [PMC free article] [PubMed] [Google Scholar]
  • 15. Mulholland E, Adegbola R (2005) Bacterial infections - a major cause of death among children in Africa. New England Journal of Medicine 352: 75–77. [DOI] [PubMed] [Google Scholar]
  • 16. Crump J, Mintz E (2010) Global trends in typhoid and paratyphoid fever. Clinical Infectious Diseases 50: 241–246. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Mastroeni P, Grant A, Restif O, Maskell D (2009) A dynamic view of the spread and intracellular distribution of Salmonella enterica. Nature Reviews Microbiology 7: 73–80. [DOI] [PubMed] [Google Scholar]
  • 18. Mastroeni P, Grant A (2011) Spread of Salmonella enterica in the body during systemic infection: unravelling host and pathogen determinants. Expert Reviews in Molecular Medicine 13: e12. [DOI] [PubMed] [Google Scholar]
  • 19. Richter-Dahlfors A, Buchan A, Finlay B (1997) Murine salmonellosis studied by confocal microscopy: Salmonella typhimurium resides intracellularly inside macrophages and exerts a cytotoxic effect on phagocytes in vivo. Journal of Experimental Medicine 186: 569–580. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Grant A, Foster G, McKinley T, Brown S, Clare S, et al. (2009) Bacterial growth rate and host factors as determinants of intracellular bacterial distributions in systemic Salmonella enterica infections. Infection and Immunity 77: 5608–5611. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21. Grant A, Morgan F, McKinley T, Foster G, Maskell D, et al. (2012) Attenuated Salmonella Typhimurium lacking the pathogenicity island-2 type 3 secretion system grow to high bacterial numbers inside phagocytes in mice. PLOS Pathogens 8: e1003070. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Grant A, Sheppard M, Deardon R, Brown S, Foster G, et al. (2008) Caspase-3-dependent phagocyte death during systemic Salmonella enterica serovar Typhimurium infection of mice. Immunology 125: 28–37. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Sheppard M, Webb C, Heath F, Mallows V, Emilianus R, et al. (2003) Dynamics of bacterial growth and distribution within the liver during Salmonella infection. Cellular Microbiology 5: 593–600. [DOI] [PubMed] [Google Scholar]
  • 24. Brown S, Cornell S, Sheppard M, Grant A, Maskell D, et al. (2006) Intracellular demography and the dynamics of Salmonella enterica infections. PLoS Biology 4: e349. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Dudewicz E, Mishra S (1988) Modern Mathematical Statistics. New York: John Wiley. [Google Scholar]
  • 26.Sivia D, Skilling J (2006) Data Analysis: A Bayesian Tutorial, 2nd edition. Oxford: Oxford University Press. [Google Scholar]
  • 27.Larson H (1982) Introduction to Probability Theory and Statistical Inference, 3rd edition. New York: John Wiley. [Google Scholar]
  • 28. Feroz F, Hobson M (2008) Multimodal nested sampling: an efficient and robust alternative to MCMC methods for astronomical data analysis. Monthly Notices of the Royal Astronomical Society 2: 449–463. [Google Scholar]
  • 29. Chopin N, Robert C (2010) Properties of nested sampling. Biometrika 97: 741–755. [Google Scholar]
  • 30. Brown S, Cornell S, Sheppard M, Grant A, Maskell D, et al. (2006) Protocol S1: Details of model constructions and statistical analyses for “Intracellular demography and the dynamics of Salmonella enterica infections”. PLoS Biology 4: e349. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31. Garcia-Perez M (1999) MPROB: computation of multinomial probabilities. Behaviour Research Methods, Instruments and Computers 31: 701–705. [DOI] [PubMed] [Google Scholar]
  • 32.Ng KM (2002) A Continuation Approach for Solving Nonlinear Optimization Problems with Discrete Variables. Ph.D. thesis, Department of Management Science and Engineering, Stanford University, Stanford, CA.
  • 33. Meng XL (1994) Posterior predictive p-values. The Annals of Statistics 22: 1142–1160. [Google Scholar]
  • 34.Gelman A, Carlin J, Stern H, Rubin D (1995) Bayesian Data Analysis. London: Chapman & Hall. [Google Scholar]
  • 35.Sokal R, Rohlf F (1995) Biometry, 3rd edition. New York: Freeman. [Google Scholar]
  • 36.Kimmel M, Axelrod D (2002) Branching Processes in Biology. New York: Springer-Verlag. [Google Scholar]
  • 37. Feroz F, Hobson M, Bridges M (2009) MultiNest: an efficient and robust Bayesian inference tool for cosmology and particle physics. Monthly Notices of the Royal Astronomical Society 398: 1601–1614. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Methods S1

Toy example of nested sampling.

(PDF)


Articles from PLoS ONE are provided here courtesy of PLOS

RESOURCES