Skip to main content
Journal of the Royal Society Interface logoLink to Journal of the Royal Society Interface
. 2014 Apr 6;11(93):20131093. doi: 10.1098/rsif.2013.1093

New model diagnostics for spatio-temporal systems in epidemiology and ecology

Max S Y Lau 1,3,, Glenn Marion 3, George Streftaris 1,2, Gavin J Gibson 1,2
PMCID: PMC3928939  PMID: 24522782

Abstract

A cardinal challenge in epidemiological and ecological modelling is to develop effective and easily deployed tools for model assessment. The availability of such methods would greatly improve understanding, prediction and management of disease and ecosystems. Conventional Bayesian model assessment tools such as Bayes factors and the deviance information criterion (DIC) are natural candidates but suffer from important limitations because of their sensitivity and complexity. Posterior predictive checks, which use summary statistics of the observed process simulated from competing models, can provide a measure of model fit but appropriate statistics can be difficult to identify. Here, we develop a novel approach for diagnosing mis-specifications of a general spatio-temporal transmission model by embedding classical ideas within a Bayesian analysis. Specifically, by proposing suitably designed non-centred parametrization schemes, we construct latent residuals whose sampling properties are known given the model specification and which can be used to measure overall fit and to elicit evidence of the nature of mis-specifications of spatial and temporal processes included in the model. This model assessment approach can readily be implemented as an addendum to standard estimation algorithms for sampling from the posterior distributions, for example Markov chain Monte Carlo. The proposed methodology is first tested using simulated data and subsequently applied to data describing the spread of Heracleum mantegazzianum (giant hogweed) across Great Britain over a 30-year period. The proposed methods are compared with alternative techniques including posterior predictive checking and the DIC. Results show that the proposed diagnostic tools are effective in assessing competing stochastic spatio-temporal transmission models and may offer improvements in power to detect model mis-specifications. Moreover, the latent-residual framework introduced here extends readily to a broad range of ecological and epidemiological models.

Keywords: spatio-temporal model assessment, latent residuals, non-centred parametrization, Bayesian inference

1. Introduction

Stochastic spatio-temporal models are playing an increasingly important role in epidemiological and ecological studies relating to transmission of diseases [1,2], invasion of alien species [3] and population movements in response to climate changes [4]. It is well known that the predicted dynamics of such systems can be extremely sensitive to the choice of model, with consequent implications for the design of control strategies [1,57], but as yet there is a lack of effective model assessment tools described in the literature. For example, studies of foot and mouth disease have cited the importance of selecting between a long-tailed spatial kernel (see §2) versus a localized spatial kernel, but this model-choice problem is far from being resolved [5,8]. Further model-choice problems arise in relation to the parametric form of the distributions of incubation and infectious periods in models of measles [911], and in relation to diseases such as smallpox [12] and AIDS [13].

Bayesian model assessment techniques appear appealing [1416] particularly because many of the above studies use Bayesian techniques for model fitting. However, it is well known that this approach is extremely sensitive to prior assumptions regarding the distribution of parameters within the competing models. Moreover, the computational challenges presented by Bayesian model-choice algorithms can be prohibitive. An alternative approach, based on the deviance information criterion (DIC) [17], is known to be problematic when applied to processes that are only partially observed [18], as in the case of most real-world systems. Furthermore, neither of these approaches leads to a diagnostic tool for assessing the fit of a model in absolute terms; rather they provide an assessment of the relative fit of the competing models. Posterior predictive checks [19], whereby summary statistics of the observed process are compared to their predictive distributions, do provide a measure but appropriate statistics can be difficult to identify [20].

We address this gap in available methodology by pursuing the following objectives:

  • — to innovate a statistically sound framework for assessing stochastic spatio-temporal models, which can be readily implemented as an addendum to a Bayesian analysis and which avoids the sensitivity and complexity of Bayesian model assessment;

  • — to illustrate how the approach can be targeted to assess particular aspects of a spatio-temporal stochastic model, here principally the choice of spatial kernel; and

  • — to demonstrate the effectiveness of the approach using simulated data and to apply it to an ecological dataset describing the spread of an alien species throughout Great Britain.

The approach adopted (see §2) involves representing stochastic spatio-temporal models using appropriately designed non-centred parametrization schemes [21], from which latent-residual processes can be defined. The assessment of fit of a model to a given dataset is then achieved in the Bayesian framework by imputing these residuals, and testing them for compliance with their (known) sampling model using classical tests. The approach has its roots in the framework proposed in [19], and extended in [20,22]. The key innovation in this paper is to design the residual processes so that the resulting tests are sensitive to mis-specification of specific aspects of the model under consideration. In particular, we formulate tests for detecting mis-specification of the spatial transmission kernel as this aspect typically has major implications for control strategies, for example based on culling or removal of susceptibles in the neighbourhood of infected individuals.

2. Model and methods

2.1. Spatio-temporal stochastic model

We consider a broad class of spatio-temporal stochastic models exemplified by the SEIR epidemic model with susceptible (S), exposed (E), infectious (I) and removed (R) compartments. Suppose that we have a spatially distributed population indexed 1, 2,…, denote ξS(t), ξE(t), ξI(t) and ξR(t) as the set of indices for individuals who are in classes S, E, I and R, respectively, at time t, and let S(t), E(t), I(t) and R(t) be the respective numbers of individuals in these classes at time t. Then individual jξS(t) becomes exposed during [t, t + dt) with probability

2.1. 2.1

where α represents a primary infection rate and β is a contact parameter. The term K(dij, κ) characterizes the dependence of the infectious challenge from infective i to j as a function of distance dij and is known as the spatial kernel function. Following exposure, the random times spent by individuals in classes E and I are modelled using a suitable distribution such as a Gamma or a Weibull distribution [12,23,24]. Specifically, we use a Gamma(μ, σ2) parametrized by the mean, μ, and variance, σ2, for the random time x spent in class E with density function

2.1. 2.2

For the random time x spent in class I, we use a Weibull(γ, η) parametrized by the shape and scale with density function

2.1. 2.3

All sojourn times are assumed independent of each other given the model parameters.

2.2. Latent residuals

Let Z denote the complete set of data (that is the time, nature and affected individual for all transitions between states) describing an epidemic generated randomly from the above model parametrized by θ. Then, as long as the sampling properties of the stochastic model described by equation (2.1) are preserved, we can consider Z to be generated in non-unique ways. In particular, we consider Z as a deterministic function hθ(·) of a random vector Inline graphic where the components of the latter are generated as a random sample from a Uniform(0, 1) distribution. That is

2.2. 2.4

This representation is essentially a functional model in the sense of [25] and is an illustration of the concept of generalized residuals proposed in [26]. Note that the selection of a residual process Inline graphic and a function hθ(·) which together specify the model given by equation (2.1) can be effected in a multiplicity of ways. In §2.2.2, we consider how this selection may be done in order to facilitate the design of statistical tests based on Inline graphic that can be sensitive to mis-specification of particular aspects of the model.

The particular construction we exploit involves a process Inline graphic that can be partitioned into four independent random samples from Uniform(0, 1) and expressed as Inline graphic where each Inline graphic j = 1, 2, 3, 4, is a vector which determines events relating to a different aspect of the process. The process Inline graphic defines a set of population-level thresholds from which the time of each subsequent exposure can be determined by considering the integrated infectious challenge. For the kth exposure (ordered temporally), Inline graphic and Inline graphic specify the quantile of the random sojourn times in class E and I, respectively. The residuals Inline graphic determine the particular infectious contacts that generate each exposure. We now describe concisely how the epidemic process can be constructed through the residuals Inline graphic, Inline graphic and Inline graphic A full description of the residuals can be found in the electronic supplementary material.

2.2.1. Exposure time residuals

We refer to the residuals Inline graphic as exposure time residuals (ETRs). Starting from the (k − 1)th exposure event, we define the accumulated infectious challenge in the population by time t as

2.2.1.

where tk−1 is the time of the (k − 1)th exposure event. The time of kth exposure is then determined from

2.2.1. 2.5

2.2.2. Infection-link residuals

We refer to the residuals Inline graphic as infection-link residuals (ILRs). Given that the kth exposure event occurs during (tk, tk + dt), and given the other transitions that have occurred prior to tk, the probability that the respective contact is between susceptible Inline graphic and infective Inline graphic is given by

2.2.2. 2.6

Note that the primary infection process can be accommodated by adding a notional and permanently infectious individual which presents a challenge α to every susceptible. To generate the particular infection-link from the residual, Inline graphic, we arrange the pij in the ascending order p(1), … , p(m), where m = S(tk)(I(tk) + 1) is the total number of ‘active’ links. The link responsible for the kth exposure is then determined from

2.2.2. 2.7

so that individual j becomes exposed due to contact with individual i, and pij = p(s′). The inclusion of the ordering operation in this process is motivated by our aim of designing tests that may be sensitive to mis-specification of the spatial kernel function in the model equation (2.1). Suppose that the modelled kernel function deviates from reality in a systematic way—for example by declining too rapidly or too slowly with distance. Then a heuristic argument (see the electronic supplementary material, Section 1.4) suggests that we may see a correspondingly systematic deviation from the U(0, 1) distribution when the residuals Inline graphic are imputed (as described in Imputation of infection-link residuals) from observations, with a concentration, or scarcity of residuals at the extremes, so that model mis-specification may be readily detected using standard tests of the fit of the imputed Inline graphic to the uniform distribution.

2.2.3. Latent time residuals

We refer to the residuals Inline graphic as latent time residuals (LTRs). For the kth exposure, define the accumulated pressure of becoming infectious by time t as

2.2.3. 2.8

where f(y) and F(y) are the density and cumulative distribution function of the latent period, respectively. The time of becoming infectious is then determined from

2.2.3. 2.9

We remark that the time of recovery can be determined similarly by using r4k and an appropriate sojourn time distribution in class I.

2.3. Bayesian inference and model assessment

It is now standard practice to conduct Bayesian analyses of partially observed epidemics using the process of data augmentation supported by computational techniques such as Markov chain Monte Carlo (MCMC) methods [5,20,27,28]. Given partial data y, these approaches involve simulating from the joint posterior distribution Inline graphic where z represents the complete epidemic data as above. As applied to fit models in this paper, this approach is described more fully in the electronic supplementary material. As the complete epidemic z is reconstructed, or ‘imputed’, it naturally lends itself to the residual-based testing methods now described.

Given a random draw Inline graphic from Inline graphic it is generally straightforward to invert equation (2.4) to impute the corresponding residual Inline graphic by sampling it uniformly from the set Inline graphic (see §2.3.1), the set of residuals mapped to Inline graphic by Inline graphic Therefore, a sample from the posterior distribution Inline graphic can easily be generated with a minor insert to an existing algorithm. On applying a classical test for consistency with the uniform distribution to the Inline graphic (specifically, we use an Anderson–Darling hypothesis test [29]; see also the electronic supplementary material) a posterior distribution of p-values, Inline graphic is generated, from which evidence against the modelling assumptions can be discerned. In Bayesian parlance, we note that the pair Inline graphic represents a non-centred parametrization.

2.3.1. Imputation of infection-link residuals

The imputation of Inline graphic and Inline graphic given Inline graphic is straightforward by inverting the procedure specified by equation (2.5) and equation (2.9), respectively. Imputation of Inline graphic is achieved by inverting equation (2.7) but, as the infection link is discrete and the space of residuals continuous, the imputation process warrants description here. The particular infection link for the kth exposure event is randomly chosen from the links between the corresponding exposed individual k and Inline graphic according to probabilities pik defined in equation (2.6). The ranking of this particular infection link, Inline graphic, is then determined among all links between Inline graphic and infective iξI(t). Finally, the residual Inline graphic is imputed as a random draw

2.3.1. 2.10

Bayesian residuals have been used in other contexts [30]. In [20], it was shown that Bayesian latent residuals based on Sellke thresholds [31] could be used to assess the fit of spatio-temporal models. However, the specific approach is problematic when the epidemic is small as thresholds must be imputed even for uninfected individuals. By contrast, the construction proposed here requires residuals to be imputed only for each infection event and avoids this shortcoming. Moreover, as the components of the residual process Inline graphic each relate to a different aspect of the stochastic model, then it is plausible that testing for mis-specification of a given aspect may be best achieved by considering only the relevant component of Inline graphic In particular, mis-specification of a spatial kernel or the latent period distribution may be assessed by examining the posterior samples of Inline graphic (ILR) and Inline graphic (LTR), respectively. We stress again the importance for the detection of mis-specified spatial kernels of the ordering operation (see §2.2.2) in the construction of the ILR, which is included with the expectation that it leads to systematic, detectable and interpretable deviations from U(0, 1) in the imputed residuals. This issue is discussed further in §3.1 and the electronic supplementary material, Section 1.4. As described in §§3 and 4, this hoped-for sensitivity is indeed achieved.

2.4. Interpretation of latent-residual tests

Posterior distributions of p-values arising from a classical test applied to a latent process have been exploited in [20,22,32]. For completeness, we include some comments on the statistical interpretation of such distributions in the Bayesian context. To the Bayesian observer of data y, Inline graphic represents their posterior belief regarding the p-value that a classical observer of Inline graphic would compute. Should this distribution be concentrated on small values, the Bayesian would infer that the classical observer may reject the hypothesis that the Inline graphic was generated as a random sample from a U(0, 1) distribution. The latter hypothesis is a key assumption for the functional-model representation given in equation (2.4) so that the classical observer would likewise question the validity of this model. Therefore, the Bayesian observer can extract from Inline graphic summaries, for example Inline graphic (as used here), and interpret them as measures of evidence against their model assumptions.

3. Simulated example

To test the methodology, we apply it to analyse spatio-temporal epidemics simulated in a population of size N = 1000, whose locations are sampled independently from a uniform distribution over a square region, between times t = 0 and t = tmax = 50. We assume that the entire population is susceptible at time 0, that the epidemic evolves according to equation (2.1) with α = 0.001, β = 3, Inline graphic and that the sojourn times in classes E and I follow Gamma(5, 2.5) and Weibull(2, 2) distributions, respectively. The observations y constitute only the precise times and locations of transitions from E to I and from I to R that occur during the observation period. Figure 1 illustrates the spatio-temporal progression of a typical realization of y.

Figure 1.

Figure 1.

An illustration of (a subset of) the observed data y in the simulation (replicate 1) in the 2000 × 2000 square area in the form of a sequence of ‘snapshots’ of the system state at particular times. (ae) Black and grey dots represent the individuals in class I and R, respectively, at times t = 10, 15, 20, 25, 30. It is assumed that the locations of all other individuals are known but these are not shown in the interests of clarity.

To assess our model-testing framework, we fit to the simulated data y a model with the correct structure (Case A), and three further models in which the spatial kernels (Cases B and C) and the latent period distribution (Case D) have been mis-specified, respectively. Specifically, we consider

  • — Case A: Inline graphic and the latent period is distributed as Gamma(μ, σ2);

  • — Case B: Inline graphic and the latent period is distributed as Gamma(μ, σ2);

  • — Case C: Inline graphic and the latent period is distributed as Gamma(μ, σ2); and

  • — Case D: Inline graphic and the latent period is distributed as Exp(μ).

A Weibull infectious period is fitted in all cases. Uniform priors, which should be constrained to bounded regions to ensure a proper posterior distribution, are specified for all model parameters estimated in the following analyses.

In each case, we use standard MCMC and data augmentation to generate a sample from Inline graphic from which—see §2—we impute posterior samples of the ILRs (Inline graphic) and of the LTRs (Inline graphic). In addition, we impute posterior samples of Inline graphic The Anderson–Darling test for consistency with the uniform distribution is applied to each sample of the residuals. Posterior distributions of the model parameters are described in detail in the electronic supplementary material.

Table 1 shows the values of Inline graphic, j = 1, 2, 3, from three independently simulated replicates, y, of the epidemic. From Inline graphic and Inline graphic, it appears that these posterior summaries systematically give evidence against the model when the spatial kernel and the latent period have been mis-specified, respectively. On the other hand, Inline graphic suggests no evidence against the model specifications in Cases B, C and D. Note that, for ease of comparison, we only present the value of Inline graphic for relevant cases. Values in cases not presented range from 3 to 6% and, therefore, suggest no evidence against the respective model specification.

Table 1.

Values of Inline graphic estimated from 1500 posterior samples of the corresponding components of Inline graphic for three replicate epidemics simulated from the specified model and analysed using four different model assumptions (Case A, the correct model structure; Case B, a mis-specified (Cauchy-type) spatial kernel; Case C, a mis-specified (power-law) spatial kernel; Case D, a mis-specified latent period distribution).

Inline graphic
Inline graphic
Inline graphic
Case B (%) Case C (%) Case D (%) Case A (%) Case B (%) Case C (%) Case A (%) Case D (%)
replicate 1 8 5 3 6 100 80 4 99
replicate 2 5 5 4 6 100 75 4 97
replicate 3 6 6 4 5 100 76 5 100

3.1. Diagnosing model mis-specification

We now illustrate the insights our approach offers for understanding the causes of model inadequacy. Specifically, having observed considerable evidence against a model from the measure Inline graphic, we show that the pattern of residuals Inline graphic can suggest the manner in which the fitted model may be deficient. Consider two (symmetric) scenarios. In Scenario I, the epidemic is simulated from a kernel K(d, κ2) = d2.8 and fitted with a kernel K(d, κ1) = exp(−κ1d); in Scenario II, the epidemic is simulated from a kernel K(d, κ1) = exp(−0.03d) and fitted with a kernel K(d, κ2) = Inline graphic Under the assumption that the fitted model is correct, the imputed ILR should resemble samples from U(0, 1). To highlight any systematic deviations from this null hypothesis, figure 2 presents the histogram formed from the subset of the ILR processes that produce small p-values (<0.05) revealing a symmetry between the two scenarios. Scenario I and Scenario II, respectively, lead to a concentration or a scarcity of residuals at the extremes of the unit interval. This symmetry suggests that the nature of the incompatibility of the spatial kernel may be diagnosed from systematically different deviations of the distribution of the ILR from U(0, 1). This is discussed in more detail in the electronic supplementary material, Section 1.4.

Figure 2.

Figure 2.

The distributions of a subset of imputed Inline graphic whose Inline graphic under two scenarios. (a) Scenario I and (b) Scenario II.

3.2. Comparison with common Bayesian model checking techniques

One common tool for model checking is the DIC [17]. The model with smallest DIC corresponds to the best model and, conventionally, models whose DIC exceeds that of the best model by more than 10 units are considered to display substantial evidence of poor fit. A key limitation of this approach is that it is known to be problematic when applied to processes that are only partially observed, as in the case of most real-world systems, where the DIC cannot be uniquely defined [18]. Following [18], we compute two versions of DIC,

3.2. 3.1

and

3.2. 3.2

where X and y represent the unobserved and observed data, respectively, and Inline graphic is often estimated by posterior point estimates, for example the posterior mean, which is used here (note that DIC1 and DIC2 are referred to as DIC4 and DIC8 in [18]). The quantities Inline graphic and Inline graphic represent contributions to the likelihood from both the observation model and the process model in the first case, and the observation model alone in the second. Note that calculation of each version of the DIC requires expectations of these quantities which can be estimated using MCMC techniques. The main difference between DIC1 and DIC2 is that the first takes the unobserved data into account. However, there is no absolute theoretical justification for a preference of one definition over another.

Table 2 shows that, although both versions of DIC can differentiate the relative goodness-of-fit between Case A and Case C as well as that between Case A and Case D, DIC2 misleadingly suggests that the fit for Case B is better than that for Case A. Note that the DIC is not a direct measure of model adequacy and only measures the relative goodness-of-fit between two models. Moreover, as shown in table 2, the ranking of models can also vary between different versions of DIC.

Table 2.

DIC computed for Cases A–D.

replicate 1 Case A Case B Case C Case D
DIC1 10 357.52 11 561.90 10 542.75 11 525.69
DIC2 5754.594 4982.372 5937.58 6897.95

A further approach to model checking is to compare the posterior predictive distribution of summary statistics with their observed values [20,28]. In the electronic supplementary material, Section 6, we consider posterior predictive checks based on common spatial autocorrelation measures including Moran's I and Geary's c indices [33], where application to simulated data shows that these measures are insensitive to the choice of model. Such summary statistics are based only on partially observed data and therefore reflect the behaviour of the competing models averaged over the missing data. By contrast, our method is based on the full posterior distribution of (imputed) unobserved data and other model parameters, which may explain its higher sensitivity to model mis-specification.

We further consider the performance of DIC and posterior predictive checks in an application to British floristic atlas data in §4.

4. Case study: spread of giant hogweed in Great Britain

Invasive alien species represent a major threat to ecosystems and cause significant environmental and financial loss worldwide [34]. Heracleum mantegazzianum (giant hogweed) causes significant problems in Great Britain and has rapidly spread since 1970 [27]. We apply our testing framework to British floristic atlas data which assess the presence of giant hogweed over a square lattice of 10 × 10 km resolution in 1970, 1987 and 2000. In total, 2838 such squares are considered to be habitable for the giant hogweed (see [27]). These are classified as susceptible or colonized at the given survey times according to the absence or presence of giant hogweed in the lattice. These data are well suited to testing our methodology. Detection of the species is relatively easy because of its height (more than 2 m), so that the number of false absences in the dataset should be limited. Moreover, the data give ‘snapshots’ of the distribution at three distinct times (from 1970 to 2000) and over a large region making them particularly suitable for inferring the spatio-temporal transmission mechanism.

We first represent the lattice of square regions as a lattice of points where the position of a point is given by the centre of the square which it represents. Figure 3 shows the snapshots of the spread of giant hogweed in Great Britain taken at three distinct times (1970, 1987 and 2000). In the light of the aggressive nature of giant hogweed, we assume that, once colonized, sites can immediately start to colonize other sites and remain colonized. In the terminology of the epidemic model, we consider, therefore, the E and I classes to be a single class and dispense with the recovery class R from our model. Effectively, we fit an S–I (susceptible–infectious) model to the presence/absence data and use our model assessment methods to compare the goodness-of-fit of several formulations, discussed in detail in the electronic supplementary material. In summary, the models differ in the choice of spatial kernel and with regard to the inclusion of terms quantifying the suitability of each site, j, for the species. Suitability is represented by a measure cj ∈ [0, 1], where cj are taken from an earlier analysis [27] in which an extensive range of covariates including average temperature, altitude and other factors were considered in their estimation. With suitability included, the instantaneous rate at which a susceptible site j becomes colonized equation (2.1) is moderated by a factor cj.

Figure 3.

Figure 3.

Snapshots of the spread of giant hogweed in Great Britain taken at three distinct times: (a) 1970, (b) 1987 and (c) 2000. Black dots represent the colonized sites.

Full model specifications and posterior estimates of the model parameters are described in the electronic supplementary material. Specifically, we consider three forms of spatial kernel with and without homogeneous suitabilities giving rise to six models:

  • — Model 1 (M1, kernel A): K(dij, κ1) = exp(−κ1dij), heterogeneous suitabilities, cj;

  • — Model 2 (M2, kernel B): K(dij, κ2) = 1/(1 + dij/κ2), heterogeneous suitabilities, cj;

  • — Model 3 (M3, kernel C): K(dij, κ3) = dijκ3, heterogeneous suitabilities, cj;

  • — Model 4 (M4, kernel A): K(dij, κ1) = exp(−κ1dij), homogeneous suitabilities, cj = 1;

  • — Model 5 (M5, kernel B): K(dij, κ2) = 1/(1 + dij/κ2), homogeneous suitabilities, cj = 1; and

  • — Model 6 (M6, kernel C): K(dij, κ3) = dijκ3, homogeneous suitabilities, cj = 1.

These models are fitted to the data using Bayesian methods as described in the electronic supplementary material. For the simple SI formulation, the residual process reduces to Inline graphic and specifies exposure times and infection links. We apply three tests to imputed values of these residuals for each of the models. As with the simulated example, we investigate Inline graphic and Inline graphic arising from an Anderson–Darling test applied to the respective subset of residuals. We also consider a combined test, with p-value Inline graphic, based on a test statistic

4.

whose distribution under the modelling assumptions is Inline graphic, and report Inline graphic for each model. Conclusions arising from the various tests are now presented.

4.1. Model assessment and implications for control strategies

From table 3, we first note that, regardless of whether dependence on suitability is included, Inline graphic and Inline graphic are larger for the models with Cauchy-form kernel (kernel B, M2 and M5) than for the respective models with exponentially bounded kernel (kernel A, M1 and M4) or with power-law kernel (kernel C, M3 and M6), suggesting that the Cauchy kernel typically provides a poorer fit. When dependence on suitability is not included (M4, M5 and M6), the fact Inline graphic for M4 (kernel A) and Inline graphic for M6 (kernel C) suggests there are substantial probabilities that the U(0, 1) hypothesis for the imputed residuals would be rejected by the classical observer and calls these models into question. By contrast, the results for M1, M2 and M3 present no evidence against the model with exponentially bounded kernel (A) and power-law kernel (C), while there remains a substantial posterior probability of rejection (0.82) for the model with Cauchy-form kernel (B). Figure 4 presents samples from Inline graphic for M1 and M2, highlighting the evidence against kernel B. It is clear from other results in table 3 that the evidence against a given model arises from the ILR residuals Inline graphic the test based on Inline graphic alone presents little evidence against any of the models M1–M6, while the evidence arising from the combined test is typically weaker than that from the tests of Inline graphic alone.

Table 3.

ValuesInline graphic, j = 1, 2, and Inline graphic estimated from 1500 posterior samples of ILR and ETR computed from the giant hogweed data under different model assumptions regarding the spatial kernel and dependence on suitability of sites.

Inline graphic
Inline graphic
Inline graphic
spatial kernels A (%) B (%) C (%) A (%) B (%) C (%) A (%) B (%) C (%)
heterogeneous suitability 13 7 11 4 82 4 11 74 10
homogeneous suitability 5 6 3 35 100 54 24 100 28

Figure 4.

Figure 4.

Posterior distributions of the p-values from testing the sets of posterior samples of ILR imputed from MCMC chains (1500 samples in each case) when fitting SI models, representing heterogeneous suitability, to the giant hogweed data with (a) kernel A (model M1) and (b) kernel B (model M2).

In summary, we find evidence that the dispersion mechanism for hogweed cannot be represented adequately by the Cauchy dispersal kernel, while no evidence against the exponentially bounded kernel and power-law kernel is found as long as habitat heterogeneity is accommodated. Figure S8 (see electronic supplementary material) shows that the long-tail behaviour of M2 tends to induce a scarcity of residuals at the left end and a concentration of residuals at the right end of the unit interval. Although the ILR residuals Inline graphic were constructed with assessment of spatial kernels in mind, comparison of Inline graphic between models with and without the dependence on suitability (i.e. comparison between M1 and M4 and that between M3 and M6) highlights the potential for the method to detect mis-specification of other aspects of the model. This is not surprising given the key role of Inline graphic and the suitabilities in the construction of the colonization links.

Giant hogweed spreads its seed mostly through wind, water and human activities [35]. Localized dispersal mechanisms typically involve the dispersal by wind or animal activities. Human activities, such as soil transport and transport of seeds adhering to car tyres, are mainly responsible for long-distance dispersal. Understanding of the importance of short- and long-distance dispersal provides valuable insight for devising appropriate control strategies [36,37]. Our results and figure 5 clearly suggest that the spread of hogweed is mainly via a nearest-neighbour mechanism. Given this highly localized dispersal mechanism, and the constraints imposed by the lattice structure of the hogweed data (§4) which forces a minimum distance of 10 km between two sites (in contrast to a more general continuous space in the simulated example), the difference between an exponentially bounded kernel and a power-law kernel becomes insignificant (also see figure 5). Hence, the two models display similar goodness-of-fit. This suggests that control measures—for instance, education programmes of increasing public awareness and participation in prevention and reporting [38], and field survey and subsequent eradication measures [35,39]—may be most effectively deployed by focusing implementation on the neighbourhood of known colonizations.

Figure 5.

Figure 5.

Estimated spatial kernels from fitting M1–M3 to the giant hogweed data with kernel parameters set to posterior means. Transmissibility is expressed relative to the amplitude of the respective kernel at 10 km (the minimum distance between two sites in the dataset).

4.2. Comparison with deviance information criterion and posterior predictive checks

Similar to §3, we compute two versions of DIC, DIC1 and DIC2 (see equations (3.1) and (3.2)), for M1 and M2 and the corresponding models M4 and M5 which do not take the suitability into account. From table 4, we note that, although both versions of DIC can differentiate the relative goodness-of-fit between M1 and M2, they unreasonably indicate that M4 (exponentially bounded kernel without considering suitability) is the best or the second best model.

Table 4.

DIC computed for M1, M2, M4 and M5.

model M1 M2 M4 M5
DIC1 7404.8 7442.9 7422.0 7890.1
DIC2 968.3 1027.4 522.2 1194.8

In §3 and in the electronic supplementary material, Section 6, we have shown that summary statistics quantifying spatial autocorrelation may be less sensitive to model mis-specification than our latent-residual approach. We therefore focus on more intuitive summary statistics based on the number of colonizations. We also adopt a conservative approach, running forward simulations of the fitted model using point estimates of parameters, here the posterior mean. Moreover, these simulations are conditioned on the colonized sites observed in 1970. We focus on comparing models M1 and M2 (which include dependence on suitability and for which our residual analysis shows a significant difference in goodness-of-fit) and examine the following predictive outcomes: the predictive distribution of the number of colonized sites, at the end of the observation period (2000), within annular regions centred on a given location (see the electronic supplementary material, figure S7, for a representation of the regions) and the numbers of reported colonized sites at the second and third observation times (1987 and 2000). Figure 6 and table 5 compare predicted distributions and the actual observations.

Figure 6.

Figure 6.

Distribution of the number of colonized sites within each ring region (see text) at the final observation time as predicted by models (a) M1 and (b) M2 (the shaded area represents the 95% two-sided interval of the predicted number of colonized sites from 1000 simulations) and the observed data (shown in the dotted black line). The displacements are measured from the centre of the ring regions.

Table 5.

Predicted and reported new colonized sites at second and third snapshots (1987 and 2000). The reported numbers are followed by the two-sided 95% credible intervals enclosed in brackets (1000 simulations for each model).

model M1 M2
1987 334 (311, 375) 334 (310, 381)
2000 412 (368, 434) 412 (388, 460)

Figure 6 and table 5 suggest that, similar to §3.2, model checks based on these apparently reasonable summary statistics may be insensitive to the choice of model. Figure 5 shows that M1 and M2 represent very different transmission mechanisms, with kernel B exhibiting a strong propensity for long-range transmission. Figure 7 shows that the estimates of transmission rates can be different when different kernels are fitted. Nevertheless, the predictive distributions of the summary statistic appear consistent with observed values for both M1 and M2.

Figure 7.

Figure 7.

Posterior distributions of transmission rates from a colonized site to a susceptible site from fitting M1 and M2 to the giant hogweed data.

5. Discussion

Earlier work [40] has championed the view that Bayesian and classical reasoning are natural approaches to follow for parameter estimation and model criticism, respectively, and therefore should be used in combination. We have presented a statistical framework that combines classical and Bayesian reasoning in testing for mis-specifications of a spatio-temporal model by investigating the consistency of imputed latent residuals with a known sampling distribution using a classical hypothesis test. In particular, we have shown how such residuals can be tailored so as to be sensitive to mis-specifications of the spatial kernel or the sojourn-time distributions within a spatio-temporal model. Analysis of simulated epidemics and data on the spread of an invasive species in the UK demonstrates how our model-testing framework can be used to diagnose the model fit and how the results can be interpreted in practice. In §3.1, we also discuss how one might perform the complementary diagnosis of the type of incompatibility of a spatial kernel by examining deviations (from the null hypothesis that the fitted model is correct) in the distribution of residuals Inline graphic after observing considerable evidence against the model.

Predicted dynamics of epidemiological and ecological systems can be extremely sensitive to the choice of model, with consequent implications for the design of control strategies [1,57]. For example, choices of temporal distributions influence both epidemic final size and the persistence of disease, and have important implications for devising effective control strategies which target symptomatic subjects and the timing of infectiousness [6,12,13]. There are examples of such effects related to the parametric form of incubation and infectious period distributions in models of measles [911], smallpox [12] and AIDS [13]. It is well known that the spatial transmission mechanisms are difficult to assess in practice yet have major implications for optimal control strategies. Studies of animals and plant diseases such as foot and mouth and citrus canker have cited the importance for selecting between a long-tailed spatial kernel versus a localized spatial kernel when devising the most appropriate strategy of culling [5,8,41]. Therefore, we believe that the methodology presented here, based on ILRs, is a novel and potentially powerful tool for diagnosing mis-specification of a spatial kernel which can provide valuable insights to modellers in practice. Moreover, we remark that the principles introduced here should be readily extendable allowing the construction of analogous residuals for a wide range of processes included in models in ecology and epidemiology.

We also believe the approach offers several advantages over alternatives. Bayesian model assessment approaches, for example Bayes factors, are known to be sensitive to selection of prior distributions and are challenging computationally [42,43]. Moreover, they allow only relative comparison of competing models, a disadvantage shared by information criteria measures, for example the DIC [17]. The latter is also problematic when dealing with partially observed processes [18], the norm in epidemiological studies, where the DIC is not uniquely defined. By contrast, the tests based on latent residuals offer an assessment of model discrepancy in absolute terms. Posterior predictive checks that use only partially observed data may be insensitive to the model choice (as shown in §§3 and 4.2) even if summary statistics are appropriately chosen. A key feature of the proposed tests is that they can be easily embedded within any Bayesian analysis of a spatio-temporal system that makes use of data augmentation. Also, in contrast to other approaches, for example posterior predictive checks, our method uses the full posterior distribution of unobserved data and model parameters, and may offer a higher sensitivity to model mis-specifications. As it is common practice to conduct Bayesian analyses of partially observed epidemics using data augmentation supported by computational techniques, for example MCMC methods [5,20,27,28], the framework represents a potentially valuable addendum to the model-testing toolkit used in epidemiological and ecological studies.

Acknowledgements

The distribution data were obtained from the National Biodiversity Network Gateway and are compiled from numerous sources including the Countryside Council for Wales, Bristol Regional Environmental Records Centre, the Scottish Wildlife Trust and Scottish Borders Biological Records Centre (for details see www.nbn.org.uk/). We also thank Dr Stephen Catterall for helping with the hogweed data and providing suitability estimates used in §4. Finally, we are thankful to the Editor and three referees for their constructive reviews of the manuscript, which have helped us to introduce some further insights into the paper.

Funding statement

We acknowledge financial support from Heriot–Watt University and the Scottish Government's Rural and Environment Science and Analytical Services Division (RESAS).

References

  • 1.Ster IC, Singh BK, Ferguson NM. 2009. Epidemiological inference for partially observed epidemics: the example of the 2001 foot and mouth epidemic in Great Britain. Epidemics 1, 21–34. ( 10.1016/j.epidem.2008.09.001) [DOI] [PubMed] [Google Scholar]
  • 2.Boender GJ, Meester R, Gies E, De Jong M. 2007. The local threshold for geographical spread of infectious diseases between farms. Prev. Vet. Med. 82, 90–101. ( 10.1016/j.prevetmed.2007.05.016) [DOI] [PubMed] [Google Scholar]
  • 3.Cook AR, Marion G, Butler A, Gibson GJ. 2007. Bayesian inference for the spatio-temporal invasion of alien species. Bull. Math. Biol. 69, 2005–2025. ( 10.1007/s11538-007-9202-4) [DOI] [PubMed] [Google Scholar]
  • 4.Walters RJ, Hassall M, Telfer MG, Hewitt GM, Palutikof JP. 2006. Modelling dispersal of a temperate insect in a changing climate. Proc. R. Soc. B 273, 2017–2023. ( 10.1098/rspb.2006.3542) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Ferguson NM, Donnelly CA, Anderson RM. 2001. The foot-and-mouth epidemic in Great Britain: pattern of spread and impact of interventions. Science 292, 1155–1160. ( 10.1126/science.1061020) [DOI] [PubMed] [Google Scholar]
  • 6.Fraser C, Riley S, Anderson RM, Ferguson NM. 2004. Factors that make an infectious disease outbreak controllable. Proc. Natl Acad. Sci. USA 101, 6146–6151. ( 10.1073/pnas.0307506101) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Filipe JAN, Maule MM. 2004. Effects of dispersal mechanisms on spatio-temporal development of epidemics. J. Theor. Biol. 226, 125–141. ( 10.1016/S0022-5193(03)00278-9) [DOI] [PubMed] [Google Scholar]
  • 8.Keeling MJ, Woolhouse MEJ, May RM, Davies G, Grenfell BT. 2003. Modelling vaccination strategies against foot-and-mouth disease. Nature 421, 136–142. ( 10.1038/nature01343) [DOI] [PubMed] [Google Scholar]
  • 9.Ferguson NM, May RM, Anderson RM. 1997. Measles: persistence and synchronicity in disease dynamics. In Spatial ecology: the role of space in population dynamics and interspecific interactions, vol. 30, pp. 137–157. Princeton, NJ: Princeton University Press. [Google Scholar]
  • 10.Bolker B, Grenfell B. 1995. Space, persistence and dynamics of measles epidemics. Phil. Trans. R. Soc. Lond. B 348, 309–320. ( 10.1098/rstb.1995.0070) [DOI] [PubMed] [Google Scholar]
  • 11.Keeling MJ, Grenfell BT. 1997. Disease extinction and community size: modeling the persistence of measles. Science 275, 65–67. ( 10.1126/science.275.5296.65) [DOI] [PubMed] [Google Scholar]
  • 12.Streftaris G, Gibson GJ. 2004. Bayesian inference for stochastic epidemics in closed populations. Stat. Model. 4, 63–75. ( 10.1191/1471082X04st065oa) [DOI] [Google Scholar]
  • 13.Muñoz A, Sabin C, Phillips A. 1997. The incubation period of AIDS. AIDS 11, S69–S76. [PubMed] [Google Scholar]
  • 14.Jeffreys H. 1935. Some tests of significance, treated by the theory of probability. Proc. Camb. Phil. Soc. 31, 203–222. ( 10.1017/S030500410001330X) [DOI] [Google Scholar]
  • 15.Jeffreys H. 1961. Theory of probability. Cary, NC: Oxford University Press. [Google Scholar]
  • 16.Draper D. 1995. Assessment and propagation of model uncertainty. J. R. Stat. Soc. B 57, 45–97. [Google Scholar]
  • 17.Spiegelhalter DJ, Best NG, Carlin BP, Van Der Linde A. 2002. Bayesian measures of model complexity and fit . J. R. Stat. Soc. B 64, 583–639. ( 10.1111/1467-9868.00353) [DOI] [Google Scholar]
  • 18.Celeux G, Forbes F, Robert CP, Titterington DM. 2006. Deviance information criteria for missing data models. Bayesian Anal. 1, 651–673. ( 10.1214/06-BA122) [DOI] [Google Scholar]
  • 19.Meng XL. 1994. Posterior predictive p-values . Ann. Stat. 22, 1142–1160. ( 10.1214/aos/1176325622) [DOI] [Google Scholar]
  • 20.Gibson GJ, Otten W, Filipe JA, Cook A, Marion G, Gilligan CA. 2006. Bayesian estimation for percolation models of disease spread in plant populations. Stat. Comput. 16, 391–402. ( 10.1007/s11222-006-0019-z) [DOI] [Google Scholar]
  • 21.Papaspiliopoulos O, Roberts GO, Sköld M. 2007. A general framework for the parametrization of hierarchical models. Stat. Sci. 22, 59–73. ( 10.1214/088342307000000014) [DOI] [Google Scholar]
  • 22.Streftaris G, Gibson GJ. 2004. Bayesian analysis of experimental epidemics of foot-and-mouth disease. Proc. R. Soc. Lond. B 271, 1111–1118. ( 10.1098/rspb.2004.2715) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Valleron AJ, Boelle PY, Will R, Cesbron JY. 2001. Estimation of epidemic size and incubation time based on age characteristics of vCJD in the United Kingdom. Science 294, 1726–1728. ( 10.1126/science.1066838) [DOI] [PubMed] [Google Scholar]
  • 24.Anderson RM. 1988. The epidemiology of HIV infection: variable incubation plus infectious periods and heterogeneity in sexual activity. J. R. Stat. Soc. B 151, 66–93. ( 10.2307/2982185) [DOI] [Google Scholar]
  • 25.Dawid AP, Stone M. 1982. The functional-model basis of fiducial inference. Ann. Stat. 10, 1054–1067. ( 10.1214/aos/1176345970) [DOI] [Google Scholar]
  • 26.Cox DR, Snell EJ. 1968. A general definition of residuals. J. R. Stat. Soc. B 30, 248–275. [Google Scholar]
  • 27.Catterall S, Cook AR, Marion G, Butler A, Hulme PE. 2012. Accounting for uncertainty in colonisation times: a novel approach to modelling the spatio-temporal dynamics of alien invasions using distribution data. Ecography 35, 901–911. ( 10.1111/j.1600-0587.2011.07190.x) [DOI] [Google Scholar]
  • 28.Cook AR, Otten W, Marion G, Gibson GJ, Gilligan CA. 2007. Estimation of multiple transmission rates for epidemics in heterogeneous populations. Proc. Natl Acad. Sci. USA 104, 20 392–20 397. ( 10.1073/pnas.0706461104) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Lewis PAW. 1961. Distribution of the Anderson–Darling statistic. Ann. Math. Stat. 32, 1118–1124. ( 10.1214/aoms/1177704850) [DOI] [Google Scholar]
  • 30.Albert JH, Chib S. 1993. Bayesian analysis of binary and polychotomous response data. J. Am. Stat. Assoc. 88, 669–679. ( 10.1080/01621459.1993.10476321) [DOI] [Google Scholar]
  • 31.Sellke T. 1983. On the asymptotic distribution of the size of a stochastic epidemic. J. Appl. Probab. 20, 390–394. ( 10.2307/3213811) [DOI] [Google Scholar]
  • 32.Streftaris G, Gibson GJ. 2012. Non-exponential tolerance to infection in epidemic systems: modeling, inference, and assessment. Biostatistics 13, 580–593. ( 10.1093/biostatistics/kxs011) [DOI] [PubMed] [Google Scholar]
  • 33.Getis A. 1991. Spatial interaction and spatial autocorrelation: a cross product approach. Environ. Plan. A 23, 1269–1277. ( 10.1068/a231269) [DOI] [Google Scholar]
  • 34.Pimentel D, Zuniga R, Morrison D. 2005. Update on the environmental and economic costs associated with alien-invasive species in the United States. Ecol. Econ. 52, 273–288. ( 10.1016/j.ecolecon.2004.10.002) [DOI] [Google Scholar]
  • 35.Pyšek P. 2007. Ecology and management of giant hogweed (Heracleum mantegazzianum). Wallingford, UK: CABI. [Google Scholar]
  • 36.Dawe NK, White ER. 1979. Giant cow parsnip (Heracleum mantegazzianum) on Vancouver Island, British Columbia. Can. Field Nat. 93, 82–83. [Google Scholar]
  • 37.Pergl J, Müllerovà J, Perglovà I, Herben I, Pyšek P. 2011. The role of long-distance seed dispersal in the local population dynamics of an invasive plant species. Divers. Distrib. 17, 725–738. ( 10.1111/j.1472-4642.2011.00771.x) [DOI] [Google Scholar]
  • 38.Bhowmik PC. 2005. Characteristics, significance, and human dimension of global invasive weeds. In Invasive plants: ecological and agricultural aspects, pp. 251–268. Basel, Switzerland: Birkhäuser. [Google Scholar]
  • 39.Sampson C, Waal LD, Child LE, Wade PM, Brock JH. 1994. Cost and impact of current control methods used against Heracleum mantegazzianum (giant hogweed) and the case for instigating a biological control programme. In Ecology and management of invasive riverside plants, pp. 55–65. Chichester, UK: John Wiley and Sons Ltd. [Google Scholar]
  • 40.Box GEP. 1980. Sampling and Bayes’ inference in scientific modelling and robustness. J. R. Stat. Soc. A 143, 383–430. ( 10.2307/2982063) [DOI] [Google Scholar]
  • 41.Gottwald TR, Graham JH, Schubert TS. 2002. Citrus canker: the pathogen and its impact. Plant Health Prog. ( 10.1094/PHP-2002-0812-01-RV) [DOI] [Google Scholar]
  • 42.Kass RE, Raftery AE. 1995. Bayes factors. J. Am. Stat. Assoc. 90, 773–795. ( 10.1080/01621459.1995.10476572) [DOI] [Google Scholar]
  • 43.Han C, Carlin BP. 2001. Markov chain Monte Carlo methods for computing Bayes factors. J. Am. Stat. Assoc. 96, 1122–1132. ( 10.1198/016214501753208780) [DOI] [Google Scholar]

Articles from Journal of the Royal Society Interface are provided here courtesy of The Royal Society

RESOURCES