Abstract
In recent years there has been growing availability of individual-level spatio-temporal disease data, particularly due to the use of modern communicating devices with GPS tracking functionality. These detailed data have been proven useful for inferring disease transmission to a more refined level than previously. However, there remains a lack of statistically sound frameworks to model the underlying transmission dynamic in a mechanistic manner. Such a development is particularly crucial for enabling a general epidemic predictive framework at the individual level. In this paper we propose a new statistical framework for mechanistically modelling individual-to-individual disease transmission in a landscape with heterogeneous population density. Our methodology is first tested using simulated datasets, validating our inferential machinery. The methodology is subsequently applied to data that describes a regional Ebola outbreak in Western Africa (2014-2015). Our results show that the methods are able to obtain estimates of key epidemiological parameters that are broadly consistent with the literature, while revealing a significantly shorter distance of transmission. More importantly, in contrast to existing approaches, we are able to perform a more general model prediction that takes into account the susceptible population. Finally, our results show that, given reasonable scenarios, the framework can be an effective surrogate for susceptible-explicit individual models which are often computationally challenging.
Author summary
Availability of individual-level, spatio-temporal disease data (e.g. GPS locations of infected individuals) has been increasing in recent years, primarily due to the increased use of modern communication devices such as mobile phones. Such data create invaluable opportunities for researchers to study disease transmission on a more refined individual-to-individual level, facilitating the designs of potentially more effective control measures. However, the growing availability of such precise data has not been accompanied by development of statistically sound mechanistic frameworks. Developing such frameworks is an essential step for systematically extracting maximal information from data, in particular, evaluating the efficacy of individually-targeted control strategies and enabling forward epidemic prediction at the individual level. In this paper we develop a novel statistical framework that overcomes a few key limitations of existing approaches, enabling a machinery that can be used to infer the history of partially observed outbreaks and, more importantly, to produce a more comprehensive epidemic prediction. Our framework may also be a good surrogate for more computationally challenging individual-based models.
Introduction
Epidemiological data collected by traditional public health surveillance often contain relatively coarse spatial and temporal information on infected individuals. In recent years, the amount and resolution of the spatio-temporal data have increased vastly due to the advent of ‘digital epidemiology’ along with the increased use of modern communication devices [1], particularly through the use of mobile phones which drastically improves the tracking of human contacts [2–4]. Such data provide unprecedented opportunities for dissecting disease spread at a more localized, individual-to-individual level. The recent West Africa Ebola outbreak (Fig 1) well demonstrated the increasing availability of such data, and, in particular, the GPS location data collected during the outbreak have been shown to be useful in identifying superspreaders and quantifying the impact of superspreading during the outbreak [4].
However, the growing availability of these more precise spatio-temporal data has not been accompanied by development of statistically sound mechanistic frameworks for modelling the underlying individual-to-individual transmission process. Developing such methods is an essential step for systematically extracting maximal information from such data, in particular, evaluating the efficacy of individually-targeted control strategies and enabling forward epidemic prediction at the individual level.
Conventional compartmental models (e.g. SEIR) require an explicit account of the complete contact process which specifies both the successful contacts (i.e. the infected in class E), and, more challengingly [6], the unsuccessful contacts (i.e. who has remained susceptible in class S). Representing unsuccessful contacts at the individual level is computationally challenging due to the need to build an explicit contact network among essentially all individuals in the population. One may consider adapting mechanistic compartmental disease models to accommodate these data. Important examples of these approaches include: 1) a patch-level approach that aggregates data points within pre-defined grids/patches [7–9], and 2) a transmission-network-based approach which is essentially a partial-likelihood approach that considers only the infected individuals and ignores the unsuccessful contacts [4, 10–12]. Fig 2 presents a schematic illustration of these two approaches. Although the patch-level approach conforms to the desirable SEIR-type mechanistic framework, in which both the successful infectious contacts (E) and unsuccessful contacts (S) are represented, at least on the patch level, the aggregation of data points can be arbitrary and it inevitably degrades the data resolution necessary for inferring, for example, the individual-to-individual transmission. The transmission-network-based (partial-likelihood) approach, on the other hand, preserves the ‘point nature’ of the data but fails to conform to the mechanistic framework by completely ignoring the general (susceptible) population and its relation to the infected class. Although the latter has been shown to be useful for sampling the relations among infections (e.g. the transmission tree), it is inadequate for the purposes of complete forward epidemic prediction which needs to take into account the general (susceptible) population [4].
Spatio-temporal point processes (see an introduction in [5]) may also appear to be natural candidates for individual spatial data. However, it is not straightforward to integrate them with a mechanistic compartmental disease model such as the SEIR (Susceptible-Exposed-Infectious-Recovered) model. In particular, it is difficult to formulate conditional intensities for a spatio-temporal point process directly for the observations that respects the mechanistic modelling assumptions. If one observes the transitions made by individuals from the E to I classes and from the I to R classes then it may be natural to consider a marked spatio-temporal point process where points represent the transitions from E to I and marks quantify the subsequent sojourn time in the I class. Calculation of intensities conditional on the observation history, necessary for the construction of the likelihood, is difficult due to the transitions from E to I being unobserved. Other approaches which do not utilize the full likelihood (e.g., contact-type partial-likelihood approach [13, 14] and likelihood-free ABC approach [15]) may also be pursued. There also have been advances for more efficient parameters inference of certain classes of spatial models—for example, [16] proposes a double Metropolis-Hastings sampler for certain spatial models with intractable normalizing constants. Nevertheless, there is still a need of developing new statistical frameworks which allow for both full-likelihood-based model inference and, importantly, a statistically and biologically interpretable forward-prediction machinery that naturally integrates with mechanistic disease models and the general susceptible population.
In this paper, we develop a framework that aims to accommodate individual-level spatio-temporal data, both in a mechanistic manner and accounting for the general (susceptible) population. The approach taken can be viewed as being rooted in spatio-temporal point processes. In essence, we view the process of transmission (transitions from S to E) as a marked spatio-temporal point processes where the marks are bivariate and specify the subsequent sojourn times in the E and I classes for the respective exposed individual. For this formulation the conditional intensity becomes tractable as described in Model and Methods. We then exploit ideas that are standard in Bayesian computation—in particular data augmentation—to accommodate the lack of observation of transmission events.
We focus on epidemic outbreaks that are mainly attenuated by a time-varying transmissibility e.g. due to controls or seasonal changes of transmissibility, which is also the case for the recent West Africa Ebola outbreak [17, 18]. We also allow the occurrence of infections to be moderated both by the distance dependency of spatial infectivity and the effect of spatially heterogeneous (susceptible) population density. Such a framework enables a machinery that can be used to infer system parameters from the history of outbreaks and, more importantly, to predict the future dynamics of an epidemic. Our work represents a key generalization and extension of the work in [4, 19], notably by accounting for the effect of heterogeneous population density and considering a broader class of disease models.
Our methodology is first tested using simulated examples. We also compare our framework with the conventional, and often computationally challenging, individual-based SEIR model (which takes into account each individual in the population explicitly). Finally, it is applied to the Ebola outbreak data (Fig 1 and Ebola Outbreak Data), demonstrating its relevance to realistic epidemics of major current importance.
Models and methods
The mechanistic transmission model
We model spatio-temporal transmission, in continuous time and space and over a heterogeneous landscape with varying population density. The framework we apply to model transmission is closely related to the contact distribution model [20]. Consider the situation where there are n(t′) infectious individuals at time t′ among an entirely susceptible population. A new infection occurs as the first event in a non-homogeneous Poisson process with a time-varying rate n(t′) × β(t) with
(1) |
for t ≥ t′, where β represents the baseline transmissibility (i.e. the baseline intensity) of an infectious individual in the absence of control measures. Multiple-level baseline transmissibility βi, i = 1, 2, … may also be considered, for example, to represent heteregeneous transmissibility among different age groups (see later Example: Application to the Ebola Outbreak Data). The parameter ω quantifies the efficacy of controls that serves to reduce disease transmissibilty [21, 22]. Note that primary/background infection can be accommodated by adding a permanent infectious source presenting an additional rate α (i.e. the total Poisson rate becomes α + β × exp(−ωt)).
The source of infection of the newly infected/exposed individual is randomly chosen from the n(t′) infectious individuals. It is assumed that the probability of the new infection being at a certain distance r and direction θ away from the source of infection, is determined by the movement patterns of infectious individuals and the density of the susceptible population. Specifically, G = (r, θ) is drawn from a density,
(2) |
where is the population density across the study area. Following Eq 2, the distance r is first drawn from f(r; η), a monotonically decreasing density function that specifies the likelihood of spatial movement over distance [23–25]. Specifically, we assume r follows an Exponential(η) distribution, i.e.,
(3) |
Given r, the position of the new infection is determined by a subsequent random draw θ from , the density of θ corresponding to the circle with radius r centered at the source of infection. When population density is homogeneous, θ may be drawn uniformly from 0 to 2π—i.e., given the homogeneous population density, there is no a priori belief that one part of the circle (i.e. the arc) is more susceptible to the occurrence of new infection than another. We consider a more general scenario with varying population density . A natural approach in specifying is to use the population density along the circumference of the circle, denoted by , to account for the effect of heterogeneous landscape, so that
(4) |
where l′ is the arc length corresponding to an arbitrary angle θ′. It is noted that, when the source of infection is the primary/background, r and θ become irrelevant, and reduces to the (normalized) population density so that the probability of the new infection occurring in a neighbourhood of a particular point is proportional to the population density at that position.
Subsequently, the new infected individual is assumed to spend random times in classes E and I which are modelled using an appropriate distribution such as a Gamma or a Weibull distribution. Specifically, following [4], we use a Gamma(γ, λ) with mean γ and s.d. λ for the random time x spent in class E, and for the random time x spent in class I we use an with mean φ [4]. All sojourn times are assumed independent of each other given the model parameters.
In S1 Text, we also provide a concise description of the algorithm for simulating from the described model.
Complete-data likelihood
Let T be the duration of the observation period, and let χE ⊆ χI ⊆ χR denote the sets of individuals who have entered class E, class I and class R by T respectively. Also, let E = (…, Ej, …) denote the exposure times for j ∈ χE, I = (…, Ij, …) denote the times of becoming infectious for j ∈ χI and R = (…, Rj, …) denote the times of recovery or removal for j ∈ χR. The densities of the sojourn times in class E and class I are denoted by fE and fI respectively, with their corresponding cumulative distribution functions denoted by FE and FI. Also, as previously defined, n(t) is the total number of infectious individuals at time t. Finally, for j ∈ χE, let ψ = (…, ψj, …) denote the collection of sources of infection for infected individuals, and G = (…, Gj, …) denote their positions relative to the sources of infections where Gj = (rj, θj).
Assuming complete data z = (E, I, R, G, ψ) and model parameters Θ = (α, β, γ, λ, φ, η, ω), we can express the likelihood as
(5) |
Here denotes χE with the earliest exposure excluded. The contribution to the likelihood arising from the infection of j by the particular source ψj is given by
(6) |
The first two lines in Eq 5 together represent the contribution to the likelihood arising from the observed sequence of exposure events. The third and fourth lines represent the contribution to the likelihood of the sojourn times in class E and I respectively for the exposed individuals.
For mathematical clarity, we have so far discussed a general case where the population density along the circumference is assumed to be continuous. In practice, however, the data of population density over a study area is often provided in a discrete form, mostly on the grid level [26] (see also Fig 1). We describe how this special case may be handled practically in S1 Text and S1 Fig.
Statistical inference
We conduct Bayesian inference of partially observed epidemics using the process of data augmentation supported by Markov chain Monte Carlo methods [4, 27–29]. Given observed partial data y, including times of symptom onset and death times, the inference involves sampling from the joint posterior distribution π(Θ, z|y) ∝ L(Θ; z)π(Θ), where z represents the complete data and π(Θ) represents the prior distribution of model quantities, such that the complete z is reconstructed, or ‘imputed’. We use weak uniform priors U(0, 100). It is noted that, in analyzing the Ebola outbreak data (see Example: Application to the Ebola Outbreak Data) where z = (E, I, R, G, ψ), other than the parameters in Θ = (α, β, a, b, c, η, ω), the exposure times E and the sources of infections ψ (i.e. the transmission tree) are unobserved and are also to be inferred [4, 27].
Results
Validation of model inference
In this section we test our methodology using simulated datasets. 10 independent epidemics are simulated from the model described in Model and Methods, parameterized by a set of model parameter values arising from fitting to an Ebola outbreak data (see Example: Application to the Ebola Outbreak Data). The same observation period, geographical area and population density as the Ebola data are considered. Fig 3a shows an exemplar simulated epidemic. Similar to the application to the Ebola outbreak data, we also consider age-specific baseline transmissibility of an infectious individual, i.e. β1 for age less than 15 and β2 for age greater than or equal to 15. Subsequently, we fit our model to each of the simulated epidemics and obtain the posterior samples of the model parameters. Fig 3b suggests that the model parameters can be accurately estimated from the corresponding inferred posterior distributions which cluster around the true parameter values. We also test with another set of simulated datasets in which we assume a different distribution of population density, suggesting the similar accuracy in parameter estimations (S2 Fig).
Comparison with individual-based SEIR model
Conventional SEIR models, which require an explicit account of the contact network among all subjects, have proven to be useful in studying patch-level level disease transmission (Fig 2a), e.g. among farms, towns and cities [7, 27]. While these models are not theoretically restricted to the patch-level, they are often computationally challenging for individual-level data arising from moderate- to large-size populations. Although these models are not preferable in the scenario considered in the paper, they may be utilized to generate ‘reference’ epidemics that can be subsequently used for further assessing our framework.
In this section we perform simulation studies to understand how our framework may capture the temporal and spatial dynamics of the epidemics generated from the SEIR model. We focus on simulations from an individual-based and susceptible-explicit SEIR model, in a heterogeneous landscape, that give rise to epidemics in which around 5% of a study population becomes infected (within 50 days of the initial infection). We note that the prevalence we consider is significantly higher than that found in the recent Ebola outbreak and matches more closely other, more transmissible viruses such as influenza [30]. We consider simpler scenarios with no control measures and known latent period distribution. Details of the SEIR model are given in S1 Text. Fig 4 suggests that our framework can capture key temporal and spatial dynamics of the epidemic simulated from the individual-based SEIR model. Similar results are observed in testing with another set of simulated epidemics (S3 Fig), in which we consider a scenario with a different population density distribution and a fatter tail in the spatial transmission distance.
We also perform a comparison between the run-time of our model inference and that of performing full individual-based SEIR model inference, which suggests that ours can be about 780 times faster (see also S1 Text).
Example: Application to the Ebola outbreak data
Ebola outbreak data
We also deploy our methodology to a dataset describing Ebola transmission in the community, collected from the Safe and Dignified Burials (SDB) programme conducted by the International Federation of Red Cross (IFRC), between Oct 20, 2014 and March 30, 2015 in Western Area (which comprises the capital Freetown and its surrounding area) in Sierra Leone. The dataset contains mobile-phone-reported GPS locations of where the bodies of 200 fatalities tested positive for Ebola (Fig 1). Age, sex, time of burial (which was usually performed within 24h of death) and symptom-onset time were also recorded. Population density data were obtained from [26].
The same dataset was previously analyzed in [4], using a transmission-network-based (partial-likelihood) approach (Fig 2b). Although it was shown that such an approach is useful for inferring key epidemiological quantities (e.g. basic reproductive number R0) and sampling the summary topology of the transmission tree among the observed cases, it does not consider the general (susceptible) population—as a result it cannot be used to establish a relation between infections and the general population, something that is necessary if more general model-based forward predictions are to be made. In this section we compare our results with the findings of the previous analysis. In particular, we show how a model-based, forward prediction may be made using our methodology. In this section we consider age-specific baseline transmissibility, i.e. β1 for age less than 15 and β2 for age greater or equal to 15. In the forward simulation, the distribution of age (group) for a new infection is assumed to be the empirical distribution of the age groups of the observed data (which may also be estimated from more general demographic data).
Model estimates
Reproductive number
A key epidemiological parameter is the so-called basic reproductive number R0, or the time-dependent variant effective reproductive number Reff, which quantifies the average number of secondary cases generated by a given infection [31–33]. In our framework the transmission tree is imputed, from which we can compute R0 and Reff as summary statistics. We estimate R0 to be 2.0 with 95% C.I. [1.8, 2.2] (Fig 5a), which is slightly lower than the estimate 2.39 in [4]. The estimate of Reff (Fig 5b) is also broadly consistent with that found in [4] and in the literature (e.g., [31]). It is also noted that degree of super-spreading was commonly characterized using a dispersion parameter k summarized from the transmission tree [4, 34, 35]. Estimated values for k are 0.47 and 0.37, using our methodology and that used in [4] respectively, both indicating significant super-spreading (k < 1), albeit to a lesser extent (i.e. higher k) here.
Age-specified transmissibility and distance of transmission
In [4], it was found that certain age groups tend to be more transmissible—in particular, infected individuals younger than 15 or older than 45 years. Using our methodology, although we find no significant difference among subgroups of those older than 15, there is still clear evidence that cases less than 15 tend to be most transmissible (Fig 5c). In fact, this age group was found to be the most transmissible in [4]. The median distance of transmission is estimated to be 0.85km [0.01, 6.15], which is about one third of the estimate 2.51km found in [4]. Such a discrepancy may reflect the fact that the heterogeneous (susceptible) population is now taken into account, with the presence of many disease-free areas reducing the likelihood of long-range transmission. A shorter distance of transmission may also be potentially more accurate, considering that the pathogen may have spread predominantly by caring within the community, e.g., through family contacts [36]. Estimates of other model parameters are given in S1 Table, showing broad consistency with the literature [4, 37, 38]. However, it is noted that our estimate of mean infectious period is lower than from cases detected within the clinical care system (e.g. mean infectious period 8d estimated for patients who received clinical care [39]). As discussed in [4], this discrepancy potentially highlights systematic differences between community-based cases and cases notified in clinical care systems, where community-based cases may have progressed more rapidly.
A more general model prediction
In contrast to a transmission-network based approach [4], our framework establishes a relation between infections and the general (susceptible) population. Specifically, it proposes a mechanism for how a new infection, beyond the set of observed infected individuals, can arise among the general (susceptible) population. This in turn allows us to perform a more general forward simulation without conditioning on the set of observed cases. Fig 6 shows the (posterior predictive) distributions of some temporal and spatial summary statistics of the epidemics simulated from the estimated model, from which it can be discerned that the model can generate epidemics that are consistent with the observed one. We also show out-of-sample predictivity for the epidemic curve for the second-half of the epidemic duration (Fig 6b). It is noted that in assessing the spatial fit, beside using a relatively crude global measure (i.e. Moran’s I index (Refs. [7])), we also consider Ripley’s L function [40, 41] which is much more informative for characterizing clustering/dispersion of point data.
Discussion
More precise individual-level spatio-temporal data have become increasingly available in recent years due to the advent of ‘digital epidemiology’ [1]. One key challenge is how we may extract maximal information from such data, especially through concurrent development of new statistical methods, as existing approaches suffer from certain limitations (see Introduction). In particular, as SEIR-type models can be computationally challenging for individual-level spatio-temporal data, new frameworks are needed to accommodate such data in a mechanistic manner. The recent Ebola outbreak in West Africa (2014-2016) highlights the need, in particular, for a statistically sound and computationally efficient framework that is both able to integrate individual temporal and spatial information and, more importantly, perform a more general forward prediction which needs to take into account the general susceptible population [4].
In this paper, we have proposed a novel mechanistic framework to address the research gap. Application to the Ebola outbreak data shows broad consistency of key epidemiological quantities with a previous analysis using a transmission-network-based partial-likelihood approach [4], despite a significantly lower, and potentially more accurate [36], median value of estimated distance of transmission (0.85km vs 2.51km). We have shown that our methods can be used in predictive mode to simulate epidemics (among the general population) that are consistent with the observed temporal and spatial patterns of the real outbreak, enabling a more general epidemic predictive framework. We also tested our model inference using simulated examples. Our model was also compared to the more explicit (but computationally challenging) individual-based SEIR model, showing that our model can be a reasonable and computationally-efficient surrogate.
There are a few simplifying assumptions made in our paper. For example, we have focused on epidemic outbreaks that are mainly attenuated by a time-varying transmissibility e.g. due to controls or seasonal changes of transmissibility. Should susceptible depletion play a key role in attenuating the epidemics, our framework may be modified accordingly—e.g., for a given region, adding a component that specifies the decreased likelihood of occurrence of new infections with increased density of existing infections, to mimic the effect of susceptible depletion. Nevertheless, the effect of susceptible depletion may only be significant on a very local scale such as that of the individual household. Moreover, it does not appear to be a determining factor in controlling the recent Ebola outbreak, at least on the ‘global’ scale [17] (Fig 6). We have considered random movement patterns of infectious individuals that may be reasonably abstracted by a monotonically decreasing density function [23, 24]. For future work, this assumption may be relaxed to model more complicated scenarios, such as spread of splash-dispersed fungal pathogens [43] in which the spreading distance may also depend on the susceptible population. In this case, one may modify the density for the distance by also taking into account the distribution of susceptible population in the annuli along the radius of the circle centered at a particular source of infection.
The transmission rate of an infectious case in our model is independent of the (local) susceptible population density. This assumption may be relaxed to allow for more “localized” transmission rates. For example, a model taking into account the heterogeneity of the susceptible population more explicitly may be obtained by allowing the infection rate for each case to be dependent on the local density of susceptibles by taking an appropriate weighted average of the latter with respect to the kernel function, at the expense of increased computational complexity. When spatial heterogeneity is present at a scale that is fine with respect to the range of transmission, then such an average may exhibit little variability over cases. Nevertheless, we note the ability of our approach to identify a kernel that matches that identified when the full SEIR model is fitted. Moreover, our model appears to be reasonable for the case of the Ebola outbreak (Fig 6).
We have considered scenarios that the entire population is susceptible, an assumption which generally holds for newly emerging infections. Vaccination, for instance, decreases the proportion of susceptibles among the general population, and has an important impact on the geographical spread of viruses (e.g. [44]). The effect of vaccination can be readily incorporated by our framework, for example, by reducing the (effective) susceptible population proportional to the vaccination rate in a particular region. The Ebola dataset we analyzed is likely to be subject to underreporting, which may have resulted in, for example, a biased (lower) estimate of the degree of superspreading [4]. Future work which takes into account the underreporting explicitly may be considered. We hope that our proposed framework can provide an essential step for the systematic modelling of the increasingly available individual-level disease data.
Supporting information
Data Availability
The authors do not own the burial dataset used in this paper, and cannot make it freely available. Inquiries regarding use of the data can be directed to International Federation of Red Cross and Red Crescent Societies at http://www.ifrc.org/en/Contact-us/. Other data including the computer code is freely available at Github https://github.com/msylau.
Funding Statement
We thank the Bill & Melinda Gates Foundation (OPP1091919), the RAPIDD programme of the Science and Technology Directorate Department of Homeland Security and the Fogarty International Centre, National Institutes of Health (NIH), and UK Medical Research Council (MRC) for their financial support. HA acknowledges financial support from the European Food Safety Authority under contract OC/EFSA/AHAW/2013/01 - CT01. SF was also supported by a MRC Career Award in Biostatistics (MR/K021680/1). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
References
- 1. Salathe M, Bengtsson L, Bodnar TJ, Brewer DD, Brownstein JS, Buckee C, et al. Digital epidemiology. PLoS Comput Biol. 2012;8(7):e1002616 doi: 10.1371/journal.pcbi.1002616 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Bengtsson L, Lu X, Thorson A, Garfield R, Von Schreeb J. Improved response to disasters and outbreaks by tracking population movements with mobile phone network data: a post-earthquake geospatial study in Haiti. PLoS Med. 2011;8(8):e1001083 doi: 10.1371/journal.pmed.1001083 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Robertson C, et al. Mobile Phone–based Infectious Disease Surveillance System, Sri Lanka-Volume 16, Number 10,October 2010-Emerging Infectious Disease journal-CDC. 2010; [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Lau MS, Dalziel BD, Funk S, McClelland A, Tiffany A, Riley S, et al. Spatial and temporal dynamics of superspreading events in the 2014–2015 West Africa Ebola epidemic. Proceedings of the National Academy of Sciences. 2017;114(9):2337–2342. doi: 10.1073/pnas.1614595114 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Daley DJ, Vere-Jones D. An introduction to the theory of point processes, vol. 1 Springer, New York; 2003. [Google Scholar]
- 6. Kenah E, Britton T, Halloran ME, Longini IM Jr. Molecular infectious disease epidemiology: survival analysis and algorithms linking phylogenies to transmission trees. PLoS Comput Biol. 2016;12(4):e1004869 doi: 10.1371/journal.pcbi.1004869 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Lau MS, Marion G, Streftaris G, Gibson GJ. New model diagnostics for spatio-temporal systems in epidemiology and ecology. Journal of The Royal Society Interface. 2014;11(93):20131093 doi: 10.1098/rsif.2013.1093 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Catterall S, Cook AR, Marion G, Butler A, Hulme PE (2012) Accounting for uncertainty in colonisation times: a novel approach to modelling the spatio-temporal dynamics of alien invasions using distribution data. Ecography 35(10):901–911. doi: 10.1111/j.1600-0587.2011.07190.x [Google Scholar]
- 9. Jewell CP, Keeling MJ, Roberts GO. Predicting undetected infections during the 2007 foot-and-mouth disease outbreak. Journal of the Royal Society Interface. 2009;6(41):1145–1151. doi: 10.1098/rsif.2008.0433 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Haydon DT, Chase-Topping M, Shaw D, Matthews L, Friar J, Wilesmith J, et al. The construction and analysis of epidemic trees with reference to the 2001 UK foot–and–mouth outbreak. Proceedings of the Royal Society of London B: Biological Sciences. 2003;270(1511):121–127. doi: 10.1098/rspb.2002.2191 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Cottam EM, Thébaud G, Wadsworth J, Gloster J, Mansley L, Paton DJ, et al. Integrating genetic and epidemiological data to determine transmission pathways of foot-and-mouth disease virus. Proceedings of the Royal Society B: Biological Sciences. 2008;275(1637):887–895. doi: 10.1098/rspb.2007.1442 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Morelli MJ, Thébaud G, Chadœuf J, King DP, Haydon DT, Soubeyrand S. A Bayesian inference framework to reconstruct transmission trees using epidemiological and genetic data. PLoS Comput Biol. 2012;8(11):e1002768 doi: 10.1371/journal.pcbi.1002768 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Diggle PJ. Spatio-temporal point processes, partial likelihood, foot and mouth disease. Statistical methods in medical research. 2006;15(4):325–336. doi: 10.1191/0962280206sm454oa [DOI] [PubMed] [Google Scholar]
- 14. Diggle PJ, Kaimi I, Abellana R. Partial-Likelihood Analysis of Spatio-Temporal Point-Process Data. Biometrics. 2010;66(2):347–354. doi: 10.1111/j.1541-0420.2009.01304.x [DOI] [PubMed] [Google Scholar]
- 15. Diggle PJ, Gratton RJ. Monte Carlo methods of inference for implicit statistical models. Journal of the Royal Statistical Society Series B (Methodological). 1984; p. 193–227. [Google Scholar]
- 16. Liang F. A double Metropolis–Hastings sampler for spatial models with intractable normalizing constants. Journal of Statistical Computation and Simulation. 2010;80(9):1007–1022. doi: 10.1080/00949650902882162 [Google Scholar]
- 17. Kucharski AJ, Camacho A, Flasche S, Glover RE, Edmunds WJ, Funk S. Measuring the impact of Ebola control measures in Sierra Leone. Proceedings of the National Academy of Sciences. 2015;112(46):14366–14371. doi: 10.1073/pnas.1508814112 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Chowell G, Nishiura H. Transmission dynamics and control of Ebola virus disease (EVD): a review. BMC medicine. 2014;12(1):1 doi: 10.1186/s12916-014-0196-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. MacCalman L, McKendrick IJ, Denwood M, Gibson G, Catterall S, Innocent G, et al. MAPRA: Modelling Animal Pathogens: Review and Adaptation. EFSA Journal. 2016;13. [Google Scholar]
- 20. Mollison D. Spatial Contact Models for Ecological and Epidemic Spread J. Royal Statist. Soc. B. 1977; 39(3):283–326 [Google Scholar]
- 21. Barbarossa MV, Dénes A, Kiss G, Nakata Y, Röst G, Vizi Z. Transmission dynamics and final epidemic size of Ebola virus disease outbreaks with varying interventions. PloS one. 2015;10(7):e0131398 doi: 10.1371/journal.pone.0131398 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Drake JM, Kaul RB, Alexander LW, O?Regan SM, Kramer AM, Pulliam JT, et al. Ebola cases and health system demand in Liberia. PLoS Biol. 2015;13(1):e1002056 doi: 10.1371/journal.pbio.1002056 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Salje H, Lessler J, Paul KK, Azman AS, Rahman MW, Rahman M, et al. How social structures, space, and behaviors shape the spread of infectious diseases using chikungunya as a case study. Proceedings of the National Academy of Sciences. 2016; p. 201611391. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Vazquez-Prokopec GM, Bisanzio D, Stoddard ST, Paz-Soldan V, Morrison AC, Elder JP, et al. Using GPS technology to quantify human mobility, dynamic contacts and infectious disease dynamics in a resource-poor urban environment. PloS one. 2013;8(4):e58802 doi: 10.1371/journal.pone.0058802 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. Keeling MJ, Woolhouse ME, Shaw DJ, Matthews L, Chase-Topping M, Haydon DT, et al. Dynamics of the 2001 UK foot and mouth epidemic: stochastic dispersal in a heterogeneous landscape. Science. 2001;294(5543):813–817. doi: 10.1126/science.1065973 [DOI] [PubMed] [Google Scholar]
- 26.Population density in Sierra Leone. http://www.worldpop.org.uk.
- 27. Lau MS, Marion G, Streftaris G, Gibson G. A systematic Bayesian integration of epidemiological and genetic data. PLoS Comput Biol. 2015;11(11):e1004633 doi: 10.1371/journal.pcbi.1004633 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28. Parry M, Gibson GJ, Parnell S, Gottwald TR, Irey MS, Gast TC, et al. Bayesian inference for an emerging arboreal epidemic in the presence of control. Proceedings of the National Academy of Sciences. 2014;111(17):6258–6262. doi: 10.1073/pnas.1310997111 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29. Gibson GJ, Renshaw E. Estimating parameters in stochastic compartmental models using Markov chain methods. Mathematical Medicine and Biology. 1998;15(1):19–40. doi: 10.1093/imammb/15.1.19 [Google Scholar]
- 30.NIH Fact Sheets: Influenza https://report.nih.gov/NIHfactsheets/ViewFactSheet.aspx?csid=133.
- 31. Fisman D, Khoo E, Tuite A. Early epidemic dynamics of the West African 2014 Ebola outbreak: estimates derived with a simple two-parameter model. PLOS currents outbreaks. 2014;. doi: 10.1371/currents.outbreaks.89c0d3783f36958d96ebbae97348d571 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32. Camacho A, Kucharski A, Aki-Sawyerr Y, White MA, Flasche S, Baguelin M, et al. Temporal Changes in Ebola Transmission in Sierra Leone and Implications for Control Requirements: a Real-time Modelling Study. PLoS Curr. 2015;7 doi: 10.1371/currents.outbreaks.406ae55e83ec0b5193e30856b9235ed2 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33. Weitz JS, Dushoff J. Modeling post-death transmission of Ebola: challenges for inference and opportunities for control. Scientific reports. 2015;5 doi: 10.1038/srep08751 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34. Lloyd-Smith JO, Schreiber SJ, Kopp PE, Getz W. Superspreading and the effect of individual variation on disease emergence. Nature. 2005;438(7066):355–359. doi: 10.1038/nature04153 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35. Althaus CL. Ebola superspreading. Lancet Infect Dis. 2015;15(5):507–8. doi: 10.1016/S1473-3099(15)70135-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36. Chowell G, Cleaton JM, Viboud C. Elucidating Transmission Patterns From Internet Reports: Ebola and Middle East Respiratory Syndrome as Case Studies. Journal of Infectious Diseases. 2016;214(suppl 4):S421–S426. doi: 10.1093/infdis/jiw356 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37. Chretien JP, Riley S, George DB. Mathematical modeling of the West Africa Ebola epidemic. Elife. 2015; p. e09186 doi: 10.7554/eLife.09186 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38. Stadler T, Kühnert D, Rasmussen DA, du Plessis L. Insights into the early epidemic spread of Ebola in Sierra Leone provided by viral sequence data. PLoS Currents Outbreaks. 2014;10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39. Bah EI, Lamah MC, Fletcher T, Jacob ST, Brett-Major DM, Sall AA, et al. Clinical presentation of patients with Ebola virus disease in Conakry, Guinea. New England Journal of Medicine. 2015;372(1):40–47. doi: 10.1056/NEJMoa1411249 [DOI] [PubMed] [Google Scholar]
- 40. Ripley BD. The second-order analysis of stationary point processes. Journal of applied probability. 1976;13(02):255–266. doi: 10.1017/S0021900200094328 [Google Scholar]
- 41. Ripley BD. Statistical inference for spatial processes. Cambridge university press; 1991. [Google Scholar]
- 42. Baddeley A, Turner R, et al. Spatstat: an R package for analyzing spatial point patterns. Journal of statistical software. 2005;12(6):1–42. doi: 10.18637/jss.v012.i06 [Google Scholar]
- 43. Madden L. Effects of rain on splash dispersal of fungal pathogens. Canadian Journal of Plant Pathology. 1997;19(2):225–230. doi: 10.1080/07060669709500557 [Google Scholar]
- 44. Bolker B, Grenfell B. Impact of vaccination on the spatial correlation and persistence of measles dynamics. Proceedings of the National Academy of Sciences. 1996;93(22):12648–12653. doi: 10.1073/pnas.93.22.12648 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The authors do not own the burial dataset used in this paper, and cannot make it freely available. Inquiries regarding use of the data can be directed to International Federation of Red Cross and Red Crescent Societies at http://www.ifrc.org/en/Contact-us/. Other data including the computer code is freely available at Github https://github.com/msylau.