Table 2.
model type | spatial method | description | advantages | disadvantages | application |
---|---|---|---|---|---|
statistical or machine learning | spatial covariate | inclusion of a covariate that aims to describe spatial connectivity within a regression model. For example, incidence of surrounding regions, distance between observations, or number of people moving between regions. The covariate is treated as a fixed effect and included into a model as any other covariate | compatible with all statistical or machine learning methods relatively quick and simple to fit allows human and vector movement to be included interpretation of coefficients is often simpler than other methods allows models to 'borrow strength' from connected regions, improving precision |
models assume that the relationship between the outcome and spatial covariates is stationary and isotropic inclusion of a large number of spatial covariates increases risk of overfitting and multicollinearity within a model user must specify which regions/observations are connected prior to model fitting which does not allow other connections to be explored |
exploratory tool for statistical or machine learning studies carried out on a small scale where few spatial connections are expected. Statistical or machine learning modelling studies where spatial connectivity is assumed to arise from human movement |
statistical | local regression models | local regression models are fitted to each region using data from nearby regions, weighted by distance. Also known as GWR. Coefficients are calculated separately for each regression model | relatively simple to carry out and interpret useful exploratory tool to understand how the relationship between covariates and the outcome differ across space does not assume these relationships are stationary |
does not provide a global model to make interpretations about a region as a whole only allows distance-based spatial connectivity to be included |
exploratory tool to generate hypotheses about how relationships differ across space. Cannot be used to make inferences about regions as a whole. Only appropriate when studying areal data |
statistical | random effects and fields | random effects or fields with a spatially structured covariance function are included in a regression model to account for additional correlation or heterogeneity arising from spatial connectivity. Users must choose an appropriate spatial structure before fitting the model, usually assuming that regions are connected if and only if they are adjacent (areal data) or that connections decay exponentially as the distance between them increases (individual-level data) | relatively easy to obtain connectivity data (if using structure based on adjacency or distance) does not assume stationarity in the model allows connections between a large number of observations without issues of overfitting associated with other statistical methods increasing number of methods and software developed to make model-fitting process simpler |
more complex to fit and interpret models than other statistical models random effects require an appropriate spatial structure defined before model is fitted structures identified in this review only allow models to account for connectivity between neighbours or close regions, other connectivity has not been explored |
statistical models where spatial connectivity is expected to exist between nearby regions. Can be carried out in small- or large-scale studies. Recommended for established diseases rather than a newly emerging setting as requires large amounts of data for precise estimates |
machine learning | movement matrix | movement matrices reflecting the movement of humans around a network used to weight connections between hidden layers of a neural network | allows complex, dynamic connectivity structures to be explored allows human movement to be included in a machine learning framework |
requires human movement data (or a representative proxy) to create which can be difficult to obtain inclusion of the matrix in the hidden layer of neural networks means the impact of this movement is difficult to observe computationally intensive |
inclusion in a neural network where human mobility is known to drive transmission. Studies that require accurate predictions based on a large amount of data but quantifying this process is not the focus |
mechanistic | spatial parameter | spatial parameters are included in mechanistic model equations, either to take account for a spatial process or to update populations within each disease compartment of the model. Examples include diffusion parameters allowing hosts and vectors to move across a region or mosquito abundance that borrows information from connected regions | models can be fitted with few data and used to make causal inferences parameters can borrow information from other regions about processes to take account of shared characteristics less computationally intensive to fit than other mechanistic approaches can be used within any mechanistic model |
requires knowledge and information regarding the underlying process of transmission parameters assume that the impact of spatial coefficients on transmission is stationary within a compartmental model making them inappropriate on a large scale |
models aiming to make causal inferences about the underlying process of transmission. Able to fit models where few data are available making it useful for newly emerging diseases or areas with low transmission. More appropriate in small-scale studies where stationarity can be assumed |
mechanistic | movement matrix | movement matrices that reflect the movement of hosts and/or vectors around a network are included within a mechanistic model. These allow interaction between hosts and vectors in different locations and update the population at each node of the network | allows complex, dynamic connectivity structures to be explored results can be extrapolated beyond the data used to fit them and causal inferences can be made provides a more 'realistic' reflection of human and vector behaviour models can be fit with relatively few data |
adequate movement data are difficult to obtain the complex nature of these models means computation can be difficult and time consuming inferences can only be made about the setting the model is parameterized to reflect requires the population being studied to be split into nodes in networks |
models taking account of human and/or vector movement or other complex connectivity structures. Able to fit models where few data exist as well as large amounts, useful for newly emerging diseases. Able to study the process of transmission or causal structures. Works well with agent-based or metapopulation mechanistic models where the population is described using a network |