Skip to main content
Elsevier - PMC COVID-19 Collection logoLink to Elsevier - PMC COVID-19 Collection
. 2022 Jan 19;50:100593. doi: 10.1016/j.spasta.2022.100593

Bayesian disease mapping: Past, present, and future

Ying C MacNab 1
PMCID: PMC8769562  PMID: 35075407

Abstract

On the occasion of the Spatial Statistics’ 10th Anniversary, I reflect on the past and present of Bayesian disease mapping and look into its future. I focus on some key developments of models, and on recent evolution of multivariate and adaptive Gaussian Markov random fields and their impact and importance in disease mapping. I reflect on Bayesian disease mapping as a subject of spatial statistics that has advanced to date, and continues to grow, in scope and complexity alongside increasing needs of analytic tools for contemporary health science research, such as spatial epidemiology, population and public health, and medicine. I illustrate (potential) utility and impact of some of the disease mapping models and methods for analysing and monitoring communicable disease such as the COVID-19 infection risks during an ongoing pandemic.

Keywords: Bayesian hierarchical models, (Adaptive) Conditional autoregressive models, Empirical Bayes, Linear coregionalization, (Multidimensional) Gaussian Markov random fields, Bayesian disease mapping

1. Introduction

On the occasion of the Spatial Statistics’ 10th Anniversary, I reflect on the past and present of Bayesian disease mapping, now part of an established branch of spatial statistics for (areal) lattice data.

Disease mapping has many connotations. For some, typically in the contemporary contexts of geography of disease and of population and public health, it concerns with small area mapping of disease incidence and mortality rates or relative risks, commonly for a population under study and as a part of research into geographical distribution of disease. For Haviland, 1871, Haviland, 1888, for example, mapping observed disease and cancer incidence and mortality rates for the counties of England served to display geographic distribution of a disease and to illustrate possible associations between disease and the environment; see Arnold-Forster, 2020, Arnold-Forster, 2021 for recent research into Haviland’s pioneering works on disease and cancer mapping.

Maps of rare diseases (such as cancer) based on crude rates or relative risks typically lack stability in the illustrated “geographic distribution of disease”. The early efforts made to statistically ascertain localities (such as counties) of elevated or lowered disease incidence and mortality rates and risks gave rise to a new topic in spatial statistics for lattice data, known as disease mapping, a topic of my main focus here. Crude rates of a rare disease calculated for (small) geographic areas such as counties and health administrative regions are typically subject to high chance variations. Earlier statistical research of disease mapping focused on the development of empirical Bayes (EB) models and related estimation procedures for stabilizing (i.e. smoothing) maps of disease rates or relative risks (Tsutakawa et al., 1985, Clayton and Kaldor, 1987, Tsutakawa, 1988, Manton et al., 1989, Cressie, 1993, to name a few). What was notably absent during the earlier years of EB disease mapping is the statistical inference of relative risks, inference in the sense that elevated or lowered disease risks are statistically ascertained.

The seminal works of Besag et al. (1991), Clayton and Bernardinelli (1992), Bernardinelli and Montomoli (1992), and Clayton et al. (1993) influenced and shaped a paradigm shift from the frequentist to Bayesian approach to disease mapping. Known as Bayesian disease mapping, the paradigm has Bayesian hierarchical models at its core and Markov chain Monte Carlo (MCMC) simulation methods provide the computational tools for Bayesian estimation, learning, and inference of all unknown quantities and parameters. Since the 1980s, the research agenda has broadened from providing tools for producing smoothed rate or risk maps to establishing a methodological paradigm with progressive, and systemtically developed, ideas, models, and methods for studying spatial and spatiotemporal variations, patterns, and clusters of disease and health outcome risks in populations of geographical regions, and for identification of (risk and protective) factors that might explain the variations and patterns and clusters. The body of Bayesian disease mapping models and related methods for estimation, learning, and inference has grown in remarkable proportion from the early development of Bayesian hierarchical models for univariate disease mapping to multidimensional disease mapping based on data of space, time, multivariate, and multi-array (e.g. data of space, time, multiple diseases, age group and/or gender stratified).

The richness of the disease mapping models and their wide range of health science applications are well documented in books (Lawson et al., 1999, Banerjee et al., 2004, Waller and Gotway, 2004, Cressie and Wikle, 2015, Lawson, 2018, Martinez-Beneito and Botella-Rocamora, 2019) and review articles (Richardson et al., 2004, Best et al., 2005, Wakefield, 2007, Ugarte et al., 2009, Lee, 2011, MacNab, 2011, MacNab, 2018), in addition to a large quantity of research papers. Here, I review key developments of Bayesian disease mapping, with a focus on recent evolution of multivariate and adaptive Gaussian Markov random fields and their impact and importance in Bayesian disease mapping. I illustrate (potential) utility and impact of some of the disease mapping models and methods for analysing communicable disease such as the COVID-19 infection risks in the context of pandemic intervention and monitoring. I reflect on Bayesian disease mapping as a subject of spatial statistics that has advanced to date, and continues to grow, in scope and complexity alongside increasing needs of analytic tools for contemporary health science research, such as spatial epidemiology, population and public health, and medicine.

2. Empirical Bayes disease mapping since the 1980s

While mapping disease and cancer rates or risks began century into the past, disease mapping gained attentions of the statistics community during the 1980s when cancer registry made population-level data more readily available for small geographical (e.g. administrative) areas of a region or country. A common goal of the researchers was to develop model-based procedures for stabilized cancer risk maps, from which environmental determinants of different types of cancer and their possible etiologic factors may be hypothesized for further investigations (Tsutakawa et al., 1985, Clayton and Kaldor, 1987, Tsutakawa, 1988, Manton et al., 1989 to name a few).

The 1980s also marked the rise of popularity of EB approaches to formulate and estimate models of multi-parameters, models in which the number and dimension of unknown parameters may equal or exceed those of the data (see Carlin and Louis, 2000a, Carlin and Louis, 2000b and references therein). Disease mapping models of the past and present are models of this nature.

2.1. An example of disease mapping model: Poisson model for count data

To facilitate further discussions, let us consider a typical disease mapping setting where we study areal-level cancer mortality risks through modelling of observed area-specific death counts, denoted O =(O1,O2,,On), where n is the number of geographic areas, such as wards, counties, local health administrative areas of a region or a country under study. For brevity, and without essential loss of generality, we will consider the simplest situation where the death counts are compared to a set of expected counts, denoted E =(E1,E2,,En), which is often derived for a set of area- and age-specific at risk population and the corresponding age-specific reference rates, known as age-standardization (Clayton and Kaldor, 1987, Manton et al., 1989). The resulting relative risks, often defined as RRi=Oi/Ei,i, are the standardized mortality ratios (SMR). This is a well known method used in studies of disease epidemiology and in cancer research (Breslow and Day, 1980). In disease mapping, this approach is often taken when areal-level age-specific data are not available or extremely small.

Working with rare events such as cancer mortality, the observed death counts may be viewed as Poisson events and modelled as follows:

OiPoisson(ϕi),ϕi=Eiφi,i, (1)

where ϕi,i, is the Poisson intensity, and φi is the unknown relative risk.

In frequentist disease mapping, the unknown relative risks in (1) are fixed parameters to be estimated from the data. One often assumes that the mortality counts are randomly varying (i.e. statistically independent) Poisson events over geographic areas. Under this assumption, the maximum likelihood estimates (MLEs) of the relative risks, denoted {φˆi,i}, are the previously mentioned SMRs:

φˆi=Oi/Ei,with associated standard errors se(φˆi)=φˆi/Ei,i. (2)

It is well known that the MLEs (2) may lack stability and efficiency when small counts are modelled. The independence assumption is often invalid due to possible spatial correlation, e.g., the tendency that areas of close proximity (say, neighbouring areas) have similar mortality risks.

In addition, Poisson distribution has the unusual characteristic that the intensity ϕi quantifies both the mean and the variance of the distribution. In real-life applications, this variance-equals-mean assumption may be invalid. For example, disease mapping data may exhibit extra-Poisson variation, i.e. Var(Oi)>E(Oi), also known as Poisson over-dispersion (Breslow, 1984, Clayton and Kaldor, 1987, Besag et al., 1991; Breslow and Clayton, 1993).

2.2. Empirical Bayes disease mapping

Model (1) is a model of multi-parameters, where the dimension of parameters equals the dimension of the data. Taking an EB approach to this problem, one would place a prior on φ, denoted f(φ|θ ), where θ  is a (vector of) hyper-parameter(s) to be estimated from data. This approach can facilitate borrowing information and risk smoothing and, with certain priors, account for Poisson over-dispersion. Many EB methods for disease mapping were proposed since the 1980s (Tsutakawa et al., 1985, Clayton and Kaldor, 1987, Tsutakawa, 1988, Manton et al., 1989, Cressie, 1993, Leroux et al., 1999 to name a few). Here, the basic ideas are outlined in the following two illustrative examples, both are taken from, and presented in detail in, Clayton and Kaldor (1987).

Since relative risks are non-negative, one prior option for φ is the gamma distribution φigamma(u,v),i, where u and v are the respective scale and shape parameters. This is a non-spatial prior, where the Poisson–gamma mixing leads to negative binomial distribution for Oi,i, with expectation and variance functions:

E(Oi)=Eiv/u,Var(Oi)=Eiv/u+Ei2v/u2,i,u>0,v>0. (3)

Negative binomial distribution is well known for its use to model Poisson over-dispersion (McCullagh and Nelder, 1983), as illustrated in expression (3) where the variance of Oi is greater than its mean. Once the hyper-parameters u and v are estimated from the data, say, the MLEs of the negative binomial likelihood, denoted uˆ and vˆ, the EB estimate of φi is readily derived from its posterior expectation, E(φi|Oi)=Oi+vEi+u,i: Replacing the u and v in E(φi|Oi) by uˆ and vˆ leads to the EB estimate

φ~i=Oi+vˆEi+uˆ (4)

In addition to modelling Poisson over-dispersion, another feature of this method is that the EB estimate φ~i is a compromise between the SMR Oi/Ei and the estimated mean of φi, vˆ/uˆ. This method leads to “global smoothing” of relative risks (see Clayton and Kaldor, 1987 for additional details).

However, global smoothing may be suboptimal when data are spatially distributed and correlated. An alternative approach is to formulate models that allow for spatial correlation and enable spatial smoothing. In Clayton and Kaldor (1987), a conditionally formulated Gaussian Markov random field (GMRF, Besag, 1974), named CAR for conditional autoregression, was proposed as a spatial prior for the log relative risks ensemble ψ =(ψ1,,ψn):

ψ GMRF(μ ,Ω ),Ω =σ2(I nλW ), (5)

where Ω  is the precision matrix, ψi=log(φi),i, σ is a scale parameter, λ is a spatial dependence parameter, and W  is the well known adjacency matrix for a given lattice (of map) and its neighbourhood system (e.g. two areas are neighbours if they share common border(s)).

This CAR model postulates symmetric spatial dependence, accounts for spatial correlation, facilitates spatial smoothing, and enables borrowing strength for risks prediction and inference. The key features of this conditionally formulated GMRF can be understood and explained via the conditional mean and precision functions of its full conditionals:

E(ψi|ψ i)=λjiψj,Prec(ψi|ψ i)=τ (6)

where τ=σ2, ψ i=(ψ1,,ψi1,ψi+1,,ψn), ji stands for the area j and i are neighbours. The area-specific conditional risk expectation of ψi given ψ i, i, is proportional to the total risks of its neighbouring areas, regulated by a positive spatial dependence parameter λ. The conditional risk prediction variance is regulated by the scale parameter σ.

Unlike the Poisson–gamma mixing, the Poisson-log-normal mixing does not lead to an analytically known distribution for O . In Clayton and Kaldor (1987), an EM algorithm (Dempster et al., 1977) was used for a two-stage estimation of ψ  (and φ) and its hyper-parameters.

The EB estimation is computationally straightforward for the first example and more complex and evolved for the second one. In an influential paper, Breslow and Clayton (1993) popularized several EB methods for inference in generalized linear mixed effects (GLMM) models, a model class that contains the majority of the disease mapping models developed to date.

The main limitation of the EB estimation of φ is that while the point estimates are nearly unbiased, the EB standard errors typically understate the associated estimation uncertainties. This can be readily observed from the “plug-in” EB estimation of the posterior (log) relative risks in the first example, where the uncertainties associated with the estimation of the hyper-parameters may not be adequately accounted for. This is the well known “naive” EB estimation issue for nearly all EB approaches to estimation of ψ  (Carlin and Gelfand, 1990, Carlin and Gelfand, 1991, Carlin and Louis, 2000a, MacNab et al., 2004, Ainsworth and Dean, 2006, MacNab and Lin, 2009).

Into the 1990s and 2000s, some progresses were made to improve EB posterior estimation of relative risks, mainly to provide better standard error estimates by bootstrap methods (Carlin and Gelfand, 1990, MacNab et al., 2004) or via analytic adjustment (Ainsworth and Dean, 2006, MacNab and Lin, 2009). With adequately calibrated standard errors that quantify risk uncertainties, EB methods for posterior inference of relative risks may be used to statistically ascertain elevated or lowered disease risks.

3. BayesIan disease mapping from the 1990s: A paradigm shift

Over the broad landscape of statistical science, the 1990s saw the rise of Bayesian statistics and a new trend of Bayesian hierarchical approaches to formulating complex models, with hierarchical Bayesian methods for posterior estimation and inference of all unknown quantifies and parameters, typically implemented by Markov chain Monte Carol (MCMC) simulations (Gelfand and Smith, 1990, Smith and Gelfand, 1992, Gelman et al., 1995, Gilks et al., 1998). Amid the MCMC movement into the 1990s and the new 21st century, a shift from EB to fully Bayesian (FB) approach to disease mapping was taking place (Bernardinelli and Montomoli, 1992, Gilks et al., 1998, Lawson et al., 1999, Gelman et al., 1995, Congdon, 2001).

The move from EB to FB disease mapping was an analytically natural transition. The two are analytically comparable in the sense that the latter produces Bayesian posterior estimates for the prior parameters, but doing so by placing non-informative or weakly informative priors on the prior parameters, an approach facilitates “data-driven” posterior estimation and inference of the prior parameters (Carlin and Louis, 2000a, MacNab et al., 2004).

Analytic and conceptual transitions take place when the fully Bayesian approach constructs models hierarchically (Cressie and Wikle, 2015) and within a Bayesian hierarchical inferential framework. Within this framework, the prior distribution, such as the previously mentioned gamma or CAR prior, can be postulated as an underlying process model for the underlying risks ψ  that give rise to the observed disease incidence and mortality counts. Statistical inference utilizes the Baye’s rule to compute (approximate) posterior distributions for all unknown quantities and parameters. Conceptually, one can explore meaningful and intuitively appealing prior and hyper-prior options within a Bayesian hierarchical inferential framework for model formulation and Bayesian posterior estimation, learning, and inference of relative risks and parameters of interests and importance. This approach provides a cohesive and principled way to quantify risk uncertainties and uncertainties of all unknown parameters.

With the arrival of freely available BUGS/WinBUGS software (Spiegelhalter et al., 2003) for Bayesian hierarchical analysis and later the GeoBUGS add-on module (Thomas et al., 2004) to WinBUGS for Bayesian analysis of spatial data, in addition to various publicly available R packages, e.g. the R-INLA (Rue et al., 2017) and the CARBayes (Lee, 2013), for (approximate) Bayesian analysis of spatial areal data, fully Bayesian approach to disease mapping model formulation, estimation, and inference became mainstreamed and stayed so to this day.

4. Multidimensional disease mapping

Multidimensional disease mapping is a broad topic that concerns with spatiotemporal disease mapping (Held and Besag, 1998, Held, 2000, MacNab and Gustafson, 2007, Ugarte et al., 2017), multivariate disease mapping (Kim et al., 2001, Jin et al., 2007, Greco and Trivisano, 2009, Martinez-Beneito, 2013, MacNab, 2018), and multi-array disease mapping, say, when space, time, age (and gender, etc.) stratifications are all considered in a disease mapping model (Held and Besag, 1998, MacNab, 2003a, Martinez-Beneito et al., 2017, Martinez-Beneito and Botella-Rocamora, 2019). Literature on this topic is quite rich (see Lawson, 2018, Martinez-Beneito and Botella-Rocamora, 2019 and references therein). Here, I briefly reflect on common ideas and models that characterize cross-dimensional dependencies and interactions.

Without essential loss of generality, let us begin with adding another dimension to the Poisson data model (1), in which one has a matrix of log relative risks ψ =(ψij), where j=1,2,,T for spatiotemporal disease mapping or j=1,2,,J for mapping of J diseases.

Spatiotemporal disease mapping models presented in the literature focused mainly on characterizing the presence of space and time risk interactions, in addition to model components of risks in space and time, respectively (Waller et al., 1997, Held and Besag, 1998, Held, 2000). These models are often motivated to facilitate borrowing information and risk smoothing within and between space and time. They are also natural extensions of the spatial risk models.

Some of the models are richly parameterized to hypothesize space and time risk interactions, in addition to decompose the ψ  into (parameterized) components of space and time. Typical examples are the models of Waller et al. (1997) and Held (2000), two highly cited and influential papers. For example, Held (2000) proposed a general model formulation

ψit=αt+φt+ϕi+ηi+δit,i=1,2,,n,t=1,2,,T, (7)

and considered types of space–time interactions and associated prior options for {δit,i,t}. These spatiotemporal (interaction) models may be highly parameterized and, in this situation, additional identification constraints may be imposed (Held, 2000).

Alternatively, one can take a parsimonious way to parameterize spatiotemporal models. For example, the following model formulation was considered in the literature (MacNab and Dean, 2001):

ψit=αt+θi+δit,i,t, (8)

in which functional analysis may be used to hierarchically model space × time interactions. For instance, illustrated in MacNab and Dean (2001), MacNab and Gustafson (2007), and MacNab (2007), space–time interactions δit can be modelled via spatially varying splines (e.g. B-splines, P-splines, and smoothing splines), in addition to modelling the time components αt by a fixed spline for non-linear risk trend. In Ugarte et al. (2010), Lee and Durbán (2011), Ugarte et al. (2017), and Feng (2021), space–time interactions are modelled by tensor-based 2d or 3d P-splines or thin plate spline. Lawson (2018) and MacNab (2018) also considered parsimonious spatiotemporal model parameterizations. They illustrate modelling linear risk trends αt=βt and δit=δitj, which can be viewed as a special (degenerated) case of the previously mentioned spline-based approaches for modelling non-linear risk trends.

Modelling spatially varying non-linear risk trends via spatially varying random walk (RW) or auto-regressive (AR) or CAR processes are also viable options (Held, 2000, Ugarte et al., 2017, MacNab, 2018, Sahu and Böhning, 2021); see Section 9 for an example. Spatiotemporal disease mapping models were mainly developed for posterior prediction of the unobservable relative risks, although a few have been developed for short-term forecasts of disease rates (Etxeberria et al., 2015, Sahu and Böhning, 2021).

Without sufficient difficulty, the spatiotemporal models (7), (8) can be extended further for multi-array disease mapping. Adding age-effects is one example. Age-effects could be included in (7) or (8) as additional components (Held and Besag, 1998, Goicoa et al., 2016, Martinez-Beneito et al., 2017) or modelled by a spline for non-linear age-effects (MacNab, 2003a). Multivariate spatiotemporal disease mapping is another direction (Tzala and Best, 2008, Jack et al., 2019, Vicente et al., 2020, Vicente et al., 2021).

Multivariate disease mapping has been an active area of research since the early 2000s. The main motivation of the efforts is to develop methods for jointly modelling diseases that share common risk factors, enabling more effective borrowing-information, and facilitating spatial and cross-spatial risk smoothing. Two main approaches to multivariate disease mapping are considered in the disease mapping literature. One is to model ψij as a sum of two components:

ψij=θi+δij,i,j, (9)

where θi is a component shared by all diseases. This is the shared component model (SCM) proposed by Held and Best (2001). SCMs typically hypothesize positive cross-variable local dependencies and correlations (MacNab, 2010). The other more flexible approach is to place a multivariate GMRF prior on vec(ψ ) or vec(ψ ), which I discuss in Section 7.

5. Disease mapping with areal-level covariates: Ecological regression

In the context of disease mapping, including a regression part in a disease mapping model can serve several important and practical purposes. One important purpose of a spatial regression model is to quantify disease risk variations explained and unexplained by the included risk (and/or protective) factors (Clayton et al., 1993, MacNab, 2003b, MacNab, 2003c, MacNab, 2004).

Known as ecological modelling and analysis, ecological regression can serve to identify areal-level characteristics (e.g. environmental exposure, socio-economic deprivation, contextual characters) that influence the disease risks. Most of the ecological models developed in the disease mapping literature were aimed for this purpose. They are commonly motivated and applied to research into the (social and/or physical) environmental determinants of health (Clayton et al., 1993, MacNab, 2004, Lee and Sarran, 2015, Cameletti et al., 2019, Huang et al., 2021) or population-level health inequality (Lee and Lawson, 2016, Jack et al., 2019).

Ecological regression faces several methodological challenges. One of them is the long-standing ecological bias, referring the situation when the degree of association between risk exposure and disease at the areal-level is different from that of the individual level, due to within-area variability in exposures and confounders (Clayton et al., 1993, Wakefield and Salway, 2001, Wakefield, 2007, Lawson, 2018).

Another analytic challenge is spatial confounding, commonly discussed as collinearity between fixed effects and random effects in a spatial GLMM, and in the presence of unmeasured confounding. Literature of ecological studies offered two main approaches to the issue. One is to deconfound the fixed and random effects (Reich et al., 2006, Hughes and Haran, 2013, Ugarte et al., 2017, MacNab, 2018, Zimmerman and Hoef, 2021). Another is to formulate transformed GMRF or MGMRF (Prates et al., 2015, Hughes, 2015, Prates et al., 2021) so that the fixed and random effects can be modelled independently. Named TGMRF or TMGMRF, the basic idea of this approach is to model the relative risks ψ , and its spatiotemporal or multivariate extensions, via a copula-inspired model structure that allows fixed and random components to be estimated independently: The fixed effects are modelled on the univariate margins, whereas the random effects spatial dependencies are modelled via a GMRF or MGMRF spatial dependence matrix. One limitation of this approach is that the regression part in the model is designed to explain marginal variations, not spatial dependence. In traditional spatial regression GLMMs, the included covariates may (partially) explain risk variation or spatial risk dependence or both; see MacNab, 2003b, MacNab, 2003c, MacNab, 2004, MacNab, 2009, MacNab, 2010, and Section 9, for illustrative examples.

Of the deconfounding approach, earlier proposals are known as the restricted spatial regression (RSR) method (Reich et al., 2006, Hughes and Haran, 2013, Ugarte et al., 2017, Zimmerman and Hoef, 2021). The key idea of the RSR approach is to re-parameterize a n-dimensional spatial prior and project the original Gaussian spatial prior onto a Gaussian spatial prior of its sub-space orthogonal to the column space of the fixed effects model matrix (Zimmerman and Hoef, 2021). Very recently, Zimmerman and Hoef (2021) researched into the RSR approach, named it deconfounding spatial confounding, and put the method into question. In the context of spatial linear mixed model and its frequentist inference, they showed that the RSR approach can lead to underestimation of the uncertainties associated with the estimation of the fixed effects and is generally inferior to that of the original spatial regression model.

An alternative to the RSR method is to formulate spatially misaligned ecological model, misaligned in a sense that the spatial prior of the random effects and the included covariates are varying at different spatial scales and/or configurations (Lee and Sarran, 2015, Prates et al., 2019, Azevedo et al., 2021). This approach to deconfounding is also consistent with the Paciorek (2010) work on spatial confounding, which offered insight into the importance of scale for spatial-confounding bias and precision of spatial regression estimators. In Paciorek (2010), analytic and simulation results suggested evidence that bias can be reduced when fitting a spatial model in which the variation of the covariate is at a scale smaller than the scale of the unmeasured confounding.

Errors-in-covariate is another important issue in ecological regression, where areal-level covariates are commonly measured imprecisely, or because surrogates or proxies are only available and used. In the Bayesian disease mapping literature, the main approach is to formulate measurement models for the covariates and incorporate them into a hierarchically formulated Bayesian spatial regression model (Bernardinelli et al., 1998, Wakefeild and Morrise, 1999, MacNab, 2009, MacNab, 2010, Lawson, 2018).

6. Conditionally formulated GMRFs

Conditionally formulated GMRFs have played important roles in disease mapping and ecological regression (Clayton and Kaldor, 1987, Besag et al., 1991). They are formulated to model spatial dependencies and spatially structured correlation and covariance functions (MacNab, 2011, MacNab, 2016a, MacNab, 2016b, MacNab, 2018). Formulating GMRF via full conditionals is conceptually appealing for articulating the idea of spatial smoothing, as illustrated earlier for CAR (5) expressed by its full conditionals (the Expression (6)). Analytically, the GMRF full conditionals also facilitate coding the powerful Gibbs sampler as a computational tool for posterior estimation of the GMRFs via MCMC simulations; see [10] for an accessible explanation of Gibbs sampler.

Several CARs have been proposed since the CAR (6). Three stood the time and are commonly considered prior options for disease mapping. They are the well known intrinsic CAR (iCAR, Besag et al., 1991), proper CAR (pCAR, Cressie, 1993, Sun et al., 1999), and Leroux et al. CAR (LCAR, Leroux et al., 1999); see Table 1 for their full conditionals and associated precision matrices.

Table 1.

Three CARs commonly used in Bayesian disease mapping: W =(wij),wij=1 when the ith and jth areas are neighbours (denoted ij) or wij=0 otherwise; wi+=jiwij,D w=diag(w1+,wn+). : The covariance matrix Σ  is presented; ψ siCAR(σs2),ψ hIIDN(σh2), : The covariance matrix Σ  is presented; ψ siCAR(σ2),ψ hIIDN(σ2); ψ sψ h: ψ s and ψ h are (assumed) statistically independent. For all models in Table 1: 0<λ<1.

Model E(ψi|ψ i) Var(ψi|ψ i) Ω
iCAR(σ2) jiψjwi+ σ2wi+ σ2(D wW )
(Besag et al., 1991)
pCAR(λ,σ2) λjiψjwi+ σ2wi+ σ2(D wλW )
(Cressie, 1993)
LCAR(λ,σ2) λjiψj1λ+λwi+ σ21λ+λwi+ σ2(λ(D wW )+(1λ)I n)
(Leroux et al., 1999)
BYM(σs,σh) ψ =ψ s+ψ h,ψ sψ h σs2(D wW )1+σh2I n
(Besag et al., 1991)
MBYM(c,σ) ψ =λψ s+1λψ h,ψ sψ h σ2(λ(D wW )1+(1λ)I n)
(MacNab, 2011)

The pCAR(λ,σ) and LCAR(λ,σ) are two-parameter and full rank GMRFs, where λ and σ are the respective spatial and non-spatial parameters: λ is a spatial dependence or weight or smoothing parameter and σ is a (non-spatial) Gaussian scale parameter (Cressie, 1993, Leroux et al., 1999). Typically defined on an irregular lattice of areal map, the pCAR, with its conditional expectation and precision functions

E(ψi|ψ i)=λjiψj/wi+,Prec(ψi|ψ i)=τwi+,τ=σ2,i, (10)

is a slight modification to the CAR (6), where {wi+,i} in (10) is a set of scaling factors. Consequently, when λ(0,1), the pCAR conditional expectation E(ψi|ψ i),i, is proportional to an average of neighbourhood risks, rather than the sum of the neighbourhood risks as is defined in CAR (6). And the pCAR (10) conditional precision is proportional to wi+, which implies that the greater the number of neighbours, the higher the conditional precision. That is, borrowing information from a higher number of neighbours could lead to higher precision of posterior risk prediction. This could be a plausible assumption in the context of disease mapping, where an areal map under study typically constitutes an irregular lattice with varying sizes of neighbourhoods over the map.

Another feature of the pCAR, which is not often discussed in the literature, is that it postulates asymmetric conditional dependency (i.e. direct influence) of ψjon ψi, quantified by λ/wi+ versus ψi on ψj, defined by λ/wj+, provided wi+wj+, ij, and 0<λ<1. That is, the direct influence of the area j on its neighbouring area i is inversely proportional to the neighbourhood size of the area i. Put it another way, an area with a higher number of neighbours is less influenced by its neighbour who has a lower number of neighbours. This could also be an intuitively plausible assumption, consistent with the conditional precision function Prec(ψi|ψ i) in (10); one might expect that an area of higher precision of risk prediction should be less influenced by an area with a lower precision of (predicted) risk. The spatial dependence parameter λ in pCAR is often called a spatial smoothing parameter; it regulates smoothly varying relative risks over the map.

To facilitate discussion, I name the coefficients of the conditional autoregression in (10) the coefficient of influence functions, or simply influence functions, denoted

Influence(j,i)pCAR=λ/wi+,ji.

The LCAR conditionals also allow asymmetric spatial dependencies, provided Influence(j,i)Influence(i,j),ij, where

Influence(j,i)LCAR=λ/(1λ+λwi+),ji.

That is, the LCAR influence function is a non-linear but increasing function of λ. The spatial dependence parameter λ has triple roles and is also a spatial weight or smoothing parameter; it weights a precision matrix of the iCAR(σ) and a precision matrix of n IIDN(σ) (normal) variates, a mixing of purely local smoothing and global smoothing (Congdon, 2008). The LCAR conditional variances are non-linear decreasing functions of λ (see Table 1); that is, λ also partially controls the risk prediction variance of ψi conditional on ψj,ji (MacNab, 2011). Noted in MacNab (2018), the LCAR parameterization is an “entangled” spatial and non-spatial parameterization, and, as a result, LCAR has limited options for multivariate generalization (see Section 7).

The iCAR(σ) is a GMRF with a singular precision matrix of rank n1. It is the limiting distribution of the pCAR(λ,σ) and LCAR (λ,σ) (when λ1). The iCAR(σ) is typically considered as a spatial prior for modelling spatially structured risk variation or clustered heterogeneity. It is commonly used when the n-vector of log relative risks ψ  is modelled as additive components of ψ siCAR(σs2), for modelling spatially structured heterogeneity or effects of omitted covariates that are spatially varying, and ψ hIIDN(σh2), for modelling extra-Poisson variation or effects of omitted covariates that are randomly varying. This is the well known Besag, Yorke, and Mollie (BYM) model (Besag et al., 1991), widely used in Bayesian disease mapping and ecological regression. This model is highly parameterized. The BYM is typically used as a prior in Bayesian disease mapping, where estimation and inference of the relative risks may be implemented with (weakly) informative priors for (functions of) the scale parameters σs2 and/or σh2 (Thomas et al., 2004, MacNab, 2011).

To improve computational efficiency and parameter identification, MacNab (2011) proposed a modification to the BYM, named MBYM: ψ =λψ s+1λψ h,ψ siCAR(σ2),ψ hIIDN(σ2), where λ(0,1) is a weight parameter, weighting between the covariance matrix of the iCAR and the covariance matrix of the IIDN; see Table 1 for the BYM and MBYM covariance matrices.

The LCAR has gained popularity due in part to the analytic result that when λ=0, the LCAR reduces to independent and identical Gaussian priors for the log relative risks, whereas the pCAR, when λ=0, reduces to n independent Gaussian priors with prior precisions proportional to wi+,i (MacNab, 2003c, MacNab, 2011, Lee, 2011).

However, compared to LCAR, pCAR has its own advantages. Its spatial and non-spatial parameters play separate and different roles, one regulates spatial dependency, the other controls non-spatial variance. With its rich options of multivariate generalization, the pCAR, and some of its adaptive and multivariate extensions, have theoretical and practical appeals for modelling and interpreting spatial dependencies (MacNab, 2018, MacNab, 2020, MacNab, 2021a, MacNab, 2021b see Sections 7, 8).

Of the five CAR models, none has shown to outperform the others in all disease mapping situations. This is consistent with the analytic results that the iCAR, pCAR, and LCAR have different spatial dependence and correction functions (see details in MacNab, 2011, MacNab, 2014) and the five CAR models may play nuanced roles in Bayesian disease mapping applications. Bayesian sensitivity analysis under various prior and hyper-prior options for Bayesian estimation, learning, and inference is an important part of Bayesian disease mapping application (MacNab, 2011); and, with goodness-of-fit and complexity assessment, say, based on deviance informion criterion (the devance, pD, DIC scores, [125], also see Section 9), it remains a viable and commonly taken approach to model evaluation, comparison, and selection.

7. Multivariate GMRFs for multidimensional disease mapping

In the broad context of multidimensional disease mapping discussed earlier, multivariate GMRFs play important roles in modelling multidimensional spatial dependencies and cross-covariances, enabling borrow-information over multi-dimensions, and facilitating spatial and cross-spatial smoothing. The five commonly used disease mapping models in Table 1 have their multivariate extensions presented in the literature. A list of illustrative models are presented in Table 2, where their joint covariance matrices are given.

Table 2.

A selected multivariate CARs. ψ  is a n×K matrix, for K-variate CARs.

Model ψ  Σ M
MiCAR(Σ ) vec(ψ ) (D wW )1Σ 
(G+V 2003)1
MpCAR(c,Σ ) vec(ψ ) (D wcW )1Σ 
(G+V 2003)1
MLCAR(c,Σ ) vec(ψ ) (c(D wW )+(1c)I n)1Σ 
(M+G 2007)2
MLCAR(c ,Σ ) vec(ψ ) (I nA )Bdiag[(ck(D wW )+(1ck)I n)1](I nA )
(MacNab, 2018)
MpCAR(C s,Σ ) vec(ψ ) (A I n)S (C s)1(A I n)
(Jin et al., 2007)
MpCAR(C ,Σ ) vec(ψ ) (A I n)S (C )1(A I n)
(G+T 2009)3
MpCAR(C ,ρ ,σ ) vec(ψ ) (σ 2I n)S (ρ ,C )1(σ 2I n)
(Sain et al., 2010)
MultBYM(Σ s,Σ h) vec(ψ )=vec(ψ s)+vec(ψ h) (D wW )1Σ s+I nΣ h
(Thomas et al., 2004)

1: Gelfand and Vounatsou (2003)2: MacNab and Gustafson (2007)3: Greco and Trivisano (2009). Σ  is a K×K covariance matrix. Bdiag[(ck(D wW )+(1ck)I n)1]=Bdiag[(c1(D wW )+(1c1)I n)1,,(cK(D wW )+(1cK)I n)1]. S (C s)=D mI K(W C s), S (C )=D mI K(W UC +W UC ), S ~(ρ ,C )=I nρ +S (C ), where D m=diag(mi,,mn), C s=C s, C s is a symmetric matrix or diagonal matrix, C C , W U is the upper triangular part of W ; A A =Σ . : ψ sMiCAR(Σ s),ψ hIIDN(Σ h); ψ s and ψ h are (assumed) statistically independent; : ψ sMiCAR(Σ ),ψ hIIDN(Σ ); ψ ks and ψ kh, k, are (assumed) statistically independent.

For mapping two or more diseases, for example, conditionally formulated multivariate GMRFs, such as the 2-fold CAR of Kim et al. (2001), the proper multivariate CARs of Gelfand and Vounatsou (2003), and the coregionalized MpCARs of Jin et al. (2007), were proposed for modelling multivariate disease risks that are correlated at co-locations as well as within- and cross-neighbourhood (i.e. cross-correlation between diseases at neighbouring locations). It is worth mentioning that the three groups of authors took three approaches to arrive their MCARs, all are multivariate generalizations of the pCAR. Working with a matrix of ψ , one has a variety of ways to construct MGMRFs; see [89] for a discussion and references therein.

Since the influential works of Besag (1974) and Mardia (1988), MGMRFs are often formulated via full conditionals (e.g. Kim et al., 2001, Gelfand and Vounatsou, 2003, Sain and Cressie, 2007, Sain et al., 2011). However, as shown in MacNab (2018), formulation and implementation of MGMRFs face several challenges, most notably the entanglement of spatial and non-spatial parameterizations, compliance with symmetry requirement, and enforcement for positivity. With the exception of separable MCARs, neither the Besag (1974) nor the Mardia (1988) framework readily enables multivariate generalizations of all CARs.

One example is the LCAR, for which non-separable multivariate generalization of full conditionals is not readily derived, due in part to its entangled spatial and non-spatial parameterizations. The MLCAR(c ,Σ ) presented in Table 2 was derived via linear transformation of independent LCARs (MacNab, 2016a), a method known as coregionalization (Jin et al., 2007).

Linear transformation is a well known and powerful idea for multivariate generalizations of univariate distributions and models. In a series of recent papers (MacNab, 2016a, MacNab, 2016b, MacNab, 2018, MacNab, 2021a, MacNab, 2021b), and built on the works of Gelfand et al., 2004, Jin et al., 2007, Greco and Trivisano, 2009, MacNab develops a coregionalization framework that connects and unifies the various lines of MGMRF development. The framework enables a systematic development, categorization, and implementation of a broad range of multivariate models via linear or spatially varying coregionalization, with extensions to locally adaptive models. The coregionalization methods readily facilitate multivariate generalizations of (any) GMRF and contain the major MCARs proposed in the literature.

To briefly explain the key ideas, consider a p-variate latent MGMRF construction vec(η )MCAR(0,S (C )) with the following precision matrix (Greco and Trivisano, 2009, MacNab, 2018)

S (C )=D mI p(W UC +W UC ),C C , (11)

provided S (C )>0, where C  is a p×p full matrix of spatial dependence parameters, D m=diag(m1,m2,,mn), {mi,i} are area-specific scaling factors, W U is the upper triangular part of W , W  is the “neighbourhood” connectivity or weight matrix. S (C ) is a spatial dependence matrix for locally dependent MGMRF constructions. Expression (11) is simplified to S (C s)=D mI p(W C s) when C s is a symmetric matrix or a diagonal matrix of spatial dependence parameters. It also reduces to a separable model when C =cI p.

Without the “entanglement” challenges, MGMRFs of construction (11) can be readily formulated within the Besag (1974) or Mardia (1988) framework (MacNab, 2018).

Linear transformation of vec(η ), denoted

vec(ψ )=(I nA )vec(η ), (12)

leads to a coregionalization MGMRF (cMGMRF) with covariance matrix (Jin et al., 2007, Greco and Trivisano, 2009, MacNab, 2016a, MacNab, 2016b, MacNab, 2018):

Σ vec(ψ )cMGMRF(C ,A )=(I nA )S (C )1(I nA ), (13)

where A  is a K×K coregionalization coefficients matrix, A A =Σ >0, Σ  is a covariance matrix.

Notice that (12), (13) exemplify one way to decomponse and mix spatial and non-spatial components that characterizes and parameterizes cross-spatial interactions. Alternatively, let the latent fields be vec(η )MVN(0,I nΣ ), a spatially varying (non-stationary) coregionalization (SVC) construction may be formulated via

vec(ψ )=H (C )vec(η ), (14)

to have the following covariance matrix (MacNab, 2018):

Σ vec(ψ )(C ,Σ )=H (C )(I NΣ )H (C ), (15)

where H (C )H (C )=S (C )1. In SVC (15) the latent variables are only correlated at same locations. SVC constructions with latent (M)GMRF can also be readily formulated.

The decomposition and mixing schemes of (12), (14) are typically used to produce valid covariance and cross-covariance functions for cMGMRFs. The precision matrices of the cMGMRFs loss their appeal of interpreting cross-spatial dependencies (MacNab, 2018, MacNab, 2021a, MacNab, 2021b).

As an alternative to (12), (14), a cMGMRF can be formulated via

vec(ψ )=(I nσ )vec(η ) (16)

to have the following precision matrix:

Ω vec(ψ )MGMRF(C ,ρ ,σ )=(I nσ 1)S ~(ρ ,C )(I nσ 1), (17)

where S ~(ρ ,C )=I nρ +S (C ), ρ  is a p by p non-spatial (with-in location) partial correlation matrix, σ  is a diagonal matrix of p scale parameters (MacNab, 2018, MacNab, 2020, MacNab, 2021a). Notice that S ~(ρ ,C ) is a precision (dependence) matrix of a latent MGMRF; its full conditionals are readily available (Sain et al., 2011, MacNab, 2018, MacNab, 2020, MacNab, 2021a). It models spatial dependencies by S (C ) and non-spatial dependencies by ρ . A main feature of this approach to separating and mixing the cMGMRF spatial and non-spatial components, with its spatial and non-spatial parameterizations, is that the cMGMRF preserves the appeal of interpreting cross-spatial dependencies in the latent MGMRF with precision matrix S ~(ρ ,C ); also see MacNab, 2018, MacNab, 2021a for connections between the cMGMRFs (13), (17).

With options of parameterization for S (C ), ρ , A , Σ , the three constructions contain all major proposals of MGMRFs to date and much more; they are rich options of multivariate generalizations of pCAR. These constructions may be parameterized to build multivariate spatial and spatiotemporal smoothers, as well as multivariate spatial and spatiotemporal covariance models. cMGMRFs of constructions (13), (15) could be parameterized for latent spatial components and factor analysis, include, but not limited to, principal spatial components analysis and dimensionality reduction (see MacNab, 2018, MacNab, 2021a for illustrative examples).

Of the cMGMRF (17) construction, options of parameterizations for the spatial and non-spatial components may be considered for their intuitively plausible characterizations and interpretations of conditional spatial dependencies within and between variables, as well as local interactions in space and time (Sain et al., 2011, MacNab, 2018, MacNab, 2020, MacNab, 2021a, MacNab, 2021b, Prates et al., 2021).

A p-variate GMRF that allows for asymmetric cross-covariance functions can be formulated by linear coregionalization of latent MGMRF with a full matrix C C  of spatial and cross-spatial dependence parameters or by spatially varying coregionalization of p independence GMRFs; the latter is a non-stationary adaptive MGMRF with adaptive non-spatial parameterization (Martinez-Beneito, 2020, MacNab, 2021a). In addition, as I show in Section 8, non-stationary GMRFs that allow for asymmetric pair-wise spatial dependencies can be formulated via adaptive parameterization.

8. Adaptive CARs

GMRFs are undirected graphical models, commonly defined for a lattice system of nodes and edges (Rue and Held, 2005). Of the GMRFs commonly used in disease mapping (say, those in Table 1), the edges are defined by the W  matrix of a given map and its neighbourhood definition. These GMRFs often have single spatial and/or scale parameter. In the context of disease mapping, they postulate pair-wise interactions of neighbouring risks. They typically lead to smoothly varying posterior relative risks prediction and lack the flexibility to allow within- and between-neighbourhood risk heterogeneities that likely happen in parts of the map (MacNab et al., 2006, Brewer and Nolan, 2007).

To mitigate the mentioned limitations, adaptive CARs have been proposed with applications to analysis of imaging data (Brezger et al., 2007), analysis of correlated data (Reich and Hodges, 2008), and disease mapping (Lu and Carlin, 2005, MacNab et al., 2006, MacNab, 2018, Lu et al., 2007, Brewer and Nolan, 2007, Congdon, 2008, Lee and Mitchell, 2012, Rushworth et al., 2017, Gao and Bradley, 2019, Corpas-Burgos and Martinez-Beneito, 2020, among others).

Adaptive CARs proposed to date can be broadly grouped under two motivating directions. One is to hierarchically formulate adaptive CAR(GMRF) with unknown adjacency matrix to be modelled and estimated from data. This is the direction taking by Lu and Carlin (2005), Lu et al. (2007), Lee and Mitchell (2012), Rushworth et al. (2017), and Gao and Bradley (2019), among others. With the exception of Gao and Bradley (2019), these authors proposed adaptive iCAR, CAR (5), and LCAR in which the non-zero elements of W =(wij,ij) were modelled as Bernoulli random variates or random variates of unit interval (i.e. wij(0,1),ij, Rushworth et al., 2017). In Gao and Bradley (2019), an adaptive iCAR was formulated for unknown adjacency matrix W , where wij,ij, were modelled as Bernoulli random variates. This line of research primarily concerned with adjacency modelling for boundary analysis and detection (Lu et al., 2007, Lee and Mitchell, 2012, Gao and Bradley, 2019) and/or analysis of spatial discontinuity in disease risk maps (Lee et al., 2014, Rushworth et al., 2017, Sahu and Böhning, 2021).

An adaptive LCAR was also proposed in Baptista et al. (2016), in which the W  matrix is replaced by a similarity matrix S =(Sij), where the elements Sij,ij, are functions of covariates that characterize risk similarities over a map.

Formulating adaptively parameterized CARs over a given W  is the second approach. Under this approach, adaptive CARs have been proposed for each of the CARs presented in Table 1; the key proposals are presented in Table 3. This group of adaptive CARs may be used to model locally varying (asymmetric) spatial dependencies, interactions, and heterogeneities. These adaptive CARs might capture micro-level phenomenons such as locally varying (asymmetric) risk dependencies and influences and heterogeneities that typically manifest macro-level phenomenons such as the spatial risk heterogeneities and discontinuity. They can be broadly sub-grouped into three categories. One contains CARs with adaptive spatial (dependence) parameters; they were typically motivated to model locally varying spatial dependencies (MacNab et al., 2006, Congdon, 2008, MacNab, 2018, MacNab, 2021a). Another includes CARs of adaptive scale or precision parameters that serve as spatial weights for modelling weighted pair-wise risk interactions and locally varying risk heterogeneities (Corpas-Burgos and Martinez-Beneito, 2020). CARs with adaptive spatial and scale parameters are among the third category. Here, I highlight the key features of the three categories of adaptive CARs.

Table 3.

Key options of adaptively parameterized CAR in the literature.

Model Influence(j,i) Var(ψi|ψ i) Σ 
pCAR(a) ci/wi+ σ2/wi+
pCAR(b) (MacNab, 2018) λcicj/wi+ σ2/wi+ σ2(D wλW c)1
LCAR(a) (MacNab et al., 2006) 1/(ci1+(wi+1)) σ2/wi+(ci)
LCAR(b) (Congdon, 2008) cj/(ci1+(wi+1)) σ2/wi+(ci) σ2(c (D wW )c +(I nc ))1
LCAR(c) (MacNab, 2018) cj/(ci1/2+(wi+1)) σ2/wi+(ci) σ2(c 1/2(D wW )c 1/2+(I nc ))1
iCAR(a) (C-B 2020)2 cj/jicj σ2/wi+c σ2(D wcW c)1
pCAR(c) λcj/jicj σ2/wi+(c) σ2(D wcλW c)1
LCAR(d) (C-B 2020)2 λcicj/wi+(c,λ) σ2/wi+(c,λ) σ2(λ(D wcW c)+(1λ)I n)1
LCAR(e) (C-B 2020)2 λcj/wi+(c~,λ) σ2/(ciwi+(c~,λ)) σ2(λ(D wcW c)+(1λ)c )1
iCAR(b) (MacNab, 2018) cj/(ciwi+) (σsi)2/wi+ σ ~(D wW )1σ ~
pCAR(d) (MacNab, 2018) λcj/(ciwi+) (σsi)2/wi+ σ ~(D wλW )1σ ~
LCAR(f) (MacNab, 2018) λcj/(ciwi+λ) (σsi)2/wi+λ σ ~(λ(D wW )+(1λ)I n)1σ ~
pCAR(e) MacNab (2018) λcicj(σiσj)1/wi+ σi2/wi+ σ (D wλW c)1σ 
Brezger et al. (2007) iCAR w~ij/w~i+ σ2/w~i+ σ2(D w~W ~)1,w~ij=cijwij
Brewer (2007)1 iCAR τij/wi+τ 1/wi+τ (D wτW τ)1

Brewer 20071: Brewer and Nolan (2007) and Reich and Hodges (2008); C-B 20202: Corpas-Burgos and Martinez-Beneito (2020), ci1/2 is replaced by ci. b i=c(b1,,bi1,bi+1,,bn); W =(wij),wij=1 when ij or wij=0 otherwise; wi+=jiwij,D w=diag(w1+,wn+); wi+(ci)=1ci+ciwi+; wi+(ci)=ci1/2+(wi+1); W τ=(wijτ),wijτ=wijτij, wi+τ=jiwijτ, D wτ=diag(w1+τ,wn+τ); τij=τiτj/(τi+τj),i in Brewer and Nolan (2007), τij=τiτj,τi=σ2,i in Reich and Hodges (2008). W c=(wijc),wijc=wijcicj, wi+c=jiwijc,D wc=diag(w1+c,wn+c); wi+(c,λ)=1λ+λwi+c, wi+(c~,λ)=1λ+λjicj; c =diag(c1,c2,,cn), σ =diag(σ1,σ2,,σn). : σ ~=diag(σ~1,σ~2,,σ~n), σ~i=σsi,ci=si1,i. E(ψi|ψ i)=jiInfluence(j,i)ψj.

8.1. Adaptive CARs of the category I

There are important differences between the adaptive pCARs and LCARs of this category. A common feature of the adaptive pCARs is that the adaptive parameters {ci,i} control spatial dependencies in the conditional mean functions, whereas its scale parameter σ regulates (log relative) risk prediction variances in the conditional variance functions (see Table 3). The notable differences between the pCARs are their influence functions, as illustrated here for the pCAR(a) and pCAR(b):

Influence(j,i)pCAR(a)=ciwi+,Influence(j,i)pCAR(b)=λcicjwi+. (18)

In pCAR(a), c  quantify how each area is directly influenced by its neighbours: The higher the ci, the greater the ψi prediction is directly influenced by its neighbouring {ψj,ji}. When ci is near-zero, prediction of ψi is not directly influenced by its neighbouring {ψj,ji}.

The pCAR(b) is a variant of the adaptive pCAR proposed in MacNab (2018). Here, the pCAR(b) has an additional spatial dependence parameter λ that characterizes neighbourhood risk dependence under the new weight matrix W c (see Table 3). Similar to its non-adaptive counterpart, its coefficient of influence functions imply asymmetric pair-wise local dependencies and influences, provided wi+wj+: An area with a higher number of neighbours is less influenced by its neighbour who has a lower number of neighbours. Two neighbours exert same influence on each other if and only if they have the same number of neighbours.

Similar to the LCAR(a), the adaptive pCAR(a) full conditionals do not lead to a GMRF due to noncompliance with the GMRF symmetry requirement. Nevertheless, the adaptive pCAR(a) or LCAR(a) can be used as a prior in a Bayesian hierarchical model for characterizing locally varying spatial dependencies and heterogeneities (which I illustrate in Section 9).

In the LCARs, the adaptive spatial parameters serve dual roles of regulating locally varying spatial dependencies and heterogeneities (see Table 3). For example, the Influence(j,i)LCAR(a)=ci/(1+ci(wi+1)),ji, is a non-linear increasing function of ci, the corresponding conditional variance function is a non-linear decreasing function of ci.

Illustrated in Table 3, the LCAR(b) and LCAR(c) are comparable GMRFs. While the LCAR(c) reduces to its non-adaptive LCAR, the LCAR(b) does not. Compared to the LCAR(a), the two have more complex influence functions for interpreting locally varying spatial dependencies and influences (given in Table 3). An alternative interpretation of the two is that, as adaptive GMRFs, their adaptive parameters are mixing parameters that allows adaptive mixing of spatial and non-spatial smoothing (Congdon, 2008).

8.2. Adaptive CARs of the category II

Adaptive CARs of this category began with proposals of adaptive iCAR, primarily motivated for modelling spatial discontinuity and heterogeneity. The basic idea is to replace the connectivity matrix W  in iCAR by adaptive parameterization of a weight matrix of non-negative elements, denoted W ~=(w~ij,ij) in Table 3, where w~ij,ij, models locally weighted pair-wise interaction between ψi and ψj. In Brezger et al. (2007), for example, the elements {w~ij,ij} were estimated via a gamma prior. Brewer and Nolan (2007), Reich and Hodges (2008), and Corpas-Burgos and Martinez-Beneito (2020) proposed more parsimonious parameterizations to W ~: w~ij=τiτj/(τi+τj)wij in Brewer and Nolan (2007), w~ij=(σiσj)1wij in Reich and Hodges (2008), and w~ij=(cicj)1/2wij in Corpas-Burgos and Martinez-Beneito (2020). The Corpas-Burgos and Martinez-Beneito (2020) adaptive iCAR can be viewed as a re-parameterization of the Reich and Hodges (2008) adaptive iCAR, by letting σi=σci1/2, where σ is an additional scale parameter. The iCAR(a) in Table 3 is the Corpas-Burgos and Martinez-Beneito (2020) adaptive iCAR equivalent, with ci1/2 being replaced by ci.

With an adaptive iCAR, one can derive its extensions of adaptive LCAR or pCAR. For example, Corpas-Burgos and Martinez-Beneito (2020) extended their adaptive iCAR to two adaptive LCARs; the LCAR(d) and LCAR(e) in Table 3 (as extensions of the iCAR(a)) are the equivalents. The pCAR(c) in Table 3 is a pCAR extension of the iCAR(a).

Similar adaptive iCAR, pCAR and LCAR of this category can be formulated via linear transformation of non-adaptive CARs. For example, let

ψ =σ ~ψ~,ψ ~CAR, (19)

where σ ~=diag(σ~1,σ~2,,σ~n), and ψ ~iCAR(1) or ψ ~LCAR(λ) or ψ ~pCAR(λ). Linear transformation (19), named linear (model of) coregionalization in MacNab (2018), leads to adaptive iCAR or pCAR or LCAR. The adaptive iCAR(b), pCAR(d), and LCAR(f) in Table 3 are examples.

A common feature of this category of CARs is that the adaptive parameters serve dual roles: They regulate both the spatial risk dependencies in the conditional mean functions and risk heterogeneities in the conditional variance functions (see Table 3). A noteworthy difference among them is their influence functions, as illustrated here for the iCAR(a) and iCAR(b):

Influence(j,i)iCAR(a)=cjjicj,ji,Influence(j,i)iCAR(b)=cjciwi+,ji. (20)

The influence function Influence(j,i)iCAR(a) in (20) quantifies relative influence of area j on area i, relative to the neighbours of the area i, in terms of its direct influence on the prediction of ψi. When cj is near-zero, area j exerts near-zero direct influence on its neighbours, but is not necessarily “disconnected” from its neighbours, which was suggested by Corpas-Burgos and Martinez-Beneito (2020). This is because area j could still be directly influenced by its neighbours, say, when ci>0 and ij, Influence(i,j)iCAR(a)=ci/ijci>0.

On the other hand, Influence(j,i)iCAR(b) quantifies direct influence of area j on area i, which is partially controlled and partially determined by three quantities. It is an increasing function of cj but decreasing function of ci: For given ci, the higher the cj, the greater the area j exerts direct influence on area i. The greater the ci and/or the higher the wi+, the less the area i is influenced by its neighbours. Area j exerts greater influence on area i if and only if cj2wj+>ci2wi.

The LCAR(e) and LCAR(f) share a common feature that, when λ=0, they both reduce to adaptive independent Gaussian distributions defined by a diagonal variance matrix diag(σ2c ).

It is worth mentioning that the adaptive LCARs in the Category I are similar to the adaptive CARs of this category in the sense that their adaptive parameters regulate both spatial dependencies and heterogeneities.

In addition, in the context of Bayesian hierarchical modelling, the Rushworth et al. (2017) adaptive LCAR may be viewed as an adaptive CAR of this category. For example, a LCAR (extension) of the adaptive iCAR(σ,W~) with w~ij=cij is a Rushworth et al. (2017) adaptive LCAR equivalent. It is readily seen this LCAR (and the Rushwoth et al. model) may be excessively parameterized. To gain identifiability, the LCAR can be simplified by letting w~ij=cicj,ci(0,1),i, which is the Corpas-Burgos and Martinez-Beneito (2020) adaptive LCAR(d) in Table 3 (with ci1/2 being replaced by ci,ci(0,1),i).

8.3. Adaptive CARs of the category III

Adaptive CARs of this category can be derived via linear transformation of adaptive CARs of the Category I. The adaptive pCAR and LCAR proposed in MacNab (2018) are examples; they can be readily derived via linear transformation (19), where ψ ~pCAR(c ) or ψ ~LCAR(c ), pCAR(c ) and LCAR(c ) are adaptive CARs without scale parameter.

As the pCAR(e) in Table 3 illustrates, the coefficients of influence (in CARs of this category) are functions of adaptive spatial and scale (or precision) parameters; the adaptive scale parameters regulate risk heterogeneities in the conditional variance functions.

The coregionalized adaptive pCAR offers the flexibility for modelling locally varying spatial (Markovian) dependencies in the latent components ψ ~, regulated by an adaptive spatial dependence parameterization in its conditional mean functions. Linear transformation (19) introduces locally varying risk heterogeneities to ψ , regulated by the adaptively parameterized coefficients of coregionalization σ ~.

In the context of disease mapping, CARs of this category may be excessively parameterized. However, they may be useful when additional information, such as covariates, are available to infer these parameters (MacNab, 2018).

9. An illustrative example: Spatial and spatiotemporal modelling of COVID-19 infection risks

The purpose of this illustrative example is to show (potential) utilities of some of the disease mapping models and methods for analysing and monitoring communicable disease such as the COVID-19 infection risks during an ongoing epidemic or pandemic. I focus on potentials of the adaptive CAR models presented herein, in the context of Bayesian mapping of spatial and spatiotemporal COVID-19 infection risks, without or with covariates.

The overall analytic approach aimed for small area pandemic minoring by describing and understanding locally varying spatial risk dependencies, asymmetric local risk influence functions, within and between neighbourhood risks heterogeneities, and the resulting spatial risk discontinuities.

Details on models and related Bayesian methods for parameter estimation, learning, and inference, and details of the data analysed, and additional results and discussions, will be presented elsewhere. Here, for brevity, is a cursory report that outlines two sets of analysis, one is spatial modelling of county-level aggregates of daily infection cases for eighty-seven counties of Minnesota, USA, over the period of 2020/01/22–2021/02/14, named the cumulative period (data) hereafter. The second is spatiotemporal modelling of weekly aggregates of infection cases for the period of 2020/09/30–2021/01/20, a period the State-level 7-day averaged number of new cases exceeded 1000, named the peak period (data) hereafter (the State’s first major infection wave). I illustrate spatiotemporal modelling of county-level weekly aggregates of 17 weeks. The covariates presented here are tentative and only used for illustrative purpose; they are census-based aggregates and will be explained and further discussed in a separate paper.

9.1. Spatial analysis of the cumulative period data

The Poisson data model (1) with a regression part was fitted, in which the vector of log relative risks ψ =(ψ1,,ψ87) was modelled by each of the adaptive CARs in Table 3. Overall, the adaptive CARs led to comparable goodness-of-fit among the spatial models. Fig. 1 presents posterior estimates of the adaptive parameters and relative risks for two illustrative models; the results from the LCAR(a) illustrate spatial discontinuity, and those from the pCAR(b) illustrate locally varying neighbourhood risk dependencies. Fig. 1 and Table 4 suggest that the included (time-invariant) covariates explained modest (but noteworthy) amount of neighbourhood risk dependencies, modelled by the spatial dependence parameter λ; they also suggest that the included covariates explained modest amount of risk variability, modelled by the scale parameter σ.

Fig. 1.

Fig. 1

Posterior estimates of the county-specific (87 counties) adaptive parameters and relative risks for indicated models. Red line or dot: without covariate, blue line: with five covariates. The Minnesota county-level COVID-19 cumulative period data. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Table 4.

Posterior estimates, median and standard deviation (sd), of the model parameters for the adaptive pCAR(b). The Minnesota bcounty-level COVID-19 cumulative period data.

Parameter Without covariate
With covariates
Median sd Median sd
Intercept 0.02 0.01 0.02 0.02
Private transportation to work 0.88 0.68
Age 55–64 −4.32 0.96
Education less than high school 3.94 0.70
College education 0.95 0.60
Unemployment −4.56 1.35
λ 0.91 0.12 0.67 0.28
σ 0.40 0.03 0.32 0.03

9.2. Spatiotemporal analysis of the peak period data

The Poisson data model (1) was extended to a spatiotemporal data model

OitPoisson(ϕit),ϕit=Eitφit,i,t, (21)

where

log(φit)=β0t+β1tXi1++βptXip+ψit,p=5, (22)

the regression part in Eq. (22) has time-varying coefficients for the included (time-invariant) covariates (p=5).

The data were analysed via three sets of spatiotemporal models. I began with modelling the weekly data as time independent, and with weekly (i.e. time-varying) adaptive CARs, say,

ψ tadaptive CAR(c t,σt), (23)

ψ t=(ψ1t,ψ2t,,ψnt), c t=(c1t,c2t,,cnt), t, where CAR (23) is an example of adaptive CARs of the Category I. The main purpose of the analysis was to describe and understand weekly changes in (i) locally varying spatial risk dependencies, (ii) asymmetric neighbourhood risk influences, and (iii) spatial risk heterogeneities and discontinuities.

As anticipated, and illustrated in Fig. 2 for indicated adaptive CARs, the spatial patterns of the adaptive parameters and relative risks varied over the 17 weeks. The included covariates explained modest amount of variations in the adaptive parameters and the relative risks during the week 7, which was among the peak weeks of this period. The observed patterns were also consistent with the posterior estimates of time-varying regression coefficients presented in Fig. 3, which illustrates a likely scenario of time-varying effects of covariates, such as the small area and neighbourhood characteristics included in the model. Fig. 3 also indicates that the included covariates led to modest reductions in the time-varying scale parameters over the peak weeks (the weeks 7 to 10).

Fig. 2.

Fig. 2

Illustrations of posterior estimates of county-specific (87 counties) adaptive parameters and relative risks for the indicated models. For the adaptive parameter estimates: Red lines—without covariate, blue lines—with covariates. For the relative risk estimates: Red dots: without covariate, blue lines: with covariates. The Minnesota county-level COVID-19 weekly data during peak period. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Fig. 3.

Fig. 3

Posterior estimates of the week-specific (17 weeks) regression coefficients and scale parameters for the indicated models. For the regression coefficients: dashed line and dots: posterior median, solid lines: lower and upper limits of 95% credible intervals. For the scale parameters: Red line: without covariate, blue line: with covariates. The Minnesota county-level COVID-19 weekly data during peak period. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Fig. 4 suggests modest variations in the estimated coefficients of influence for a selection of counties, using the results from the adaptive pCAR(d) (a model that had one of the lowest DIC, see Table 5). It also suggests some modest changes in the local risk influences from week 7 to week 17, and minor or modest changes from unadjusted to adjusted risk influences.

Fig. 4.

Fig. 4

Posterior estimates of the coefficients of influence for indicated counties, calculated using the posterior medians of the adaptive parameters in pCAR(d) (see Table 3). Red dots: Unadjusted risk influences—estimated coefficients of risk influence based on indicated ST model without covariate; blue dots: Adjusted risk influences—estimated coefficients of risk influence based on indicated ST model with covariates. The Minnesota county-level COVID-19 weekly data during peak period.

Table 5.

DIC results for indicated models. The Minnesota COVID-19 peak period data.

Model Without covariate With covariates
Dbar pD DIC Dbar pD DIC
(1) pCAR(a) 10693 1237 11930 10687 1242 11929
(2) pCAR(b) 10687 1243 11930 10682 1247 11929
(3) pCAR(c) 10688 1223 11911 10690 1225 11915
(4) pCAR(d) 10692 1212 11904 10689 1214 11903
(5) LCAR(a) 10704 1227 11931 10692 1230 11922
(6) LCAR(b) 10694 1227 11921 10682 1230 11912
(7) LCAR(e) 10677 1228 11905 10676 1227 11903
(8) LCAR (f) 10697 1208 11905 10693 1212 11905
(9) SVC LCAR TpCAR1 10706 1222 11928 10706 1215 11921
(10) SVC LCAR TpCAR2 10671 1121 11792 10675 1120 11795

For the second set of analysis, the spatiotemporal model defined by Eqs. (21), (22) was fitted for eight options of time-invariant adaptive CAR parameterization. An example of the adaptive CARs of the category I is

ψ tadaptive CAR(c ,σ),t, (24)

which is a simplification of Eq. (23). Table 5 presents the DIC results for the indicated adaptive CARs (1)(8). Time- or variable-invariant adaptive CARs were previously used in Bayesian spatiotemporal models (Reich and Hodges, 2008, Rushworth et al., 2017, Sahu and Böhning, 2021) or multivariate spatial models (Corpas-Burgos and Martinez-Beneito, 2020). In this study, an intention of the analysis was to observe and compare the estimated time-invariant adaptive parameters with their counterparts of time-varying parameters. Fig. 5 illustrates the results for the pCAR(a). The patterns displayed from the estimated time-invariant spatial dependence parameters (the plots on the left) are notably different from those of the time-varying parameters for week 7 (the plots on the right). They convey different and important information: The estimated time-invariant c  may serve to explain and inform locally varying neighbourhood risk influences and spatial discontinuity over the first wave of the pandemic, while the weekly estimates of the adaptive parameters may serve to explain retrospectively, or monitor on going, temporal dynamics of locally varying risk dependencies, varying neighbourhood risk influences, and spatial discontinuities.

Fig. 5.

Fig. 5

Posterior estimates (median and standard deviation) of county-specific (87 counties) adaptive parameters for the indicated models. Red dots: Unadjusted risk influences—estimated coefficients of risk influence based on indicated ST model without covariate; blue dots: Adjusted risk influences—estimated coefficients of risk influence based on indicated ST model with covariates. The Minnesota COVID-19 weekly data during peak period.

The eight adaptive CAR models led to comparable and consistent posterior estimates of relative risks, although modest differences were observed. Overall, the CARs of the Category II performed slightly better than the CARs of the Category I. The LCARs also had slightly smaller DIC scores, compared with their pCAR counterparts. Of note is that the LCAR(e) (Corpas-Burgos and Martinez-Beneito, 2020) and the LCAR(d) and pCAR(f) of linear models of coregionalization (MacNab, 2018) led to comparable DIC results, although the LCAR(d) and pCAR(f) were shown to be more efficient, as indicated by the smaller pD scores and posterior relative risk standard deviations (not presented here). The overall DIC comparison, and the estimated adaptive parameters and relative risks, consistently suggested notable degrees of locally varying spatial dependencies, asymmetric local risk influences, and spatial discontinuities.

The third set of analysis concerned with CARs of the Category III, they are illustrative examples of exploring and explaining adaptive spatiotemporal risk interactions via a SVC model with spatially adaptive areal-specific temporal CAR processes:

vec(ψ )=(I Tdiag(σ ~))vec(η ), (25)

where σ ~=(σs1,,σsn), log(s )LCAR(λs,W ) and the vector of latent variables vec(η ) is modelled by a spatially adaptive temporal pCAR:

vec(η )pCAR(c ,W T), (26)

where W T is a 17 by 17 first-order temporal connectivity matrix tt+1 for “TpCAR1” and tt+1 and/or tt1 for “TpCAR2”. The last model with the temporal pCAR2 led to the best DIC with notably improved statistical efficiency, as indicated by a notably reduced pD (in Table 5) and smaller posterior relative risk standard deviations (not presented here). The two sets of adaptive parameters s  and c  in the SVC models convey important information on asymmetric spatial and temporal risks dependencies and influences, which will be presented and discussed in a separate paper.

10. Discussion and looking into the future

The importance and popularity of Bayesian disease mapping in spatial epidemiology, population and public health, and medicine are seen in the large and still growing literature on spatial and spatiotemporal small area mapping of risks of COVID-19 infection and related health outcomes, and on applications of CAR models in COVID-19 related research. Identifications of high risk areas and patterns of disease incidence and mortality can provide valuable information for public health planning such as priority setting for allocating funds and localized disease prevention or intervention. In addition to mapping disease and illness, novel disease mapping models have been developed for spatial modelling of child growth (Gelfand and Vounatsou, 2003), geographical mapping of gene frequencies (Gelfand and Vounatsou, 2003), mapping health care utilization rates (Nathoo and Ghosh, 2012), and vaccination coverage mapping (Utazi et al., 2018), among others.

The (still on-going) COVID-19 pandemic, being an unprecedented global crisis, also offers unparallelled opportunities for the statistics community to put the established and newly developed disease mapping methods and tools, particularly the commonly used CAR, MCAR, and adaptive (M)CAR models, to rigorous assessment and test, to improve the existing ideas, methods, and tools, and to broaden the subject further. The colossal amount of COVID-19 related data available around the world are intrinsically spatial, spatiotemporal, multivariate (e.g. infection, hospitalization, mortality, uptake of vaccination, etc.), and multi-array (e.g. concerns spatial, temporal, and multivariate outcomes, in addition to other dimensions such as age, gender, risk/protective factors, and countries, for the varying public health intervention policies, updates of preventative measures, and availabilities of vaccines, among others). Making these data accessible to statistical analysis is critically important in our efforts to better understand the current (and future) pandemic and its impact on population and public health and to inform on pandemic intervention policies and strategies.

We see in issues of the Spatial Statistics, and (bio)statistical journals elsewhere, that innovative (multivariate) spatial and spatiotemporal disease mapping models have been developed to estimate, explain, and map the risks or spread of COVID-19 infection or related mortality (to name some most recent ones, Li and Dey, 2021, Slater et al., 2021, Huang et al., 2021; Feng, 2021, Lee et al., 2021, Sahu and Böhning, 2021) and for short-term forecasts of infection or infection-related mortality rates (Sahu and Böhning, 2021).

Disease mapping models are widely known for modelling rare and non-communicable diseases such as heart disease and cancer. CARs are commonly motivated for borrowing information and modelling smoothly varying disease risks as effects of omitted covariates. However, typical Bayesian disease mapping models are hierarchically formulated Bayesian spatial GLMMs that allows a variety of data models to be considered and applied. For example, in addition to the Poisson model (1) for rare events, we also have the options of binomial models for non-rare events Martuzzi and Elliott, 1996, MacNab, 2003b, frailty models for time-to-event data (Carlin and Banerjee, 2003, Lawson, 2018), and Gaussian models for continuous data (MacNab, 2021b).

When non-rare events are studied, (M)CARs and adaptive (M)CARs may be considered to allow for nuanced (intelligent) smoothing and for characterization and learning of (cross) spatial risks dependencies and/or (cross) spatial risk correlations, as well as spatial risk discontinuity. For instance, areal data of adequate numbers of events may contain useful information to inform on more flexible spatial dependence parameterization, perhaps aided by additional covariates. The works of [29] and Rushworth et al. (2017), and the present case study of spatiotemporal modelling of peak period COVID-19 infection risks, are examples.

Risks of communicable diseases such as the COVID-19 infection are quite likely to be spatially varying and to exhibit spatial discontinuity. This is due to the nature of communicable disease outbreak and transmission in proximities, differences in implementation and/or uptake of public health intervention policies and preventive measures, etc., in addition to the underlying health inequality that can partially influence the risks of infection or infection-related hospitalization or mortality. For these reasons, adaptive CARs might be more plausible options for modelling small area COVID-19 related data, without or with associated (protective) risk factors. The Sahu and Böhning (2021) work presents an example.

Here, I briefly illustrated another example of using adaptive spatial and spatiotemporal models to explore, explain, and monitor locally varying (cross) spatial dependencies, asymmetric local risk influences, and spatial discontinuities. Monitoring and identification of clusters of high local risks and risk influences could offer timely and valuable information that might assist and guide targeted public health interventions such as local restrictions and lockdowns, perhaps avoiding unnecessary state- or nation-wide restrictions and lockdowns.

In this article I gave a more detailed review on formulating adaptive CARs via options of adaptive spatial dependence or weight parameterization for a known connectivity matrix W . One attraction of this approach is that the resulting models have clear interpretations of locally varying spatial risk dependencies and neighbourhood risk influences. Estimation and inference of the model parameters lead to estimation and inference of spatial dependence (local influence) functions, with associated estimation uncertainties. The adaptive spatial dependence parameterization discussed herein may mitigate the impact of a potentially unrealistic assumption of local risk dependencies defined by a given connectivity matrix, by allowing location-specific spatial dependence parameters to be zero, and thereby enables characterization of spatial risk discontinuity, as illustrated in [29]. In addition, some of the recent proposals of adaptive parameterizations of multivariate CARs, are natural extensions of the univariate adaptive CARs; the adaptive MGMRFs discussed in MacNab, 2018, MacNab, 2021a are examples.

The approach to allowing the connectivity matrix W  to be modelled as (binary) random variates within a CAR formulation leads to an alternative class of adaptive models. These models were initially motivated for, and were shown to be effective in, adjacency modelling and boundary detection (Lu and Carlin, 2005, Lu et al., 2007, Lee and Mitchell, 2012). While adaptive CARs of this class were also used for modelling spatial risk discontinuity (Lee et al., 2014, Sahu and Böhning, 2021), they lack the flexibility of modelling locally varying spatial dependencies that offered by adaptive spatial dependence parameterization. Nevertheless, the two classes of adaptive CARs are useful tools for their intended purposes and may be considered as complementary approaches in real-life applications. Adaptive CARs that accommodate both adaptive spatial dependence (or weight) parameterization and adjacency modelling are possible options to be further explored and developed.

The disease mapping community is poised to embrace new challenges and opportunities brought by the rise of data science and a (big) rich data era. Earlier disease mapping literature presented alternative ideas of modelling structured spatial discontinuities. Examples are hidden Markov models (Green and Richardson, 2002) and mixture models (Denison and Holmes, 2001, Fernandez and Green, 2002). These earlier ideas and methods have shown limited advantage in mapping rare disease. However, they may regain their recognition and appeal in the context of modelling (big) data of rich content, information, and structure. Together with (adaptive) CARs and their multivariate extensions, these models and related methods for estimation and inference may be further explored and better understood for their roles in disease mapping applications.

Recent advances in multivariate CAR/GRMFs, a topic of my focus here and a (new) comprehensive book has written for (Martinez-Beneito and Botella-Rocamora, 2019), provide ideas, models, methods, and tools for jointly modelling multidimensional spatial data in general, and the COVID-19 infection and related data in particular. Recent proposals of the INLA-SPDE approach to Bayesian analysis of spatial data over high resolution grids mark a new frontier of Bayesian disease mapping and its applications (Lindgren et al., 2011, Moraga et al., 2017, Utazi et al., 2018)

In addition, recent proposals of directed acyclic graph auto-regressive models in the context of (multivariate) disease mapping (Datta et al., 2019, Gao et al., 2021) offer alternative ways to model spatial dependencies. In the context of multivariate disease mapping, Liang (2019) proposed MCARs in which the non-spatial dependencies were characterized by a directed acyclic graph representation. The fusion of the Datta et al. (2019) and Liang (2019) ideas might broaden the scope and complexity of Bayesian disease mapping and widen its applicability in applied scientific (e.g. health science) research. The data science and big data movements are also inspiring new ideas of Bayesian disease mapping methods for handling data of high dimensions, e.g. areal data of large n and/or p and/or T. A recent example is the idea of “divide and conquer” (Orozco-Acosta et al., 2021) for scalable approximate Bayesian inference of disease mapping of large n using INLA.

Perhaps one of the most urgent need for advancing Bayesian disease mapping and its broad applications is devoted efforts for improved or new methods of Bayesian estimation, learning, and inference of (adaptive and/or multivariate) CAR. INLA offers a solution to computation for models of large n but small or modest number of (prior) model parameters (Rue et al., 2017). Recent proposals of Gibbs sampling of GMRFs on large lattice (Marcotte and Allard, 2018, Brown et al., 2021) offer ideas and options to be explored, adapted, and improved. Alternative computational options that might be fruitfully explored and furthered are the hybrid or Hamiltonian Monte Carlo sampling methods (Girolami and Calderhead, 2011, Zhang et al., 2018, Hoffman et al., 2021) and the variational inference (Blei et al., 2017, Tan, 2020).

Acknowledgement

This research was funded in part by a discovery grant (RGPIN 238660-13) from the Natural Sciences and Engineering Research Council of Canada .

References

  1. Ainsworth L.M., Dean C.B. Approximate inference for disease mapping. Comput. Statist. Data Anal. 2006;50(10):2552–2570. [Google Scholar]
  2. Arnold-Forster A. Mapmaking and mapthinking: cancer as a problem of place in nineteenth-century England. Social History of Med. 2020;33(2):463–488. doi: 10.1093/shm/hky059. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Arnold-Forster A. Oxford Scholarship online book; 2021. The Cancer Problem: Malignancy in Nineteenth-Century Britain. [Google Scholar]
  4. Azevedo D.R., Prates M.O., Bandyopadhyay D. Mspock: Alleviating spatial confounding in multivariate disease mapping models. J. Agric. Biol. Environ. Stat. 2021:1–28. [Google Scholar]
  5. Banerjee S., Carlin B.P., Gelfand A.E. Chapman and Hall/CRC; 2004. Hierarchical Modeling and Analysis for Spatial Data. [Google Scholar]
  6. Baptista H., Mendes J.M., MacNab Y.C., Xavier M., Almeida J.C. A Gaussian random field mode for similarity-based smoothing in Bayesian disease mapping. Stat Methods Med Res. 2016;25(4):1166–1184. doi: 10.1177/0962280216660407. [DOI] [PubMed] [Google Scholar]
  7. Bernardinelli L., Montomoli C. Empirical bayes versus fully bayesian analysis of geographical variation in disease risk. Stat. Med. 1992;11(8):983–1007. doi: 10.1002/sim.4780110802. [DOI] [PubMed] [Google Scholar]
  8. Bernardinelli L., Pascutto C., Best N.G., Gilks W.R. Disease mapping with errors in covariates. Stat. Med. 1998;16(7):741–752. doi: 10.1002/(sici)1097-0258(19970415)16:7<741::aid-sim501>3.0.co;2-1. [DOI] [PubMed] [Google Scholar]
  9. Besag J. Spatial interaction and the statistical analysis of lattice systems (with discussions) J. R. Stat. Soc. Ser. B Stat. Methodol. 1974;36:192–236. [Google Scholar]
  10. Besag J., York J., Mollie A. Bayesian image restoration, with two applications in spatial statistics. Ann Inst Stat Math. 1991;43:1–20. [Google Scholar]
  11. Best N., Richardson S., Thomson A. A comparison of Bayesian spatial models for disease mapping. Stat. Methods Med. Res. 2005;14(1):35–59. doi: 10.1191/0962280205sm388oa. [DOI] [PubMed] [Google Scholar]
  12. Blei D.M., Kucukelbir A., McAuliffe J.D. Variation inference: a review for statisticians. J. Amer. Statist. Assoc. 2017;112(518):859–877. [Google Scholar]
  13. Breslow N.E. Extra-Poisson variation in log-linear models. Appl. Stat. 1984;33(1):38–44. [Google Scholar]
  14. Breslow N.E., Clayton D.G. Approximate inference in generalized linear mixed models. J. Amer. Statist. Assoc. 1993;88(421):9–25. [Google Scholar]
  15. Breslow N.E., Day N.E. International Agency for Research on Cancer; Lyon: 1980. Statistical Methods in Cancer Research. Volume I - the Analysis of Case-Control Studies. [PubMed] [Google Scholar]
  16. Brewer M.J., Nolan A.J. Variable smoothing in Bayesian intrinsic autoregressions. Environmetrics. 2007;18(8):841–857. [Google Scholar]
  17. Brezger A., Fahrmeir L., Hennerfeind A. Adaptive Gaussian Markov random fields with applications in human brain mapping. Appl Stat. 2007;56(3):327–345. [Google Scholar]
  18. Brown A.D., McMahan C.S., Watson S.C. Sampling strategies for fast updating of Gaussian Markov random fields. Amer. Statist. 2021;75(1):52–65. doi: 10.1080/00031305.2019.1595144. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Cameletti M., Gómez-Rubio V., Blangiardo M. Bayesian modelling for spatially misaligned health and air pollution data through the INLA-SPDE approach. Spatial Stat. 2019;31:2211–6753. [Google Scholar]
  20. Carlin B.P., Banerjee S. In: Bayesian Statistics. Bernardo J.M., Bayarri M.J., Berger J.O., Dawid A.P., Heckerman D., Smith A.E.M., West M., editors. vol. 7. Oxford University Press; Oxford: 2003. Hierarchical multivariate CAR models for spatio-temporally correlated survival data (with discussion) pp. 45–63. [Google Scholar]
  21. Carlin B.P., Gelfand A.E. Approaches for empirical Bayes confidence intervals. J. Amer. Statist. Assoc. 1990;85(409):105–114. [Google Scholar]
  22. Carlin B.P., Gelfand A.E. A sample reuse method for accurate parametric empirical Bayes confidence intervals. J. R. Stat. Soc. Ser. B Stat. Methodol. 1991;53(1):189–200. [Google Scholar]
  23. Carlin B.P., Louis T.A. Chapman and Hall/CRC; 2000. Bayes and Empirical Bayes Methods for Data Analysis. [Google Scholar]
  24. Carlin B.P., Louis T.A. Empirical Bayes: Past. Present and future. J. Amer. Statist. Assoc. 2000;95(452):1286–1289. [Google Scholar]
  25. Clayton D., Bernardinelli L. In: Geographical and Environment Epidemiology: Methods for Small Area Studies. Elliott P., Cuzick J., English D., Stern R., editors. Oxford University Press; Oxford: 1992. Bayesian methods for mapping disease risk. 205220. [Google Scholar]
  26. Clayton D.G., Bernardinelli L., Montomoli C. Spatial correlation in ecological analysis. Int. J. Epidemiol. 1993;22(6):1193–1202. doi: 10.1093/ije/22.6.1193. [DOI] [PubMed] [Google Scholar]
  27. Clayton D.G., Kaldor J. Empirical Bayes estimates of age-standardized relative risks for use in disease mapping. Biometrics. 1987;43:671–681. [PubMed] [Google Scholar]
  28. Congdon P. Wiley; 2001. Bayesian Statistical Modelling. [Google Scholar]
  29. Congdon P. A spatially adaptive conditional autoregressive prior for area health data. Stat. Methodol. 2008;5(6):1572–3127. [Google Scholar]
  30. Corpas-Burgos F., Martinez-Beneito M.A. On the use of adaptive spatial weight matrices from disease mapping multivariate analyses. Stoch Environ Res Risk Assess. 2020;34:531–544. [Google Scholar]
  31. Cressie N. Wiley; New York: 1993. Statistics for Spatial Data (Revised Edition) [Google Scholar]
  32. Cressie N., Wikle C.K. Wiley; New York: 2015. Statistics for Spatio-Temporal Data. [Google Scholar]
  33. Datta A., Banerjee S., Hodges J.S., Gao L. Spatial disease mapping using directed acyclic graph auto-regressive (DAGAR) models. Bayesian Anal. 2019;14(4):1221–1244. doi: 10.1214/19-ba1177. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Dempster A.P., Laird N.M., Rubin D.B. Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. Ser. B Stat. Methodol. 1977;39(1):1–22. [Google Scholar]
  35. Denison D.G.T., Holmes C.C. Bayesian partitioning for estimationing disease risks. Biometrics. 2001;57:143–149. doi: 10.1111/j.0006-341x.2001.00143.x. [DOI] [PubMed] [Google Scholar]
  36. Etxeberria J., Ugrate M.D., Goicoa T., Militino A.F. On predicting cancer mortality using ANOVA-type P-spline models. REVSTAT-Statistical J. 2015;13(1):21–40. [Google Scholar]
  37. Feng C. Spatial–temporal generalized additive model for modeling COVID-19 mortality risk in Toronto, Canada. Spatial Stat. 2021 doi: 10.1016/j.spasta.2021.100526. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Fernandez C., Green P.J. Modelling spatially correlated data via mixture: A Bayesian approach. J. R. Stat. Soc. Ser. B Stat. Methodol. 2002;64(4):805–826. [Google Scholar]
  39. Gao H., Bradley J.R. Bayesian analysis of areal data with unknown adjacencies using the stochastic edge mixed effects model. Spatial Stat. 2019;31:2211–6753. [Google Scholar]
  40. Gao L., Datta A., Banerjee S. UCLA: Biostatistics. 2021. Multivariate directed acyclic graph auto-regressive (MDAGAR) models for spatial diseases mapping. Retrieved from https://escholarship.org/uc/item/88c7t942. [Google Scholar]
  41. Gelfand A.E., Schmidt A.M., Banerjee S., Sirmans C.F. Nonstationary multivariate process modelling through spatially varying coregionalization (with discussion) Test. 2004;13:263–312. [Google Scholar]
  42. Gelfand A.E., Smith A.F.M. Sampling-based approaches to calculating marginal densities. J. Amer. Statist. Assoc. 1990;85(410):398–409. [Google Scholar]
  43. Gelfand A.E., Vounatsou P. Proper multivariate conditional autoregressive models for spatial data analysis. Biostatistics. 2003;4:11–25. doi: 10.1093/biostatistics/4.1.11. [DOI] [PubMed] [Google Scholar]
  44. Gelman A., Carlin J.B., Stern H.S., Dunson D.B. Chapman and Hall/CRC; 1995. Bayesian Data Analysis. [Google Scholar]
  45. Gilks W.R., Richardson S., Spiegelhalter D.J. Chapman and Hall/CRC; 1998. Markov Chain Monte Carlo in Practice. [Google Scholar]
  46. Girolami M., Calderhead B. Riemann manifold langevin and Hamiltonian Monte Carlo methods (with discussions) J. R. Stat. Soc. Ser. B Stat. Methodol. 2011;73:123–214. [Google Scholar]
  47. Goicoa T., Ugrate M.D., Etxeberria J., Militino A.F. Age–space–time CAR models in Bayesian disease mapping. Stat. Med. 2016;30:2391–2405. doi: 10.1002/sim.6873. [DOI] [PubMed] [Google Scholar]
  48. Greco F.P., Trivisano C. A multivariate CAR model for improving the estimation of relative risks. Stat. Med. 2009;28 doi: 10.1002/sim.3577. 1707-1224. [DOI] [PubMed] [Google Scholar]
  49. Green P.J., Richardson S. Hidden Markov models and disease mapping. J. Amer. Statist. Assoc. 2002;97(460):1–16. [Google Scholar]
  50. Haviland A. The geographical distribution of disease in England and Walses. The British Med J. 1871;7:5–6. [Google Scholar]
  51. Haviland A. The geographical distribution of cancerous disease in the British isles. Lancet. 1888;March 3:412–414. [Google Scholar]
  52. Held L. Bayesian modelling of inseparable space–time variation in disease risk. Stat. Med. 2000;19(17–18):2555–2567. doi: 10.1002/1097-0258(20000915/30)19:17/18<2555::aid-sim587>3.0.co;2-#. [DOI] [PubMed] [Google Scholar]
  53. Held L., Besag J. Modelling risk from a disease in time and space. Stat. Med. 1998;17(18):2045–2060. doi: 10.1002/(sici)1097-0258(19980930)17:18<2045::aid-sim943>3.0.co;2-p. [DOI] [PubMed] [Google Scholar]
  54. Held L., Best N.G. A shared component model for detecting joint and selective clustering of two diseases. J. R. Stat. Soc. Ser. C. Appl. Stat. 2001;164(1):73–85. [Google Scholar]
  55. Hoffman, M.D., Radul, A., Sountsov, P., 2021. An Adaptive MCMC Scheme for Setting Trajectory Lengths in Hamiltonian Monte Carlo. In: Proceedings of the 24th International Conference on Artificial Intelligence and Statistics (AISTATS) 2021, San Diego, California, USA. PMLR: Volume 130.
  56. Huang G., Patrick E., Brown P.E. Population-weighted exposure to air pollution and COVID-19 incidence in Germany. Spatial Stat. 2021;41:2211–6753. doi: 10.1016/j.spasta.2020.100480. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Hughes J. Copcar: A exible regression model for areal data. J. Comput. Graph. Statist. 2015;24:733–755. doi: 10.1080/10618600.2014.948178. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Hughes J., Haran M. Dimension reduction and alleviation of confounding for spatial generalized linear mixed effects models. J. the R Stat Soc B (Methodology) 2013;75(1):139–159. [Google Scholar]
  59. Jack E., Lee D., Dean N. Estimating the changing nature of Scotland’s health inequalities by using a multivariate spatiotemporal model. J. the R Stat Soc. Ser A, (Statistics in Society) 2019;182(3):1061–1080. doi: 10.1111/rssa.12447. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Jin X., Carlin B.P., Banerjee S. Order-free co-regionalized areal data models with application to multiple-disease mapping. J. R. Stat. Soc. Ser. B Stat. Methodol. 2007;69(5):817–838. doi: 10.1111/j.1467-9868.2007.00612.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Kim H., Sun D., Tsutakawa R.K. A bivariate Bayes method for improving the estimates of mortality rates with a twofold conditional autoregressive model. J. Amer. Statist. Assoc. 2001;96(456):1506–1521. [Google Scholar]
  62. Lawson A.B. Chapman and Hall/CRC; 2018. Bayesian Disease Mapping Hierarchical Modeling in Spatial Epidemiology (Third Edition) [Google Scholar]
  63. Lawson A., Biggeri A., Bohning D., Lesaffre E., Viel J.-F., Bertollini R. Wiley; 1999. Disease Mapping and Risk Assessment for Public Health. [Google Scholar]
  64. Lee D. A comparison of conditional autoregressive models used in Bayesian disease mapping. Spatial and Spatio-Temporal Epidemiol. 2011;2(2):79–89. doi: 10.1016/j.sste.2011.03.001. [DOI] [PubMed] [Google Scholar]
  65. Lee D. CARBayes: an R package for Bayesian spatial modeling with conditional autoregressive priors. J. Stat. Softw. 2013;55(13):1–24. [Google Scholar]
  66. Lee D.J., Durbán M. P-spline ANOVA-type interaction models for spatio-temporal smoothing. Stat Modell. 2011;11(1):49–69. [Google Scholar]
  67. Lee D., Lawson A. Quantifying the spatial inequality and temporal trends in maternal smoking rates in glasgow. Ann Appl Stat. 2016;10(3):1427–1446. doi: 10.1214/16-AOAS941. [DOI] [PMC free article] [PubMed] [Google Scholar]
  68. Lee D., Mitchell R. Boundary detection in disease mapping studies. Biostatistics. 2012;13(3):415–426. doi: 10.1093/biostatistics/kxr036. [DOI] [PubMed] [Google Scholar]
  69. Lee D., Robertson C., Marques C. Quantifying the small-area spatio-temporal dynamics of the Covid-19 pandemic in Scotland during a period with limited testing capacity. Spatial Stat. 2021 doi: 10.1016/j.spasta.2021.100508. [DOI] [PMC free article] [PubMed] [Google Scholar]
  70. Lee D., Rushworth A., Sahu S.K. A Bayesian localized conditional autoregressive model for estimating the health effects of air pollution. Biometrics. 2014;70(2):419–429. doi: 10.1111/biom.12156. [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. Lee D., Sarran C. Controlling for unmeasured confounding and spatial misalignment in long-term air pollution and health studies. Environmetrics. 2015;26(2):477–487. doi: 10.1002/env.2348. [DOI] [PMC free article] [PubMed] [Google Scholar]
  72. Leroux B.G., Lei X., Breslow N. In: Statistical Models in Epidemiology, the Environment and Clinical Trials. Halloran M.E., Berry D., editors. Springer; New York: 1999. Estimation of disease rates in small areas: a new mixed model for spatial dependence. 135178. [Google Scholar]
  73. Li X., Dey D.K. Estimation of COVID-19 mortality in the United States using spatio-temporal conway maxwell Poisson model. Spatial Stat. 2021 doi: 10.1016/j.spasta.2021.100542. [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. Liang Y. Graph-based multivariate conditional autoregressive models. Stat Theory and Related Fields. 2019;3(2):158–169. [Google Scholar]
  75. Lindgren F., Rue H., Lindstrom J. An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach. J. R. Stat. Soc. Ser. B Stat. Methodol. 2011;73(4):423–498. [Google Scholar]
  76. Lu H., Carlin B.P. Bayesian areal wombling for geographical boundary analysis. Geograph Anal. 2005;37(3):265–285. [Google Scholar]
  77. Lu H., Reilly C.S., Banerjee S., Carlin B.P. Bayesian areal Wombling via adjacency modelling. Environ Ecol Stat. 2007;14:433–452. [Google Scholar]
  78. MacNab Y.C. A Bayesian hierarchical model for accident and injury surveillance. Accid. Anal. Prev. 2003;35(1):91–102. doi: 10.1016/s0001-4575(01)00093-8. [DOI] [PubMed] [Google Scholar]
  79. MacNab Y.C. Hierarchical Bayesian spatial modelling of small-area rates of non-rare disease. Stat. Med. 2003;22(10):1761–1773. doi: 10.1002/sim.1463. [DOI] [PubMed] [Google Scholar]
  80. MacNab Y.C. Hierarchical Bayesian modelling of spatially correlated health service outcome and utilization rates. Biometrics. 2003;59(2):305–316. doi: 10.1111/1541-0420.00037. [DOI] [PubMed] [Google Scholar]
  81. MacNab Y.C. Bayesian spatial and ecological models for small-area accident and injury analysis. Accid. Anal. Prev. 2004;36(6):1019–1028. doi: 10.1016/j.aap.2002.05.001. [DOI] [PubMed] [Google Scholar]
  82. MacNab Y.C. Spline smoothing in Bayesian disease mapping. Environmetrics. 2007;18(7):727–744. doi: 10.1002/sim.2868. [DOI] [PubMed] [Google Scholar]
  83. MacNab Y.C. Bayesian multivariate disease mapping and ecological regression with errors in covariates. Stat. Med. 2009;28(9):1369–1385. doi: 10.1002/sim.3547. [DOI] [PubMed] [Google Scholar]
  84. MacNab Y.C. On Bayesian shared component disease mapping and ecological regression with errors in covariates. Stat. Med. 2010;29(11):1239–1249. doi: 10.1002/sim.3875. [DOI] [PubMed] [Google Scholar]
  85. MacNab Y.C. On Gaussian Markov random fields and Bayesian disease mapping. Stat Methods Med Res. 2011;20(1):49–68. doi: 10.1177/0962280210371561. [DOI] [PubMed] [Google Scholar]
  86. MacNab Y.C. On identification in Bayesian disease mapping and ecological-spatial regression. Stat Methods Med Res. 2014;23(2):134–155. doi: 10.1177/0962280212447152. [DOI] [PubMed] [Google Scholar]
  87. MacNab Y.C. Linear models of coregionalization for multivariate lattice data: a general framework for coregionalized multivariate CAR models. Stat. Med. 2016;35:3827–3850. doi: 10.1002/sim.6955. [DOI] [PubMed] [Google Scholar]
  88. MacNab Y.C. Linear models of coregionalization for multivariate lattice data: Order-dependent and order-free MCARs. Statist Methods Med Res. 2016;25(4):1118–1144. doi: 10.1177/0962280216660419. [DOI] [PubMed] [Google Scholar]
  89. MacNab Y.C. Some recent work on multivariate Gaussian Markov random fields (with discussions) TEST. 2018;27(3):497–541. [Google Scholar]
  90. MacNab Y.C. Bayesian estimation of multivariate Gaussian Markov random fields with constraint. Stat. Med. 2020;39(30):4767–4788. doi: 10.1002/sim.8752. [DOI] [PubMed] [Google Scholar]
  91. MacNab, Y.C., 2021. On coregionalized multivariate Gaussian Markov random fields - Part I: Construction and parameterization. Manuscript under peer-review.
  92. MacNab, Y.C., 2021. On coregionalized multivariate Gaussian Markov random fields – Part II: Bayesian estimation and inference, with simulation and case studies. Manuscript under peer-review.
  93. MacNab Y.C., Dean C.B. Autoregressive spatial smoothing and temporal spline smoothing for mapping rates. Biometrics. 2001;57(3):949–956. doi: 10.1111/j.0006-341x.2001.00949.x. [DOI] [PubMed] [Google Scholar]
  94. MacNab Y.C., Farrell P., Gustafson P., Wen S. Estimation in Bayesian disease mapping. Biometrics. 2004;60:865–873. doi: 10.1111/j.0006-341X.2004.00241.x. [DOI] [PubMed] [Google Scholar]
  95. MacNab Y.C., Gustafson P. Regression B-spline smoothing in Bayesian disease mapping: with an application to patient safety surveillance. Stat. Med. 2007;26(24):4455–4474. doi: 10.1002/sim.2868. [DOI] [PubMed] [Google Scholar]
  96. MacNab Y.C., Kmetic A., Gustafson P., Shaps S. An innovative application of Bayesian disease mapping methods to patient safety research: the Canadian iatrogenic injury study. Stat. Med. 2006;25(23):3960–3980. doi: 10.1002/sim.2507. [DOI] [PubMed] [Google Scholar]
  97. MacNab Y.C., Lin Y. On empirical Bayes penalized quasi-likelihood inference in GLMMs and in disease mapping and ecological modeling. Comput. Statist. Data Anal. 2009;53(8):2950–2967. [Google Scholar]
  98. Manton K.G., Woodbury M.A., Stallard E., Riggan W.B., Creason J.P., Pellom A.C. Empirical Bayes procedures for stabilizing maps of u.s, cancer mortality rates. J. Amer. Statist. Assoc. 1989;84(407):637–650. doi: 10.1080/01621459.1989.10478816. [DOI] [PubMed] [Google Scholar]
  99. Marcotte D., Allard D. Gibbs sampling on large lattice with GMRF. Comput. Geosci. 2018;111:190–199. [Google Scholar]
  100. Mardia K.V. Multi-dimensional multivariate Gaussian Markov random fields with application to image processing. J. Multivariate Anal. 1988;24:265–284. [Google Scholar]
  101. Martinez-Beneito M.A. A general modeling framework for multivariate disease mapping. Biometrika. 2013;100(3):539–553. [Google Scholar]
  102. Martinez-Beneito M.A. Some link between conditional and coregionalized multivariate Gaussian Markov random fields. Spatial Stat. 2020;40 [Google Scholar]
  103. Martinez-Beneito M.A., Botella-Rocamora P. CRC Press; 2019. Disease Mapping: From Foundations To Multidimensional Modeling. [Google Scholar]
  104. Martinez-Beneito M.A., Botella-Rocamora P., Banerjee S. Towards a multidimensional approach to Bayesian disease mapping. Bayesian Anal. 2017;12(1):239–259. doi: 10.1214/16-BA995. [DOI] [PMC free article] [PubMed] [Google Scholar]
  105. Martuzzi M., Elliott P. Empirical Bayes estimation of small area prevalence of non-rare conditions. Stat. Med. 1996;15(17):1867–1873. doi: 10.1002/(SICI)1097-0258(19960915)15:17<1867::AID-SIM398>3.0.CO;2-2. [DOI] [PubMed] [Google Scholar]
  106. McCullagh P., Nelder J.A. Chapman and Hall; 1983. Generalized Linear Models (Second Edition) [Google Scholar]
  107. Moraga P., Cramb S.M., Mengersen K.L., et al. A geostatistical model for combined analysis of point-level and area-level data using INLA and SPDE. Spatial Stat. 2017;21:27–41. [Google Scholar]
  108. Nathoo S., Ghosh P. Skew-elliptical spatial random effect modeling for areal data with application to mapping health utilization rates. Stat. Med. 2012;32(2):290–306. doi: 10.1002/sim.5504. [DOI] [PubMed] [Google Scholar]
  109. Orozco-Acosta E., Adin A., Ugarte M.D. Scalable Bayesian modelling for smoothing disease risks in large spatial data sets using INLA. Spatial Stat. 2021;41:2211–6753. [Google Scholar]
  110. Paciorek C.J. The importance of scale for spatial-confounding bias and precision of spatial regression estimators. Statist. Sci. 2010;25(1):107–125. doi: 10.1214/10-STS326. [DOI] [PMC free article] [PubMed] [Google Scholar]
  111. Prates M.O., Assuncao R.M., Rodrigues E.C. Alleviating spatial confounding for areal data problems by displacing the geographical centroids. Bayesian Anal. 2019;14:623–647. [Google Scholar]
  112. Prates M., Azevedo D., MacNab Y.C., M. Willig. 2021. Non-separable spatio-temporal models via transformed Gaussian Markov random fields. Under peer review. [Google Scholar]
  113. Prates M.O., Dey D.K., Willing M.R., Yan J. Transformed Gaussian Markov random fields and spatial modeling of species abundance. Spatial Stat. 2015;14:382–399. [Google Scholar]
  114. Reich B.J., Hodges J.S. Modeling longitudinal periodontal data: A spatially-adaptive model with tools for specifying priors and checking fit. Biometrics. 2008;64:790–799. doi: 10.1111/j.1541-0420.2007.00956.x. [DOI] [PubMed] [Google Scholar]
  115. Reich B.J., Hodges J.S., Zadnik V. Effects of residual smoothing on the posterior of the fixed effects in disease-mapping models. Biometrics. 2006;62(4):1197–1206. doi: 10.1111/j.1541-0420.2006.00617.x. [DOI] [PubMed] [Google Scholar]
  116. Richardson S., Thomson T., Best N., Elliott P. Interpreting posterior relative risk estimates in disease-mapping studies. Environ Health Perspect. 2004;112(9):1016–1025. doi: 10.1289/ehp.6740. [DOI] [PMC free article] [PubMed] [Google Scholar]
  117. Rue H., Held L. Chapman & Hall; New York: 2005. Gaussian Markov Random Fields - Theory and Applications. [Google Scholar]
  118. Rue H., Riebler A., Srbye S.H., Illian J.B., Simpson D.P., Lindgren F.K. Bayesian computing with INLA: A review. Annu. Rev. Stat. Appl. 2017;4(1):395–421. [Google Scholar]
  119. Rushworth A., Lee D., Sarran C. An adaptive spatiotemporal smoothing model for estimating trends and step changes in disease risk. Appl. Stat. 2017;66(1):141–157. [Google Scholar]
  120. Sahu S.K., Böhning D. Bayesian spatio-temporal joint disease mapping of Covid-19 cases and deaths in local authorities of England. Spatial Stat. 2021 doi: 10.1016/j.spasta.2021.100519. [DOI] [PMC free article] [PubMed] [Google Scholar]
  121. Sain S.R., Cressie N.A.C. A spatial analysis of multivariate lattice data. J. Econometrics. 2007;140:226–259. [Google Scholar]
  122. Sain S.R., Furrer R., Cressie N.A.C. A spatial analysis of multivariate output from regional climate models. Ann. Appl. Stat. 2011;5(1):150–175. [Google Scholar]
  123. Slater J.J., Brown P.E., Rosenthal J.S., Mateu J. Capturing spatial dependence of COVID-19 case counts with cellphone mobility data. Spatial Stat. 2021 doi: 10.1016/j.spasta.2021.100540. In press. [DOI] [PMC free article] [PubMed] [Google Scholar]
  124. Smith A.F.M., Gelfand A.E. Bayesian statistics without tears: a sampling–resampling perspective. Amer. Statist. 1992;46(2):84–88. [Google Scholar]
  125. Spiegelhalter D.J., Best N.G., Carlin B.P., van der Linde A. Bayesian measures of model complexity and fit (with discussion) J. R. Stat. Soc. Ser. B Stat. Methodol. 2002;64(4):583–639. [Google Scholar]
  126. Spiegelhalter D., Thomas A., Best N., Lunn D. 2003. WinBUGS User manual. [Google Scholar]
  127. Sun D., Tsutakawa R.K., Speckman P.L. Posterior distribution of hierarchical models using CAR(1) distributions. Biometrika. 1999;86:341–350. [Google Scholar]
  128. Tan L.S.L. Use of model reparametrization to improve variational Bayes. J. R. Stat. Soc. Ser. B Stat. Methodol. 2020;83(1):30–57. [Google Scholar]
  129. Thomas A., Best N., Lunn D., Arnold R., Spiegelhalter D. 2004. GeoBUGS User manual. [Google Scholar]
  130. Tsutakawa R.K. Mixed model for analyzing geographic variability in disease rates. J. Amer. Statist. Assoc. 1988;83:37–42. doi: 10.1080/01621459.1988.10478562. [DOI] [PubMed] [Google Scholar]
  131. Tsutakawa R.K., Shoop G.L., Marienfeld C.J. Empirical bayes estimation of cancer mortality rates. Stat. Med. 1985;4(2):201–212. doi: 10.1002/sim.4780040210. [DOI] [PubMed] [Google Scholar]
  132. Tzala E., Best N. Bayesian latent variable modelling of multivariate spatio-temporal variation in cancer mortality. Stat. Methods Med. Res. 2008;17(1):97–118. doi: 10.1177/0962280207081243. [DOI] [PubMed] [Google Scholar]
  133. Ugarte M.D., Adin A., Goicoa T. One-dimensional, two-dimensional, and three dimensional B-splines to specify space–time interactions in Bayesian disease mapping: Model fitting and model identifiability. Spatial Stat. 2017;22(2):451–468. [Google Scholar]
  134. Ugarte M.D., Goicoa T., Ibanez B., Militino A.F. Evaluating the performance of spatio-temporal Bayesian models in disease mapping. Environmetrics. 2009;20(6):647–665. [Google Scholar]
  135. Ugarte M.D., Goicoa T., Militino A.F. Spatio-temporal modeling of mortality risks using penalized splines. Environmetrics. 2010;21(3–4):270–289. [Google Scholar]
  136. Utazi C.E., Thorley J., Alegana V.A., Ferrari M.J., Nilsen K., Takahashi S., Metcalf C.J.E., Lessler J., Tatem A.J. A spatial regression model for the disaggregation of areal unit based data to high-resolution grids with application to vaccination coverage mapping. Stat. Methods Med. Res. 2018;28(10–11):3226–3241. doi: 10.1177/0962280218797362. [DOI] [PMC free article] [PubMed] [Google Scholar]
  137. Vicente G., Goicoa T., Ugarte M.D. Bayesian inference in multivariate spatio-temporal areal models using INLA: analysis of gender-based violence in small areas. Stochastic Environ Res Risk Assess. 2020;34:1421–1440. [Google Scholar]
  138. Vicente G., Goicoa T., Ugarte M.D. Multivariate Bayesian spatio-temporal P-spline models to analyze crimes against women. Biostatistics. 2021 doi: 10.1093/biostatistics/kxab042. [DOI] [PMC free article] [PubMed] [Google Scholar]
  139. Wakefeild J., Morrise S. Spatial dependence and errors-in-variables in environmental epidemiology. Bayesian Statistics. 1999;6:657–684. [Google Scholar]
  140. Wakefield J. Disease mapping and spatial regression with count data. Biostatistics. 2007;8(2):158–183. doi: 10.1093/biostatistics/kxl008. [DOI] [PubMed] [Google Scholar]
  141. Wakefield J., Salway R. A statistical framework for ecological and aggregate studies. J. the R Statistical Soc: Ser A (Statistics in Society) 2001;164:119–137. [Google Scholar]
  142. Waller L.A., Carlin B.P., Xia H., Gelfand A.E. Hierarchical spatio-temporal mapping of disease rates. J. Amer. Statist. Assoc. 1997;92(438):607–617. [Google Scholar]
  143. Waller L.A., Gotway G.A. Wiley; 2004. Applied Spatial Statistics for Public Health Data. [Google Scholar]
  144. Zhang C., Shahbaba B., Zhao H. Variational Hamiltonian Monte Carlo via score matching. Bayesian Anal. 2018;13(2):485–506. doi: 10.1214/17-ba1060. [DOI] [PMC free article] [PubMed] [Google Scholar]
  145. Zimmerman D.L., Hoef J.M.V. On deconfounding spatial confounding in linear models. Amer. Statist. 2021 doi: 10.1080/00031305.2021.1946149. [DOI] [Google Scholar]

Articles from Spatial Statistics are provided here courtesy of Elsevier

RESOURCES