Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2018 Dec 1.
Published in final edited form as: Environmetrics. 2017 Sep 25;28(8):e2465. doi: 10.1002/env.2465

Spatiotemporal multivariate mixture models for Bayesian model selection in disease mapping

AB Lawson A, R Carroll A,*, C Faes B, RS Kirby C, M Aregay A, K Watjou B
PMCID: PMC5722237  NIHMSID: NIHMS898673  PMID: 29230091

Abstract

It is often the case that researchers wish to simultaneously explore the behavior of and estimate overall risk for multiple, related diseases with varying rarity while accounting for potential spatial and/or temporal correlation. In this paper, we propose a flexible class of multivariate spatio-temporal mixture models to fill this role. Further, these models offer flexibility with the potential for model selection as well as the ability to accommodate lifestyle, socio-economic, and physical environmental variables with spatial, temporal, or both structures. Here, we explore the capability of this approach via a large scale simulation study and examine a motivating data example involving three cancers in South Carolina. The results which are focused on four model variants suggest that all models possess the ability to recover simulation ground truth and display improved model fit over two baseline Knorr-Held spatio-temporal interaction model variants in a real data application.

Keywords: Poisson, mixture model, model selection, shared components, McMC

1. Introduction

Recent studies have demonstrated the importance of mixture model selection methods for use in distinguishing the most appropriate linear predictor for a given set of data in the disease mapping framework. These methods are often used in place of variable or traditional model selection as they offer many advantages. These advantages include but are not limited to incorporating collinear predictors (Garcia et al, 2010; George and Clyde, 2004; Hoeting et al, 2002; Rockova and George, 2014; Scheel et al, 2014; Bondell et al, 2010), reducing the necessary amount of modeling parameters (Lee et al, 2014; Carroll et al, 2016a; Carroll et al, 2016b), and offering the possibility of a final model fit (Carroll et al, 2016c). In particular, the Bayesian paradigm furnishes a reasonable approach to multivariate spatial (Torabi, 2016) and spatio-temporal small area health studies (Waller et al, 1997; Wikle et al, 2001), where a range of diseases with common etiology or exposure responses can be modeled together with a correlation structure between diseases.

These mixture models follow a similar structure to the BayeSTDetect model proposed by Li et al, 2012, though the implication is quite different. In that setting, the goal is to detect departure patterns in small area health data. In application, they use these methods to determine if a new government policy alters the patterns in diagnosis of chronic pulmonary disease. Instead, our methods capitalize on the ability to detect changes in patterns and apply it to the question of whether a particular spatial or spatio-temporal (ST) unit is better explained by strictly spatially varying information, ST varying information, or a mixture of the two. Further, we apply these models in the multivariate setting. Thus, these multivariate mixture model selection methods are the focus of this paper.

The motivating data for this project involves three related cancers in the state of South Carolina: oral cavity and pharynx (OCPCa), lung and bronchus (LBCa), and melanoma cancer of the skin (MCaS). These cancers have some common risk factors but vary in rarity. Both OCPCa and LBCa have a link through tobacco smoke exposure while OCPCa and MCaS have a link through ultra violet light exposure. OCPCa is the least common of these with a 0.0001 rate of disease while MCaS and LBCa are among the most common cancer types with rates of 0.0008 and 0.0003 respectively (U.S. Cancer Statistics Working Group, 2015). We believe that a multivariate mixture model of the type explored in this paper will allow for better modeling of the more rare disease, OCPCa, by the sharing of information with the more common diseases, LBCa and MCaS, with which it shares either behavioral or environmental risk factors.

This methodology is explored via a large-scale simulation study in addition to the real data case study involving the motivating data. The aims of the simulation study are to 1) assess ground truth recovery and 2) fit models in the presence of misspecification. Thus, the proposed scenarios offer information about how well a given model fits data simulated from that same model and how well the different fitted methods perform when the ground truth is not the same. Further, the fitted models offer situations such that the goal is either to simply model ST variation or model different types of covariates in addition to underlying ST variation.

The paper is developed as follows. First, we describe the statistical methods used in the fitted models, present our motivating and simulated data, outline our methods for model comparison, and discuss computational considerations. Next, we summarize the results related to the simulation study as well as the motivating data. Finally, we discuss and draw conclusions based on these results.

2. Methods

This paper focusses on the context of disease mapping in I predefined small areas across J units of time for K diseases. For all simulated data and fitted model scenarios defined in the following sections, we make the common assumptions of a Poisson data model for the ith area at time j for disease k (Lawson, 2013).

yijkμijk~Pois(μijk)log(μijk)=log(eijk)+log(θjik)

where the outcome, yijk, is an observed aggregated count of disease, μijk is the mean of the Poisson distribution, eijk is the expected rate of disease, and θijk is the relative risk. We are interested in modeling the log relative risk log(θijk). The expected rate, eijk is assumed known and is defined as the product of a fixed disease rate and the area of interest’s recorded population which can be obtained from a public health resource.

In this study, a mixture model structure for the log relative risk is proposed. A general definition of this model is as follows:

log(θijk)=a0k+hpijkhMijkhhpijkh=1;pijkh[0,1]

where a0k~N(0,τa-1) is an intercept, pijkh is a mixture parameter, and Mijkh is a mixture component corresponding to either a spatial or spatio-temporal functional form. The mixture parameter can be assumed to be fixed or varying across space and time. Additionally, spatial, temporal, or both dependencies could be assumed for this mixture parameter. The specific options used for the mixture components and mixture parameter are explained in the following sections.

2.1. Specific Models

In what follows we describe our specific modeling strategy. For these fitted models, we assume that h = {S, ST}, and the mixture components are defined such that mixing occurs between MikS, a spatial component, and MijkST, a ST component that also accommodates temporal-only effects. This structure demonstrated to be appropriate in previous studies (Carroll et al, 2016c; Carroll et al, 2017), and provides enough flexibility without extra parameterization. Additionally, it is parsimonious and thus reduces the number of mixture parameters necessary for estimation to one. We can simplify the general definition and propose a two component mixture of the form:

log(θijk)=a0k+pijkMikS+(1-pijk)MijkST (1)
MikS=XiβkS+uik+vi (2)
MijkST=XijβjkST+XjβkT+γj+ϕijk (3)

In our Bayesian framework, we assume prior distributions for all fixed effect parameters and random effects. In both components defined in ( 1 ), ( 2 ), and ( 3 ), the bold parameters indicate that they are vectors such that each predictor has its own parameter estimate. However, in our example we only have one temporal predictor. Thus, this term would reduce to XjβkT where the parameter βkT is a single parameter associated with the temporally varying predictor which is represented by Xj. In these definitions, Xi and Xij represent the ith and ijth values of the spatial or ST covariates respectively. Note that predictors are standardized (zero mean and variance of one) for model fitting. For the spatial random effects, the uncorrelated heterogeneous term, uik~N(0,τuk-1), accompanied by the correlated heterogeneous (CH) term, defined as vi~N(1nii~lvl,1niτv), creates a convolution term. The CH term follows an intrinsic conditional autoregressive (ICAR) model (Besag and Green, 1993; Besag et al, 1991) where ni is the number of neighbors for county i, and i~l indicates that the two counties i and l are neighbors (il). We notate this prior distribution subsequently as CAR(τ-1) where τ* is the precision of the effect that is distributed ICAR. The spatio-temporal component includes a type of temporal random walk term, γj~N(γj-1,τγ-1), and an uncorrelated ST interaction term, ϕijk~N(0,τϕjk-1). Further, the absence of a k subscript indicates that the parameter is a shared effect, thus the CH and temporal random walk terms are common among the multiple diseases (Knorr-Held and Best, 2001). This sharing indicates that the underlying temporal and spatial correlation is the same for both diseases.

We assume non-informative prior distributions for the regression parameters such that βks~N(0,τβkS-1),βkT~N(0,τβkT-1), and βjkST~N(0,τβkST-1). Finally, all preceding and subsequent prior distributions for standard deviations of parameters are such that τ−1/2~Unif(0,C). It was found that C = 4 was essentially non-informative through sensitivity analysis that included comparisons to uniform distributions with larger ranges (Gelman, 2006). Further, other studies have shown that relative risks are not overly influenced by prior distributions placed on the variance parameters (Bernardinelli et al, 1995).

In the following sections, we describe our four fitted models named F1 up to F4, and they are summarized for the univariate (k = 1) and multivariate (k = K) cases in Table 1. The descriptions outlined in these sections illustrate how the models differ largely via the structure of the mixture parameter, but we also provide some options for mixture component specification. So far, we have described what will hence forth be referred to as ‘PRED’ models. Alternatively, we also fit ‘RE’ models wherein the mixture components only incorporate random effects such that: MikS=uik+vi and MijkST=γj+ϕijk. The associated likelihood, prior distributions, and posteriors for all models utilized in this paper are included in supplemental section A.1.

Table 1.

Summary of fitted models F1 up to F4.

Model Formula Mixture parameter
Univariate
F1
logθij=a0+piMiS+(1-pi)MijST
logit(pi) = zi + αi
zi~Norm(0, τz)
αi~Norm(0, τα)
F2 logit(pi) = zi + αi
zi~CAR(τz)
αi~Norm(0, τα)
F3
logθij=a0+pijMiS+(1-pij)MijST
logit(pij) = zij + αij
zij~CAR(τzj)
αij~Norm(0, τα)
F4 logit(pij) = (zij + wj)/2 + αij
zij~CAR(τz)
wj~RW(1)(τwj)
αij~Norm(0, τα)
Multivariate
F1
logθijk=a0k+pikMikS+(1-pik)MijkST
logit(pik) = zik + αik
zik~Norm(0, τzk)
αik~Norm(0, ταk)
F2 logit(pik) = zik + αik
zik~CAR(τzk)
αik~Norm(0, ταk)
F3
logθijk=a0+pijkMikS+(1-pijk)MijkST
logit(pijk) = zijk + αijk
zijk~CAR(τzjk)
αijk~Norm(0, ταk)
F4 logit(pijk) = (zijk + wjk)/2 + αijk
zijk~CAR(τzjk)
wjk~RW(1)(τwjk)
αijk~Norm(0, ταk)

2.1.1. Spatially varying mixture parameter

The models described in this section only allow for a spatially varying mixture parameter pijkpik. This leads to a log relative risk model which is specified such that:

log(θijk)=α0k+pikMikS+(1-pik)MijkSTlogit(pik)=zik+αik (4)

where αik~Norm(0,ταk-1) and offers the ability to act as an intercept and create a vertical shift in the distribution associated with the pik term. Two fitted models arise from this formulation, and they differ in the way the zik term is specified.

F1 model

This model assumes an uncorrelated linkage between the spatial and spatio-temporal mixture components. Thus, zik is defined as zik~N(0,τzk-1), and while the inclusion of both αik and zik appears to be redundant in this modeling formulation, they are included for consistency with the subsequent fitted models. However, this could potentially lead to identifiability issues for this model. This is the simplest of the modeling alternatives.

F2 model

This model is an extension of F1 in that it assumes a correlated spatial structure within the mixture parameter via an ICAR distribution on the zik term ( zik~CAR(τzk-1)). Additionally, this is the first model that illustrates αik ‘s ability to create a shift in the ICAR structure of pik, thus logit(pik) could potentially no longer be centered at zero. We examined models with and without this shifting parameter and determined that the fit improved when it was included.

2.1.2. Spatio-temporally varying mixture parameter

The models described in this section offer a spatio-temporally varying mixture parameter. This leads to a log relative risk model which is specified as in ( 1 ). As with the spatially varying mixture parameter specification, two more fitted models arise from this formulation that differ in how pijk is specified.

F3 model

This model employs a definition of pijk which is nearly identical to that of ( 4 ). It is defined such that logit(pijk) = zijk + αijk. Additionally, while F3 allows the mixture parameter to vary across space and time, the correlation remains only spatial. However, the precisions associated with the zik term are allowed to vary over time creating a time labeled ICAR distribution, i.e. zijk~CAR(τzjk-1). The shift parameter is also allowed to vary spatio-temporally but remains uncorrelated such that αijk~Norm(0,ταk-1).

F4 model

This model continues to allow the mixture parameters to vary across space and time. But, instead of only spatial correlation as seen in F3, a temporal random walk of order one (RW(1)) term is also implemented in the mixture parameter structure. Thus, the definition of the mixture parameter is logit(pijk) = (zijk + wjk)/2 + αijk where zijk and αijk are as in F3, wjk is the RW(1) term, and the sum of zijk and wik is divided by two so that the variance of this term as a whole is on the same scale as in the other fitted models. The definition of the RW(1) term, notated as wjk~RW(1)(τwjk-1), considers previous and subsequent time points by reducing the ICAR model to a single dimension (Thomas et al, 2014; Fahrmeir and Lang, 2001). The BUGS code for accomplishing this is included in the supplemental materials embedded within the example code for fitted model F4 as w1 and w2, and the formula is written as follows:

wjk={N(wj+1,k,1τjk)forj=1N(wj-1,k+wj+1,k2,12τjk)forj=2,,J-1N(wj-1,k,1τjk)forj=J (5)

2.1.3. Approaches to Multivariate Modeling

Within our mixture modeling approach, we must choose how to describe the multivariate nature of the disease incidence. One approach is to consider cross-correlation between diseases and to assume multivariate spatial correlation models. One such model that is commonly proposed is the MCAR model (Banerjee, 2016; Gelfand and Vounatsou, 2003). Essentially this model proposes that between disease cross correlation be accommodated within the CAR specification for the separate diseases. Alternative approaches that could be considered include multivariate normal specifications (Banerjee, 2016; Gelfand and Vounatsou, 2003; Martinez-Beneito et al, 2016; MacNab, 2016). While these approaches could be attractive, there are disadvantages to their use. The main issues include: a large amount of parameterization and computational inefficiencies. Our goal here is to propose a more flexible and parsimonious approach to multivariate modeling as well as model selection, whereby we assume that different diseases can have a common shared component or components (Knorr-Held and Best, 2001; Corberan-Vallet, 2012). Component sharing means assuming common parameters between diseases that are assumed to be linked by certain risk factors and allows the estimation of a field which may provide evidence for the spatial distribution of common but unobserved etiological factors. This can also reduce the need to specify separate, disease specific components of risk and hence, be more parsimonious. We adopt this approach in what follows, and while in this paper sharing of the correlated spatial and temporal random effects is done, other choices could be made as well.

For the motivating data example, we employ the RE and PRED versions of the multivariate mixture models described above for these three diseases as well as fit each disease using comparable univariate mixture models. Additionally, we offer a comparison with 3 different “standard” multivariate models using Knorr-Held model formulations. The first (KH) is the classic Knorr-Held model whereby log(θijk) = α0k + uik + vi + γj + ϕijk such that the parameters are described in the same way as the mixture models above. Next, we impose a Knorr-Held type model (KH+) where there is more temporal structure imposed in the ϕijk term via ϕijkT~N(ϕi(j-1)k,τϕj). Finally, we also use a Knorr-Held model (KHM) where γj is shared between diseases in the same way as above but the correlated spatial random effect (v′ = (v1, … vn) for K × 1 vi ) follows a multivariate CAR model with K × K precision matrix Ω such that vi~N(1nii~lvl,Ω) and Ω has a non-informative Wishart prior distribution with three degrees of freedom and parameter matrix R.

2.1.4. Identifiability

As seen in past model selection frameworks, when there are multiple parameters with spatial and temporal structure, identification of one in the presence of another could be problematic and outweigh the benefits of dependence in the mixing weights. However, using the mixture modeling structure rather than a selection parameter assigned to the several linear predictor alternatives offers an improvement in terms of identifiability in that there are not multiple parameters of each type across the linear predictor alternatives. Nevertheless, there is still some potential for identification issues in this structure since spatial and temporal correlation is imposed in both the spatial random effects and the mixture parameter, depending on the fitted model, and this should be something to consider when determining the best modeling option. Ultimately, our goal is to determine the best fitting model and obtain the overall risk, thus identification issues are not a focus of this paper.

2.2. Motivating and Simulated Data

2.2.1. Motivating Data

OCPCa, MCaS, and LBCa incidences for the state of South Carolina, USA were gathered from the SCAN data sets (Cancer Incidence, 1996–2009), and Supplementary Table 3A displays the frequencies and percentages of data that are zero counts as well as censored. SCAN performed a type of censoring of the data whereby an observed count between 1 and 4 inclusive is given the value “<5” and an observed count between 5 and 10 inclusive is given the value 10. We perform an imputation a priori whereby the new outcome is approximately:

yijk={Pois(eijk)foryijk=<5Pois(eijk)foryijk=10yijkelse

where yijk is the newly imputed outcome, eijk is the expected rate of disease, and yijk is the original outcome from SCAN. One additional structure we imposed restricted the Poisson distributions such that the imputed incidence falls in the appropriate range based on the censored value from the original data set. Example R code for accomplishing this, histograms of the raw and imputed distributions per cancer type, and the resulting Poisson likelihood are included in the supplementary materials (sections A.6 and A.2, respectively). For OCPCa and LBCa, this worked well as the imputed values appear to fit within the distribution naturally, but when considering MCaS, the upper bounds where the distribution was limited based on the definitions of the censored values seem to be causing a slight issue as there are small peaks in the distribution at 4 and 10. However, it is not uncommon for distributions to have peaks such as these and the full distribution of MCaS does not appear unusual, thus we continued with this as the assumed distribution. For comparison, the full distributions, both imputed and censored, are available in Supplemental Figures A.2.1 and A.2.2.

Additionally, predictors collected across space, time, or both have been included in the study as potential risk factors of either spatial, temporal, or ST structure. The demographic predictors come from the Area Health Resources Files, 2003 data set while the environmental predictors come from the National Oceanic and Atmospheric Administration, 2015, South Carolina Department of Health and Environmental Control (SCDHEC), 2014, and the North America Land Data Assimilation System, 2013. The selected spatial only varying predictors are proportion of persons with health insurance (pHI), median household radon level (radon), and proportion of African American population (pAA); the two socioeconomic covariates are census data and were thus collected in the year 2000 while the measure of radon is a county level average of in home test kit results analyzed by the SCDHEC laboratory. The temporally only varying predictor is statewide average annual rainfall. The selected ST predictors include: average daily sunlight (sun), unemployment rate of those 16 years or older (UER), and proportion of persons in poverty (pppov). The proportion forms of the predictors were calculated using a number of persons measure from the data source and the county level populations acquired from the South Carolina Community Assessment Network (SCAN) data sets (Cancer Incidence, 1996–2009).

The six suggested predictors were selected and assigned to be spatial, temporal, or ST based on three criteria: 1) Availability - some are census measures so they are not collected annually, 2) Reasonable amount of collinearity - some predictors are highly collinear, e.g. poverty rate and median income, thus only one is included, and 3) The presence of spatial, temporal, or both variation - some appear to have nearly the same measures from one year to the next, and they are included as spatial rather than ST covariates. Further, these covariates are important in the prediction of OCPCa, MCaS, and LBCa incidences. Specifically, in relation to significance, the National Cancer Institute lists age and race/ethnicity associations for all three types of cancers (National Institutes of Health, 2015a; National Institutes of health, 2015b; American Cancer Society, 2015a; National Institutes of Health, 2015c). Sunlight exposure also has a suggested relationship with all three types of cancers (Giovannucci, 2005); however, it is most notable with respect to OCPCa and MCaS (American Cancer Society, 2015b; Ananthaswamy, 2001; PDQ Adult Treatment Board, 2016). In general, disadvantaged individuals are less likely to have insurance, be unemployed, live in poverty, and have a higher incidence of cancer; these three predictors, proportion of persons with health insurance, unemployment rate, and proportion of persons in poverty are somewhat correlated in these data (American Cancer Society, 2015a; National Cancer Institute, 2010).

2.2.2. Simulated Data

The goal of the simulation study presented here is to determine how well the fitted models perform under different scenarios. The same set of real demographic and environmental predictors as mentioned in Section 2.2.1 (spatial only: pHI, radon, and pAA; temporal only: rainfall; ST: sun, UER, and pppov) is utilized in the simulated data. The time period is J = 14 years, 1996–2009 and the number of spatial units (I) is 46, corresponding to the South Carolina counties.

For each of the simulation models, we assume K = 3 and simulate one scenario for three different diseases: ‘A,’ ‘B,’ and ‘C’ with some shared components. There is no separate univariate simulation for diseases; the univariate versus multivariate structure is only defined by the fitted model. The components to be shared are the CH and the temporal random walk terms as presented in the fitted models (Section 2.1). Additionally, the predictors are shared, but the parameter estimates associated with each predictor do vary between diseases. Further, these shared random effects, predictors, and fixed effect parameter estimates are also assumed to be constant across the 100 simulated data sets generated under each scenario for better assessment of ground truth recovery. The remainder of the parameters vary between diseases as well as between simulated data sets, and the rate of disease ‘A’ is 0.001 while diseases ‘B’ and ‘C’ are rarer and assume the rates of 0.00005 and 0.0001 respectively. These rates indicate that disease ‘A’ is present in 1 out of every 1000 people, disease ‘B’ is presents in 1 out of every 20000 people, and disease ‘C’ is present in 1 out of every 10000 people.

A description of specifications and variations in the simulation study are contained in the Supplemental Simulated Data Section (A.3). Ultimately, simulation ground truths differ per the assumptions related to the mixture parameter and contents of the mixture components to thoroughly explore all fitted models.

2.3. Model Comparison Tools

We compared these models in multiple ways. The first goodness of fit (GoF) measure is the deviance information criterion (DIC) (Spiegelhalter et al, 2002) calculated using the deviance measures from the log likelihood of the Poisson distribution, DICk=D¯k+var(D¯k)2 where k = −2Σijyijk log(μijk) − log(yijk!) − μijk An additional measure is the WAIC which makes use of the posterior predictive distribution as described by Watanabe et al, 2010 and Gelman et al, 2014 such that

WAICk=-2(lpdWAICk-pDWAICk)ldpWAICk=ijlog(mean(ppdijk))ppdijk=exp(yijklog(μijk)-log(yijk!)-μijk)pDWAICk=ijvar(log(ppdijk))

where mean() and var() refer to the mean and variance across the MCMC samples. For the multivariate setting, both DIC and WAIC are calculated for the overall model (ΣkDICk and ΣkWAICk ) as well as for each disease. A final measure of GoF is the mean squared predictive error (MSPE) defined such that MSPEk=ij(yijk-y^ijk)2n. All these measures are averaged over the whole of the simulated data sets.

An additional method we utilize for model evaluation and comparison considers the recovery of the mixture parameter as well as random effect ground truths. These comparison measures are accomplished by calculating the bias ( λ^¯-λ), variance, and mean squared error (MSE) ( MSE(λ^)=E((λ^¯-λ)2)) of the estimate in relation to the ground truth and relating it to the variance of that estimate. In these calculations, λ^¯=d=1100λ^ where λ̂ is the posterior mean estimate of λ, the known simulation ground truth, which is the same for all simulated data sets. Finally, we record the precision estimates for the random effect parameters and assess them for accuracy relating to the ground truth.

2.4. Statistical Computation

These analyses were accomplished using R version 3.2.4. Specifically, the package R2WinBUGS which calls the Bayesian inference software WinBUGS from the freely available statistical processing software R was utilized for inference. Additionally, to decrease computation time, another R package, snowfall, was implemented for parallelizing code (Knaus, 2013). Finally, the R package fillmap, which is available via GitHub, was utilized for producing maps (Carroll, 2016d). All simulation computation was performed on a Windows 7 Professional Dell Precision T7500 dual processor server, and running F1 up to F4 for all 100 data sets under each simulated data scenario took roughly 5 days.

We considered the computational alternatives for these types of models and determined that the ideal sampling algorithms to be a slice updater for standard deviation parameters and a metnormal updater for the fixed and random effect estimates as well as the estimates associated with the mixture parameter. Conveniently, these are available in WinBUGS; thus, we preformed these analyses via the R package R2WinBUGS which calls WinBUGS from R (Thomas et al, 2014; Carroll et al, 2015; Lunn et al, 2013; Thomas et al, 2006; R Core Team, 2015). For simulated data, we ran each of the 2 chains for 50000 iterations and sampled 2500 of them. Ultimately, this was to ensure convergence for all simulated data sets as some did indicate issues. With the motivating data case study, we can greatly reduce these values based on the convergence of each fitted model individually. We still sampled 2500 from each chain for the motivating data case study. Furthermore, when these models are fitted, the initial values supplied to each chain of the MCMC are such that they are the expected values of the associated prior distributions. Finally, for convergence, we referred to values, the Brooks-Gelman-Rubin convergence diagnostic as defined by Gelman and Rubin, 1992, and trace plots for a subset of the simulated data sets. Example BUGS code for implementing the different fitted models is included in the supplementary materials.

A demonstrative example of these methods using the case study data described in Section 4 is available via the R package shiny (Chang et al, 2016). This example utilizes stored MCMC results to avoid long wait times and possible convergence issues that could arise from running MCMC in the background of the shiny application. It is accessible via GitHub from user carrollrm within repository MixModShiny (use the call: runGitHub(“MixModShiny”, “carrollrm”)).

3. Results

3.1. Simulation Results

Based on the GoF measures in Supplemental Tables A.4.1 and A.4.2, multivariate models, which are indicated with a grey shading, offer a better fit in certain situations but this is not always the case. Further, the best fitting model is typically as expected based on the simulation ground truth. Specifically, S1Aft shows that RE models are best fitting while S3At shows that PRED models are best, S1Aft performs best with F2, and F4 models typically show improvements for “t” suffix simulation scenarios. However, F3 never offers the best fit.

Figure 1 displays bias versus variance plots for the mixture parameter, CH random effect, and the RW(1) random effect. The plots with the mixture parameter show that less bias is present in the multivariate setting and more bias is present with F1 model fits. Further, models with the most bias, variance, or both are typically PRED models in the univariate setting while the RE and PRED models perform about the same in the multivariate setting. Also, as expected, F3 and F4 present more variance than F1 and F2. The CH random effect plots illustrate that, in general, the bias and variance measures are quite small. Additionally, less bias and variance exist in the multivariate setting while more bias and variance are detected in the RE models. The RW(1) random effect plots illustrate that there is more bias in RE models with large outlier bias and variance estimates for F1 RE models. In general, there is more variance in PRED models as well as more bias and variance in the univariate setting.

Figure 1.

Figure 1

3.2. Results for Multivariate Modeling of Cancers in South Carolina

The GoF results for these model scenarios are displayed in Table 2. The full table of results is included as Table A.5.1 in the supplementary materials, and these indicate that WAIC and DIC continue to give differing results. Since the simulation study suggests that WAIC is the better measure to use for these models, we will continue based on that assumption. F2 RE is best fitting for MCaS, and F3 RE produces a slightly smaller, comparable WAIC for OCPCa. For the multivariate models, F3 RE produces the lowest overall WAIC, but that model is not the best fitting when considering the individual disease. F2 PRED is the best fitting for OCPCa while F4 RE is the best for MCaS and LBCa. When considering LBCa, the results for the univariate and multivariate cases indicate that the same fitted model is best fitting and that the univariate is slightly better than the multivariate. The other two diseases fit best under different modeling scenarios in the univariate and multivariate settings. When comparing the best fitting models between the univariate and multivariate fits, the results for MCaS indicate that the multivariate fit is best while the results for OCPCa indicate the opposite.

Table 2.

GoF measures for the case study data. Bold indicates the lowest value for comparable models within the univariate and multivariate cases.

Disease Fitted Model Univariate Multivariate
WAIC pDWAIC WAIC pDWAIC
OCPCa F1 3212.38 102.74 3450.55 131.32
F2 3204.27 95.03 3446.10 130.56
F3 3196.91 101.39 3471.25 163.55
F4 3197.54 95.96 3604.93 201.53
KH 3223.60 105.27 3482.81 163.04
KH+ 3473.61 229.13 3493.14 156.84
KHM --- --- 3474.00 160.35
MCaS F1 3850.89 195.90 3815.31 182.49
F2 3785.76 165.16 3848.12 190.28
F3 3808.01 199.01 3846.18 218.29
F4 3791.33 175.25 3757.62 160.52
KH 3941.36 251.21 3857.35 233.16
KH+ 4046.05 289.99 3839.63 206.18
KHM --- --- 3837.88 225.37
LBCa F1 5231.87 526.14 4515.12 163.90
F2 4423.68 127.72 4455.13 136.38
F3 4436.70 148.72 4446.84 159.85
F4 4415.70 124.55 4419.83 131.13
KH 4500.47 161.92 4491.91 164.48
KH+ 4693.60 247.31 4556.21 169.03
KHM --- --- 4447.19 141.11

Supplemental Figure A.5.5 displays the estimates of the temporal random walk effect for F2RE and F4RE per disease as well as the shared estimate from the multivariate fit. These estimates show both differences and similarities when comparing the two fitted models’ results. The estimate shared between diseases in the multivariate setting is very different here and this is likely why some diseases fit better with one versus the other. Additionally, the estimates associated with the MCaS outcome appear to have consistently steeper slopes than the others in these two sets of results with F4RE being the steeper of the two. The intercepts of the estimates associated with both LBCa and MCaS are lower for F4RE compared to F2RE. Lastly, the estimates associated with OCPCa show little change over time and are very similar with both fitted models. Further, the increase over time in estimates associated with LBCa and MCaS as well as the small amount of change in those for OCPCa directly reflect the increase in incidence over time for the first two and steadiness of the latter; this is displayed in Supplemental Figure A.2.3. All of this explains why the MSPE measures, displayed in Supplemental Figures A.5.1–A.5.4, show poor fit associated with OCPCa for the later years in the multivariate RE setting.

Next, we wished to explore fitting these data for only the first 9 years as these are the years that the models perform better for OCPCa. This exercise is performed with fitted model F2RE only. Supplemental Table A.5.2 displays the WAIC and pDWAIC estimates calculated for the first 9 years of the study time and these indicate that there is an improvement in fitting for only the first 9 years. Additionally, the 9 year fits show that the multivariate setting offers a better fit over the univariate setting. For the models that were fit over the entire 14 year study time, the estimates were only summed over the first 9. Supplemental Figure A.5.21 displays the MSPE estimates in comparison to those from fitting all 14 years, and this plot shows that the MSPE measures are much more consistent between the univariate and multivariate settings when only the first 9 years of the data are used. Supplemental Figure A.5.22 shows the mixture parameter estimates for this additional fit and these estimates are different from the first fits, but that is expected as the data is different. However, there are also differences in the estimates produced for OCPCa in the univariate and multivariate settings. Supplemental Figures A.5.23 and A.5.24 show the spatial random effect estimates and these display resemblances between the univariate and multivariate settings.

Figure 2 displays the overall risk (θijk) for the 9 year model fits for univariate OCPCa and the multivariate model that includes all three cancers of interest; note that only year 1996, 2000, and 2004 are displayed for brevity. This measure is the most interpretable as identifiability is no longer an issue for the sum of all random effects. Ultimately, these estimates suggest that risk for all cancers is increasing across the 9 years considered but the spatial distribution of risk, while consistent within a given disease, varies across the three considered. The risk associated with OCPCa appears to be increased in the eastern portion of the state, the risk for MCaS is high in the eastern, coastal counties as well as in the western mountain counties, and the risk for LBCa is more scattered with high risk in the northern portion of the state. Supplemental Figures A.5.18–A.5.20 display these overall risk measures for years 1996, 2000, 2004, and 2008 with model F2RE using the entire study time. When comparing these estimates, risk appears to be the same for LBCa and MCaS while the risk associated with OCPCa appears to closely resemble that of MCaS.

Figure 2.

Figure 2

4. Discussion

The results in Section 3 suggest that multivariate ST models show significant improvements over the Knorr-Held models and have a role to play in etiological investigation and public health surveillance. The best model of the proposed F1 up to F4 depends on the data and outcome of interest. In the simulation study, we were able to appropriately recover the best models as expected.

Simulation study results also inspected recovery of the ground truth via bias and variance calculations, maps of spatial or ST parameters, and random effect precision estimates. And, while WIAC and DIC did not clearly indicate improvements in multivariate over univariate modeling, some of the other measures did. In particular, the bias versus variance plots as well as the ground truth recovery evaluations displayed some strong improvements in the multivariate setting. Further, the recovery of the true fixed effect parameter estimates and random effects are often at least the same if not better in the multivariate setting when the model is misspecified. Thus, the multivariate setting offers an advantage in being more robust.

Computationally these models offer technically advanced inference at relatively low computational cost. The real data example fits using MCaS in the univariate setting took the following amounts of time in minutes on a standard laptop: 7, 21, 28, and 86 for F1, F2, F3, and F4 respectively. Similarly, the multivariate times were: 31, 33, 67, and 73. If multivariate normal or multivariate CAR methods were imposed on the mixture parameter or random effects, it is likely that this computation time would increase. However, these computational times still pose a problem for interactive apps such as shiny without using stored MCMC posterior estimates.

These methods are not without issues. As discussed in Section 2.1.4., identifiability issues could arise depending on the parameters included. Further, collinearity between geo-referenced fixed effects and random effects could occur (Reich et al, 2006), and if the focus is on assessment of the fixed effects models, then solutions should be sought for this. We did not note any issues between the fixed and random effects; but, there was evidence of correlation between the mixture parameters and the spatial random effects (Supplemental Table A.5.3). However, our focus was on the overall assessment of risk, and we accomplished this by including the necessary fixed and random effects to account for the variation in the outcomes of interest as comparisons with the Knorr-Held models demonstrated that the mixture parameters were important in terms of model prediction and goodness of fit. Another issue uncovered in this exploration involves the potential problems that can arise when one of the diseases of interest differs from the others. Alternatively, an issue this exploration aims to address involves the ideal method for examining mixture model GoF and recovery. While DIC and WAIC typically agree in the individual disease results as well as across univariate and multivariate models, the total DIC and WAIC measures for the multivariate setting (presented in Supplemental Table A.5.2) show variation in what model is deemed best fitting. For these total measures, WAIC typically returns the expected choice for best model.

If the focus is on parameter estimation of fixed effects, treating random effects as nuisance parameters, then some consideration should be given to the issue of co-linearity between geo-referenced fixed effects and correlated heterogeneity (National Cancer Institute, 2010). While orthogonalization can be attempted in some cases, (Hodges and Reich, 2010) a simpler solution is to consider a multi-stage approach where fixed effects are estimated first and then correlated spatial effect estimated from residuals. Following that, the fixed effects are re-estimated with ‘plug-in’ estimates of the correlated effect. This was proposed by Lawson et al, 2012. Some residual bias will likely be present in this approach but it does provide a way to prioritize the fixed effect estimation. In this case, it may also be of interest to only incorporate the random effects in the mixture as this will lead to improved interpretation of the fixed random effects.

In conclusion, these multivariate ST mixture models offer a useful and informative option for modeling multiple diseases with spatial, temporal, or both structuring. Here, improvements were noted when comparing the multivariate to the univariate models as well as our proposed alternative models over the classically used Knorr-Held models in the simulation study as well as with the three cancers in South Carolina. These methods offer the ability to produce an appropriate estimate of overall risk for a rare disease by borrowing information from related, more common ones.

Supplementary Material

Supp info

Acknowledgments

This research was supported in part by funding under grant NIH R01CA172805.

Footnotes

Conflict of Interest

The authors declare no conflict of interest.

References

  1. American Cancer Society. Cancer facts & figures 2015. Atlanta, GA: 2015a. [Accessed 14 January 2016]. http://seer.cancer.gov/statfacts/html/lungb.html. [Google Scholar]
  2. American Cancer Society. Do we know what causes melanoma skin cancer? Atlanta, GA: 2015b. [Accessed 28 January, 2016]. http://www.cancer.org/cancer/skincancer-melanoma/detailedguide/melanoma-skin-cancer-what-causes. [Google Scholar]
  3. Ananthaswamy HN. Sunlight and skin cancer. J Biomed Biotechnol. 2001;1(2):49. doi: 10.1155/S1110724301000122. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Area Health Resource Files (AHRF) Rockville, MD: US Department of Health and Human Services, Health Resources and Services Administration, Bureau of Health Workforce; 2003. [Accessed 13 June 2015]. http://ahrf.hrsa.gov/ [Google Scholar]
  5. Banerjee S. Multivariate spatial models. In: Lawson AB, Banerjee S, Haining RP, Ugarte MD, editors. Handbook of spatial epidemiology. Boca Raton, FL: CRC Press; 2016. pp. 375–94. [Google Scholar]
  6. Bernardinelli L, Clayton D, Pascutto C, Montomoli C, Ghislandi M, Songini M. Bayesian analysis of space—time variation in disease risk. Stat Med. 1995;14(21–22):2433–43. doi: 10.1002/sim.4780142112. [DOI] [PubMed] [Google Scholar]
  7. Besag J, York J, Mollié A. Bayesian image restoration, with two applications in spatial statistics. Ann Inst Stat Math. 1991;43(1):1–20. doi: 10.1007/bf00116466. [DOI] [Google Scholar]
  8. Besag J, Green PJ. Spatial Statistics and Bayesian Computation. J Roy Stat Soc B. 1993;55(1):25–37. URL: http://www.jstor.org/stable/2346064. [Google Scholar]
  9. Bondell HD, Krishna A, Ghosh SK. Joint variable selection for fixed and random effects in linear mixed-effects models. Biometrics. 2010;66(4):1069–77. doi: 10.1111/j.1541-0420.2010.01391.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Carroll R, Lawson AB, Faes C, Kirby RS, Aregay M, Watjou K. Comparing INLA and OpenBUGS for hierarchical Poisson modeling in disease mapping. Spat Spatiotemporal Epidemiol. 2015;14–15:45–54. doi: 10.1016/j.sste.2015.08.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Carroll R, Lawson AB, Faes C, Kirby RS, Aregay M, Watjou K. Spatially-dependent Bayesian model selection for disease mapping. Stat Methods Med Res. 2016a doi: 10.1177/0962280215627298. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Carroll R, Lawson AB, Faes C, Kirby RS, Aregay M, Watjou K. Bayesian model selection methods in modeling small area colon cancer incidence. Ann Epidemiol. 2016b;26(1):43–9. doi: 10.1016/j.annepidem.2015.10.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Carroll R, Lawson AB, Faes C, Kirby RS, Aregay M, Watjou K. Spatio-temporal Bayesian model selection for disease mapping. Environmetrics. 2016c;27(8):466–478. doi: 10.1002/env.2410. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Carroll R. fillmap: Create maps with SpatialPolygons objects. R package verion 0.0.0.9000. 2016d https://github.com/carrollrm/fillmap.
  15. Carroll R, Lawson AB, Faes C, Kirby RS, Aregay M, Watjou K. Space-time variation of respiratory cancers in South Carolina: A flexible multivariate mixture modeling approach to risk estimation [Special Issue] Ann Epidemiol. 2017;27:42–51. doi: 10.1016/j.annepidem.2016.08.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Chang W, Cheng J, Allaire JJ, Xie Y, McPherson J. shiny: Web application framework for R. R package version 0.13.2. 2016 https://CRAN.R-project.org/package=shiny.
  17. Corberan-Vallet A. Prospective surveillance of multivariate spatial disease data. Stat Methods Med Res. 2012;21(5):457–77. doi: 10.1177/0962280212446319. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Fahrmeir L, Lang S. Bayesian inference for generalized additive mixed models based on Markov random field priors. J R Stat Soc C. 2001;50(2):201–20. doi: 10.1111/1467-9876.00229. [DOI] [Google Scholar]
  19. Garcia RI, Ibrahim JG, Zhu H. Variable selection for regression models with missing data. Stat Sin. 2010;20(1):149–65. [PMC free article] [PubMed] [Google Scholar]
  20. Gelfand AE, Vounatsou P. Proper multivariate conditional autoregressive models for spatial data analysis. Biostatistics. 2003;4(1):11–25. doi: 10.1093/biostatistics/4.1.11. [DOI] [PubMed] [Google Scholar]
  21. Gelman A, Rubin DB. Inference from iterative simulation using multiple sequences. Stat Sci. 1992;7(4):457–72. doi: 10.1214/ss/1177011136. [DOI] [Google Scholar]
  22. Gelman A. Prior distributions for variance parameters in hierarchical models. Bayesian Anal. 2006;1(3):515–33. [Google Scholar]
  23. Gelman A, Hwang J, Vehtari A. Understanding predictive information criteria for βkS Bayesian models. Stat Comp. 2014;24(6):997–1016. doi: 10.1007/s11222-013-9416-2. [DOI] [Google Scholar]
  24. George EI, Clyde M. Model uncertainty. Stat Sci. 2004;19(1):81–94. doi: 10.1214/088342304000000035. [DOI] [Google Scholar]
  25. Giovannucci E. The epidemiology of vitamin D and cancer incidence and mortality: A revew (United States) Cancer Causes Control. 2005;16(2):83–95. doi: 10.1007/s10552-004-1661-4. [DOI] [PubMed] [Google Scholar]
  26. Hodges JS, Reich BJ. Adding spatially-correlated errors can mess up the fixed effect you love. Am Stat. 2010;64(4):325–34. doi: 10.1198/tast.2010.10052. [DOI] [Google Scholar]
  27. Hoeting JA, Raftery AE, Madigan D. Bayesian variable and transformation selection in linear regression. J Comput Graph Stat. 2002;11(3):485–507. doi:10.1.1.35.1365. [Google Scholar]
  28. Knaus J. snowfall: Easier cluster computing (based on snow) R package version 1.84–6. 2013 http://CRAN.R-project.org/package=snowfall.
  29. Knorr-Held L. Bayesian modeling of inseparable space-time variation in disease risk. Stat Med. 2000;19(17–18):2555–67. doi: 10.1002/1097-0258(20000915/30)19:17/18&#x0003c;2555::aid-sim587&#x0003e;3.0.co;2-#. [DOI] [PubMed] [Google Scholar]
  30. Knorr-Held L, Best NG. A shared component model for detecting joint and selective clustering of two diseases. J Roy Stat Soc A. 2001;164(1):73–85. doi: 10.1111/1467-985x.00187. [DOI] [Google Scholar]
  31. Lawson AB, Choi J, Cai B, Hossain M, Kirby RS, Liu J. Bayesian 2-Stage Space-Time Mixture Modeling With Spatial Misalignment of the Exposure in Small Area Health Data. J Agric Biol Environ Stat. 2012;17(3):417–41. doi: 10.1007/s13253-012-0100-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Lawson AB. Bayesian disease mapping: Hierarchical modeling in spatial epidemiology. 2. Boca Raton, FL: CRC Press; 2013. [Google Scholar]
  33. Lee KJ, Jones GL, Caffo BS, Bassett SS. Spatial Bayesian variable selection models on functional magnetic resonance imaging time-series data. Bayesian Anal. 2014;9(3):699–732. doi: 10.1214/14-BA873. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Li G, Best N, Hansell AL, Ahmed I, Richardson S. BaySTDetect: detecting unusual temporal patterns in small area data via Bayesian model choice. Biostatistics. 2012;13(4):695–710. doi: 10.1093/biostatistics/kxs005. [DOI] [PubMed] [Google Scholar]
  35. Lunn D, Jackson C, Best N, Thomas A, Spiegelhalter D. The BUGS book: A practical introduction to Bayesian analysis. 1. Boca Raton, FL: CRC Press; 2013. [Google Scholar]
  36. MacNab YC. Linear models of coregionalization for multivariate lattice data: a general framework for coregionalized multivariate CAR models. Stat Med. 2016;35(21):3827–50. doi: 10.1002/sim.6955. [DOI] [PubMed] [Google Scholar]
  37. Martinez-Beneito MA, Botella-Rocamora P, Banerjee S. Towards a multidimensional approach to Bayesian disease mapping. Bayesian Anal. 2016 doi: 10.1214/16-ba995. In print. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. National Cancer Institute. Cancer health disparities. Rockville, MD: 2008. [Accessed: 14 January 2016]. https://www.cancer.gov/research/areas/disparities. [Google Scholar]
  39. National Institutes of Health. SEER stat fact sheets: Lung and bronchus cancer. Rockville, MD: 2015a. [Accessed 14 January 2016]. https://seer.cancer.gov/statfacts/html/lungb.html. [Google Scholar]
  40. National Institutes of Health. SEER stat fact sheets: Melanoma of the skin. Rockville, MD: 2015b. [Accessed 14 January 2016]. https://seer.cancer.gov/statfacts/html/melan.html. [Google Scholar]
  41. National Insitutes of Health. SEER stat fact sheets: Oral cavity and pharynx cancer. Rockville, MD: National Institutes of Health; 2015c. [Accessed 27 May 2016]. https://seer.cancer.gov/statfacts/html/oralcav.html. [Google Scholar]
  42. National Oceanic and Atmospheric Administration. Climate at a Glance. Ashville, NC: National Centers for Environmental Information; [Accessed 20 January 2016]. http://www.ncdc.noaa.gov/cag/ [Google Scholar]
  43. North America Land Data Assimilation System (NLDAS) Daily Sunlight (insolation) for years 1979–2011 on CDC WONDER Online Database. Centers for Disease Control and Prevention; 2013. [Accessed 27 January 2016]. http://wonder.cdc.gov/NASA-INSOLAR.html. [Google Scholar]
  44. PDQ Adult treatment editorial board. PDQ lip and oral cavity cancer treatment. Bethesda, MD: National Cancer Institute; 2016. [Accessed 27 May 2016]. http://www.cancer.gov/types/head-and-neck/patient/lip-mouth-treatment-pdq. [Google Scholar]
  45. R Core Team. R Foundation for Statistical Computing. Vienna, Austria: 2015. R: A language and environment for statistical computing. http://www.R-project.org/ [Google Scholar]
  46. Reich BJ, Hodges JS, Zadnik V. Effects of residual smoothing on the posterior of the fixed effects in disease-mapping models. Biometrics. 2006;62(4):1197–206. doi: 10.1111/j.1541-0420.2006.00617.x. [DOI] [PubMed] [Google Scholar]
  47. Rockova V, George EI. Negotiating multicollinearity with spike-and-slab priors. Metron. 2014;72(2):217–29. doi: 10.1007/s40300-014-0047-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Scheel I, Ferkingstad E, Frigessi A, Haug O, Hinnerichsen M, Meze-Hausken E. A Bayesian hierarchical model with spatial variable selection: The effect of weather on insurance claims. J R Stat Soc C. 2013;62(1):85–100. doi: 10.1111/j.1467-9876.2012.01039.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Spiegelhalter DJ, Best NG, Carlin BP, van der Linde A. Bayesian measures of model complexity and fit. J Roy Statist Soc B. 2002;64(4):583–639. doi: 10.1111/1467-9868.00353. [DOI] [Google Scholar]
  50. South Carolina Community Assessment Network. Cancer Incidence. Columbia, SC: South Carolina Department of Health and Environental Control; 1996–2009. [Accessed 2 January, 2016]. http://scangis.dhec.sc.gov/scan/index.aspx. [Google Scholar]
  51. South Carolina Department of Health and Environmental Control. Average in home radon concentrations (pCi/L) Columbia, SC: 2014. http://www.scdhec.gov/images/Radon/Radon2014%20(1).jpg. [Google Scholar]
  52. Thomas A, O’hara B, Ligges U, Sturtz S. Making BUGS Open. R News. 2006;6(1):12–7. http://cran.r-project.org/doc/Rnews/ [Google Scholar]
  53. Thomas A, Best N, Lunn D, Arnold R, Spiegelhalter D. GeoBUGS user manual. Version 3.2.3. 2014 http://www.openbugs.net/Manuals/GeoBUGS/Manual.html.
  54. Torabi M. Hierarchical multivariate mixture generalized linear models for the analysis of spatial data: An application to disease mapping. Biom J. 2016;58(5):1138–50. doi: 10.1002/bimj.201500248. [DOI] [PubMed] [Google Scholar]
  55. U.S. Cancer Statistics Working Group. United States Cancer Statistics: 1999–2012 Incidence and Mortality Web-based Report. Atlanta: U.S. Department of Health and Human Services, Centers for Disease Control and Prevention and National Cancer Institute; 2015. [Accessed: 27 May 2016]. www.cdc.gov/uscs. [Google Scholar]
  56. Waller LA, Carlin BP, Xia H, Gelfand AE. Hierarchical Spatio-Temporal Mapping of Disease Rates. J Am Stat Assoc. 1997;92(438):607–17. doi: 10.1080/01621459.1997.10474012. [DOI] [Google Scholar]
  57. Watanabe S. Asymptotic equivalence of Bayes cross validation and widely applicable information criterion in singular learning theory. J Mach Learn Res. 2010;11(Dec):3571–94. doi:10.1.1.407.7976. [Google Scholar]
  58. Wikle KW, Milliff RF, Nychka D, Berliner ML. Spatiotempoal hierarchical Bayesian modeling: Tropical ocean surface winds. J Am Stat Assoc. 2001;96(454):382–97. http://www.jstor.org/stable/2670277. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supp info

RESOURCES