Spatiotemporal signal detection using continuous shrinkage priors

An-Ting Jhuang; Montserrat Fuentes; Dipankar Bandyopadhyay; Brian J Reich

doi:10.1002/sim.8514

. Author manuscript; available in PMC: 2021 Aug 27.

Published before final editing as: Stat Med. 2020 Feb 27:10.1002/sim.8514. doi: 10.1002/sim.8514

Spatiotemporal signal detection using continuous shrinkage priors

An-Ting Jhuang ¹, Montserrat Fuentes ², Dipankar Bandyopadhyay ³, Brian J Reich ⁴

PMCID: PMC7561003 NIHMSID: NIHMS1634844 PMID: 32106341

Abstract

Periodontal disease (PD) is a chronic inflammatory disease that affects the gum tissue and bone supporting the teeth. Although tooth-site level PD progression is believed to be spatio-temporally referenced, the whole-mouth average periodontal pocket depth (PPD) has been commonly used as an indicator of the current/active status of PD. This leads to imminent loss of information, and imprecise parameter estimates. Despite availability of statistical methods that accommodates spatiotemporal information for responses collected at the tooth-site level, the enormity of longitudinal databases derived from oral health practice-based settings render them unscalable for application. To mitigate this, we introduce a Bayesian spatiotemporal model to detect problematic/diseased tooth-sites dynamically inside the mouth for any subject obtained from large databases. This is achieved via a spatial continuous sparsity-inducing shrinkage prior on spatially varying linear-trend regression coefficients. A low-rank representation captures the nonstationary covariance structure of the PPD outcomes, and facilitates the relevant Markov chain Monte Carlo computing steps applicable to thousands of study subjects. Application of our method to both simulated data and to a rich database of electronic dental records from the HealthPartners® Institute reveal improved prediction performances, compared with alternative models with usual Gaussian priors for regression parameters and conditionally autoregressive specification of the covariance structure.

Keywords: nonstationary covariance, periodontal disease, shrinkage priors, space-time disease surveillance

1. INTRODUCTION

Disease surveillance, which consists of an ongoing systematic collection, collation, analysis, and interpretation of data to establish patterns of a chronic disease progression leading to dissemination of informed action for its prevention and control,¹ has remained an active area of epidemiological research. For example, oral health surveillance techniques² are widely used to detect and prevent periodontal disease (PD), a chronic inflammatory disease that affects the gum tissues and bone supporting the teeth. In the United States, during 2009 and 2010, 47.2% of adults aged ≥ 30 years had some form of PD, affecting approximated 64.7 million people.³ Furthermore, PD is associated with number of comorbid diseases, such as cancer,⁴ cardiovascular diseases,⁵ inflammatory bowel diseases,⁶ and so on, and hence can only increase the cost and disease burden if not properly detected.

Under various temporal settings (such as in clinical studies, or practice-based), oral health clinicians often use partial- or whole-mouth averages⁷ of tooth-site level periodontal pocket depth (PPD) to detect a subject’s active/current PD status^8,9 at an observed time-point. The PPD, recorded in whole millimeters, is defined as the distance from the gingival margin to the epithelial attachment.¹⁰ In addition to the loss of information by taking averages, the corresponding surveillance tools developed mostly ignore the hypothesized spatial referencing¹¹ of PD progression inside the mouth, given the level of PD for a group of proximal sites can be different from those that are located distally. The ability to detect specific regions (say, tooth-sites) in the mouth where PD is rapidly progressing compared with others can lead to quicker interventions with treatments (such as scaling and root planning), medication, and surgery. This early detection and flagging of anomalies can enhance coordination and control activities, specially for chronic PD.

It is believed that the incorporation of spatial information strengthens the power of surveillance, and can localize outbreaks of a disease or characterize variations in regional patterns.¹² Methods for space-time disease surveillance can be broadly classified into the test-based and model-based approaches. The tools to test space-time interactions include Knox test,¹³ Mantel’s test,¹⁴ the k nearest neighbor test,¹⁵ and the computationally heavy cumulative sum methods¹⁶ for detecting change-points. An alternative popular method that detects disease clusters in space and time and provides more information than the space-time interaction tests is the scan statistic,¹⁷ that also inspired a Bayesian test to detect unusual temporal patterns in small area data.¹⁸ However, compared with the test-based methods, model-based approaches estimate disease risk, and thus provide better insight into etiology, spread, prediction, and control of the disease. Under Poisson assumptions for count data in specified time and areas, there exists a number of methods, such as the residual-based approach,¹⁹ exponentially weighted moving average,²⁰ hidden Markov models,²¹ and so on. Recent surveillance techniques in this era of big data can now incorporate data from a variety of sources, such as electronic health records, mobile phone call records, geographically tagged tweets, and so on.²²

Contrary to these geographically aggregated data methods, oral health surveillance for PD mostly considered subject-level exploration for a cross-sectional spatial setting under various complex scenarios, such as complex covariance,^23,24 informative missingess,¹¹ and non-Gaussian responses.^25,26 While the whole-mouth average may be sufficient for studying population effects of systematic treatment, we focus on site-level modeling to quickly detect local changes to guide site-specific treatments. In this article, our motivation for developing short-term spatiotemporal PD surveillance comes form an observational database in a dental practice-based setting, maintained by the HealthPartners® (HP) Institute (henceforth, HP data) located at suburban Minneapolis, Minnesota. Although Bayesian methods exist under nonstationary space-time dependence assumptions,²⁷ their computational scalability in light of the enormity of the HP database is questionable. In addition, the assumed spatial association inside a mouth can be nonstationary, because the association among posteriorly located tooth-sites in the molars can be different than those in the anterior incisors. We set forward to address this problem via a Bayesian spatiotemporal proposal that can detect problematic sites in the mouth via the spatial horseshoe (SHS)²⁸ prior on the site-specific linear-trend coefficients.

Bayesian sparsity-inducing regressions can be broadly classified into two categories: (a) the (discrete) mixture “spike-and-slab” priors,²⁹ which places a point mass at zero and an absolutely continuous prior on the remaining nonzero elements of the parameter vector, and (b) the (continuous) shrinkage priors³⁰ with absolutely continuous shrinkage on the entire parameter vector. Although the spike-and-slab models are theoretically attractive, the discrete indicators there give rise to poor mixing and slow convergence, often complicating the full exploration of the posterior via Markov chain Monte Carlo (MCMC) techniques.^31–33 On the other hand, the “global-local” shrinkage prior models are computationally elegant as it models the posterior inclusion probabilities directly, thereby adjusting to sparsity via global shrinkage, and identifying signals via local shrinkage.^34,35

As the first spatial continuous shrinkage prior, our SHS proposal²⁸ reflects the (realistic) prior belief that there are usually only a few unhealthy sites in one’s mouth during a short period of time that simultaneously incorporate spatial dependence in the signal at nearby observations. The article explores some of its nice theoretical properties, such as high concentration around zero for sparsity, and heavy tails to avoid excessive shrinkage. In this article, we extend this approach to the multisubject spatiotemporal setting, and develop a low-rank representation to capture the nonstationary spatial covariance structure of the HP data with reduced computing time.

The rest of the article proceeds as follows. We describe the motivating HP dataset in Section 2. In Section 3, we introduce our spatiotemporal model for sparse signal detection, and the low-rank representation. We present the Bayesian inferential setup through prior specifications, and related MCMC-based computing details in Section 4. In Section 5, we apply our method to the HP dataset, and summarize our findings. We conduct a simulation study to evaluate the prediction performance of the proposed model in Section 6. Finally, we conclude with a brief discussion in Section 7.

2. MOTIVATING HP DATA

The longitudinal HP dataset consists of information on periodontal health collected for 25 763 subjects from routine dental practices located in suburban Minneapolis. The study period we selected was 2007 to 2014, and the subjects were at least 18-years-old as of January 1, 2007. PPD was measured at six prespecified sites for each tooth, excluding the wisdom teeth (third molars), of each subject via a periodontal probe, and recorded as integer values (in mm). Figure 1 illustrates the tooth numbering system and measured locations for each tooth, excluding wisdom teeth, that is, tooth number 1, 16, 17, and 32. Maxillary (upper) and mandibular (lower) are the jaw indicators, while buccal and lingual represent the cheek-side, and the side closest to the tongue, respectively. The left side of the plots in this article represents the right side of the individual, and vice versa.

Mean and SD of periodontal pocket depth (PPD) in millimeters across all subjects and their visits during the first 2 years. Tooth numbering system is described in numbers and texts for the 28 teeth (2–15 and 18–31), excluding the four third molars (wisdom teeth). There are six locations measured for each tooth, numbered from the buccal to the lingual surface, and the mesial to the distal surface. Note that increments of 0.5 mm reflect the smallest clinically meaningful increment

The objective of our analysis is to flag unhealthy sites at an early stage before severe PD progression. Hence, we restricted our analysis to data collected only during the first 2 years for each subject. This short period also makes the assumption of a linear change of PPD in time reasonable. Although each examining oral clinician provided a recommended follow-up time for periodontal checkups at each time point for each subject, there is no reason to believe that subjects will be abiding to those in this practice-based setting. This leads to a longitudinal database with irregular observation times, and we exclude subjects with less than four visits in the first 2 years. This leads to 7279 subjects satisfying the above conditions, with the number of visits ranging from 4 to 8. Figure 1 presents the mean and SD of the recorded PPD for these subjects. The average ranged between 1.1 and 3.0 mm, with higher values in the posterior located sites (molars) than the anterior, confirming previous findings.³⁶ The variation of PPD exhibits a similar pattern, with the SD ranging from 0.7 to 1.8 mm. Although the database in not publicly available, it can be requested via relevant data use agreement with the HP.

3. MODEL DESCRIPTION

3.1. Spatiotemporal model

Denote y_ijk as the recorded PPD in millimeters for subject i = 1, … n_p at visit j = 1, … , n_vi and at site k = 1, … , n_s, where n_p = 7279 is the number of subjects, n_s = 168 is the number of sites, and n_vi is the number of visits for subject i. With PPD recorded as an integer, we define $y_{i j k}^{*}$ as a latent variable for subject i at site k for the jth visit related to the observed PPD as $y_{i j k} = max {[y_{i j k}^{*}], 0}$ , where [x] is the nearest integer to x. We use the data at the visits in the first 2 years, and assume a latent linear trend of PPD in time. We specify the complete data model for subject i as:

y_{i j k}^{*} = α_{i k} + β_{i k} t_{i j} + ε_{i j k},

(1)

where α_ik is the baseline PPD at site k, β_ik is the slope at site k, t_ij is the years since baseline for the jth visit, and ε_ijk is the random error. We account for missing observations using standard Bayesian missing data methods, assuming the data are missing at random (MAR).³⁷

To capture the prior belief that only a few sites may have deteriorated during the study period, we assume that the slope β_ik marginally follows a horseshoe prior.³⁰ The horseshoe prior can be written hierarchically as

β_{i k} ∣ λ_{i k} ~ N (0, λ_{i k}^{2}), λ_{i k} ~ C^{+} (0, 1),

(2)

where λ_ik is the prior SD, and follows the standard half-Cauchy distribution on the positive reals. Marginally over λ_ik, the prior for β_ik has a mass concentration near zero with heavy tails. The shape of the density shrinks null signals toward zero and avoids shrinking the true signals. This property facilitates separating signals from the noise.

Define the vector of slopes for subject i as $β_{i} = {(β_{i 1}, \dots, β_{i n_{s}})}^{T}$ . To incorporate spatial dependence into a multivariate horseshoe prior, we propose to set β_ik = λ_ik ζ_ik, where λ_ik is a shrinkage parameter with half-Cauchy prior and ζ_ik is normal. Spatial shrinkage is induced by the spatial process model for $λ_{i} = {(λ_{i 1}, \dots, λ_{i n_{s}})}^{T}$ . We propose a Gaussian copula model³⁸ that preserves the marginal half-Cauchy distribution as in Equation (2), and captures spatial dependence, such that

λ_{l k} = f (δ_{i k}),

(3)

where δ_ik is the kth variable of the latent process $δ_{i} = {(δ_{i 1}, \dots, δ_{i n_{s}})}^{T}$ , $f (\cdot) = F_{C^{+}}^{- 1} [Φ (\cdot)]$ is the half-Cauchy link function, $F_{C^{+}}^{- 1} (\cdot)$ is the inverse cumulative density function of the half-Cauchy distribution, and Φ(⋅) is the standard normal cumulative distribution function. The model can be expressed as

y_{i j}^{*} = α_{i} + t_{i j} \cdot f (δ_{i}) ⊙ ζ_{i} + ε_{i j},

(4)

where $y_{i j}^{*} = {(y_{i j 1}^{*}, \dots, y_{i j n_{s}}^{*})}^{T}$ is the vector of latent variables for subject i at the jth visit, $α_{i} = {(α_{i 1}, \dots, α_{i n_{s}})}^{T}$ is the vector of baseline PPD for subject i, the operator ⊙ defines the pointwise vector product, δ_i is the spatial latent vector, $ζ_{i} = {(ζ_{i 1}, \dots, ζ_{i n_{s}})}^{T}$ is the normal vector, and $ε_{i j} = {(ε_{i j 1}, \dots, ε_{i j n_{s}})}^{T}$ is the vector of random error.

3.2. Low-rank representation

We use a low-rank representation to capture the complex spatial dependence of the PPD responses, and to facilitate computing for the vectors α_i, δ_i, and ζ_i. Let Q be an n_s × L basis function matrix that determines the covariance of PPD among sites. We set α_i = Qa_i, δ_i = Qd_i, and ζ_i = Qz_i, such that the model becomes

y_{i j}^{*} = Q a_{i} + t_{i j} \cdot f (Q d_{i}) ⊙ Q z_{i} + ε_{i j},

(5)

where a_i = (a_i1, … , a_iL)^T is the vector related to baseline PPD for subject i, d_i = (d_i1, … , d_iL)^T is the vector related to spatial latent vector, and z_i = (z_i1, … , z_iL)^T is the vector related to slope.

We use principal component analysis (PCA)³⁹ to form the basis function matrix Q. PCA does not merely increase computational efficiency, but provides interpretable decomposition of our data. Denote the total visit times as $N_{v} = \sum_{i = 1}^{n_{p}} n_{v i}$ and S as the n_s × n_s sample covariance matrix of the N_v response vectors $y_{i j} = {(y_{i j 1}, \dots, y_{i j n_{s}})}^{T}$ for i = 1, … , n_p and j = 1, … , n_vi. The eigen decomposition of S is $S = \tilde{Q} \tilde{D} {\tilde{Q}}^{T}$ , where $\tilde{Q}$ is the matrix of ordered eigenvectors q_(i) in the ith column, and $\tilde{D}$ is the diagonal matrix with the ordered eigenvalues ${\tilde{d}}_{(1)} \geq \dots \geq {\tilde{d}}_{(n_{s})}$ . We take the basis matrix Q to be the first L columns of $\tilde{Q}$ . The choice of L depends on the proportion of explained variation, $\sum_{l = 1}^{L} {\tilde{d}}_{(l)} / \sum_{k = 1}^{n_{s}} {\tilde{d}}_{(k)}$ .

In this article, we assume the vectors α_i and β_i are both expanded using the same basis function matrix Q. However, it is possible to have a different basis for different model components. For example, one option is to perform PCA on the sample covariance of least squares estimates of α_i and β_i, and use them as the basis function for α_i and β_i.

4. BAYESIAN INFERENCE

4.1. Prior specification

We select multivariate normal priors for the vectors related to baseline PPD a_i, the slope z_i and the latent d_i, such that $a_{i} ~ N (μ_{a}, σ_{a i}^{2} Σ_{a}), d_{i} ~ N (0, σ_{d i}^{2} Σ_{d})$ , and $z_{i} ~ N (0, γ_{i}^{2} Σ_{z})$ . The mean of a_i is nonzero to capture the overall mean spatial trend, and assigned a noninformative prior as $μ_{a} ~ N (0, 100^{2} I_{L})$ . The priors for the variance parameters $σ_{a i}^{2}$ , $σ_{d i}^{2}$ and $γ_{i}^{2}$ are the uninformative inverse gamma distribution, IG(0.1, 0.1). The random error ε_ijk follows an independent and identical normal prior with zero mean and variance $σ_{ε}^{2}$ , with an uninformative inverse gamma hyperprior IG(0.1, 0.1) for $σ_{ε}^{2}$ .

We select inverse Wishart priors for the covariance matrices Σ_a and Σ_z. The covariance of the intercepts and slopes across subjects may not be the same as the sample covariances, and our model allows for this due to the assignment of an inverse Wishart prior for Σ_a and Σ_z. We hope that the basis matrix Q captures the main features, and allow Σ_a to specify the best covariance in the span of Q. If Q is full rank, this model spans all possible covariance matrices for a_i and z_i, and is thus a flexible model in this limiting sense. The slopes β_i are the product of the two terms, f(δ_i) and ζ_i, which can pose difficulty in estimating the scale of both δ_i and ζ_i. We therefore fix Σ_d at D, the diagonal matrix, with the first L eigenvalues ${\tilde{d}}_{(1)}, \dots, {\tilde{d}}_{(L)}$ . Furthermore, to preserve the half-Cauchy marginal distribution for λ_i, we modify the link function to be $f (δ_{i k}) = F_{C^{+}}^{- 1} [Φ (δ_{i k} / \sqrt{w_{k}})]$ , where w_k is the kth diagonal element QDQ^T. This produces an identifiable model, that is, still quite flexible, with the slope process β_i nonstationary such that $Cov (β_{i k}, β_{i k^{'}}) = λ_{i k} λ_{i k^{'}} Cov (ζ_{i k}, ζ_{i k^{'}})$ depending on the two sites k and k^′ for k, k^′ = 1, … , n_s and k ≠ k^′.

In summary, we formulate the priors

a_{i} | μ_{a}, Σ_{a} ~ N (μ_{a}, σ_{a i}^{2} Σ_{a}), d_{i} | Σ_{d} ~ N (0, D), z_{i} ∣ Σ_{z} ~ N (0, γ_{i}^{2} Σ_{z}), μ_{a} ~ N (0, 100^{2} I_{L}), σ_{a i}^{2}, γ_{i}^{2}, σ_{ε}^{2} ~ IG (0.1, 0.1), Σ_{a}^{- 1}, Σ_{z}^{- 1} ~ Wishart (L, D^{- 1}),

(6)

where L is the degrees of freedom.

4.2. Computing details

We perform MCMC sampling using R. We implement blocked Metropolis-Hastings (MH) sampling⁴⁰ for the vectors d_i and z_i. The full conditional distributions for these parameters

P (d_{l} ∣ \cdot) \propto [\prod_{j = 1}^{n_{v_{i}}} P (y_{i j}^{*} ∣ d_{i})] \times P (d_{i}) \propto exp [- \frac{1}{2} \sum_{j = 1}^{n_{v_{i}}} {(y_{i j}^{*} - μ_{i j})}^{T} {(σ_{ε}^{2} I)}^{- 1} (y_{i j}^{*} - μ_{i j}) - \frac{1}{2} d_{i}^{T} D^{- 1} d_{i}], and P (z_{i} ∣ \cdot) \propto [\prod_{j = 1}^{n_{v_{i}}} P (y_{i j}^{*} ∣ z_{i})] \times P (z_{i}) \propto exp [- \frac{1}{2} \sum_{j = 1}^{n_{v_{i}}} {(y_{i j}^{*} - μ_{i j})}^{T} {(σ_{ε}^{2} I)}^{- 1} (y_{i j}^{*} - μ_{i j}) - \frac{1}{2 γ_{i}^{2}} z_{i}^{T} Σ_{z}^{- 1} z_{i}],

(7)

are where the latent mean vector μ_ij = Qa_i + t_ij f(Qd_i) ⊙ Qz_i. We use Gaussian candidate distributions N(0, D) and $N (0, γ_{i}^{2} Σ_{z})$ . We tune the blocked MH algorithm of d_i and z_i via D and Σ_z to attain acceptance probability near 40%. We monitor convergence using trace plots of several representative parameters.

Gibbs sampling is used for the remaining parameters: the vectors $y_{i j}^{*}$ , a_i, μ_a the parameters $σ_{a i}^{2}, γ_{i}^{2}, σ_{ε}^{2}$ and the inverse covariance matrices $Σ_{a}^{- 1}$ , $Σ_{z}^{- 1}$ . Given the priors in Equation (6), the full conditional distributions used for the Gibbs updates are given below. Define the latent mean as μ_ijk = α_ik + t_ij β_ik. The latent PPD for subject i at visit j and site k, $y_{i j k}^{*}$ , follows a truncated normal distribution with mean μ_ijk, variance $σ_{ε}^{2}$ , lower bound min{0, y_ijk − 0.5} and upper bound y_ijk + 0.5, that is,

y_{i j k}^{*} ∣ y_{i j k}, μ_{i j k}, σ_{ε}^{2} ~ TN [μ_{i j k}, σ_{ε}^{2}, min (0, y_{i j k} - 0.5), y_{i j k} + 0.5],

(8)

where TN(μ, σ², l, u) is the truncated normal density with the mean μ, the variance σ², the lower bound l and the upper bound u. The low-rank vector of baseline PPD a_i follows a multivariate normal posterior distribution.

a_{i} ∣ \cdot ~ N {W_{a i} Q^{T} [\frac{1}{σ_{ε}^{2}} \sum_{j = 1}^{n_{v_{i}}} (y_{i j}^{*} - t_{i j} β_{i}) + \frac{1}{σ_{a i}^{2}} Σ_{a}^{- 1} μ_{a}], W_{a i}},

(9)

where $W_{a i} = {(\frac{1}{σ_{a i}^{2}} Σ_{a}^{- 1} + \frac{n_{v_{i}}}{σ_{ε}^{2}} Q^{T} Q)}^{- 1}$ . In the next layer, the posterior mean μ , scale parameters $σ_{a i}^{2}, γ_{i}^{2}, σ_{ε}^{2}$ , and the scale matrices $Σ_{a}^{- 1}$ , $Σ_{z}^{- 1}$ all have full conditionals as below. Denote the total visit times as $N_{v} = \sum_{i = 1}^{n_{p}} n_{v_{i}}$ .

μ_{a} ∣ \cdot ~ N [{(\sum_{i = 1}^{n_{p}} \frac{1}{σ_{a i}^{2}} Σ_{a}^{- 1} + \frac{1}{100^{2}} I)}^{- 1} (\sum_{i = 1}^{n_{p}} \frac{1}{σ_{a i}^{2}} Σ_{a}^{- 1} a_{i}), {(\sum_{i = 1}^{n_{p}} \frac{1}{σ_{a i}^{2}} Σ_{a}^{- 1} + \frac{1}{100^{2}} I)}^{- 1}], σ_{a i}^{2} ∣ \cdot ~ IG [0.1 + \frac{1}{2}, 0.1 + \frac{1}{2} {(a_{i} - μ_{a})}^{T} Σ_{a}^{- 1} (a_{i} - μ_{a})], γ_{l}^{2} ∣ \cdot ~ IG (0.1 + \frac{1}{2}, 0.1 + \frac{1}{2} z_{l}^{T} Σ_{z}^{- 1} z_{i}), σ_{ε}^{2} ∣ \cdot ~ IG [0.1 + \frac{N_{v} n_{s}}{2}, 0.1 + \frac{1}{2} \sum_{i = 1}^{n_{p}} \sum_{j = 1}^{n_{v_{i}}} \sum_{k = 1}^{n_{s}} {(y_{i j k}^{*} - μ_{i j k})}^{2}], Σ_{a}^{- 1} ∣ \cdot ~ Wishart {n_{p} + L, {[\sum_{i = 1}^{n_{p}} \frac{1}{σ_{a i}^{2}} (a_{i} - μ_{a}) {(a_{i} - μ_{a})}^{T} + D^{- 1}]}^{- 1}}, Σ_{z}^{- 1} ∣ \cdot ~ Wishart [n_{p} + L, {(\sum_{i = 1}^{n_{p}} \frac{1}{γ_{i}^{2}} z_{i} z_{i}^{T} + D^{- 1})}^{- 1}] .

(10)

We generate 10 000 samples and discard the first 2000 as burn-in for data analysis in Section 5.

5. APPLICATION: HP DATA

In this section, we apply the proposed model in Section 3 to the HP data described in Section 2.

5.1. Model comparisons

We fit the model to the visits during the first 2 years for all 7279 subjects simultaneously, and evaluate the prediction of PPD at the next visit for each subject. We compare models with varying flexibility of shrinkage across space and different covariances. We consider two priors (Gaussian and SHS) for the slopes β_i, and two covariances (the sample covariance and conditionally autoregressive, or conditional autoregressive (CAR) covariance⁴¹) across space, via the basis function matrix Q. The Gaussian β_i has a constant shrinkage parameter across space, that is, λ_ik = 1 for all subjects i = 1, … , n_p and sites k = 1, … , n_s. By contrast, the SHS prior for the slopes allows spatially varying shrinkage, β_i = f(δ_i) · ζ_i. Regarding the basis function matrix Q, we consider low-rank representation of the sample covariance, or CAR covariance⁴¹ with the first-order neighbors. Here, a site neighbors the one or two sites on the same buccal/lingual side of the same tooth on the same side of the same jaw, the site on the tooth’s opposite buccal/lingual side, and the site directly above/below on the opposite jaw. Therefore, the four most posterior sites in the buccal side have two neighbors, the other sites in the buccal side have three, and all others have four. Consider the site at location 5 of tooth 15 in (Figure 1) as an example. Its four neighbors are locations 4 and 6 on the same lingual side of tooth 15, location 2 on tooth 15’s buccal side, and location 5 of tooth 18 directly below on the opposite jaw. The CAR covariance is proportional to (M − ρA)⁻¹, where M is the diagonal matrix with the elements m₁, … , m_ns indicating the number of neighbors for sites 1, … , n_s, ρ is the spatial dependence parameter, and A is the adjacency matrix, with A_ij = 1 if sites i and j are neighbors and A_ij = 0, otherwise. The spatial dependence parameter ρ does not quantify the correlation between neighbors, however, correlations generally increase with ρ. We set ρ = 0.99, which gives moderate spatial dependence.⁴² The number of eigenvectors L = 11, 53 in the basis function matrix Q for the sample and CAR covariance are chosen for 70% and 90% explained variation in the sample covariance, respectively. We also compared ρ = 0.5 and ρ = 0.9, and found no substantial improvement.

Table 1 presents the prediction results, based on 100 MCMC iterations. For both L = 11 and L = 53 basis functions, the SHS model with the low-rank representation of the sample covariance produces the smallest predicted mean squared error (MSE) for the observed y_ijk. Compared with L = 11, the MSE is smaller with L = 53 for all models, and with L = 53 the MSE of the SHS model based on the sample covariance is roughly half the MSE of the Gaussian CAR model. Coverage is close to the nominal level 95% for all models. Using a Dell Optiplex 9020 computer with 64-Bit Windows 10, Intel i7–4790 3.6 GHz processor and 32 GB RAM, the computing times (in minutes) for the Gaussian (SHS) models are approximately 17 (23) and 34 (46), for L = 11 and 53, respectively.

TABLE 1.

HP data analysis results

			L = 11		L = 53
Statistic	Model	Covariance	Estimate	SE	Estimate	SE
100×MSE	Gaussian	Sample	89.57	1.23	54.82	0.87
		CAR	101.31	1.37	84.08	0.91
	SHS	Sample	83.85	2.02	43.85	0.99
		CAR	94.36	1.44	60.11	1.02
Coverage (%)	Gaussian	Sample	93.72	0.12	94.79	0.13
		CAR	93.86	0.11	96.15	0.08
	SHS	Sample	93.64	0.12	94.31	0.15
		CAR	94.10	0.11	95.40	0.12
Computing time	Gaussian	Sample	16.89	–	34.72	–
		CAR	17.01	–	34.26	–
	SHS	Sample	22.63	–	45.57	–
		CAR	23.90	–	47.13	–

Open in a new tab

Note: Comparison of prediction accuracy between the Gaussian and spatial horseshoe (SHS) models using the low-rank representation of the sample covariance and conditional autoregressive (CAR) covariance, with the number of basis functions L = 11, 53. Methods are compared using mean squared error (MSE), coverage %, and computing time (in minutes) for 100 MCMC iterations.

5.2. Interpreting eigenvectors

Figures 2 and 3 illustrate the first to fourth and the fifth to eighth eigenvectors of the sample covariance, respectively. We interpret the first eigenvector as the overall mean of PPD; the second as a weighted average of PPD with more emphasis on the posterior teeth; the third puts more weight on the teeth in the mandibular side (ie, lower jaw); and the fourth puts more weight on the posterior teeth but more anterior part than the second eigenvector. The other four eigenvectors in Figure 3 exhibit several local features.

The first four eigenvectors of the sample covariance

Similarly, Figures A1 and A2 (in Appendix A1) present the first to fourth, and fifth to seventh, and twelfth eigenvectors of the CAR covariance in a tooth map, respectively. The first 11 eigenvectors of the CAR covariance change horizontally, from the posterior to the anterior, to the posterior region which are nearly identical for both jaws. Starting from the twelfth eigenvector, there are differences between the maxillary side (ie, upper jaw) and the mandibular side, and the buccal side and the lingual side. The first eigenvector serves as the overall mean of PPD. The second eigenvector puts more emphasis on the left-posterior region and decreases toward the right-posterior region. The third eigenvector is similar to the second in the sample covariance. The remaining eigenvectors in the CAR covariance depict varying characteristics in subregions of a mouth.

5.3. Summary of the fitted models

In this subsection, we summarize the fit of the Gaussian and SHS models with L = 53 eigenvectors to the HP data. To avoid excessive false positives, we implement the Bayesian spatial false discovery rate (BSFDR) procedure⁴³ with rate 0.01 to control for multiple testing. We consider the one-sided null and alternative hypotheses H₀ : β_ik ≤ 0 and H₁ : β_ik > 0, for i = 1, …, 7279 and k = 1, …, 168. We reject the null if the posterior probability of the alternative exceeds the threshold T. The BSFDR procedure determines T, such that the false discovery rate is approximately 0.01. The critical probabilities are T = 94.95% for the Gaussian models and T = 96.97% for the SHS models. The proportions of sites for which $H_{0_{i k}}$ is rejected across subjects are 4.81% and 7.29% for the Gaussian and SHS models. Hence, the SHS model appears to be more powerful.

Figures 4 and 5 plot the fitted results with L = 53 eigenvectors for the two subjects (hereforth, labelled “Subject 1” and “Subject 2”) with greatest difference in the posterior mean of β_ik between the Gaussian and SHS models. The posterior mean for subject 1 in the SHS model is larger in teeth 5, 6, 14, 15, and 18 than the Gaussian model. The map of posterior probability, P(β_ik > 0|Y), indicates that the PPD in the left side of mouth have increased significantly within the first 2 years. Compared with the Gaussian model, SHS finds more significant deterioration of PPD in the buccal side of the lower jaw and in the middle of the right upper jaw (eg, teeth 2, 5, and 6). For Subject 2, the posterior means are larger in the left maxillary side for the SHS model compared with the Gaussian model. The map of posterior probability shows similar results. Comparing models, we find more significant sites and stronger spatial clustering of the signal in the SHS model, compared with the Gaussian model.

Posterior mean of β_k and posterior probability P(β_1k > 0|Y)*, k* = 1,…, 168 for Subject 1 in the Gaussian and spatial horseshoe models with L = 53 eigenvectors

Posterior mean of β_k and posterior probability P(β_2k > 0|Y)*, k* = 1, …, 168 for Subject 2 in the Gaussian and spatial horseshoe models with L = 53 eigenvectors

Figure 6 illustrates that compared with the Gaussian model, the density of the posterior means of β_ik (combining all subjects) from the SHS model has higher concentration around zero, with heavier tails. Moreover, Figure 7 plots the average rejection rate among all subjects by teeth. Aside from teeth in the posterior region, SHS model also detects progression of teeth in the mandibular side and few in the right-maxillary side of the mouth. In addition, Figure 8 presents the correlation among the basis functions for the covariances Σ_a and Σ_z using low-rank representation of the sample covariance and CAR covariance. It is not surprising that almost all basis functions are unrelated due to the eigendecomposition of the two covariances. The only exception is the first and second basis functions in Σ_a, where we observe a weak negative correlation using the sample covariance.

Density plot (left) and quantile-quantile plot (right) of the posterior mean of site-specific linear-trend coefficient β_ik for subjects i = 1, …, 7279 and k = 1, …, 168 from the Gaussian and spatial horseshoe models with the low-rank representation of the sample covariance under 90% explained variation

The average rejection rate across subjects by teeth from the Gaussian and spatial horseshoe models with L = 53 eigenvectors. The rejection rule is available in the beginning of Section 5.3

Posterior mean of the correlation matrix corresponding to the covariances Σ_a and Σ_z from the spatial horseshoe model using the sample covariance (left column) and conditional autoregressive (CAR) covariance (right column) in the basis function matrix Q. The diagonal values are all 1, and removed for better illustration

Fitting our model to the entire dataset of 7279 subjects is time-consuming. However, this is required only once offline to estimate population parameters. Fitting the model to one subject as would be done in practice is fast. The computing times are 0.27 and 0.36 minutes, with L = 11 using the low-rank representation of the sample covariance for the Gaussian and SHS models, respectively. When L = 53, it takes 0.63 and 1.17 minutes for the Gaussian and SHS models, respectively.

6. SIMULATION STUDY

In this section, we conduct a brief simulation study to examine the benefits of using shrinkage priors to detect increases in PD. For all simulations, we restrict the spatial domain to be one jaw (ie, the 84 sites on 14 teeth) and generate data for 50 subjects. For each subject, the intercept is generated from the CAR model (defined in Section 5.1) (α_i1, …, α_i84)^T ~ Normal(31, 0.5²S), where S = (M − 0.99A)⁻¹, A is the 84 × 84 adjacency matrix with (u, v) element equal one with sites u and v are adjacent, and zero otherwise (including the diagonal), and M is the diagonal matrix with ith element equal to the number of sites that are adjacent to site i. The slopes β_ik are generated to be the same within a tooth, and independent across teeth and subjects. The slopes for a tooth are assigned value β₀ with probability π₀, and 0 with probability 1 − π₀. Given the slopes and intercepts, the data are generated as $y_{i j k}^{*} ~ Normal (α_{i k} + (j - 1) β_{i k}, 1)$ for time steps j = 1, …, 5. Therefore, in the simulation, the data are not integer-valued as in the real data analysis. The simulations vary by the effect size β₀ ∈ {0.50, 1.00} and proportion of nonnull slopes π₀ ∈ {0.05, 0.20}. For each combination of these factors, we generate 100 datasets.

For each dataset, we fit three models. The first model is the Gaussian model with δ_i set to zero (“Gaussian”). The second model is the horseshoe model that uses data from all five visits (“HS5”), and the third model is the horseshoe model that uses data from only the first four visits (“HS4”). For each model, we use the full CAR covariance to determine the latent-factor structure, that is, Q and D are set to the eigenvectors and eigenvalues, respectively, of S. For each model, we use the priors given in Section 4.1, and generate 5000 MCMC samples after discarding 1000 as burn-in. This gives estimates of the posterior means ${\hat{β}}_{i k}$ and posterior probabilities that β_ik is positive, denoted q_ik. We conclude that the slope is positive if q_ik > 0.9. Table 2 reports the MSE of ${\hat{β}}_{i k}$ (averaged over tooth-site and subject) and the Type I error and power (also, averaged over tooth-site and subject) of the test for a positive slope.

TABLE 2.

Summary of the simulation study

Statistic	Effect Size	Proportion Nonnull	Gaussian	HS5	HS4
MSE	0.5	0.05	19.6(0.3)	1.4(0.1)	2.4(0.1)
		0.20	19.6(0.3)	3.2(0.1)	4.7(0.1)
	1.0	0.05	19.6(1.5)	1.5(0.1)	3.1(0.0)
		0.20	19.7(0.3)	3.8(0.1)	7.5(0.1)
Type I error	0.5	0.05	1.1(0.1)	0.3(0.1)	0.4(0.1)
		0.20	1.1(0.4)	0.4(0.1)	0.4(0.1)
	1.0	0.05	1.1(0.1)	0.4(0.1)	0.4(0.1)
		0.20	1.1(0.1)	0.8(0.1)	0.6(0.1)
Power	0.5	0.05	12.1(0.5)	23.0(0.4)	8.5(0.3)
		0.20	11.9(0.5)	22.6(0.2)	8.4(0.1)
	1.0	0.05	48.7(1.0)	90.9(0.2)	58.7(0.5)
		0.20	48.0(0.9)	91.0(0.1)	58.5(0.2)

Open in a new tab

Note: The competing models are the Gaussian model, and the horseshoe (“HS”) model that uses data from four (HS4), or five (HS5) visits. The simulations vary depending on the effect size β₀ and proportion of nonnull slopes π₀. MSE, Type I error and power are multiplied by 100, and standard errors are given in parentheses.

The MSE is dramatically smaller for the HS prior than the Gaussian prior, especially when the proportion of nonnull slopes is low (π₀ = 0.05). All three methods are conservative, with Type I error less than 0.05 in all cases. The HS prior that uses the full dataset is more powerful than the Gaussian prior. In fact, the horseshoe prior that only uses data from the first four visits is often more powerful than the Gaussian model that uses data from all five visits.

7. DISCUSSION

In this article, we propose a spatiotemporal model for detecting local changes in PD. We implement the SHS prior on the linear time trend by sites for each subject. We introduce low-rank representation to reduce computational load, and obtain a nonstationary spatial covariance which suits the HP data and provides more flexibility. The empirical results show improved prediction compared with alternatives that rely on the usual Gaussian priors for the regression parameters, and a CAR specification for the covariance structure. Computing codes in R for fitting the proposed model is available on request from the corresponding author.

A potential limitation of our model is the assumption of a linear change of the PPD in time. We believe this is reasonable as we are using PPD responses collected within the subjects’ dental visits in 2 years. This linear time trend can be modified (via splines, or other functional structures) to meet the assumptions for a longer study duration.⁴⁴ One possibility is to allow for a higher order trend at each site, but assume the same shrinkage parameter λ_ik to appear in the prior SD of all terms to shrink toward the static-mean model, following developments⁴⁵ in nonspatial data. A second restriction is that although some spatiotemporal dependence is induced by the random slopes and intercepts, the errors are assumed to be independent. We think this is sufficient for the HP data in that PPDs were measured independently across subjects, visits and sites. However, this may not hold for other datasets. Although we found the eigen-decomposition based on the sample covariance matrix to yield better results than the spatial CAR model, we are yet to explore more sophisticated parametric correlation structures.^23,46 In addition, our shrinkage prior for the slopes is symmetric. While negative regression coefficients are plausible as PPD can decrease with interventions such as improvements in dental hygiene, increasing PPD is more common and relevant for disease monitoring, and so an asymmetric shrinkage prior could prove useful.

Relying on previous oral health studies,^11,25 one maybe tempted to consider informative missingness, or the “missing-not-at-random” scenario within a spatiotemporal setup. Missing teeth are indicative of poor periodontal health. Hence, a specific region with many missing teeth is likely to have higher PPD at the nonmissing sites (in that region)—an observation which can be attributed to spatial clustering. However, in the present analysis, we feel the working MAR assumption is reasonable, given that subjects rarely loose teeth during the short follow-up time we are considering, and thus periodontal health assessment can be relied on changes in PPD over time at nonmissing sites. Furthermore, our current flagging algorithm only considers the baseline PPD, whereas, other covariates (sociodemographic, behavioral, and so on) may also influence signal detection. All these are important avenues of future research, and will be considered elsewhere.

ACKNOWLEDGEMENTS

This work was supported by grant R01DE024984 from the National Institutes of Health. The authors thank B.S.M., B.D.R., S.K., and B.A.R. for providing the HealthPartners dataset, and the context behind this work.

Funding information

Foundation for the National Institutes of Health, Grant/Award Number: R01-DE024984-01A1

APPENDIX

A1. Plots of estimated parameters

FIGURE A1 — The first four eigenvectors of the conditional autoregressive (CAR) covariance

FIGURE A2 — The fifth to seventh and the twelfth eigenvectors of the conditional autoregressive (CAR) covariance

Footnotes

SUPPORTING INFORMATION

Additional supporting information may be found online in the Supporting Information section at the end of this article.

REFERENCES

1.World Health Organization. Global early warning system for major animal diseases, including Zoonoses (GLEWS) http://www.who.int/zoonoses/outbreaks/glews/en/; 2007.
2.Beltrán-Aguilar ED, Malvitz DM, Lockwood SA, Rozier R, Gary TSL. Oral health surveillance: past, present, and future challenges. J Public Health Dentist. 2003;63:141–149. [DOI] [PubMed] [Google Scholar]
3.Eke PI, Dye BA, Wei L, Thornton-Evans GO, Genco RJ. Prevalence of periodontitis in adults in the United States: 2009 and 2010. J Dental Res. 2012;91:914–920. [DOI] [PubMed] [Google Scholar]
4.Cheng Y-SL, Jordan L, Chen H-S, et al. Chronic periodontitis can affect the levels of potential oral cancer salivary mRNA biomarkers. J Periodontal Res. 2017;52:428–437. [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Persson GR, Persson RE. Cardiovascular disease and periodontitis: an update on the associations and risk. J Clinical Periodontology. 2008;35:362–379. [DOI] [PubMed] [Google Scholar]
6.Vavricka SR, Manser CN, Hediger S, et al. Periodontitis and gingivitis in inflammatory bowel disease: a case—control study. Inflammatory Bowel Diseases. 2013;19:2768–2777. [DOI] [PubMed] [Google Scholar]
7.Tran DT, Gay I, Du Xianglin L, et al. Assessment of partial-mouth periodontal examination protocols for periodontitis surveillance. J Clinical Periodontology. 2014;41:846–852. [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Page RC, Eke PI. Case definitions for use in population-based surveillance of periodontitis. J Periodontology. 2007;78:1387–1399. [DOI] [PubMed] [Google Scholar]
9.Michalowicz HJS, Philstrom BL. Is change in probing depth a reliable predictor of change in clinical attachment loss? J Am Dental Assoc. 2013;144:171–178. [DOI] [PubMed] [Google Scholar]
10.Bandyopadhyay LVH, Abanto-Valle CA, Ghosh P. Linear mixed models for skew-normal/independent bivariate responses with an application to periodontal disease. Stat Med. 2010;29:2643–2655. [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Reich BJ, Bandyopadhyay D. A latent factor model for spatial data with informative missingness. The Annals of Applied Statistics. 2010;4:439–459. [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Unkel S, Farrington CP, Garthwaite PH, Robertson C, Andrews N. Statistical methods for the prospective detection of infectious disease outbreaks: a review. J Royal Stat Soc Ser A (Stat Soc). 2012;175:49–82. [Google Scholar]
13.Knox EG, Bartlett MS. The detection of space-time interactions. J Royal Stat Soc Ser C (Appl Stat). 1964;13:25–30. [Google Scholar]
14.Mantel N The detection of disease clustering and a generalized regression approach. Cancer Res. 1967;27:209–220. [PubMed] [Google Scholar]
15.Jacquez GM. A k nearest neighbor test for space-time interaction. Stat Med. 1996;15:1935–1949. [DOI] [PubMed] [Google Scholar]
16.Rogerson PA, Ikuho Y. Monitoring change in spatial patterns of disease: comparing univariate and multivariate cumulative sum approaches. Stat Med. 2004;23:2195–2214. [DOI] [PubMed] [Google Scholar]
17.Kulldorff M, Heffernan R, Hartman J, Assunção R, Mostashari F. A space–time permutation scan statistic for disease outbreak detection. PLOS Med. 2005;2:e59. [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Li G, Best N, Hansell AL, Ahmed I, Richardson S. BaySTDetect: detecting unusual temporal patterns in small area data via Bayesian model choice. Biostatistics. 2012;13:695–710. [DOI] [PubMed] [Google Scholar]
19.Vidal Rodeiro CL, Lawson Andrew B. Monitoring changes in spatio-temporal maps of disease. Biomet J. 2006;48:463–480. [DOI] [PubMed] [Google Scholar]
20.Zhou H, Lawson AB. EWMA smoothing and Bayesian spatial modeling for health surveillance. Stat Med. 2008;27:5907–5928. [DOI] [PubMed] [Google Scholar]
21.Watkins RE, Eagleson S, Veenendaal B, Wright G, Plant AJ. Disease surveillance using a hidden Markov model. BMC Med Inform Decis Mak. 2009;9:39. [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Lee EC, Asher JM, Goldlust S, Kraemer JD, Lawson AB, Bansal S. Mind the scales: harnessing spatial big data for infectious disease surveillance and inference. J Infect Diseas. 2016;214:S409–S413. [DOI] [PMC free article] [PubMed] [Google Scholar]
23.Reich BJ, Hodges JS, Carlin BP. Spatial analyses of periodontal data using conditionally autoregressive priors having two classes of neighbor relations. J Am Stat Assoc. 2007;102:44–55. [Google Scholar]
24.Jin IH, Yuan Y, Bandyopadhyay D. A Bayesian hierarchical spatial model for dental caries assessment using non-Gaussian Markov random fields. Ann Appl Stat. 2016;10:884–905. [DOI] [PMC free article] [PubMed] [Google Scholar]
25.Reich BJ, Bandyopadhyay D, Bondell HD. A nonparametric spatial model for periodontal data with non-random missingness. J Am Stat Assoc. 2013;108:820–831. [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Cai B, Bandyopadhyay D. Bayesian semiparametric variable selection with applications to periodontal data. Stat Med. 2017;36:2251–2264. [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Reich BJ, Hodges JS. Modeling longitudinal spatial periodontal data: a spatially-adaptive model with tools for specifying priors and checking fit. Biometrics. 2008;64:790–799. [DOI] [PubMed] [Google Scholar]
28.Jhuang A-T, Fuentes M, Jones JL, et al. Spatial signal detection using continuous shrinkage priors. Techonometrics. 2019;61:494–506. [DOI] [PMC free article] [PubMed] [Google Scholar]
29.Ishwaran H, Rao JS. Spike and slab variable selection: frequentist and Bayesian strategies. Ann Stat. 2005;33:730–773. [Google Scholar]
30.Carvalho CM, Polson NG, Scott JG. The Horseshoe estimator for sparse signals. Biometrika. 2010;97:465–480. [Google Scholar]
31.Goldsmith J, Huang L, Crainiceanu CM. Smooth scalar-on-image regression via spatial Bayesian variable selection. J Comput Graphical Stat. 2014;23:46–64. [DOI] [PMC free article] [PubMed] [Google Scholar]
32.Boehm Vock LF, Reich BJ, Fuentes M, Dominici F. Spatial variable selection methods for investigating acute health effects of fine particulate matter components. Biometrics. 2015;71:167–177. [DOI] [PMC free article] [PubMed] [Google Scholar]
33.Ročková V, George EI. The Spike-and-Slab LASSO. J Am Stat Assoc. 2018;113:431–444. [Google Scholar]
34.Polson NG, Scott JG. Shrink globally, act locally: sparse Bayesian regularization and prediction. Bayesian Stat. 2010;9:501–538. [Google Scholar]
35.Bhadra A, Datta J, Polson NG, Willard B. Default Bayesian analysis with global-local shrinkage priors. Biometrika. 2016;103:955–969. [Google Scholar]
36.Quteish TDSM. Periodontal reasons for tooth extraction in an adult population in Jordan. J Oral Rehabilitat. 2003;30:110–112. [DOI] [PubMed] [Google Scholar]
37.Little RJA, Rubin DB. Statistical Analysis with Missing Data. 3rd ed. Hoboken, NJ: John Wiley & Sons; 2019. [Google Scholar]
38.Nelsen RB. An Introduction to Copulas. New York, NY: Springer; 2006. [Google Scholar]
39.Pearson K LIII.On lines and planes of closest fit to systems of points in space. London Edinburgh Dublin Philosoph Mag J Sci. 1901;2(11):559–572. [Google Scholar]
40.Gelman A, Carlin JB, Stern HS, Dunson DB, Vehtari A, Rubin DB. Bayesian Data Analysis. 3rd ed. Boca Raton, FL: Chapman and Hall/CRC; 2013. [Google Scholar]
41.Banerjee S, Carlin BP, Gelfand AE. Hierarchical Modeling and Analysis for Spatial Data. 2nd ed. Boca Raton, FL: Chapman and Hall/CRC; 2014. [Google Scholar]
42.Gelfand AE, Vounatsou P. Proper multivariate conditional autoregressive models for spatial data analysis. Biostatistics. 2003;4:11–15. [DOI] [PubMed] [Google Scholar]
43.Sun W, Reich BJ, Tony CT, Guindani M, Schwartzman A. False discovery control in large-scale spatial multiple testing. J Royal Stat Soc Ser B (Stat Methodol). 2015;77:59–83. [DOI] [PMC free article] [PubMed] [Google Scholar]
44.Corberán-Vallet A, Lawson AB. chapter 27 Spatial health surveillance In: Lawson AB, Banerjee S, Haining RP, Ugarte MD, eds. Handbook of Spatial Epidemiology. Boca Raton, FL: Chapman and Hall/CRC; 2016:501–519. [Google Scholar]
45.Wei R, Reich BJ, Hoppin JA, Ghosal S. Sparse Bayesian additive nonparametric regression with application to health effects of pesticides mixtures. Statistica Sinica. 2020;30:55–79. [Google Scholar]
46.Mancl LA, Leroux BG. Efficiency of regression estimates for clustered data. Biometrics. 1996;52:500–511. [PubMed] [Google Scholar]

[R1] 1.World Health Organization. Global early warning system for major animal diseases, including Zoonoses (GLEWS) http://www.who.int/zoonoses/outbreaks/glews/en/; 2007.

[R2] 2.Beltrán-Aguilar ED, Malvitz DM, Lockwood SA, Rozier R, Gary TSL. Oral health surveillance: past, present, and future challenges. J Public Health Dentist. 2003;63:141–149. [DOI] [PubMed] [Google Scholar]

[R3] 3.Eke PI, Dye BA, Wei L, Thornton-Evans GO, Genco RJ. Prevalence of periodontitis in adults in the United States: 2009 and 2010. J Dental Res. 2012;91:914–920. [DOI] [PubMed] [Google Scholar]

[R4] 4.Cheng Y-SL, Jordan L, Chen H-S, et al. Chronic periodontitis can affect the levels of potential oral cancer salivary mRNA biomarkers. J Periodontal Res. 2017;52:428–437. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R5] 5.Persson GR, Persson RE. Cardiovascular disease and periodontitis: an update on the associations and risk. J Clinical Periodontology. 2008;35:362–379. [DOI] [PubMed] [Google Scholar]

[R6] 6.Vavricka SR, Manser CN, Hediger S, et al. Periodontitis and gingivitis in inflammatory bowel disease: a case—control study. Inflammatory Bowel Diseases. 2013;19:2768–2777. [DOI] [PubMed] [Google Scholar]

[R7] 7.Tran DT, Gay I, Du Xianglin L, et al. Assessment of partial-mouth periodontal examination protocols for periodontitis surveillance. J Clinical Periodontology. 2014;41:846–852. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R8] 8.Page RC, Eke PI. Case definitions for use in population-based surveillance of periodontitis. J Periodontology. 2007;78:1387–1399. [DOI] [PubMed] [Google Scholar]

[R9] 9.Michalowicz HJS, Philstrom BL. Is change in probing depth a reliable predictor of change in clinical attachment loss? J Am Dental Assoc. 2013;144:171–178. [DOI] [PubMed] [Google Scholar]

[R10] 10.Bandyopadhyay LVH, Abanto-Valle CA, Ghosh P. Linear mixed models for skew-normal/independent bivariate responses with an application to periodontal disease. Stat Med. 2010;29:2643–2655. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R11] 11.Reich BJ, Bandyopadhyay D. A latent factor model for spatial data with informative missingness. The Annals of Applied Statistics. 2010;4:439–459. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R12] 12.Unkel S, Farrington CP, Garthwaite PH, Robertson C, Andrews N. Statistical methods for the prospective detection of infectious disease outbreaks: a review. J Royal Stat Soc Ser A (Stat Soc). 2012;175:49–82. [Google Scholar]

[R13] 13.Knox EG, Bartlett MS. The detection of space-time interactions. J Royal Stat Soc Ser C (Appl Stat). 1964;13:25–30. [Google Scholar]

[R14] 14.Mantel N The detection of disease clustering and a generalized regression approach. Cancer Res. 1967;27:209–220. [PubMed] [Google Scholar]

[R15] 15.Jacquez GM. A k nearest neighbor test for space-time interaction. Stat Med. 1996;15:1935–1949. [DOI] [PubMed] [Google Scholar]

[R16] 16.Rogerson PA, Ikuho Y. Monitoring change in spatial patterns of disease: comparing univariate and multivariate cumulative sum approaches. Stat Med. 2004;23:2195–2214. [DOI] [PubMed] [Google Scholar]

[R17] 17.Kulldorff M, Heffernan R, Hartman J, Assunção R, Mostashari F. A space–time permutation scan statistic for disease outbreak detection. PLOS Med. 2005;2:e59. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R18] 18.Li G, Best N, Hansell AL, Ahmed I, Richardson S. BaySTDetect: detecting unusual temporal patterns in small area data via Bayesian model choice. Biostatistics. 2012;13:695–710. [DOI] [PubMed] [Google Scholar]

[R19] 19.Vidal Rodeiro CL, Lawson Andrew B. Monitoring changes in spatio-temporal maps of disease. Biomet J. 2006;48:463–480. [DOI] [PubMed] [Google Scholar]

[R20] 20.Zhou H, Lawson AB. EWMA smoothing and Bayesian spatial modeling for health surveillance. Stat Med. 2008;27:5907–5928. [DOI] [PubMed] [Google Scholar]

[R21] 21.Watkins RE, Eagleson S, Veenendaal B, Wright G, Plant AJ. Disease surveillance using a hidden Markov model. BMC Med Inform Decis Mak. 2009;9:39. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R22] 22.Lee EC, Asher JM, Goldlust S, Kraemer JD, Lawson AB, Bansal S. Mind the scales: harnessing spatial big data for infectious disease surveillance and inference. J Infect Diseas. 2016;214:S409–S413. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R23] 23.Reich BJ, Hodges JS, Carlin BP. Spatial analyses of periodontal data using conditionally autoregressive priors having two classes of neighbor relations. J Am Stat Assoc. 2007;102:44–55. [Google Scholar]

[R24] 24.Jin IH, Yuan Y, Bandyopadhyay D. A Bayesian hierarchical spatial model for dental caries assessment using non-Gaussian Markov random fields. Ann Appl Stat. 2016;10:884–905. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R25] 25.Reich BJ, Bandyopadhyay D, Bondell HD. A nonparametric spatial model for periodontal data with non-random missingness. J Am Stat Assoc. 2013;108:820–831. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R26] 26.Cai B, Bandyopadhyay D. Bayesian semiparametric variable selection with applications to periodontal data. Stat Med. 2017;36:2251–2264. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R27] 27.Reich BJ, Hodges JS. Modeling longitudinal spatial periodontal data: a spatially-adaptive model with tools for specifying priors and checking fit. Biometrics. 2008;64:790–799. [DOI] [PubMed] [Google Scholar]

[R28] 28.Jhuang A-T, Fuentes M, Jones JL, et al. Spatial signal detection using continuous shrinkage priors. Techonometrics. 2019;61:494–506. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R29] 29.Ishwaran H, Rao JS. Spike and slab variable selection: frequentist and Bayesian strategies. Ann Stat. 2005;33:730–773. [Google Scholar]

[R30] 30.Carvalho CM, Polson NG, Scott JG. The Horseshoe estimator for sparse signals. Biometrika. 2010;97:465–480. [Google Scholar]

[R31] 31.Goldsmith J, Huang L, Crainiceanu CM. Smooth scalar-on-image regression via spatial Bayesian variable selection. J Comput Graphical Stat. 2014;23:46–64. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R32] 32.Boehm Vock LF, Reich BJ, Fuentes M, Dominici F. Spatial variable selection methods for investigating acute health effects of fine particulate matter components. Biometrics. 2015;71:167–177. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R33] 33.Ročková V, George EI. The Spike-and-Slab LASSO. J Am Stat Assoc. 2018;113:431–444. [Google Scholar]

[R34] 34.Polson NG, Scott JG. Shrink globally, act locally: sparse Bayesian regularization and prediction. Bayesian Stat. 2010;9:501–538. [Google Scholar]

[R35] 35.Bhadra A, Datta J, Polson NG, Willard B. Default Bayesian analysis with global-local shrinkage priors. Biometrika. 2016;103:955–969. [Google Scholar]

[R36] 36.Quteish TDSM. Periodontal reasons for tooth extraction in an adult population in Jordan. J Oral Rehabilitat. 2003;30:110–112. [DOI] [PubMed] [Google Scholar]

[R37] 37.Little RJA, Rubin DB. Statistical Analysis with Missing Data. 3rd ed. Hoboken, NJ: John Wiley & Sons; 2019. [Google Scholar]

[R38] 38.Nelsen RB. An Introduction to Copulas. New York, NY: Springer; 2006. [Google Scholar]

[R39] 39.Pearson K LIII.On lines and planes of closest fit to systems of points in space. London Edinburgh Dublin Philosoph Mag J Sci. 1901;2(11):559–572. [Google Scholar]

[R40] 40.Gelman A, Carlin JB, Stern HS, Dunson DB, Vehtari A, Rubin DB. Bayesian Data Analysis. 3rd ed. Boca Raton, FL: Chapman and Hall/CRC; 2013. [Google Scholar]

[R41] 41.Banerjee S, Carlin BP, Gelfand AE. Hierarchical Modeling and Analysis for Spatial Data. 2nd ed. Boca Raton, FL: Chapman and Hall/CRC; 2014. [Google Scholar]

[R42] 42.Gelfand AE, Vounatsou P. Proper multivariate conditional autoregressive models for spatial data analysis. Biostatistics. 2003;4:11–15. [DOI] [PubMed] [Google Scholar]

[R43] 43.Sun W, Reich BJ, Tony CT, Guindani M, Schwartzman A. False discovery control in large-scale spatial multiple testing. J Royal Stat Soc Ser B (Stat Methodol). 2015;77:59–83. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R44] 44.Corberán-Vallet A, Lawson AB. chapter 27 Spatial health surveillance In: Lawson AB, Banerjee S, Haining RP, Ugarte MD, eds. Handbook of Spatial Epidemiology. Boca Raton, FL: Chapman and Hall/CRC; 2016:501–519. [Google Scholar]

[R45] 45.Wei R, Reich BJ, Hoppin JA, Ghosal S. Sparse Bayesian additive nonparametric regression with application to health effects of pesticides mixtures. Statistica Sinica. 2020;30:55–79. [Google Scholar]

[R46] 46.Mancl LA, Leroux BG. Efficiency of regression estimates for clustered data. Biometrics. 1996;52:500–511. [PubMed] [Google Scholar]

PERMALINK

Spatiotemporal signal detection using continuous shrinkage priors

An-Ting Jhuang

Montserrat Fuentes

Dipankar Bandyopadhyay

Brian J Reich

Abstract

1. INTRODUCTION

2. MOTIVATING HP DATA

FIGURE 1.

3. MODEL DESCRIPTION

3.1. Spatiotemporal model

3.2. Low-rank representation

4. BAYESIAN INFERENCE

4.1. Prior specification

4.2. Computing details

5. APPLICATION: HP DATA

5.1. Model comparisons

TABLE 1.

5.2. Interpreting eigenvectors

FIGURE 2.

FIGURE 3.

5.3. Summary of the fitted models

FIGURE 4.

FIGURE 5.

FIGURE 6.

FIGURE 7.

FIGURE 8.

6. SIMULATION STUDY

TABLE 2.

7. DISCUSSION

ACKNOWLEDGEMENTS

APPENDIX

A1. Plots of estimated parameters

FIGURE A1.

FIGURE A2.

Footnotes

REFERENCES

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases