Bayesian geostatistical modelling with informative sampling locations

D Pati; B J Reich; D B Dunson

doi:10.1093/biomet/asq067

. 2011 Mar;98(1):35–48. doi: 10.1093/biomet/asq067

Bayesian geostatistical modelling with informative sampling locations

D Pati ¹, B J Reich ², D B Dunson ³

PMCID: PMC3744635 PMID: 23956461

Summary

We consider geostatistical models that allow the locations at which data are collected to be informative about the outcomes. A Bayesian approach is proposed, which models the locations using a log Gaussian Cox process, while modelling the outcomes conditionally on the locations as Gaussian with a Gaussian process spatial random effect and adjustment for the location intensity process. We prove posterior propriety under an improper prior on the parameter controlling the degree of informative sampling, demonstrating that the data are informative. In addition, we show that the density of the locations and mean function of the outcome process can be estimated consistently under mild assumptions. The methods show significant evidence of informative sampling when applied to ozone data over Eastern U.S.A.

Keywords: Cox process, Gaussian process, Joint model, Point pattern, Posterior consistency, Preferential sampling

1. Introduction

Geostatistical models focus on inferring a continuous spatial process based on data observed at finitely many locations, with the locations typically assumed to be noninformative. As noted by Diggle et al. (2010), this assumption is commonly violated in point-referenced spatial data, as it is not unusual to collect data at locations thought to have a large or small value for the outcome. For example, in monitoring of air pollution, one may place more monitors at locations believed to have a high value of ozone or another pollutant, while in studying distribution of animal species one may systematically look in locations thought to commonly contain the species of interest. Diggle et al. (2010) proposed a shared latent process model to adjust for bias due to informative sampling locations. Their analysis was implemented using a Monte Carlo approach for maximum likelihood estimation.

We follow a Bayesian approach using a model related to those described by R. Menezes in an unpublished 2005 Ph.D thesis from Universidad de Santiago de Compostela, Ho & Stoyan (2008) and Diggle et al. (2010). The locations are modelled using a log Gaussian Cox process (Møller et al., 2001), with the intensity function included as a spatially varying predictor in the outcome model, which also includes spatial random effects drawn from a Gaussian process. A parameter a controls the degree of informative sampling, and the sampling locations are ignorable in the special case in which a = 0, while a > 0 implies a tendency to take more observations at spatial locations having relatively high outcome values. This model modifies shared random effects models for joint modelling of longitudinal and event time data (Radcliffe et al., 2004) and for accommodating informative missingness (Wu & Follmann, 1999).

To our knowledge, we are the first to develop a Bayesian approach to the informative locations problem in geostatistical modelling. However, adapting recently proposed models to the Bayesian paradigm is relatively straightforward, and our primary contribution is studying the theoretical properties of the model. In particular, it is not obvious that the data contain information about the informativeness of the sampling locations, and one may wonder to what extent the prior is driving the results even in large samples. We address this concern by proving that the posterior is proper under a noninformative prior on a. In addition, one can consistently estimate a, the density of the sampling locations and the mean function of the outcome process. This result extends recent work showing posterior consistency in Gaussian process regression models (Choi & Schervish, 2007; Choi, 2007). Proofs are provided in the Appendix.

2. Model for spatial data with informative sampling

Our objective is to estimate the spatial surface μ(s) ∈ 𝕉, for all s ∈ 𝒟 ⊂ 𝕉², based on observations y₁, . . . , y_n at locations s₁, . . . , s_n ∈ 𝒟. We propose the joint model

y_{i} | s_{i} \sim N {η (s_{i}) + a ξ (s_{i}), σ^{2}}, p (s_{i}) = \frac{exp {ξ (s_{i})}}{\int_{𝒟} exp {ξ (s)} d s} (i = 1, \dots, n),

(1)

where the observations are independent across locations s_i given ξ(s) and η(s), and p(s) is the location density. Assuming the locations are a realization of an inhomogeneous Poisson process with log intensity ξ(s), the mean surface is characterized as μ(s) = η(s) + aξ(s), where η(s) is a baseline surface and aξ(s) is an adjustment due to informative sampling. Letting x(s) denote a vector of spatial covariates, ξ(s) = x(s)^T β_ξ + ξ_r (s) and η(s) = x(s)^T β_η + η_r (s), where β_ξ and β_η are regression coefficients and ξ_r (s) and η_r (s) are zero-mean residual processes.

The log sampling density is treated as a latent covariate to adjust for informative sampling, with a > 0 implying that samples are more likely to be taken in areas with a large response. Setting the coefficient in β_ξ corresponding to the intercept to zero for identifiability,

E (y_{i} | s_{i}) = x {(s_{i})}^{T} β * + a ξ_{r} (s_{i}) + η_{r} (s_{i}) (i = 1, \dots, n),

(2)

where β* = aβ_ξ + β_η. Therefore, accounting for informative sampling is only necessary when there is an association between the spatial surface of interest and the sampling density that cannot be explained by the shared spatial covariates x(s).

The residuals ξ_r (s) ∼ Π_{ξ_r} and η_r(s) ∼ Π_{η_r} are assigned independent zero-mean Gaussian process priors with Matérn covariance functions (Stein, 1999),

c (h | ψ) = \frac{τ^{2}}{2^{ν - 1} Γ (ν)} {(\frac{2 ν^{1 / 2} h}{ρ})}^{ν} 𝒦_{ν} (\frac{2 ν^{1 / 2} h}{ρ}), h = ‖ s - s^{'} ‖,

(3)

where ψ = (τ², ρ, ν) and 𝒦 denotes the modified Bessel function of the second kind. The Matérn covariance has three parameters: τ² > 0 controls the variance, ρ > 0 controls the spatial range of the correlation and ν > 0 controls the smoothness of the process. Special cases include the exponential c(h | ψ) = τ² exp(−2^1/2h/ρ) with ν = 1/2, and the squared exponential c(h | ψ) = τ² exp(−2h²/ρ²) with ν = ∞.

3. Theoretical properties

3.1. Weak posterior consistency

In this section, we obtain posterior consistency of the parameters of our model under fixed-domain asymptotics. Consider the joint model defined in §2, with 𝒟 = [0, 1]² without loss of generality and Π_{ξ_r}, Π_{η_r} Gaussian processes on 𝒞(𝒟), the space of continuous functions on 𝒟. Letting c(h | ψ_ξ) and c(h | ψ_η) denote the covariance functions for ξ_r and η_r, respectively, we choose independent bounded hyperpriors for $τ_{ξ}^{2}$ , $τ_{η}^{2}$ , ν_ξ and ν_η while letting ρ_ξ ∼ π_ξ and ρ_η ∼ π_η, where the supports of both π_η and π_ξ are 𝕉⁺. We choose a proper prior on 𝕉 for a, β_ξ ∼ N(β_0ξ, Σ_0ξ), β_η ∼ N(β_0η, Σ_0η) and σ² ∼ Inv-Ga (α_σ, β_σ).

Assumption 1. The prior ζ ∼ Π satisfies the prior positivity condition Π(ζ : ‖ζ − ζ₀‖_∞ < ∊) > 0 for all ∊ > 0 and for any ζ₀ ∈ 𝒞(𝒟).

van der Vaart & van Zanten (2009) showed that Assumption 1 holds for Gaussian process priors with squared exponential covariance under mild conditions and, in an unpublished 2005 Ph.D thesis from Carnegie Mellon University, T. Choi provided a set of sufficient conditions on the Matéern covariance kernel for the same setting.

Assumption 2. The covariates are uniformly bounded, so there exists an M > 0 such that ‖x(s)‖ ⩽ M for all s ∈ 𝒟.

Theorem 1. Under models (1)–(2) with priors chosen as described in §3 and Assumptions 1–2, the posterior distribution Π[ξ_r, η_r, a, β_ξ, β_η, σ | {(y_i, s_i), i = 1, . . . , n}] is weakly consistent.

Theorem 1 does not imply that the hyperparameters in the covariance kernel are consistently estimated, though we do take into account uncertainty in these parameters and do not assume that the priors are well specified. It is typically not possible to consistently estimate all the parameters in the Matérn covariance (Zhang, 2004).

3.2. Posterior propriety of a

Under models (1)–(2), the parameter a controls the degree of informative sampling. The uniform improper prior, π_a(a) ∝ 1, provides a noninformative choice. Theorem 2 shows that this prior leads to a proper posterior, implying that the data are informative about a.

Letting s = (s₁, . . . , s_n), y = (y₁, . . . , y_n)^T, $ξ_{r}^{n}$ = {ξ_r (s₁), . . . , ξ_r (s_n)}^T, and $η_{r}^{n}$ = {η_r(s₁), . . . , η_r (s_n)}^T, we have $ξ_{r}^{n}$ ∼ N(0, $Σ_{ξ}^{n}$ ) and $η_{r}^{n}$ ∼ N(0, $Σ_{η}^{n}$ ), where $Σ_{ξ}^{n}$ (s, s^′) = c(‖s − s^′‖|ψ_ξ) and $Σ_{η}^{n}$ (s, s^′) = c(‖s − s^′‖ |ψ_η) for s, s^′ ∈ 𝒟. Let c(h |ψ_ξ) = $τ_{ξ}^{2}$ exp(−2^1/2h^p/ρ_ξ) and c(h |ψ_η) = $τ_{η}^{2}$ exp(−2^1/2h^p/ρ_η) for 0 < p ⩽ 2. We assume independent bounded priors on τ_ξ and τ_η and independent discrete uniform priors on ρ_ξ and ρ_η. Let β_ξ ∼ N(β_0ξ, Σ_0ξ), β_η ∼ N(β_0η, Σ_0η) and σ² ∼ π(σ²). Here we focus on powered exponential covariance functions rather than Matérn to simplify calculations. A similar result should hold for Matérn covariance functions if the priors on the hyperparameters have a bounded support.

Theorem 2. With the above prior specifications, the marginal posterior distribution of a, p(a | y, s) is proper, provided n ⩾ 2 and E_π (σ) < ∞.

When the conditions of Theorem 2 are satisfied, the joint posterior is also proper.

4. Computational details

The exact density for the sample locations in (1) is not available analytically, so an approximation is required. In point process modelling, the integral is often approximated as the sum over a fine grid. Letting t₁, . . . , t_M ∈ 𝒟 be a rectangular grid covering 𝒟 with cell area Δ, we have

\int_{𝒟} exp {ξ (s)} d s \approx Δ \sum_{j = 1}^{M} exp {ξ (t_{j})} .

(4)

This approximation yields a tractable posterior, but requires computationally expensive matrix inversions, which we limit using a kernel convolution approximation to the process.

Let δ(s) be a zero-mean Gaussian process with covariance c(h | ψ). A process convolution (Higdon, 2002) lets

δ (s) = \int_{𝒟} K_{ψ} (s - u) d W (u),

(5)

where W is the Brownian motion and K_ψ is a kernel with parameters ψ. The kernel corresponding to the Matérn covariance is

K_{ψ} (u) = τ \frac{Γ {(ν + 1)}^{1 / 2} ν^{ν / 4 + 1 / 4} | u |^{ν / 2 - 1 / 2}}{π^{1 / 2} Γ (ν / 2 + 1 / 2) Γ {(ν)}^{1 / 2} ρ^{ν / 2 + 1 / 2}} 𝒦_{ν / 2 + 1 / 2} (\frac{2 ν^{1 / 2} | u |}{ρ}) .

The kernel convolution representation of the Gaussian process in (5) is often used to motivate dimension reduction for the spatial process. Let ϕ₁, . . . , ϕ_N be a grid of spatial knots. Then, for large N,

δ (s) \approx \sum_{j = 1}^{N} K_{ψ} (s - φ_{j}) w_{j},

(6)

where w_j ∼ N(0, 1). Applying kernel convolution to ξ(s) and η(s) yields

\begin{matrix} y_{i} | s_{i} \sim N {x {(s_{i})}^{T} β * + \sum_{j = 1}^{N} K_{ψ_{η}} (s_{i} - φ_{j}) u_{j} + a \sum_{j = 1}^{N} K_{ψ_{ξ}} (s_{i} - φ_{j}) υ_{j} + σ^{2}}, \\ p (s_{i}) = \frac{exp {x {(s_{i})}^{T} β_{ξ} + \sum_{j = 1}^{N} K_{ψ_{ξ}} (s_{i} - φ_{j}) υ_{j}}}{\sum_{l = 1}^{M} exp {x {(t_{l})}^{T} β_{ξ} + \sum_{j = 1}^{N} K_{ψ_{ξ}} (t_{l} - φ_{j}) υ_{j}}}, \end{matrix}

(7)

where u_j, υ_j ∼ N(0, 1). Selecting the number of grid points M and knots N is discussed in §5 and §6.

We use a combination of Gibbs and Metropolis sampling for posterior computation. Assuming conjugate normal and inverse gamma priors, and reparameterization so that u_j ∼ N(0, $τ_{η}^{2}$ ) and υ_j ∼ N(0, ${τ_{ξ}}^{2}$ ), the full conditionals for β^*, a, $τ_{η}^{2}$ , $τ_{ξ}^{2}$ and the vector (u₁, . . . , u_N)^T are conjugate and we use Gibbs sampling. The correlation parameters ρ_η and ρ_ξ and the smoothness parameters ν_η and ν_ξ are updated with Metropolis sampling, tuned to have an acceptance ratio near 0.4. The sampling density parameters υ_j are updated using blocked Metropolis sampling to account for posterior correlation between coefficients for nearby knots. We used 10 blocks, with knots allocated to blocks using k-means clustering implemented by the kmeans package in R. For the simulation study in §5, we generated 5000 samples and discarded the first 1000 as burn-in. For the analysis of the ozone data in §6, we generated 20 000 samples and discarded the first 5000. Convergence was monitored using trace plots of the deviance as well as several parameters.

5. Simulation study

We conduct a simulation study to illustrate the effect of failing to account for informative sampling on spatial interpolation, and determine the amount of data needed to reliably identify informative sampling. We assume 𝒟 = [0, 1]² and no spatial covariates, x(s) = 1 for all s. We generate data using model (7) with an equally spaced grid of N = 225 knots on [−0.2, 1.2]² and a Matérn kernel. We generate S = 50 datasets from each of four simulation scenarios: (i) n = 250, a = 0, ρ = 0.2; (ii) n = 250, a = 1, ρ = 0.2; (iii) n = 250, a = 1, ρ = 0.5 and (iv) n = 500, a = 1, ρ = 0.2, with σ = 1, E{μ(s)} = 0, ν = 2.0 and τ = 0.1 under all scenarios. For each simulated dataset, we fit three models. The noninformative sampling model sets a = 0, the plug-in model sets ξ(s) = ξ̂(s) to account for informative locations and the full model implements the approach of §4. In the plug-in analysis, the location density is estimated using kernel density estimation in R’s KernSur function in the GenKern package with default settings. GenKern gives a bivariate kernel density estimate that uses Gaussian kernels with bandwidth chosen using a direct plug-in approach to approximate the asymptotically optimal bandwidth.

We use the same grid of N = 225 knots for generating the data in the kernel convolution model, and approximate the integral using a square grid of M = 900 points t₁, . . . , t_M covering [0, 1]. Motivated by Rodrigues & Diggle (2010), we used an equally spaced grid of 225 knots on [−0.2, 1.2]². The simulation study results show that irrespective of the number and position of the sampling locations, the Gaussian process can be well approximated with 225 knots. Following Lee et al. (2005), the grid spacings are chosen to be no larger than the standard deviation of the kernel in the convolution representation. We use diffuse normal priors for β^* and a and the covariance parameters have priors σ², $τ_{ξ}^{2}$ , $τ_{η}^{2}$ ∼ Inv-Ga(0.01, 0.01), $ρ_{ξ}^{2}$ , $ρ_{η}^{2}$ ∼ U(0, 2), and $ν_{ξ}^{2}$ , $ν_{η}^{2}$ ∼ U(0, 30).

Table 1 reports bias, mean-squared error, mean absolute deviation, and coverage probability, each averaged over the grid of M spatial locations t₁, . . . , t_M. The coverage probability is the proportion of the M grid locations for which the posterior 95% interval for μ(t_j) covers the true value. For the plug-in model and the full model, we also report the power for a in Table 1 which is defined to be the proportion of datasets for which the posterior 95% credible interval for a excludes zero.

Table 1.

Simulation study results

Design	Model	mse (×10²)	mad (×10²)	Bias (×10²)	cp (×10²)	Power for a (×10²)
(i)	nis	33.1 (2.8)	41.3 (0.6)	2.0 (1.3)	93.0 (1.0)	–
	Plug-in	32.2 (1.7)	41.3 (0)	2.5 (1.3)	93.0 (1.0)	10.0
	Full	31.9 (1.2)	41.5 (0.7)	2.5 (1.3)	93.0 (1.0)	10.0
(ii)	nis	49.4 (5.0)	50.0 (1.1)	−25.8 (1.3)	90.0 (1.0)	–
	Plug-in	39.2 (5.5)	44.8 (0.9)	−13.9 (1.3)	91.0 (1.0)	74.0
	Full	32.9 (2.8)	43.2 (0.8)	−7.5 (1.6)	93.0 (1.0)	80.0
(iii)	nis	13.2 (1.1)	28.1 (1.8)	−8.3 (1.4)	94.0 (1.0)	–
	Plug-in	12.1 (0.8)	27.1 (1.8)	−3.1 (1.4)	94.0 (1.0)	40.0
	Full	10.8 (0.7)	25.3 (1.4)	−2.0 (1.3)	95.0 (1.0)	50.0
(iv)	nis	25.6 (1.1)	36.9 (0.7)	−15.3 (1.2)	92.0 (1.0)	–
	Plug-in	20.9 (0.8)	33.9 (0.5)	−7.2 (1.1)	92.0 (1.0)	88.0
	Full	19.1 (0.6)	32.6 (0.4)	−0.8 (1.0)	94.0 (1.0)	98.0

Open in a new tab

nis, noninformative sampling; mse, mean squared error; mad, mean absolute deviation; cp, convergence probability.

All three methods perform similarly, when sampling is not informative. In this case, the informative sampling methods rarely identify a as significant and reduce to the usual geostatisti-cal model. The noninformative sampling model has high mean squared error and negative bias in the remaining designs with informative sampling. The two methods that allow for informative sampling reduce mean squared error compared with the noninformative sampling model. The informative sampling models also reduce bias, although some bias remains, especially for design (ii). In all cases, the full model improves on the plug-in approach. The relative mean squared error of the noninformative sampling model to the full model is smaller for design (iii) (0.132/0.108 = 1.222) with large spatial range and design (iv) (0.256/0.190 = 1.47), a larger sample size than for design (ii) (0.494/0.329 = 1.502), so it seems that accounting for informative sampling is most important for small datasets with considerable spatial variation.

To analyse sensitivity to the prior for a, we redid simulation design (ii) with a = 1 and = 0.2 and used four different priors for a: N(1, 1), N(0, 1), N(0, 10²) and an improper prior. In summary, the mean-squared prediction error and predictive coverage are insensitive to the hyper-parameters of the prior on a for n = 150 and n = 200. Even for a sample size as small as n = 50, differences are small for different priors. However, the N(0, 10²) prior and the informative prior N(1, 1) lead to a better power for a than the others when n = 50 and 100. The minimum sample size needed to swamp out the prior for a is ∼ 150 in this example.

6. Analysis of Eastern United States ozone data

With the increasing concern about air pollution and climate change, building predictive models for ozone is an important area. It is often the case that the monitoring locations are informative about the ozone surface and hence it is important to account for informative sampling. We analyse the median daily ozone for June–August 2007 for n = 631 observations in Eastern U.S.A. The data are plotted in Fig. 1(a). There is a clear association between the sampling density and the response, as there are more monitors placed in areas with high ozone, such as Atlanta and New England, than in areas with low ozone, such as Mississippi and West Virginia. We fit a generalized additive model to the median ozone values and the kernel density estimate of the log sampling density using locally weighted scatterplot smoothing as shown in Fig. 1(b). The linear fit is entirely contained within the generalized additive model 95% confidence intervals for all values of the log sampling density estimate, supporting the log-linear model in (1).

Fig. 1 — Plots of the ozone data. (a) The ozone data in parts per billion and the monitor locations (points). (b) The estimated log sampling density against the response. Log sampling density versus median ozone (circles), gamfit with 95% intervals (dashed line), linear fit (solid line).

To apply a stationary spatial model, we first project the spatial locations to a two-dimensional surface using the Mercator projection, and then scale them to the unit square coordinate-wise by subtracting the minimum and dividing by the range of the observation locations. We fit the informative sampling model with a 30 × 30 grid of knots on [−0.2, 1.2]² in the kernel convolution approximation in (6) and a 50 × 50 grid of points on [0, 1]² in the integral approximation in the sampling density (4). Points outside the convex hull of the observation locations or outside the continental United States were discarded from integral approximation to the sampling density, leaving M = 1077. Kernel convolution knots not within 0.1 of an integral approximation knot were discarded, leaving N = 490.

We include a second-order spatial trend as predictors in x(s), that is, linear and quadratic terms for rescaled latitude and longitude and their interaction. We compare the noninformative sampling, plug-in and full models described in §5. The posteriors for several parameters are summarized in Table 2. The spatial process for both the mean process and sampling density are fairly smooth. The posterior 95% intervals for ν_ξ and ν_η exclude the exponential covariance (ν = 0.5) for all the three models.

Table 2.

Mean and 95% intervals for the ozone data

Parameters	nis	Plug-in	Full
a	–	4.43 (2.16, 6.46)	3.21 (2.12, 4.25)
σ	4.68 (4.37, 5.03)	4.70 (4.38, 5.04)	4.78 (4.47, 5.12)
τ_g	0.17 (0.14, 0.27)	0.15 (0.13, 0.19)	0.17 (0.13, 0.21)
ρ_g	0.06 (0.05, 0.16)	0.06 (0.04, 0.10)	0.06 (0.05, 0.10)
ν_g	3.95 (0.92, 6.42)	3.46 (1.53, 5.52)	12.6 (0.74, 28.8)
τ_f	–	–	0.05 (0.04, 0.06)
ρ_f	–	–	0.07 (0.04, 0.13)
ν_f	–	–	10.7 (0.74, 28.77)

Open in a new tab

nis, noninformative sampling.

The 95% interval of a for both the plug-in model (2.16, 6.46) and fully Bayesian model (2.12, 4.25) excludes zero, indicating an informative sampling scheme. The scale of a’s posterior is not comparable between the two models, because the plug-in density estimate has been standardized to have zero-mean and variance one. The effect of accounting for informative sampling is illustrated in Fig. 2. The difference in predicted values between the noninformative sampling and full model in Fig. 2(c) is the largest in Northern Pennsylvania and West Virginia. These areas have relatively few monitors and are near areas with high ozone. The difference between the non-informative sampling and plug-in predictions in Fig. 2(d) are also positive in these areas though the differences are not nearly as large in the plug-in analysis. This may be because the plug-in estimates do not appropriately account for uncertainty in estimation, and hence may lead to some attenuation of the estimated surface.

Fig. 2 — The effect of accounting for informative sampling. (a) Posterior mean predicted values; (b) log sampling density from the full model; (c) the difference in posterior mean predicted values of the noninformative sampling model and full model and (d) the plug-in model.

Finally, we refit the model with different priors and different knot locations to test for sensitivity to these assumptions. We fit the model with 20 × 20 and 40 × 40 initial grids of knots in the kernel convolution approximation. After removing knots outside the domain of interest, we obtain N = 206 and N = 876 knots, respectively. The results were fairly similar to the original 30 × 30 grid. In all cases the posterior of a was separated from zero, the posterior median being 3.31 and 2.85 for N = 206 and N = 876 knots, respectively, and the largest difference between the NIS and full model was in the Northern Pennsylvania and West Virginia.

7. Discussion

We have focused on a simple model for informative locations, which assumes that the outcomes are conditionally independent of the locations, given the mean process μ(s) and the spatial location density p(s). In addition, we include a single parameter a controlling the informativeness of the sampling process. These simplifying assumptions certainly make the theory and computation more tractable. However, to characterize the data from a broader variety of applications more realistically, it may be necessary to generalize the models. There are several interesting directions in this regard. First, it is straightforward conceptually to replace the constant a with a spatially varying coefficient a(s), which is assigned a Gaussian process prior. This generalization allows the informativeness of the sampling locations to vary spatially; for example, in certain regions, e.g., near cities, monitors may be placed without regard to the outcome, while in the rural areas, monitors may be placed at sites likely to have high values of ozone. It is an open question whether one can consistently estimate a(s) in this extended model without very restrictive assumptions. However, a simple adjustment for informative sampling may be preferable to more complicated models that require rich datasets for reliable estimation.

Acknowledgments

This research was partially supported by the National Institute of Environmental Health Sciences of the National Institutes of Health. The authors would like to thank Mr. Avishek Chakraborty and Mr. Anirban Bhattacharya for their helpful comments.

Appendix.

Proof of Theorem 1. Let ϕ = (ξ_r, η_r, β_ξ, β_η, a, σ) and ϕ₀ = (ξ_0r, η_0r, β_ξ0, β_η0, a₀, σ₀) be a fixed set of parameters in 𝒞(𝒟) × 𝒞(𝒟) × 𝕉 × 𝕉⁺. Clearly (y_i, s_i) ∼ f (y, s | ϕ), where

f (y, s | φ) = f (y | s, φ) p (s | φ) = \frac{1}{\sqrt (2 π σ^{2})} exp [- \frac{{y - μ (s)}^{2}}{2 σ^{2}}] \frac{exp {x {(s)}^{T} β_{ξ} + ξ_{r} (s)}}{\int_{𝒟} exp {x {(s)}^{T} β_{ξ} + ξ_{r} (s)} d s} .

Here μ(s) = x(s)^T(aβ_ξ + β_η) + aξ_r(s) + η_r(s). Let μ₀(s) = x(s)^T(a₀β_ξ0 + β_η0) + a₀ξ_0r (s) + η_0r(s). Define Λ(ϕ₀, ϕ) = log {f (y, s | ϕ₀)/f (y, s | ϕ)} and K (ϕ₀, ϕ) = E_ϕ₀{Λ(ϕ₀, ϕ)}. Then following Schwartz (1965), its enough to show that for all ∊ > 0,

(Π_{ξ_{r}} \times Π_{η_{r}} \times π_{β_{ξ}} \times π_{β_{η}} \times π_{σ} \times π_{a}) {ϕ : K (ϕ_{0}, ϕ) < ∊} > 0 .

We calculate K (ϕ₀, ϕ) using the following equation:

\begin{array}{l} K (φ_{0}, φ) & = & E_{φ_{0}} {Λ (φ_{0}, φ)} = E_{φ_{0}} {log \frac{f (y, s | φ_{0})}{f (y, s | φ)}} \\ = & \frac{1}{2} log \frac{σ^{2}}{σ_{0}^{2}} + E_{φ_{0}} [- \frac{{y - μ_{0} (s)}^{2}}{2 σ_{0}^{2}}] - E_{φ_{0}} [- \frac{{y - μ (s)}^{2}}{2 σ^{2}}] \\ - E_{φ_{0}} {x {(s)}^{T} (β_{ξ} - β_{ξ 0}) + ξ_{r} (s) - ξ_{0 r} (s)} + log [\frac{\int_{𝒟} exp {x {(s)}^{T} β_{ξ} + ξ_{r} (s)} d s}{\int_{𝒟} exp {x {(s)}^{T} β_{ξ 0} + ξ_{0 r} (s)} d s}] \\ = & \frac{1}{2} log \frac{σ^{2}}{σ_{0}^{2}} - \frac{1}{2} (1 - \frac{σ_{0}^{2}}{σ^{2}}) + \frac{1}{2 σ^{2}} \int_{𝒟} {μ_{0} (s) - μ (s)}^{2} p (s) d s \\ + \int_{𝒟} {x {(s)}^{T} (β_{ξ} - β_{ξ 0}) + ξ_{r} (s) - ξ_{0 r} (s)} p (s) d s \\ + log [\frac{\int_{𝒟} exp {x {(s)}^{T} β_{ξ} + ξ_{r} (s)} d s}{\int_{𝒟} exp {x {(s)}^{T} β_{ξ 0} + ξ_{0 r} (s)} d s}] . \end{array}

For each δ > 0, define

\begin{array}{l} B_{δ} = {φ : {‖ ξ_{r} - ξ_{0 r} ‖}_{\infty} < δ, {‖ η_{r} - η_{0 r} ‖}_{\infty} < δ, ‖ β_{ξ} - β_{f 0} ‖ < δ, ‖ β_{g} - β_{g 0} ‖ < δ, \\ | a - a_{0} | < δ, | σ / σ_{0} - 1 | < δ} . \end{array}

Take b₁ = ‖μ₀ − μ‖_∞ and b₂ = σ/σ₀. Let g₁(b₁, b₂) = log b₂ − ( $b_{2}^{2}$ − 1)/(2 $b_{2}^{2}$ ) + $b_{1}^{2}$ /(2 $σ_{0}^{2} b_{2}^{2}$ ). Clearly g₁(b₁, b₂) is continuous at b₁ = 0 and b₂ = 1 and g₁(0, 1) = 0. We have

b_{1} ⩽ M ‖ (a β_{ξ} + β_{η}) - (a_{0} β_{ξ 0} + β_{η 0}) ‖ + ‖ {a ξ_{r} (s) + η_{r} (s)} - {a_{0} ξ_{0 r} (s) + η_{0 r} (s)} ‖

and

\begin{matrix} K (ϕ_{0}, ϕ) ⩽ g_{1} (b_{1}, b_{2}) + \int_{𝒟} {x {(s)}^{T} (β_{ξ} - β_{ξ 0}) + ξ_{r} (s) - ξ_{0 r} (s)} p (s) d s \\ + log [\frac{\int_{𝒟} exp {x {(s)}^{T} β_{ξ} + ξ_{r} (s)} d s}{\int_{𝒟} exp {x {(s)}^{T} β_{ξ 0} + ξ_{0 r} (s)} d s}] . \end{matrix}

For ∊ > 0, there exists a δ₁ > 0 such that for all ϕ ∈ B_δ₁,

\frac{1}{2} log \frac{σ^{2}}{σ_{0}^{2}} - \frac{1}{2} (1 - \frac{σ_{0}^{2}}{σ^{2}}) + \frac{1}{2 σ^{2}} \int_{𝒟} {μ_{0} (s) - μ (s)}^{2} p (s) d s < \frac{∊}{3} .

There also exists δ₂ > 0 such that for all ϕ ∈ B_δ₂, {x(s)^T(β_ξ − β_ξ₀) + ξ_r (s) − ξ_0r(s)} < ∊/3 uniformly for all s ∈ 𝒟. If we define h_ϕ(s) = exp{x(s)^T β_ξ + ξ_r(s)}, then ϕ ↦ ∫_𝒟h_ϕ(s) ds is a continuous function and hence ϕ ↦ log{∫_𝒟 h_ϕ (s) ds} is also a continuous function. So, there exists a δ₃ > 0 such that

ϕ \in B_{δ_{3}} \Rightarrow log {\int_{𝒟} h_{ϕ} (s) d s} - log {\int_{𝒟} h_{ϕ_{0}} (s) d s} < \frac{∊}{3} .

Choosing δ = min{δ₁, δ₂, δ₃}, ϕ ∈ B_δ implies K (ϕ₀, ϕ) < ∊. From T. Choi’s unpublished 2005 Ph.D thesis, it follows that with the priors specified in §3.1

(Π_{ξ_{r}} \times Π_{η_{r}} \times π_{β_{ξ}} \times π_{β_{η}} \times π_{σ} \times π_{a}) (B_{δ}) > 0 .

Hence,

(Π_{ξ_{r}} \times Π_{η_{r}} \times π_{β_{ξ}} \times π_{β_{η}} \times π_{σ} \times π_{a}) {ϕ : K (ϕ_{0}, ϕ) < ∊} > 0 .

Proof of Theorem 2. The prior specifications on ρ_ξ, ρ_η, τ_ξ and τ_η enable one to bound any quadratic forms and determinants involving $Σ_{ξ}^{n}$ and $Σ_{η}^{n}$ by fixed quantities. Hence, in showing that the posterior p(a |y, s) is proper, its enough to treat ρ_ξ, ρ_η, τ_ξ and τ_η as constants. Without loss of generality, we can work with 𝒟 = [0, 1]² by the projection argument described in §6. Following Benes et al. (2003), we consider the grid approximation of the infinite dimensional Gaussian process {ξ_r (s) : s ∈ 𝒟}, denoted by ξ_r. Let $𝒟 = \cup_{j = 1}^{J} I_{j}$ , with {I_j} denoting a segmentation of 𝒟 into contiguous regions of equal area Δ = J⁻¹ ∫_𝒟ds. Choose J sufficiently large such that at most one s_i lies within any I_j. The infinite-dimensional Gaussian process, ξ_r, can be approximated by a finite dimensional vector $ξ_{r}^{J} = {(ξ_{r}^{* 1}, \dots, ξ_{r}^{* J})}^{T}$ , corresponding to the choice of arbitrary points $s_{1}^{*}, \dots, s_{J}^{*}$ within I₁, . . . , I_J, respectively, such that ξ_r (s_i) = $ξ_{r}^{* j}$ if s_i ∈ I_j. Thus $ξ_{r}^{J}$ ∼ N(0, $Σ_{ξ}^{* J}$ ), where ${(Σ_{ξ}^{* J})}_{i j} = c (‖ s_{i}^{*} - s_{j}^{*} ‖ | ψ)$ . Define the true posterior p^true( $ξ_{r}^{n}$ |s) and the approximated posterior p^J ( $ξ_{r}^{n}$ |s) as follows:

p^{true} (ξ_{r}^{n} | s) \propto p^{true} (ξ_{r}^{n}, s) = E {\int exp {x {(s)}^{T} β_{ξ} + ξ_{r} (s)} d s | ξ_{r}^{n}}^{- n} exp {- 0.5 {(ξ_{r}^{n})}^{T} {(Σ_{ξ}^{n})}^{- 1} {(ξ_{r}^{n})}^{T}}

and

p^{J} (ξ_{r}^{n} | s) \propto p^{J} (ξ_{r}^{n}, s) = {[Δ \sum_{j = 1}^{J} exp {x {(s_{j}^{*})}^{T} β_{ξ} + ξ_{r} (s_{j}^{*})}]}^{- n} exp {- 0.5 {(ξ_{r}^{J})}^{T} {(Σ_{ξ}^{* J})}^{- 1} {(ξ_{r}^{J})}^{T}} .

Marginalizing out $η_{r}^{n}$ , we have y|s, ξ_r, a, σ², β_η, β_ξ ∼ N(Xβ* + a $ξ_{r}^{n}$ , σ²I_n + $Σ_{η}^{n}$ ), where X^T = {x(s₁) ⋯ x(s_n)}. The true posterior of ( $ξ_{r}^{n}$ , a, σ², β_ξ, β_η) is

p^{true} (ξ_{r}^{n}, a, β_{ξ}, β_{η}, σ^{2} | y, s) \propto p (y | s, ξ_{r}, a, σ^{2}, β_{ξ}, β_{η}) p^{true} (ξ_{r}^{n}, s) π (σ^{2}) π (β_{ξ}) π (β_{η}) .

Benes et al. (2003) showed that, under these assumptions, for a fixed s ∈ 𝒟ⁿ, the expectation of any bounded function with respect to p^J( $ξ_{r}^{n}$ |s) converges to the corresponding expectation with respect to p^true( $ξ_{r}^{n}$ |s) as J tends to infinity. Hence, there exists a J such that the expectation of the bounded function with respect to p^J ( $ξ_{r}^{n}$ |s) is greater than the corresponding expectation with respect to (1/2) p^true( $ξ_{r}^{n}$ |s). Thus, in order to show propriety of the true posterior of ( $ξ_{r}^{n}$ , a, σ², β_ξ, β_g), which involves p^true( $ξ_{r}^{n}$ |s), it is enough to show the propriety of the approximated posterior p^J ( $ξ_{r}^{n}$ , a, β_ξ, β_η, σ²| y, s). The approximated posterior of ( $ξ_{r}^{n}$ , a, σ², β_ξ, β_η) is

\begin{array}{l} p^{J} (ξ_{r}^{n}, a, β_{ξ}, β_{η}, σ^{2} | y, s) & = & C exp {- 0.5 {(Y - X β * - a ξ_{r}^{n})}^{T} {(σ^{2} I_{n} + Σ_{η}^{n})}^{- 1} (Y - X β * - a ξ_{r}^{n})} \\ \times exp {- 0.5 {(ξ_{r}^{J})}^{T} {(Σ_{ξ}^{*^{J}})}^{- 1} (ξ_{r}^{J})} π (β_{ξ}) π (β_{η}) \\ \times π (σ^{2}) \frac{exp {\sum_{i = 1}^{n} x {(s_{i})}^{T} β_{ξ} + ξ_{r} (s_{i})}}{Δ^{n} {[\sum_{j = 1}^{J} exp {x {(s_{j}^{*})}^{T} β_{ξ} + ξ_{r}^{* j}}]}^{n}}, \end{array}

where C is a constant. As exp{x(s_i)^Tβ_ξ + ξ_r (s_i)} < $\sum_{j = 1}^{J} exp {x {(s_{j}^{*})}^{T} β_{ξ} + ξ_{r}^{* j}}$ for all i = 1, . . . , n,

\frac{exp {\sum_{i = 1}^{n} x {(s_{i})}^{T} β_{ξ} + ξ_{r} (s_{i})}}{{[\sum_{j = 1}^{J} exp {x {(s_{j}^{*})}^{T} β_{ξ} + ξ_{r}^{* j}}]}^{n}} < 1 .

After integrating out $ξ_{r}^{J}$ , excluding $ξ_{r}^{n}$ , we are left with

\begin{array}{l} p (ξ_{r}^{n}, a, β_{ξ}, β_{η}, σ^{2} | Y, s) & ⩽ & C_{1} exp {- 0.5 {(Y - X β * - a ξ_{r}^{n})}^{T} {(σ^{2} I_{n} + Σ_{η}^{n})}^{- 1} (Y - X β * - a ξ_{r}^{n})} \\ \times exp {{- 0.5 {(ξ_{r}^{n})}^{T} {(Σ}_{f}^{n})}^{- 1} (ξ_{r}^{n})} π (β_{ξ}) π (β_{η}) π (σ^{2}), \end{array}

where C₁ > 0 is a constant and $Σ_{ξ}^{n}$ is the variance–covariance matrix of $ξ_{r}^{n}$ constructed out of $Σ_{ξ}^{* J}$ . Setting Z = (y − Xβ^*)/a, Σ = ( $Σ_{η}^{n}$ + σ²I_n)/a² and Ω_η = {( $Σ_{ξ}^{n}$ )⁻¹ + Σ⁻¹}⁻¹ and completing quadratic forms yield

\begin{array}{l} p (ξ_{r}^{n}, a, β_{ξ}, β_{η}, σ^{2} | y, s) & ⩽ & C_{2} exp {- 0.5 {(ξ_{r}^{n} - Ω_{η} Σ^{- 1} Z)}^{T} Ω_{η}^{- 1} (ξ_{r}^{n} - Ω_{η} Σ^{- 1} Z)} \\ \times exp {- 0.5 (Z^{T} Σ^{- 1} Z - Z^{T} Σ^{- 1} Ω_{η} Σ^{- 1} Z)} π (β_{ξ}) π (β_{η}) π (σ^{2}), \end{array}

where C₂ > 0 is another constant. Next we state a useful lemma from matrix algebra.

Lemma 1. If A and B are positive definite square matrices so is A − A(A + B)⁻¹ A.

Proof. We have

A - A {(A + B)}^{- 1} A = A {(A + B)}^{- 1} B = {B^{- 1} (A + B) A^{- 1}}^{- 1} = {(B^{- 1} + A^{- 1})}^{- 1} .

The conclusion follows from the fact that the sum and inverses of positive definite matrices of the same dimension are also positive definite.

From Lemma 1, we have (Z^TΣ⁻¹Z − Z^TΣ⁻¹Ω_ηΣ⁻¹Z) ⩾ 0, so that

\begin{array}{l} p (ξ_{r}^{n}, a, β_{ξ}, β_{η}, σ^{2} | y, s) & ⩽ & C_{2} exp {- 0.5 {(ξ_{r}^{n} - Ω_{η} Σ^{- 1} Z)}^{T} Ω_{η}^{- 1} (ξ_{r}^{n} - Ω_{η} Σ^{- 1} Z)} \\ \times π (β_{ξ}) π (β_{η}) π (σ^{2}) . \end{array}

Integrating out $ξ_{r}^{n}$ first and then β_ξ and β_η,

p (a, σ^{2} | y, s) ⩽ C_{3} | {(Σ_{ξ}^{n})}^{- 1} + a^{2} {(Σ_{η}^{n} + σ^{2} I_{n})}^{- 1} |^{- (1 / 2)} .

Call $Σ_{ξ}^{n}$ = A and $Σ_{η}^{n}$ = B. Hence

| A^{- 1} + a^{2} {(B + σ^{2} I)}^{- 1} | = \frac{| I + a^{2} A {(B + σ^{2} I)}^{- 1} |}{| A |} = \frac{| a^{2} A + σ^{2} I + B |}{| σ^{2} I + B | | A |} .

Now we state a useful result from matrix algebra.

Proposition 1. If A and B and nonnegative definite matrices, then | A + B| ⩾ | A| + |B| with strict inequality holding in case of positive definite matrices.

Using Proposition 1, we get

\begin{array}{l} {(\frac{| a^{2} A + σ^{2} I + B |}{| σ^{2} I + B |})}^{- (1 / 2)} & ⩽ & {(\frac{| a^{2} A | + | σ^{2} I + B |}{| σ^{2} I + B |})}^{- (1 / 2)} \\ = & {1 + \frac{a^{2 n} | A |}{\prod_{i = 1}^{n} (σ^{2} + b_{i})}}^{- (1 / 2)} \\ ⩽ & {1 + \frac{a^{2 n} | A |}{{(σ^{2} + b_{n})}^{n}}}^{- (1 / 2)}, \end{array}

where 0 < b₁ ⩽ b₂ ⩽ ⋯ ⩽ b_n are the eigen values of B. By Minkowski’s inequality we get

{1 + \frac{a^{2 n} | A |}{{(σ^{2} + b_{n})}^{n}}}^{- (1 / 2)} ⩽ \frac{{(σ^{2} + b_{n})}^{n / 2}}{c_{n} {(a^{2} | A |^{(1 / n)} + σ^{2} + b_{n})}^{n / 2}} .

Set |A|^1/n = k₁ and b_n = k₂. We assume n ⩾ 2. Then ignoring constants

\begin{array}{l} \int_{- \infty}^{\infty} \frac{{(σ^{2} + b_{n})}^{n / 2}}{{(a^{2} | A |^{1 / n} + σ^{2} + b_{n})}^{n / 2}} d a & = & \int_{- \infty}^{\infty} \frac{1}{{1 + (a^{2} k_{1}) / (σ^{2} + k_{2})}^{n / 2}} d a \\ ⩽ & \int_{- \infty}^{\infty} \frac{1}{{1 + (a^{2} k_{1}) / (σ^{2} + k_{2})}} d a \\ = & π {(\frac{σ^{2} + k_{2}}{k_{1}})}^{(1 / 2)} . \end{array}

Now, since E_π (σ) < ∞,

\int_{0}^{\infty} {(\frac{σ^{2} + k_{2}}{k_{1}})}^{(1 / 2)} π (d σ^{2}) < \infty .

By Fubini’s Theorem, p(a | Y, s) is integrable.

References

Benes V, Bodlák K, Møller J, Waagepetersen RP. The ISI Int Conf Environ Statist Health. Univ Santiago de Compostela; 2003. Bayesian analysis of log Gaussian Cox process models for disease mapping; pp. 95–105. [Google Scholar]
Choi T. Alternative posterior consistency results in nonparametric binary regression using Gaussian process priors. J Statist Plan Infer. 2007;137:2975–83. [Google Scholar]
Choi T, Schervish M. On posterior consistency in nonparametric regression problems. J Mult Anal. 2007;98:1969–87. [Google Scholar]
Diggle P, Menezes R, Su T. Geostatistical inference under preferential sampling (with discussion) Appl Statist. 2010;59:191–232. [Google Scholar]
Higdon D. Space and space-time modelling using process convolutions. Quant Methods Curr Environ Issues. 2002:37–56. [Google Scholar]
Ho L, Stoyan D. Modelling marked point patterns by intensity-marked Cox processes. Statist Prob Lett. 2008;78:1194–9. [Google Scholar]
Lee H, Higdon D, Calder C, Holloman C. Efficient models for correlated data via convolutions of intrinsic processes. Statist Mod. 2005;5:53–74. [Google Scholar]
Møller J, Syversveen A, Waagepetersen R. Log Gaussian Cox processes. Scand J Statist. 2001;25:451–82. [Google Scholar]
Radcliffe SJ, Guo W, Ten Have T. Joint modelling of longitudinal and survival data via a common frailty. Biometrics. 2004;60:892–9. doi: 10.1111/j.0006-341X.2004.00244.x. [DOI] [PubMed] [Google Scholar]
Rodrigues A, Diggle P. A class of convolution-based models for spatio-temporal processes with non-separable covariance structure. Scand J Statist. 2010;37:553–67. [Google Scholar]
Schwartz L. On Bayes procedures. Z. Wahrsch. Verw. Gebiete. 1965;4:10–26. [Google Scholar]
Stein ML. Interpolation of Spatial Data: Some Theory for Kriging. New York: Springer Series in Statistics; 1999. [Google Scholar]
van der Vaart A, van Zanten J. Adaptive Bayesian estimation using a Gaussian random field with inverse Gamma bandwidth. Ann Statist. 2009;37:2655–75. [Google Scholar]
Wu M, Follmann D. Use of summary measures to adjust for informative missingness in repeated measures data with random effects. Biometrics. 1999;55:75–84. doi: 10.1111/j.0006-341x.1999.00075.x. [DOI] [PubMed] [Google Scholar]
Zhang H. Inconsistent estimation and asymptotically equal interpolations in model-based geostatistics. J Am Statist Assoc. 2004;99:250–61. [Google Scholar]

[b1-asq067] Benes V, Bodlák K, Møller J, Waagepetersen RP. The ISI Int Conf Environ Statist Health. Univ Santiago de Compostela; 2003. Bayesian analysis of log Gaussian Cox process models for disease mapping; pp. 95–105. [Google Scholar]

[b2-asq067] Choi T. Alternative posterior consistency results in nonparametric binary regression using Gaussian process priors. J Statist Plan Infer. 2007;137:2975–83. [Google Scholar]

[b3-asq067] Choi T, Schervish M. On posterior consistency in nonparametric regression problems. J Mult Anal. 2007;98:1969–87. [Google Scholar]

[b4-asq067] Diggle P, Menezes R, Su T. Geostatistical inference under preferential sampling (with discussion) Appl Statist. 2010;59:191–232. [Google Scholar]

[b5-asq067] Higdon D. Space and space-time modelling using process convolutions. Quant Methods Curr Environ Issues. 2002:37–56. [Google Scholar]

[b6-asq067] Ho L, Stoyan D. Modelling marked point patterns by intensity-marked Cox processes. Statist Prob Lett. 2008;78:1194–9. [Google Scholar]

[b7-asq067] Lee H, Higdon D, Calder C, Holloman C. Efficient models for correlated data via convolutions of intrinsic processes. Statist Mod. 2005;5:53–74. [Google Scholar]

[b8-asq067] Møller J, Syversveen A, Waagepetersen R. Log Gaussian Cox processes. Scand J Statist. 2001;25:451–82. [Google Scholar]

[b9-asq067] Radcliffe SJ, Guo W, Ten Have T. Joint modelling of longitudinal and survival data via a common frailty. Biometrics. 2004;60:892–9. doi: 10.1111/j.0006-341X.2004.00244.x. [DOI] [PubMed] [Google Scholar]

[b10-asq067] Rodrigues A, Diggle P. A class of convolution-based models for spatio-temporal processes with non-separable covariance structure. Scand J Statist. 2010;37:553–67. [Google Scholar]

[b11-asq067] Schwartz L. On Bayes procedures. Z. Wahrsch. Verw. Gebiete. 1965;4:10–26. [Google Scholar]

[b12-asq067] Stein ML. Interpolation of Spatial Data: Some Theory for Kriging. New York: Springer Series in Statistics; 1999. [Google Scholar]

[b13-asq067] van der Vaart A, van Zanten J. Adaptive Bayesian estimation using a Gaussian random field with inverse Gamma bandwidth. Ann Statist. 2009;37:2655–75. [Google Scholar]

[b14-asq067] Wu M, Follmann D. Use of summary measures to adjust for informative missingness in repeated measures data with random effects. Biometrics. 1999;55:75–84. doi: 10.1111/j.0006-341x.1999.00075.x. [DOI] [PubMed] [Google Scholar]

[b15-asq067] Zhang H. Inconsistent estimation and asymptotically equal interpolations in model-based geostatistics. J Am Statist Assoc. 2004;99:250–61. [Google Scholar]

PERMALINK

Bayesian geostatistical modelling with informative sampling locations

D Pati

B J Reich

D B Dunson

Summary

1. Introduction

2. Model for spatial data with informative sampling

3. Theoretical properties

3.1. Weak posterior consistency

3.2. Posterior propriety of a

4. Computational details

5. Simulation study

Table 1.

6. Analysis of Eastern United States ozone data

Fig. 1.

Table 2.

Fig. 2.

7. Discussion

Acknowledgments

Appendix.

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Bayesian geostatistical modelling with informative sampling locations

D Pati

B J Reich

D B Dunson

Summary

1. Introduction

2. Model for spatial data with informative sampling

3. Theoretical properties

3.1. Weak posterior consistency

3.2. Posterior propriety of a

4. Computational details

5. Simulation study

Table 1.

6. Analysis of Eastern United States ozone data

Fig. 1.

Table 2.

Fig. 2.

7. Discussion

Acknowledgments

Appendix.

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases