Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2009 Nov 20.
Published in final edited form as: Environmetrics. 2008 Sep 26;20(5):575–594. doi: 10.1002/env.957

An autoregressive point source model for spatial processes

Jacqueline M Hughes-Oliver 1,*,, Tae-Young Heo 2, Sujit K Ghosh 1
PMCID: PMC2779585  NIHMSID: NIHMS145489  PMID: 19936263

Abstract

We suggest a parametric modeling approach for nonstationary spatial processes driven by point sources. Baseline near-stationarity, which may be reasonable in the absence of a point source, is modeled using a conditional autoregressive (CAR) Markov random field. Variability due to the point source is captured by our proposed autoregressive point source (ARPS) model. Inference proceeds according to the Bayesian hierarchical paradigm, and is implemented using Markov chain Monte Carlo (MCMC) methods. The parametric approach allows a formal test of effectiveness of the point source. Application is made to a real dataset on electric potential measurements in a field containing a metal pole and the finding is that our approach captures the pole’s impact on small-scale variability of the electric potential process.

Keywords: Bayesian inference, correlation nonstationarity, heterogeneity, hierarchical model, random effect, variance nonstationarity

1. INTRODUCTION

Stochastic processes across spatial domains are often modeled by decomposing them into trend and stationary error processes. However, it has become increasingly clear that an assumption of stationarity of the error process is driven more by mathematical convenience than by reality, and more practitioners are choosing to replace this convenient assumption with the more realistic assumption of nonstationarity; see, for example, Sampson and Guttorp (1992), Haas (1995), Cressie and Majure (1997), Higdon et al. (1999), Fuentes (2002), and Wikle (2004). Nonstationary error models are difficult to develop and estimate because deviations from stationarity can occur in many ways. They require direct or indirect methods for describing variance and correlation as they change over the index set of the process, while guaranteeing that linear transformations of the process will have non-negative variances. Auxiliary information, if available, can ease the difficulty associated with modeling nonstationary error processes.

In this paper, we consider the case where a point source impacts the stochastic process of interest. Auxiliary information provided by a point source and the resulting impact on the covariance function of a process has received only minimal attention. In epidemiology, relatively recent activities have led to a large body of literature relating point sources to disease “hot spots.” The primary goal is to test whether or not clusters of disease cases in space and/or time are significant, after accounting for chance variations. In their nomenclature, a “focused” test is used to detect significant clustering around a point source exposure to a pollutant that is presumed to increase disease risk. These efforts typically model disease incidence using the Cox process based on inhomogeneous Poisson processes. A few key references are Diggle and Rowlingson (1994), Bithell (1995), Lawson (1995, 2001), Lawson and Waller (1996), and Diggle et al. (1997). Hughes-Oliver et al. (1998a, 1998b), and Hughes-Oliver and Gonzalez-Farias (1999) discuss the effects of point sources on a deposition process in semiconductor manufacturing and measuring electric potential in a field.

We propose a hierarchical Bayesian approach to an extension of the process decomposition model suggested by Hughes-Oliver and Gonzalez-Farias (1999). There are three major benefits afforded by the Bayesian paradigm. First, it is able to account for uncertainty in parameter estimates when evaluating prediction uncertainty; frequentist approaches are highly criticized for their inability to deal with this issue. Second, Bayesian inference does not require derivation of asymptotic properties (as sample size increases) of estimators because results are based entirely on simulated observations from relevant posterior distributions. Third, the Bayesian paradigm allows the incorporation of prior information that may not be as easily incorporated in a frequentist approach. There are also benefits of the process decomposition model. This model provides a parametric method for decomposing the observed process into a trend surface, a baseline error process, and an additional error process (associated with the point source) that may be viewed as a “shock” to the baseline. Nonstationarity is easily modeled by this additional error process since no restrictions are placed on the forms of their covariance functions.

We also introduce a newly created autoregressive point source (ARPS) process capturing the effect of a point source on the error component of a process. This process is attractive for at least three reasons. First, it allows site-specific variances to be a function of proximity to source, where no restriction is placed on definition of distance; that is, any relevant distance metric may be used. Second, site-to-site correlations may be a function of proximity to source. Third, and possibly most attractive, this model is parametric and lends itself to testing the statistical significance of point source impacts on the error component of a process. We use the data from Hughes-Oliver and Gonzalez-Farias (1999) to illustrate our proposed method for modeling the effect of a point source.

Section 2 contains a description of the process decomposition approach. Section 3 introduces our approach, the so-called ARPS process, for modeling the effect of a point source on the covariance structure. The full, overall covariance structure is explored in Section 4. Section 5 contains modeling details for electric potential in a field containing a metal pole. Concluding remarks are presented in Section 6.

2. PROCESS DECOMPOSITION

Suppose {Y(s) : sDR2} is the stochastic process of interest and D is the index set, such that Y(s) is the response at site s. That is, s = (x, y), where x may be longitude and y may be latitude. Covariates for this process are given as {X(s) : sDR2, X is q-dimensional}. These q covariates may be fixed or random and will often include distance to the point source located at P. Given covariates, the Y(·) process is decomposed into trend and error as

Y(s)=μ(s)+Z(s), (1)

where μ(s) = f (X(s), β) (the trend) is the expectation of Y(s) conditioned on covariates X(s) at site s and Z(s) (the error or detrended response) has zero expectation. Effect of the point source on trend or large-scale variation is modeled by judicious choice of the function f(·,·). This is commonly done using a linear or non-linear function where the functional form of f(·,·) is completely known and only the vector β is unknown. Generalized additive models GAMs, (Hastie and Tibshirani, 1990) may also be used as a first-stage approximation of the trend assuming errors are independent and identically distributed; see Holland et al. (2000) for an application of GAMs to environmental data.

It is also possible, however, that the point source affects the covariance or small-scale variation; see Hughes-Oliver et al. (1998a, 1998b), and Hughes-Oliver and Gonzalez-Farias (1999) for discussions and datasets. This effect leads to covariances that are functions of the distances (and possibly angles) between the sites and source. The resulting nonstationarity of Y(s), where the covariance is site-dependent, is modeled using Z(s).

Suppose that Z(s) is the result of a baseline zero-mean process Z0(s) that has been “shocked” by the zero-mean point source process Z1(s) in such a way that

Z(s)=Z0(s)+Z1(s). (2)

Suppose further that processes Z0 and Z1 are independent and that Covi(·,·) is the covariance function for Zi, that is, Covi(s, t) = Cov(Zi(s), Zi(t)). The covariance function for Z is then Cov(s, t) = Cov0(s, t) + Cov1(s, t).

Any valid covariance may be assigned to the Covi and this will automatically guarantee that the covariance of Z is valid, thus linear transformations of the Z process will have non-negative variances. The choice of Covi should be motivated by the data in such a way that the baseline process is well represented and parameters of Covi measure the strength of importance of the source. Specific features may be built into the Covi in a variety of ways. Suppose the effect of source P on error process Z is such that the detrended response is the same at all sites equi-distant from P; note this does not imply that the effect of the point source on the response is necessarily taken to be constant at all sites equi-distant from P, only that the errors are constant. Then the required two-dimensional covariance Covi(s,t) may be built from a one-dimensional covariance r1(ds, dt), where ds ≡ ‖sP‖ is the Euclidean distance between site s and the source. This is because {Z1(s) : sDR2} may be written as a process {Z1*(ds):dsR} on the real line; see Hughes-Oliver and Gonzalez-Farias (1999) for details. Another approach to building Cov1(s, t) is to explore the local behavior of the process around the source by means of a conditional autoregressive (CAR) model based on special neighbor relationships, as in Heo and Hughes-Oliver (2004). In the next section, we propose a spatial autoregressive random effects model for Z1 based on a point source.

3. AUTOREGRESSIVE POINT SOURCE MODELING

Building on the process decomposition model in Equations (1) and (2), we propose using an autoregressive random effects model to capture the effect of the point source on error process Z1. For this point source at P, we create a sequence of concentric regions RkD for k = 1,…, rn, such that PRr and the outer boundary of Rk is the inner boundary of Rk−1 for k = 2,…, r, where RkRk′ = ϕ, for all kk′; see Figure 1 for an illustration of such concentric regions. Next, we assume that the (random) spatial effect due to point source P for the region Rk is captured by αk. In other words, {R1,…, Rr} forms a partition of the index set D and the process Z1 is constant across each of these regions. It is important to realize that this assumption of constancy applies to the error process and not the response. In fact, the trend term of Equation (1) allows the mean response to vary within any particular region Rk. Finally, the random effects α1,…, αr arise as an AR(1) process. This assumption is based on the fact that the spatial effect due to region Rk depends on the two neighboring regions Rk−1 and Rk+1. More specifically, the error process caused by the point source P may be written as

Z1(s)=k=1rαkI(sRk)α1N(0,σ12),  σ12>0αk|αk1N(ψαk1,σ22),  σ22>0,  k=2,,r (3)

Figure 1.

Figure 1

Illustration of regions for the autoregressive point source process

In the ensuing discussion, we will refer to the model in Equation (3) as ARPS (ψ,σ12,σ22), where ARPS stands for autoregressive point source.

To understand the impact of Equation (3) on the covariance Covi(·,·), let Z1 = [Z1(s1),…, Z1(sn)]′ be the vector of responses from the point source process over all n sampling sites. Then ∑1Var(Z1) has (i, j) th element

k=1rm=1rI(siRk)I(sjRm)Cov(αk,αm)

where, assuming mk,

Cov(αk,αm)ψmk(σ12ψ2(k1)+σ22q=0k2ψ2q) (4)

Suppose Z1 contains the reordered entries of Z1 such that Z1=[Z11,Z12,,Z1r], where Z1k corresponds to the vector for all nk sites in region Rk. Then, the variance–covariance matrix for Z1 is the n × n matrix

1=Var(Z1)=[δ1,1Jn1×n1δ1,2Jn1×n2δ1,3Jn1×n3δ1,rJn1×nrδ1,2Jn2×n1δ2,2Jn2×n2δ2,3Jn2×n3δ2,rJn2×nrδ1,3Jn3×n1δ2,3Jn3×n2δ3,3Jn3×n3δ3,rJn3×nrδ1,rJnr×n1δ2,rJnr×n2δ3,rJnr×n3δr,rJnr×nr]=[δ1,1Jn1×n1ψδ1,1Jn1×n2ψ2δ1,1Jn1×n3ψr1δ1,1Jn1×nrψδ1,1Jn2×n1δ2,2Jn2×n2ψδ2,2Jn2×n3ψr2δ2,2Jn2×nrψ2δ1,1Jn3×n1ψδ2,2Jn3×n2δ3,3Jn3×n3ψr3δ3,3Jn3×nrψr1δ1,1Jnr×n1ψr2δ2,2Jnr×n2ψr3δ3,3Jnr×n3δr,rJnr×nr],

where J is a matrix with all elements equal to 1, δ1,1=σ12,δk,k=ψ2δk1,k1+σ22 for k = 2, 3,…, r, and δk,m = ψδk,m−1 for k = 1,…, r − 1 and m = k + 1,…, r. If ψ = ±1, the variance–covariance matrix Var(Z1) takes a simpler form with δk,m=σ12+(k1)σ22 if ψ = 1 and δk,m=(1)mk[σ12+(k1)σ22] if ψ = −1, for km. Note that Var(Z1) is always singular when r<n. Also note that there is relatively little information available for estimating α12, so in a Bayesian framework one needs to either specify a very informative prior for α12 or impose a constraint such as a relationship between α12 and one or more of the other parameters; we develop this idea more fully in Section 4.

Alternative expressions for the ARPS model allows it to benefit from insights commonly gained for mixed effect models. Specifically, we can write Z1 as

Z1=Δα,

where α = (α1,…, αr)′, Δ = ((δik)), and δik = I(siRk) for i = 1,…, n, k = 1,…, r. Consequently,

1=ΔCΔ

where C = ((ckm)) and ckm = Cov(αk, αm) is as specified in (4), for k,m = 1,…, r. Because rank(∑1) = rank(C) = r, it is clear that ∑1 is singular when r<n.

The ARPS model in Equation (3) allows for two types of nonstationarity that we loosely call variance nonstationarity and correlation nonstationarity. Variance nonstationarity is nonconstant variance and correlation nonstationarity is correlation that is not a function only of site-to-site distance. If σ12>σ12ψ2+σ22, then variance increases as sites are further from the source. On the other hand, variance is constant when σ12=σ12ψ2+σ22 and decreases when σ12<σ12ψ2+σ22. If ψ ≠ 0, then correlation is a function of both site-to-site and site-to-source distances from P, no matter what relationships exist between α12,α22, and ψ. Consequently, we test the impact of point source P on small-scale variability by testing the null hypothesis that

ψ=0andσ12=σ22

against an appropriate alternative; under these conditions, the process exhibits neither variance nonstationarity nor correlation nonstationarity. Using slightly different notation, let

η=σ12σ12ψ2+σ22.

Then,

  • η > 1 and ψ = 0 implies there is no correlation and the point source causes a single increase in variance from R2 to R1 (variance is the same in R2, …, Rr);

  • η > 1 and ψ ≠ 0 implies variance increases as sites are further from source P and correlation is a function of site-to-source distances;

  • η = 1 and ψ = 0 implies that variance is constant and there is no correlation, that is, the point source does not affect small-scale variability of the process Y;

  • η = 1 and ψ ≠ 0 implies that variance is constant and correlation is a function of site-to-source distances;

  • η<1 and ψ = 0 implies there is no correlation and the point source causes a single decrease in variance from R2 to R1 (variance is the same in R2,…, Rr); and

  • η<1 and ψ ≠ 0 implies variance decreases as sites are further from source P and correlation is a function of site-to-source distances.

We can thus test the impact of point source P on small-scale variability by testing the null hypothesis that ψ ≠ 0 and η = 1 against any of the above bulleted alternatives.

4. RESTRICTIONS ON THE PARAMETER SPACE

Process decomposition, in particular Equation (2), separates the overall error process into two component processes, one representing the baseline (when no source acts on the system) and the other representing the point source. Point source process Z1 was discussed in Section 3 with a comment made that restrictions are necessary for one of its parameters. This need is discussed here.

The baseline process Z0 is assumed to include a measurement error (ME) component and a residual component. In other words, using ∑0 to represent Var(Z0), where Z0 = [Z0(s1),…, Z0(sn)]′, we get

0=00+01=σe2In+01,

and consequently

=σe2In+01+1

is the overall covariance matrix of the Y process based on Equations (1) and (2).

Let Z0=[Z01,Z02,,Z0r] follow the order of Z1 as described in Section 3. If the Z0 process is i.i.d (making ∑01 the zero matrix) and Z1 is ARPS (1,σ12,σ22), then

=σe2In+1=σe2In+[σ12Jn1×n1σ12Jn1×n2σ12Jn1×n3σ12Jn1×nrσ12Jn2×n1(σ12+σ22)Jn2×n2(σ12+σ22)Jn2×n3(σ12+σ22)Jn2×nrσ12Jn3×n1(σ12+σ22)Jn3×n2(σ12+2σ22)Jn3×n3(σ12+2σ22)Jn3×nrσ12Jnr×n1(σ12+σ22)Jnr×n2(σ12+2σ22)Jnr×n3(σ12+[r1)σ22]Jnr×nr]

In this case, the scale parameter σe2 always appears with scale parameter σ12 as σe2+σ12. To improve identifiability, we can place a restriction on either σe2,σ12, or both. A commonly used restriction in time series analysis is one that guarantees stationarity of the one-dimensional autoregressive process, namely σ12=σ22/(1ψ2). This restriction would force η = 1 and consequently not allow variances of the Z1 process to differ based on distance from source. Finding this unacceptable, we settled on the alternative restriction σ12=σe2. This is quite reasonable given that σe2 represents measurement variability of the baseline process Z0. Region R1, whose variance is σ12, is farthest from the source and is thus least affected by that source. Depending on the strength of impact of the source and the relative size of study region D, we argue that R1 should exhibit features very similar to the baseline process Z0, thus supporting the restriction σ12=σe2. We use this restriction throughout.

5. MODELING ELECTRIC POTENTIAL IN A FIELD CONTAINING A METAL POLE

5.1. Dataset and candidate models

To illustrate some details of our approach, we now describe a real dataset from Hughes-Oliver and Gonzalez-Farias (1999) in which a point source acts as a catalyst for the response. Quoting these authors (pp. 63–64), “measurements [of electric potential] are taken at sites falling on a regular grid, as shown in [Figure 2], where the sites are one meter apart in both the vertical and horizontal directions. …[electric potential] is expected to be fairly constant across the field, but an existing metal pole affects the measuring device so that the constant pattern in the field is not observable. It is in this sense that we consider the metal pole to be a point source. … [electric potential] appears to be a function only of distance to the point source, and because the contours are approximately circular, there is no apparent need for rotating or rescaling the axes.”

Figure 2.

Figure 2

Contours of electric potential measurements, g, in a field containing a metal pole. The measurement sites (●) fall on a regular grid, with spacings of one meter in both directions. The metal pole is located at (12,33.4) in the original coordinate system, and at (0,0) in the translated coordinate system. The contours represent the following sample percentiles: .7, 1, 2.5, 5, 10, 20, 30, 40, 50, 60, 70, 80, 90

Let Y(s) denote the following transformed value of electric potential at site s, g(s)

Y(s)=46300g(s)4630

This simple rescaling moderates extreme behavior in our Markov chain Monte Carlo (MCMC) techniques discussed below. Another view of the data is given in Figure 3, where both electric potential and transformed electric potential are plotted as a function of distance from the point source. This figure clearly suggests that the mean and variance of Y(s) are decreasing functions of distance from the point source. Using d(s) to represent Euclidean distance between s and source P, we use Equation (1) with

μ(s)=β0+β1d(s)

to describe the trend surface and

Z(s)=Z00(s)+Z01(s)+Z1(s)

to describe the error process as in Equation (2) for Z0(s) = Z00(s) + Z01(s). In this error process, Z00 is an ME process and Z01 captures the effect of spatial proximity in the baseline process; Z00 and Z01 together represent the baseline process Z0 discussed in Sections 2 and 4. The process Z1 captures the effect of the point source, based on r = 10 concentric regions from the metal pole. ARPS regions R1, …, R10 are indicated in Figure 3, along with boxplots of the transformed electric potential y within these regions. The processes Z00, Z01, and Z1 all have mean zero and are assumed independent, conditioned on the mean function μ(·).

Figure 3.

Figure 3

Electric potential (plotted as bullets) as a function of distance between measurement sites and the metal pole. Locations of ARPS regions R1, …, R10 and boxplots of transformed electric potential within these regions are also shown

We begin by assuming normality of the conditional Y process. Normality can easily be replaced by a fat-tailed distribution like the Student’s t, but this would require an adjustment in interpretation of the variance–covariance matrix by incorporating a scale factor; we do not pursue this here. The Z00 process is assumed to follow a normal distribution with constant variance σe2; the Z01 process is assumed to follow a normal CAR model; and the Z1 process follows an ARPS model. More specifically, suppose D is an index set containing sites s1,…, sn on the regular grid of Figure 2, where n = 160. Then, the Bayesian hierarchical model may be written as follows:

Y(sj)Normal(Y*(sj),σe2),j=1,,nY*(sj)=β0+β1d(sj)+Z01(sj)+Z1(sj)Z01(·)~CAR(ρ,σc2)Z1(·)~ARPS(ψ,σ12=σe2,σ22) (5)

The ME variance is σe2. While the CAR(1,σc2) process is more popular, we prefer the more general CAR(ρ,σc2) process (Sun et al., 2000) for several reasons. First, because −1 ≤ ρ ≤ 1, the former is actually a special case of the latter. Second, the value of ρ indicates the strength of spatial correlation. And third, when −1 < ρ < 1, CAR(ρ,σc2) yields a positive definite covariance structure for Z01, unlike the semidefinite covariance offered by CAR(1,σc2). The CAR(ρ,σc2) process specifies conditional distributions as

Z01(sj)|{Z01(sk),jk}~Normal(ρk=1nbjkZ01(sk),εj2),

where bjj = 0, bjk = (1/Nj)I (locations sj and sk are neighbors), Nj is the number of neighbors of sj, εj2=σc2/Nj, and σc2 is a scale parameter measuring smoothness. The overall variance of the Z01 process is obtained from 011=M1(IρB), where B=((bjk))j,k=1n and M is a diagonal matrix with elements ε12,,εn2. Carlin and Banerjee (2002) argue that negative smoothness parameters are not desirable, so they use 0 < ρ < 1.

We use proper priors for all hyper parameters, θ=(β0,β1,σe2,ρ,σc2,ψ,σ22), as outlined below:

β0~Normal(0,10 000)β1~Normal(0,10 000)σe2~InverseGamma(0.001,0.001)ρ~Uniform(0,1)σc2~InverseGamma(.01,.01)ψ~Uniform(0.1,2)σ22~InverseGamma(0.001,0.001).

The six modeling scenarios considered here are summarized in Table 1. Model 1 recognizes no spatial patterns in small-scale variability, neither in the baseline process nor due to the point source; Model 1 uses ordinary least squares estimation. Models 2, 4, and 6 all include point source covariance models. For Model 2, no spatial proximity modeling is used for the baseline process. Models 3 and 4 both fit the singular CAR model, while Models 5 and 6 fit the more general CAR model.

Table 1.

Modeling cases for electric potential in a field

Model* CAR(1,τ) CAR(ρ,τ) ARPS(ψ,σ2, σ22)
(1) trend + ME
(2) trend + ME + ARPS x
(3) trend + ME + CAR1 x
(4) trend + ME + CAR1 + ARPS x x
(5) trend + ME + CARρ x
(6) trend + ME + CARρ + ARPS x x

Posterior distributions are obtained using MCMC in softwareWinBUGS Version 1.3 (Spiegelhalter et al., 2000). This version of WinBUGS provides a function for CAR(ρ,σc2), but only for deterministic ρ. We hard-coded the cases where ρ was allowed to be random, namely, Models 5 and 6. WinBUGS offers many useful convergence measures that can be conveniently summarized using the CODA module (Best et al., 1996) running under R or SPlus.

Using this Bayesian hierarchical approach, we implement a Gibbs sampler method to sample from the full conditional distributions of all these parameters. While all hyperpriors are proper, Models 2, 4, and 6 contain the singular ARPS and Models 3 and 4 contain the singular CAR(1,σc2). Presence of the ME component, however, ensures that the overall covariance of Y, given by ∑, is a positive definite matrix and hence all posterior distributions are proper. Concerns raised by Sun et al. (1999, 2000, 2001) are not applicable here. In Table 2, we report summaries of the posterior densities for each of the six models, and in Table 3 we report model diagnostics. Results are based on three chains of 70 000 iterations each after a burn-in period of 30 000 iterations.

Table 2.

Summaries of posterior densities from modeling electric potential

Parameter Mean MC error 2.5% Median 97.5%
Model 1
β0 −0.0400 1.397E-5 −0.0477 −0.0400 −0.0324
β1 0.8059 3.904E-5 0.7848 0.8059 0.8271
σe2
0.001326 3.143E-7 0.001062 0.001315 0.001653
Model 2
β0 −0.0211 2.796E-5 −0.0299 −0.0211 −0.0123
β1 0.7281 1.036E-4 0.6985 0.7281 0.7579
σe2
0.001011 3.097E-7 0.000806 0.001003 0.001266
ψ 1.330 3.213E-3 0.5619 1.3430 1.9280
η 0.4737 1.955E-3 0.2265 0.4259 1.0580
σ22
0.000586 1.710E-6 0.000179 0.000472 0.001671
Model 3
β0 −0.0224 4.356E-5 −0.0312 −0.0224 −0.0137
β1 0.7339 1.712E-4 0.7012 0.7340 0.7660
σe2
0.000599 1.211E-6 0.000330 0.000593 0.000906
σc2
0.001775 3.568E-6 0.001051 0.001725 0.002776
Model 4
β0 −0.0191 4.190E-5 −0.0282 −0.0191 0.0100
β1 0.7201 1.637E-4 0.6866 0.7202 0.7534
σe2
0.000665 1.173E-6 0.000392 0.000661 0.000959
σc2
0.001454 3.927E-6 0.000815 0.001398 0.002420
ψ 1.1470 4.478E-3 0.2223 1.182 1.866
η 0.5262 3.107E-3 0.2037 0.4479 1.384
σ22
0.000588 1.498E-6 0.000180 0.000476 0.001671
Model 5
β0 −0.0292 1.067E-4 −0.0433 −0.0295 −0.0127
β1 0.7633 2.263E-4 0.7267 0.7632 0.8001
σe2
0.000596 1.779E-6 0.000279 0.000588 0.000962
ρ 0.8527 8.813E-4 0.5397 0.8817 0.9934
σc2
0.002221 6.378E-6 0.001143 0.002147 0.003705
Model 6
β0 −0.0204 5.191E-5 −0.0307 −0.0204 −0.0098
β1 0.7252 1.719E-4 0.6928 0.7253 0.7574
σe2
0.000554 1.527E-6 0.000265 0.000549 0.000878
ρ 0.4643 1.339E-3 0.0330 0.4630 0.9254
σc2
0.002090 5.583E-6 0.001116 0.002031 0.003378
ψ 1.3320 2.858E-3 0.5457 1.3470 1.9250
η 0.3801 1.299E-3 0.1706 0.3494 0.7814
σ22
0.000592 1.645E-6 0.000180 0.000477 0.001690

Table 3.

Diagnostic measures from modeling electric potential

Model Criteria

DIC (Rank) pD pθ D(θ̄) SSE (Rank)
(1) trend + ME −606 (6) 3 3 −612 −609 0.208 (6)
(2) trend + ME + ARPS −642 (5) 10 15 −662 −652 0.158 (5)
(3) trend + ME + CAR1 −660 (2) 82 164 −824 −742 0.093 (3)
(4) trend + ME +CAR1 + ARPS −648 (4) 74 176 −796 −722 0.103 (4)
(5) trend + ME + CARρ −656 (3) 89 165 −834 −745 0.092 (2)
(6) trend + ME + CARρ + ARPS −664 (1) 90 177 −845 −755 0.086 (1)

Several checks were made on convergence of the MCMC, including running three separate chains using different starting values and observing Gelman and Rubin’s (1992) diagnostic measure R; all values were very close to unity, indicating good mixing of the chains. Trace and posterior density plots for all three chains obtained for Model 6 (the most complicated model considered) are given in Figure 4. Additional details on the effect of multiple starting values are provided in Figure 5 for ψ of Model 6, where ψ is the parameter of greatest interest in capturing effect of the point source on the error process. Recall from Section 3 that if ψ ≠ 0 then correlation is a function of both site-to-site and site-to-source distances. There is strong evidence of convergence.

Figure 4.

Figure 4

MCMC trace and posterior density plots for Model 6 using three chains

Figure 5.

Figure 5

MCMC results for ψ of Model 6. Panel (a) is the autocorrelation function of ψ for each chain. Panel (b) is the posterior histogram of ψ for each chain. Panel (c) shows side-by-side boxplots of posterior values for ψ from each chain

Recent developments in Bayesian modeling of random effects suggest that results can be very sensitive to the priors on the associated variances (Gelman, 2004; Spiegelhalter et al., 2004, chapter 5). To investigate the effect of priors on Model 6, we considered two additional priors for σc2andσ22, leading us to nine combinations of priors for these two parameters based on individual priors of uniform(0,.1) for σc or σ2 (most noninformative), inverse gamma(0.001,0.001) for σc2orσ22, and inverse gamma(.01,.01) for σc2orσ22 (most informative). Figure 6 shows histograms of posterior simulations of σ2 from the nine combinations of priors. Inference on σ2 clearly differs according to prior specification for σ2 (but not for σc), with the most informative prior leading to larger values for σ2, but none of the inverse gamma priors considered here constrain posterior inference to the degree presented in Gelman et al. (2003, Appendix C). The parameter σ2 is not of direct interest when determining the impact of ARPS, however, so we once again focus attention to ψ.

Figure 6.

Figure 6

Histogram of posterior simulations of σ2 from models with nine different prior distributions: σc distributed as uniform(0,.1) (top row), σc2 distributed as inverse gamma(.001,.001) (middle row), σc2 distributed as inverse gamma(.01,.01) (bottom row), σ2 distributed as uniform(0,.1) (left column), σ22 distributed as inverse gamma(.001,.001) (middle column), σ22 distributed as inverse gamma(.01,.01) (right column)

Figure 7 shows posterior densities of c for each of the nine combinations considered. Given a prior for σ22(σ2), selection of prior for σc2(σ2) has little effect on the posterior distribution of ψ. On the other hand, choice of prior for σ22(σ2) has a big effect on the model. Vague or non-informative priors allow the data to dominate, and this leads to a stronger effect, or large values, of the point source parameter ψ. As expected, very informative priors that place little prior probability on a strong ARPS process lead to small values of ψ. It is important to note, however, that 95% credible sets for ψ never include 0, which suggests that the ARPS process adds an important component to the model for this data.

Figure 7.

Figure 7

Posterior densities of ψ obtained from nine different combinations of priors for σc2c) and σ222) in Model 6. Label (a) indicates all three densities corresponding to σ2 having a uniform(0,.1) prior, the most non-informative prior considered. Likewise, labels (b) and (c) indicate all three densities corresponding to σ22 having inverse gamma(.001,.001) or inverse gamma(.01,.01) priors, respectively. Vertical reference lines indicate 2.5th posterior percentiles for ψ

For all models, the 2.5th posterior percentile of β1 exceeds 0.68, thus suggesting that electric potential increases at sites further from the pole. Posterior regions for β0, β1, and σe2 overlap in significant ways across all six models. The posterior mean of σe2 under Models 1 and 2 are almost double the values under other models, thus suggesting that the residuals of Models 1 and 2 contain spatial variation that is later captured by the CAR process. The fitted ARPS process is most impressive when the CAR process is not modeled, as expected. In the presence of the strong spatial structure created by a degenerate CAR(1,σc2), ARPS parameters are closer to their null values, with posterior means of 1.147 for ψ (null value is 0) and 0.5262 for η (null value is 1) in Model 4. When strength of spatial dependence is estimated using CAR(ρ,σc2) with random ρ, as in Model 6, posterior means are 1.332 for ψ, 0.3801 for η, and 0.4643 for ρ.

5.2. Model comparison

Many techniques have been suggested for choosing among competing models. Akaike’s Information Criterion (AIC), Bayesian Information Criterion (BIC), and Bayes factors are among the most often used, but these methods fail in our models because of the random effects and some improper priors (Spiegelhalter et al., 2002). Bayes factors are not interpretable with a CAR(1,σc2) prior for spatial effects (Han and Carlin, 2001). This CAR(1,σc2) prior creates n additional parameters that cannot be counted as n free parameters, thus making AIC and BIC inapplicable. For our model comparisons, we use the Deviance Information Criterion (DIC) proposed by Spiegelhalter et al. (2002). Sum of squared errors (SSE) is used as a secondary model comparison criterion. SSE due to prediction is computed as

SSE=E{j=1n[Y(sj)Y^(sj)]2|Y()}

where the expectation is taken with respect to the posterior predictive distribution of Ŷ(·). See Gelfand and Ghosh (1998) for a decision-theoretic justication of this quantity as a model-choice criterion.

DIC, like AIC and BIC balances model adequacy against model complexity. Model adequacy is measured by the posterior mean = Eθ|y(D(θ)) of the deviance D(θ), where D(θ) = −2 log f (y|θ)+ 2 log(h(y)) and h(y) is unaffected by the model. Model complexity is measured by the effective numbers of parameters, pD = D(θ̄), where D(θ̄) is the deviance evaluated at the posterior mean, θ̄, of the parameters. Consequently, DIC = D + pD and small values are desirable. DIC is effective in high-dimensional or very complicated models because of the method used for determining the effective number of parameters.

In Table 3, we report DIC and its components for each of the six models. Multiple runs yield differences of at most 1.9 in DIC. We also report pθ, the total number of parameters and the posterior mean of SSE for all six models. Models 1 and 2 are clearly inadequate. Model 4 is better than Models 1 and 2 but is not as good as Models 3, 5, and 6. Model 6 is the best model by all measures.

Model 6 has many interesting features. While the CAR process enforces unequal variances only to address edge effects caused by differing numbers of neighbors, the ARPS process explicitly fits unequal variances as suggested by the data in Figure 3. Figure 8(a) shows fitted variances (obtained as the diagonal elements of ∑ evaluated at the posterior mean, θ̄, of the parameters) for Models 3, 5, and 6. The curve for Model 6 is much more variable than the curves for Models 3 and 5. Figure 8(b) shows fitted variances only for Model 5, and it is clear that these fitted variances are not consistent with the patterns in the data, as displayed in Figure 3. Corner sites (site numbers 1, 20, 141, and 160) have largest variances, followed by sites adjacent to corners, followed by other edge sites, and finally followed by interior sites. Figure 8(c) shows fitted variances only for Model 3, and again, these fitted variances are clearly not consistent with the data. In both Models 3 and 5, fitted variances are a function of number of neighbors and they increase away from the pole. There are much stronger spatial patterns and smoothness in Model 3 (ρ=.99 and σc2 = .0018 in CAR) than in Model 5 (ρ=.85 and σc2 = .0022 in CAR).

Figure 8.

Figure 8

Fitted variances from Models 3 (modified), 5, and 6. A modification is necessary for Model 3 because this fitted model is singular. For this model, we replace ρ = 1 with ρ = 0.99 to obtain invertible matrices that yield variances. Panel (a) shows variances for all models, while panel (b) shows variances only for Model 5 and panel (c) shows variances only for modified Model 3. Observation sites numbers are used as plotting symbols in panels (b) and (c), where site 1 is in the bottom left corner of Figure 2 and site 160 is in the top right corner

A second interesting feature of Model 6 is the relationship between correlation and site-to-source distances. Figure 9 shows correlation clouds from ∑ evaluated at θ̄ as a function of site-to-site and site-to-source distances for Models 3, 5, and 6. Figure 9(a) shows that for Model 5, correlations between sites closest to the pole (ARPS region R10) are in general smaller than correlations between sites farthest from the pole (ARPS region R1). There is, however, no clear separation between these groups of correlations. For Model 3, Figure 9(b) shows a greater degree of separation of correlations according to distance from pole, but the separation is still small. On the other hand, Model 6, as shown in Figure 9(c), clearly has correlations that are a function of distance to pole. Moreover, correlations between sites in R10 are largest.

Figure 9.

Figure 9

Fitted correlations from Models 3 (modified), 5, and 6. A modification is necessary for Model 3 because this fitted model is singular. For this model, we replace ρ = 1 with ρ = 0.99 to obtain an invertible matrix that yields correlations. Panels (a), (b), and (c) show correlations for Models 5, 3, and 6, respectively. All panels display correlations between all pairs of sites where both sites are either in ARPS Region R1 or R10. Panel (c) also displays correlations between all pairs of sites where both sites are in ARPS Region R2

Third, focusing only on the ARPS process, we consider the one-step correlations from regions R1 to R10, defined as

Corr(α1,α2)=ψσ12σ12ψ2+σ22,Corr(αk,αk+1)=ψσ12ψ2(k1)+σ22q=0k2ψ2qσ12ψ2k+σ22q=0k1ψ2q fork=2,3,,9

Evaluating at θ̄ for Model 6, we obtain Corr (αk, αk+1), k = 1, 2,…, 9 as: 0.790, .908, .954, .975, .987, .993, .996, .998, .999. These numbers change quickly from k = 1 to k = 3 and are primarily responsible for the differences seen in Figure 9(c).

Finally, as evidenced by the posterior density intervals for ψ and η given in Table 2, as well as additional results not presented here, there is clear evidence of statistical significance of the ARPS model. More specifically, the posterior interval for ψ does not contain the value 0 and the posterior interval for η does not contain the value 1.

6. CONCLUDING REMARKS

Our newly proposed ARPS process can improve models of stochastic phenomena across spatial domains because it takes advantage of auxiliary information provided by point sources. The result is more simplified, interpretable, parametric, yet realistic, nonstationary modeling of error processes. By restricting the initial variance σ12 of the ARPS process to equal the ME variance σe2, we strengthen the interpretation that ARPS captures extra variation due to the point source, and sites far from this source are well represented by the baseline error process. Additionally, we comment on the interplay between the ARPS process, the commonly used CAR(1,σc2) process, and the more desirable CAR(ρ,σc2) process. This is all done in a Bayesian hierarchical framework where key objectives include simple tests for, and clear interpretations of, the impact of a point source.

A complete analysis of electric potential in a field containing a metal pole clearly shows the advantages of our approach. With ARPS, we capture the observed trend of decreasing variances away from the pole; this finding is consistent with Hughes-Oliver and Gonzalez-Farias (1999). We also achieve greater flexibility in modeling correlations within and across regions of various distances from the pole. Posterior density regions for parameters ψ and η clearly do not contain null values, thus implying ARPS is important in capturing statistical significance of the pole’s impact on small-scale variability of the electric potential process. Model selection is based on both the DIC and SSE.

The ARPS error process may, at first glance, appear unrealistically simple because it assigns the same random effect αk across an entire region Rk, but it is actually a very flexible model. Regions R1, …, Rr may be chosen to represent expert knowledge of the process being monitored (perhaps regions should be irregularly shaped and defined by geographic features such as rivers, valleys, or plateaus in the monitored area) with only minimal requirements that one must have enough data to be able to fit a reliable model for the specified number of parameters. Because the model is parametric, we allow testing for the contribution of ARPS toward improving fit of the observed data, as we have demonstrated for the data on electric potential. Also by letting r approach infinity, we can approximate any stochastic process by a sequence of simple stochastic processes. This idea is similar to approximating a measurable function by a sequence of simple functions (Billingsley, 1986, pp. 63–65).

There are, however, several issues that are currently being investigated. One such issue deals with dynamic regionalization for the ARPS process, where we observe a spatio-temporal process that can possibly alter the effect of a source over time. This is particularly relevant in spatio-temporal processes affected by meteorological conditions. On the other hand, regionalization for purely spatial processes may be done in a straightforward manner, with model fits from different choices for R1,…, Rr compared for adequacy. Ideally, we desire r as large as possible, but this increases the number of parameters and reduces reliability of the model. Cost-benefits analyses, possibly using cross validation may be necessary for selecting r.

Another issue under current investigation is simultaneously accounting for the impact of multiple sources. Ranking the impact of different sources is particularly difficult when sites are affected by multiple sources in very different ways. The field of source apportionment (see, for example, Henry, 1997; Henry et al., 1997; Park et al., 2001; Christensen and Sain, 2002; Park et al., 2002) ranks the impact of multiple sources by effectively partitioning the observed variance matrix into factors that represent different categories of sources. Bayesian hierarchical modeling of multiple ARPS processes is a fundamentally different approach from the standard techniques of source apportionment. This will be the subject of future manuscripts.

REFERENCES

  1. Best N, Cowles MK, Vines K. Convergence Diagnostics and Output Analysis Software for Gibbs Sampling Output. Cambridge: MRC Biostatistics Unit, Institute of Public Health; 1996. [Google Scholar]
  2. Billingsley P. Probability and Measure. 2nd ed. New York: Wiley; 1986. [Google Scholar]
  3. Bithell JF. The choice of test for detecting raised disease risk near a point source. Statistics in Medicine. 1995;14:2309–2322. doi: 10.1002/sim.4780142104. [DOI] [PubMed] [Google Scholar]
  4. Carlin BP, Banerjee S. Hierarchical multivariate CAR models for spatio-temporally correlated survival data, (with discussion) In: Bernardo JM, Bayarri MJ, Berger JO, Dawid AP, Heckerman D, Smith AFM, West M, editors. Bayesian Statistics 7. Oxford: Oxford University Press; 2002. [Google Scholar]
  5. Christensen WF, Sain SR. Accounting for dependence in a flexible multivariate receptor model. Technometrics. 2002;44:328–337. [Google Scholar]
  6. Cressie NAC, Majure JJ. Spatio-temporal statistical modeling of livestock waste in streams. Journal of Agricultural, Biological, and Environmental Statistics. 1997;2:24–47. [Google Scholar]
  7. Diggle PJ, Rowlingson BS. A conditional approach to point process modeling of elevated risk. Journal of the Royal Statistical Society Series A–Statistics in Society. 1994;157:433–440. [Google Scholar]
  8. Diggle P, Morris S, Elliot P, Shaddick G. Regression modelling of disease risk in relation to point sources. Journal of the Royal Statistical Society Series A–Statistics in Society. 1997;160:491–505. [Google Scholar]
  9. Fuentes M. Spectral methods for nonstationary spatial processes. Biometrika. 2002;89:197–210. [Google Scholar]
  10. Gelfand AE, Ghosh SK. Model choice: a minimum posterior predictive loss approach. Biometrika. 1998;85:1–11. [Google Scholar]
  11. Gelman A. Prior distributions for variance parameters in hierarchical models. 2004. [accessed April 2004]. unpublished manuscript, available at http://www.stat.columbia.edu/~gelman/research/unpublished/tau5.pdf. [Google Scholar]
  12. Gelman ST, Rubin DB. Inference from iterative simulation using multiple sequence (with Discussion) Statistical Science. 1992;7:457–511. [Google Scholar]
  13. Gelman A, Carlin JB, Stern HS, Rubin DB. Bayesian Data Analysis. 2nd ed. London: Chapman and Hall; 2003. [Google Scholar]
  14. Haas TC. Local predictions of a spatio-temporal process with an application to wet sulphate deposition. Journal of the American Statistical Association. 1995;90:1189–1199. [Google Scholar]
  15. Han C, Carlin B. MCMC methods for computing Bayes factors: a comparative review. Journal of the American Statistical Association. 2001;96:1122–1132. [Google Scholar]
  16. Hastie TJ, Tibshirani RJ. Generalized Additive Models. New York: Chapman and Hall; 1990. [Google Scholar]
  17. Henry RC. History and fundamentals of multivariate air duality receptor models. Chemometrics and Intelligent Laboratory Systems. 1997;37:37–42. [Google Scholar]
  18. Henry RC, Speigelman CH, Collins JF, Park ES. Reported emissions of organic gases are not consistent with observations. Proceedings of the National Academy of Sciences, USA. 1997;94:6596–6599. doi: 10.1073/pnas.94.13.6596. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Heo TY, Hughes-Oliver JM. A simple statistical air dispersion model and its prediction, presented at the Spring ENAR Meetings of The Biometric Society; 2004. Mar, 2004. [Google Scholar]
  20. Higdon D, Swall J, Kern J. Non-stationary spatial modeling. In: Bernardo JM, Berger JO, Dawid AP, Smith AFM, editors. Bayesian Statistics 6. Oxford: Oxford University Press; 1999. pp. 761–768. [Google Scholar]
  21. Holland DM, DeOliveira V, Cox LH, Smith RL. Estimation of regional trends in sulfur dioxide over the eastern United States. Environmetrics. 2000;11:373–393. [Google Scholar]
  22. Hughes-Oliver JM, Gonzalez-Farias G. Parametric covariance models for shock-induced stochastic processes. Journal of Statistical Planning and Inference. 1999;77:51–72. [Google Scholar]
  23. Hughes-Oliver JM, Gonzalez-Farias G, Lu JC, Chen D. Parametric nonstationary correlation models. Statistics & Probability Letters. 1998a;40:267–268. [Google Scholar]
  24. Hughes-Oliver JM, Lu JC, Davis JC, Gyurcsik RS. Achieving uniformity in a semiconductor fabrication process using spatial modeling. Journal of the American Statistical Association. 1998b;93:36–45. [Google Scholar]
  25. Lawson AB. MCMC methods for putative pollution source problems in environmental epidemiology. Statistics in Medicine. 1995;14:2473–2485. doi: 10.1002/sim.4780142115. [DOI] [PubMed] [Google Scholar]
  26. Lawson AB. Statistical Methods in Spatial Epidemiology. New York: Wiley; 2001. [Google Scholar]
  27. Lawson AB, Waller LA. A review of point pattern methods for spatial modelling of events around sources of pollution. Environmetrics. 1996;7:471–487. [Google Scholar]
  28. Park ES, Guttorp P, Henry RC. Multivariate receptor modeling for temporally correlated data by using MCMC. Journal of the American Statistical Association. 2001;96:1171–1183. [Google Scholar]
  29. Park ES, Spiegelman CH, Henry RC. Bilinear estimation of pollution source profiles and amounts by using multivariate receptor models. Environmetrics. 2002;13:775–809. [Google Scholar]
  30. Sampson PD, Guttorp P. Nonparametric estimation of nonstationary spatial covariance structure. Journal of the American Statistical Association. 1992;87:108–119. [Google Scholar]
  31. Spiegelhalter DJ, Abrams KR, Myles JP. Bayesian Approaches to Clinical Trials and Health-Care Evaluation. Chichester: Wiley; 2004. [Google Scholar]
  32. Spiegelhalter DJ, Best NG, Carlin BP, van der Linde A. Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society, Series B. 2002;64(4):583–639. [Google Scholar]
  33. Spiegelhalter DJ, Thomas A, Best NG. WinBUGS Version 1.3 User Manual. Cambridge: Medical Research Council Biostatistics Unit; 2000. Available from http://www.mrc-bsu.cam.ac.uk/bugs. [Google Scholar]
  34. Sun D, Tsutakawa RK, He ZQ. Propriety of posteriors with improper priors in hierarchical linear mixed models. Statistica Sinica. 2001;11:77–95. [Google Scholar]
  35. Sun D, Tsutakawa RK, Kim H, He ZQ. Spatio-temporal interaction with disease mapping. Statistics in Medicine. 2000;19:2015–2035. doi: 10.1002/1097-0258(20000815)19:15<2015::aid-sim422>3.0.co;2-e. [DOI] [PubMed] [Google Scholar]
  36. Sun D, Tsutakawa RK, Speckman PL. Posterior distribution of hierarchical models using CAR(1) distributions. Biometrika. 1999;86:341–350. [Google Scholar]
  37. Wikle CK. Hierarchical models in environmental science. International Statistical Review. 2004 in press. [Google Scholar]

RESOURCES