Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2012 Jul 12.
Published in final edited form as: J Am Stat Assoc. 2012 Jan 1;105(492):1394–1402. doi: 10.1198/jasa.2010.ap09323

Estimating Individual-Level Risk in Spatial Epidemiology Using Spatially Aggregated Information on the Population at Risk

Peter J Diggle 1, Yongtao Guan 2, Anthony C Hart 3, Fauzia Paize 4, Michelle Stanton 5
PMCID: PMC3395722  NIHMSID: NIHMS385815  PMID: 22798701

Abstract

We propose a novel alternative to case-control sampling for the estimation of individual-level risk in spatial epidemiology. Our approach uses weighted estimating equations to estimate regression parameters in the intensity function of an inhomogeneous spatial point process, when information on risk-factors is available at the individual level for cases, but only at a spatially aggregated level for the population at risk. We develop data-driven methods to select the weights used in the estimating equations and show through simulation that the choice of weights can have a major impact on efficiency of estimation. We develop a formal test to detect non-Poisson behavior in the underlying point process and assess the performance of the test using simulations of Poisson and Poisson cluster point processes. We apply our methods to data on the spatial distribution of childhood meningococcal disease cases in Merseyside, U.K. between 1981 and 2007.

Keywords: Estimating equations, Inhomogeneous spatial-point processes, Meningococcal disease

1. INTRODUCTION

A fundamental problem in spatial epidemiology is to understand the relationship between the risk that an individual will experience a particular health outcome and the values of one or more risk factors associated with that individual. A widely used study design to address problems of this kind is the case-control study (Breslow and Day 1980). In its simplest form, a case-control study requires data consisting of the values of all risk factors under consideration for each individual in the study region who experiences the health outcome (cases), and for a random sample of individuals not experiencing the health outcome (controls). In contrast, an ecological study uses data in a spatially aggregated form, consisting of the numbers of cases, numbers of individuals at risk (denominators), and average values of risk factors in each of a set of spatial units that partition the study region. The attraction of the ecological study design is that the required data are often routinely available, for example, from national disease registries or censuses, whereas the individual-level data required by the case-control study design typically need to be acquired from scratch, often at considerable expense. However, it is well known that the effects estimated in ecological and case-control studies are different, a phenomenon known as ecological bias (Greenland and Morgenstern 1989; Greenland and Robins 1994). In this paper, we consider the problem of estimating individual-level effects when data on risk factors are available at the individual level for cases, but only at the spatially aggregated level for the population at risk. In many applications, the risk factors are available only for a (random) subset of the cases. Our proposed methods can still be applied, with the understanding that the interpretation of the (less interesting) intercept term in the log-linear intensity model (see Section 3) will be different.

As a specific example, we consider data relating to 864 cases of childhood meningococcal disease in the metropolitan county of Merseyside, U.K., over the period January 1, 1981 to December 31, 2007. The data available for each case include their age, sex, residential location as derived from their full postcode, and a measure of social deprivation for their residential location, the Townsend score (Townsend, Phillimore, and Beattie 1988). The corresponding data for the population at risk are available for each of the approximately 5000 census output areas that make up the Merseyside study region. Within the study period, censuses took place in the years 1981, 1991, and 2001. The scientific objectives are to investigate the effect of deprivation on the risk of meningococcal disease while accounting for the potentially confounding effects of age and sex, and to assess whether there is any residual spatial clustering of cases after accounting for these effects.

In Section 2 of the paper we review earlier work on methods for combining individual level and spatially aggregated data, and explain why these are unsuitable for our application. In Section 3 we give the details of our proposed approach, which uses a set of weighted estimating equations for the regression parameters of interest. In Section 4, we propose data-driven methods to select the weights so as to obtain reasonably efficient estimators. Our approach for inference assumes that case locations form an inhomogeneous spatial Poisson process. In Section 5, we therefore develop a method to test for residual clustering. In Section 6 we assess the performance of the proposed methods through simulations. In Section 7 we describe the application to the Merseyside meningococcal disease data. Section 8 is a concluding discussion. Some technical details are given in an Appendix.

2. LITERATURE REVIEW

Several authors have previously suggested methods for combining individual level and spatially aggregated epidemiological data to alleviate the well-known limitations of ecological studies that use only spatially aggregated data.

Prentice and Sheppard (1995) and Sheppard and Prentice (1995) proposed augmenting spatially aggregated data with individual-level data from a sample of the population at risk. This allowed them to develop unbiased estimating equations for the effects of individual-level risk factors, while appropriately exploiting the additional information provided by the spatially aggregated data. Wakefield (2004) proposed an approach that combines ecological data with cohort data. A practical limitation of both of these proposals is the additional expense entailed in generating the individual-level data. For the Merseyside childhood meningococcal disease data introduced in Section 1, it would not be feasible to generate retrospectively a random sample from the population at risk, because the data were observed over a 27-year period.

Best, Ickstadt, and Wolpert (2000) considered the more general problem of combining data at disparate spatial resolutions and proposed a sophisticated parametric model in which all of the data are related to a latent, spatially continuous stochastic process representing unexplained spatial variation in risk. They fit their model to data relating childhood respiratory disorders to exposure to traffic-related pollutants, using a computationally intensive Markov chain Monte Carlo (MCMC) implementation of Bayesian inference. Jackson, Best, and Richardson (2006) also used MCMC methods to fit their proposed hierarchical model for combining small area and individual-level data on exposures and health outcomes.

MCMC has had a major impact on statistical practice, chiefly through its ability to handle models whose complexity precludes fitting by analytical or non-Monte Carlo numerical methods. However, applications of MCMC algorithms often need careful tuning to generate reliably reproducible results. Although such algorithms have been commonly used for modeling areal spatial data using free software such as WinBUGS, their use in the current setting is still challenging. We believe that it is useful to develop methods that avoid the need for MCMC implementation.

Haneuse and Wakefield (2007, 2008a, 2008b) noted that the cohort-based approach of Wakefield (2004) is inherently inefficient for investigation of rare outcomes, and proposed a hybrid design in which ecological data are supplemented either with case-control data or with case data alone. We note that meningococcal disease is an extremely rare disease. Specifically, in our application the average number of cases per year is only 32, from a population at risk of the order of several hundred thousand.

An important practical distinction between methods that require individual-level data from cases only, or from both case and noncases, is that the former are typically available at little or no additional cost because the individuals concerned will have undergone detailed clinical investigation, whereas the latter require additional data to be collected as an integral component of the overall research design. As already argued earlier, this would not be feasible in applications such as ours, for which the available data consist of historical case records.

The methods proposed in Wakefield (2004) and in Haneuse and Wakefield (2007, 2008a, 2008b) are currently restricted to binary risk factors. Although binary risk factors could represent different levels of a continuous risk factor, for the Merseyside childhood meningococcal disease data we prefer to include such risk factors (e.g., age and deprivation score) directly in the model.

Our proposed method is, to the best of our knowledge, the first that can be used to fit models at the individual level with both discrete and continuous risk factors, requiring only case data at the individual level, and avoiding the need for computationally intensive MCMC methods.

3. METHODOLOGY

Let si, i = 1, …, N, be the spatial locations of a population of N individuals at risk in a spatial region D, among whom the first M members are cases, with MN. For the ith member of the population, write Xi = {Xi1, …, Xip} for the value of an associated p-dimensional covariate vector, with Xi1 = 1 for all i. We assume that Xi is observed for each case, i = 1, …, M, but not for noncases, i = M + 1, …, N. However, an aggregated quantity

μjk=i=1NXijI(siDk) (1)

is available, where the subregions Dk: k = 1, …, K, form a partition of the study region D and I(·) is an indicator function. Note that μ̃1k is the number of individuals at risk in Dk.

We treat the (unobserved) locations of the N individuals in the population at risk as a realization of an inhomogeneous spatial Poisson process with intensity λ0(s) (Diggle 2003), representing a spatially varying population density. Let f [X(s)′β] be the probability that an individual at location s is a case and assume, temporarily, that cases occur independently; this is often a reasonable assumption for noncontagious diseases. Then, the case locations also follow an inhomogeneous spatial Poisson process, with intensity λ(s; β) = λ0(s)f [X(s)′β]. Our first goal is to make inferences about the parameter vector β based on the observed covariates for the cases, that is, X(si): i = 1, …, M, and the spatially aggregated covariates for all members of the population at risk in each sub-region, that is, μ̃jk: j = 1, …, p and k = 1, …, K.

To motivate our method, we assume that each Xij can be considered as the value at si of a spatially continuous process {Xj(s): sD}. Then, μ̃jk is an unbiased estimator for

μjk=DkXj(s)λ0(s)ds. (2)

Note that (2) can be reexpressed as

μjk=DkXj(s)f[X(s)β0]λ(s;β0)ds,

where β0 is the true but unknown value of β. Now define

μ^jk(β)=i=1MXj(s)f[X(s)β]I(sjDk).

Then, μ̂jk(β0) is also an unbiased estimator for μjk. Hence, we have a set of unbiased estimating equations for β,

Uj(β;W)=k=1Kwk[μjk-μ^jk(β)]=0,j=1,,p, (3)

where W = {wk: k = 1, …, K} is any set of predefined weights. The estimator β̂ obtained by solving (3) is consistent for β, following established theoretical results for estimating equations (Crowder 1986). Typically, μ̃jk converges to μjk much faster than μ̂jk because NM. With a slight abuse of notation, we replace μ̃jk by μjk from now on.

For ease of presentation, we suppose that the parametric part of λ(s) can be described by a log-linear model, that is, f (·) = exp(·). This log-linear specification is widely used, but our methods apply equally to other choices for f (·). Now, write U(β; W) for the p-element vector with jth element Uj(β; W), for j = 1, …, p. Also, let U(1)(β; W) be the p by p matrix of first derivatives of elements of U(β; W) with respect to elements of β. Note that

U(1)(β;W)=k=1Kwk[i=1MX(si)X(si)exp[X(si)β]I(siDk)],

hence E[U(1)(β0;W)]=k=1KwkAk=A(W), say, where

Ak=Dkλ0(s)X(s)X(s)ds. (4)

Following standard arguments involving Taylor series expansions, and under suitable regularity conditions (Guyon 1995), (β̂β0) is asymptotically Normally distributed with mean vector zero and approximate covariance matrix

(W)=[A(W)]-1B(w)[A(w)]-1, (5)

where

B(W)=k=1Kwk2Dkλ0(s)X(s)X(s)exp[X(s)β0]ds+k=1Kl=1KwkwlDkDlλ0(s1)λ0(s2)X(s1)X(s2)×[g(s1,s2)-1]ds1ds2. (6)

In (6), g(s1, s2) is the pair correlation function of the spatial point process that generates the cases. For any Poisson process, g(s1, s2) = 1 (Møller and Waagepetersen 2004), in which case only the first term of (6) remains. Then, A(W) and B(W) can be estimated consistently by

A^(W)=k=1Kwk{i=1MX(si)X(si)exp[X(si)β^]I(siDk)}

and

B^(W)=k=1Kwk2{i=1MX(si)X(si)exp[2X(si)β^]I(siDk)},

respectively. For non-Poisson processes, B(W) also depends on the pair correlation function g(·), but in either case the covariance matrix of β̂ is affected by the choice of the weights W. In the next section we develop data-driven methods to choose the weights, while in Section 5 we revisit the Poisson assumption.

4. CHOICE OF WEIGHTS

4.1 A General Framework

We now propose a general framework to guide the choice of weights. Let tr[Σ(W)] denote the trace of the covariance matrix Σ(W). Our aim is to find a set of weights W that yields the smallest possible value of tr[Σ(W)]. A similar idea was used in Bevilacqua et al. (2009), who proposed minimizing the trace of a so-called Godambe information matrix in order to choose the value of a single tuning parameter in their composite likelihood estimation. Here, we consider the more complicated problem of choosing a potentially large number of weights wk: k = 1, …, K; for example, in our Merseyside meningitis application, K = 4586.

Write B(W)=k=1Mwk2Bk, where B(W) is defined in (6). In the Poisson case, note that

Bk=Dkλ0(s)X(s)X(s)exp[X(s)β0]ds. (7)

Now, take the derivative of tr[Σ(W)] with respect to wk to obtain

Tk(W)=tr[(W)]wk=2tr{A(W)-2[wkBk-AkA(W)-1B(W)]},

where Ak is as defined in (4). To obtain the “optimal” weights, we follow Guan and Shen (2010) to solve Tk(W) = 0 for k = 1, …, K. In general, closed-form solutions are difficult to obtain without imposing additional restrictions. We suppose that

Ak=ckBk (8)

for some constants ck: k = 1, …, K. Note that Ak and Bk are both p × p matrices. Thus, condition (8) simply requires that all corresponding elements in Ak and Bk are being proportional. Such a condition is satisfied, if the parametric part of the intensity function is constant over each subregion Dk. Under condition (8), we have

Tk(W)=2tr[A(W)-2BkA(W)-1l=1KwlBl(wkcl-ckwl)],

which is equal to zero if wk = ck, for k = 1, …, K. If Tk(W) = 0 has a unique root, the resulting weights are then the optimal weights, in the sense that they minimize tr[Σ(W)] under the condition (8). Furthermore, (8) implies Σ(W) = A(W)−1 because of (4), (5), and (6). Note that when (8) holds, estimation of A(W), and hence of Σ(W), is straighforward. In the next two subsections, we try to find values of ck such that (8) holds approximately, in the Poisson and non-Poisson cases.

4.2 The Poisson Process Case

Let k be the average of the covariates {X(s): sDk}. Under the working assumption that X(s) ≈ k for sDk, it follows from (7) that Akexp(X¯kβ0)Bk, implying the choice of kβ weights

wk=exp(X¯kβ0). (9)

In practice, k is easily obtained from μjk: j = 1, …, p, and β0 can be replaced by a consistent estimator based on a convenient set of predefined weights, for example equal weights wk = 1 for all k. This working assumption strictly renders our approach redundant, in the sense that if it were true then covariate information would automatically be available for every individual and a standard case-control analysis would be straightforward. Nevertheless, the resulting choice of weights (9) is intuitively appealing. To see why, imagine that λ0(s) and X(s) were both known for all sD. Then, maximum likelihood estimation could be used to estimate β. Specifically, the jth score function from the maximum likelihood would become

Uj(β)=k=1Ki=1MXj(si)I(siDk)-k=1KDkλ0(s)Xj(s)exp[X(s)β]ds.

Note that the jth estimating equation based on the proposed weights (9) is

Uj(β;W)=k=1Kexp(Xkβ0)i=1MXj(si)I(siDk)exp[X(s)β]-k=1Kexp(Xkβ0)Dkλ0(s)Xj(s)ds.

The striking similarity between Ũj(β) and Uj(β; W) suggests that the weights (9) essentially try to make Uj(β; W) close to Ũj(β). This is appealing because maximum likelihood estimators are known to be asymptotically efficient. In general, the smaller the subregions Dk: k = 1, …, K become, the closer Uj(β; W) will be to Ũj(β) and the more efficient the estimator β̂ will be. As |Dk| → 0, k = 1, …, K, A(W) and B(W) both converge to

k=1KDkλ0(s)X(s)X(s)exp[X(s)β0]ds=Dλ0(s)X(s)X(s)exp[X(s)β0]ds

for sufficiently smooth X(·). Thus, the limit of the covariance matrix Σ(W) becomes

(W)={Dλ0(s)X(s)X(s)exp[X(s)β0]ds}-1,

which is the limiting covariance matrix of the maximum likelihood estimator.

4.3 The Non-Poisson Process Case

In the non-Poisson case, we assume that the process is second-order intensity reweighted stationary (Baddeley, Møller, and Waagepetersen 2000), hence g(s1, s2) = g(s1s2). Define

Bkl=DkDlλ0(s1)λ0(s2)X(s1)X(s2)[g(s1-s2)-1]ds1ds2.

Following the results in the Poisson process case, (8) holds approximately if, for some constants ck,

Akckl=1KwlwkBkl. (10)

To see when this may be so, suppose that g(t) = 1 if ||t|| > r for some small value r, where ||·|| is Euclidean distance. If also the spatial variation in X(·) is sufficiently smooth, and assuming that wkwl if dkl < r, then

l=1KwlwkBklDkλ0(s1)X(s1)X(s1)×[Dλ0(s2)[g(s1-s2)-1]ds2]ds1.

Thus, (10) holds if ∫D λ0(s2)[g(s1s2) − 1] ds2 is approximately constant for all s1Dk. This in turn holds if the spatial variation in λ0(·) is sufficiently smooth, for example, λ0(s) ≈ λ̄0k = μ1k/|Dk| for all sDl if dkl < r. A rough approximation for ck is then ckλ̄0k||t||≤r[g(t) − 1] dt, which leads to the weights

wk=exp(X¯kβ0)1+λ¯0kexp(X¯kβ0)||t||r[g(t)-1]dt. (11)

For clustered process, the weights assigned by (11) to regions with high values of exp(X¯kβ0) are smaller than those used in the Poisson process case, because typically g(t) > 1 for such processes. This change is intuitively reasonable because the numbers of individuals in such regions are highly variable by comparison with a Poisson process with the same intensity λ0(·), suggesting that their contributions to the estimating equation should be downweighted.

5. ASSESSING NON–POISSON BEHAVIOR

In practice, nonparametric estimation of the pair correlation function g(·) in conjunction with a nonparametric specification for λ0(·) is difficult; see, for example, Baddeley, Møller, and Waagepetersen (2000) or Diggle et al. (2007). Hence, a pragmatic strategy is to conduct inference about β under the assumption that the point process is Poisson, but to include within the analysis a diagnostic test of this assumption. In principle, we might wish to detect different kinds of non-Poisson behavior, but in the epidemiological setting the usual concern is to detect residual spatial clustering.

To achieve this, consider the statistic

Gkl(β)=i1,i2=1MI(si1Dk,si2Dl,i1i2)exp[X(si1)β]exp[X(si2)β]. (12)

Note that if the process is Poisson, then

E[Gkl(β0)]=DkDtλ0(s1)λ0(s2)ds1ds2=μ1kμ1l, (13)

where μ1k is as defined in (1). Let dkl be the distance between suitably defined centres of Dk and Dl, for example, their centroids. Motivated by (13), we define the statistic

g(u;β)=k,l=1Kκ[(dkl-u)/h]Gkl(β)k,l=1κ[(dkl-u)/h]μ1kμ1l,

where κ(·) is a kernel function and h is the bandwidth. If the case process is Poisson, we expect to find (u; β̂) ≈ 1 because of (13) and the consistency of β̂ for β0. If the process is clustered, we expect to find (u; β̂) > 1 when u is small. Conversely, if the process is spatially regular, we expect to find (u; β̂) < 1 when u is small. A plot of the estimated function (u; β̂) therefore gives a visual assessment of spatial clustering or regularity.

To develop a formal test, define

G(β)=k=1KGkk(β),

where Gkk(β) is defined by (12). Note that Gkk(β)=k=1μ1k2g(0;β), if the kernel function κ(x) = 0 except at x = 0. We consider only u = 0 because small distances typically provide most of the information about non-Poisson behavior. We apply Taylor series expansion to G(β̂) and obtain

G(β^)G(β0)-2[k=1Kμ1kμk](β^-β0)G(β0)+2[k=1Kμ1kμk]A(W)-1U(β0;W),

where μk = {μjk: j = 1, …, p} is a p-element vector. See the Appendix for details.

Recall that the covariance matrix for β̂ is as given in (5). To estimate the variance of G(β̂), we therefore only need to estimate the variance of G(β0) and the covariance between U(β0; W) and G(β0). Under the null hypothesis, it follows from straightforward algebra that the former is

2k=1KDkDkλ0(s1)λ0(s2)exp[X(s1)β0]exp[X(s2)β0]ds1ds2+4k=1K(μ1k)2Dkλ0(s)exp[X(s)β0]ds,

while the latter is

-2k=1Kwkμ1kDkλ0(s)X(s)exp[X(s)β0]ds.

See the Appendix for details. Both terms can be estimated relatively easily, for example, by

2k=1Ki1,i2=1MI(si1,si2Dk,i1i2)exp{2[X(si1)+[X(si2)]β^}+4k=1K(μ1k)2i=1MI(siDk)exp[2X(si)β^] (14)

and

-2k=1Kwkμ1ki=1MI(siDk)X(si)exp[2X(si)β^], (15)

respectively. Combining (5), (14), and (15), the variance of G(β̂) can then be estimated straightforwardly. Assuming asymptotic normality for G(β̂), we can then apply a standard Wald-type test.

6. SIMULATIONS

We now show the performance of our method using simulations of inhomogeneous Poisson and Poisson cluster processes. We generated simulated realizations on square regions D = [0, n] × [0, n] with n = 1 or n = 2. The intensity function for both processes was λ(s; α, β) = λ0(s) exp[α + βX(s)], where λ0(s) = exp[β*X*(s)], β = β* = 0.5, and X(s) and X*(s) are independent realizations of a stationary, isotropic Gaussian process with covariance exp(−γu) for γ = 5 or 10. Note that a smaller value of γ yields a smoother covariate surface. The average number of events per realization was 200 and 800 for n = 1 and n = 2, respectively. For the Poisson cluster process, we used a stationary Poisson process with intensity 100 to generate the parent locations and a radially symmetric Gaussian distribution with standard deviation σ in each coordinate direction to generate the offspring locations relative to their parents (Diggle 2003; Waagepetersen 2007). We used values of σ = 0.02 and 0.04 to correspond to relatively strong and weak clustering, respectively.

We assume that X(s), but not X*(s), is observed at every simulated event location. Aggregated covariate informaton is observed in the form of μ1k =∫Dkλ0(s) ds and μ2k = ∫Dkλ0(s)X(s) ds where the Dk: k = 1, …, K, are equal sub-squares that partition D. For the 1 × 1 region D, we used K = 25, 100, 400, and for the 2 × 2 region, K = 100, 400, 1600. To estimate the parameters α and β we used both equal weights and the data-driven weights determined by the methods described in Section 3. Note that the estimator based on equal weights do not change with K. For the cluster process, we used max[(r) − πr2, 0] as an approximation to ∫||t||≤r[g(t) − 1] dt, where (r) is an estimate of the inhomogeneous K-function (Baddeley, Møller, and Waagepetersen 2000) and r = 3σ. For comparison, we also implemented two other methods on each simulated data-set: Poisson maximum likelihood estimation under the unrealistic assumption that both λ0(s) and X(s) are completely observed for all sD; and an ecological analysis using only spatially aggregated data. For the latter, we assume that λ0(s) and X(s) are both constants within each subregion; the estimates λ̂0(s) = μ1k/|Dk| and (s) = μ2k/μ1k are then used to form the Poisson maximum likelihood instead of λ0(s) and X(s), where sDk.

Table 1 gives the empirical mean squared errors (MSEs) of the resulting estimators for β from 1000 simulations. Even in the best scenario with K = 400n2 subregions, the estimator based on ecological-data only analysis has substantially much larger MSE than the rest estimators in all cases. The large MSE is mostly due to a large bias of the estimator, which becomes more severe with smaller K. In contrast, all other estimators are approximately unbiased, and the empirical biases (not shown) are consistent with this. As would be expected, for all of the estimators the MSE decreases as the study region becomes larger. Also, the proposed estimator based on the data-driven weights is more efficient than that based on equal weights, and the MSE of the weighted estimator decreases as K increases. In the Poisson process case, the MSE approaches that of the maximum likelihood estimator with known λ0(s), as is consistent with the theoretical comparison given in Section 4.2.

Table 1.

Mean squared errors, multiplied by 10,000, from 1000 simulations of various estimators for the regression coefficient β

Process γ n EST1 EST2
EST3 EST4
K = 25n2 100n2 400n2
Poisson 5 1 98 81 74 72 71 192
2 23 16 15 15 14 81
10 1 105 78 68 61 57 325
2 20 16 15 14 13 221
Cluster 1 5 1 226 207 202 197 203 353
2 45 37 36 35 36 107
10 1 157 139 130 127 125 417
2 37 33 32 31 30 240
Cluster 2 5 1 260 232 218 215 263 457
2 50 42 41 40 48 116
10 1 177 159 147 144 159 461
2 42 38 36 35 39 265

NOTE: EST1 and EST2 are the proposed estimator based on equal weights and the “optimal” weights, EST3 is the maximum likelihood estimator with λ0(s) and X(s) known for all sD, and EST4 is the estimator from ecological analysis when K = 400n2, where K is the number of subregions Dk and n defines the region size. Clusters 1 and 2 are the inhomogeneous Poisson cluster process with σ = 0.04, 0.02, respectively. γ determines the smoothness of the covariates (see Section 6).

In the cluster process case with σ = 0.02, the MSE of the weighted estimator is consistently smaller than that of the Poisson maximum likelihood estimator. This indicates the benefit of accounting for non-Poisson behavior. When σ = 0.04, the weighted estimator can still out-perform the maximum likelihood estimator, but to a lesser extent because the spatial clustering is now weaker.

Table 2 presents the empirical size and power of the proposed test for non-Poisson behavior from 1000 simulations, using a nominal significance level of 10%. The empirical size is close to the nominal size in all cases. The empirical power generally increases when the study region becomes larger, when the clustering strength becomes stronger, and when the covariate process becomes less smooth. When K increases, so that the Dk become smaller, the power generally increases for the case σ = 0.02, but first increases and then decreases for the case σ = 0.04. Taken together, these results suggest that the test is most powerful when the size of each subregion Dk is well matched to the scale of the spatial clustering.

Table 2.

Empirical size and power from 1000 simulations for the proposed test for residual spatial clustering, at the 10% nominal significance level

Process γ n K = 25n2 100n2 400n2
Poisson 5 1 0.089 0.094 0.109
2 0.101 0.093 0.120
10 1 0.083 0.100 0.134
2 0.111 0.089 0.096
Cluster 1 5 1 0.490 0.686 0.571
2 0.890 0.993 0.971
10 1 0.562 0.760 0.571
2 0.943 0.997 0.968
Cluster 2 5 1 0.629 0.981 0.998
2 0.966 1.000 1.000
10 1 0.746 0.983 0.997
2 0.992 1.000 1.000

NOTE: Regression parameters are estimated by the proposed estimation method using equal weights. Clusters 1 and 2 are the inhomogeneous Poisson cluster process with σ = 0.04, 0.02, respectively, n denotes the region size, γ determines the smoothness of the covariates (see Section 6) and K is the number of subregions Dk.

7. APPLICATION: CHILDHOOD MENINGOCOCCAL DISEASE IN MERSEYSIDE, U.K

7.1 Individual-Level Data

Individual-level data were obtained for all meningococcal disease patients admitted to Alder Hey Children’s Hospital, Liverpool, U.K. during the period January 1, 1981 to December 31, 2007. Alder Hey is the primary hospital in Merseyside for childhood illnesses. All patients admitted to Alder Hey during this time period were aged 16 or under. Data for each case included date of admission, full unit post-code of residence, age, and sex. Each post-code was converted to a grid reference using the online tool GeoConvert, developed by the Census Dissemination Unit at the University of Manchester, U.K. (http://cdu.mimas.ac.uk).

Although data were available for all meningococcal disease cases aged 16 or under, for reasons explained below, we analyze only cases aged 0 to 14 years old, inclusive. Patients who were admitted to Alder Hey, yet resided outside the Merseyside boundary were also excluded from the subsequent analysis, leaving 864 cases for the analysis.

7.2 Area-Level Data

Spatially aggregated control data from the 1981, 1991, and 2001 censuses for Merseyside were obtained from the Census Dissemination Unit via their website (http://casweb.mimas.ac.uk). The finest spatial resolution at which data were available was enumeration district (1981 ,1991) or output area (2001). The average size of an enumeration district in England and Wales is approximately 450 residents, whereas the recommended size of each output area is approximately 300 residents. Population counts for each small area are available for each year of age and each sex. For the purposes of this analysis, the population of interest consists of the age range 0–15 years.

The Townsend score, a measure of social deprivation (Townsend, Phillimore, and Beattie 1988), was derived for each small area. This uses census variables relating to unemployment, car ownership, owner occupation, and overcrowding.

Owing to boundary changes across the three censuses of interest (1981, 1991, 2001), spatially aggregated census data are not strictly comparable across censuses. To resolve this, we needed to harmonize the boundaries such that all population counts correspond to the 2001 census output area level; the Merseyside study region consists of 4586 such output areas. We therefore redistributed counts from the 1981 and 1991 censuses at enumeration district level among the 2001 output areas in proportion to their respective areas of overlap.

Midyear population estimates are available from the Office of National Statistics (ONS) at local authority level by sex and quinary age group (0 years, 1–4 years, 5–9 years, 10–14 years, 15–19 years, etc.) from 1981 to 2007 (Office of National Statistics 2004). To obtain population estimates at output area level for each year, we assumed that the annual percentage changes applied equally to all output areas within each local authority.

As midyear estimates of the variables required to calculate the Townsend score were not available, we estimated Townsend scores at output area level for each year by linearly interpolating the values calculated in each of the three census years and extrapolating at a constant level from 2001 onwards.

7.3 Results

We denote the years 1981 to 2007 by t = 1, 2, …, 27 and write X(s, t) for the vector of covariates associated with a child living at s in year t. Each X(s, t) includes age, sex, Townsend score, and indicator variables for the five local authorities that make up the county of Merseyside, namely Knowsley, Liverpool, St. Helens, Sefton, and Wirral. We model the case intensity as

λ(s,t)=λ0(s,t)exp[X(s,t)β+h(t)],

where λ0(s, t) is the spatially varying population density at year t and h(t) describes the temporal trend in risk. We specify h(t) as a cubic spline with knots at t = 4, 8, 12, 16, and 20 with t = 1 for the year 1981. These knots are selected based on visual examination of the histogram of the yearly counts. We have also tried other choices for the knots and obtained very similar results.

We first fit the above model to our data using equal weights, to obtain estimates β̂e and ĥe(t). We then calculate weights w(s, t) = exp[k(t)′β̂e + ĥe(t)], where k(t) is the vector of average covariates in the output area Dk that includes s and at year t. Table 3 gives the estimates for the regression coefficients and their associated standard error estimates on the assumption that the case process is an inhomogeneous Poisson process. Figure 1 plots the fitted temporal trend in risk, exp[ĥ(t)].

Table 3.

Estimates of covariate effects in the model for meningococcal disease risk in Merseyside, U.K.

Covariates EST STD 95% CI
Knowsley −9.74 0.35 (−10.43, −9.06)
Liverpool −9.39 0.33 (−10.03, −8.76)
St. Helens −11.38 0.51 (−12.38, −10.38)
Sefton −9.74 0.37 (−10.46, −9.02)
Wirral −11.26 0.43 (−12.10, −10.42)
Age −0.21 0.01 (−0.24, −0.18)
Gender 0.25 0.13 (−0.00, 0.50)
Townsend 0.09 0.02 (0.05, 0.14)

NOTE: Figures in parentheses are 95% confidence intervals.

Figure 1.

Figure 1

Estimated temporal trend component in the model for meningococcal disease risk in Merseyside, U.K.

The formal test of the Poisson assumption described in Section 4 yields a test statistic −0.8009, corresponding to a two-sided p-value 0.4232. Figure 2 shows the estimate (u; β̂) for separation distances u = 1, 2, …, 10 km, obtained using bandwidth of 1 km. The estimate is approximately constant, taking a value reasonably close to one throughout the plotted range, again consistent with the Poisson assumption.

Figure 2.

Figure 2

Estimated pair correlation function for the point process of case locations of meningococcal disease cases in Merseyside, U.K.

The results support the hypothesis of a positive association between area-level social deprivation and individual-level meningococcal disease risk. After adjusting for the effects of age, sex, and local authority, a unit change in the Townsend score at output-area level increases the risk by a factor of 1.1 (95% confidence interval 1.0493 to 1.1466).

The results also revealed a significant local authority effect, with highest risk in Liverpool and lowest in St. Helens after controlling for other risk factors. The local authorities that consitute Merseyside vary greatly in terms of their social composition and degree of urbanization.

Age, as anticipated, is a significant risk factor for the disease, with risk decreasing by a factor of 0.8146 (95% confidence interval 0.7928 to 0.8370) for each year of increase in age after controlling for other risk factors. We find no significant association between meningococcal disease risk and sex.

The estimated temporal trend in risk over the study period indicates a large increase over the period 1995 to 2000 and a decrease thereafter. The decrease is likely to be due in part to the introduction of the MenC vaccine in November 1999, after which the incidence of serogroup C meningococcal disease in the target age groups began to decline (Trotter et al. 2004).

PCR testing was introduced as a meningococcal disease confirmation test in October 1996. Following this, there were improvements in the ascertainment of meningococcal disease cases, which may explain this increase (Carrol et al. 2000; Gray et al. 2006).

We are unable to explain the secondary peak in estimated risk centred on 1988. There is a known relationship between previous viral upper respiratory tract infections, including influenza, and meningococcal disease risk (Stuart et al. 1996; Jensen et al. 2004). During the 1989/90 winter season, an influenza epidemic in England and Wales was followed by an increase in the national incidence of meningococcal disease cases. Although no direct evidence of causality is available, Cartwright and Jones (1991) demonstrated a possible association between the two events. However, we do not have data relating to the Merseyside area that would allow us to investigate the relationship between local incidence of flu or flu-like illnesses and meningococcal disease risk.

With regard to our main finding of an association with deprivation, earlier ecological studies in Gwent, S. Wales at enumeration district level (Fone et al. 2003), in the eastern region of England at local authority ward level (Williams et al. 2004) and in North East Thames at local authority ward level (Jones et al. 1997) have also reported a significant positive association between the Townsend score and meningococcal disease risk. However, in each of these studies the target for inference was the ecological, rather than individual, association (Greenland and Morgenstern 1989). Also, none of the earlier studies allowed for temporal trends in risk.

Haynesa and Gale (1999) has recognized that the relationship between health and deprivation is not uniform across the U.K., but varies according to the geographical type of area. In particular, the relationship between deprivation measures such as the Townsend score and health tends to be weaker in rural areas than in urban areas (Townsend, Phillimore, and Beattie 1988). For example, car ownership contributes to the Townsend score but is more weakly related to deprivation in rural than in urban areas. We therefore tested for interaction between Townsend score and the indicator variables for the local authorities, but the result was not significant at the 5% level.

8. DISCUSSION

We have proposed a method for analyzing spatially referenced, individual-level health outcomes that does not require individual-level control data. Instead, the method uses small-area level information on the population at risk, consisting of numbers of individuals and average values of relevant covariates for each small-area unit. The method delivers estimates of individual-level risk-factor effects, an inferential procedure based on the assumption that case locations are a realization of an inhomogeneous Poisson point process model, and a diagnostic test for departure from the Poisson assumption. The ability to handle this kind of data structure is essential for our application to the Merseyside childhood meningococcal disease data.

Simulation studies show that our method works well in practice. Firstly, it yields estimators that can be competitive in terms of efficiency by comparison with maximum likelihood estimation under the unrealistic assumption that the continuous spatial variation in the density of the population at risk is known exactly. Secondly, the diagnostic test has the correct size and detects residual spatial clustering of cases, with power which depends in a sensible way on the strength of the residual clustering.

Our analysis of the Merseyside childhood meningococcal disease data adds to the evidence of a positive association between small-area level social deprivation and disease risk and provides, for the first time, an estimate of the size of the individual-level effect.

In some situations there is more information than simply the mean of each predictors in each subregion. For example, statistics related to higher-order moments of each predictor, such as its standard deviation, may be available. Each higher-order moment will lead to a new set of estimating equations. The availability of such information would allow us to consider higher-order models, for example models with linear and quadratic terms in each of the predictors. Our proposed methods can be directly applied in such cases by treating each higher-order term as a new predictor. However, when only a linear model is fitted, it is not obvious how best to incorporate into the analysis the additional estimating equations that are generated by the higher-order moments; in particular, it is difficult to develop intuitively appealing weights.

One potential use of the higher-order information would be to refine the choice of the weights. Our current proposal amounts to expanding exp[X(s)′β0] at X(s) = k for sDk. Alternatively, we might consider

exp[X(s)β0]exp(X¯kβ0){1+β0[X(s)-X¯k]+β0[X(s)-X¯k][X(s)-X¯k]β0}.

If a covariance matrix Sk is available for each of the predictors in Dk, we can instead define wk=exp(X¯kβ^)(1+β^Skβ^). Theese new weights are likely to be useful when the covariates show a high level of spatial variation within subregions. However, in our simulations we found that they made almost no difference to the resulting estimates of β.

Acknowledgments

Yongtao Guan’s research was supported in part by NSF grant DMS-0845368 and by NIH/NIDCR grant UL1 DE019586. Michelle Stanton’s research was supported by the award of a research studentship by the U.K. Engineering and Physical Sciences Research Council. Data on childhood meningococcal disease in Merseyside, U.K., were collected by Alistair Thomson, Omnia Marzouk, Andrew Riordan, Paul Baines, Enitan Carrol, Scott Hackett, and Niten Makwana, Alder Hey Children’s Hospital, Merseyside.

APPENDIX: TECHNICAL DETAILS

Taylor series expansion for G(β̂)

Recall that

G(β)=k=1KGkk(β),

where

Gkk(β)=i1,i2=1MI(si1Dk,si2Dk,i1i2)exp[X(si1)β]exp[X(si2)β].

A Taylor series expansion gives

G(β^)G(β0)-2k=1Ki1,i2=1MI(si1,si2Dk,i1i2)X(si1)exp[X(si1)β0]exp[X(si2)β0](β^-β0)G(β0)+2k=1Ki1,i2=1MI(si1,si2Dk,i1i2)X(si1)exp[X(si1)β0]exp[X(si2)β0]×A(W)-1U(β0;W)G(β0)+2[k=1Kμ1kμk]A(W)-1U(β0;W),

where μk = {μjk: j = 1, …, p} is a p-element vector, and the last approximation is due to the fact that under the null hypothesis of a Poisson process,

E{k=1Ki1,i2=1MI(si1,si2Dk,i1i2)X(si1)exp[X(si1)β0]exp[X(si2)β0]}=k=1Kμ1kμk.

Expression for the variance of G(β0)

Under the null hypothesis, it follows from Campbell’s theorem (Daley and Vere-Jones 1988) that

var[Gkk(β0)]=2DkDkλ0(s1)λ0(s2)exp[X(s1)β0]exp[X(s2)β0]ds1ds2+4DkDkDkλ0(s1)λ0(s2)λ0(s3)exp[X(s1)β0]ds1ds2ds3.

Because ∫Dk λ0(s) ds = μ1k, the last term is equal to

4(μ1k)2Dkλ0(s)exp[X(s)β0]ds.

This immediately yields the variance of G(β0) since var[G(β0)]=k=1Kvar[Gkk(β0)].

Expression for the covariance of G(β0) and U(β0; W)

Recall that U(β; W) is a p-element vector with jth element Uj(β; W): j = 1, …, p, where

Uj(β;W)=k=1Kwk[μjk-μ^jk(β)]=k=1Kwkμjk-k=1Kwki=1MXj(si)exp[X(si)β0]I(siDk).

Under the null hypothesis, it follows from Campbell’s theorem that

cov[Gkk(β0),Uj(β;W)]=-2wkDkDkλ0(s1)λ2(s2)Xj(s1)exp[X(s1)β0]ds1ds2=-2wkμ1kDkλ0(s)Xj(s)exp[X(s)β0]ds.

This immediately implies that

cov[G(β0),U(β;W)]=-2k=1Kwkμ1kDkλ0(s)X(s)exp[X(s)β0]ds.

Contributor Information

Peter J. Diggle, Email: p.diggle@lancaster.ac.uk, School of Health and Medicine, Lancaster University, Lancaster, U.K. and Adjunct Professor, Department of Biostatistics, Johns Hopkins University School of Public Health, Baltimore.

Yongtao Guan, Email: yongtao.guan@yale.edu, Division of Biostatistics, Yale University, New Haven, CT 06520-8034.

Anthony C. Hart, Division of Medical Microbiology, University of Liverpool, Liverpool, U.K. (deceased September 2007).

Fauzia Paize, Email: fauziap@liv.ac.uk, Institute of Child Health, University of Liverpool, Alder Hey Children’s NHS Foundation Trust, Liverpool, U.K.

Michelle Stanton, Email: m.stanton@lancaster.ac.uk, School of Health and Medicine, Lancaster University, Lancaster, U.K.

References

  1. Baddeley AJ, Møller J, Waagepetersen R. Non- and Semi-Parametric Estimation of Interaction in Inhomogeneous Point Patterns. Statistica Neerlandica. 2000;54 (3):329–350. [Google Scholar]
  2. Best N, Ickstadt K, Wolpert R. Spatial Poisson Regression for Health and Exposure Data Measured at Disparate Resolutions. Journal of the American Statistical Association. 2000;95:1076–1088. [Google Scholar]
  3. Bevilacqua M, Mateu J, Porcu E, Zhang H, Zini A. Weighted Composite Likelihood-Based Tests for Space-Time Separability of Covariance Functions. Statistics and Computing. 2009;20:283–293. [Google Scholar]
  4. Breslow N, Day N. Statistical Methods in Cancer Research: The Analysis of Case-Control Studies. Lyon: International Agency for Research on Cancer; 1980. [PubMed] [Google Scholar]
  5. Carrol E, Thomson A, Riordan F, Fellick J, Shears P, Sills J, Hart C. Increasing Microbiological Confirmation and Changing Epidemiology of Meningococcal Disease on Merseyside, England. Clinical Microbiology and Infection. 2000;6:259–262. doi: 10.1046/j.1469-0691.2000.00078.x. [DOI] [PubMed] [Google Scholar]
  6. Cartwright KAV, Jones DM. Influenza A and Meningococcal Disease. The Lancet. 1991;338:554–557. doi: 10.1016/0140-6736(91)91112-8. [DOI] [PubMed] [Google Scholar]
  7. Crowder M. On Consistency and Inconsistency of Estimating Equations. Econometric Theory. 1986;2:305–330. [Google Scholar]
  8. Daley DJ, Vere-Jones D. An Introduction to the Theory of Point Processes. New York: Springer-Verlag; 1988. [Google Scholar]
  9. Diggle PJ. Statistical Analysis of Spatial Point Patterns. New York: Academic Press; 2003. [Google Scholar]
  10. Diggle PJ, Gómez-Rubio V, Brown PE, Chetwynd AG, Gooding S. Second-Order Analysis of Inhomogeneous Spatial Point Processes Using Case-Control Data. Biometrics. 2007;63 (2):550–557. doi: 10.1111/j.1541-0420.2006.00683.x. [DOI] [PubMed] [Google Scholar]
  11. Fone DL, Harries J, Lester N, Nehaul L. Meningococcal Disease and Social Deprivation: A Small Area Geographical Study in Gwent, UK. Epidemiology and Infection. 2003;130:53–58. doi: 10.1017/s095026880200794x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Gray SJ, Trotter CL, Ramsay ME, Guiver M, Fox AJ, Borrow R, Mallard RH, Kaczmarski EB. Epidemiology of Meningococcal Disease in England and Wales 1993/94 to 2003/04: Contribution and Experiences of the Meningococcal Reference Unit. Journal of Medical Microbiology. 2006;55:887–896. doi: 10.1099/jmm.0.46288-0. [DOI] [PubMed] [Google Scholar]
  13. Greenland S, Morgenstern H. Ecological Bias, Confounding and Effect Modification. International Journal of Epidemiology. 1989;18:269–274. doi: 10.1093/ije/18.1.269. [DOI] [PubMed] [Google Scholar]
  14. Greenland S, Robins J. Ecological Studies: Biases, Misconceptions and Counterexamples. American Journal of Epidemiology. 1994;139:747–760. doi: 10.1093/oxfordjournals.aje.a117069. [DOI] [PubMed] [Google Scholar]
  15. Guan Y, Shen Y. A Weighted Estimating Equation Approach for Inhomogeneous Spatial Point Processes. Biometrika. 2010 to appear. [Google Scholar]
  16. Guyon X. Random Fields on a Network: Modeling, Statistics, and Applications. New York: Springer-Verlag; 1995. [Google Scholar]
  17. Haneuse S, Wakefield J. Hierarchical Models for Combining Ecological and Case-Control Data. Biometrics. 2007;63:128–136. doi: 10.1111/j.1541-0420.2006.00673.x. [DOI] [PubMed] [Google Scholar]
  18. Haneuse S, Wakefield J. The Combination of Ecological and Case-Control Data. Journal of the Royal Statistical Society, Ser B. 2008a;70:73–93. doi: 10.1111/j.1467-9868.2007.00628.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Haneuse S, Wakefield J. Geographic-Based Ecological Correlation Studies Using Supplemental Case-Control Data. Statistics in Medicine. 2008b;27:864–887. doi: 10.1002/sim.2979. [DOI] [PubMed] [Google Scholar]
  20. Haynesa R, Gale S. Mortality, Long-Term Illness and Deprivation in Rural and Metropolitan Wards of England and Wales. Health and Place. 1999;5:301–312. doi: 10.1016/s1353-8292(99)00020-9. [DOI] [PubMed] [Google Scholar]
  21. Jackson C, Best N, Richardson S. Improving Ecological Inference Using Individual-Level Data. Statistics in Medicine. 2006;25:2136–2159. doi: 10.1002/sim.2370. [DOI] [PubMed] [Google Scholar]
  22. Jensen ES, Lundbye-Christensen S, Samuelsson S, Sørensen HT, Schønheyder HC. A 20-Year Ecological Study of the Temporal Association Between Influenza and Meningococcal Disease. European Journal of Epidemiology. 2004;19:181–187. doi: 10.1023/b:ejep.0000017659.80903.5f. [DOI] [PubMed] [Google Scholar]
  23. Jones IR, Urwin G, Feldman RA, Banatvala N. Social Deprivation and Bacterial Meningitis in North East Thames Region: Three Year Study Using Small Area Statistics. British Medical Journal. 1997;314:794. doi: 10.1136/bmj.314.7083.794. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Møller J, Waagepetersen R. Statistical Inference and Simulation for Spatial Point Process. New York: Chapman & Hall; 2004. [Google Scholar]
  25. Office of National Statistics. A Short Guide to Population Estimates. 2004 available at http://www.statistics.gov.uk/downloads/theme_population/Short_Guide_revision_Nov_04_final.pdf.
  26. Prentice R, Sheppard L. Aggregate Data Studies of Disease Risk Factors. Biometrika. 1995;82:113–125. [Google Scholar]
  27. Sheppard L, Prentice R. On the Reliability and Precision of Within- and Between-Population Estimates of Relative Rate Parameters. Biometrics. 1995;51:853–863. [PubMed] [Google Scholar]
  28. Stuart JM, Robinson PM, Cartwright K, Noah ND. Antibiotic Prescribing During an Outbreak of Meningococcal Disease. Epidemiology and Infection. 1996;117:103–105. doi: 10.1017/s0950268800001187. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Townsend P, Phillimore P, Beattie A. Health and Deprivation: Inequality and the North. London: Routledge; 1988. [Google Scholar]
  30. Trotter CL, Andrews NJ, Kaczmarski EB, Miller E, Ramsay ME. Effectiveness of Meningococcal Serogroup C Conjugate Vaccine 4 Years After Introduction. The Lancet. 2004;364:365–367. doi: 10.1016/S0140-6736(04)16725-1. [DOI] [PubMed] [Google Scholar]
  31. Waagepetersen RP. An Estimating Function Approach to Inference for Inhomogeneous Neyman–Scott Processes. Biometrics. 2007;63 (1):252–258. doi: 10.1111/j.1541-0420.2006.00667.x. [DOI] [PubMed] [Google Scholar]
  32. Wakefield J. Ecological Inference for 2 × 2 Tables (with discussion) Journal of the Royal Statistical Society, Ser A. 2004;167:385–445. [Google Scholar]
  33. Williams CJ, Willocks LJ, Lake IR, Hunter PR. Geographic Correlation Between Deprivation and Risk of Meningococcal Disease: An Ecological Study. BMC Public Health. 2004;4:30. doi: 10.1186/1471-2458-4-30. [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES