Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2014 Mar 1.
Published in final edited form as: Scand Stat Theory Appl. 2012 Jul 26;40(1):119–137. doi: 10.1111/j.1467-9469.2012.00795.x

Decomposition of Variance for Spatial Cox Processes

Abdollah Jalilian 1, Yongtao Guan 2, Rasmus Waagepetersen 3
PMCID: PMC3626311  NIHMSID: NIHMS359873  PMID: 23599558

Abstract

Spatial Cox point processes is a natural framework for quantifying the various sources of variation governing the spatial distribution of rain forest trees. We introduce a general criterion for variance decomposition for spatial Cox processes and apply it to specific Cox process models with additive or log linear random intensity functions. We moreover consider a new and flexible class of pair correlation function models given in terms of normal variance mixture covariance functions. The proposed methodology is applied to point pattern data sets of locations of tropical rain forest trees.

Keywords: additive random intensity, composite likelihood, Cox process, Matérn covariance function, normal variance mixture, pair correlation function, variance component

1 Introduction

The spatial distributions of tropical rain forest trees are influenced by many factors including e.g. spatially varying environmental conditions, seed dispersal, infectious diseases, and gap creation by hurricanes. One natural question is, loosely speaking, ‘how much of the variation’ in the spatial distribution of a tropical rain forest tree species can be attributed to each of these factors? In this paper we study methods for addressing this question with a particular focus on the contribution of the environment.

The most fundamental summaries of variation are the variances of counts of trees in bounded regions. A generalized linear mixed model (GLMM) for such counts, with a variance component for each of the sources of variation, might be a starting point for decomposing the variance. However, with this approach one loses the fine-scale information contained in extensive tropical rain forest data sets which include locations of individual trees and not just numbers of trees in certain regions (e.g. Condit, 1998). Moreover, conclusions obtained from a fitted GLMM may in general be strongly dependent on the sizes and shapes of the regions used to create the count data.

A more natural and flexible approach is to model the individual tree locations as a spatial point process (e.g. Møller and Waagepetersen, 2004) and then derive summaries of variation from a fitted spatial point process model. With this approach one can compute variances and covariances of counts in arbitrary regions and therefore recover all summaries that would be obtained from a GLMM approach. In particular, Cox processes (e.g. Møller and Waagepetersen, 2004) provide a useful framework for integrating different sources of variation.

This paper introduces a general criterion for decomposition of variance for Cox processes. The criterion is applied to specific Cox process models with either additive or log linear random intensity functions. The additive model is appealing in the context of variance decomposition but has not been well studied in the point process literature. For either model, the resulting variance decomposition depends on the assumed pair correlation function which characterizes the spatial correlation in the Cox process. It is therefore important to have a wide class of pair correlation functions to choose from. To this end we consider a new class of shot-noise Cox processes with normal variance mixture pair correlation functions. These functions are closely related to widely used counterparts in geostatistics and hence serve as a bridge to further connect the theories of geostatistics and point processes.

1.1 A tropical rain forest data example

Waagepetersen and Guan (2009) fitted so-called inhomogeneous Thomas processes to locations of three tree species, namely Acalypha diversifolia (528 trees), Lonchocarpus heptaphyllus (836 trees), and Capparis frondosa (3299 trees) that were alive in 1995 in the 1000 m by 500 m Barro Colorado Island plot (Condit et al., 1996; Condit, 1998; Hubbell and Foster, 1983). The significant covariates in the fitted Thomas models were elevation and potassium for Acalypha and Capparis and nitrogen and phosphorous for Lonchocarpus. The point patterns of tree locations and the covariate potassium are shown in Figure 1.

Figure 1.

Figure 1

Locations of Acalypha, Lonchocarpus, and Capparis trees and image of interpolated potassium content in the surface soil (from top to bottom).

In Section 7 we study decomposition of variance for these data sets. We moreover employ a much broader class of Cox processes than the inhomogeneous Thomas processes used in Waagepetersen and Guan (2009).

2 Background on spatial point processes

Let X be a point process on ℝ2 and let N(B), for any bounded B ⊆ ℝ2, denote the number of points in XB. The first- and second-order moments of the counts N(B) are determined by the intensity function ρ and the second-order product density ρ(2) of X which are functions defined on ℝ2 and ℝ2 × ℝ2, respectively (see Møller and Waagepetersen, 2004). More precisely,

EN(B)=Bρ(u)du

and

EN(A)N(B)=ABρ(u)du+ABρ(2)(u,v)dudv

for bounded A, B ⊆ ℝ2. The pair correlation function g is given by g(u, v) = ρ(2)(u, v)/[ρ(u)ρ(v)].

For a Poisson process, the counts in disjoint regions are independent and Poisson distributed. A Cox process is driven by a random intensity function Λ = {Λ(u) : u ∈ ℝ2}. Conditional on Λ = λ, the Cox process becomes a Poisson process with intensity function λ. For a Cox process, ρ(u) = Inline graphicΛ(u) and ρ(2)(u, v) = Inline graphic[Λ(u)Λ(v)].

3 Decomposition of variance

To quantify the various sources of variation in a spatial point pattern we consider the contribution of each source to the total variation of a count N(B) for a region B. According to the previous section,

VarN(B)=Bρ(u)du+BBρ(u)ρ(v)[g(u,v)1]dudv. (1)

The first term on the right-hand side of (1) is the variance of N(B) for a Poisson process with intensity function ρ(·). The second term is the increase (or decrease) in variance due to possible attraction (or repulsion) between points corresponding to g > 1 (or g < 1).

For a Cox process X driven by a random intensity function Λ, we can decompose a count N(B) as N(B) = I(B) + (B) where

N^Λ(B)=E[N(B)Λ]=BΛ(u)du

is the minimum mean square error predictor of N(B) given Λ and I(B) = N(B) − (B) is an innovation process in the terminology of Baddeley et al. (2005). Then I(B) and (B) are uncorrelated processes with

EIΛ(B)=0,Cov[IΛ(A),UΛ(B)]=ABρ(u)du

and

Cov[N^Λ(A),N^Λ(B)]=ABρ(u)ρ(v)[g(u,v)1]dudv

since ℂov[Λ(u), Λ(v)] = ρ(u)ρ(v)[g(u, v) − 1]. The process I may be viewed as a ‘nugget’ process in geostatistical terminology since I(A) and I(B) are uncorrelated when A and B are disjoint. The spatially correlated sources of variation causing the extra-Poisson variance in (1) enters via .

In Section 4 we model Λ in terms of a random covariate process Z. To quantify how much of the spatially structured variation is due to Z we further decompose (B) = I,Z(B) + ,Z into uncorrelated components

N^Λ,Z=E[N^Λ(B)Z]=BE[Λ(u)Z]du

and

IΛ,Z(B)=N^Λ(B)N^Λ,Z(B)=B{Λ(u)E[Λ(u)Z]}du.

If all random variation in Λ is due to Z then Inline graphicarI,Z(B) = 0. Further, Inline graphicar,Z(B) = 0 if Λ is independent of Z. The quantity

R2(B)=VarN^Λ,Z(B)VarN^Λ(B)=VarBE[Λ(u)Z]duVar[BΛ(u)du] (2)

thus describes the proportion of the variance of ∫B (u)du explained by Z. For sufficiently small B, R2(B) may be approximated by

R2=VarE[Λ(uB)Z]VarΛ(uB),uBB (3)

which in the stationary case does not depend on B. The quantity R2 can also be viewed as an analogue of the R2 statistic for linear regression. To see this connection, we define the expected ‘sums of squares’

SSR=ESIR2(u)du,SSE=ESIE2(u)du

and

SST=SSR+SSE

where IR(u) = Λ̂|Z(u) − ρ(u), IE(u) = Λ(u) − Λ̂|Z(u), and Λ̂|Z(u) = Inline graphic[Λ(u)|Z]. Then, in the case of a stationary random intensity function Λ, the ratio SSR/SST (known as R2 for linear regression) coincides with (3). We discuss our R2 criterion in relation to specific models in Section 4. Note that R2 is not a goodness-of-fit criterion. In particular, R2 = 0 whenever Λ is independent of Z.

4 Models for rain forest data

Let X be the process that generates the point pattern of locations for a particular tree species. We assume that X is a Cox process driven by a random intensity function Λ where Λ depends on a stationary process Z = {Z(u) : uS}, Z(u) = [Z1(u), …, Zp(u)], of p observed covariates and a non-negative process Λ0 = {Λ0(u) : uS} representing unobserved sources of variation. We assume that Λ depends on Z through = {β0 + β1:pZ(u)}uS for some parameter β = (β0, β1:p) = (β0, β1, …, βp) ∈ ℝp+1 and that and Λ0 are independent second-order stationary processes.

We further let ρ0 = Inline graphicΛ0(u) and g0(uv)=E[Λ0(u)Λ0(v)]/ρ02. Then ρ0 and g0 become respectively the intensity and the pair correlation function of a Cox process driven by the random intensity function Λ0. Note that

g0(uv)=c0(uv)/ρ02+1 (4)

where c0 is the covariance function for Λ0. Moreover σ02=Var[Λ0(u)] is equal to ρ02[g0(0)1]. In Section 5 we discuss parametric models for g0 or equivalently c0.

To quantify how much variation is due to Z, it is appropriate to model Z as a random field. However, since Z is observed, it is often convenient to base parameter estimation on X|Z where Z is then treated as a fixed quantity, see Section 6.

4.1 An additive model

Variance decomposition is straightforward for the following additive model:

Λ(u)=Z(u)+Λ0(u)=β0+β1:pZ(u)+Λ0(u). (5)

Thus, X can be considered as a superposition of two independent Cox processes with random intensity functions (u) and Λ0. A drawback of this model is that Λ is not positive for all values of β and Z. This is probably why (5) has not attracted much interest in the point process literature; Best et al. (2000) is one notable exception. In Section 4.2 we discuss an alternative log linear model for which positivity of Λ is guaranteed.

Conditional on Z, the intensity function of X becomes

ρ(uZ;β)=ρ0+β0+β1:pZ(u).

In practice, a negative estimate of the intensity function may be obtained. This did, however, not happen in our data examples (Section 7). Possible problems with fitted negative intensity functions may be mitigated by enforcing linear constraints of the type ρ0 + β0 + β1:pZ(ui) ≥ 0 for a finite set of locations ui e.g. including X. This is for instance possible with the R procedure constrOptim().

Further, given Z, the second-order product density and the pair correlation function become

ρ2(u,vZ;β)=ρ(uZ;β)ρ(vZ;β)+c0(uv)

and g(u, v|Z; β) = 1 + c0(uv)/[ρ(u|Z; β)ρ(v|Z; β)].

For (5) we obtain a straightforward decomposition of the variance of Λ(u) into the sum of σZ2=VarZ(u) and σ02=VarΛ0(u). Hence (3) becomes

R2=σZ2σZ2+σ02.

4.2 A log linear model

The multiplicative log linear random intensity function

Λ(u)=exp[Z(u)+logΛ0(u)]=Λ0(u)exp[β0+β1:pZ(u)] (6)

is always non-negative. This model has an appealing interpretation in terms of location dependent thinning (Waagepetersen, 2007) where X is obtained by independent thinning of a Cox process driven by Λ0 and the probability of retaining a point at u is proportional to exp[(u)]. Hence in the tropical rain forest context, the covariates may be regarded as influencing the survival of plants in a stationary process of seedlings.

4.2.1 Intensity and pair correlation

Without loss of generality we assume in case of (6) that ρ0 = Inline graphicΛ0(u) = 1. Conditional on Z, the intensity function of X is

ρ(uZ;β)=E[Λ(u)Z]=exp[Z(u)],

the pair correlation function g(uv|Z) = g0(uv) coincides with g0, and

ρ(2)(u,vZ)=ρ(uZ;β)ρ(vZ;β)+ρ(uZ;β)ρ(vZ;β)c0(uv).

Unconditionally, the intensity and the pair correlation function become

ρ(u)=ρexpZ=Eexp[Z(u)]andg(uv)=g0(uv)gexpZ(uv)

where ρexp and gexp are the intensity and the pair correlation function of a Cox process with random intensity function exp[(·)].

4.2.2 Decomposition of variance

For a Cox process with random intensity function (6),

Var[Λ(u)]=σexpZ2+σ02[σexpZ2+ρexpZ2]

where

σexpZ2=VarE[Λ(u)Z]=Varexp[Z(u)].

According to (3), R2 becomes

R2=σexpZ2σexpZ2+σ02[σexpZ2+ρexpZ2]=gexpZ(0)1gexpZ(0)g0(0)1.

A related approach is to consider the proportion of variance of log Λ explained by . If both and log Λ0 are Gaussian then (Møller et al., 1998)

VarlogΛ(u)=VarZ(u)+VarlogΛ0(u)=logg0(0)+loggexpZ(0).

The proportion of variance for log Λ explained by Z then becomes

loggexpZ(0)logg0(0)+loggexpZ(0)

which is related to R2 by the approximation log(x) ≈ x − 1.

5 Models for g0

Our estimation procedure in Section 6 requires the specification of parametric models for the intensity function and the second-order product density of X|Z. Hence it remains to specify a parametric model for the pair correlation function g0, or equivalently (cf. (4)), for the covariance function c0 of Λ0. A given covariance function c0 is not necessarily a covariance function of a non-negative random field and g0 = c0 − 1 hence might not be a pair correlation function of a Cox process. In the literature on Cox processes, pair correlation models have therefore been obtained from explicit constructions of non-negative random fields. The two most popular constructions are log Gaussian and shot-noise fields leading to log Gaussian Cox processes (Møller et al., 1998) and shot-noise Cox processes (Møller, 2003). On the other hand, the most common examples of Neyman-Scott and Poisson-cluster processes are special cases of shot-noise Cox processes.

5.1 Log Gaussian Cox processes

A log Gaussian random field is obtained by exponentiating a Gaussian random field Y and the pair correlation function for the corresponding Cox process is simply the exponentiated covariance function of Y, i.e.

g0(uv)=exp[c(uv)] (7)

where c(uv) = ℂov[Y (u), Y (v)] (Møller et al., 1998).

5.2 Shot-noise Cox and cluster processes

A shot-noise field is in the simplest form obtained by a sum of positive kernel functions k(· − u) scaled by a parameter ξ > 0 and centered around points u of a homogeneous Poisson point process with intensity κ > 0. A Cox process driven by a shot-noise field can equivalently be considered as a Poisson-cluster process given by a superposition of clusters where for each parent point u, a Poisson number of offspring is dispersed independently around u according to the density k(· − u). The resulting pair correlation function becomes

g0(r)=1+κξ2R2k(u)k(u+r)du. (8)

Suppose g0 is a function so that c0(·) = g0(·) − 1 is an absolutely integrable covariance function. Further, let denote the inverse Fourier transform of the square root of the spectral density for c0. Then g0 has a representation of the form (8) (see e.g. page 65 and 489 in Chilès and Delfiner, 1999) with k = . However, g0 is not a pair correlation function of a shot-noise Cox process unless is non-negative. In general for a given g0, this property of is not easy to verify.

The usual approach in the literature on Poisson-cluster and shot-noise Cox processes (e.g. Stoyan et al., 1995; Diggle, 2003; Møller, 2003; Illian et al., 2008; Hellmund et al., 2008) is in fact to specify first k and then obtain g0 rather than the other way around. It is, however, remarkable that in the literature on Poisson-cluster/shot-noise Cox processes essentially only two specific examples of k are considered. For the so-called modified Thomas process, k is a Gaussian density with standard deviation ω; in which case c0 is a Gaussian covariance function

c0(r)=σ02exp[(||r||/η)2] (9)

where σ02=κξ2/(πη2) and η = 2ω. For the Matérn cluster process, k is the density of a uniform distribution on a disc of radius rM, say. For a Matérn process, c0 has a less simple expression (e.g. Stoyan and Stoyan, 1995) than for the modified Thomas process but an important feature is that c0(r) = 0 for r > rM. Hence, the modified Thomas process and the Matérn cluster model can only produce light tailed or extremely light tailed covariance functions and this is often not appropriate e.g. for tropical rain forest data. In Wiegand et al. (2007) and Tanaka et al. (2008) more general processes are constructed using modified Thomas processes as building blocks but again only light tailed correlation structures are obtained. Tanaka et al. (2008) also consider an inverse-power type kernel k but this does not admit a closed form expression for the pair correlation function.

In the next Section 5.3 we consider a more flexible class of normal variance mixture shot-noise fields where the kernel k and the pair correlation function is given in terms of a normal variance mixture. These normal variance mixtures provide flexibility for modeling of both shape at the origin and tail behavior of the pair correlation function.

5.3 Normal variance mixtures

Our starting point for obtaining flexible kernel functions k for shot-noise Cox processes are normal variance mixtures

f(r)=0φ(r;s)h(s)ds (10)

where φ(·; s) is the density of a zero-mean two-dimensional Gaussian vector with covariance matrix sI and h is some probability density on ℝ+. Any function of the form (10) is a positive definite function on ℝ2 (e.g. Schlather, 1999). If anisotropy is required, the identity matrix I may be replaced by a suitable non-diagonal correlation matrix.

One can easily verify that if h is a convolution h = * then so is f, f = k * k where

k(u)=0φ(u;s)h(s)ds (11)

which is non-negative. Hence, for the shot-noise field with kernel k we obtain c0(r) = κξ2f(r). A wide class of covariance functions are obtained by choosing h in the class of generalized inverse Gaussian distributions (e.g. Jørgensen, 1982) which includes both gamma, inverse gamma and inverse Gaussian distributions as special cases. The resulting class of normal variance mixtures is the class of generalized hyperbolic distributions (Barndoroff-Nielsen, 1977, 1978). By Barndoroff-Nielsen and Halgreen (1977) any generalized inverse Gaussian distribution is infinitely divisible and hence any generalized hyperbolic density can be represented as a convolution. However, for a generalized inverse Gaussian density h it is not always easy to identify the convolution density . We discuss below some special cases of generalized hyperbolic distributions where and hence k is explicitly known. A concise overview of multivariate generalized hyperbolic distributions is given in Sections 2.1–2.2 of Hu (2005).

5.3.1 Variance gamma/Bessel

A gamma density

h(s;α,β)=βαsα1exp(βs)/Γ(α),s>0,

with parameters α, β > 0 is a convolution of h(·; α/2, β) with itself. The densities f and k given by (10) and (11) become so-called variance gamma or Bessel densities (in the one-dimensional case an early reference is McKay, 1932). Specifically k is of the form

k(u)=1π2ν+1η2Γ(ν+1)(||u||/η)νKν(||u||/η) (12)

where ν′ = α/2 −1, η = (2β)−1/2, and Kν is a modified Bessel function of the second kind. The covariance function for the corresponding shot-noise field becomes

c0(r)=σ02(||r||/η)νKν(||r||/η)2ν1Γ(ν) (13)

where ν = α − 1 and σ02=κξ2/(4πη2ν). The resulting Matérn pair correlation function is

g0(r)=1+σ02(||r||/η)νKν(||r||/η)2ν1Γ(ν)ρ02

where ρ0 = κξ.

The smoothness parameter ν controls the shape of the pair correlation function and gives additional flexibility in the modeling. For instance, ν = 1/2 yields the exponential model

g0(r)=1+σ02exp(||r||/η)/ρ02 (14)

which offers more slowly decaying correlations than the Thomas pair correlation function obtained with (9). The Gaussian covariance function (9) may be viewed as a limiting case of a Matérn covariance function when ν → ∞ (Stein, 1999, page 50).

For any fixed ν (9.7.2. in Abramowitz and Stegun, 1965)

Kν(x)=exp(x)2x/π(1+O(1/x)), (15)

so in the tails ||r|| → ∞, the Matérn covariance function behaves as ||r||ν−1/2 exp(−||r||). By Corollary 2 in Yu (2011) the Matérn covariance function is log concave for ν ≥ 1/2. Moreover, by the same corollary, c0(tr0) is log convex as a function of t > 0 for any r0 ∈ ℝ2 when ν ≤ 1/2.

5.3.2 Cauchy

An inverse gamma density with parameters α, β > 0 is of the form

h(s;α,β)=βαsα1exp(β/s)/Γ(α),s>0.

It is in general not easy to identify the corresponding convolution density but the case α = 1/2 is an exception. With α = 1/2 the Laplace transform of h is simply L(t)=exp(2βt), t ≤ 0, so that h(·; 1/2, β) is a convolution of h(·; 1/2, β/4) with itself. The corresponding kernel function becomes a Cauchy density

k(u)=12πω2[1+(||u||/ω)2]3/2

with ω=β/2 and the covariance function becomes

c0(r)=σ02[1+(||r||/η)2]3/2 (16)

with σ02=κξ2/(2πη2), and η = 2ω. The Cauchy covariance function is polynomially decreasing and hence more suitable than the Matérn model for modeling of a slowly decaying covariance. The Cauchy covariance is log concave in a neighbourhood around the origin.

5.3.3 Normal inverse Gauss

The density of an inverse Gauss distribution is given by

h(s;μ,λ)=(λ2πs3)1/2exp[λ(sμ)22μ2s]

with μ, λ> 0. The inverse Gauss density h(·; μ, λ) is the convolution of h(·; μ/2, λ/4) with itself. The resulting normal inverse Gauss kernel is

k(u)=(γ/4)1/2exp(γ/4)K3/2(γ/4+(||u||/η)2)η2π3/22[γ/4+(||u||/η)2]3/4

where γ = (λ/μ)2 and η=μ/λ and

c0(r)=σ02K3/2(γ+(||r||/η)2)γ3/4K3/2(γ)[γ+(||r||/η)2]3/4

where σ02=κξ2γ1/2exp(γ)K3/2(γ)/(η2π3/22γ3/4). By the asymptotic formula (15), the normal inverse Gauss covariance behaves as exp(− ||r||)/||r||2 in the tails. The parameter γ may be viewed as a kind of ‘offset’ parameter for the squared distance ||r||2.

Remark

The convolution representations of the Matérn and Cauchy covariance functions above are not new and were found using Fourier transforms in Matérn (1986) (see Table 1, page 30 - a detailed derivation for the Matérn class is moreover provided in Jonsdottir et al., 2011). However, these representations and the shot-noise/Poisson-cluster processes with the corresponding kernels have not been considered previously in the point process literature. Moreover, the normal variance mixture perspective leads to straightforward simulation of offspring locations when the resulting shot-noise Cox process is viewed as a Poisson-cluster process. The Matérn and the normal inverse Gauss covariance functions both have three parameters. However, the Matérn class may seem preferable since the parameter ν allows for modeling of a range of kernel and covariance shapes varying from log convex to log concave.

Table 1.

Estimates of β, ψ and R2 and maximal values of the composite likelihoods CL1 and CL2. For each species, the first four rows in the last three columns correspond to the log linear model with Gaussian, Cauchy, Matérn and LG-Matérn covariance functions. The next four rows are for the additive model with the same covariance models. For LG-Matérn, abusing notation, ψ̂ denotes the parameter estimate for the covariance function of the Gaussian field Y. In the first column, the estimate ρ̂ is given by N(W)/|W|. The second column also shows estimates of σ^expZ2 and σ^Z2.

Species β̂1:2/CL1(β̂)
ψ^=(σ^02,η^,ν^)
CL2(ψ̂|β̂) – C R2
(0.02, 0.005) (13.7, 4.4, ∞) 27438.39 0.01
(11.9, 4.9, –) 28744.97 0.01
−4117.7 (8.5, 4.7, 0.69) 28507.05 0.01
Acalypha
σ^expZ2=0.13×106
(2.4, 5.6, 1.02) 28675.13 0.01

(15.5, 4.9)×10−6 (11.6×10−6, 4.1, ∞) 0 0.01
(8.1×10−6, 5.2, –) 1724.32 0.01
ρ̂ = 1056×10−6 −4119.7 (6.1×10−6, 5.8, 0.56) 1128.50 0.02
C = −6291053.0
σ^Z2=0.11×106
(6.1×10−6, 5.8, 0.56) 1128.50 0.02

(−0.03, −0.16) (1.1, 28.4, ∞) 82006.98 0.11
(1.8, 18.4, –) 82174.76 0.07
−6117.6 (2.0, 14.0, 0.65) 82326.85 0.06
Lonchocarpus
σ^expZ2=0.45×106
(1.1, 14.8, 0.86) 82344.08 0.06

(−38.2, −193.3)×10−6 (1.5×10−6, 36.6, ∞) 0 0.17
(2.1×10−6, 27.4, –) 934.78 0.12
ρ̂ = 1672×10−6 −6121.9 (2.8×10−6, 23.1, 0.41) 702.26 0.09
C = −6168628.5
σ^Z2=0.29×106
(2.8×10−6, 23.1, 0.41) 702.26 0.09

(0.03, 0.004) (0.25, 69.8, ∞) 5012.70 0.28
(0.43, 43.1, –) 5223.48 0.18
−19693.0 (0.76, 48.2, 0.22) 5342.78 0.11
Capparis
σ^expZ2=4.84×106
(0.59, 49.7, 0.26) 5361.99 0.11

(193.2, 24.8)×10−6 (10.0×10−6, 70.2, ∞) 0 0.29
(15.0×10−6, 48.7, –) 285.51 0.21
ρ̂ = 6598×10−6 −19700.1 (28.8×10−6, 51.6, 0.21) 466.02 0.12
C = −5089810.54
σ^Z2=4.06×106
(28.8×10−6, 51.6, 0.21) 466.02 0.12

6 Parameter estimation

In practice we consider a parametric model c0(·; ψ) for the covariance function of Λ0 where c0 could for instance be the Matérn covariance function (13) with ψ=(σ02,η,ν). Given (X, Z) observed within W ⊆ ℝ2, we then obtain a plug-in estimate for R2 in (3) after estimating the three parameters β, ψ, and σZ2 (or σexpZ2).

6.1 Composite likelihood estimation

The inference about β and ψ is based on X|Z. The regression parameter β can be estimated by the first-order composite log likelihood function (CL1) (Schoenberg, 2005; Waagepetersen, 2007)

CL1(β)=uXlogρ(uZ;β)Wρ(uZ;β)du. (17)

This is the log likelihood function in the case where X is a Poisson process with intensity function ρ(·|Z; β). The Berman-Turner quadrature scheme can be used to approximate the integral in (17) both for the log linear and the additive model, see e.g. Baddeley and Turner (2000).

For a Cox process specified by the log linear random intensity function (6) one may subsequently estimate ψ using minimum contrast estimation based on the K-function, see Waagepetersen (2007) and Waagepetersen and Guan (2009). However, for the additive model the K-function is not well-defined since the pair correlation function is not translation invariant (Baddeley et al., 2000). As suggested by Guan (2006) and Waagepetersen (2007), ψ can instead be estimated using a second-order composite likelihood function

CL2(ψβ)=u,vXw(u,v)logρ(2)(u,vZ;β,ψ)W2w(u,v)ρ(2)(u,vZ;β,ψ)dudv (18)

based on the second-order product density where w is a weight function, see e.g. (20) and (21) below. The integral term in (18) can also be approximated by a variant of the Berman-Turner scheme. The second-order composite likelihood can be evaluated for both the log linear model and the additive model and the maximized CL2 can be used as a criterion for model selection, see Section 7.

Instead of maximizing the right-hand side of (18) with respect to both β and ψ, a computationally simpler approach is to obtain β̂ by maximizing (17) and then ψ̂ by maximizing CL2(·|β̂). Moreover, an equivalent version of CL2(ψ|β̂) is given by

CL2(ψβ^)=u,vXw(u,v)logg(u,vZ;β^,ψ)W2w(u,v)Cov[Λ(u),Λ(v)Z;β^,ψ]dudv (19)

which is obtained from CL2(ψ|β̂) by subtracting the second-order composite likelihood of a Poisson process with intensity function ρ(·|Z; β̂). More stable convergence results were obtained using (19) instead of (18) when using a standard implementation of the Nelder-Mead algorithm for maximizing the second-order composite likelihood.

In simulation studies and applications to real data we discovered that the second-order composite likelihood estimates can be quite sensitive to the accuracy of the Berman-Turner quadrature scheme used to approximate the double integral in (19). This is especially the case when c0 is steep at zero like for (13) with a small ν or small η. Regarding the weight function w,

w(u,v)=1[||uv||t]/[πt2] (20)

is the standard choice which ensures that only t-close pairs of points are used in the composite likelihood. In the data example in Section 7 we also considered

w(u,v)=1[||uv||t]/[πt2ρ(uZ,β^)ρ(uZ,β^)]. (21)

For log linear models, this w implies a simplification of the double integral since the intensity function is eliminated from the integrand.

6.2 Estimation of environmental variances

In practice Z is observed on a grid G = {ui}i=1,…,M covering W. Since we assume that Z is a stationary process, simple estimators of σZ2 and σexpZ2 are given by

σ^Z2=1MuG[Z^(u)ρ^Z]2 (22)

and

σ^expZ2=1MuG{exp[Z^(u)]ρ^expZ}2 (23)

where

ρ^Z=1MuGZ^(u),ρ^expZ=1MuGexp[Z^(u)], (24)

and Z^(u)=β^Z(u).

6.3 Discussion of theoretical properties of estimates

One may be concerned about the joint properties of the composite likelihood estimates and (22)–(24). A fully detailed asymptotic analysis is outside the scope of this paper and we here only outline the essential arguments. We may view the estimate θ̂ = (β̂, ψ̂) as a solution of an estimating equation u(θ) = 0 where u is obtained by concatenating the gradients of the log composite likelihoods (17) and (18). Note that u is conditionally unbiased, Inline graphic[u(θ)|Z] = 0. Moreover the estimates (22)–(24) are obtained in a simple manner from spatial averages of the form A(β̂) = ΣuM f (Z(u), β̂) where f(z, β) is either βz, (βz)2, exp(βz) or exp(2βz).

Under appropriate mixing conditions for Λ0 and Z one may verify using techniques as in Waagepetersen and Guan (2009) that asymptotically (increasing window size |W|), |W|1/2(θ̂θ) is distributed as u(θ)S−1 for a matrix S = − Inline graphicdu(β)/dβ and that u(θ) is asymptotically normal. Hence |W|1/2(θ̂θ) is asymptotically zero-mean normal. Moreover,

W1/2(A(β^)EA(β))=W1/2(A(β)EA(β))+W1/2(β^β)EddβA(β)+oP(1).

By mixing the first term on the right hand side converges in distribution to a normal distribution. It thus follows that the spatial average A(β̂) is a |W|1/2 consistent estimate of Inline graphicA(β) but the replacement of β with β̂ introduces additional error in the estimation quantified by the second term in the right hand side of the above equation. This error is asymptotically independent of the error given by the first term since

Covθ[A(β),u(θ)]=EθCovθ[A(β),u(θ)Z]+Covθ{Eθ[A(β)Z],Eθ[u(θ)Z]}=0

by the unbiasedness of u(·) given Z and since A(β) is constant given Z.

To summarize, under appropriate mixing conditions, the composite likelihood estimates and the estimates (22)–(24) all become consistent and asymptotically normal.

7 Decomposition of variance for tropical rain forest data example

In this section we return to the data example in Section 1.1. Waagepetersen and Guan (2009) fitted inhomogeneous Thomas models with random intensity functions of the form (6) and a Gaussian covariance function (9) for Λ0. Using the same covariates as in Waagepetersen and Guan (2009), we fit for each of the species the additive model (5) and the log linear model (6) with c0 being the Gaussian covariance function (9), the Matérn covariance function (13), and the Cauchy covariance function (16). Note that these covariance functions have very distinct behaviors both at the origin and in the tails, see also Figure 3. As in Myllymäki and Penttinen (2009) we also consider a pair correlation function of the form (7) where Y is a Gaussian random field with a Matérn covariance function. We denote this covariance function LG-Matérn since it corresponds to the case of a log Gaussian random intensity function Λ0. In this case σ02=exp[VarY(u)]1. As measures of fit for these models, we use the maximal values of the composite likelihoods CL1 and CL2.

Figure 3.

Figure 3

The various fitted covariance functions c0 in case of the log linear model for Acalypha (top left), Lonchocarpus (top right), and Capparis (bottom left).

Regarding the weight function w we tried both (20) and (21) with t = 125 in (18). The integrals in the composite likelihoods CL1 and CL2 were approximated using a Berman-Turner quadrature scheme consisting of data points and 200 × 100 dummy points over the observation window W = [0, 1000] × [0, 500]. The ranking of the models according to CL2 did not depend on the choice of w-function (except for a single swap of ranks between Cauchy and Matérn in case of the additive model for Lonchocarpus). However, for the log linear models we in general obtained somewhat smaller estimates of σ02 with (20) than with (21) and vice versa for the additive models. According to a model check for the best fitting log linear models (see below) the estimates obtained with (21) gave the best fit. In the following we restrict attention to the results obtained with (21).

Table 1 shows the parameter estimates, the maximal composite likelihood values, and the estimated R2 for each species and model.

Considering first β1:2 for each species, the regression parameters have similar signs and relative magnitudes for the log linear model and the additive model. However, the maximal first-order composite likelihood CL1 is always largest for the log linear model. The fitted are shown in Figure 2 for each species.

Figure 2.

Figure 2

The fitted regression term, Z^, for Acalypha under the additive model and the log linear model, and for Lonchocarpus and Capparis under the log linear model (from top to bottom).

Considering CL2 and comparing log linear and additive models for each species, the log linear models always yield the largest CL2 values. Regarding the choice of covariance function, the best fit for Acalypha is obtained with the Cauchy covariance function while for Lonchocarpus and Capparis, the LG-Matérn covariance function performs better, followed by the Matérn covariance function. Both for the log linear and the additive models, the smallest CL2 is obtained with the Gaussian covariance function. This suggests that the fast decaying pair correlation function obtained with the Gaussian covariance function is not suitable for the tropical rain forest data. For the additive model, the LG-Matérn and the Matérn covariance functions are almost identical because in this case σ^02 is small and exp(x) − 1 ≈ x when x ≈ 0.

Regarding R2, the estimates vary considerably across models. However, the overall qualititative conclusion is stable: the contribution of the environment is smallest for Acalypha and largest for Capparis. This may be linked to the different modes of seed dispersal of the species, see Waagepetersen and Guan (2009). The R2 obtained with the best fitting models are 0.01, 0.06, and 0.11 for Acalypha, Lonchocarpus, and Capparis.

The fitted covariance functions for the log linear models and for all species are shown in Figure 3. For all species, the Gaussian covariance function differ much from the other three covariance functions both a the origin and in the tail. For Lonchocarpus and Capparis, the fitted Matérn and LG-Matérn covariance functions appear rather similar.

We used a non-parametric estimate of the g0-function (see e.g. Chapter 4 in Møller and Waagepetersen, 2004) as a summary statistic for model assessment and computed pointwise 90% confidence bands for this statistic using simulations from the best fitting log linear models according to CL2. The non-parametric estimates of c0(·) = g0(·) − 1 and 90% pointwise confidence bands are shown in Figure 4 and do not provide evidence against the fitted models.

Figure 4.

Figure 4

Non-parametric estimates of c0 (solid line) and 90% pointwise confidence bands (gray area) obtained from 100 simulations under best fitting models. The red dashed line shows mean of simulated non-parametric estimates and the blue dotted line shows the parametric estimate of c0. Top left: Acalypha, top right: Lonchocarpus, and bottom left: Capparis.

8 Discussion

In this paper we introduced a method for decomposing variance for spatial Cox processes. As can be seen from our data analysis, the results can be sensitive to the choice of the pair correlation function. Fortunately, the flexible class of normal variance mixture pair correlation functions allow us to compare results from a wide range of pair correlation functions. We can then report results obtained with the best fitting model according to the second-order composite likelihood criterion.

Concerning the second-order composite likelihood estimation, further studies regarding numerical implementation and choice of w-function seem needed. For the Matérn model, the joint estimation of ( σ02, η, ν) is computationally demanding and simulation studies indicate that the statistical properties of the estimates can be poor. For routine use, a more feasible approach is to maximize only with respect to σ02 and η for each ν in a moderate collection of ν values.

The Thomas process has enjoyed much popularity in the point process literature. However, at least for the tropical rain forest data considered in this paper, it seems inferior to models with a more slowly decaying pair correlation function. For all the data examples considered, the log linear model provided a better fit than the additive model. This is perhaps not so surprising since the log linear model has an appealing interpretation in terms of survival of seedlings. On the other hand, variance decomposition is more straightforward for the additive model than for the log linear model.

Our proposed variance decomposition procedure assumes a stationary random environment. This may e.g. not be tenable for Acalypha because the fitted in W shows a trend from right to left. Although this does not necessarily invalidate the stationarity assumption for , it at least implies that W may be too small to allow a precise estimate of the variance for . Further research is required to handle the case where is not stationary but perhaps satisfies a weaker assumption of intrinsic stationarity.

Acknowledgments

We thank two anonymous referees for constructive comments which helped to improve the manuscript.

Abdollah Jalilian and Rasmus Waagepetersen’s research was supported by the Danish Natural Science Research Council, grant 09-072331 ‘Point process modeling and statistical inference’, and by Centre for Stochastic Geometry and Advanced Bioimaging, funded by a grant from the Villum Foundation. Yongtao Guan’s research was supported by NSF grant DMS-0845368 and by NIH grant 1R01DA029081.

The BCI forest dynamics research project was made possible by National Science Foundation grants to Stephen P. Hubbell: DEB-0640386, DEB-0425651, DEB-0346488, DEB-0129874, DEB-00753102, DEB-9909347, DEB-9615226, DEB-9615226, DEB-9405933, DEB-9221033, DEB-9100058, DEB-8906869, DEB-8605042, DEB-8206992, DEB-7922197, support from the Center for Tropical Forest Science, the Smithsonian Tropical Research Institute, the John D. and Catherine T. MacArthur Foundation, the Mellon Foundation, the Celera Foundation, and numerous private individuals, and through the hard work of over 100 people from 10 countries over the past two decades. The plot project is part of the Center for Tropical Forest Science, a global network of large-scale demographic tree plots.

The BCI soils data set were collected and analyzed by J. Dalling, R. John and K. Harms with support from NSF DEB021104, 021115, 0212284, 0212818 and OISE 0314581 and OISE 031458, STRI and CTFS.

Contributor Information

Abdollah Jalilian, Department of Statistics, Shahid Beheshti University, Evin, Tehran, 19839-63113, Iran.

Yongtao Guan, Division of Biostatistics, Yale University, New Haven, Connecticut 06520-8034, U.S.A.

Rasmus Waagepetersen, Department of Mathematical Sciences, Aalborg University, Fredrik Bajersvej 7G, DK-9220 Aalborg, Denmark.

References

  1. Abramowitz M, Stegun I. Handbook of Mathematical Functions. 9 Dover; New York: 1965. [Google Scholar]
  2. Baddeley A, Turner R. Practical maximum pseudolikelihood for spatial point patterns. Aust N Z J Stat. 2000;42:283–322. [Google Scholar]
  3. Baddeley AJ, Møller J, Waagepetersen R. Non- and semi-parametric estimation of interaction in inhomogeneous point patterns. Stat Neerl. 2000;54:329–350. [Google Scholar]
  4. Baddeley AJ, Turner R, Møller J, Hazelton M. Residual analysis for spatial point processes (with discussion) J R Stat Soc Ser B Stat Methodol. 2005;67:617–666. [Google Scholar]
  5. Barndoroff-Nielsen O. Exponentially decreasing distributions for the logarithm of particle size. Proc R Soc Lond Ser A Math Phys Eng Sci. 1977;353:401–419. [Google Scholar]
  6. Barndoroff-Nielsen OE. Hyperbolic distributions and distributions on hyperbolae. Scand J Stat. 1978;5:151–157. [Google Scholar]
  7. Barndoroff-Nielsen OE, Halgreen C. Infinite divisibility of the hyperbolic and generalized inverse Gaussian distributions. Zeitschrift für Wahrschein-lichkeitstheorie und verwandte Gebiete. 1977;38:309–312. [Google Scholar]
  8. Best NG, Ickstadt K, Wolpert RL. Spatial Poisson regression for health and exposure data measured at disparate resolutions. J Amer Statist Assoc. 2000;95:1076–1088. [Google Scholar]
  9. Chilès J-P, Delfiner P. Probability and Statistics. Wiley; 1999. Geostatistics - modeling spatial uncertainty. [Google Scholar]
  10. Condit R. Tropical Forest Census Plots. Springer-Verlag and R. G. Landes Company; Berlin, Germany and Georgetown, Texas: 1998. [Google Scholar]
  11. Condit R, Hubbell SP, Foster RB. Changes in tree species abundance in a neotropical forest: impact of climate change. Journal of Tropical Ecology. 1996;12:231–256. [Google Scholar]
  12. Diggle PJ. Statistical analysis of spatial point patterns. 2 Arnold; 2003. [Google Scholar]
  13. Guan Y. A composite likelihood approach in fitting spatial point process models. J Amer Statist Assoc. 2006;101:1502–1512. [Google Scholar]
  14. Hellmund G, Prokešová M, Jensen EBV. Lévy-based Cox point processes. Adv in Appl Probab. 2008;40:603–629. [Google Scholar]
  15. Hu W. PhD thesis. Florida State University; Tallahassee, FL, USA: 2005. Calibration of multivariate generalized hyperbolic distributions using the EM algorithm, with applications in risk management, portfolio optimization and portfolio credit risk. [Google Scholar]
  16. Hubbell SP, Foster RB. Diversity of canopy trees in a neotropical forest and implications for conservation. In: Sutton SL, Whitmore TC, Chadwick AC, editors. Tropical Rain Forest: Ecology and Management. Blackwell Scientific Publications; Oxford: 1983. pp. 25–41. [Google Scholar]
  17. Illian J, Penttinen A, Stoyan H, Stoyan D. Statistics in Practice. Wiley; 2008. Statistical analysis and modelling of spatial point patterns. [Google Scholar]
  18. Jonsdottir KY, Rønn-Nielsen A, Mouridsen K, Jensen EBV. Research Report. Vol. 2. Centre for Stochastic Geometry and Advanced Bioimaging; 2011. Lévy-based modelling in brain imaging. [Google Scholar]
  19. Jørgensen B. Lecture Notes in Statistics. Springer; 1982. Statistical properties of the generalized inverse Gaussian distribution. [Google Scholar]
  20. Matérn B. Number 36 in Lecture Notes in Statistics. Springer-Verlag; Berlin: 1986. Spatial Variation. [Google Scholar]
  21. McKay AT. A Bessel function distribution. Biometrika. 1932;24:39–44. [Google Scholar]
  22. Møller J. Shot noise Cox processes. Adv in Appl Probab. 2003;35:614–640. [Google Scholar]
  23. Møller J, Syversveen AR, Waagepetersen RP. Log Gaussian Cox processes. Scand J Stat. 1998;25:451–482. [Google Scholar]
  24. Møller J, Waagepetersen RP. Statistical inference and simulation for spatial point processes. Chapman and Hall/CRC; Boca Raton: 2004. [Google Scholar]
  25. Myllymäki M, Penttinen A. Conditionally heteroscedastic intensity-dependent marking of log Gaussian Cox processes. Stat Neerl. 2009;63:450–473. [Google Scholar]
  26. Schlather M. Introduction to positive definite functions and to unconditional simulation of random fields. ST 99-10. Lancaster University; 1999. [Google Scholar]
  27. Schoenberg FP. Consistent parametric estimation of the intensity of a spatial-temporal point process. J Statist Plann Inference. 2005;128:79–93. [Google Scholar]
  28. Stein ML. Spatial interpolation, some theory for kriging. Springer Verlag; New York: 1999. [Google Scholar]
  29. Stoyan D, Kendall WS, Mecke J. Stochastic Geometry and Its Applications. 2 Wiley; Chichester: 1995. [Google Scholar]
  30. Stoyan D, Stoyan H. Fractals, Random Shapes and Point Fields. Wiley; Chichester: 1995. [Google Scholar]
  31. Tanaka U, Ogata Y, Stoyan D. Parameter estimation and model selection for Neyman-Scott point processes. Biom J. 2008;50:43–57. doi: 10.1002/bimj.200610339. [DOI] [PubMed] [Google Scholar]
  32. Waagepetersen R. An estimating function approach to inference for inhomogeneous Neyman-Scott processes. Biometrics. 2007;63:252–258. doi: 10.1111/j.1541-0420.2006.00667.x. [DOI] [PubMed] [Google Scholar]
  33. Waagepetersen R, Guan Y. Two-step estimation for inhomogeneous spatial point processes. J R Stat Soc Ser B Stat Methodol. 2009;71:685–702. [Google Scholar]
  34. Wiegand T, Gunatilleke S, Gunatilleke N, Okuda T. Analyzing the spatial structure of a Sri Lankan tree species with multiple scales of clustering. Ecology. 2007;88:3088–3102. doi: 10.1890/06-1350.1. [DOI] [PubMed] [Google Scholar]
  35. Yu Y. ArXiv e-prints. 2011. On Normal Variance-Mean Mixtures. [Google Scholar]

RESOURCES