Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2019 Mar 22.
Published in final edited form as: J Appl Stat. 2017 Feb 11;45(3):568–585. doi: 10.1080/02664763.2017.1288200

Spatially explicit survival modeling for small area cancer data

G Onicescu a,*, A Lawson b, J Zhang c, Mulugeta Gebregziabher b, Kristin Wallace b, J M Eberth c
PMCID: PMC6429959  NIHMSID: NIHMS1503667  PMID: 30906096

Abstract

In this paper we propose a novel Bayesian statistical methodology for spatial survival data. Our methodology broadens the definition of the survival, density and hazard functions by explicitly modeling the spatial dependency using direct derivations of these functions and their marginals and conditionals. We also derive spatially dependent likelihood functions. Finally we examine the applications of these derivations with geographically augmented survival distributions in the context of the Louisiana Surveillance, Epidemiology, and End Results (SEER) registry prostate cancer data.

Keywords: Bayesian hierarchical models, Markov chain Monte Carlo, prostate cancer, spatial, kernel convolution

1. Introduction

The analysis of time to event data, also called survival analysis, has numerous applications in various fields, including medicine, public health and epidemiology [4, 24, 30]. A person’s geographical location can also play a role in their survival [6] as it is often correlated with common risk factors of disease (such as environmental pollution, access to healthcare and local water and soil composition) which may impact disease outcomes. The use of spatial analysis in population based studies has grown considerably in the past ten years because of advances in software development, data acquisition and dissemination, and the increasing availability of geographically-referenced data. However, to extract useful patterns from spatial data new statistical methodologies need to be developed.

The use of survival modeling for cancer outcomes has increased in the recent decade. Geographical context has been exploited [5, 6, 10, 14, 15, 21, 3438] and cure rate models have been proposed with spatial components [12, 23]. It is clear that spatial referencing is important in understanding the variation in cancer outcomes. While some spatial data is available at the individual level using the addresses of residence, in many instances, the exact geographical address location is not available. Instead, mostly for confidentiality reasons, these point data are aggregated to larger administrative units, such as counties or parishes, to yield counts [26].

Random effects are commonly used to account for confounding within models for spatial disparities. While this approach is reasonable, it does not include the geographical risk in the definition of the survival measures and, therefore, has limited interpretability.

Spatial confounding is likely present in many of the applied contexts in which residuals are spatially correlated [28]. In addition, spatial structure in the covariates is very common in applications and complicates the problem because the covariate and the residual spatial structure compete to explain variability in the response [28, 33]. Therefore, methodologies that avoid the use of spatial random effects are needed, especially in the context of correlation between the covariates and the spatial structure.

The probability of getting a cancer diagnosis by a specific time can also be extended for a spatial domain or location. To address this issue, it is important to consider the probability density, survival, and hazard functions as defined for both temporal and spatial domains. Our objective is to develop and apply a hierarchical Bayesian approach for geographically augmented spatial time to event data by explicitly modeling the spatial dependency using direct derivations of the density, survival and hazard functions in the context of independent space and time. In order to illustrate our approach, we use time to event data on prostate cancer (PrCa)”, which has been shown to have a marked county-level variation within the US [37, 38]. An alternative development of the methodology assuming dependency between space and time was presented in [27].

2. Definitions and Notations

We implement the model in the Bayesian framework, which is based on specifying a probability model for the observed data, given a vector of unknown parameters, leading to the likelihood function. Then we assume that the vector of unknown parameters is random and has a prior distribution. Inference is based on the posterior distribution, which is proportional to the likelihood multiplied by the prior. We begin by defining the generalized probability density function (pdf), which is allowed to have both a temporal and spatial component. While dependence between space and time can also be considered, we develop and present in this paper the case when the space and time components are independent.

For the probability density function (pdf), space is defined on the real plane ℝ2 = ℝxℝ, while the temporal component is defined on the real line ℝ. Hence f:ℝxℝx(0, ∞), f(s, t) = f1(s)f2(t), where f1(s) and f2(t) are independent space and time density functions, s being a location on the surface area defined by latitude and longitude and t being a time. Next, we define the cumulative distribution functions for the temporal component only, spatial component only and for the space and time components combined. In our notation, the indices t, s and (s, t) refers to the domain of integration: temporal, spatial and spatio-temporal respectively.

Let As be a spatial area, sAs and let t* ∈ (0, ∞) be a duration time. We define

Ft(t*)=0t*f2(t)dt, (1)

where Ft(t*) is the probability that the time random variable takes a value less or equal to t*, where Ft(t*) is the probability that the time random variable takes a value less or equal to t*,

Fs(As)=Asf1(s)ds, (2)

where Fs(As) is the marginal probability over the space As,

Ft(s,t*)=0t*f(s,t)dt=0t*f1(s)f2(t)dt=f1(s)Ft(t*), (3)

where Ft(s, t*) is the probability that the time random variable takes a value less or equal to t* at the spatial location s,

Fs(As,t*)=Asf(s,t)ds=Asf1(s)f2(t*)ds=f2(t*)Asf1(s)ds=f2(t*)Fs(As), (4)

where Fs(As, t*) is the probability that the time random variable takes the value t* over the domain As,

Fs,t(As,t*)=Fs(As)Ft(t*), (5)

and Fs;t(As; t) is the joint probability that the time random variable takes a value less or equal to t over the domain As.

The corresponding survival and hazard functions are defined as follows:

St(s,t*)=1Ft(s,t*), (6)
Ss,t(As,t*)=P(sAs,t>t*)=Fs(As)(1Ft(t*)), (7)
ht(s,t*)=f(s,t*)St(s,t*), (8)
hs,t(As,t*)=Asf(s,t*)dsSs,t(As,t*)=Fs(As,t*)Fs(As)(1Ft(t*)), (9)

where St(s, t*) is the cumulative probability of surviving beyond time t* for a person residing in location s, Ss,t(As, t*) is the joint cumulative probability of surviving beyond time over the area As, ht(s, t*) is the probability of instantaneous failure at time t* for a person residing at location s and hs,t(As, t*) is the joint probability of instantaneous failure at time t* over the area As.

The format of our data consists of a time to event variable for each individual (t), the parish of residence (As), as well as individual and parish level covariates. As is the case with most publicly available data, we do not have access to the exact address location for each individual and therefore we develop our spatial survival methodology in the context of parish level aggregated data using an area likelihood.

Let n = number of subjects, ti = survival times, i = 1, … ,n, independent, identically distributed with density function f(s, t), mi=(mi1,...,mip), i = 1, …,n a vector of covariates for subject i, p=number of covariates, ci = censoring time, i = 1, … , n,

yi=min(ci,ti),vi={1 if tici0 if ti>ci.

Let y = (y1, y2, … , yn)′ and v= (v1,v2, … ,vn)′. Let 𝜃 be the vector of parameters of interest. Let A1, … , AL be a partition of the study region and nl be the number of people in area Al. The area likelihood is defined as follows:

LA(θ|n,y,v,mi)=l=1Li=1nl[Fs(Al,yi,mi)]vi[Ss,t(Al,yi,mi)]1vi=l=1Li=1nl[f2(yi,mi)Alf1(s)ds]vi[Fs(Al,mi)(1Ft(yi,mi))]1vi=l=1Li=1nl[f2(yi,mi)Alf1(s)ds]vi[Alf1(s)ds(1  0yif2(t,mi)dt)]1vi, (10)

where Fs(Al; yi; mi) and related measures are modified to display the covariate dependence.

The temporal component probability density function for the ith observation, i = 1, … ,n, follows the Weibull(μi) distribution, which is a commonly used distribution for time to event data. It can model a decreasing, constant or increasing failure rate over time, if its shape parameter μ is less, equal or greater than 1 respectively.

We define spatial partitions A1, . . . ,AL exhaustive and unique within the study region. Using the Weibull temporal density and assuming independent space and time, the area likelihood is defined as follows:

LA(θ|n,y,v)=l=1Li=1nl[μλi(yi)μ1e(yi)μλiAlf1(s)ds]vi[Alf1(s)ds(10yiμλitμ1etμλidt)]1vi. (11)

The covariates were linked to the log(λi) parameter, log(λi) =β0 + βmi, where β0 is the intercept and β = (β1, … , βp)′ is a vector of regression parameters.

We refer to this spatially explicit model using Weibull distributed time to event and independent space and time assumption as the Spatio-Temporal Survival Unstructured Weibull (STSU- Weibull) model.

3. Spatial model

The term Fs(Al) =∫Al f1(s)ds that appears in the definition of the likelihood function can be computed by constructing the spatial process. Modeling the spatial dependence structure is of fundamental importance to all spatially referenced data. There are a wide variety of choices that can be used to specify the spatial model. One approach is to assume a geostatistical model whereby the spatial component is assumed to follow a Gaussian process with spatial dependence defined by a covariance function, usually assumed to be second order stationary, with the covariance between any two locations depending on the distance between them [16, chapter 3]. Spatial modeling using the covariance function is computationally restrictive due to the necessity for inversion of a potentially large positive definite matrix.

One alternative to directly specifying the covariance function is to assume a process convolution model [22], which is based on the idea that any stationary Gaussian process can be expressed as the convolution of a white noise process x(s) with a specified kernel k(s). The advantage of convolution based models lies in their computational simplification. In addition, they always induce valid covariance functions and, due to their nonparametric nature, have considerable flexibility versus a fully parametric approach.

Kernel convolutions have been widely used in the spatial literature [9, 13]. Special cases such as Gaussian component mixture (GSM) have also been proposed [25]. In order to construct the spatial model we used a process convolution using a Gaussian kernel function, allowed to vary over the study area. As described in Higdon et al.[22], the model for the spatial process is determined by specifying the white noise process x(s) and the smoothing kernel k(s). We have chosen the Gaussian kernel k(s)αexp12s2, (s being the Euclidian norm), since it induces a covariance matrix which is a function of the squared distance between two spatial locations and gradually dies off with increased distance. Specifically, denoting d=s-s the displacement vector between the locations s and s, the Gaussian kernel induces the covariance c(d)=Cov(z(s)z(s))αexp(12d(2)2)[22].

The Gaussian process z(s) can be constructed over a spatial region 𝕊 as follows [8, 22]:

z(s)=Sk(us)x(u)du, (12)

for s ∈ 𝕊. Since the above integration cannot be explicitly solved, the integral can be approximated by a finite sum:

z(s)=j=1ngxjk(wjs),xj~N(0,σ2), (13)

where wj, j = 1, … , ng are the grid points, ng is the total number of grid points over the area.

Variations of the above formula can be used by restricting the domain of the white noise process [22]. In order to have positiveness of the white noise process, we used the following formulas:

z(s)=j=1ngexjk(wjs),xj~N(0,σ2), (14)
zA=Az(s) dssAj=1ngexjk(wjs). (15)

Since we had access only to the number of deaths and total population of interest in each parish, we computed the coordinates of the centroid of each parish and approximated the above formula with the following:

zApAj=1ngexjk(wjcA), (16)

where pA is the percentage of deaths in area A and cA is the centroid of parish A.

For each parish Al, l = 1, … , L the kernel was computed at the differences between the centroid of the parish and each grid point in the whole polygon area. We stored these values in the kernel matrix Kij i = 1, … , L and j = 1, … , ng of dimension L x ng. zAl was therefore calculated by multiplying the number of deaths for parish A with the scalar product between the lth row of the kernel matrix and the exponentiated white noise vector (ex1, … , exng )′.

The integral over an area of the spatial pdf function is defined as the ratio between zA and zAT:

zAT=l=1LzAl,zAzAT=Af1(s)ds, (17)

L being the total number of partitions of the study region. It is to be noted that using this construction, the risk probability is split between the parishes included in the study area, and, therefore, the value of the parish level risk probability and related measures are dependent on the number of parishes in the study.

The area likelihood becomes:

LA(μ,λ|n,y,v)=l=1Li=1nl[μλi(yi)μ1e(yi)μλizAlzAT]vi[zAlzAT(10yiμλitμ1etμλidt)]1vi. (18)

The likelihood can be factored into a spatial and temporal product. In the first term the spatial component zAlzAT)vi can be grouped with the spatial component in the second term,zAlzAT)1vi , these terms multiplied being equal to zAlzAT.

4. Prior distributions

Since our models are implemented in the Bayesian framework, we assigned prior distributions for each parameter. For each of the coefficients β0, … , βp, we used Gaussian prior distributions N(0, σ0), … , N(0, σp) respectively, where the hyperparameters σ0, … , σp each were assumed to have a Uniform (0,10) distribution. We used zero mean Gaussian prior distributions N(0,Σx), with Σx diagonal with each variance equal to 52 for the white noise effects xk, k = 1, … , ng, ng being the total number of grid points in the state. For the log of the shape parameter log(μ) of the Weibull distribution we have assigned a Normal distribution with zero mean and variance 0,12. Since μ is a power term, we used a small variance for the prior distribution of the log(μ) parameter. Sensitivity analyses have been performed on the distributional assumptions by changing the prior distribution for the standard deviation hyperparameters of the regression coefficient σ0 … σpto be Uniform (0,15), therefore increasing the upper bound of this prior distribution. In the sensitivity analyses we used zero mean Gaussian prior distributions N(0,Σx), with Σx diagonal with each variance equal to 102 for the white noise effects xk, k = 1, … ,ng, ng being the total number of grid points in the state. A N(0, 0,52) distribution was assumed for the logshape parameter of the Weibull distribution. The results showed that the parameter estimates were not influenced by the changes of the prior distribution parameters. The prior distributions for the parameters are defined as follows:

Regression coefficients: βj~N(0,σj2),σj~Uniform(0,10),j=0,...,p,White noise process: xk ~ N(0, 52), k = 1, … , ng;

Weibull log shape parameter: log(μ) ~ N(0, 0.12).

5. Computational approach and software

Advances in computing power and software have made Markov chain Monte Carlo (MCMC) [20, 31] one of the most important computational tool in Bayesian biostatistics. For our approach we used MCMC implemented via a Metropolis-Hastings algorithm [11] for sampling from the posterior distribution of the parameters.

The model is specified as follows:

LA(μ,λ|n,y,v)=l=1Li=1nl[μλi(yi)μ1e(yi)μλizAlzAT]vi[zAlzAT(10yiμλitμ1etμλidt)]1vi. (19)
zApAj=1ngexjk(wjcA),zAT=l=1LzAl, (20)

where pA is the percentage of deaths in area A, wj are the grid points, cA is the centroid of parish A, ng being the total number of gridpoints, with

k(s)αexp12s2 (s being the Euclidian norm) is the Gaussian kernel, and prior distributions:

βj~N(0,σj2),σj~Uniform(0,10),j=0,...,p,xk~N(0,52),k=1,...,ng,log(μ)~N(0,0.12).

Algorithm consists of the following steps repeated T times, where T is the number of iterations of the chain.

Our vector of parameters of interest θ consists of the coefficients and their standard deviations, the log shape parameter of the Weibull distribution log(μ) and the white noise process xk, k=1, …,nd, nd being the total number of deaths in the state. We used symmetric proposal distributions for all the parameters.

Step 0: Assign starting values θ0 for θ .

For t = 1, … ,T do the following steps:

Step 1: Propose new values θ′ from symmetric proposal distributions h(θ ).

Step 2: Calculate log(α) = min(0, R), where

R=log(LA(y|θ)g(θ)LA(y|θ)g(θ)

where LA is the area likelihood function and g is the prior distribution.

Step 3: Update θt = θ with probability α

The program has been run on a 64bit operating system operating a High Performance Desktop Computer with 24 gigabytes of random access memory, (8) 64 bit Xeon Processor Cores, with (2) 250 GB Solid State Disks (SSD) for high performance disk input/output (I/O).

The large number of observations and the complexity of our models prohibit the use of conventional posterior sampling software such as WinBUGS or OpenBUGS. Therefore for the implementation of the Metropolis-Hastings algorithm we used Julia version 0.2 [7]. Julia is an open source high-level programming language for technical computing. It approaches and often matches the performance of C and it is faster than many other programming languages. Preliminary data analysis was performed using R [29] version 3.2.0 and SAS software, version 9.2, SAS Institute Inc., Cary, NC. Graphical displays were performed using R [29].

Two models were used, the first model was not adjusted for any covariates and the second one was adjusted for individual covariates (race, marital status, stage, grade, age at diagnosis) and parish level variables (mean family income and number of accredited cancer centers). The unadjusted model was run for 22000 iterations with the first 10000 iterations discarded as burn-in, while the model adjusted for covariates was run for 25000 iterations discarding the first 15000 as burn-in.

Julia was used for running the simulations for the STSU-Weibull model and the R package BRUGS[32] was used for running the simulation random effects model.

6. Goodness of fit and convergence diagnosis

We have chosen the best fitting model based on the DIC [31], which is widely used in Bayesian models. The parameters degrees of freedom (pD) reflect the model complexity. They were computed from the posterior variance of the deviance, based on the estimator proposed by Gelman et al. [17, 19]. This estimator is easily computed and has been shown to

Let G be the number of samples used for estimation. Let D¯ be the average deviance, D^(θg) is the deviance computed at the sample parameter value θg, g = 1, … ,G.

pD was estimated as half the variance of the deviance:

pD^=121G1g=1G(D^(θg)D¯)2=12Var^(D). (21)

Compared to other proposals for estimating the parameters degrees of freedom, this measure has the advantage of always being positive and it also give the correct estimate for large sample sizes [19]. The DIC was computed as the sum of the average deviance D¯ and the effective number of parameters:

DIC=D¯+pD^. (22)

Lower values of DIC indicate a better fit of the model. Spiegelhalter et al. [31] suggests that models within 1–2 units of the best model deserve consideration, while models with 3–7 units or more difference in DIC have considerably less support. For computational simplicity, a single chain was run for each model. Convergence was assessed first informally using visual examination of the trace plots. Furthermore, we constructed two chains from the remaining iterations after discarding the burn-in and used Gelman and Rubin’s convergence diagnosis [18] to formally assess convergence.

7. Application

7.1. Prostate cancer

Prostate cancer is an important public health problem in the US, with an estimated 233,000 new cases in 2014, leading to 29,480 deaths. About one male in six will be diagnosed with PrCa in his lifetime. It occurs mainly in older men, with the mean age of diagnosis being 66 [1]. Risk factors include but are not limited to age, race, family history/genetics and geographical location. Spatial referencing is important in understanding the variation in PrCa outcomes [37, 38].

We examined cancer registry data from the SEER Louisiana registry for the years 2007 through 2010. SEER registry data has been used previously in the development of spatial survival methods [5, 6, 10, 14]. The data included prostate cancer cases. For stage of diagnosis, we used SEER historic stage which was available for our selected years and was defined as having two categories: localized/regional and distant. Localized refers to a cancer confined to the prostate with no penetration of the capsule. Regional indicates a cancer that involves the regional lymph nodes and/or penetration of the prostatic capsule with or without direct extension beyond the limits of the prostatic capsule into the surrounding organs or tissues. Distant means that the cancer has spread to parts of the body remote from the primary tumor. For PrCa, the SEER staging system combines localized and regional cases into one stage group.

We further selected only observations with complete dates available and excluded 437 subjects with survival time zero, considered unknown. For our time to event outcome we used the time to death from any causes as the prostate only cancer deaths were too infrequent. In addition, because prostate tumors are often slow growing and men with the disease commonly die of other causes, there is likely to be a certain number of men who die of causes other than prostate cancer but who mistakenly have their underlying cause of death attributed to prostate cancer merely because they were labeled as having the disease as a result of screening. [Feurer et al, 1999]. The all cause mortality endpoint depends only on accurate ascertainment of deaths and when they occur, and therefore eliminates attribution bias (i. e. incorrect labeling of death from other causes as death from prostate cancer). The model was adjusted for variables that were available and thought a-priori to be associated with the vital outcome. The individual-level patient data that were used in this study include: race (African-American versus Whites and Other), marital status at diagnosis (grouped into married versus single and separated/divorced/widowed), stage at diagnosis (SEER historic stage A grouped into localized/regional versus distant), grade (grouped into grades 1 and 2 versus 3 and 4) and age (age at diagnosis in years). Observations with missing values for the covariates were excluded from the analysis.

Parish level variables included mean family income and the number of accredited cancer centers. Data for the parish specific covariate “mean family income” has been downloaded from the US Census American Community Survey (ACS) 2011, 5 year estimates [3]. The number of the Commission on Cancer accredited cancer centers has been obtained from the American College of Surgeons website [2].

7.2. Results

Table 1 describes the demographic characteristics of the subjects. The range of age of diagnosis was 34 to 98 years, with the mean being 66 mean age of diagnosis was 66 (sd=9.41). Mean time to death or censoring was 22.60 months (sd=13.58), ranging from 1 to 47. The majority of the subjects were white (67%) versus 32.52% black and only 0.48% other races. The majority of the cancer stages were localized/regional (95.94%) with only 4.06% distant. Of all subjects, 92.82% were alive by the end of the study period, while 7.83% died of any cause. Most cancers were grade 2 (44.65%) and 3 (54.18%) with only 0.87% Grade 1 and 0.30% Grade 4. Due to the small number in grade categories 1 and 4, for estimation we grouped the grade variable and compared grades 3 and 4 versus grades 1 and 2. Regarding marital status, 73.08% of the subjects were married, 12.77% were single and 14.15% were separated, divorced or widowed.

Table 1.

Demographic characteristics

Mean+/−SD Range
Age at diagnosis 66.12+/−9.41 34–98
N. missing 2
Survival time(months) 22.60+/−13.58 1–47
N (%)
Marital status
Single 1590 (12.77%)
Married 9097 (73.08%)
Separated/Divorced/Widowed 1761 (14.15%)
N. missing 1387
Race
White 9258 (67.00%)
Black 4494 (32.52%)
Other 66 (0.48%)
N. missing 17
Grade
I 116 (0.87%)
II 5957 (44.65%)
III 7229 (54.18%)
IV 40 (0.30%)
Stage
Distant 554 (4.06%)
Localized/ Regional 13101 (95.94%)
N. Missing 180
All causes death indicator
Alive 12842 (92.82%)
Died 993 (7.83%)
Mean+/−SD Range
Mean family income (in dollars) $65,279.86 ($10,229.80) $48,436-$92,161
No. cancer centers 0.5 0–7

A fixed number of grid points (1000) were generated over the rectangle enclosing the Louisiana polygon area. The total number of gridpoints was chosen in order to have enough grid points in each parish but at the same time to keep the computational programming time at a feasible level. Out of the total grid points, we further selected the ng=571 grid points that were inside the study region. The number of grid points in areas ranged from 2 to 31 grid points per parish, with a median of 7.

7.2.1. Model selection and goodness of fit assessment

Two models were considered: Model 1 was the unadjusted model. Model 2 was adjusted for the following individual and parish level covariates: race, marital status, stage, grade, mean family income (in dollars) and number of accredited cancer centers. In Table 2, DIC and pD values are given for the models. The results show that the adjusted model had the lowest DIC, suggesting that the adjusted model fits the data much better than the unadjusted one. Therefore we chose the adjusted model as the best fitting model.

Table 2.

Parameter estimates and 95% credible intervals (CI).

DIC pD
Model 1a 124545.1 14.14
Model 2b 122627.7 16.75
a

Model1 is unadjusted.

b

Model 2 is adjusted for race, marital status, stage, grade, age at diagnosis, mean family income and number of accredited cancer centers.

7.2.2. Estimation

Covariates were linked via the log of the scale parameter of the Weibull distribution. Higher coefficients for the covariates indicate lower survival rates associated with increased values of the covariate. The posterior means and 95% credible intervals (CI) of the coefficients from our best fitting model are shown in table 3. After controlling for other covariates in the model, black race versus white or other races was associated with lower survival, with 0.39 (0.23, 0.54) increase in log scale parameter. Being married was associated with higher survival, with 0.40 decrease (CI=(0.55 decrease, 0.24 decrease)). Distant stage versus localized/regional was associated with lower survival, with 1.92 (CI=(1.73, 2.12)) increase in log scale parameter. Higher age at diagnosis was associated with lower survival. Age at diagnosis was standardized for computing simplifications and was used mainly for adjusting purposes. Parish level variables mean family income and number of cancer centers did not show an association with the outcome.

Table 3.

Parameter estimates and 95% credible intervals (CI)

Estimate 95 % CIa
Intercept -6.96 (−7.25,−6.69)*
Race
White and Other races Ref.
Blacks 0.39 (0.23,0.54)*
Marital status
Not Married Ref.
Married -0.40 (−0.55,−0.24)*
Stage
Localized/ Regional Ref.
Distant 1.92 (1.73,2.12)*
Grade
1 and 2 Ref.
3 and 4 0.23 (0.047,0.40)*
Age at diagnosis (standardized) 0.71 (0.64,0.79)*
Mean family income (standardized) 0.011 (−0.071,0.088)
No. cancer centers -0.0052 (−0.041,0.035)
a *

indicates that the credible interval does not include zero.

Figure 1 shows the spatial probabilities zAzAT for each parish A, with higher probabilities indicating higher risk of death. A color progression is used to depict the risk probabilities, the darker the color meaning a higher risk. Some parishes with higher risk are in the South and North-Western part of the state. Parishes in the North-East have lower risk. Figure 2 displays the model based temporal survival St(t*) = 1 – Ft(t*) for the reference categories, which is slightly curved shaped at earliest times and has an almost linear trend afterwards. The spatially explicit model based temporal survival has similar estimates and range values with the survival plot estimated by the traditional random effects model, and the Kaplan Meier survival plot.

Fig 1.

Fig 1

Map of model based estimated spatial risk probabilities

Fig 2.

Fig 2

Estimated model based temporal only survival curves St(t*) = 1 Ft(t*), reference categories

8. Simulated Comparison

To evaluate the benefit of using the spatial method proposed we carried out a simulation study whereby we made comparisons between the proposed model and a model with a standard contextual spatial random effect structure. We considered two simulation scenarios, in the first one we included five covariates and the truth coefficients were set to 1, while in the second scenario we included three covariates and the coefficients were set to 1.5.

8.1. Simulation using the STSU-Weibull model:

We generated 50 datasets, using a Weibull (μ, λi) distribution for the time to event. We understand that 50 is a small number of datasets but we were prohibited by computational costs to include more datasets. We assumed that μ=1, leading to the particular case of the exponential distribution. For simulations, we used the same STSU-Weibull model and prior distributions described in the previous sections. The covariates were linked to the log(λi) parameter, log(λi) = M β, where β = (β0, … ,β5) is the vector of unknown coefficients and M=(1,mi)=(1,mi1,...,mi5) is the corresponding matrix of covariates including the intercept. The coefficients (β0, … ,β5)were set to 1, considered to be the truth value.

Covariates were simulated using various distributions. We have simulated two dichotomous variables mi1~Bernoulli(0.5) and mi2~Bernoulli(0.5) and one continuous normally distributed variable mi3~Normal(0,1). The variables mi4 and mi5 were each simulated as continuous spatially correlated variables using a stationary isotropic covariance model, the corresponding covariance function depending only on the distance between the two points. This is implemented in R [29] in the RMGauss function in the RandomFields package. The variable mi4 was generated using the RMGauss function with a mean trend of 0 and variance 4, while the mi5 variable was generated with a mean 0, variance 10 and scale parameter 2. Censoring times were simulated from U(0,2).

In the second scenario we included only the intercept, dichotomous variable, the continuous normally distributed variable and a continuous spatially correlated variable, simulated with a mean trend of 0 and variance 2. Censoring times were simulated from U(0.2,1).

8.2. Simulation using Weibull model with correlated and uncorrelated random effects:

Let n be the total number of subjects, nj be the number of subjects in parish j and tij be the survival time for subject i in parish j, M=(1,mij)=(1,mij1,...,mij5), a matrix of covariates for subject i in parish j including the intercept term, k=number of covariates, cij = censoring time, i = 1, … ,n and j=1, … ,64,

yij=min(cij,tij),vij={1 if tijcij0 if tij>cij.

The temporal component pdf for the ith observation in parish j, i = 1, … ,n, j = 1, … ,64 follows the Weibull(μ, λij) distribution. In order to be able to make the comparison, for the random effects model we have used the same data as the one used in the STSU-Weibull model simulation. Covariates and random effects were linked to the log(λij) parameter as follows:

log(λij)=Mβ+Wj1+Wj2,i= 1,n,

where β= (β0, … ,β5) is the vector of unknown coefficients,M is the matrix of covariates including the intercept, Wj1 are the uncorrelated spatial random effects and Wj2 are the correlated spatial random effects. We have assumed the same Normal prior distributions for the coefficients as the ones used in the STSU-Weibull model. For the random effects we have assumed that Wj1 follows an independent Normal distribution, Wj1~N(0,v12), where v12 denotes the variance of the uncorrelated spatial random effect. For the correlated spatial random effect Wj2 we have assumed a conditional autoregressive (CAR) model.Specifically, Wj2~Normal(kpjWk2npj,v22npj), where v22 is the variance and pj is the set of neighbors corresponding to parish j. Uniform (0,100) priors were assumed for the standard deviations v1 and v2.

We considered two scenarios, when the coefficients (β0, … ,β5) were set to 1 and 1.5 respectively, considered to be the truth value.

8.3. Simulation Results and Comparison

Table 4 shows the results from the first simulation scenario, when the truth value of the coefficients was set to 1. We display the mean and 95%CI of the estimated coefficients over the 50 simulations, as well as the mean square error (MSE) for each coefficient, calculated as the sum of the square difference between the estimated coefficient at each iteration and the truth value, here set to 1. While all the coefficients in both models were close to the truth value 1, the majority of the coefficient estimates were closer to the truth value 1 in the STSU-Weibull model than the estimates in the random effects model. The coefficients in both models were closed to the truth values.

Table 4.

Mean coefficients estimates obtained from simulated data when truth coefficients are 1.

STSU-Weibulla Mean(95% CI) UH+CH random effectsb Mean(95% CI) STSU-Weibull MSE UH+CH random effects MSE
β0 0.9960(0.9113,1.0797) 0.9958 (0.9052, 1.0851) 0.1005 0.1125
β1 1.0065(0.9323,1.0806) 1.0066(0.9325, 1.0808) 0.0396 0.0403
β2 1.0010 (0.9274, 1.0747) 1.0011 (0.9267, 1.0757) 0.0774 0.0766
β3 0.9980(0.9593,1.0366) 0.9980(0.9591, 1.0368) 0.0241 0.0240
β4 1.0021 (0.9751, 1.0290) 1.0012(0.9678, 1.0345) 0.0098 0.0122
β5 0.9994(0.9656,1.0314) 0.9994(0.9611, 1.0395) 0.0168 0.0198
a

STSU-Weibull is the Spatial Temporal Survival Uncorrelated model with Weibull distributed time to event.

b

UH+CH random effects is the Weibull model with unstructured and correlated random effects.

Table 5 displays the results from the second simulation scenario, when when the truth value of the coefficients was set to 1.5.

Table 5.

Mean coefficients estimates obtained from simulated data when truth coefficients are 1.5.

STSU-Weibulla Mean(95% CI) UH+CH random effectsb Mean(95% CI) STSU-Weibull MSE UH+CH random effects MSE
β0 1.5003(1.4390, 1.5610) 1.5006(1.409,1.537) 0.0504 0.0505
β1 1.4927(1.4138, 1.5709) 1.4933 (1.454, 1.615) 0.0743 0.0732
β2 1.5040(1.4604, 1.5477) 1.5047 (1.484, 1.572) 0.0251 0.0258
β3 1.5037(1.4649, 1.5424) 1.5045(1.484,1.572) 0.0154 0.0173
a

STSU-Weibull is the Spatial Temporal Survival Uncorrelated space-time model with Weibull distributed time to event.

b

UH+CH random effects is the Weibull model with unstructured and correlated random effects.

Figure 3 displays the map of the structured and unstructured random effects, the estimated risk probabilities from the simulated STSU-Weibull model from the first scenario when the truth value of the coefficients was set to 1, as well the mean percentage number of deaths over the 50 simulations and a map of the correlated spatial covariate. We notice a similar pattern in the maps of the estimated risk probabilities and percentage number of deaths and the spatially correlated variable, while the map of random effects show a more residual unstructured effect.

Fig 3.

Fig 3

Simulation comparison map: A) map of uncorrelated plus correlated random effects from contextual random effects models fitted in R2WinBUGS; B) map of estimated risk probabilities from the STSU-Weibull model fitted in Julia; C) map of averaged percentage number of deaths by parish from the simulated data; D) map of spatially structured covariate

In the simulation studies, the percent censoring was not fixed, but rather induced by the assumed distributional assumptions and truth values of the coefficients. In the first simulation scenario, the range of the death rate was 11% to 13% with a mean of 12%, while in the second scenario the death rate range was 19% to 21%, with a mean of 20%. In simulations, the difference in censoring rates did not seem to have an effect on the performance of our model.

9. Discussion

In this paper, we proposed a Bayesian methodology for directly modeling the spatial dependency in the specification of the survival, density and hazard functions. We applied the model to analyze the SEER PrCa data in Louisiana during the years 2007 through 2010. Our outcome is death from any cause, which could also include non cancer related deaths or deaths from other cancers. As is the case with the aged population, some deaths could be non cancer related, such as accidents or other age related diseases. As mortality due to prostate cancer is very limited, the experience of vital outcome of PrCa is also limited.

The results from our study indicate that there is a marked geographical pattern in the survival probability of PrCa. This could be due to the presence of environmental factors, such as pollution or exposure to chemicals from the soil and water. Although we have not fully examined access to care, we did look at the association of the outcome with the number of cancer centers, but we did not find an association.

An assumption of our model is independence between the spatial and temporal components. In the context of AFT model allowing for dependent space and time components, qualitatively similar results have been estimated for covariate effects and spatial risk estimates [27], suggesting that the model is robust to distributional assumptions. Although we are not using this approach here, an alternative model would be to include the spatial covariates in the spatial component in the definition of the spatial kernel.

A number of extensions are to be noted for this methodology. First, while the method was developed using a Weibull distribution of the time to event, other distributional assumptions can be made. For example, more complex parametric models such as generalized gamma or lognormal model can be employed. However, the survival and hazard functions for some parametric models may be complicated and expressed in terms of integrals, therefore not being computationally feasible.

Our model assumes that the risk probability is split between the parishes included in the study area, and, therefore, the value of the parish level risk probability and related measures are dependent on the number of parishes in the study.

Compared to the results from the Weibull random effects convolution model presented in Appendix 1, the Weibull spatially explicit model obtained similar coefficient estimates. However, one of the main advantage of the spatially explicit model is the increased interpretability of the risk estimates, which range between 0 and 1, versus the random effects, which have limited interpretability due to their unlimited range.

In simulated models we obtained similar but overall better estimates in the STSU-Weibull model versus the random effects model. Since our spatial field was constructed using a spatially correlated covariate, the random effects were only a residual factor and showed a relatively different pattern than the estimated risk probabilities obtained from the STSU-Weibull model and the percentage number of deaths or the spatially correlated covariate. This suggests that the STSU-Weibull is more appropriate than the random effects models, especially for models containing spatially structured covariates. We have set the number of simulations to 50 due to computational reasons. The high computational time for our model was mostly due to our choice of specification of the spatial process using kernel convolutions, which requires estimation of the white noise process random effects. However, the spatially explicit methodology is not limited to this specification of the spatial process and can be further developed using alternative approximate methods that can be computationally more efficient.

Further complications of the analysis of spatial data is the presence of censoring for outcomes, which is very frequent in survival studies. A future possible extension of survival models, either spatially explicit or traditional random effects models, is allowing the censoring mechanism to depend on covariates and have a spatial structure. This could be accomplished by using a joint logistic regression model for the death versus censoring indicator, allowing dependence on covariates and including a spatial structure via the use of random effects.

Another extension of our model which we have not considered here is the inclusion of covariates to be related to the spatial risk probabilities , which would allow the covariates to have a direct effect also on the risk of death.

With respect to the generality of our model, it can be used for a variety of diseases for which the spatial survival is of interest. In addition, the model can be applied to other geographical areas. The advantage of our model relies on the inclusion of geographical risk in the definition of the survival measures, thus enhancing the interpretability of the results.

9.0.0.1.

Funding. This work was supported by the National Institutes of Health [R03 Grant No. CA176702–01A1].

This work was supported by the National Institutes of Health [R03 Grant No. CA176702–01A1].

10. Appendix 1: Weibull spatial model using correlated and uncorrelated random effects using SEER data:

10.1. Definition and notations

Let n be the total number of subjects, nj be the number of subjects in county j and tij be the survival time for subject i in county j, M=(1,mij)=(1,mij1,...,mij5), a matrix of covariates for subject i in county j including the intercept term, k=number of covariates, cij = censoring time, i = 1, … ,n and j=1, … ,64,

yij=min(cij,tij),vij={1 if tijcij0 if tij>cij.

The temporal component probability density function for the ith observation in county j, i = 1, … ,n, j = 1, … ,64 follows the Weibull(μ, λij) distribution, which is a commonly used distribution for time to event data. It can model a decreasing, constant or increasing failure rate over time, if its shape parameter μ is less, equal or greater than 1 respectively. Covariates and random effects were linked to the log(μij) parameter as follows:

log(λij)=Mβ+Wj1++Wj2,i= 1,...,n,

where β = (β0, … ,β5) is the vector of unknown coefficients,Wj1 are the uncorrelated spatial random effects and Wj2 are the correlated spatial random effects.

10.2. Prior distributions

For each of the coefficients β0, … ,βp, we used Gaussian prior distributions N(0, σ0), … N(0, σp) respectively, where the hyperparameters σ0σp were each assumed to have a Uniform (0,10) distribution. For the random effects we have assumed that Wj1 follows an independent Normal distribution, Wj1~N(0,τ12), where τ12 denotes the variance of the uncorrelated spatial random effect. For the correlated spatial random effect Wj1 we have assumed a conditional autoregressive (CAR) model. Specifically,W2j~Normal(kWkcjkkcjk,v22), where τ22 denotes the variance of the spatially correlated random effect,

τ22=ψ2kcjk.

10.3. Results

In the random effects models we considered the same covariates as in the STSU-adjusted models: race, marital status, stage, grade, age at diagnosis, mean family income and number of accredited cancer centers. The model was run for 40000 iterations, the first 30000 being discarded as burn-in.

Table 6 shows the estimated coefficients. After controlling for other covariates in the model, black race versus white or other races, having distant stage versus localized/regional stage, having grades 3 and 4 versus 1 and 2 and being of older age at diagnosis were associated with lower survival. Being married was associated with higher survival.

Table 6.

Parameter estimates and 95% credible intervals (CI)

Estimate 95 % CIa
Intercept -7.01 (−7.33,−6.71)*
Race
White and Other races Ref.
Blacks 0.42 (0.27,0.58)*
Marital status
Not Married Ref.
Married -0.39 (−0.54,−0.24)*
Stage
Localized/ Regional Ref.
Distant 1.94 (1.72,2.13)*
Grade
1 and 2 Ref.
3 and 4 0.22 (0.052,0.38)*
Age at diagnosis (standardized) 0.72 (0.64,0.79)*
Mean family income (standardized) 0.0094 (−0.083,0.11)
No. cancer centers 0.0032 (−0.042,0.047)
a *

indicates that the credible interval does not include zero.

Figure 4 displays the sum of correlated and uncorrelated random effects. Larger spatial random effects indicate longer survival time, which is affected by the exponential of the random effects. We notice a spatial clustering with higher survival times in the North-East and lower survival times in the South-West.

Fig 4.

Fig 4

Map of UH+CH random effects using SEER data, Weibull random effects model.

Figure 5 displays the estimated survival probabilities from the Weibull random effects model, which shows an almost linear decreasing curve.

Fig 5.

Fig 5

Estimated survival curve, reference categories, Weibull random effects model.

Figure 6 displays the Kaplan-Mayer survival probabilities, which are in the similar range as the estimated model based survival probabilities.

Fig 6.

Fig 6

Kaplan-Meier Survival Plot.

References

  • [1].American Cancer Society. Available at: http://www.cancer.org, 2015. Accessed February 15, 2015.
  • [2].American College Surgeons. Available at: http://www.facs.org/, 2013. Accessed December 30, 2013.
  • [3].American Community Survey. Available at: http://www.census.gov/acs/, 2013. Accessed December 22, 2015.
  • [4].Assael BM, C Castellani, Ocampo MB, Iansa P, Callegaro A, and Valsecchi MG . Epidemiology and survival analysis of cystic fibrosis in an area of intense neonatal screening over 30 years. Am. J. Epidemiol, 156(5): 397–401, 2002. [DOI] [PubMed] [Google Scholar]
  • [5].Banerjee S and Dey DK. Semiparametric proportional odds models for spatially correlated survival data. Lifetime Data Anal, 11(2):175–191, 2005. [DOI] [PubMed] [Google Scholar]
  • [6].Banerjee Sudipto, Wall Melanie M., and Carlin Bradley P.. Frailty modeling for spatially correlated survival data, with application to infant mortality in Minnesota. Biostatistics, 4:123–142, 2003. [DOI] [PubMed] [Google Scholar]
  • [7].Bezanson Jeff, Karpinski Stefan, Shah Viral B., and Edelman Alan. Julia: A fast dynamic language for technical computing. CoRR abs/1209.5145, 2012. [Google Scholar]
  • [8].Calder CA, Holloman C, and Higdon D. Exploring space-time structure in ozone concentration using a dynamic process convolution model In Kass et al. , editor, Bayesian Case Studies VI. Springer, New York, 2002. [Google Scholar]
  • [9].Calder Catherine A.. A dynamic process convolution approach to modeling ambient particulate matter concentrations. Environmetrics, 19(1):39–48, 2008. [Google Scholar]
  • [10].Carlin Bradley P. and Banerjee Sudipto. Hierarchical multivariate CAR models for spatio-temporally correlated survival data, volume 4 of Bayesian Statistics 7. eds Bernardo JB and Dawid AP and Berger JO and West M, Oxford University Press, New York, 2003. [Google Scholar]
  • [11].Chib Siddhartha and Greenberg Edward. Understanding the Metropolis-Hastings algorithm. The American Statistician, 49:327–335, 1995. [Google Scholar]
  • [12].Cooner F, Banerjee S, and McBean AM. Modelling geographically referenced survival data with a cure fraction. Stat Methods Med Res, 15(4):307–324, 2006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [13].D’Hondt O, Lopez-Martiez C, Ferro-Famil L, and Pottier E. Spatially nonstationary anisotropic texture analysis in SAR images. IEEE Transactions on Geoscience and Remote Sensing, 45:3905–3918, 2007. [Google Scholar]
  • [14].Diva U, Banerjee S, and Dey DK. Modelling spatially correlated survival data for individuals with multiple cancers. Stat Modelling, 7(2):191–213, 2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [15].Diva Ulysses, Dey Dipak K., and Banerjee Sudipto. Parametric models for spatially correlated survival data for individuals with multiple cancers. Stat Med, 27(12):2127–2144, 2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [16].Gelfand Alan E., Diggle Peter J., Fuentes Montserrat, and Guttorp Peter. Handbook of Spatial statistics. CRC Press, 2010. [Google Scholar]
  • [17].Gelman A, Carlin JB, Stern HS, and Rubin D. Bayesian data analysis. CRC Press, 2004. [Google Scholar]
  • [18].Gelman Andrew and Rubin Donald B.. Inference from iterative simulation using multiple sequences, statistical science. Statistical Science, 7(4):457–511, 1992. [Google Scholar]
  • [19].Gelman Andrew, Hwang Jessica, and Vehtari Aki. Understanding predictive information criteria for bayesian models. Statistics and Computing, 24(6):997–1016, 2013. [Google Scholar]
  • [20].Hastings WK. Monte Carlo sampling methods using Markov chains and their applications. Biometrika, 57: 97–109, 1970. [Google Scholar]
  • [21].Henderson Robin, Shimakura Silvia, and Gorst David. Modeling spatial variation in leukaemia survival data. J. Am. Statist. Assoc, 27:965–972, 2002. [Google Scholar]
  • [22].Higdon D. Space and space-time modeling using process convolutions In Quantitative Methods for Current Environmental Issues. Springer, London, 2002. [Google Scholar]
  • [23].Kim S, Khen MH, Dey DK, and Gamerman D. Bayesian dynamic models for survival data with a cure fraction. Lifetime Data Anal, 13(1):17–35, 2007. [DOI] [PubMed] [Google Scholar]
  • [24].Lee Elisa T. and Go Oscar T. Survival analysis in public health research. Annual Review of Public Health, 18: 105–134, 1997. [DOI] [PubMed] [Google Scholar]
  • [25].Moraga Paula and Lawson Andrew B.. Gaussian component mixtures and CAR models in Bayesian disease mapping. Comput Stat Data Anal, 56:1417–1433, 2012. [Google Scholar]
  • [26].National Research Council. Putting people on the map: Protecting Confidentiality with Linked Social-Spatial Data Panel on Confidentiality Issues Arising from the Integration of Remotely Sensed and Self-Identifying Data. Gutmann MP, SternPC PC , editors Committee on the Human Dimensions of Global Change. Division of Behavioral and Social Sciences and Education. Washington (D.C.) National Academy Press, 2007. [Google Scholar]
  • [27].Onicescu G, Lawson AB, Zhang J, Gebregziabher M, Wallace K, and Eberth JM. Bayesian accelerated failure time model for space-time dependency in a geographically augmented survival model Stat Methods Med Res, Epub ahead of Print, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [28].Paciorek C. The importance of scale for spatial-confounding bias and precision of spatial regression estimators. Stat Sci, 25(1):107–125, 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [29].R. R version 3.0.1, R Foundation for Statistical Computing, Vienna, Austria: http://www.R-project.org, 2013. [Google Scholar]
  • [30].Rossouw JE, Anderson GL, Prentice RL, LaCroix AZ, Kooperberg C, Stefanick ML, Jackson RD, Beresford SA, Howard BV, Johnson KC, Kotchen JM, Ockene J, and Writing Group for the Women’s Health Initiative Investigators. Risks and benefits of estrogen plus progestin in healthy postmenopausal women: principal results from the women’s health initiative randomized controlled trial. JAMA, 288(3):321–333, 2002. [DOI] [PubMed] [Google Scholar]
  • [31].Spiegelhalter David J, Best Nicola G., and der Linde Angelika van. Bayesian measures of model complexity and fit (with discussion). J R Stat Soc Series B, 64:583–639, 2002. [Google Scholar]
  • [32].Thomas Andrew, O’Hara Bob, Ligges Uwe, and Sturtz Sibylle. Making bugs open. R News, 6(1):12–17, 2006. URL http://cran.r-project.org/doc/Rnews/. [Google Scholar]
  • [33].Waller L and Gotway C. Applied Spatial Statistics for Public Health Data. Wiley; Hoboken, New Jersey, 2004. [Google Scholar]
  • [34].Wang S, Zhang J, and Lawson AB. A Bayesian normal mixture accelerated failure time spatial model and its application to prostate cancer. Stat Methods Med Res. (to appear), 2012. [DOI] [PubMed] [Google Scholar]
  • [35].Zhang Jiajia and Lawson Andrew B.. Bayesian parametric accelerated failure time spatial model and its application to prostate cancer. J Appl Stat, 8(2):591–603, 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [36].Zhao L and Hanson TE. Spatially dependent polya tree modeling for survival data. Biometrics, 67(2): 391–403, 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [37].Zhou H, Lawson AB, Hebert JR, Slate EH, and Hill EG. Joint spatial survival modeling for the age at diagnosis and the vital outcome of prostate cancer. Stat Med, 27(18):3612–3628, 2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [38].Zhou H, Lawson AB, Hebert JR, Slate EH, and Hill EG. A Bayesian hierarchical modeling approach for studying the factors affecting the stage at diagnosis of prostate cancer. Stat Med, 27(9):1468–1489, 2008. [DOI] [PubMed] [Google Scholar]

RESOURCES