Skip to main content
Biostatistics (Oxford, England) logoLink to Biostatistics (Oxford, England)
. 2018 Jun 24;20(4):666–680. doi: 10.1093/biostatistics/kxy023

Temporally dependent accelerated failure time model for capturing the impact of events that alter survival in disease mapping

Rachel Carroll 1,, Andrew B Lawson 2, Shanshan Zhao 1
PMCID: PMC8136284  PMID: 29939209

Summary

The introduction of spatial and temporal frailty parameters in survival models furnishes a way to represent unmeasured confounding in the outcome of interest. Using a Bayesian accelerated failure time model, we are able to flexibly explore a wide range of spatial and temporal options for structuring frailties as well as examine the benefits of using these different structures in certain settings. A setting of particular interest for this work involved using temporal frailties to capture the impact of events of interest on breast cancer survival. Our results suggest that it is important to include these temporal frailties when there is a true temporal structure to the outcome and including them when a true temporal structure is absent does not sacrifice model fit. Additionally, the frailties are able to correctly recover the truth imposed on simulated data without affecting the fixed effect estimates. In the case study involving Louisiana breast cancer-specific mortality, the temporal frailty played an important role in representing the unmeasured confounding related to improvements in knowledge, education, and disease screenings as well as the impacts of Hurricane Katrina and the passing of the Affordable Care Act. In conclusion, the incorporation of temporal, in addition to spatial, frailties in survival analysis can lead to better fitting models and improved inference by representing both spatially and temporally varying unmeasured risk factors and confounding that could impact survival. Specifically, we successfully estimated changes in survival around the time of events of interest.

Keywords: Accelerated failure time, Breast cancer, Event impact, Survival, Spatio-temporal

1. Introduction

Disease mapping by way of spatial frailty models has become commonplace in survival analysis (Banerjee and others, 2003; Bastos and Gamerman, 2006; Henderson and others, 2002; Li and Ryan, 2002; Silva and Amaral-Turkman, 2005). The introduction of spatial frailty terms into these models offers a way to represent unmeasured risk factors with geographic structure that are related to the outcome of interest in the survival regression model. Typical examples of these unmeasured risk factors include: health disparities, access to care, or environmental exposure. Beyond spatially varying unmeasured risk factors, there could also be temporally or spatio-temporally varying unmeasured risk factors. The case study explored here was a motivating example that involved breast cancer-specific (BrCa) mortality in Louisiana during the years 2000 to 2013. In addition to a general increase in BrCa related knowledge, education, and disease screenings, two notable events, Hurricane Katrina (HK, August 2005) and the passing of the Affordable Care Act (ACA, March 2010), occurred during this time, and we believed they had the ability to impact the survival of the women. It was our belief that a hurricane could cause women to have delayed breast cancer diagnosis, less access to care, and/or worse quality of care, among other things, following the storm; beyond that, we felt that this event could impact coastal regions more severely. Additionally, the Affordable Care Act offered insurance to those that might have remained uninsured in other circumstances and potentially led to earlier BrCa diagnoses and better care. Previous work demonstrated the benefits of including spatial frailty terms for the modeling of these data (Carroll and others, 2017). Incorporating temporal and spatio-temporal random effects in other disease mapping areas has proved to be advantageous (Batista and Antn, 2013; Carroll and others, 2016; Lawson and others, 2017; Li and others, 2012; Waller and others, 1997), and we believed that it also had a place in survival analysis.

There has been some previous work performed in the realm of spatio-temporal survival analysis. Two case studies offered some examples for employing spatio-temporal frailties in a semiparametric Cox model (Banerjee and Carlin, 2003; Banerjee and others, 2003). While these case studies illustrated the importance of including spatio-temporal frailties, without a simulation study, it was difficult to assess the performance of these frailties. Onicescu and others (2017) also offered a complex spatio-temporal structure that assumed a dependency between space and time for an accelerated failure time (AFT) model. This complex methodology offered flexibility; however, the spatial and temporal components were not easily separated for interpretation. Additionally, a dependence between space and time might not always be necessary.

The methodology employed in this exploration combined and extended the previously executed spatio-temporal survival analyses. First, our model was a Bayesian AFT model (Christensen and Johnson, 1988) that assumed a standard logistic distribution for the error term. This model has recently increased in popularity as it offers advantages compared to the commonly used Cox proportional hazards model in that it is not necessary to assume proportional hazards and the fixed and random effects have a direct relationship with the logarithm of time (Onicescu and others, 2017; Orbe and others, 2002; Zhang and Lawson, 2011). Second, the linear predictors of interest incorporated fixed effects as well as flexible frailty structures such as spatial frailties alone, additive spatial and temporal frailties, or spatio-temporal interaction frailties. These linear predictor definitions could have been ideal in any given setting; however, we wished to explore the effectiveness of temporally dependent parameters for survival data and little exploration has occurred in this area thus far, particularly related to one or more events that have the potential to alter the survival experience. These methods were tested via a simulation study accompanied by a real data case study.

2. Statistical methods

We considered disease mapping for Inline graphic predefined small areas across Inline graphic units of time. For subject Inline graphic in small area Inline graphic and time unit Inline graphic, we observed Inline graphic, where Inline graphic was the survival time such that Inline graphic was the true survival time, Inline graphic was the censoring time, and Inline graphic was the censoring indicator. The time scale associated with subscript Inline graphic differed from the time scale of the survival outcome. The primary survival time Inline graphic was such that an individual had their own time zero, e.g., time since diagnosis for the case study, while the Inline graphic time units could be defined as calendar time to capture yearly trends or influence of a major event, e.g., following HK or the passing of the ACA in the motivating example. Thus, the AFT model was expressed as follows for the Inline graphic individual in spatial area Inline graphic and temporal unit Inline graphic:

graphic file with name M17.gif (2.1)

where Inline graphic was individual survival time, Inline graphic was a linear predictor that incorporated the fixed and random effects, Inline graphic were random errors, and Inline graphic was a scale parameter. The common distributional assumptions for the errors include: Weibull, standard logistic, and standard normal. Here, we assumed a logistic distribution as it did not require the proportional hazards assumption, which was a major advantage of the AFT, and it offered a closed form expression of the survival and hazard functions (Collett, 2013). The proposed method applies to all alternative distributional assumptions. For the scale parameter Inline graphic, we assumed a flat, uniform prior distribution that ranged from 0.01 to 10 (Christensen and Johnson, 1988), which is a common choice for this scale parameter. Additionally, a sensitivity analysis presented in the supplementary material available at Biostatistics online was performed and suggested that different error distribution assumptions lead to nearly identical posterior estimates, particularly for appropriately specified models. The corresponding survival and density can be expressed as follows:

graphic file with name kxy023-um1.jpg

where Inline graphic by rearranging equation 2.1 above. Following this, the likelihood was defined using: Inline graphic as a vector containing all Inline graphic survival times, Inline graphic as a matrix containing individual level covariate, spatial, and temporal information, and Inline graphic a vector containing all model parameters. This likelihood assumed conditional independence and was written as follows:

graphic file with name kxy023-m2-2.jpg (2.2)

2.1. Spatial frailty models

When we considered spatial frailties alone, the definition of the AFT model only needed to be parameterized for individual Inline graphic in spatial area Inline graphic, constant across time unit Inline graphic, such that Inline graphic, Inline graphic, was defined as:

graphic file with name M33.gif

where Inline graphic represented the fixed effect parameter estimates associated with individual known, important risk factors and Inline graphic was the spatial frailty term that represented the difference between means of a specific spatial unit and the population. The prior distribution assumed for each of the Inline graphic, Inline graphic fixed effect parameter estimates were assumed independent and such that Inline graphic. The spatial frailty term Inline graphic could be defined as either uncorrelated (Inline graphic) or a convolution of uncorrelated and correlated (Inline graphic) heterogeneity. The uncorrelated frailty, Inline graphic, was defined with Inline graphic prior distribution, which allowed spatial units to be independent conditional on Inline graphic. Alternatively, the correlated frailty, Inline graphic, was defined by a conditional autoregressive (CAR) prior such that Inline graphic (Besag and Green, 1993), where for the correlated frailty, Inline graphic is the number of first degree neighbors for parish Inline graphic; thus, a parish Inline graphic is only directly related to its nearest Inline graphic neighbors. Finally, the spatial frailty precisions (Inline graphic) and all precision parameters henceforth were defined by the following general uninformative prior distribution: Inline graphic. A sensitivity analysis displayed in the supplementary material available at Biostatistics online suggested that the results were robust to different prior assumptions. We denoted this spatial frailty only model as Inline graphic.

2.2. Additive spatial and temporal frailty models

Next, we extended the spatial frailty only model Inline graphic to incorporate temporal variations. There were multiple avenues for incorporating a temporal structure within the frailty terms. The first we examined was an additive form of spatial and temporal frailties; these additive models assumed that the spatial variation was constant across time and the temporal variation constant across space. So, Inline graphic was defined as Inline graphic where Inline graphic, Inline graphic were defined as in Section 2.1 and Inline graphic was the temporal frailty term. Based on our motivating example, we considered two decompositions of the temporal frailty Inline graphic: Inline graphic, where Inline graphic for Inline graphic with Inline graphic as some meaningful temporal unit across the study period (e.g. by calendar year to represent the advances in screening and BrCa knowledge from one year to the next) and Inline graphic, where Inline graphic for Inline graphic and Inline graphic such that Inline graphic was as in Inline graphic and Inline graphic denoted the time unit defined by change points based on the effects of influential events within the study period. As discussed before, these events could range from natural disasters to government legislation to changes in treatment procedures, where the first two were considered in our real data case study.

We could define change points to allow the effect of events to start immediately or have a lagged start. The lag time choice was especially important for natural disasters, e.g., allowing the effect related to HK begin one or more years following the storm. Individuals had Inline graphic (Inline graphic) if they were diagnosed prior to the change point for the first event of interest, Inline graphic if they were diagnosed between the change points for the first and second events, up to Inline graphic if they were diagnosed after the change point for the last event. When there are multiple events with overlapping time windows of impact, we can similarly define the change points to reflect the impact of all events.

For the event-related parameter, many different specifications could arise. The first specification that we considered was called “constant;” for this definition, there was a change point related to each event, of which there could be one or more. This definition was such that all those who entered the study following the change point for a given event were considered impacted by that event. As an example using HK, the change point could be September 2005; thus, women diagnosed before September 2005 had Inline graphic and those diagnosed after had Inline graphic. The next specification that we explored was called “jump and return” wherein two change points and an estimate defined a window of time for the event to have an impact. Using HK as an example, the window of impact for this event could be September 2006 to August 2008 such that women diagnosed between these time points had Inline graphic. All other women had Inline graphic. Finally, a “trend” option represented the progression of survival experience following an event. Here, the trend had several change points with a new estimate that corresponded to each. For an example related to HK, the trend parameter could produce four estimates corresponding to women who were diagnosed in the first year (September 2005–August 2006), second year (September 2006–August 2007), third year (September 2007–August 2008), and then the rest of the study time (September 2009 and later) following the storm. The trend frailty, along with Inline graphic, was used for identifying the appropriate event-frailty lag time and change points for the constant and jump and return options. For this manuscript, we predefined these change points to reflect our beliefs about the influential events.

A random walk prior distribution was used to represent the non-event-defined temporal frailty parameter such that Inline graphic with Inline graphic. This random walk structure imposed an ideal temporal correlation due to the sequential nature of time. The prior distribution of the constant and jump and return event-defined temporal frailties was as follows: Inline graphic with Inline graphic. An intuitive prior distribution for the temporal trend frailty was also a random walk to impose correlation in time. The change points for all event-related frailties were based on the timing of the event, e.g., the month of September for HK, and this minimized identifiability issues between the non-event- and event-related frailties. There are many potential combinations and specifications related to this temporal frailty, and we addressed those believed to be most appropriate for these data. However, other specifications could be important in other circumstances, and this flexible modeling framework can handle other specifications. Details about the specifications used in the case study are defined in Section 4.2.

2.3. Spatio-temporal frailty models

A second technique for incorporating temporal structure via frailties into the linear predictor involved including a spatio-temporal interaction parameter (Knorr-Held, 2000; Wikle and others, 1999; Banerjee and Carlin, 2003; Banerjee and others, 2003); even more complex structures could be assumed here, e.g., also including additive spatial and/or temporal terms. In this model, survival experience differed per space and time. The subsequent linear predictor (Inline graphic) was defined as follows:

graphic file with name M86.gif

We considered two definitions of Inline graphic. The first, Inline graphic, assumed frailties that were correlated in time, uncorrelated in space defined as Inline graphic. We let the precision parameter vary by Inline graphic as this allowed the spatio-temporal interaction to accommodate separate precision parameters per study year to potentially represent the impact of an event during year Inline graphic. The other, Inline graphic, employed a multivariate conditional autoregressive (MCAR) model which furnished a frailty structure that was correlated in space and time (Banerjee and Carlin, 2003) defined as Inline graphic where Inline graphic were as in Section 2.1, Inline graphic was a Inline graphic matrix that contained adjacency information, and Inline graphic was a Inline graphic matrix that represented the conditional variance matrix. This definition was structured as in Banerjee and others (2003) where they go on to abbreviate the distribution by dictating that Inline graphic, and this is the definition we will use to refer to this distribution from here on. The precision matrix Inline graphic followed a Wishart distribution with degrees of freedom Inline graphic and parameter matrix Inline graphic where Inline graphic was the identity matrix with dimension Inline graphic. To complete the appropriate specification for the MCAR model, an intercept was included for each Inline graphic such that Inline graphic where Inline graphic follows the “dflat” prior distribution, a very wide, flat prior distribution. This spatial correlation could be important when considering the impact of events on survival as certain events, e.g., natural disasters, could impact spatial regions differently.

2.4. Notation and summary of fitted models

The fitted models described in Sections 2.1.–2.3. are summarized in Table 1. These fitted models covered a wide range of options available for spatial and spatio-temporal specification within the AFT modeling framework and were defined in such a way that all simpler models are special cases of the more complex one(s); thus, Inline graphic is a special case of Inline graphic which is a special case of Inline graphic, and so on. Note, we only considered the uncorrelated spatial frailty in this simulation since (1) the focus of this project was on the temporal frailties and (2) this was the best spatial frailty structure for previous spatial only explorations involving the case study data (Carroll and others, 2017). All the methods could accommodate a correlated spatial frailty parameter.

Table 1.

Summary of fitted models

  Fixed Spatial Temporal Spatio-temporal
Fitted Models Inline graphic Inline graphic Inline graphic Inline graphic Inline graphic Inline graphic Inline graphic
Inline graphic        
Inline graphic      
Inline graphic    
Inline graphic      
Inline graphic        

The ✓ represents what is included in a given model. For example, Inline graphic has four checks representing the model contents such that Inline graphic.

2.5. Model comparison and evaluation tools

The model goodness-of-fit was evaluated using the standard measure of deviance information criterion (DIC) (Spiegelhalter and others, 2002). However, when some of the simulation scenarios were fitted with the MCAR model, a negative effective number of parameters estimate (pD) was produced; this estimate was needed for the DIC calculation but nonsensical when the value was below zero. Typically, negative estimates arise because of a strong prior-data conflict, and this type of result is understandable when an overly complex model is fitted. We adopted the slightly more conservative Gelman and others (2004) alternative calculation using Inline graphic where Inline graphic was the posterior deviance from the OpenBUGS sampler as this prevented a nonsensical negative pD estimate. Hence, the DIC calculation was such that Inline graphic for all scenarios.

The models recovery abilities were evaluated via bias squared calculations in the simulation study. These bias squared calculations utilized used the posterior mean to get an estimate of Inline graphic (notated as Inline graphic), calculated the bias squared per simulated data set (Inline graphic), and then averaged the bias squared calculation over all simulated data sets. This type of calculation was also performed only relating to the frailty estimates such that, for example with Inline graphic, Inline graphic and Inline graphic were used in place of Inline graphic and Inline graphic, respectively.

Maps, plots, and secondary assessments were also useful tools of evaluation. In the simulation study, we calculated and plotted measures of bias squared to assess simulation ground truth recovery. Finally, we produced maps and plots of the spatial and temporal frailty estimates for (1) simulation ground truth comparison and (2) assessment in the real data case study. The case study evaluation was taken a step further by quantitatively assessing the spatial and temporal frailties via comparisons with risk factors available at the same spatial and temporal resolution (Carroll and Zhao, 2018).

2.6. Computational techniques

These analyses were accomplished using RStudio version 0.99.902 and R version 3.3.1. Specifically, the package R2OpenBUGS which calls the Bayesian inference software OpenBUGS from R was utilized for inference (Carroll and others, 2015; Lunn and others, 2013; Team, 2015; Thomas and others, 2014; 2006). To execute the AFT model in OpenBUGS, the zeros trick as described in the BUGS manual was employed to obtain the correct likelihood contribution since this likelihood was not among the standard distributions. Finally, the R package fillmap, which is available via GitHub, was utilized for producing maps and performing the quantitative secondary assessments (Carroll and Zhao, 2018; Carroll, 2016).

In the OpenBUGS sampler, the following updaters were utilized for the given parameters: adaptive metropolis mixed block for Inline graphic, Inline graphic, Inline graphic, Inline graphic, and Inline graphic; standard adaptive metropolis block for Inline graphic and Inline graphic from Inline graphic; wrapper for chain graph for Inline graphic and Inline graphic from Inline graphic; conjugate Wishart for Inline graphic; and slice for the standard deviations of the random effects. Example R and BUGS code is available in the supplementary material available at Biostatistics online. Additionally, an R shiny application for the case study is available on GitHub from user: carrollrm in repository: LAmortBCaShiny.

3. Simulation study

3.1. Assumptions

For these simulation studies, we assumed an area that contained 25 spatial regions on a Inline graphic grid. We further assumed that all individual diagnoses (Inline graphic) occurred in a span of 6 years with 200 per year and at least one per spatial region for each year. A month of diagnosis variable was also randomly generated.

For the fixed effects, there was one standardized continuous (Inline graphic, Inline graphic), one dichotomous (Inline graphic, Inline graphic), one categorical (Inline graphic, Inline graphic, Inline graphic) covariate, and the intercept Inline graphic was set to be 2.5. These were included in all simulation scenarios. For the MCAR model, Inline graphic. For the random effects, spatial, temporal, and spatio-temporal structures were assumed where the indexing in space was Inline graphic for the spatial regions within the grid and time was related to diagnosis year, Inline graphic, and the event which occurred at month six of year three, Inline graphic. Specifically for random effects, we assumed an uncorrelated spatial variation (Inline graphic), two possibilities for a temporal variation: (1) an annual temporal random walk (Inline graphic with Inline graphic for Inline graphic) and (2) an annual temporal random walk plus an event-related temporal parameter that followed the “constant” specification via a single change point (Inline graphic where Inline graphic with Inline graphic for Inline graphic, Inline graphic, and Inline graphic), and two alternatives for spatio-temporal variation: (1) Inline graphic and (2) Inline graphic where Inline graphic. A single realization of these random effects was assumed for all simulated data sets in combinations to reflect the five fitted models described in Sections 2.12.4.

To produce the 50 simulated data sets under each of these scenarios, a simulated Inline graphic was calculated based on the specifications above. With that value, a survival time (in months) was calculated as in equation 2.1 with Inline graphic and an assumed error, Inline graphic, that was simulated separately for each of the 50 simulated data sets to offer variation within the simulation scenario. A censoring time was also generated from an exponential distribution such that roughly 90% of individuals were censored. This amount of censoring closely resembles the case study data as well as breast cancer statistics on survival (American Cancer Society, 2016).

3.2. Simulation study results

Table 2 includes the model goodness-of-fit and recovery evaluation measures from the simulation study data scenarios that reflect all simulated data sets and fitted models. Over all scenarios, the true models performed very well, all with low DIC and bias. In terms of identifying the correct model, DIC was generally sufficient, but when two DICs were close, bias squared was useful. Further, the goodness-of-fit and recovery estimates largely agreed and were less definitive for the non-additive scenarios. Ultimately, these results suggested that failing to include a temporal frailty when needed (i.e., fitting Inline graphic to any of the simulated data sets not simulated as in Inline graphic) led to poor model fits and biased estimates; additionally, including a temporal frailty with simple structure when it was unneeded (i.e., data simulated as in Inline graphic fit with Inline graphic) did not sacrifice model goodness-of-fit or recovery. Both measures did well to identify if the frailty structure should be additive (Inline graphic, Inline graphic) or an interaction (Inline graphic, Inline graphic). Figure 2.1 of the supplementary material available at Biostatistics online displays plots of bias squared relating to the frailty portion of the simulated data scenario models. The bias squared estimates indicated how well the models recovered the assumed frailties for each simulation scenario. These results differed from those in Table 2 in that they only related to the frailty estimates. However, they agreed with DIC and some overall bias squared estimates in that the bias squared appeared to appropriately indicate if the frailty should be additive spatial and temporal terms or an interaction. Further, these results also suggested that including temporal frailties of any type when unnecessary does not largely increase the bias squared, aside from Inline graphic. Additionally, regardless of the frailty specification, the fixed effect estimates were recovered fairly well for all models and scenarios (Table 2.1 of the supplementary material available at Biostatistics online); however, the further the fitted model was from the true model in terms of specification, the more biased the estimates became.

Table 2.

Model goodness-of-fit for each simulation scenario averaged over the 50 simulated data sets

Model Parameterization DIC pD Inline graphic
Data simulated as in Inline graphic
Inline graphic Inline graphic 4263.15 31.11 7.79
Inline graphic Inline graphic 4266.42 33.59 7.80
Inline graphic Inline graphic 4269.62 34.78 11.54
Inline graphic Inline graphic 4290.03 70.19 7.98
Inline graphic Inline graphic 4361.11 161.04 935.08
Data simulated as in Inline graphic
Inline graphic Inline graphic 5145.22 31.11 20.06
Inline graphic Inline graphic 4401.93 36.24 7.79
Inline graphic Inline graphic 4419.90 38.73 7.75
Inline graphic Inline graphic 4567.28 189.93 8.19
Inline graphic Inline graphic 4509.07 158.68 935.67
Data simulated as in Inline graphic
Inline graphic Inline graphic 7248.81 29.12 988.68
Inline graphic Inline graphic 5043.69 34.62 204.13
Inline graphic Inline graphic 2889.37 37.99 8.52
Inline graphic Inline graphic 4814.55 117.13 191.50
Inline graphic Inline graphic 4827.53 134.36 1143.08
Data simulated as in Inline graphic
Inline graphic Inline graphic 5066.11 30.66 19.53
Inline graphic Inline graphic 5048.95 35.44 18.94
Inline graphic Inline graphic 5050.77 37.05 21.88
Inline graphic Inline graphic 4412.46 17.84 8.01
Inline graphic Inline graphic 4409.86 180.26 935.08
Data simulated as in Inline graphic
Inline graphic Inline graphic 7147.41 31.73 151.42
Inline graphic Inline graphic 6287.31 35.51 73.61
Inline graphic Inline graphic 6287.27 37.32 72.94
Inline graphic Inline graphic 4482.51 191.04 10.47
Inline graphic Inline graphic 4479.24 204.54 9.76

Bold highlighted estimates indicate models comparable to the best in terms of fit and recovery. Close for DIC is 3–4 units or less and close for bias squared is a 5% difference.

Figure 1 displays the temporal frailty estimates related to the data simulated as in Inline graphic and fitted with models Inline graphic and Inline graphic. All other true temporal and spatio-temporal frailties in the simulation and the corresponding estimates are included in Figures 2.2–2.14 of the supplementary material available at Biostatistics online. All these displays suggested the true simulation model frailties were recovered well under appropriate models. Specifically, both Inline graphic and Inline graphic recovered the truth for the data simulated as in Inline graphic. For the data simulated as in Inline graphic, the effect of the event was separable and represented accurately with Inline graphic by including Inline graphic. Fitted model Inline graphic attempted to recover the truth by giving an averaged estimate for the year in which the event occurred and, thus, was not able to recover the impact of the event as well as the true model, Inline graphic. We also explored events that occurred at month one of year four, i.e., Inline graphic occurred at the same time point as Inline graphic, and of different magnitudes. From these, it was apparent that Inline graphic could accurately recover the impact of these events; however, Inline graphic had an advantage through the ability to give an accurate estimate directly related to the event of interest, when the impact was strong enough. Similarly, the spatial frailty estimates from Inline graphic, Inline graphic, and Inline graphic reflected the 6 year average of the simulation truth for Inline graphic and Inline graphic (Figures 2.4 and 2.5 of the supplementary material available at Biostatistics online).

Fig. 1.

Fig. 1.

Temporal frailty simulation assumption (Inline graphic) and estimates (Inline graphic and Inline graphic).

4. Louisiana SEER BrCa-specific mortality case study

The case study illustrated these fitted models abilities under a real situation rather than the ideal one laid out in the simulation study. Explicitly, this case study explored BrCa-specific mortality in Louisiana, USA for years 2000 to 2013. These data were previously explored in the spatial only setting (Carroll and others, 2017). Those results suggested that spatial frailties were important and explained survival differences that were independent of the individual-level risk factors adjusted for, but we believed this could be further improved by including temporal frailty terms.

4.1. Case study data

The data for this case study was obtained from the publicly available SEER data sets (release date: 15 April 2016), which offered a cause of death variable to indicate if the individualâŁTMs underlying cause of death was BrCa as well as survival time in months. In addition to this information, diagnosis month and year were recorded along with the FIPS county code for each woman. These together created the spatial and temporal survival scenario that we were interested in exploring.

The SEER data also provided several clinical and demographic risk factors of known association with BrCa mortality (American Cancer Society, 2016; Surveillance, Epidemiology, and End Results Program, 2015; Wieder and others, 2016). By including these available risk factors, the spatial, temporal, and spatio-temporal frailties represented latent combinations of unmeasured risk factors beyond these. Explicitly, the known risk factors of interest included: African American race, marital status at diagnosis, age at diagnosis (standardized to have a mean of 0 and standard deviation of 1), cancer grade, ER/PR tumor subtype status, BrCa surgery, and radiation therapy.

This real data case study furnished a situation for examining the use of temporal and spatio-temporal frailties in relation to risk factors and external events that could alter the survival time. HK (29 August 2005) made landfall and moved through the state of Louisiana during the time of this case study. Additionally, the ACA, a nationwide legislation which offered affordable health care to all citizens, was signed into law by President Barack Obama on 23 March 2010. Finally, knowledge, education, and screenings related to BrCa improved over these years which could influence breast cancer mortality, thus we believed that a temporal frailty could aid in adjusting for that as well.

4.2. Case study-specific statistical model details

For this case study, we considered a single definition of Inline graphic, diagnosis years, and several definitions of Inline graphic. From the definitions given in Section 2.2, we considered non-lagged and lagged options for the constant (single change point) and jump and return (double change point) specifications where a lagged effect indicated that, for example, HK’s impact on BrCa-specific mortality was not detected for those diagnosed immediately after the storm, rather it began a year later. The trend specification was also considered. Here, we had estimates for five years following HK but only four years following ACA due to the end of the study. Then, from the best single change point constant options (non-lagged vs. lagged) per event, we considered the BOTH specification which combined the change points from the best models for the two individual events. The specifications included for the event-related frailty in the alternative Inline graphic specifications allowed the HK-related frailty to change at the month of September while the ACA-related frailty changed at the month of April for specified years. The alternative options led to slightly different interpretations of the temporal frailty.

4.3. Case study results

The results in Table 3 were used for assessing the models’ goodness of fit. When we considered the events one at a time, the best models were ACA - constant and HK - constant lag. The DIC measures suggested that all models that accounted for ACA without a lag time were comparable and the best fitting. The fit with BOTH - constant did not appear to improve upon ACA - constant enough to warrant including an HK-related temporal frailty; thus, Inline graphic ACA - constant was the best model for these data.

Table 3.

Model assessment estimates and calculations

Model Parameterization DIC pD
Inline graphic Inline graphic 54997.5 39.2
Inline graphic Inline graphic 54930.0 48.6
Inline graphic Inline graphic  
 HK-constant   54930.0 48.0
 HK-constant lag   54928.2 48.4
 HK-jump and return   54930.0 49.1
 HK-trend   54930.0 49.0
 ACA-constant   54926.1 48.2
 ACA-constant lag   54930.0 49.5
 ACA-trend   54930.0 48.7
 BOTH-constant   54926.1 48.7
ST1 Inline graphic 54969.4 95.4
ST2 Inline graphic 54986.8 128.3

For the Inline graphic models, HK and ACA indicate Hurricane Katrina (August 2005) and Affordable Care Act (March 2010) respectively, constant provides a single change point for the frailty and continues to keep the estimate for the rest of the study years, jump and return indicates an estimate that jumps to an estimate for a window of time (specifically September 2007 to August 2009) then returns to the previous parameter value (2 change points, Inline graphic), constant lag means that a year of lag time was allowed following the event date (1 change point, Inline graphic), trend indicates that the temporal frailty term offered a different estimate per year for several years following the event (HK-4 change points, Inline graphic and ACA-3 change points, Inline graphic), and BOTH includes changes for both events where the change point for HK is lagged (2 change points, Inline graphic). Highlighted estimates indicate models comparable to the best in terms of fit. Close for DIC is 3–4 units or less.

Figure 2 displays the temporal frailty estimates for Inline graphic ACA - constant with the Louisiana SEER data. Estimates for other models are included in Figures 3.1 to 3.8 of the supplementary material available at Biostatistics online. The fixed effect estimates and spatial frailties were nearly identical for all fitted models and estimates associated with Inline graphic ACA - constant are included in Table 3.1 and Figure 3.9 of the supplementary material available at Biostatistics online. In general, the overall (Inline graphic for Inline graphic and Inline graphic for Inline graphic) temporal frailty suggested an increase in survival time across the study, a period of more constant, slightly reduced estimates began at about 2007, and a large increase in 2010. We believed the overall increase represented advancements in BrCa education and screening, the midway leveling off could have been due to a small but inseparable impact from HK, and the large increase related to the passing of the ACA. The Inline graphic estimate from the best model suggested that the change in 2010 was important to represent as a separate event-related temporal frailty. Figure 3.4 of the supplementary material available at Biostatistics online displays the temporal frailty estimate for ACA - trend, and these estimates suggested that we need not fit ACA - constant lag nor ACA - jump and return models since the estimates appeared to change immediately and did not diminish prior to the end of the study time. This was supported by the DIC estimate for ACA - constant lag showing no improvement over that of Inline graphic. The spatio-temporal interaction frailties also appeared to have a slight decline and/or leveling off in the years 2007–2009 and an increase around 2010 (Figures 3.7 and 3.8 of the supplementary material available at Biostatistics online), but DICs suggested that these complicated models did not offer improvements in model fit.

Fig. 2.

Fig. 2.

Temporal frailty estimates for Inline graphic ACA—constant with the case study data.

5. Discussion

Our results suggested that it was important to consider temporal frailties in addition to spatial frailties for survival analysis. Both types of frailties allowed for modeling of the unmeasured confounding in the data beyond what the known fixed effect risk factors could explain. Further, the temporal frailty had the ability to represent changes in survival experience over time due to events such as a major hurricane or health-related government legislation.

The simulation results indicated that it was necessary to include a temporal frailty when temporal variation was present in the simulation ground truth. Moreover, even when there was no true temporal structure assumed in the simulation, the model did not suffer in terms of goodness-of-fit or recovery when a separate temporal frailty was included. However, the bias specifically related to the frailties was slightly increased. Beyond that, the simulation results also indicated that (1) the fixed effect estimates were recovered well even when the frailties were misspecified and (2) that the frailty estimates were recovered nearly perfectly when appropriately specified or with some misspecification.

These results also illustrated the importance of choosing the appropriate type of frailty structure: additive spatial and temporal or a spatio-temporal interaction. With the additive spatial and temporal frailty structure (Inline graphic, Inline graphic), examination of the frailties’ associations could be performed independently for space and time, but the event effect must be defined a priori. Adding an ability for the data to select the change-points for the event-related frailty is of interest for future explorations. For models that contained the spatio-temporal frailty structure (Inline graphic and Inline graphic), interactions between space and time were allowed. This could potentially be a more flexible modeling approach and ideal in certain situations but interpretation and secondary assessments were more difficult. Both the simulation study and the case study illustrated that DIC was useful in distinguishing between the best of these two options for the data at hand.

Identifiability was a concern with these models as multiple temporal random effects were included in a single linear predictor. This issue has been discussed with other models of similar structure, e.g., wherein multiple spatial random effects were included (Waller and Carlin, 2010). However, our results suggested that we did not have the same issues with identifiabilty since the simulation study appropriately recovered parameters, when the event effect was strong enough. Beyond that, the event effect differed from the annual effect in that the change point(s) fell at different points in time; however, this might not always be the case. Finally, the model goodness-of-fit measures indicated if it was important to include the separate event frailty. Thus, we believed that if the event was not strong enough to be separable and identifiable from the annual frailty, the goodness-of-fit measures would indicate that the event frailty was not necessary. The case study illustrated this as there was likely still an impact from Hurricane Katrina, but that impact was captured in the annual frailty rather than by a separate event-related frailty. Ultimately, we felt that it was best to start with the simpler models (Inline graphic and Inline graphic) then try to improve upon them with additional parameters.

Using these spatial and temporal frailties led to an improved understanding of our case study outcome of interest. By incorporating frailties into the model, the produced latent effects were examined and the unmeasured confounding in the data was assessed. We demonstrated how to make these interpretations with respect to BrCa-specific mortality in Louisiana SEER data for the years 2000–2013 as well as different ways of defining the temporal frailty parameter. First, we illustrated how to determine the best construction of the temporal frailty parameter, e.g., constant, lagged, jump and return, or trend, by initially fitting the trend parameterization. To assess the spatial and temporal frailties, we compared them to several available risk factors at the same spatial and temporal resolution. Descriptions of the risk factors explored, tables of estimates, plots of the comparisons, and correlations between the frailties and risk factors are included in the supplemental materials and displayed in Table 4.1 and Figure 4.1supplementary material available at Biostatistics online. Based on these results, we believed that the overall temporal frailty was associated with increased number and quality of BrCa screenings (number of mammograms: Inline graphic), access to health care (total number of hospitals, Inline graphic), socio-economic status (% persons in poverty, Inline graphic), and the impacts of HK as well as the passing of the ACA for those diagnosed following the given event. The spatial frailty estimates were much alike the estimates seen in our previous work (Carroll and others, 2017), and there we determined that they were associated with socio-demographic status, access to and quality of health care, access to fresh food, and chemical exposure related to working in agriculture. Table 4.1supplementary material available at Biostatistics online displays secondary model fit results, which indicated that these risk factors continued to be the ones that were associated with this spatial frailty, and Section 4.2. of the supplementary material available at Biostatistics online describes the spatial and temporal secondary frailty assessments.

6. Conclusion

The incorporation of temporal frailties in addition to spatial frailties in survival analysis led to better fitting models and improved inference. Our methods addressed a wide range of spatial and temporal options for structuring frailties and examined the benefits of using these different structures in certain settings. Ultimately, we believed that the temporal frailties could play an important role in representing the unmeasured risk factors related to improvements in disease knowledge and screenings as well as events that have the potential to alter survival.

Supplementary Material

kxy023_Supplementary_Data

Acknowledgments

This research was supported by the Intramural Research Program of NIH, National Institute of Environmental Health Sciences. Conflict of Interest: None declared.

References

  1. American Cancer Society. (2016). Breast cancer facts and figures 2015-2016. https://goo.gl/CtmxyY (accessed May 2017).
  2. Banerjee, S. and Carlin, B. P. (2003). Semiparametric spatio-temporal frailty modeling. Environmetrics 14, 523–535. [Google Scholar]
  3. Banerjee, S., Carlin, B. P. and Gelfand, A. E. (2003a). Hierarchical multivariate CAR models for spatio-temporally correlated survival data (with discussion). In: Bernardo, J. M.,Bayarri, M. J.,Berger, J. O.,Dawid, A. P.,Heckerman, D.,Smith,, A. F. M. and West, M. (editors), Bayesian Statistics. Oxford: Oxford University Press, pp. 45–63. [Google Scholar]
  4. Banerjee, S., Wall, M. M. and Carlin, B. P. (2003b). Frailty modeling for spatially correlated survival data, with application to infant mortality in minnesota. Biostatistics 4, 123–142. [DOI] [PubMed] [Google Scholar]
  5. Bastos, L. S. and Gamerman, D. (2006). Dynamic survival models with spatial frailty. Biostatistics 12, 441–460. [DOI] [PubMed] [Google Scholar]
  6. Batista, N. E. and Antn, O. A. (2013). Spatiotemporal analysis of lung cancer incidence and case fatality in Villa Clara Province, Cuba. MEDICC Review 15, 16–21. [DOI] [PubMed] [Google Scholar]
  7. Besag, J. and Green, P. J. (1993). Spatial statistics and Bayesian computation. Journal of the Royal Statistical Society. Series B (Methodological) 55, 25–37. [Google Scholar]
  8. Carroll, R. (2016). fillmap: Create maps with spatialpolygons objects. R package version 0.0.0.9000. [Google Scholar]
  9. Carroll, R., Lawson, A. B.,Faes, C.,Kirby, R. S.,Aregay, M. and Watjou, K. (2015). Comparing INLA and OpenBUGS for hierarchical Poisson modeling in disease mapping. Spatial and Spatio-temporal Epidemiology 14–15, 45–54. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Carroll, R., Lawson, A. B.,Faes, C.,Kirby, R. S.,Aregay, M. and Watjou, K. (2016). Spatio-temporal Bayesian model selection for disease mapping. Environmetrics 27, 466–478. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Carroll, R., Lawson, A. B.,Jackson, C. L. and Zhao, S. (2017). Spatial assessment of breast cancer-specific mortality using Louisiana SEER data. Social Science & Medicine (1982) 11, 1–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Carroll, R. and Zhao, S. (2018). Gaining relevance from the random: Interpreting observed spatial heterogeneity. Spatial and Spatio-temporal Epidemiology, 25, 11–17. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Christensen, R. and Johnson, W. (1988). Modelling accelerated failure time with a Dirichlet process. Biometrika 75, 693–704. [Google Scholar]
  14. Collett, D. (2013). Modelling survival data in medical research. In: Faraway, J. J.,Tanner, M. A.,Carlin, B. P.,Zidek, J. and Blitzstein, J. K. (editors), Texts in Statistical Science. Boca Raton: CRC Press, pp. 221–274. [Google Scholar]
  15. Henderson, R., Shimakura, S. and Gorst, D. (2002). Modeling spatial variation in leukemia survival data. Journal of the American Statistical Association 97, 965–972. [Google Scholar]
  16. Knorr-Held, L. (2000). Bayesian modelling of inseparable space-time variation in disease risk. Statistics in Medicine 19, 2555–2567. [DOI] [PubMed] [Google Scholar]
  17. Lawson, A. B., Carroll, R.,Faes, C.,Kirby, R. S.,Aregay, M. and Watjou, K. (2017). Spatio-temporal multivariate mixture models for Bayesian model selection in disease mapping. Environmetrics 28, e2465. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Li, G., Best, N.,Hansell, A. L.,Ahmed, I. and Richardson, S. (2012). Baystdetect: Detecting unusual temporal patterns in small area data via Bayesian model choice. Biostatistics 13, 695–710. [DOI] [PubMed] [Google Scholar]
  19. Li, Y. and Ryan, L. (2002). Modeling spatial survival data using semiparametric frailty models. Biometrics 58, 287–297. [DOI] [PubMed] [Google Scholar]
  20. Lunn, D., Jackson, C.,Best, N.,Thomas, A. and Spiegelhalter, D. (2013). The BUGS Book: A Practical Introduction to Bayesian Analysis, 1st edition. Boca Raton: CRC Press. [Google Scholar]
  21. Onicescu, G., Lawson, A.,Zhang, J.,Gebregziabher, M.,Wallace, K. and Eberth, J. M. (2017). Bayesian accelerated failure time model for space-time dependency in a geographically augmented survival model. Statistical Methods in Medical Research 26, 2244–2256. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Orbe, J., Ferreira, E. and Nunez-Anton,, V. (2002). Comparing proportional hazards and accelerated failure time models for survival analysis. Statistics in Medicine 21, 3493–3510. [DOI] [PubMed] [Google Scholar]
  23. Silva, G. L. and Amaral-Turkman,, M. A. (2005). Bayesian analysis of an additive survival model with frailty. Communication in Statistics A 33, 2517–2533. [Google Scholar]
  24. Spiegelhalter, D. J., Best, N. G.,Carlin, B. P. and van der Linde, A. (2002). Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society Series B 64, 583–639. [Google Scholar]
  25. Surveillance Epidemiology, and End Results Program. (2015). SEER Stat Fact Sheets: Female Breast Cancer. National Cancer Institute. [Google Scholar]
  26. Team, R Core. (2015). R: A language and environment for statistical computing. R foundation for statistical computing. Vienna, Austria. [Google Scholar]
  27. Thomas, A., Best, N.,Lunn, D.,Arnold, R. and Spiegelhalter, D. (2014). GeoBUGS User Manual. Cambridge, UK: MRC Biostatistics Unit. [Google Scholar]
  28. Thomas, A., O’hara, B.,Ligges, U. and Sturtz, S. (2006). Making BUGS open. R News 6, 12–17. [Google Scholar]
  29. Waller, L. A. and Carlin, B. P. (2010). Disease mapping. In: Gelfand, A. E.,Diggle, P. J.,Fuentes, M. and Guttorp, P. (editors), Handbook of Spatial Statistics. Boca Raton: CRC Press, pp. 217–244. [Google Scholar]
  30. Waller, L. A., Carlin, B. P.,Xia, H. and Gelfand, A. E. (1997). Hierarchical spatio-temporal mapping of disease rates. Journal of the American Statistical Association 92, 607–617. [Google Scholar]
  31. Wieder, R., Shafiq, B. and Adam, N. (2016). African American race is an independent risk factor in survival from initially diagnosed localized breast cancer. Journal of Cancer 7, 1587–1598. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Wikle, C., Berliner, M. and Cressie, N. (1999). Hierarchical Bayesian space-time models. Environmental and Ecological Statistics 5, 117–154. [Google Scholar]
  33. Zhang, J. and Lawson, A. B. (2011). Bayesian parametric accelerated failure time spatial model and its application to prostate cancer. Journal of Applied Statistics 38, 591–603. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

kxy023_Supplementary_Data

Articles from Biostatistics (Oxford, England) are provided here courtesy of Oxford University Press

RESOURCES