Bayesian Influence Measures for Joint Models for Longitudinal and Survival Data

Hongtu Zhu; Joseph G Ibrahim; Yueh-Yun Chi; Niansheng Tang

doi:10.1111/j.1541-0420.2012.01745.x

. Author manuscript; available in PMC: 2012 Dec 1.

Published in final edited form as: Biometrics. 2012 Mar 4;68(3):954–964. doi: 10.1111/j.1541-0420.2012.01745.x

Bayesian Influence Measures for Joint Models for Longitudinal and Survival Data

Hongtu Zhu ^1,^*, Joseph G Ibrahim ^1,^**, Yueh-Yun Chi ^2,^***, Niansheng Tang ^3,^****

PMCID: PMC3496431 NIHMSID: NIHMS411827 PMID: 22385010

Summary

This article develops a variety of influence measures for carrying out perturbation (or sensitivity) analysis to joint models of longitudinal and survival data (JMLS) in Bayesian analysis. A perturbation model is introduced to characterize individual and global perturbations to the three components of a Bayesian model, including the data points, the prior distribution, and the sampling distribution. Local influence measures are proposed to quantify the degree of these perturbations to the JMLS. The proposed methods allow the detection of outliers or influential observations and the assessment of the sensitivity of inferences to various unverifiable assumptions on the Bayesian analysis of JMLS. Simulation studies and a real data set are used to highlight the broad spectrum of applications for our Bayesian influence methods.

Keywords: Bayesian influence measure, Longitudinal, Perturbation model, Sensitivity analysis, Survival

1. Introduction

There has been extensive research literature on joint modeling of longitudinal and survival data (JMLS) by using either frequentist or Bayesian methods. For instance, the early development of joint models for longitudinal and survival data was primarily motivated by characterizing the relationship between features of CD4 or viral load profiles and time-to-event in HIV/AIDS clinical trials. JMLS has been further developed in other types of biomedical applications, such as cancer vaccine (immunotherepy) trials and quality of life studies. References include Pawitan and Self (1993); De Gruttola and Tu (1994); Tsiatis, DeGruttola, and Wulfsohn (1995); Faucett and Thomas (1996); Wulfsohn and Tsiatis (1997); Henderson, Diggle, and Dobson (2000); Wang and Taylor (2001); Xu and Zeger (2001); Law, Taylor, and Sandler (2002); Song, Davidian, and Tsiatis (2002); Chen, Ibrahim, and Sinha (2002, 2004a); Brown and Ibrahim (2003a; Brown and Ibrahim (2003b); Brown, Ibrahim, and DeGruttola (2005) and Chi and Ibrahim (2006, 2007) among many others. Nice overviews of JMLS are given in Tsiatis and Davidian (2004) and Yu et al. (2004) from a frequentist perspective, and in Ibrahim, Chen, and Sinha (2001) as well as Hanson, Branscum, and Johnson (2011) from a Bayesian perspective.

Recent advances in computation and prior elicitation have made Bayesian analysis of these “complex” JMLS feasible. For instance, some Bayesian semiparametric approaches for longitudinal profiles include Gaussian processes, and functional Dirichlet processes among others. Nonparametric prior processes for the baseline cumulative hazard function, include the gamma process prior, the correlated gamma process, and the Dirichlet process prior among others. However, very little has been done on developing a general Bayesian influence approach to detect influential points and to assess various unverifiable assumptions underlying the JMLS, which is the focus of this article.

Bayesian local and global influence (or robustness) approaches have been widely used to perturb the data, the prior and the sampling distribution and assess the influence of these perturbations on the posterior distribution and the associated posterior quantities. However, these Bayesian local and global influence approaches are not directly applicable to complex JMLS. Although there are some frequentist diagnostic tools (Dobson and Henderson, 2003; Rizopoulos, Verbeke, and Molenberghs, 2008; Rizopoulos and Ghosh, 2011) developed specifically for some JMLS, they are not sufficient for carrying out sensitivity analysis (e.g., prior) of complex Bayesian analysis of JMLS.

The development of the proposed methodology was primarily motivated by a clinical trial conducted by the International Breast Cancer Study Group (IBCSG) (Chi and Ibrahim, 2006). A subset of the IBCSG data set contains the longitudinal measurements for quality of life (QOL) measured at baseline and at months 3 and 18 after randomization. In addition, we have bivariate failure times, including disease-free survival (DFS) and overall survival (OS), which were collected from n = 832 patients from Switzerland, Sweden, and New Zealand/Australia. The covariates are different therapeutic procedures, age, estrogen receptor (ER) status (negative/positive), and the number of positive nodes of the tumor. Although a Bayesian analysis of joint longitudinal and bivariate survival models have been used to fit this data set (Chi and Ibrahim, 2006), a general diagnostic framework for assessing such a model fit to the IBCSG data is completely lacking. There is a great need to develop various diagnostic measures for the detection of outliers and/or influential observations, and the assessment of the sensitivity of inferences to the prior distributions and other unverifiable assumptions on the JMLS.

The article is organized as follows. In Section 2, we introduce a general model for jointly modeling multivariate longitudinal and survival data. In Section 3, we discuss various perturbation models and then calculate their associated Bayesian influence measures to quantify the effects of perturbing the data, the prior, and the sampling distribution on possible posterior quantities of interest. We present a detailed analysis of the IBCSG data in Section 4.

2. Joint Models of Longitudinal and Survival Data

Consider data from n independent subjects. For each subject, we observe a K × 1 vector of multiple longitudinal responses and an M × 1 vector of multivariate time-to-event outcomes. For the ith subject, let y_ik (t_ijk) be an assessment of the kth longitudinal response measured at time t_ijk and let Y_ik = (y_ik (t_i₁_k), …, y_ik (t_{in_ikk}))^T denote the observed longitudinal process for the kth response for i = 1, …, n, j = 1, …, n_ik, and k = 1, …, K. Moreover, for the ith subject, we observe the event time $T_{i m} = min (T_{i m}^{*}, C_{i m})$ and the event indicator $δ_{i m} = 1 (T_{i m}^{*} \leq C_{i m})$

for the mth time-to-event outcome for m = 1, …, M, where 1(A) is an indicator function of an event A and $T_{i m}^{*}$ and C_im denote the true event time and the censoring time, respectively.

We consider a general shared parameter model for jointly modeling the longitudinal and survival data as follows. Let b_i = (b_i₁, …, b_iK) be time-independent random effects underlying both the longitudinal and survival processes for the ith subject. Conditional on b_i, all components of the longitudinal outcomes and the time-to-event outcomes are independent. Let Y_i = (Y_i₁, …, Y_iK), T_i = (T_i₁, …, T_iM)^T, and δ_i = (δ_i₁, …, δ_iM)^T. The shared parameter model is defined by

p (Y_{i}, T_{i}, δ_{i}; θ) = \int p (Y_{i} ∣ b_{i}; θ_{y}) p (T_{i}, δ_{i} ∣ b_{i}; θ_{T}) p (b_{i}; θ_{b}) {d b}_{i},

(1)

where p(… | …) and p(…) denote the appropriate conditional density and density functions, respectively, and $θ^{T} = (θ_{y}^{T}, θ_{T}^{T}, θ_{b}^{T})$ is the vector containing all unknown parameters corresponding to each of the submodels. Moreover, corresponding to the partition of $θ^{T} = (θ_{y}^{T}, θ_{T}^{T}, θ_{b}^{T}), θ^{T} = (θ_{F}^{T}, θ_{I}^{T})$ can be further decomposed as a finite-dimensional parameter vector $θ_{F}^{T} = (θ_{y, F}^{T}, θ_{T, F}^{T}, θ_{b, F}^{T})$ and a vector of infinite-dimensional parameters $θ_{I}^{T} = (θ_{y, I}^{T}, θ_{T, I}^{T}, θ_{b, I}^{T})$ , such as the baseline hazard function or cumulative baseline hazard function, for each of the submodels. This class of shared parameter models in equation (1) includes most JMLS in the existing literature as a special case (Ibrahim et al., 2001; Tsiatis and Davidian, 2004; Hanson et al., 2011).

We specify each of the submodels in equation (1) as follows. First, we consider a multivariate generalized linear mixed model for the longitudinal process. Specifically, all components of Y_ik conditional on b_ik are independent and the conditional distribution of each y_ik (t_ijk) given b_ik is a member of the exponential family with a nonlinear link function given by

g_{k} (E {y_{i k} (t_{ijk}) ∣ b_{i k}}) = η_{i k} (t_{ijk}, b_{i k}),

(2)

where g_k (·) is a known monotonic link function and η_ik (t, b_ik) is a parametric or nonparametric function of the random effects and t. Also, η_ik (t, b_ik) may depend on other covariates of interest, such as gender. Furthermore, we consider a general form of η_ik (t, b_ik) given by

η_{i k} (t, b_{i k}) = R_{i k} (t) β_{k} + W_{i k} (t) b_{i k},

(3)

where R_ik (t) and W_ik (t) are, respectively, the fixed effects and random effects design matrices, and β_k and b_ik are vectors of the corresponding fixed and random effects parameters.

For the specification in equation (3), we may consider a random varying-coefficient model as follows:

η_{i k} (t, b_{i k}) = \sum_{l = 1}^{p} x_{i l} μ_{ikl} (t) \approx \sum_{l = 1}^{p} \sum_{s = 1}^{S} θ_{ikl, s} x_{i l} f_{s} (t),

(4)

where x_i = (x_i₁, …, x_ip)^T is a vector of covariates and f_s (t) are known basis functions, such as B-splines. If θ_ikl_,_s = β_kl_,_s + b_ikl_,_s with E(b_ikl_,_s) = 0, then β_k includes all coefficients β_kl_,_s and b_ik includes all b_ikl_,_s for all k, l, and s.

Second, we consider a general multivariate survival model for the survival process as follows. Let S(t₁, …, t_M |z_i, H_i (t, b_i), b_i) be the joint survivor function of (T_i₁, …, T_iM) given (z_i, H_i (t, b_i), b_i), where H_i (t, b_i) = {η_ik (t̃, b_ik) : t̃ ∈ [0, t), k = 1, …, K} and z_i is a vector of time-independent covariates. It is assumed that S(t₁, …, t_M |z_i, H_i (t, b_i), b_i) takes the form

F (S_{1} (t_{1} ∣ z_{i}, H_{i} (t, b_{i}), b_{i}), \dots, S_{M} (t_{M} ∣ z_{i}, H_{i} (t, b_{i}), b_{i}); φ),

(5)

where F (…) is a known function, φ is a vector of unknown parameters for characterizing the dependence or association structure, and S_m (t|z_i, H_i (t, b_i), b_i) for m = 1, …, M are the marginal survival functions given (z_i, H_i (t, b_i), b_i). For bivariate time-to-event outcomes, Chi and Ibrahim (2006) have proposed a joint survivor function that is a special case of (5).

For the mth time-to-event outcome, we assume that the marginal hazard function of the ith subject is given by

λ_{m} (t ∣ z_{i}, H_{i} (t, b_{i}), b_{i}) = λ_{m 0} (t) exp {{\tilde{g}}_{m 0} (z_{i}; θ_{m, T}) + \sum_{k = 1}^{K} {\tilde{g}}_{m k} (η_{i k} (t, b_{i k}), b_{i k}; θ_{m, T})},

(6)

where λ_m₀(t) is an unknown baseline hazard function and θ_m_,_T contains all unknown parameters in λ_m (t|z_i, H_i (t, b_i), b_i) except λ_m₀(t). Moreover, g̃_mk (·; ·) for k = 0, …, K are prespecified functions that characterize the effect of the kth longitudinal profile on the mth time-to-event outcome. Then, we calculate $S_{m} (t ∣ z_{i}, H_{i} (t, b_{i}), b_{i}) = exp (- \int_{0}^{t} λ_{m} (u ∣ z_{i}, H_{i} (t, b_{i}), b_{i}) d u)$ . In the literature, it is common to assume that

{\tilde{g}}_{m 0} (z_{i}; θ_{m, T}) = z_{i}^{T} γ_{m} and {\tilde{g}}_{m k} (η_{i k} (t, b_{i k}), b_{i k}; θ_{m, T}) = α_{m k} η_{i k} (t, b_{i k}) for k = 1, \dots, K,

(7)

where γ_m is a vector of unknown parameters for the time-independent covariates and α_mk are unknown parameters.

Third, we consider a multivariate model for the random effects b_i as follows. Specifically, let p₀(b_i ; θ_b) be a prespecified density function, such as multivariate Gaussian. The density of b_i, denoted by p(b_i ; θ_b), is assumed to take the form p₀(b_i ; θ_b)ψ(b_i ; θ_b), where ψ(·; ·) is a known and nonnegative function such that ∫ p(b_i ; θ_b)db_i = 1. For instance, ψ(b_i ; θ_b) can be the square of a polynomial function of individual components of θ_b and the density of a copula function. We may further consider nonparametric alternatives to the parametric model p(b_i ; θ_b), such as a Dirichlet process.

A formal Bayesian analysis of (θ, b) also involves the specification of a prior distribution p(θ), where b = (b₁, …, b_n). A typical joint prior specification is to assume p(θ) = p(θ_F)p(θ_I), where p(θ_F) and p(θ_I), respectively, denote parametric prior distributions for the components of θ_F and non-parametric prior distributions for the components of θ_I. Let D_o = {(Y_i, T_i, δ_i): i = 1, …, n}. Then, we use Markov chain Monte Carlo (MCMC) methods to obtain samples from the joint posterior distribution of (θ, b), which is given by

p (θ, b ∣ D_{o}) \propto p (θ_{F}) p (θ_{I}) \prod_{i = 1}^{n} [p (Y_{i} ∣ b_{i}; θ_{y}) p (T_{i}, δ_{i} ∣ b_{i}; θ_{T}) p (b_{i}; θ_{b})] .

(8)

We focus on nonparametric priors for the nonparametric components in η_ik (t, b_ik) and the baseline hazard or cumulative baseline hazard function. We can take different prior distributions, including a Gaussian prior, zero-inflated priors, and stick-breaking priors among others, for the coefficients {θ_ikl_,_s} in model (4). For instance, a Dirichlet process for {θ_ikl_,_s }, denoted by DP(αP₀), usually clusters all the longitudinal profiles into one of k ≤ n clusters, where P₀ is the base probability measure and α is the confidence parameter. See a nice review of Bayesian nonparametric methods for functional data in Dunson (2009).

Different prior distributions for the baseline hazard λ_m₀(·) or cumulative baseline hazard Λ_m₀(·) include a piecewise constant hazards model, Gamma process model, Beta process model, or a Dirichlet process model. As an illustration, we construct the piecewise constant hazards model for λ_m₀(·). We start with a finite partition of the time axis, 0 < c_m_,1 < · · · < c_m_,_L, with c_m_,_L > T_im for all i and then set λ_m₀(t) = h_m_ℓ for t ∈ I_m_ℓ = (c_m_{, ℓ−1}, c_m_{, ℓ}]. Furthermore, a first-order autoregressive prior or an independent gamma prior can be taken for h_m = (h_m₁, …, h_mL)^T (Sinha, 1993; Arjas and Gasbarra, 1994; Ibrahim et al., 2001). An important alternative is the gamma process prior for Λ_m₀(·), that is, $Λ_{m 0} ~ G P (c_{0} Λ_{m}^{*}, c_{0})$ , where c₀ is a fixed constant and $Λ_{m}^{*} (\cdot)$ is a known increasing function with $Λ_{m}^{*} (0) = 0$ .

Our aim is to carry out Bayesian inference about parameters of interest, which requires a reasonably “robust” prior p(θ) and the correct specification of p(Y_i, T_i, δ_i; θ). A nonrobust prior p(θ), the presence of outliers, and the misspecification of the JMLS may introduce serious bias in the estimation and inference on θ. Thus, it is crucial to assess the sensitivity of statistical inference to the prior, the sampling distribution, and outliers. We note that existing frequentist diagnostic tools are not sufficient for this endeavor (Dobson and Henderson, 2003; Rizopoulos et al., 2008; Rizopoulos and Ghosh, 2011).

Example 1

For the purposes of illustration, we consider an example with two longitudinal markers and bivariate survival times. In this case, K = M = 2. Specifically, each longitudinal response is given by

\begin{array}{l} y_{i k} (t_{ijk}) = η_{i k} (t_{ijk}, b_{i k}) + ε_{ijk} \\ = β_{k 0} + β_{k 1} t_{ijk} + β_{k 2} r_{i} + b_{i k 0} + b_{i k 1} t_{ijk} + ε_{ijk}, \end{array}

(9)

for i = 1, …, 100, k = 1, 2, and j = 1, …, n_i, where the r_i s represent a baseline covariate in the longitudinal model. Moreover, it is assumed that t_ij₁ = t_ij₂ for all i and j, ε_ij = (ε_ij₁, ε_ij₂)^T are independently and identically distributed as N₂(0, Σ), and the random effects $b_{i} = {(b_{i 1}^{T}, b_{i 2}^{T})}^{T}$ are distributed as N₄(0, Φ), where b_ik = (b_ik₀, b_ik₁)^T for k = 1, 2. Here Σ and Φ for k = 1, 2 are covariance matrices. Conditional on b_i, the two event and censoring times are assumed to be independent and their marginal hazard functions are given by

λ_{m} (t ∣ b_{i}, z_{i}) = λ_{m 0} (t) exp {α_{m 1} η_{i 1} (t, b_{i 1}) + α_{m 2} η_{i 2} (t, b_{i 2}) + z_{i}^{T} γ_{m}}

(10)

for m = 1, 2, where z_i = (z_i₁, z_i₂)^T is a vector of time-independent covariates. Let Y_i (t) = (y_i₁(t), y_i₂(t))^T and η_i (t, b_i) = (η_i₁(t, b_i₁), η_i₂(t, b_i₂))^T. The density of (Y_i, T_i, δ_i, b_i) given θ for the ith subject, denoted by p(Y_i, T_i, δ_i, b_i ; θ), is given by

\begin{array}{l} C_{0} {∣ Φ ∣}^{- 1 / 2} exp (- \frac{1}{2} b_{i}^{T} Φ^{- 1} b_{i}) \prod_{m = 1}^{2} λ_{m} {(T_{i m} ∣ b_{i}, z_{i})}^{δ_{i m}} \\ \times exp {- \int_{0}^{T_{i m}} λ_{m} (u ∣ b_{i}, z_{i}) d u} \\ \times \prod_{j = 1}^{n_{i}} ({∣ \sum ∣}^{- 1 / 2} exp [- \frac{1}{2} {Y_{i} (t_{i j}) - η_{i} (t_{i j}, b_{i})}^{T} \times \sum^{- 1} {Y_{i} (t_{i j}) - η_{i} (t_{i j}, b_{i})}]), \end{array}

where C₀ is a constant independent of θ.

To carry out a Bayesian analysis, we take a joint prior distribution for θ as follows:

\begin{array}{l} α_{m} = {(α_{m 1}, α_{m 2})}^{T} ~ N (α_{m}^{0}, H_{α}^{0}), γ_{m} ~ N (γ_{m}^{0}, H_{γ}^{0}), \\ \sum^{- 1} ~ {Wishart}_{2} (R^{0}, ρ^{0}), Φ^{- 1} ~ {Wishart}_{4} (R_{φ}^{0}, ρ^{0}) \\ β_{k} = {(β_{k 0}, β_{k 1}, β_{k 2})}^{T} ~ N (β_{k}^{0}, H_{β}^{0}), \end{array}

(11)

for k, m = 1, 2, where $α_{m}^{0}, H_{α}^{0}, γ_{m}^{0}, H_{γ}^{0}$ , R⁰, ρ⁰, $R_{φ}^{0}$ , and ρ⁰ are prespecified hyperparameters. For the baseline hazard λ_m₀(·), we take a piecewise constant hazards model with 250 subintervals with equal lengths such that $λ_{m 0} (t) = \sum_{l = 1}^{L} h_{m l} 1 (t \in (c_{m, l - 1}, c_{m, l}])$ , where the c_m_, _ls are prespecified constants. We take h_{m l} ~ Γ(τ₀_l, τ₁_l) for l = 1, …, 250 and m = 1, 2. We use MCMC methods to conduct our Bayesian influence analysis on θ and b.

3. Bayesian Influence Analysis

We address three issues related to Bayesian influence analysis of JMLS: perturbation models for perturbing the JMLS, appropriate perturbations, and Bayesian influence measures.

3.1. Perturbation Models and Appropriate Perturbations

We introduce three classes of perturbation models to formally perturb JMLS. Let ω be a perturbation vector in a set Ω ⊂ R^W, which represents a Euclidean space of dimension W, where W is an integer. The perturbed model Inline graphic = {p(D_o, b; θ, ω): ω ∈ Ω} characterizes various perturbations to the assumed density of p(D_o, b; θ) such that ∫ p(D_o, b; θ, ω)dD_o db = 1 and p(D_o, b; θ, ω⁰) = p(D_o, b; θ) for a unique ω⁰, which represents no perturbation. The first class of perturbations is to individually perturb a subject’s longitudinal profile, repeated measures within each subject, covariates, survival time, and censoring indicator. For instance, we introduce a perturbation vector ω_y_,_i into p(Y_i |b_i; θ_y) to perturb Y_i such that the perturbed density is given by

\prod_{i = 1}^{n} {p (T_{i}, δ_{i} ∣ b_{i}; θ_{T}) p (b_{i}; θ_{b}) p (Y_{i} ∣ b_{i}; θ_{y}, ω_{y, i})},

(12)

in which ω = {ω_y_,1, …, ω_y_,_n }. This single-case perturbation equation (12) to the individual’s longitudinal profile is primarily designed to detect one or a few influential subjects, whose longitudinal profiles differ significantly from the other subjects in the evolution process. Furthermore, we may introduce a perturbation vector ω_i into p(T_i, δ_i |b_i; θ_T) to perturb (T_i, δ_i) such that the perturbed density is given by

\prod_{i = 1}^{n} {p (T_{i}, δ_{i} ∣ b_{i}; θ_{T}, ω_{T, i}) p (b_{i}; θ_{b}) p (Y_{i} ∣ b_{i}; θ_{y})},

(13)

in which $ω = {(ω_{T, 1}^{T}, \dots, ω_{T, n}^{T})}^{T}$ . This single-case perturbation equation (13) to the survival data is designed to reveal influential values of survival times relative to other survival times.

The second class is to perturb the shared random effects that underly both the longitudinal measurement and survival processes. For instance, we introduce a perturbation vector ω_b_,_i to simultaneously perturb (Y_i, T_i, δ_i) with the presence of b_i such that

\prod_{i = 1}^{n} {p (T_{i}, δ_{i} ∣ b_{i}; θ_{T}, ω_{b, i}) p (b_{i}; θ_{b}) p (Y_{i} ∣ b_{i}; θ_{y}, ω_{b, i})} .

(14)

In equation (14), we use the same ω_b_,_i to simultaneously perturb p(Y_i |b_i ; θ_y) and p(T_i, δ_i |b_i ; θ_T) to delineate large discrepancies between the longitudinal profile and the corresponding survival time, each of which may not be influential by themselves. This single-case perturbation equation (14) is primarily designed to detect influential survival times, whose occurrence is low relative to their subject-specific longitudinal profiles. An alternative perturbation is to introduce a perturbation ω_b_,_i into p(b_i ; θ_b) such that

\prod_{i = 1}^{n} {p (T_{i}, δ_{i} ∣ b_{i}; θ_{T}) p (b_{i}; θ_{b}, ω_{b, i}) p (Y_{i} ∣ b_{i}; θ_{y})} .

(15)

Perturbation equation (15) is designed to detect “influential” random effects b_i, whereas their corresponding T_i and Y_i may be not influential. Furthermore, by setting ω_b_,_i = ω_b, we can formally assess the parametric distributional assumptions of the random effects b_i and the amount of such perturbations to statistical inferences, such as their impact on parameter estimation.

The third class includes perturbations to the prior p(θ) and the simultaneous perturbations to all three components of the Bayesian model. A fundamental issue associated with any Bayesian analysis is how much posterior quantities, such as the Bayes factor, parameter estimates, and credible (or highest posterior density) intervals, are sensitive to changes in the prior distribution. Thus, it is important to assess both Bayesian semiparametric assumptions for the longitudinal profiles and perturbations regarding the nonparametric prior processes for the cumulative baseline hazard function. For instance, we consider a prior perturbation to Λ_m₀(t) ~ Inline graphic P (c₀Λ^*(t), c₀) by assuming Λ_m₀(t) ~ P (c₀Λ^*(t, ω_P), c₀) such that $Λ^{*} (t, ω_{P}) = \int_{0}^{t} k_{*} (s, ω_{P}) h_{0} (s) d s$ and Λ^*(t, 0) = Λ^*(t). Combining the first two classes with the third class, we can obtain various simultaneous perturbations to the prior, the sampling distribution, and the data, which allows us to assess the simultaneous sensitivity of all components of a Bayesian analysis.

Example 1 (continued)

We consider some simultaneous perturbations as follows:

\begin{array}{l} Y_{i} (t_{i j}) = η_{i} (t_{i j}, b_{i}) + ω_{y, i} 1_{2} + ε_{i j}, \\ p (b_{i}; θ_{b}, ω_{b, i}) ~ N_{4} (0, ω_{b, i}^{- 1} Φ), \\ p (α_{m} ∣ ω_{α 1}, ω_{α 2}) ~ N (α_{m}^{0} + ω_{α 2} 1_{2}, H_{α}^{0} / ω_{α 1}), \\ p (γ_{m} ∣ ω_{γ 1}, ω_{γ 2}) ~ N (γ_{m}^{0} + ω_{γ 2} 1_{2}, H_{γ}^{0} / ω_{γ 1}) . \end{array}

(16)

Thus, the perturbed (unnormalized) log-posterior is given by

\begin{array}{l} - 0.5 \sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} {Y_{i} (t_{i j}) - η_{i} (t_{i j}, b_{i}) - ω_{y, i} 1_{2}}^{T} \\ \times \sum^{- 1} {Y_{i} (t_{i j}) - η_{i} (t_{i, j}, b_{i}) - ω_{y, i} 1_{2}} \\ - 0.5 {\sum_{i = 1}^{n} ω_{b, i} b_{i}^{T} Φ^{- 1} b_{i}^{T} + \sum_{i = 1}^{n} log ∣ ω_{b, i}^{- 1} Φ ∣ \\ + {(β_{k} - β_{k}^{0})}^{T} {(H_{β}^{0})}^{- 1} (β_{k} - β_{k}^{0}) + log ∣ H_{β}^{0} ∣} \\ - 0.5 {\sum_{m = 1}^{2} ω_{α 1} {(α_{m} - α_{m}^{0} - ω_{α 2} 1_{2})}^{T} \\ \times H_{α}^{0 - 1} (α_{m} - α_{m}^{0} - ω_{α 2} 1_{2}) + 2 log ∣ ω_{α 1}^{- 1} H_{α}^{0} ∣ \\ + 2 log ∣ ω_{γ 1}^{- 1} H_{γ}^{0} ∣ + \sum_{m = 1}^{2} ω_{γ 1} {(γ_{m} - γ_{m}^{0} - ω_{γ 2} 1_{2})}^{T} \\ \times H_{γ}^{0 - 1} (γ_{m} - γ_{m}^{0} - ω_{γ 2} 1_{2})} . \end{array}

(17)

In this case, ω contains (ω_y_,1, …, ω_y_,_n, ω_b_,1, …, ω_b_,_n, ω_α₁, ω_α₂, ω_γ₁, ω_γ₂)^T and ω⁰ = (0, …, 0, 1, …, 1, 1, 0, 1, 0)^T represents no perturbation.

After perturbing the JMLS, we need to quantify the amount of perturbation introduced by each perturbation, the extent to which each component of a perturbation vector contributes to, and the degree of orthogonality for the components of a perturbation vector (Amari, 1990; Zhu, Ibrahim, and Tang, 2011). This is very critical for us to properly pinpoint the cause (e.g., prior) of a large effect. Specifically, we regard the perturbed model p(D_o, b, θ; ω) as the probability density of (D_o, b, θ) for ω and then calculate its score function ∂_{ω_h}ℓ(ω), where ∂_{ω_h} = ∂/∂ω_h, ℓ(ω) = log p(D_o, b, θ; ω), and ω_h is the hth component of ω. Thus, the Fisher information matrix with respect to ω, denoted by G(ω) = (g_{h s} (ω)), is a W × W matrix with its (h, s) element given by $g_{h s} (ω) = E_{ω} [\partial_{ω_{h}} ℓ (ω) \partial_{ω_{s}} ℓ (ω)] = - E_{ω} [\partial_{ω_{h} ω_{s}}^{2} ℓ (ω)]$ , for h, s = 1, …, W, where Eω denotes the expectation taken with respect to p(D_o, b, θ; ω).

We call p(D_o, b, θ; ω) an appropriate perturbation model if G(ω⁰) equals cI_W, where c is any positive scalar and I_W is a W × W identity matrix (Zhu et al., 2007). Specifically, g_hh (ω) can be regarded as the amount of perturbation introduced by ω_h, whereas the correlation $g_{h s} (ω) / \sqrt{g_{h h} (ω) g_{s s} (ω)}$ indicates an association between ω_h and ω_s. The diagonal structure of G(ω) implies that all components of ω are orthogonal to each other. Orthogonal subcomponents of ω allow for easy detection of the cause of a large effect. If G(ω⁰) is not diagonal, then we choose a new perturbation vector ω̃, defined by

\tilde{ω} (ω) = ω^{0} + G {(ω^{0})}^{1 / 2} (ω - ω^{0}) .

(18)

Based on ω̃, we can obtain a new perturbation model p(D_o, b, θ; ω̃) such that G(ω̃) evaluated at ω⁰ equals I_W.

Example 1 (continued)

Let $g_{y, i i} = n_{i} E (1_{2}^{T} \sum^{- 1} 1_{2}) = n_{i} ρ^{0} 1_{2}^{T} R^{0} 1_{2}$ for i = 1, …, n. It follows from the definition of G(ω⁰) that $G (ω^{0}) = diag (g_{y, 11}, \dots, g_{y, n n}, 2 1_{n}, M, 1_{2}^{T} H_{α}^{0 - 1} 1_{2}, M, 1_{2}^{T} H_{γ}^{0 - 1} 1_{2})$ . The diagonal structure of G(ω⁰) indicates that all components of ω are orthogonal to each other. However, because G(ω⁰) does not take the form of cI_W, this indicates that different components of ω introduce different amounts of perturbations. For instance, a large g_y_,_ii indicates that ω_y_,_i introduces a large perturbation for a subject with more repeated measures (n_i). In practice, we can always choose the appropriate perturbation scheme in equation (18).

3.2. Local Influence Measures

Let f(p(D_o, b, θ; ω)) = f (ω) be a real objective function (e.g., Bayes factor) of the perturbed model, which defines the aspect of inference of interest for sensitivity analysis. We use f(ω) to measure the effect of a small perturbation ω to a JMLS around ω⁰. Specifically, we consider a smooth curve of the perturbed model p(D_o, b, θ; ω(t)) such that p(D_o, b, θ; ω(0)) = p(D_o, b, θ). Then, the score function of ℓ(ω (t)) with respect to t is equal to ∂_t ℓ(ω(t)) = h^T∂ωℓ(ω(t)), where ∂_t = ∂/∂t, ∂ω = ∂/∂ω, and ∂_t ω(t)|_t ₌₀ = h ∈ R^W. Then, we quantify the effects of introducing ω(t) to perturb the JMLS by using {f (ω(t)) − f (ω(0))}² relative to the Kullback–Leibler divergence between p(D_o, b, θ; ω(0)) and p(D_o, b, θ; ω(t)), denoted by S(ω(0), ω(t)).

For small t, it follows from a Taylor’s series expansion that S(ω(0), ω(t)) ≈ 0.5t²h^T G(ω⁰)h and f (ω(t)) − f (ω(0)) = ḟ_h(0)t + O(t²), where ${\dot{f}}_{h} (0) = \nabla_{f}^{T} h$ , in which ∇_f = ∂ωf(ω⁰). If ∇_f ≠ 0, we use a quantity FI_f_,_h called the first-order influence measure (FI) in the direction h ∈ R^W for the objective function f (ω), which is given by

\begin{array}{l} {FI}_{f, h} = {FI}_{f (ω^{0}), h} = lim_{t \to 0} \frac{{f (ω (t)) - f (ω (0))}^{2}}{2 S (ω (0), ω (t))} \\ = {FI}_{f (ω^{0}), h} = \frac{h^{T} \nabla_{f} \nabla_{f}^{T} h}{h^{T} G h}, \end{array}

(19)

where G = G(ω⁰). For the appropriate perturbation ω̃(ω) given in equation (18), FI_f_,_h reduces to $h^{T} G^{- 1 / 2} \nabla_{f} \nabla_{f}^{T} G^{- 1 / 2} h$ with the constraint h^T h = 1.

We use the maximum value of FI_f_,_h and its associated eigenvector, denoted by h_max, as influence measures to quantify the largest degree and influential direction of local influence of ω̃ to the JMLS. It can be easily shown that FI_{f,h_max} equals $\nabla_{f}^{T} G^{- 1} \nabla_{f}$ and $h_{max} = G^{- 1 / 2} \nabla_{f} / \sqrt{\nabla_{f}^{T} G^{- 1} \nabla_{f}}$ . In particular, h_max can be used to detect robustness of priors or identify influential observations and incorrect sampling distributional assumptions for single-case and global perturbations (Cook, 1986). Following Zhu and Lee (2001) and Zhu et al. (2007), we also suggest inspecting FI_{f, e_i} to identify the most significant components of ω̃, where e_i is a W × 1 vector with a 1 for the ith element and 0 otherwise. For instance, we consider the Bayes factor given by f(ω) = log p(D_o; ω) − log p(D_o; ω⁰), where p(D_o; ω) = ∫ p(D_o, b, θ; ω)dbdθ. Thus, under some smoothness conditions, it can be shown that ∇_f = E_ω_<_sup_>₀_</_sup_>[∂ωlog p(D_o, b, θ; ω⁰)|D_o ] and $h_{max} = G^{- 1 / 2} \nabla_{f} / \sqrt{\nabla_{f}^{T} G^{- 1} \nabla_{f}}$ . To calculate the local influence measures associated with f(ω), we just need to compute ∇_f and G. In practice, we can use MCMC methods to draw samples {(θ⁽^s⁾, b⁽^s⁾): s = 1, …, S₀} from p(θ, b; D_o, ω⁰) to approximate ∇_f via $\nabla_{f} \approx S_{0}^{- 1} \sum_{s = 1}^{S_{0}} \partial_{ω} log p (D_{o}, b^{(s)}, θ^{(s)}; ω^{0})$ .

We can also carry out Bayesian local influence when ∇_f = 0. Because f (ω(t)) = f (ω(0)) + 0.5h^T H_f ht² + O(t³), where $H_{f} = \partial_{ω}^{2} f (ω^{0})$ , we introduce a second-order influence measure (SI) in the direction h ∈ R^W, given by

{SI}_{f, h} = {SI}_{f (ω^{0}), h} = lim_{t \to 0} \frac{{f (ω (t)) - f (ω (0))}}{S (ω (0), ω (t))} = \frac{h^{T} H_{f} h}{h^{T} Gh} .

(20)

For ω̃ (ω) in equation (18), SI_f_,_h reduces to h^TG^−1/2H_fG^−1/2h, where h^T h = 1. Moreover, we also use SI_{f,e_i} and the eigenvalue–eigenvector pairs of G^−1/2H_f G^−1/2 as our influence measures. We use the eigenvector of the largest eigenvalue of G^−1/2H_f G^−1/2, denoted by h_max, as influence measures to quantify the influential direction of local influence of ω̃ to the JMLS.

Finally, we examine the influence measures associated with three common objective functions, these being the φ–divergence, the posterior mean distance, and the Bayes factor, and include the detailed formulas in the supplementary document. Although all three objective functions can assess the local influence of a perturbation vector ω to the JMLS, there is a conceptual difference among these measures. The φ–divergence and the Bayes factor quantify the effects of introducing ω on the overall posterior distribution, whereas the posterior mean distance quantifies the effect of ω on the posterior mean of θ. Because the perturbation vector ω may influence various characteristics of the posterior distribution, such as the shape, mode, and mean, the φ–divergence, and the Bayes factor can be more sensitive to some perturbations of the posterior distribution compared to the posterior mean for certain perturbation schemes ω. In contrast, the posterior mean distance may be more sensitive to perturbations which have a dramatic effect on the posterior mean.

4. Application to the IBCSG Data

We applied our proposed methodology to both simulated data and the IBCSG data discussed in the Introduction Section. For the sake of space, we only present some influence analysis results for the IBCSG data here. We refer the reader to the supplementary document for further details.

For the IBCSG data, we considered a JMLS for jointly investigating the relationship between the multidimensional QOL and the bivariate failure time variables DFS and OS. To satisfy the normality assumption for the four considered longitudinal QOL indicators (appetite, denoted as y₁; perceived coping, denoted as y₂; mood, denoted as y₃; and physical well-being, denoted as y₄), we transformed their corresponding observed values of QOL to $\sqrt{100 - QOL}$ (Chi and Ibrahim, 2006). The transformed QOLs decreased over time and were scaled between 0 and 10 with smaller values reflecting better QOL. There were 832 patients from Switzerland, Sweden, and New Zealand/Australia with a total of 2154 QOL observations being included in this analysis.

Let y_i₁(t_ij₁), …, y_i₄(t_ij₄) be the observed values of the transformed QOLs for the ith patient at the jth time point, respectively. We considered the following JMLS:

\begin{array}{l} λ_{m} (t ∣ b_{i}, z_{i}) = λ_{m 0} (t) exp {α_{m 1} η_{i 1} (t, b_{i 1}) + \dots + α_{m 4} η_{i 4} (t, b_{i 4}) + z_{i}^{T} γ_{m}} \\ y_{i k} (t_{ijk}) = η_{i k} (t_{ijk}, b_{i k}) + ε_{ijk} = β_{k 0} + β_{k 1} x_{i 1} + \dots + β_{k 6} x_{i 6} + β_{k 7} t_{ijk} + b_{i k 0} + b_{i k 1} t_{ijk} + ε_{ijk}, \end{array}

(21)

for i = 1, …, 832, j = 1, 2, 3, k = 1, …, 4, and m = 1, 2, where x_i = (x_i₁, …, x_i₆)^T includes the number of initial cycles, reintroduction, interaction of the number of initial cycles and reintroduction, age, residency for Switzerland, and residency for Sweden. Moreover, z_i includes the number of initial cycles, reintroduction, interaction of the number of initial cycles and reintroduction, age, number of positive nodes, and ER status. We assumed that ε_ij = (ε_ij₁, …, ε_ij₄)^T are independently and identically distributed as N₄(0, Σ), and the random effects b_ik = (b_ik₀, b_ik₁)^T are independently and identically distributed as N₂(0, φ_k) for i = 1, …, 832, j = 1, 2, 3, and k = 1, …, 4. For the baseline hazard λ_m₀(·), we take the piecewise constant hazards model with 250 subintervals with equal lengths such that $λ_{m 0} (t) = \sum_{l = 1}^{L} h_{m l} 1 (t \in (c_{m, l - 1}, c_{m, l}])$ with L = 250, where the c_m_,_l s are prespecified constants.

To conduct a Bayesian analysis, we specified the following prior distributions:

\begin{array}{l} α_{m} = {(α_{m 1}, \dots, α_{m 4})}^{T} ~ N (α_{m}^{0}, H_{α}^{0}), γ_{m} ~ N (γ_{m}^{0}, H_{γ}^{0}), \sum^{- 1} ~ {Wishart}_{4} (R^{0}, ρ^{0}), \\ β_{k} = {(β_{k 0}, \dots, β_{k 7})}^{T} ~ N (β_{k}^{0}, H_{β}^{0}), h_{m l} ~ Γ (τ_{0 l}, τ_{1 l}), Φ_{k}^{- 1} ~ {Wishart}_{2} (R_{φ k}^{0}, ρ_{φ k}^{0}) \end{array}

(22)

for m = 1 and 2, and k = 1, …, 4, where h = {h_ml: m = 1, 2, l = 1, …, L}, $α_{m}^{0}, H_{α}^{0}, γ_{m}^{0}, H_{γ}^{0}$ , R⁰, ρ⁰, $β_{k}^{0}, H_{β}^{0}$ , τ_0l, τ_1l, $R_{φ k}^{0}$ , and $ρ_{φ k}^{0}$ are prespecified hyperparameters. Moreover, $α_{m}^{0}, γ_{m}^{0}, β_{k}^{0}$ , R⁰, and $R_{φ k}^{0}$ were set to their Bayesian posterior means obtained from MCMC methods based on noninformative prior distributions for α_m, γ_m, Σ⁻¹, β, and φ_k. We used MCMC methods, whose key steps are described in the supplementary document, to carry out the Bayesian analysis.

To illustrate our influence analysis, we considered five different perturbations to the JMLS and carried out the associated influence analysis. Specifically, for each perturbation ω, we calculated the metric tensor G and then took the new appropriate perturbation ω̃ (ω) given in equation (18). Detailed derivations of influence quantities, such as G(ω⁰), can be found in the supplementary document. We chose the Bayes factor as the objective function and then calculated the local influence measure h_max of the Bayes factor for each perturbation scheme by using the MCMC output. Specifically, a total of 5,000 iterations after 5,000 burn-in samples were used to compute all local influence measures.

The first perturbation is a single-case perturbation, which is obtained by perturbing each subject’s longitudinal profile as follows:

y_{i k} (t_{ijk}, ω_{ijk}) = β_{k 0} + β_{k 1} x_{i 1} + \dots + β_{k 6} x_{i 6} + β_{k 7} t_{ijk} + b_{i k 0} + b_{i k 1} t_{ijk} + ε_{ijk} / ω_{ijk} .

In this case, $ω = {(ω_{11}^{T}, \dots, ω_{1 n_{1}}^{T}, \dots, ω_{n 1}^{T}, \dots, ω_{n n_{n}}^{T})}^{T}$ , in which ω_ij = (ω_ij₁, …, ω_ij₄)^T for i = 1, …, n = 832, and j = 1, …, 3, and ω⁰ = 0 presents no perturbation. This single-case perturbation is designed to detect influential transformed QOLs of the longitudinal profiles in the evolution process. Figure 1 presents the subcomponents of h_max corresponding to all cases (i, j) for k = 1, 2, 3, and 4, where (i, j) represents the ith patient at the jth time point. Inspecting Figure 1 reveals the most influential cases as (158,2), (306,1), (619,3), (788,2), (802,2), and (803,2) for k = 1; (39,2), (541,1), and (625,2) for k = 2; (103,2), (130,2), (209,1), (331,2), and (331,3) for k = 3; and (53,2), (103,2), (208,2), (306,1), (530,1), (780,2), and (792,3) for k = 4.

Results of single-case perturbation to longitudinal profiles for the IBCSG data set: Index plots of the local influence measure |h_max| corresponding to k = 1 (left upper panel), k = 2 (right upper panel), k = 3 (left lower panel), and k = 4 (right lower panel).

The second perturbation is also a single-case perturbation, which perturbs each marginal hazard function as follows:

λ_{m} (t ∣ b_{i}, z_{i}, ω_{i m}) = λ_{m 0} (t) exp {α_{m 1} η_{i 1} (t, b_{i}) + \dots + α_{m 4} η_{i 4} (t, b_{i}) + z_{i}^{T} γ_{m} + ω_{i m}}

for i = 1, …, n and m = 1, 2. In this case, ω = (ω₁₁, ω₂₁, …, ω_n₁, ω_n₂) with ω⁰ = 1 representing no perturbation. This single-case perturbation is used to detect influential disease-free survival and overall survival times in the survival process. Inspecting Figure 2 reveals patients 15, 18, and 28 as most influential for m = 1, and patients 15, 42, 85, 96, 136, 234, 281, and 282 for m = 2 by our local influence measure h_max for the Bayes factor.

Results of single-case perturbation to marginal hazard functions for the IBCSG data set: Index plots of the local influence measure h_max for ω_ins with m = 1 (left upper panel), and for ω_ins with m = 2 (left lower panel), and *g_ii* with m = 1 (right upper panel), and m = 2 (right lower panel).

The third perturbation is to simultaneously perturb the shared random effects b_i in both the longitudinal profiles and the marginal hazard functions:

\begin{array}{l} y_{i k} (t_{ijk}, ω_{i k}) = β_{k 0} + β_{k 1} x_{i 1} + \dots + β_{k 6} x_{i 6} + β_{k 7} t_{ijk} + ω_{i k} (b_{i k 0} + b_{i k 1} t_{ijk}) + ε_{ijk} \\ ≜ η_{i k} (t_{ijk}, b_{i k}, ω_{i k}) + ε_{ijk}, \end{array}

λ_{m} (t ∣ b_{i}, z_{i}, ω_{i}) = λ_{m 0} (t) exp {α_{m 1} η_{i 1} (t, b_{i 1}, ω_{i 1}) + \dots + α_{m 4} η_{i 4} (t, b_{i 4}, ω_{i 4}) + z_{i}^{T} γ_{m}},

where ω_i = (ω_i₁, …, ω_i₄). In this case, ω = {ω₁₁, …, ω₁₄, …, ω_n₁, …, ω_n₄) and ω⁰ = 1 represents no perturbation. This single-case perturbation is used to detect these influential subjects, whose survival times have a low chance of occurrence given their subject-specific longitudinal profiles. Inspecting Figures 3 and 4 reveal the following influential subjects. Specifically, in Figure 3, patients 15, 28, 40, 70, 81, 96, 117, 136, 228, and 282 were detected to be the most influential for k = 1 and patients 15, 28, 30, 40, 70, 94, 136, 220, and 234 were detected to be the most influential for k = 2. Figure 4 shows that patients 15, 28, 70, 81, 136, and 150 were detected to be the most influential for k = 3 and patients 9, 15, 18, 28, 70, 94, 96, and 136 were detected to be the most influential for k = 4.

Results of perturbing the shared random effects for the IBCSG data set: Index plots of the local influence measure |h_max| for perturbing *b_ik* with k = 1 (left upper panel), and k = 2 (left lower panel), *g_ii* with k = 1 (right upper panel) and k = 2 (right lower panel).

The fourth perturbation involves perturbing the prior distributions as follows:

\begin{array}{l} β_{k} ~ N_{7} (β_{k}^{0} + ω_{β 0} 1_{7}, H_{β k}^{0} / ω_{β 1}), \\ α_{m} ~ N_{4} (α_{m}^{0} + ω_{α 0} 1_{4}, ω_{α 1}^{- 1} H_{α m}^{0}), \\ \sum^{- 1} ~ {Wishart}_{4} (ρ_{0}, ω_{\sum}^{- 1} R^{0}), \\ Φ_{k}^{- 1} ~ {Wishart}_{2} (ρ_{k}^{0}, ω_{φ}^{- 1} R_{φ k}^{0}), \\ γ_{m} ~ N_{6} (γ_{m}^{0} + ω_{γ 0} 1_{6}, ω_{γ 1}^{- 1} H_{γ m}^{0}), h_{m l} ~ Γ (τ_{λ 0}, ω_{λ} τ_{λ 1}) . \end{array}

In this case, ω = {ω_β₀, ω_β₁, ω_α₀, ω_α₁, ω_γ₀, ω_γ₁, ω_λ, ω_Σ, ω_φ}, and ω⁰ = {0, 1, 0, 1, 0, 1, 1, 1, 1} represents no perturbation. This perturbation assesses the sensitivity of the Bayes factor to minor changes in the prior distributions. Based on the local influence measures |h_max| for the Bayes factor and the diagonal elements of G(ω⁰), Figures 5(a) and (b) shows that perturbing the prior of h_ml has a large impact on the Bayesian analysis.

Results of the IBCSG data set. Perturbing the prior distributions: (a) index plot of the local influence measure |h_max| (the left panel); and (b) index plot of *g_ii* (the right panel) for perturbing priors of the parameters β_k, α_m, γ_m, *h_{m l}*, Σ and Φ₁, …, Φ₄. The simultaneous perturbation: (c) index plot of the local influence measure |h_max| (the left panel); and (d) index plot of *g_ii* (the right panel).

The last perturbation is a simultaneous perturbation. Specifically, we consider the following perturbation scheme:

\begin{array}{l} y_{i k} (t_{ijk}, ω_{i}, ω_{b}) = β_{k 0} + β_{k 1} x_{i 1} + \dots + β_{k 6} x_{i 6} + β_{k 7} t_{ijk} + ω_{b} (b_{i k 0} + b_{i k 1} t_{ijk}) + ε_{ijk} / ω_{i} \\ ≜ η_{i k} (t_{ijk}, b_{i k}, ω_{b}) + ε_{ijk} / ω_{i}, \end{array}

λ_{m} (t ∣ b_{i}, z_{i}, ω_{λ}, ω_{b}) = λ_{m 0} (t) exp {α_{m 1} η_{i 1} (t, b_{i 1}, ω_{b}) + \dots + α_{m 4} η_{i 4} (t, b_{i 4}, ω_{b}) + z_{i}^{T} γ_{m} + ω_{λ}},

and a subset of the $h_{m l}^{'}$ s and the priors of β_k, α_m, γ_m, h_ml, Σ⁻¹ and $Φ_{k}^{- 1}$ are pertarbed. In this case, we have

ω = {ω_{1}, \dots, ω_{n}, ω_{b}, ω_{λ}, ω_{h 1}, \dots, ω_{h G}, ω_{β 0}, ω_{β 1}, ω_{α 0}, ω_{α 1}, ω_{γ 0}, ω_{γ 1}, ω_{h}, ω_{\sum}, ω_{φ}}

and ω⁰ = {1, …, 1, 1, 0, 1, 1, …, 1, 0, 1, 0, 1, 0, 1, 1, 1, 1} represents no perturbation. We are interested in examining the sensitivity of all components of the Bayesian analysis to such simultaneous perturbations. Based on all the subcomponents of |h_max| and g_ii, Figure 5(c) identifies influential perturbations $b_{i k 0} + b_{i k 1} t, z_{i}^{T} γ_{m}$ , h_ml s and the prior distribution of the h_ml s as well as the three most influential patients 103, 130, and 780. Finally, we deleted the three influential subjects 103, 130, and 780 and recalculated the posterior estimates of the parameters for the IBCSG data (Table 1). Inspecting Table 1 indicates that many subcomponents of β_k and α_m are very sensitive to the deletion of these three subjects.

Table 1.

The IBCSG data set: Posterior means (Mean) of the parameters and their posterior standard deviations (SD) with and without influential patients 103, 130, and 780.

Par.	With		Without		Par.	With		Without
Par.	Mean	SD	Mean	SD	Par.	Mean	SD	Mean	SD
α₁₁	0.263	0.055	0.725	0.141	β₁₀	3.481	0.147	4.285	0.052
α₁₂	−0.838	0.042	0.090	0.142	β₁₁	−0.051	0.154	−0.265	0.082
α₁₃	−1.081	0.061	0.077	0.129	β₁₂	−0.090	0.162	−0.106	0.097
α₁₄	1.245	0.076	−1.655	0.215	β₁₃	−0.050	0.222	0.349	0.086
α₂₁	0.642	0.096	1.132	0.146	β₁₄	0.585	0.127	−0.150	0.052
α₂₂	−1.363	0.067	0.062	0.179	β₁₅	0.140	0.182	−0.320	0.059
α₂₃	−1.525	0.056	0.069	0.156	β₁₆	0.174	0.141	−0.154	0.091
α₂₄	1.650	0.121	−2.276	0.334	β₁₇	−0.735	0.056	−0.903	0.019
γ₁₁	0.047	0.240	0.062	0.214	β₂₀	5.177	0.604	5.806	0.140
γ₁₂	−0.437	0.235	−0.398	0.201	β₂₁	0.268	0.124	−0.195	0.154
γ₁₃	−0.149	0.346	0.084	0.399	β₂₂	0.152	0.200	−0.197	0.153
γ₁₄	−1.185	0.243	−0.250	0.164	β₂₃	−0.329	0.219	0.440	0.205
γ₁₅	1.304	0.166	1.058	0.159	β₂₄	0.344	0.246	0.140	0.158
γ₁₆	−0.790	0.212	−0.322	0.154	β₂₅	0.298	0.135	0.204	0.130
γ₂₁	−1.280	0.480	−1.044	0.367	β₂₆	0.377	0.254	0.012	0.122
γ₂₂	−1.851	0.449	−1.685	0.377	β₂₇	−0.863	0.112	−1.014	0.048
γ₂₃	1.525	0.682	1.695	0.723	β₃₀	4.289	0.162	5.048	0.120
γ₂₄	−2.911	0.542	−1.317	0.300	β₃₁	0.347	0.094	0.059	0.118
γ₂₅	2.129	0.245	1.589	0.334	β₃₂	0.213	0.136	0.158	0.125
γ₂₆	−2.929	0.430	−1.726	0.239	β₃₃	−0.428	0.205	−0.089	0.180
σ₁₁	3.941	0.165	4.259	0.205	β₃₄	0.594	0.120	0.148	0.109
σ₁₂	1.588	0.115	1.642	0.120	β₃₅	0.268	0.161	−0.061	0.119
σ₁₃	2.444	0.125	2.524	0.131	β₃₆	0.820	0.104	0.159	0.098
σ₁₄	2.537	0.122	2.533	0.130	β₃₇	−0.823	0.037	−0.855	0.057
σ₂₂	4.163	0.173	3.660	0.164	β₄₀	3.974	0.623	4.560	0.071
σ₂₃	2.716	0.128	2.386	0.129	β₄₁	0.047	0.192	0.017	0.058
σ₂₄	2.112	0.131	2.322	0.129	β₄₂	−0.182	0.241	−0.029	0.065
σ₃₃	4.672	0.171	4.449	0.170	β₄₃	−0.019	0.293	0.130	0.076
σ₃₄	3.129	0.137	3.395	0.142	β₄₄	0.853	0.290	0.215	0.036
σ₄₄	4.695	0.173	5.243	0.177	β₄₅	0.151	0.102	−0.170	0.055
φ_1,11	1.516	0.179	1.073	0.207	β₄₆	0.724	0.140	0.158	0.028
φ_1,12	−0.567	0.110	−0.356	0.098	β₄₇	−0.655	0.115	−0.661	0.011
φ_1,22	0.479	0.078	0.380	0.074	φ_3,11	0.685	0.081	1.067	0.117
φ_2,11	1.163	0.133	1.539	0.156	φ_3,12	−0.267	0.056	−0.376	0.076
φ_2,12	−0.241	0.079	−0.289	0.121	φ_3,22	0.626	0.079	0.417	0.068
φ_2,22	0.654	0.084	0.644	0.139
φ_4,11	0.537	0.088	0.328	0.072
φ_4,12	−0.144	0.056	−0.090	0.053
φ_4,22	0.403	0.057	0.404	0.057

Open in a new tab

Supplementary Material

Supplementary data

NIHMS411827-supplement-Supplementary_data.pdf^{(547.3KB, pdf)}

Footnotes

5. Supplementary Materials

The web-based supplementary document referenced in Sections 3 and 4 is available with this article at the Biometrics website on the Wiley Online Library.

References

Amari SI. Lecture Notes in Statistics. 2. Vol. 28. Berlin: Springer-Verlag; 1990. Differential-Geometrical Methods in Statistics. [Google Scholar]
Arjas E, Gasbarra D. Nonparametric bayesian inference from right censored survival data, using the gibbs sampler. Statistica Sinica. 1994;4:505–524. [Google Scholar]
Brown ER, lbrahim JG. A bayesian semiparametric joint hierarchical model for longitudinal and survival data. Biometrics. 2003a;59:221–228. doi: 10.1111/1541-0420.00028. [DOI] [PubMed] [Google Scholar]
Brown ER, lbrahim JG. Bayesian approaches to joint cure rate and longitudinal models with applications to cancer vaccine trials. Biometrics. 2003b;59:686–693. doi: 10.1111/1541-0420.00079. [DOI] [PubMed] [Google Scholar]
Brown ER, Ibrahim JG, DeGruttola V. A flexible B-spline model for multiple longitudinal biomarkers and survival. Biometrics. 2005;61:64–73. doi: 10.1111/j.0006-341X.2005.030929.x. [DOI] [PubMed] [Google Scholar]
Chen MH, Ibrahim JG, Sinha D. Bayesian inference for multivariate survival data with a surviving fraction. Journal of Multivariate Analysis. 2002;80:101–126. [Google Scholar]
Chen MH, Ibrahim JG, Sinha D. A new joint model for longitudinal and survival data with a cure fraction. Journal of Multivariate Analysis. 2004;91:18–34. [Google Scholar]
Chi YY, Ibrahim JG. Joint models for multivariate longitudinal and multivariate survival data. Biometrics. 2006;62:432–445. doi: 10.1111/j.1541-0420.2005.00448.x. [DOI] [PubMed] [Google Scholar]
Chi Y, Ibrahim JG. A New class of joint models for longitudinal and survival data accomodating zero and non-zero cure fractions: A case study of an international breast cancer study group trial. Statistica Sinica. 2007;17:445–462. [Google Scholar]
Cook RD. Assessment of local influence (with Discussion) Journal of the Royal Statistical Society, Series B: Methodological. 1986;48:133–169. [Google Scholar]
De Gruttola V, Tu XM. Modelling progression of cd-4 lymphocyte count and its relationship to survival time. Biometrics. 1994;50:1003–1014. [PubMed] [Google Scholar]
Dobson A, Henderson R. Diagnostics for joint longitudinal and dropout time modeling. Biometrics. 2003;59:741–751. doi: 10.1111/j.0006-341x.2003.00087.x. [DOI] [PubMed] [Google Scholar]
Dunson D. Nonparametric Bayes applications to biostatistics. In: Hjort N, Holmes C, Muller P, Walker S, editors. Bayesian Nonparametrics in Practice. Cambridge, UK: Cambridge University Press; 2009. pp. 223–273. [Google Scholar]
Faucett C, Thomas D. Simultaneously modeling censored survival data and repeatedly measured covariates: A gibbs sampling approach. Statistics in Medicine. 1996;15:1663–1685. doi: 10.1002/(SICI)1097-0258(19960815)15:15<1663::AID-SIM294>3.0.CO;2-1. [DOI] [PubMed] [Google Scholar]
Hanson T, Branscum A, Johnson W. Predictive comparison of joint longitudinal-survival modeling: A case study illustrating competing approaches (with discussion) Lifetime Data Analysis. 2011;17:3–28. doi: 10.1007/s10985-010-9162-0. [DOI] [PubMed] [Google Scholar]
Henderson R, Diggle P, Dobson A. Joint modeling of longitudinal measurements and event time data. Biostatistics. 2000;4:465–480. doi: 10.1093/biostatistics/1.4.465. [DOI] [PubMed] [Google Scholar]
Ibrahim JG, Chen MH, Sinha D. Bayesian Survival Analysis. Springer Series in Statistics. New York: Springer-Verlag; 2001. [Google Scholar]
Ibrahim JG, Chen MH, Sinha D. Bayesian methods for joint modeling of longitudinal and survival data with applications to cancer vaccine studies. Statistica Sinica. 2004;14:863–883. [Google Scholar]
Law NJ, Taylor JMG, Sandler HM. The joint modeling of a longitudinal disease progression marker and the failure time process in the presence of cure. Biostatistics. 2002;3:547–563. doi: 10.1093/biostatistics/3.4.547. [DOI] [PubMed] [Google Scholar]
Pawitan Y, Self S. Modeling disease marker processes in aids. Journal of the American Statistical Association. 1993;83:719–726. [Google Scholar]
Rizopoulos D, Ghosh P. A Bayesian semiparametric multivariate joint model for multiple longitudinal outcomes and a time-to-event. Statistics in Medicine. 2011;30:1366–1380. doi: 10.1002/sim.4205. [DOI] [PubMed] [Google Scholar]
Rizopoulos D, Verbeke G, Molenberghs G. Shared parameter models under random effects misspecification. Biometrika. 2008;95:63–74. [Google Scholar]
Sinha D. Semiparametric Bayesian analysis of multiple event time data. Journal of the American Statistical Association. 1993;88:979–983. [Google Scholar]
Song X, Davidian M, Tsiatis AA. A semiparametric likelihood approach to joint modeling of longitudinal and time-to-event data. Biometrics. 2002;58:742–753. doi: 10.1111/j.0006-341x.2002.00742.x. [DOI] [PubMed] [Google Scholar]
Tsiatis A, Davidian M. An overview of joint modeling of longitudinal and time-to-event data. Statistica Sinica. 2004;14:793–818. [Google Scholar]
Tsiatis AA, DeGruttola V, Wulfsohn MS. Modeling the relationship of survival to longitudinal data measured with error. Applications to survival and cd4 counts in patients with aids. Journal of the American Statistical Association. 1995;90:27–37. [Google Scholar]
Wang Y, Taylor JMG. Jointly modeling longitudinal and event time data with application to acquired immunodeficiency syndrome. Journal of the American Statistical Association. 2001;96:895–905. [Google Scholar]
Wulfsohn MS, Tsiatis AA. A joint model for survival and longitudinal data measured with error. Biometrics. 1997;53:330–339. [PubMed] [Google Scholar]
Xu J, Zeger SL. The evaluation of multiple surrogate endpoints. Biometrics. 2001;57:81–87. doi: 10.1111/j.0006-341x.2001.00081.x. [DOI] [PubMed] [Google Scholar]
Yu M, Law NJ, Taylor JMG, Sandler HM. Joint longitudinal-survival-cure models and their application to prostate cancer. Statistica Sinica. 2004;14:835–862. [Google Scholar]
Zhu H, Ibrahim J, Lee SY, Zhang H. Perturbation selection and influence measures in local influence analysis. Annals of Statistics. 2007;35:2565–2588. [Google Scholar]
Zhu H, Ibrahim JG, Tang NS. Bayesian local influence analysis: a geometric approach. Biometrika. 2011;98:307–323. doi: 10.1093/biomet/asr009. [DOI] [PMC free article] [PubMed] [Google Scholar]
Zhu H, Lee SY. Local influence for incomplete-data models. Journal of the Royal Statistical Society, Series B: Statistical Methodology. 2001;63:111–126. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary data

NIHMS411827-supplement-Supplementary_data.pdf^{(547.3KB, pdf)}

[R1] Amari SI. Lecture Notes in Statistics. 2. Vol. 28. Berlin: Springer-Verlag; 1990. Differential-Geometrical Methods in Statistics. [Google Scholar]

[R2] Arjas E, Gasbarra D. Nonparametric bayesian inference from right censored survival data, using the gibbs sampler. Statistica Sinica. 1994;4:505–524. [Google Scholar]

[R3] Brown ER, lbrahim JG. A bayesian semiparametric joint hierarchical model for longitudinal and survival data. Biometrics. 2003a;59:221–228. doi: 10.1111/1541-0420.00028. [DOI] [PubMed] [Google Scholar]

[R4] Brown ER, lbrahim JG. Bayesian approaches to joint cure rate and longitudinal models with applications to cancer vaccine trials. Biometrics. 2003b;59:686–693. doi: 10.1111/1541-0420.00079. [DOI] [PubMed] [Google Scholar]

[R5] Brown ER, Ibrahim JG, DeGruttola V. A flexible B-spline model for multiple longitudinal biomarkers and survival. Biometrics. 2005;61:64–73. doi: 10.1111/j.0006-341X.2005.030929.x. [DOI] [PubMed] [Google Scholar]

[R6] Chen MH, Ibrahim JG, Sinha D. Bayesian inference for multivariate survival data with a surviving fraction. Journal of Multivariate Analysis. 2002;80:101–126. [Google Scholar]

[R7] Chen MH, Ibrahim JG, Sinha D. A new joint model for longitudinal and survival data with a cure fraction. Journal of Multivariate Analysis. 2004;91:18–34. [Google Scholar]

[R8] Chi YY, Ibrahim JG. Joint models for multivariate longitudinal and multivariate survival data. Biometrics. 2006;62:432–445. doi: 10.1111/j.1541-0420.2005.00448.x. [DOI] [PubMed] [Google Scholar]

[R9] Chi Y, Ibrahim JG. A New class of joint models for longitudinal and survival data accomodating zero and non-zero cure fractions: A case study of an international breast cancer study group trial. Statistica Sinica. 2007;17:445–462. [Google Scholar]

[R10] Cook RD. Assessment of local influence (with Discussion) Journal of the Royal Statistical Society, Series B: Methodological. 1986;48:133–169. [Google Scholar]

[R11] De Gruttola V, Tu XM. Modelling progression of cd-4 lymphocyte count and its relationship to survival time. Biometrics. 1994;50:1003–1014. [PubMed] [Google Scholar]

[R12] Dobson A, Henderson R. Diagnostics for joint longitudinal and dropout time modeling. Biometrics. 2003;59:741–751. doi: 10.1111/j.0006-341x.2003.00087.x. [DOI] [PubMed] [Google Scholar]

[R13] Dunson D. Nonparametric Bayes applications to biostatistics. In: Hjort N, Holmes C, Muller P, Walker S, editors. Bayesian Nonparametrics in Practice. Cambridge, UK: Cambridge University Press; 2009. pp. 223–273. [Google Scholar]

[R14] Faucett C, Thomas D. Simultaneously modeling censored survival data and repeatedly measured covariates: A gibbs sampling approach. Statistics in Medicine. 1996;15:1663–1685. doi: 10.1002/(SICI)1097-0258(19960815)15:15<1663::AID-SIM294>3.0.CO;2-1. [DOI] [PubMed] [Google Scholar]

[R15] Hanson T, Branscum A, Johnson W. Predictive comparison of joint longitudinal-survival modeling: A case study illustrating competing approaches (with discussion) Lifetime Data Analysis. 2011;17:3–28. doi: 10.1007/s10985-010-9162-0. [DOI] [PubMed] [Google Scholar]

[R16] Henderson R, Diggle P, Dobson A. Joint modeling of longitudinal measurements and event time data. Biostatistics. 2000;4:465–480. doi: 10.1093/biostatistics/1.4.465. [DOI] [PubMed] [Google Scholar]

[R17] Ibrahim JG, Chen MH, Sinha D. Bayesian Survival Analysis. Springer Series in Statistics. New York: Springer-Verlag; 2001. [Google Scholar]

[R18] Ibrahim JG, Chen MH, Sinha D. Bayesian methods for joint modeling of longitudinal and survival data with applications to cancer vaccine studies. Statistica Sinica. 2004;14:863–883. [Google Scholar]

[R19] Law NJ, Taylor JMG, Sandler HM. The joint modeling of a longitudinal disease progression marker and the failure time process in the presence of cure. Biostatistics. 2002;3:547–563. doi: 10.1093/biostatistics/3.4.547. [DOI] [PubMed] [Google Scholar]

[R20] Pawitan Y, Self S. Modeling disease marker processes in aids. Journal of the American Statistical Association. 1993;83:719–726. [Google Scholar]

[R21] Rizopoulos D, Ghosh P. A Bayesian semiparametric multivariate joint model for multiple longitudinal outcomes and a time-to-event. Statistics in Medicine. 2011;30:1366–1380. doi: 10.1002/sim.4205. [DOI] [PubMed] [Google Scholar]

[R22] Rizopoulos D, Verbeke G, Molenberghs G. Shared parameter models under random effects misspecification. Biometrika. 2008;95:63–74. [Google Scholar]

[R23] Sinha D. Semiparametric Bayesian analysis of multiple event time data. Journal of the American Statistical Association. 1993;88:979–983. [Google Scholar]

[R24] Song X, Davidian M, Tsiatis AA. A semiparametric likelihood approach to joint modeling of longitudinal and time-to-event data. Biometrics. 2002;58:742–753. doi: 10.1111/j.0006-341x.2002.00742.x. [DOI] [PubMed] [Google Scholar]

[R25] Tsiatis A, Davidian M. An overview of joint modeling of longitudinal and time-to-event data. Statistica Sinica. 2004;14:793–818. [Google Scholar]

[R26] Tsiatis AA, DeGruttola V, Wulfsohn MS. Modeling the relationship of survival to longitudinal data measured with error. Applications to survival and cd4 counts in patients with aids. Journal of the American Statistical Association. 1995;90:27–37. [Google Scholar]

[R27] Wang Y, Taylor JMG. Jointly modeling longitudinal and event time data with application to acquired immunodeficiency syndrome. Journal of the American Statistical Association. 2001;96:895–905. [Google Scholar]

[R28] Wulfsohn MS, Tsiatis AA. A joint model for survival and longitudinal data measured with error. Biometrics. 1997;53:330–339. [PubMed] [Google Scholar]

[R29] Xu J, Zeger SL. The evaluation of multiple surrogate endpoints. Biometrics. 2001;57:81–87. doi: 10.1111/j.0006-341x.2001.00081.x. [DOI] [PubMed] [Google Scholar]

[R30] Yu M, Law NJ, Taylor JMG, Sandler HM. Joint longitudinal-survival-cure models and their application to prostate cancer. Statistica Sinica. 2004;14:835–862. [Google Scholar]

[R31] Zhu H, Ibrahim J, Lee SY, Zhang H. Perturbation selection and influence measures in local influence analysis. Annals of Statistics. 2007;35:2565–2588. [Google Scholar]

[R32] Zhu H, Ibrahim JG, Tang NS. Bayesian local influence analysis: a geometric approach. Biometrika. 2011;98:307–323. doi: 10.1093/biomet/asr009. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R33] Zhu H, Lee SY. Local influence for incomplete-data models. Journal of the Royal Statistical Society, Series B: Statistical Methodology. 2001;63:111–126. [Google Scholar]

PERMALINK

Bayesian Influence Measures for Joint Models for Longitudinal and Survival Data

Hongtu Zhu

Joseph G Ibrahim

Yueh-Yun Chi

Niansheng Tang

Summary

1. Introduction